Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20210118となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# 最適戦略を用いた量子状態検証の標準化に向けて Towards the standardization of quantum state verification using optimal strategies ( http://arxiv.org/abs/2002.00640v2 ) ライセンス: Link先を確認	Xinhe Jiang, Kun Wang, Kaiyi Qian, Zhaozhong Chen, Zhiyu Chen, Liangliang Lu, Lijun Xia, Fangmin Song, Shining Zhu, Xiaosong Ma	(参考訳) 絡み合った状態を生成する量子デバイスは広く研究され、広く使われている。そのため、特定されたデバイスが本当に確実に、かつ効率的に動作するかどうかを確認する必要がある。本稿では,フォトニックプラットフォームを用いた局所的測定(非適応的)とアクティブフィードフォワード操作(適応的)の両方を用いて,提案する2量子ビットエンタングル状態検証手法を実験的に実現する。約3283/536のコピー(N$)は、非適応的/適応的戦略に対するターゲット量子状態を検証するために99%の信頼を得るために必要である。これらの最適戦略は、パラメータ $r=-1$ を持つ $N$$$\epsilon$$\sim$$N^r$ の関数として不忠実な $\epsilon$ のハイゼンベルクスケーリングを提供し、$r=-0.5$ の標準量子極限を超える。非適応的および適応的戦略にそれぞれ$r=-0.88\pm$0.03と$-0.78\pm$0.07のスケーリングパラメータを得る。我々の実験は量子状態の検証のための標準化された手順として機能する可能性がある。 Quantum devices for generating entangled states have been extensively studied and widely used. As so, it becomes necessary to verify that these devices truly work reliably and efficiently as they are specified. Here, we experimentally realize the recently proposed two-qubit entangled state verification strategies using both local measurements (nonadaptive) and active feed-forward operations (adaptive) with a photonic platform. About 3283/536 number of copies ($N$) are required to achieve a 99% confidence to verify the target quantum state for nonadaptive/adaptive strategies. These optimal strategies provide the Heisenberg scaling of the infidelity $\epsilon$ as a function of $N$ ($\epsilon$ $\sim$ $N^r$) with the parameter $r=-1$, exceeding the standard quantum limit with $r=-0.5$. We experimentally obtain the scaling parameter of $r=-0.88\pm$0.03 and $-0.78\pm$0.07 for nonadaptive and adaptive strategies, respectively. Our experimental work could serve as a standardized procedure for the verification of quantum states.	翻訳日:2023-06-04 20:50:50 公開日:2021-01-18
# 量子暗号を用いたセキュアな対称的個人情報検索 Provably-secure symmetric private information retrieval with quantum cryptography ( http://arxiv.org/abs/2004.13921v2 ) ライセンス: Link先を確認	Wen Yu Kon, Charles Ci Wen Lim	(参考訳) プライベート情報検索 (pir) は、ユーザが興味のあるデータベースの特定のエントリを学習できるが、そのクエリはデータセンターから隠蔽されるという、ユーザのプライバシーを提供するデータベースクエリプロトコルである。シンメトリ・プライベート情報検索(SPIR)は、ユーザがデータベースの追加エントリを学習できないデータベースプライバシを付加することで、PIRをさらに強化する。複数のデータベースを持つ無条件でセキュアなSPIRソリューションは古典的には知られているが、セキュアな通信とプロトコル内のランダムな共有のために、パーティ間で長い秘密鍵を必要とするため、非現実的である。本稿では,セキュアな通信と共有ランダム性要件の両方を実現するための実装として,量子鍵分布(QKD)を提案する。我々は、QKDがSPIRプロトコルのセキュリティを維持しており、外部の盗聴者に対しても安全であることを証明した。また,測定装置に依存しないQKDによって生成される鍵を持つ2データベースSPIRプロトコルの例を用いて,このような古典量子システムを実際に実装する方法を示す。キーレート計算により,現在のqkd技術を用いて,都市レベルで実現可能であることを示す。 Private information retrieval (PIR) is a database query protocol that provides user privacy, in that the user can learn a particular entry of the database of his interest but his query would be hidden from the data centre. Symmetric private information retrieval (SPIR) takes PIR further by additionally offering database privacy, where the user cannot learn any additional entries of the database. Unconditionally secure SPIR solutions with multiple databases are known classically, but are unrealistic because they require long shared secret keys between the parties for secure communication and shared randomness in the protocol. Here, we propose using quantum key distribution (QKD) instead for a practical implementation, which can realise both the secure communication and shared randomness requirements. We prove that QKD maintains the security of the SPIR protocol and that it is also secure against any external eavesdropper. We also show how such a classical-quantum system could be implemented practically, using the example of a two-database SPIR protocol with keys generated by measurement device-independent QKD. Through key rate calculations, we show that such an implementation is feasible at the metropolitan level with current QKD technology.	翻訳日:2023-05-21 19:46:14 公開日:2021-01-18
# 調和に閉じ込められた相互作用粒子の2体クエンチダイナミクス Two-body quench dynamics of harmonically trapped interacting particles ( http://arxiv.org/abs/2005.01235v4 ) ライセンス: Link先を確認	A. D. Kerin and A. M. Martin	(参考訳) 我々は、相互作用強度が1つの値から別の値にキューチされる3次元等方性トラップにおける相互作用原子対の量子進化を考える。静的問題の厳密な解を用いることで、初期状態と最終状態の重なりや2つの原子間の分離の期待値など、時間依存の観測可能性を評価することができる。相互作用が非相互作用的状態から強い相互作用的状態に切り替わる場合、あるいはその逆の場合、分析結果が得られる。初期状態と最終状態の重なりを調べると、相互作用が非相互作用から強相互作用状態へと縮められるとき、初期の依存ダイナミクスは単一の不純物多体極限における理論的な仕事と一致することが分かる。系が強い状態から非相互作用状態へと焼成されると、相互作用ポテンシャルのゼロレンジの性質による対数発散から生じる2つの原子の分離における大きな振動を予測する。 We consider the quantum evolution of a pair of interacting atoms in a three dimensional isotropic trap where the interaction strength is quenched from one value to another. Using exact solutions of the static problem we are able to evaluate time-dependent observables such as the overlap between initial and final states and the expectation value of the separation between the two atoms. In the case where the interaction is quenched from the non-interacting regime to the strongly interacting regime, or vice versa, we are able to obtain analytic results. Examining the overlap between the initial and final states we show that when the interaction is quenched from the non-interacting to strongly interacting regimes the early time dependence dynamics are consistent with theoretical work in the single impurity many-body limit. When the system is quenched from the strongly to non-interacting regime we predict large oscillations in the separation between the two atoms, which arises from a logarithmic divergence due to the zero-range nature of the interaction potential.	翻訳日:2023-05-21 05:30:09 公開日:2021-01-18
# 高絡み合い状態からの量子コード構築の修正法 Modifying method of constructing quantum codes from highly entangled states ( http://arxiv.org/abs/2005.01426v3 ) ライセンス: Link先を確認	Zahra Raissi	(参考訳) 古典符号、高度に絡み合った純粋状態(k-ユニフォームまたは絶対最大絡み合う(AME)状態)と量子誤り訂正符号(QECC)の間には関連がある。これにより、k一様状態または対応する古典コードから開始し、各ステップで1つのパーティをトレースして安定化器qeccを構築する体系的な方法が導かれる。我々は、古典符号の対応する生成行列に部分的トレースが原因となる変化を記述することにより、コードワード、エンコーディング手順、およびQECCの安定化形式について明示的な構成を提供する。次に、この方法を変更して、論理的なquditをAME状態に分散した部分空間にエンコードする安定化器QECCを生成する。この構成は、パーティを追跡せずにAME状態から始まる量子コードを生成する。したがって、より大きな符号空間を持つ量子安定化符号を構築することができる。 There is a connection between classical codes, highly entangled pure states (called k-uniform or absolutely maximally entangled (AME) states), and quantum error correcting codes (QECCs). This leads to a systematic method to construct stabilizer QECCs by starting from a k-uniform state or the corresponding classical code and tracing out one party at each step. We provide explicit constructions for codewords, encoding procedure and stabilizer formalism of the QECCs by describing the changes that partial traces cause on the corresponding generator matrix of the classical codes. We then modify the method to produce another set of stabilizer QECCs that encode a logical qudit into a subspace spanned by AME states. This construction produces quantum codes starting from an AME state without tracing out any party. Therefore, quantum stabilizer codes with larger codespace can be constructed.	翻訳日:2023-05-21 05:24:27 公開日:2021-01-18
# 2レベルゆらぎのアンサンブルからの正・負の周波数雑音 Positive- and negative-frequency noise from an ensemble of two-level fluctuators ( http://arxiv.org/abs/2005.03591v2 ) ライセンス: Link先を確認	Xinyuan You, Aashish A. Clerk, Jens Koch	(参考訳) 発散性2レベルゆらぎのアンサンブルのブロッホ・レッドフィールド処理に基づく電荷ノイズの解析は、一般的には発散定理に違反する。標準的なマルコフ近似(浴槽に結合した2段のゆらぎに適用される場合)は、この故障の主な原因として特定できる。結果として生じる脱コヒーレンス速度は、変動周波数でのバス応答のみを含み、周波数拡大の効果を完全に無視する。この問題を克服するための体系的かつ計算学的に便利な方法は、スペクター・キュービット法を用いることである: 補助キュービットを2レベルゆらぎのアンサンブルに結合することにより、$S({\omega})$に対する解析近似をゆらぎ散逸定理と完全に整合できる。本稿では, クロスオーバー周波数の温度依存性を考慮した1/f$から1/f^2$のクロスオーバーを含む, いくつかの周波数範囲で異なる挙動を示すノイズの特性について論じる。 The analysis of charge noise based on the Bloch-Redfield treatment of an ensemble of dissipative two-level fluctuators generally results in a violation of the fluctuation-dissipation theorem. The standard Markov approximation (when applied to the two-level fluctuators coupled to a bath) can be identified as the main origin of this failure. The resulting decoherence rates only involve the bath response at the fluctuator frequency, and thus completely neglect the effects of frequency broadening. A systematic and computationally convenient way to overcome this issue is to employ the spectator-qubit method: by coupling an auxiliary qubit to the two-level fluctuator ensemble, an analytical approximation for $S({\omega})$ fully consistent with the fluctuation-dissipation theorem can be obtained. We discuss the resulting characteristics of the noise which exhibits distinct behavior over several frequency ranges, including a $1/f$ to $1/f^2$ crossover with a $T^3$ temperature dependence of the crossover frequency.	翻訳日:2023-05-20 22:25:38 公開日:2021-01-18
# 強レーザー駆動下での励起エネルギー移動 Excitation Energy Transfer under Strong Laser Drive ( http://arxiv.org/abs/2005.04719v2 ) ライセンス: Link先を確認	Xuanhua Wang, Zhedong Zhang, Jin Wang	(参考訳) 強い分子-光相互作用は分子構造と動的過程の制御を可能にする。光キャビティにより分子が強く駆動される共鳴エネルギー伝達の分子間距離を大幅に向上させるために、強いレーザー駆動を持つモデルを提案する。エネルギー移動の最適ラビ周波数と量子収率は、双極子-双極子相互作用と分子-キャビティカップリングのトレードオフから生じる。特定のラビ周波数での強い駆動を印加すると, 共振エネルギー伝達のF\"オルスター機構と比較して, 有効エネルギー移動の空間範囲と, 距離の遅い減衰速度が観察される。我々の研究は、分子ポラリトンにおける協調エネルギー移動の分光学的研究に光を当てている。 Strong molecule-light interaction enables the control of molecular structures and dynamical processes. A model with strong laser drive is proposed to greatly enhance the intermolecular distance of resonant energy transfer, where the molecules are strongly driven by an optical cavity. The optimal Rabi frequency and quantum yield of energy transfer are observed, resulting from the trade off between dipole-dipole interaction and molecule-cavity coupling. When the strong drive at certain Rabi frequency is applied, a larger spatial range of effective energy transfer and a slower decay rate with the distance compared to the F\"orster mechanism of resonant energy transfer are observed in our model. Our work sheds light on spectroscopic study of the cooperative energy transfer in molecular polaritons.	翻訳日:2023-05-20 16:07:49 公開日:2021-01-18
# 相対論的量子情報における粒子検出器モデルの破れ共分散 Broken covariance of particle detector models in relativistic quantum information ( http://arxiv.org/abs/2006.12514v3 ) ライセンス: Link先を確認	Eduardo Mart\'in-Mart\'inez, T. Rick Perche and Bruno de S. L. Torres	(参考訳) 量子場に結合した空間的スメア粒子検出器の予測は、一般に点的極限の外側で共変ではないことを示す。この共変の欠如は、時間順序演算における曖昧さとして現れている。共分散の崩壊が、unruh-dewittモデルのような量子場理論における典型的な検出器モデルにどのように影響するかを分析する。具体的には,共分散の破れが検出器-場系の状態,検出器の形状と運動状態,時空幾何にどのように依存するかを示す。さらに,違反の大きさを明示的に評価するツールを提供し,摂動解析においてスメア検出器の予測が正確に,あるいはほぼ共変している状態を特定する。 We show that the predictions of spatially smeared particle detectors coupled to quantum fields are not generally covariant outside the pointlike limit. This lack of covariance manifests itself as an ambiguity in the time-ordering operation. We analyze how the breakdown of covariance affects typical detector models in quantum field theory such as the Unruh-DeWitt model. Specifically, we show how the violations of covariance depend on the state of the detectors-field system, the shape and state of motion of the detectors, and the spacetime geometry. Furthermore, we provide the tools to explicitly evaluate the magnitude of the violation, and identify the regimes where the predictions of smeared detectors are either exactly or approximately covariant in perturbative analyses.	翻訳日:2023-05-13 04:51:52 公開日:2021-01-18
# 量子レーダー入門 Introduction to quantum radar ( http://arxiv.org/abs/2006.14238v3 ) ライセンス: Link先を確認	Ricardo Gallego Torrom\'e, Nadya Ben Bekhti-Winkel and Peter Knott	(参考訳) 量子絡み合いと量子相関の概念を簡潔に紹介した後、量子照明と他のプロトコルに基づく量子レーダのいくつかのスキームについて論じる。我々は,レーダアプリケーションのための量子生成および/または検出量子センシングプロトコルの実装におけるいくつかの本質的な困難を克服するために導入された異なる概念をレビューする。本レビューは, 異なる概念の実現可能性の評価を事例として, 最先端の最先端技術に関する最新の批判的プレゼンテーションである。また、レビューを現場の非専門家に公開することを目標としています。そのため、いくつかの付録と技術用語集が含まれている。 After a brief introduction to the notion of quantum entanglement and quantum correlations, several schemes for a quantum radar based upon the quantum illumination and others protocols are discussed. We review different concepts that have been introduced to overcome several of the inherent difficulties in the implementation of quantum generation and/or detection quantum sensing protocols for RADAR applications. Our review is an up-to date critical presentation of the state of the art, with emphasis in the case by case assessment of the feasibility of the different concepts. We also aim that the review is accessible to non-experts in the field. Hence several appendixes and a technical glossary are included.	翻訳日:2023-05-12 20:02:37 公開日:2021-01-18
# マジックステート蒸留のための測定シーケンス Measurement sequences for magic state distillation ( http://arxiv.org/abs/2007.07929v3 ) ライセンス: Link先を確認	Jeongwan Haah, Matthew B. Hastings	(参考訳) マジック状態蒸留(magic state distillation)は入力状態のエラーを抑制するために特別な符号を使用する。本稿では,量子ビット間の誤りの独立性を仮定して,任意の部分の誤りを抑制できるマジック状態蒸留プロトコルの詳細な測定手順を提案する。入力魔法の状態と合わせて、このプロトコルは2次元の正方形グリッド上で動作し、キュービットの水平ペアに$zz$、垂直ペアに$xx$、単一キュービットに$z,x$の測定を行う。 Magic state distillation uses special codes to suppress errors in input states, which are often tailored to a Clifford-twirled error model. We present detailed measurement sequences for magic state distillation protocols which can suppress arbitrary errors on any part of a protocol, assuming the independence of errors across qubits. Provided with input magic states, our protocol operates on a two-dimensional square grid by measurements of $ZZ$ on horizontal pairs of qubits, $XX$ on vertical pairs, and $Z,X$ on single qubits.	翻訳日:2023-05-09 09:03:49 公開日:2021-01-18
# 未知量子ビットの遠方における交換自由計算 Exchange-Free Computation on an Unknown Qubit at a Distance ( http://arxiv.org/abs/2008.00841v4 ) ライセンス: Link先を確認	Hatim Salih, Jonte R. Hance, Will McCutcheon, Terry Rudolph, and John Rarity	(参考訳) 我々は任意の量子ビットを直接操作する方法を示し、粒子の交換は行わない。これは、リモート古典的ボブによるアリスにおける任意の量子状態の交換自由な準備を含む。その結果,遠隔の第三者の未知の量子ビット上での任意の計算交換自由なプログラムにより,一方の第三者が直接実行可能なプロトコルを提案することができた。さらに、これを普遍的な2量子ビットゲートの交換自由制御に利用する方法を示し、プログラム可能な量子回路上で任意の所望のアルゴリズムを直接実行することが可能であることを示す。 We present a way of directly manipulating an arbitrary qubit, without the exchange of any particles. This includes as an application the exchange-free preparation of an arbitrary quantum state at Alice by a remote classical Bob. As a result, we are able to propose a protocol that allows one party to directly enact, by means of a suitable program, any computation exchange-free on a remote second party's unknown qubit. Further, we show how to use this for the exchange-free control of a universal two-qubit gate, thus opening the possibility of directly enacting any desired algorithm remotely on a programmable quantum circuit.	翻訳日:2023-05-07 06:47:24 公開日:2021-01-18
# DLTベースのCOVID-19パスポートのためのフレームワーク Framework for a DLT Based COVID-19 Passport ( http://arxiv.org/abs/2008.01120v7 ) ライセンス: Link先を確認	Sarang Chaudhari, Michael Clear and Hitesh Tewari	(参考訳) 日常的に対話するさまざまなネットワークをまたいで個人を識別することは、私たちが住んでいるデジタル世界にとっての課題であり、セキュアで効率的なプライバシー保護id機構の開発は重要な研究分野となっている。さらに、Bitcoinのような分散型意思決定ネットワークの人気は、エンドユーザーの認証情報を保管し、安全に広めるために分散型台帳技術を使うことに大きな関心を寄せている。本稿では、新型コロナウイルスのワクチン接種の詳細を公開され、分散化され、不変なブロックチェーン上に保存し、バイオメトリック・暗号ハッシュ技術を用いて各ユーザー固有の識別子を生成する2要素認証システムを利用するメカニズムについて述べる。私たちの主な貢献は、ユーザーを認証し、匿名でブロックチェーン上の予防接種記録を見つけるのに使える虹彩抽出技術に対して、確実にセキュアで局所性に敏感なハッシュアルゴリズムを使用することです。 Uniquely identifying individuals across the various networks they interact with on a daily basis remains a challenge for the digital world that we live in, and therefore the development of secure and efficient privacy preserving identity mechanisms has become an important field of research. In addition, the popularity of decentralised decision making networks such as Bitcoin has seen a huge interest in making use of distributed ledger technology to store and securely disseminate end user identity credentials. In this paper we describe a mechanism that allows one to store the COVID-19 vaccination details of individuals on a publicly readable, decentralised, immutable blockchain, and makes use of a two-factor authentication system that employs biometric cryptographic hashing techniques to generate a unique identifier for each user. Our main contribution is the employment of a provably secure input-hiding, locality-sensitive hashing algorithm over an iris extraction technique, that can be used to authenticate users and anonymously locate vaccination records on the blockchain, without leaking any personally identifiable information to the blockchain.	翻訳日:2023-05-07 06:27:59 公開日:2021-01-18
# 離散回転対称性を持つ量子ドットアレイからのねじれ光の放出 Emission of twisted light from quantum dot arrays with a discrete rotational symmetry ( http://arxiv.org/abs/2008.03908v3 ) ライセンス: Link先を確認	H. T. Sullivan, J. H. Cole	(参考訳) 量子ドットの円形配列の光学的性質を理論的に検討する。円形エミッタアレイ(CEA)と呼ばれるこの構造は、軌道角運動量を運ぶ光と同様に、円偏光の放出と吸収を通じて光学角運動量と交換することができる。バンド間およびバンド内遷移率の両方について解析式を導出し、選択規則を決定する。 ceaをモデル化する場合、量子ドットの重要な特性のみが考慮される。これにより、我々のモデルは様々な量子ドットからなるCEAに適用可能である。最後に、CEAの特定の光学特性をチューニングするための設計原理を決定する。これにより、光学角運動量に逆転するCEAの表面からなるメタマテリアルを設計する可能性が開ける。 We theoretically explore the optical properties of a circular array of quantum dots. This structure, that we call a circular emitter array (CEA), can exchange optical angular momentum via the emission and absorption of circularly polarised light as well as light carrying orbital angular momentum. Analytical expressions are derived for both interband and intraband transition rates and selection rules are determined. Only the key properties of the quantum dots are considering when modelling the CEA. This extends the applicability of our model to CEAs composed of a variety of quantum dots. Finally design principles for the tuning of the specific optical properties of the CEA are determined. This opens up the prospect of the designing a metamaterial, consisting of a surface of CEAs, that upconverts optical angular momentum.	翻訳日:2023-05-06 16:09:21 公開日:2021-01-18
# 超強結合系における分光と臨界量子温度測定 Spectroscopy and critical quantum thermometry in the ultrastrong coupling regime ( http://arxiv.org/abs/2009.01994v2 ) ライセンス: Link先を確認	M. Salado-Mej\'ia, R. Rom\'an-Ancheyta, F. Soto-Eguibar and H. M. Moya-Cessa	(参考訳) 我々は、異方性ホップフィールドモデルの厳密な解析解を示し、2つの超強結合量子系のスペクトルおよび熱的応答を詳細に研究する。興味深いことに,結合系の初期状態によっては,真空ラビ分裂は,逆直観的疎結合効果のスペクトルシグネチャと考えられる重要な非対称性を示す。量子熱力学応用のための温度計として結合系を用い,超強結合法で有効な温度推定の究極の境界を求める。驚くべきことに、もしシステムが量子相転移を行うと、量子フィッシャー情報は周期的な発散を示し、そのような臨界量子センサに対して任意に高い温度測定精度を持つことができる。 We present an exact analytical solution of the anisotropic Hopfield model, and we use it to investigate in detail the spectral and thermometric response of two ultrastrongly coupled quantum systems. Interestingly, we show that depending on the initial state of the coupled system, the vacuum Rabi splitting manifests significant asymmetries that may be considered spectral signatures of the counterintuitive decoupling effect. Using the coupled system as a thermometer for quantum thermodynamics applications, we obtain the ultimate bounds on the estimation of temperature that remain valid in the ultrastrong coupling regime. Remarkably, if the system performs a quantum phase transition, the quantum Fisher information exhibits periodic divergences, suggesting that one can have several points of arbitrarily high thermometric precision for such a critical quantum sensor.	翻訳日:2023-05-03 20:59:26 公開日:2021-01-18
# 光学波長変換のためのニオブ酸リチウムへのシリコンフォトニックデバイスのハイブリッド集積 Hybrid integration of silicon photonic devices on lithium niobate for optomechanical wavelength conversion ( http://arxiv.org/abs/2010.08493v2 ) ライセンス: Link先を確認	Igor Marinkovi\'c, Maxwell Drimmer, Bas Hensen, Simon Gr\"oblacher	(参考訳) 量子情報プロセッサの急速な発展は、量子ネットワークを実現する技術に対する需要を加速させた。有望なアプローチの1つは、マイクロ波と光学場の中間体としてメカニカル共振器を用いる。超伝導、トポロジー、スピン量子ビットプロセッサからの信号は、通信波長で光学状態とコヒーレントに変換できる。しかし、均質な構造から作られた現在のデバイスはノイズの増加と変換効率の低下に苦しむ。異なる材料の有利な特性を不均一な設計に組み合わせることで、優れた量子トランスダクションデバイスが実現できるはずであり、これらのハイブリッドアプローチは、しかしながら複雑な製造手順によって妨げられている。そこで本研究では,異なる素材の独立したデバイス部品を1つのデバイスに統合する,従来のピック・アンド・プレイス・アイデアに基づく新たな統合手法を提案する。この方法は、プロセス中に連続的な光学的モニタリングによって精度のアライメントを可能にする。本手法を用いて, 最先端の波長変換特性を有するニオブ酸シリコンハイブリッドデバイスを作製した。 The rapid development of quantum information processors has accelerated the demand for technologies that enable quantum networking. One promising approach uses mechanical resonators as an intermediary between microwave and optical fields. Signals from a superconducting, topological, or spin qubit processor can then be converted coherently to optical states at telecom wavelengths. However, current devices built from homogeneous structures suffer from added noise and small conversion efficiency. Combining advantageous properties of different materials into a heterogeneous design should allow for superior quantum transduction devices -- so far these hybrid approaches have however been hampered by complex fabrication procedures. Here we present a novel integration method based on previous pick-and-place ideas, that can combine independently fabricated device components of different materials into a single device. The method allows for precision alignment by continuous optical monitoring during the process. Using our method, we assemble a hybrid silicon-lithium niobate device with state-of-the-art wavelength conversion characteristics.	翻訳日:2023-04-28 22:03:14 公開日:2021-01-18
# 回路量子音響力学における量子対古典構造 Quantum versus Classical Regime in Circuit Quantum Acoustodynamics ( http://arxiv.org/abs/2011.05075v2 ) ライセンス: Link先を確認	Gang-hui Zeng, Yang Zhang, Aleksey N. Bolgar, Dong He, Bin Li, Xin-hui Ruan, Lan Zhou, Le-Mang Kuang, Oleg V. Astafiev, Yu-xi Liu, Z. H. Peng	(参考訳) 超伝導人工原子からなる回路量子音響力学系を2次元表面波共振器と1次元マイクロ波伝送線路の両方に結合して実験的に検討した。人工原子と音響波共振器との強い結合は, 希釈冷凍機の基礎温度における真空ラビ分裂の観察によって確認される。マイクロ波伝送線路におけるマイクロ波光子の伝搬は、音波共振器内の数個のフォノンによって制御可能であることを示す。さらに,高励起状態からのRabi分裂および温度誘起遷移の測定に対する温度効果を実証した。その結果,Rabi分裂における2ピークのスペクトル構造はいくつかのピークに変化し,環境温度の上昇に伴い徐々に消失することがわかった。量子-古典遷移は、熱ゆらぎエネルギー$k_{B}T$と結合系の特性エネルギーレベル間隔によって決定されるクロスオーバー温度$T_{c}T$の周囲で観測される。実験結果は, 実効温度の異なる結合系の主方程式による理論シミュレーションとよく一致している。 We experimentally study a circuit quantum acoustodynamics system, which consists of a superconducting artificial atom, coupled to both a two-dimensional surface acoustic wave resonator and a one-dimensional microwave transmission line. The strong coupling between the artificial atom and the acoustic wave resonator is confirmed by the observation of the vacuum Rabi splitting at the base temperature of dilution refrigerator. We show that the propagation of microwave photons in the microwave transmission line can be controlled by a few phonons in the acoustic wave resonator. Furthermore, we demonstrate the temperature effect on the measurements of the Rabi splitting and temperature induced transitions from high excited dressed states. We find that the spectrum structure of two-peak for the Rabi splitting becomes into those of several peaks, and gradually disappears with the increase of the environmental temperature $T$. The quantum-to-classical transition is observed around the crossover temperature $T_{c}$, which is determined via the thermal fluctuation energy $k_{B}T$ and the characteristic energy level spacing of the coupled system. Experimental results agree well with the theoretical simulations via the master equation of the coupled system at different effective temperatures.	翻訳日:2023-04-24 19:04:23 公開日:2021-01-18
# ホログラフィックの絡み合ったポリトープとしてのアソシヘドロン The associahedron as a holographic entanglement polytope ( http://arxiv.org/abs/2101.03823v2 ) ライセンス: Link先を確認	P\'eter L\'evay	(参考訳) 本項では、${\rm AdS}_3/{\rm CFT}_2$対応を用いて、散乱振幅の理解に使用されるArkani-Hamed-Bai-He-Yan(ABHY)アソシアヘドロンと、絡み合いのパターンから生じる時空を理解するために用いられるアソシアヘドロンとの類似性を観察する。この類推は、アソシアヘドロンを${\rm CFT}_2$真空に付随するホログラフィック絡みポリトープとして自然な解釈を示唆している。我々の観測は、散乱振幅の分解特性が、ホログラフィック量子絡み合いの理論で用いられる時空の分離性の概念と結びついている可能性を示唆している。 By employing the ${\rm AdS}_3/{\rm CFT}_2$ correspondence in this note we observe an analogy between the structures found in connection with the Arkani-Hamed-Bai-He-Yan (ABHY) associahedron used for understanding scattering amplitudes, and the one used for understanding space-time emerging from patterns of entanglement. The analogy suggests the natural interpretation for the associahedron as a holographic entanglement polytope associated to the ${\rm CFT}_2$ vacuum. Our observations hint at the possibility that the factorization properties of scattering amplitudes are connected to the notion of separability of space-time as used in the theory of holographic quantum entanglement.	翻訳日:2023-04-17 02:52:57 公開日:2021-01-18
# ibmq-melbourne量子コンピュータにおけるschr\"odinger cat状態の絡み合いの作成と研究 Preparation and study of the entanglement of the Schr\"odinger cat state on the ibmq-melbourne quantum computer ( http://arxiv.org/abs/2101.05089v2 ) ライセンス: Link先を確認	A.R. Kuzmak, V.M. Tkachuk	(参考訳) ibmq-melbourne量子コンピュータで作製したschr\"odinger cat状態における、ある量子ビットと残りのシステムの絡み合いについて検討した。この目的のために使用されるプロトコルは、ある量子ビットに対応するスピンの平均値を決定することに基づいている。異なる数の量子ビットからなるシュリンガー猫状態のパラメータに対する絡み合いの依存性について検討する。さらに、各量子ビットのエンタングルメントを、残りのシステムで最大エンタングル化されたschr\"odinger cat状態で検討する。 We study the entanglement between a certain qubit and the remaining system in the Schr\"odinger cat state prepared on the ibmq-melbourne quantum computer. The protocol, which we use for this purpose, is based on the determination of the mean value of spin corresponding to a certain qubit. We explore the dependence of the entanglement on a parameter of the Schr\"odinger cat state which consists of different numbers of qubits. In addition, we explore the entanglement of each qubit with the remaining system in the maximum entangled Schr\"odinger cat state.	翻訳日:2023-04-15 17:42:10 公開日:2021-01-18
# バリューアライメントの挑戦 - より公正なアルゴリズムからAI安全性まで The Challenge of Value Alignment: from Fairer Algorithms to AI Safety ( http://arxiv.org/abs/2101.06060v2 ) ライセンス: Link先を確認	Iason Gabriel and Vafa Ghazavi	(参考訳) 本稿では,AIシステムと人的価値の整合性の問題に対処し,それを技術と価値に関するより広い思考範囲に位置づける。真空中に存在するのではなく、異なる価値システムをロックインするテクノロジーの能力に長年関心が寄せられている。また、参加型デザインプロセスなど、技術と特定の社会的価値を連携させる方法も検討されている。本稿では、AIの価値アライメントに関する問題をより詳しく検討し、AIシステムのパワーと自律性が、これまで遭遇したことのない価値領域における機会と課題をもたらすことを示唆する。公正性、説明責任、透明性、倫理的コミュニティの作業と、技術AI安全研究者による作業との間の重要な連続性について、我々は「社会的価値の整合性」という問題により多くの注意を払う必要があることを示唆している。 This paper addresses the question of how to align AI systems with human values and situates it within a wider body of thought regarding technology and value. Far from existing in a vacuum, there has long been an interest in the ability of technology to 'lock-in' different value systems. There has also been considerable thought about how to align technologies with specific social values, including through participatory design-processes. In this paper we look more closely at the question of AI value alignment and suggest that the power and autonomy of AI systems gives rise to opportunities and challenges in the domain of value that have not been encountered before. Drawing important continuities between the work of the fairness, accountability, transparency and ethics community, and work being done by technical AI safety researchers, we suggest that more attention needs to be paid to the question of 'social value alignment' - that is, how to align AI systems with the plurality of values endorsed by groups of people, especially on the global level.	翻訳日:2023-04-15 03:03:40 公開日:2021-01-18
# ハイパーパラレルトランジスタ、ルータ及びユニティファイパティを有する動的ランダムアクセスメモリ Hyperparallel transistor, router and dynamic random access memory with unity fidelities ( http://arxiv.org/abs/2101.06872v1 ) ライセンス: Link先を確認	Ji-Zhen Liu, Ning-Yang Chen, Wen-Qiang Liu, Hai-Rui Wei and Ming Hua	(参考訳) 理論的には、量子単一光子トランジスタ、ルータ、動的ランダムアクセスメモリ(DRAM)など、いくつかの超並列光学素子を実装している。量子ドット(qd)-キャビティ中間体の必然的な側漏れと不完全な複屈折を考慮に入れ、我々の光学素子の統一性を達成することができる。ハイパー並列構造は光子の偏光と空間自由度(DOF)に基づいており、並列効率を高め、チャネルの容量を改善し、量子資源を節約し、運転時間を短縮し、環境騒音を低減している。また, 実用的スキームは, マイクロキャビティの側漏れや結合強度制限に対して頑健である。 We theoretically implement some hyperparallel optical elements, including quantum single photon transistor, router, and dynamic random access memory (DRAM). The inevitable side leakage and the imperfect birefringence of the quantum dot (QD)-cavity mediates are taken into account, and unity fidelities of our optical elements can be achieved. The hyperparallel constructions are based on polarization and spatial degrees of freedom (DOFs) of the photon to increase the parallel efficiency, improve the capacity of channel, save the quantum resources, reduce the operation time, and decrease the environment noises. Moreover, the practical schemes are robust against the side leakage and the coupling strength limitation in the microcavities.	翻訳日:2023-04-14 21:25:48 公開日:2021-01-18
# 準周期的及びランダム駆動量子多体系における加熱速度の厳密な境界 Rigorous Bounds on the Heating Rate in Thue-Morse Quasiperiodically and Randomly Driven Quantum Many-Body Systems ( http://arxiv.org/abs/2101.07065v1 ) ライセンス: Link先を確認	Takashi Mori, Hongzheng Zhao, Florian Mintert, Johannes Knolle, Roderich Moessner	(参考訳) 閉多体系の非平衡量子力学はリッチだが挑戦的な分野である。周期駆動(フロケット)システムの最近の進歩は多くの厳密な結果をもたらしてきたが、量子多体系に対する我々の理解は急速に変化しているが、半周期駆動は限定的である。ここでは、Thue-Morse準周期駆動およびランダム多極駆動の下での量子多体系の加熱速度の厳密な非摂動境界を導出し、後者は前者の調整可能なランダム化変種である。この過程において、局所可観測物の力学を含む過渡前熱状態を記述する静的有効ハミルトニアンを導出する。 thue-morse準周期駆動のバウンドは、数値シミュレーションと一致して、加熱時間は(\omega/g)^{-c\ln(\omega/g)}$、正の定数$c$、典型的なエネルギースケールのハミルトニアン$g$であることを示唆している。 The nonequilibrium quantum dynamics of closed many-body systems is a rich yet challenging field. While recent progress for periodically driven (Floquet) systems has yielded a number of rigorous results, our understanding on quantum many-body systems driven by rapidly varying but a- and quasi-periodic driving is still limited. Here, we derive rigorous, non-perturbative, bounds on the heating rate in quantum many-body systems under Thue-Morse quasi-periodic driving and under random multipolar driving, the latter being a tunably randomized variant of the former. In the process, we derive a static effective Hamiltonian that describes the transient prethermal state, including the dynamics of local observables. Our bound for Thue-Morse quasi-periodic driving suggests that the heating time scales like $(\omega/g)^{-C\ln(\omega/g)}$ with a positive constant $C$ and a typical energy scale $g$ of the Hamiltonian, in agreement with our numerical simulations.	翻訳日:2023-04-14 21:21:36 公開日:2021-01-18
# SpiNNakerとLoihiのニューロモルフィックボード上で動くシミュレートされた街灯ロボットのためのスパイキング中央パターン生成装置 A Spiking Central Pattern Generator for the control of a simulated lamprey robot running on SpiNNaker and Loihi neuromorphic boards ( http://arxiv.org/abs/2101.07001v1 ) ライセンス: Link先を確認	Emmanouil Angelidis, Emanuel Buchholz, Jonathan Patrick Arreguit O'Neil, Alexis Roug\`e, Terrence Stewart, Axel von Arnim, Alois Knoll, Auke Ijspeert	(参考訳) 中央パターン生成器(cpgs)モデルは、動物の移動を阻害する神経機構とロボット研究の道具の両方を調べるために長い間用いられてきた。本研究では,シミュレートされたランプレーモデルを制御する手段として,スパイキングCPGニューラルネットワークとそのニューロモルフィックハードウェアの実装を提案する。 CPGモデルを構築するために、ニューラルエンジニアリング・フレームワーク(NEF)で繰り返し発生する神経集団を用いて自然に出現する力学系を用いる。モデルの背後にある数学的定式化は、高レベル信号で変調された結合抽象振動子のシステムからなり、様々な出力歩数を生成することができる。中央パターン生成モデルの数学的定式化によって、モデルがスパイクニューラルネットワーク(snn)に変換され、snシミュレータであるnengoで簡単にシミュレーションできることを示した。スパイキングcpgモデルは、様々なシナリオで模擬ランプレイロボットモデルの水泳歩行を生成するために使用される。センサ情報によって提供できるネットワークへの入力を変更することで、ロボットの方向や速度を動的に制御できることを示す。提案手法は工学的応用と科学的研究に適した他のタイプのCPGに一般化することができる。我々はspinnakerとloihiという2つのニューロモルフィック・プラットフォームでシステムをテストする。最後に、このスパイキングアルゴリズムのカテゴリは、エネルギー効率と計算速度の観点から、ニューロモルフィックハードウェアの理論的優位性を活用できる可能性を示している。 Central Pattern Generators (CPGs) models have been long used to investigate both the neural mechanisms that underlie animal locomotion as well as a tool for robotic research. In this work we propose a spiking CPG neural network and its implementation on neuromorphic hardware as a means to control a simulated lamprey model. To construct our CPG model, we employ the naturally emerging dynamical systems that arise through the use of recurrent neural populations in the Neural Engineering Framework (NEF). We define the mathematical formulation behind our model, which consists of a system of coupled abstract oscillators modulated by high-level signals, capable of producing a variety of output gaits. We show that with this mathematical formulation of the Central Pattern Generator model, the model can be turned into a Spiking Neural Network (SNN) that can be easily simulated with Nengo, an SNN simulator. The spiking CPG model is then used to produce the swimming gaits of a simulated lamprey robot model in various scenarios. We show that by modifying the input to the network, which can be provided by sensory information, the robot can be controlled dynamically in direction and pace. The proposed methodology can be generalized to other types of CPGs suitable for both engineering applications and scientific research. We test our system on two neuromorphic platforms, SpiNNaker and Loihi. Finally, we show that this category of spiking algorithms shows a promising potential to exploit the theoretical advantages of neuromorphic hardware in terms of energy efficiency and computational speed.	翻訳日:2023-04-14 21:20:59 公開日:2021-01-18
# 熱平衡から相変化材料を用いたキャビティ壁と原子とのカシミール-ポルダー相互作用 Casimir-Polder Interaction of an Atom with a Cavity Wall Made of Phase-Change Material out of Thermal Equilibrium ( http://arxiv.org/abs/2101.06995v1 ) ライセンス: Link先を確認	G. L. Klimchitskaya and V. M. Mostepanenko	(参考訳) 我々は,He$^$,Na,Cs,Rbの原子と,二酸化バナジウム膜でコーティングされたサファイアのキャビティウォールとの間の熱平衡なカシミール・ポリダー相互作用を,壁温度の増加とともに誘電-金属相転移を経ると考えている。原子壁分離と壁温度の関数としてのカシミール・ポルダー力とその勾配の数値計算は、後者が環境温度を超えるときに行う。その結果, カシミール・ポルダー力の測定実験において, 熱平衡を欠く石英ガラス壁と$^{87}$Rb原子の勾配を測定した結果と比較した。また, 相変化壁材の使用は, 誘電体壁の場合と異なり, 力の大きさ, 特に力勾配を大きく増加させることが示された。 We consider the out-of-thermal-equilibrium Casimir-Polder interaction between atoms of He$^$, Na, Cs, and Rb and a cavity wall made of sapphire coated with a vanadium dioxide film which undergoes the dielectric-to-metal phase transition with increasing wall temperature. Numerical computations of the Casimir-Polder force and its gradient as the functions of atom-wall separation and wall temperature are made when the latter exceeds the temperature of the environment. The obtained results are compared with those in experiment on measuring the gradient of the Casimir-Polder force between $^{87}$Rb atoms and a silica glass wall out of thermal equilibrium. It is shown that the use of phase-change wall material increases significantly the force magnitude and especially the force gradient, as opposed to the case of dielectric wall.	翻訳日:2023-04-14 21:20:34 公開日:2021-01-18
# 実空間時間依存schr\"odinger計算によるヘキサゴナルナノリボンの高調波スペクトル High-harmonic spectra of hexagonal nanoribbons from real-space time-dependent Schr\"odinger calculations ( http://arxiv.org/abs/2101.06970v1 ) ライセンス: Link先を確認	Helena Dr\"ueke and Dieter Bauer	(参考訳) 高ハーモニック分光法は、全ての光学的手段と前例のない時間分解能で凝縮物質の電子構造とダイナミクスを撮像する有望な候補である。本研究では, 六角形ナノリボン, グラフェン, 六角形窒化ホウ素などの高調波スペクトルをアームチェアおよびジグザグ構成で検討した。系の対称性は、放射された高調波の存在と強度を説明する。 High-harmonic spectroscopy is a promising candidate for imaging electronic structures and dynamics in condensed matter by all-optical means and with unprecedented temporal resolution. We investigate harmonic spectra from finite, hexagonal nanoribbons, such as graphene and hexagonal boron nitride, in armchair and zig-zag configuration. The symmetry of the system explains the existence and intensity of the emitted harmonics.	翻訳日:2023-04-14 21:20:18 公開日:2021-01-18
# Capitol (Pat)riots: TwitterとParlerの比較研究 Capitol (Pat)riots: A comparative study of Twitter and Parler ( http://arxiv.org/abs/2101.06914v1 ) ライセンス: Link先を確認	Hitkul, Avinash Prabhu, Dipanwita Guhathakurta, Jivitesh jain, Mallika Subramanian, Manvith Reddy, Shradha Sehgal, Tanvi Karandikar, Amogh Gulati, Udit Arora, Rajiv Ratn Shah and Ponnurangam Kumaraguru	(参考訳) 2021年1月6日、右派保守派の暴徒がアメリカ議会議事堂ヒルを襲撃し、2020年の大統領選挙結果を議会が承認した。イベント開始直後、暴動に関連する投稿がソーシャルメディアで流行し始めた。ソーシャルメディアプラットフォームは、ソーシャルメディアプラットフォームであるParlerを支持する言論の自由であり、暴動が計画され、議論されたプラットフォームとして主張されている。われわれのレポートは、暴動の前後のparlerとtwitterのトレンドコンテンツの対比を示している。トレンドハッシュタグに基づいて両プラットフォームからデータを収集し,話題の話題,プラットフォームでアクティブな人,両プラットフォームで生成されたコンテンツのオーガニック性などに基づいて比較を行った。 twitter上のコンテンツはイベントに対する強い不満を持ち、暴動やインキッターに対する行動を求めたが、パーラーのコンテンツは、攻撃的な暴徒と同様の投票者詐欺の考えを反映する強い保守的な物語を持っていた。またTwitterと比較すると、Parlerのトラフィックの操作率も非常に高い。 On 6 January 2021, a mob of right-wing conservatives stormed the USA Capitol Hill interrupting the session of congress certifying 2020 Presidential election results. Immediately after the start of the event, posts related to the riots started to trend on social media. A social media platform which stood out was a free speech endorsing social media platform Parler; it is being claimed as the platform on which the riots were planned and talked about. Our report presents a contrast between the trending content on Parler and Twitter around the time of riots. We collected data from both platforms based on the trending hashtags and draw comparisons based on what are the topics being talked about, who are the people active on the platforms and how organic is the content generated on the two platforms. While the content trending on Twitter had strong resentments towards the event and called for action against rioters and inciters, Parler content had a strong conservative narrative echoing the ideas of voter fraud similar to the attacking mob. We also find a disproportionately high manipulation of traffic on Parler when compared to Twitter.	翻訳日:2023-04-14 21:20:09 公開日:2021-01-18
# BECから量子関連相への交叉における量子性の発見 Uncover quantumness in the crossover from BEC to quantum-correlated phase ( http://arxiv.org/abs/2101.06878v1 ) ライセンス: Link先を確認	J.P. Restrepo Cuartas and H. Vinck-Posada	(参考訳) Tavis-Cummingsモデルにおける集団現象は相転移の特徴に着目して広く研究されている。多くの場合、分離された放射線マター系を考慮した変分法が用いられている。本稿では, 単一モードキャビティに結合した2レベルエミッタの集合体における量子絡み合いの役割について検討する。系の統計的性質、例えば最初の4つの統計モーメントは、光と物質の分布の構造を明確に示している。 2階相関関数はいくつかの状態において1つになるが、統計解析はコヒーレントな振る舞いから、共通理解とは対照的に急激な離脱を証明している。 Collective phenomena in the Tavis-Cummings model has been widely studied, focusing on the phase transition features. In many occasions, it has been used variational approaches that consider separated radiation-matters systems. In this paper, we examine the role of the quantum entanglement of an assembly of two-level emitters coupled to a single-mode cavity; this allows us to characterise the quantum correlated state for each regime. Statistical properties of the system, e.g., the first four statistical moments, show clearly the structure of the light and matter distributions. Even though the second order correlation function goes to one in some regimes, the statistical analysis evidence a sharp departure from coherent behaviour, contrarily to the common understanding.	翻訳日:2023-04-14 21:19:12 公開日:2021-01-18
# コロナ・アプリにおけるデータ保護効果評価 Data Protection Impact Assessment for the Corona App ( http://arxiv.org/abs/2101.07292v1 ) ライセンス: Link先を確認	Kirsten Bock, Christian R. K\"uhne, Rainer M\"uhlhoff, M\v{e}to R. Ost, J\"org Pohle, Rainer Rehak	(参考訳) SARS-CoV-2は2020年初頭にヨーロッパで普及して以来、パンデミックとの戦いや封じ込めに関する技術的な解決を求める声が強く、議論の中心に接触追跡アプリがある。 EUのGDPR(General Daten Protection Regulation)は、データ処理が権利と自由に高いリスクをもたらす可能性のあるデータ保護影響評価(DPIA)を実施するよう、管理者に要求している(第35条GDPR)。 DPIAは、基本的権利に関連するデータ処理の結果を識別し、評価する構造化されたリスク分析であり、これらのリスクに対処するために考えられた措置や、それを行うことができないことを示す。標準データ保護モデル (SDM) に基づいて, PEPP-PT, DP-3T, およびChaos Computer ClubのメンバーであるLinus Neumannによって要約された概念である, 最も"プライバシフレンドリー"であると考えられる3つの接触追跡アプリ設計を, 徹底的に検証する科学的DPIAを提案する。 DPIAは、処理コンテキストと期待されるユースケースの分析から始まります。そして、現実的な処理目的を定義することにより、処理アクティビティを記述する。続いて法的な評価としきい値分析が行われる。最後に,弱点,リスクを分析し,適切な保護策を決定する。分散化実装でさえも、多くの重大な弱点とリスクを伴うことを示している。法的には、同意は法的根拠として適さないので、データは法律に基づいて処理されなければならない。また,データ主体と影響を受ける人々の権利を実現するための対策が不十分であることがわかった。最後に、匿名化は、個人的参照を分離することを目的とした継続的プロセスとして理解され、法的、組織的、技術的措置が混在していることを示します。現在利用可能なすべての提案には、そのような明確な分離プロセスがない。 Since SARS-CoV-2 started spreading in Europe in early 2020, there has been a strong call for technical solutions to combat or contain the pandemic, with contact tracing apps at the heart of the debates. The EU's General Daten Protection Regulation (GDPR) requires controllers to carry out a data protection impact assessment (DPIA) where their data processing is likely to result in a high risk to the rights and freedoms (Art. 35 GDPR). A DPIA is a structured risk analysis that identifies and evaluates possible consequences of data processing relevant to fundamental rights and describes the measures envisaged to address these risks or expresses the inability to do so. Based on the Standard Data Protection Model (SDM), we present a scientific DPIA which thoroughly examines three published contact tracing app designs that are considered to be the most "privacy-friendly": PEPP-PT, DP-3T and a concept summarized by Chaos Computer Club member Linus Neumann, all of which process personal health data. The DPIA starts with an analysis of the processing context and some expected use cases. Then, the processing activities are described by defining a realistic processing purpose. This is followed by the legal assessment and threshold analysis. Finally, we analyse the weak points, the risks and determine appropriate protective measures. We show that even decentralized implementations involve numerous serious weaknesses and risks. Legally, consent is unfit as legal ground hence data must be processed based on a law. We also found that measures to realize the rights of data subjects and affected people are not sufficient. Last but not least, we show that anonymization must be understood as a continuous process, which aims at separating the personal reference and is based on a mix of legal, organizational and technical measures. All currently available proposals lack such an explicit separation process.	翻訳日:2023-04-14 21:12:34 公開日:2021-01-18
# コヒーレントワンウェイ量子鍵分布に対するゼロエラー攻撃 Zero-error attack against coherent-one-way quantum key distribution ( http://arxiv.org/abs/2101.07192v1 ) ライセンス: Link先を確認	R\'obert Tr\'enyi, Marcos Curty	(参考訳) コヒーレントワンウェイ(COW)量子鍵分布(QKD)は、単純な実験装置で秘密鍵を長距離に分散するという約束を守った。実際、このスキームは現在商用アプリケーションで使われている。しかし、最近、その秘密鍵レートはシステムの透過率とほぼ4分の1でスケールしており、長距離QKD伝送には適していないことが示されている。このような悲観的な結果はいわゆるゼロエラー攻撃(ゼロエラー攻撃)によって引き起こされ、盗聴器はエラーを発生させないが、システムの正統な利用者はセキュアな鍵を抽出できない。そこで本研究では,誤差のない場合,その最大到達距離を制限できないという観点から,事実上最適であるcow-qkdに対するゼロエラー攻撃を提案する。これは秘密鍵レートの上界に変換され、これは以前に知られていた上界よりも桁違いに低い。 Coherent-one-way (COW) quantum key distribution (QKD) held the promise of distributing secret keys over long distances with a simple experimental setup. Indeed, this scheme is currently used in commercial applications. Surprisingly, however, it has been recently shown that its secret key rate scales at most quadratically with the system's transmittance and, thus, it is not appropriate for long distance QKD transmission. Such pessimistic result was derived by employing a so-called zero-error attack, in which the eavesdropper does not introduce any error, but still the legitimate users of the system cannot distill a secure key. Here, we present a zero-error attack against COW-QKD that is essentially optimal, in the sense that no other attack can restrict further its maximum achievable distance in the absence of errors. This translates into an upper bound on its secret key rate that is more than an order of magnitude lower than previously known upper bounds.	翻訳日:2023-04-14 21:10:31 公開日:2021-01-18
# プレーヤーデータを生成するモバイルゲームの設計 -- 学んだ教訓 Designing a mobile game to generate player data -- lessons learned ( http://arxiv.org/abs/2101.07144v1 ) ライセンス: Link先を確認	William Wallis and William Kavanagh and Alice Miller and Tim Storer	(参考訳) ユーザフレンドリーなツールは、高品質なゲーム設計の要件を、開発経験のない研究者が独自のゲームをリリースできるレベルまで引き下げた。しかし、研究目的のゲームは少ないため、最高の実践は確立されていない。同様のプロジェクトの指導なしにモバイルゲームを開発したので、私たちは経験を共有する必要性に気づき、将来の研究者がそれに追随する道を開くことに気付きました。ゲームバランシングとシステムシミュレーションの研究は、マルチプレイヤーモバイルゲーム「RPGLite」の開発に触発された実験ケーススタディを必要とした。 RPGの作成では、開発に関する専門知識がなく、研究目的で効果的なアマチュアゲーム開発に関する一連の教訓を学びました。本稿では,開発プロセス全体を振り返り,これらの教訓を紹介する。 User friendly tools have lowered the requirements of high-quality game design to the point where researchers without development experience can release their own games. However, there is no established best-practice as few games have been produced for research purposes. Having developed a mobile game without the guidance of similar projects, we realised the need to share our experience so future researchers have a path to follow. Research into game balancing and system simulation required an experimental case study, which inspired the creation of "RPGLite", a multiplayer mobile game. In creating RPGLitewith no development expertise we learned a series of lessons about effective amateur game development for research purposes. In this paper we reflect on the entire development process and present these lessons.	翻訳日:2023-04-14 21:09:40 公開日:2021-01-18
# 批判的分析:batアルゴリズムに基づく複数の領域の探索と応用 Critical Analysis: Bat Algorithm based Investigation and Application on Several Domains ( http://arxiv.org/abs/2102.01201v1 ) ライセンス: Link先を確認	Shahla U. Umar, Tarik A. Rashid	(参考訳) 近年,2010 年に xin-she yang が提案した bat algorithm (ba) などの群最適化アルゴリズムが提案されている。このアルゴリズムのアイデアはコウモリのエコーロケーション能力から取られた。目的: 本研究の目的は, batアルゴリズムの限界, アルゴリズムが適用されている分野, 異なる領域における汎用最適化問題, および他のメタヒューリスティックアルゴリズムに対する性能を評価するすべての研究を含む, 読者にbatアルゴリズムの完全な研究を提供することである。 Approach: Bat Algorithm is given in-depth in terms of backgrounds, characteristics, limitations, it has also displayed the algorithms that hybridized with BA (K-Medoids, Back-propagation neural network, Harmony Search Algorithm, Differential Evaluation Strategies, Enhanced Particle Swarm Optimization, and Cuckoo Search Algorithm) and their theoretical results, as well as to the modifications that have been performed of the algorithm (Modified Bat Algorithm (MBA), Enhanced Bat Algorithm (EBA), Bat Algorithm with Mutation (BAM), Uninhabited Combat Aerial Vehicle-Bat algorithm with Mutation (UCAV-BAM), Nonlinear Optimization)... 発見:このアルゴリズムの長所と短所を、アルゴリズムに対処するすべての研究と、それについて科学者が理解し、開発するのに役立つことを期待した分野と応用に光を当てた。 originality/value: 研究コミュニティの知識に関しては、このアルゴリズムに関する包括的な調査は行われていません。キーワードは、swarm intelligence、nature-inspired algorithms、metaheuristic algorithms、optimize algorithms、bat algorithmである。 In recent years several swarm optimization algorithms, such as Bat Algorithm (BA) have emerged, which was proposed by Xin-She Yang in 2010. The idea of the algorithm was taken from the echolocation ability of bats. Purpose: The purpose of this study is to provide the reader with a full study of the Bat Algorithm, including its limitations, the fields that the algorithm has been applied, versatile optimization problems in different domains, and all the studies that assess its performance against other meta-heuristic algorithms. Approach: Bat Algorithm is given in-depth in terms of backgrounds, characteristics, limitations, it has also displayed the algorithms that hybridized with BA (K-Medoids, Back-propagation neural network, Harmony Search Algorithm, Differential Evaluation Strategies, Enhanced Particle Swarm Optimization, and Cuckoo Search Algorithm) and their theoretical results, as well as to the modifications that have been performed of the algorithm (Modified Bat Algorithm (MBA), Enhanced Bat Algorithm (EBA), Bat Algorithm with Mutation (BAM), Uninhabited Combat Aerial Vehicle-Bat algorithm with Mutation (UCAV-BAM), Nonlinear Optimization)... Findings: Shed light on the advantages and disadvantages of this algorithm through all the researches that dealt with the algorithm in addition to the fields and applications it has addressed in the hope that it will help scientists understand and develop it. Originality/value: As far as the research community knowledge, there is no comprehensive survey study conducted on this algorithm cover{\i}ng all its aspects. Keywords: Swarm Intelligence; Nature-Inspired Algorithms; Metaheuristic Algorithms; Optimization Algorithms; Bat Algorithm.	翻訳日:2023-04-14 21:03:08 公開日:2021-01-18
# データ資源プロファイル:2020年3月から5月にかけてのニューヨークで発生した新型コロナウイルスの感染状況 Data Resource Profile: Egress Behavior from Select NYC COVID-19 Exposed Health Facilities March-May 2020 ( http://arxiv.org/abs/2101.10079v1 ) ライセンス: Link先を確認	Debra F. Laefer, Thomas Kirchner, Haoran (Frank) Jiang, Darlene Cheong, Yunqi (Veronica) Jiang, Aseah Khan, Weiyi Qiu, Nikki Tai, Tiffany Truong, Maimunah Virk	(参考訳) ベクターコントロール戦略は、新型コロナウイルス(covid-19)の緩和と封じ込めの中心であり、公共および民間の空間および関連サービスの運用状況を制限する自治体条例の形で行われている。しかし、リスク行動の観点から特定の集団反応についてはほとんど知られていない。これらのベクターコントロール変数戦略の影響を理解するために、ニューヨーク市の最初の新型コロナウイルス波(03/22/20-05/19/20)のピーク時に、ニューヨーク市の19の医療施設の外で、複数週間にわたる多地点観測研究が行われた。本研究の目的は, 病院や救急医療センターから退院した個人の触覚, 目的地選択, PPE 利用行動の把握である。主要な目標は、人々が三次元ベクトル環境と相互作用する方法に関する将来の研究のための経験的基礎を確立することであった。匿名化されたデータはスマートフォンで収集された。各データレコードには、医療施設を離れる個人の時間、データ、場所、ルーティング、ビルド環境とのインタラクション、他の個人、そして自分自身が含まれている。 PPEの使用状況、目的地、仲介所、交通機関の選択も記録されている。この記録は施設のジップコードによる61の社会経済的要因と7つの同時気象要因に関連付けられ、ARCGISシステムで統合された形状ファイルにまとめられた。本稿では,5,100以上の公開アクセス可能な観測記録を作成するためのプロジェクトチームとプロトコルについて述べる。 Vector control strategies are central to the mitigation and containment of COVID-19 and have come in the form of municipal ordinances that restrict the operational status of public and private spaces and associated services. Yet, little is known about specific population responses in terms of risk behaviors. To help understand the impact of those vector control variable strategies, a multi-week, multi-site observational study was undertaken outside of 19 New York City medical facilities during the peak of the city's initial COVID-19 wave (03/22/20-05/19/20). The aim was to capture perishable data of the touch, destination choice, and PPE usage behavior of individuals egressing hospitals and urgent care centers. A major goal was to establish an empirical basis for future research on the way people interact with three-dimensional vector environments. Anonymized data were collected via smart phones. Each data record includes the time, data, and location of an individual leaving a healthcare facility, their routing, interactions with the build environment, other individuals, and themselves. Most records also note their PPE usage, destination, intermediary stops, and transportation choices. The records were linked with 61 socio-economic factors by the facility zip code and 7 contemporaneous weather factors and the merged in a unified shapefile in an ARCGIS system. This paper describes the project team and protocols used to produce over 5,100 publicly accessible observational records and an affiliated codebook that can be used to study linkages between individual behaviors and on-the-ground conditions.	翻訳日:2023-04-14 21:02:30 公開日:2021-01-18
# 機械的TA2: TAをサポートしたピアグレーディングシステム Mechanical TA 2: A System for Peer Grading with TA Support ( http://arxiv.org/abs/2101.10078v1 ) ライセンス: Link先を確認	Hedayat Zarkoob, Farzad Abdolhosseini, and Kevin Leyton-Brown	(参考訳) Mechanical TA 2 (MTA2) は、信頼性の高い TA グレーダを利用して高品質なピアレビューをインセンティブ化する、オープンソースの Web ベースのピアグレーティングアプリケーションである。以前のMTAのプロトタイプ実装では、コンセプトの価値は証明されていたが、スケールや拡張性には適せず、MTA2はこれらのハードルを克服したシステムを完全に再実装した。 MTA2は2つの相互接続された目的を果たす: 実用的なピアグレーディングを容易にし、異なるピアグレーディング機構の実験用のテストベッドとして機能する。このシステムの特徴は、カスタマイズを容易にするモジュラーデザイン、生徒をピアグレードの技量に基づいて異なるプールに分割する支援、自動校正とスポットチェックの仕組み、学生が格付けをアピールし、個々のレビューに対するフィードバックを与える能力などである。 Mechanical TA 2 (MTA2) is an open source web-based peer grading application that leverages trusted TA graders to incentivize high-quality peer review. A previous, prototype implementation of MTA proved the value of the concept, but was neither suitable for use at scale nor easily extensible; MTA2 is a complete reimplementation of the system that overcomes these hurdles. MTA2 serves two, interconnected purposes: facilitating practical peer grading and serving as a testbed for experimentation with different peer grading mechanisms. The system is characterized by a modular design that makes customization easy; support for dividing students into different pools based on their peer-grading prowess; mechanisms for automated calibration and spot checking; and the ability for students to appeal grades and to give feedback about individual reviews.	翻訳日:2023-04-14 21:02:02 公開日:2021-01-18
# パネル:人間とテクノロジーによる包括的プライバシーとセキュリティ Panel: Humans and Technology for Inclusive Privacy and Security ( http://arxiv.org/abs/2101.07377v1 ) ライセンス: Link先を確認	Sanchari Das and Robert S. Gutzwiller and Rod D. Roscoe and Prashanth Rajivan and Yang Wang and L. Jean Camp and Roberto Hoyle	(参考訳) コンピュータセキュリティとユーザプライバシは、ユーザの増加とデータに対する脅威の両方により、デジタル時代の重要な問題と懸念である。一般的なサイバーセキュリティガイダンス(すなわち、悪意のある脅威からすべてのユーザーデータを保護)とプライバシーの個人主義的アプローチ(すなわち、ユーザ固有のものであり、ユーザのニーズやリスク認識に依存する)の間に、別の問題が生じる。ソフトウェアバグ(Streiff, Kenny, Das, Leeth, & Camp, 2018)、安全でない認証(Das, Wang, Tingle, & Camp, 2019)、行動的(パスワードの共有(Das, Dingman, & Camp, 2018)、コンプライアンス(Das, Dev, & Srinivasan, 2018)である。このパネルの提案は、セキュリティとプライバシの非独占的な設計から生じる、社会技術的脆弱性の第3のカテゴリに対処します。本パネルでは,プライバシーに対するユーザのニーズと欲求に対処する。パネルは価値に敏感なデザインについて詳細な議論を行い、高齢者や10代、障害のある人、一般のセキュリティやプライバシーの懸念に重点を置かない人たちなど、潜在的に脆弱な人々に焦点を当てる。人的要因は、これらの領域の改善を促進することへの関心と能力を持っている。 Computer security and user privacy are critical issues and concerns in the digital era due to both increasing users and threats to their data. Separate issues arise between generic cybersecurity guidance (i.e., protect all user data from malicious threats) and the individualistic approach of privacy (i.e., specific to users and dependent on user needs and risk perceptions). Research has shown that several security- and privacy-focused vulnerabilities are technological (e.g., software bugs (Streiff, Kenny, Das, Leeth, & Camp, 2018), insecure authentication (Das, Wang, Tingle, & Camp, 2019)), or behavioral (e.g., sharing passwords (Das, Dingman, & Camp, 2018); and compliance (Das, Dev, & Srinivasan, 2018) (Dev, Das, Rashidi, & Camp, 2019)). This panel proposal addresses a third category of sociotechnical vulnerabilities that can and sometimes do arise from non-inclusive design of security and privacy. In this panel, we will address users' needs and desires for privacy. The panel will engage in in-depth discussions about value-sensitive design while focusing on potentially vulnerable populations, such as older adults, teens, persons with disabilities, and others who are not typically emphasized in general security and privacy concerns. Human factors have a stake in and ability to facilitate improvements in these areas.	翻訳日:2023-04-14 21:01:22 公開日:2021-01-18
# PLLay:永続景観に基づく効率的な地形層 PLLay: Efficient Topological Layer based on Persistence Landscapes ( http://arxiv.org/abs/2002.02778v4 ) ライセンス: Link先を確認	Kwangho Kim, Jisu Kim, Manzil Zaheer, Joon Sik Kim, Frederic Chazal, and Larry Wasserman	(参考訳) 本研究では,パーシステンスランドスケープに基づく一般ディープラーニングモデルのための新しいトポロジカルレイヤpllayを提案し,入力データ構造の基盤となるトポロジ的特徴を効率的に活用する。本研究では,任意のフィルタを持つ一般永続ホモロジーに対して,層入力に関して微分可能性を示す。したがって,提案するレイヤをネットワーク内の任意の場所に配置し,入力データのトポロジ的特徴に関する重要な情報を次のレイヤに供給することで,ネットワークの学習性を向上させることができる。 PLLayのタスク最適構造は、入力処理やデータ前処理を必要とせずに、バックプロパゲーションを通じてトレーニング中に学習される。本稿では,DTM関数に基づくフィルタの新しい適応法を提案し,安定性解析により,提案した層が雑音や外れ値に対して頑健であることを示す。各種データセットの分類実験により,本手法の有効性を示す。 We propose PLLay, a novel topological layer for general deep learning models based on persistence landscapes, in which we can efficiently exploit the underlying topological features of the input data structure. In this work, we show differentiability with respect to layer inputs, for a general persistent homology with arbitrary filtration. Thus, our proposed layer can be placed anywhere in the network and feed critical information on the topological features of input data into subsequent layers to improve the learnability of the networks toward a given task. A task-optimal structure of PLLay is learned during training via backpropagation, without requiring any input featurization or data preprocessing. We provide a novel adaptation for the DTM function-based filtration, and show that the proposed layer is robust against noise and outliers through a stability analysis. We demonstrate the effectiveness of our approach by classification experiments on various datasets.	翻訳日:2023-01-03 04:27:58 公開日:2021-01-18
# グラフニューラルネットワークを強化したランダム特徴 Random Features Strengthen Graph Neural Networks ( http://arxiv.org/abs/2002.03155v3 ) ライセンス: Link先を確認	Ryoma Sato, Makoto Yamada, Hisashi Kashima	(参考訳) グラフニューラルネットワーク(GNN)は、さまざまなグラフ学習タスクのための強力な機械学習モデルである。近年,様々なGNNモデルの表現力の限界が明らかにされている。例えば、gnnは非同型グラフを区別できず、効率的なグラフアルゴリズムを学べない。本稿では,各ノードにランダムな特徴を加えるだけで,GNNが強力になることを示す。このランダムな特徴により、GNNは最小支配セット問題と最大マッチング問題に対して近似比でほぼ最適な多項式時間近似アルゴリズムを学習できることを示す。本手法の主な利点は,市販のGNNモデルと若干の修正を加えて組み合わせることができる点である。実験により、ランダムな特徴の追加により、グラフ畳み込みネットワーク(GCN)やグラフ同型ネットワーク(GIN)など、通常のGNNが解決できない様々な問題を解決することができることを示す。 Graph neural networks (GNNs) are powerful machine learning models for various graph learning tasks. Recently, the limitations of the expressive power of various GNN models have been revealed. For example, GNNs cannot distinguish some non-isomorphic graphs and they cannot learn efficient graph algorithms. In this paper, we demonstrate that GNNs become powerful just by adding a random feature to each node. We prove that the random features enable GNNs to learn almost optimal polynomial-time approximation algorithms for the minimum dominating set problem and maximum matching problem in terms of approximation ratios. The main advantage of our method is that it can be combined with off-the-shelf GNN models with slight modifications. Through experiments, we show that the addition of random features enables GNNs to solve various problems that normal GNNs, including the graph convolutional networks (GCNs) and graph isomorphism networks (GINs), cannot solve.	翻訳日:2023-01-02 22:19:37 公開日:2021-01-18
# スケーラブルな信念伝播のための緩和スケジューリング Relaxed Scheduling for Scalable Belief Propagation ( http://arxiv.org/abs/2002.11505v2 ) ライセンス: Link先を確認	Vitaly Aksenov and Dan Alistarh and Janne H. Korhonen	(参考訳) 大規模ハードウェア並列性を活用する能力は、機械学習の急速な進歩の重要な実現要因のひとつだ。その結果、古典的機械学習アルゴリズムの効率的な並列変種の開発に多大な労力が注がれた。しかし、並列化に関する豊富な知識にもかかわらず、いくつかの古典的な機械学習アルゴリズムは収束を維持しながら効率的に並列化するのが難しいことがしばしばある。本稿では,グラフィカルモデルに基づく推論の鍵となる機械学習タスクに対する効率的な並列アルゴリズム,特に基本的な信念伝播アルゴリズムに着目した。この文脈でスケーラブルな緩和スケジューラをどのように活用するかを示すことによって、この古典的なパラダイムを効率的に並列化するという課題に対処する。本稿では,本手法が,拡張性および壁時計収束時間の観点から,様々な実用的応用において,従来の並列信念伝達実装よりも優れていることを示す,広範な実証研究を行う。 The ability to leverage large-scale hardware parallelism has been one of the key enablers of the accelerated recent progress in machine learning. Consequently, there has been considerable effort invested into developing efficient parallel variants of classic machine learning algorithms. However, despite the wealth of knowledge on parallelization, some classic machine learning algorithms often prove hard to parallelize efficiently while maintaining convergence. In this paper, we focus on efficient parallel algorithms for the key machine learning task of inference on graphical models, in particular on the fundamental belief propagation algorithm. We address the challenge of efficiently parallelizing this classic paradigm by showing how to leverage scalable relaxed schedulers in this context. We present an extensive empirical study, showing that our approach outperforms previous parallel belief propagation implementations both in terms of scalability and in terms of wall-clock convergence time, on a range of practical applications.	翻訳日:2022-12-28 20:27:10 公開日:2021-01-18
# 逆整合損失を用いた不対向画像変換 Unpaired Image-to-Image Translation using Adversarial Consistency Loss ( http://arxiv.org/abs/2003.04858v7 ) ライセンス: Link先を確認	Yihao Zhao, Ruihai Wu, Hao Dong	(参考訳) unpaired image-to-image translationは、unpaired trainingデータを使用して異なる画像ドメイン間のマッピングを見つけることを目的としたビジョン問題のクラスである。サイクル一貫性損失はそのような問題に対して広く用いられる制約である。しかし、厳密なピクセルレベルの制約のため、幾何学的な変化や大きな物体の除去、無関係なテクスチャの無視はできない。本稿では,画像から画像への変換における新たな逆抵抗損失を提案する。この損失は、翻訳された画像が特定のソースイメージに変換される必要はなく、翻訳された画像がソースイメージの重要な特徴を保持し、上記のサイクルコンシスタンスロスの欠点を克服するよう促すことができる。本手法は, 眼鏡の除去, 男性から女性への翻訳, 自撮りからアニメへの翻訳の3つの課題に対して, 最先端の成果を得る。 Unpaired image-to-image translation is a class of vision problems whose goal is to find the mapping between different image domains using unpaired training data. Cycle-consistency loss is a widely used constraint for such problems. However, due to the strict pixel-level constraint, it cannot perform geometric changes, remove large objects, or ignore irrelevant texture. In this paper, we propose a novel adversarial-consistency loss for image-to-image translation. This loss does not require the translated image to be translated back to be a specific source image but can encourage the translated images to retain important features of the source images and overcome the drawbacks of cycle-consistency loss noted above. Our method achieves state-of-the-art results on three challenging tasks: glasses removal, male-to-female translation, and selfie-to-anime translation.	翻訳日:2022-12-24 21:20:05 公開日:2021-01-18
# PiP:自律運転のための計画インフォームド軌道予測 PiP: Planning-informed Trajectory Prediction for Autonomous Driving ( http://arxiv.org/abs/2003.11476v2 ) ライセンス: Link先を確認	Haoran Song, Wenchao Ding, Yuxuan Chen, Shaojie Shen, Michael Yu Wang, Qifeng Chen	(参考訳) 特に社会的に準拠した柔軟な方法で、自動運転計画のための周辺車両の動きを予測することは重要である。しかし、運転行動の相互作用と不確実性のため、将来の予測は困難である。マルチエージェント環境での予測問題に対処するために,計画インフォームド軌道予測(PiP)を提案する。我々のアプローチは、歴史的情報のみに基づいて計画と切り離された従来の予測方法と区別される。本手法は,ego車両の計画と共に予測プロセスを通知することにより,高速道路データセットにおけるマルチエージェント予測の最先端性能を実現する。さらに,インタラクティブなシナリオにおいて,自律走行に非常に有益であるego車両の複数候補軌道にpipを条件付けすることにより,予測と計画を結合する新しいパイプラインを実現する。 It is critical to predict the motion of surrounding vehicles for self-driving planning, especially in a socially compliant and flexible way. However, future prediction is challenging due to the interaction and uncertainty in driving behaviors. We propose planning-informed trajectory prediction (PiP) to tackle the prediction problem in the multi-agent setting. Our approach is differentiated from the traditional manner of prediction, which is only based on historical information and decoupled with planning. By informing the prediction process with the planning of ego vehicle, our method achieves the state-of-the-art performance of multi-agent forecasting on highway datasets. Moreover, our approach enables a novel pipeline which couples the prediction and planning, by conditioning PiP on multiple candidate trajectories of the ego vehicle, which is highly beneficial for autonomous driving in interactive scenarios.	翻訳日:2022-12-20 03:50:16 公開日:2021-01-18
# 時空間特徴抽出のための畳み込みスパイクニューラルネットワーク Convolutional Spiking Neural Networks for Spatio-Temporal Feature Extraction ( http://arxiv.org/abs/2003.12346v2 ) ライセンス: Link先を確認	Ali Samadzadeh, Fatemeh Sadat Tabatabaei Far, Ali Javadi, Ahmad Nickabadi, Morteza Haghir Chehreghani	(参考訳) スパイキングニューラルネットワーク(SNN)は、イベントベースの性質のため、低消費電力および組み込みシステム(新しいニューロモルフィックチップなど)で使用できる。また、従来のニューラルネットワーク(anns)とは対照的に、anの特性を維持しながら計算コストが低いという利点がある。しかしながら、畳み込みスパイクニューラルネットワークやその他の種類のSNNの層における時間的符号化はまだ研究されていない。本稿では,この特性を利用した実験において,畳み込みsnsの時空間的特徴抽出について考察する。浅い畳み込みSNNは、C3DやConvLstmなどの最先端の時空間特徴抽出手法よりも優れている。さらに,NMNIST (99.6%), DVS-CIFAR10 (69.2%), DVS-Gesture (96.7%), ANN の UCF-101 (42.1%) および HMDB-51 (21.5%) のデータセットに比べて優れた性能を示した実世界の問題(特に分類タスク)に取り組むための新しいディープスパイクアーキテクチャを提案する。また,本論文で説明した時空間バックプロパゲーションの変化に基づいて,トレーニングプロセスが実施されていることも注目に値する。 Spiking neural networks (SNNs) can be used in low-power and embedded systems (such as emerging neuromorphic chips) due to their event-based nature. Also, they have the advantage of low computation cost in contrast to conventional artificial neural networks (ANNs), while preserving ANN's properties. However, temporal coding in layers of convolutional spiking neural networks and other types of SNNs has yet to be studied. In this paper, we provide insight into spatio-temporal feature extraction of convolutional SNNs in experiments designed to exploit this property. The shallow convolutional SNN outperforms state-of-the-art spatio-temporal feature extractor methods such as C3D, ConvLstm, and similar networks. Furthermore, we present a new deep spiking architecture to tackle real-world problems (in particular classification tasks) which achieved superior performance compared to other SNN methods on NMNIST (99.6%), DVS-CIFAR10 (69.2%) and DVS-Gesture (96.7%) and ANN methods on UCF-101 (42.1%) and HMDB-51 (21.5%) datasets. It is also worth noting that the training process is implemented based on variation of spatio-temporal backpropagation explained in the paper.	翻訳日:2022-12-19 04:46:24 公開日:2021-01-18
# 微分プライバシーのための離散ガウス The Discrete Gaussian for Differential Privacy ( http://arxiv.org/abs/2004.00010v5 ) ライセンス: Link先を確認	Cl\'ement L. Canonne, Gautam Kamath, Thomas Steinke	(参考訳) 微分プライベートシステムを構築するための重要なツールは、機密データセットで評価された関数の出力にガウスノイズを追加することである。残念ながら、継続的分散を使うことにはいくつかの実践的な課題がある。まず第一に、有限のコンピュータは、連続分布のサンプルを正確に表現することはできない。さらに、基礎となるデータがそれ自体が離散的(例えば人口数)である場合、連続的なノイズを加えると、結果の解釈が困難になる。これらの欠点を念頭に置いて,微分プライバシーの文脈において離散ガウシアンを導入し,分析する。具体的には,離散ガウス雑音の追加は連続ガウス雑音の追加と本質的に同一のプライバシーと精度を保証することを示す。また,この分布から正確なサンプリングを行うための簡易かつ効率的なアルゴリズムを提案する。これは、プライベートに応答するクエリ、あるいは一般的には低感度整数値クエリに適用可能であることを示している。 A key tool for building differentially private systems is adding Gaussian noise to the output of a function evaluated on a sensitive dataset. Unfortunately, using a continuous distribution presents several practical challenges. First and foremost, finite computers cannot exactly represent samples from continuous distributions, and previous work has demonstrated that seemingly innocuous numerical errors can entirely destroy privacy. Moreover, when the underlying data is itself discrete (e.g., population counts), adding continuous noise makes the result less interpretable. With these shortcomings in mind, we introduce and analyze the discrete Gaussian in the context of differential privacy. Specifically, we theoretically and experimentally show that adding discrete Gaussian noise provides essentially the same privacy and accuracy guarantees as the addition of continuous Gaussian noise. We also present an simple and efficient algorithm for exact sampling from this distribution. This demonstrates its applicability for privately answering counting queries, or more generally, low-sensitivity integer-valued queries.	翻訳日:2022-12-18 01:51:27 公開日:2021-01-18
# 深層学習に基づく無線変調分類器の可視化 Visualizing Deep Learning-based Radio Modulation Classifier ( http://arxiv.org/abs/2005.02175v2 ) ライセンス: Link先を確認	Liang Huang (Member, IEEE), You Zhang, Weijian Pan, Jinyin Chen, Li Ping Qian (Senior Member, IEEE) and Yuan Wu (Senior Member, IEEE)	(参考訳) 近年,無線特徴をエンドツーエンドに抽出・分類することで,自動変調分類に深層学習が応用されている。しかし、深層学習に基づく無線変調分類器は解釈可能性に欠けており、どの無線特徴が抽出され、分類するために選択されるかの説明や可視性はほとんどない。本稿では,クラスアクティベーションベクトルを導入することで,異なる深層学習型無線変調分類器を可視化する。具体的には、畳み込みニューラルネットワーク(CNN)ベースの分類器と長短期記憶(LSTM)ベースの分類器の両方を別々に研究し、抽出した無線特徴を可視化する。 CNNに基づく分類器とLSTMに基づく分類器は、変調基準点に関する類似の無線特徴を抽出する。特にLSTMを用いた分類器では,得られた電波特性は人間の知識と類似している。以上の結果から,深層学習に基づく分類器によって抽出された無線特徴は,無線信号の搬送内容に大きく依存しており,短い無線サンプルが誤分類につながる可能性がある。 Deep learning has recently been successfully applied in automatic modulation classification by extracting and classifying radio features in an end-to-end way. However, deep learning-based radio modulation classifiers are lack of interpretability, and there is little explanation or visibility into what kinds of radio features are extracted and chosen for classification. In this paper, we visualize different deep learning-based radio modulation classifiers by introducing a class activation vector. Specifically, both convolutional neural networks (CNN) based classifier and long short-term memory (LSTM) based classifier are separately studied, and their extracted radio features are visualized. Extensive numerical results show both the CNN-based classifier and LSTM-based classifier extract similar radio features relating to modulation reference points. In particular, for the LSTM-based classifier, its obtained radio features are similar to the knowledge of human experts. Our numerical results indicate the radio features extracted by deep learning-based classifiers greatly depend on the contents carried by radio signals, and a short radio sample may lead to misclassification.	翻訳日:2022-12-07 06:58:06 公開日:2021-01-18
# 会話エージェントのための過去の会話からの意図マイニング Intent Mining from past conversations for conversational agent ( http://arxiv.org/abs/2005.11014v4 ) ライセンス: Link先を確認	Ajay Chatterjee and Shubhashis Sengupta	(参考訳) 会話システムは、AIコミュニティにおいて主要な関心事である。チャットボットは、時間単位のサポートを提供し、顧客エンゲージメントを高めるために、ますますデプロイされている。多くの商用ボット構築フレームワークは、ユーザ入力を認識するためにインテントモデルを構築し、トレーニングする必要がある標準アプローチに従っている。インテントモデルは、テキストによる発話とインテントラベルペアの集まりで教師あり設定で訓練される。異なる意図でトレーニングデータをかなり広範囲に収集することは、ボット構築プロセスにおけるボトルネックである。さらに、100から数千の会話を意図してラベル付けするコストは、時間と労力のかかる作業である。 In this paper, we present an intent discovery framework that involves 4 primary steps: Extraction of textual utterances from a conversation using a pre-trained domain agnostic Dialog Act Classifier (Data Extraction), automatic clustering of similar user utterances (Clustering), manual annotation of clusters with an intent label (Labeling) and propagation of intent labels to the utterances from the previous step, which are not mapped to any cluster (Label Propagation); to generate intent training data from raw conversations. 我々は,不均衡データクラスタリングのための新しい密度ベースクラスタリングアルゴリズムiter-dbscanを導入した。 subject Matter Expert(ドメインの専門知識を持つアノテーション)は、手動でクラスタ化されたユーザー発話を調べ、発見のためのインテントラベルを提供する。手動アノテーションの意図範囲,正確性,時間節約の観点から,訓練した意図モデルの有効性を検証するため,ユーザ実験を行った。このシステムは会話システムのための意図モデルを構築するために開発されたが、このフレームワークは短いテキストクラスタリングやラベリングフレームワークとしても使用できる。 Conversational systems are of primary interest in the AI community. Chatbots are increasingly being deployed to provide round-the-clock support and to increase customer engagement. Many of the commercial bot building frameworks follow a standard approach that requires one to build and train an intent model to recognize a user input. Intent models are trained in a supervised setting with a collection of textual utterance and intent label pairs. Gathering a substantial and wide coverage of training data for different intent is a bottleneck in the bot building process. Moreover, the cost of labeling a hundred to thousands of conversations with intent is a time consuming and laborious job. In this paper, we present an intent discovery framework that involves 4 primary steps: Extraction of textual utterances from a conversation using a pre-trained domain agnostic Dialog Act Classifier (Data Extraction), automatic clustering of similar user utterances (Clustering), manual annotation of clusters with an intent label (Labeling) and propagation of intent labels to the utterances from the previous step, which are not mapped to any cluster (Label Propagation); to generate intent training data from raw conversations. We have introduced a novel density-based clustering algorithm ITER-DBSCAN for unbalanced data clustering. Subject Matter Expert (Annotators with domain expertise) manually looks into the clustered user utterances and provides an intent label for discovery. We conducted user studies to validate the effectiveness of the trained intent model generated in terms of coverage of intents, accuracy and time saving concerning manual annotation. Although the system is developed for building an intent model for the conversational system, this framework can also be used for a short text clustering or as a labeling framework.	翻訳日:2022-11-30 08:14:03 公開日:2021-01-18
# AIコモンズの悲劇 The Tragedy of the AI Commons ( http://arxiv.org/abs/2006.05203v2 ) ライセンス: Link先を確認	Travis LaCroix and Aydin Mohseni	(参考訳) 近年,倫理的人工知能研究の政策とガイドラインの提案が盛んである。これらは、共通の利益のために、社会的責任のあるaiの開発を導くものだ。しかしながら、通常、非協力のためのインセンティブ(つまり、そのような政策やガイドラインに従わないこと)が存在し、これらの提案は、彼ら自身の規範的主張を強制する効果的なメカニズムを欠いている。説明された状況は、社会的ジレンマ、すなわち、協力する個別のインセンティブを持たない状況を構成するが、相互協力は、すべての関係者にとって最良の結果をもたらす。本稿では,この社会ジレンマを,人工知能の倫理的発展の文脈でモデル化するために,確率論的進化ゲームダイナミクスを用いる。このフォーマリズムは、介入される可能性のある変数を分離することを可能にするため、AIの多くのステークホルダー間の協力を強化するための実用的な提案を提供する。以上の結果から,このようなシナリオにおいて,確率的効果が協調に有効であることを示す。彼らは、協力のコストが低く、失敗のリスクが高い小さなグループで共通の利益の調整を試みるべきであることを示唆している。これは、そのような倫理提案が、その範囲、規模、内容に関して成功すると期待すべき条件について洞察を与える。 Policy and guideline proposals for ethical artificial-intelligence research have proliferated in recent years. These are supposed to guide the socially-responsible development of AI for the common good. However, there typically exist incentives for non-cooperation (i.e., non-adherence to such policies and guidelines); and, these proposals often lack effective mechanisms to enforce their own normative claims. The situation just described constitutes a social dilemma; namely, a situation where no one has an individual incentive to cooperate, though mutual cooperation would lead to the best outcome for all involved. In this paper, we use stochastic evolutionary game dynamics to model this social dilemma in the context of the ethical development of artificial intelligence. This formalism allows us to isolate variables that may be intervened upon, thus providing actionable suggestions for increased cooperation amongst numerous stakeholders in AI. Our results show how stochastic effects can help make cooperation viable in such a scenario. They suggest that coordination for a common good should be attempted in smaller groups in which the cost for cooperation is low, and the perceived risk of failure is high. This provides insight into the conditions under which we should expect such ethics proposals to be successful with regard to their scope, scale, and content.	翻訳日:2022-11-23 14:28:07 公開日:2021-01-18
# クラウドソースによる呼吸音データからのCOVID-19自動診断の探索 Exploring Automatic Diagnosis of COVID-19 from Crowdsourced Respiratory Sound Data ( http://arxiv.org/abs/2006.05919v3 ) ライセンス: Link先を確認	Chlo\"e Brown, Jagmohan Chauhan, Andreas Grammenos, Jing Han, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, Cecilia Mascolo	(参考訳) 人体から発生する音声信号(息、呼吸、心臓、消化、振動音など)は、臨床医が病気の診断や疾患の進行を評価する指標として日常的に用いられている。最近まで、このような信号は通常、定期的な訪問の際に手動の聴取を通して収集されていた。研究は現在、身体の音(例えばデジタル聴診器から)を心臓血管や呼吸検査に収集するためにデジタル技術を使い始めており、自動分析に使用することができる。初期の研究で、声とせきからcovid-19の診断信号を検出することが期待されている。本稿では,covid-19の診断を支援するために収集された呼吸音の大規模クラウドソーシングデータセットに関するデータ分析について述べる。気管支喘息や健康管理の人たちからcovid-19の音がいかに識別できるかを理解するために、cooughsとbreathを使っています。その結果、単純なバイナリ機械学習分類器でも、正常な健康音とcovid-19音を分類できることがわかった。また、covid-19陽性者と、covid-19陽性者とを区別する方法、および、covid-19陽性者と、喘息患者と、coough患者とを区別する方法を示す。我々のモデルはすべてのタスクで80%以上のAUCを達成する。これらの結果は予備的であり、この種のデータとオーディオベースの機械学習のポテンシャルの表面のみを掻き取る。この研究は、新型コロナウイルス(COVID-19)の診断に役立つ事前スクリーニング信号として、自動で呼吸パターンを分析する方法について、さらなる調査を行うための扉を開く。 Audio signals generated by the human body (e.g., sighs, breathing, heart, digestion, vibration sounds) have routinely been used by clinicians as indicators to diagnose disease or assess disease progression. Until recently, such signals were usually collected through manual auscultation at scheduled visits. Research has now started to use digital technology to gather bodily sounds (e.g., from digital stethoscopes) for cardiovascular or respiratory examination, which could then be used for automatic analysis. Some initial work shows promise in detecting diagnostic signals of COVID-19 from voice and coughs. In this paper we describe our data analysis over a large-scale crowdsourced dataset of respiratory sounds collected to aid diagnosis of COVID-19. We use coughs and breathing to understand how discernible COVID-19 sounds are from those in asthma or healthy controls. Our results show that even a simple binary machine learning classifier is able to classify correctly healthy and COVID-19 sounds. We also show how we distinguish a user who tested positive for COVID-19 and has a cough from a healthy user with a cough, and users who tested positive for COVID-19 and have a cough from users with asthma and a cough. Our models achieve an AUC of above 80% across all tasks. These results are preliminary and only scratch the surface of the potential of this type of data and audio-based machine learning. This work opens the door to further investigation of how automatically analysed respiratory patterns could be used as pre-screening signals to aid COVID-19 diagnosis.	翻訳日:2022-11-23 06:44:40 公開日:2021-01-18
# AdamP: スケール不変ウェイトにおけるモーメント最適化のスローダウン AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights ( http://arxiv.org/abs/2006.08217v3 ) ライセンス: Link先を確認	Byeongho Heo, Sanghyuk Chun, Seong Joon Oh, Dongyoon Han, Sangdoo Yun, Gyuwan Kim, Youngjung Uh, Jung-Woo Ha	(参考訳) 正規化技術は現代の深層学習の恩恵である。彼らはしばしばより良い一般化性能で重みをより早く収束させる。重み間の正規化誘起スケール不変性は、勾配降下(GD)最適化器に有利な土台を与えると論じられ、実効的なステップサイズは時間とともに自動的に減少し、全体的な訓練手順を安定化させる。しかし、GDオプティマイザに運動量を導入することで、スケール不変量に対する効果的なステップサイズが大幅に減少し、これはまだ研究されていない現象であり、現在の実践において望ましくない副作用を引き起こした可能性がある。現代のディープニューラルネットワークの大多数は(1)運動量に基づくgd(sgdやadamなど)と(2)スケール不変パラメータで構成されているため、これは重要な問題である。本稿では,これら2成分の多種多様な組み合わせが,有効なステップサイズとサブ最適モデル性能の早期崩壊につながることを検証した。本稿では,SGDPとAdamPによる簡易かつ効果的な対策として,各最適化ステップにおいて放射状成分(標準増加方向)を除去する手法を提案する。スケールの不変性のため、この修正は有効な更新方向を変更することなく有効なステップサイズだけを変更し、GDオプティマイザの本来の収束特性を享受する。機械学習における運動量GDの多様さとスケール不変性を考慮して,13ベンチマークの基準値に対して評価を行った。それらは、分類(例:イメージネット)、検索(例:cubとsop)、検出(例:coco)、言語モデリング(例:wikitext)、音声分類(例:dcase)といったビジョンタスクから成り立っている。当社のソリューションがベンチマークで均一に向上していることを確認します。ソースコードはhttps://github.com/clovaai/adampで入手できる。 Normalization techniques are a boon for modern deep learning. They let weights converge more quickly with often better generalization performances. It has been argued that the normalization-induced scale invariance among the weights provides an advantageous ground for gradient descent (GD) optimizers: the effective step sizes are automatically reduced over time, stabilizing the overall training procedure. It is often overlooked, however, that the additional introduction of momentum in GD optimizers results in a far more rapid reduction in effective step sizes for scale-invariant weights, a phenomenon that has not yet been studied and may have caused unwanted side effects in the current practice. This is a crucial issue because arguably the vast majority of modern deep neural networks consist of (1) momentum-based GD (e.g. SGD or Adam) and (2) scale-invariant parameters. In this paper, we verify that the widely-adopted combination of the two ingredients lead to the premature decay of effective step sizes and sub-optimal model performances. We propose a simple and effective remedy, SGDP and AdamP: get rid of the radial component, or the norm-increasing direction, at each optimizer step. Because of the scale invariance, this modification only alters the effective step sizes without changing the effective update directions, thus enjoying the original convergence properties of GD optimizers. Given the ubiquity of momentum GD and scale invariance in machine learning, we have evaluated our methods against the baselines on 13 benchmarks. They range from vision tasks like classification (e.g. ImageNet), retrieval (e.g. CUB and SOP), and detection (e.g. COCO) to language modelling (e.g. WikiText) and audio classification (e.g. DCASE) tasks. We verify that our solution brings about uniform gains in those benchmarks. Source code is available at https://github.com/clovaai/AdamP.	翻訳日:2022-11-21 02:20:22 公開日:2021-01-18
# 低位ガウスコプラによる量的不確かさの行列補完 Matrix Completion with Quantified Uncertainty through Low Rank Gaussian Copula ( http://arxiv.org/abs/2006.10829v2 ) ライセンス: Link先を確認	Yuxuan Zhao, Madeleine Udell	(参考訳) 現代の大規模データセットは、しばしば欠落したエントリで悩まされる。値が不足している表データに対しては、ペナルティ化された再構成エラーを最小化する完全行列に対して、複数のインプテーションアルゴリズムが解く。しかし、ほとんど誰もその計算の不確かさを見積もることができない。本稿では,定量化の不確実性を伴う価値インプテーションを欠く確率的かつスケーラブルなフレームワークを提案する。我々のモデルである低ランクガウスコピュラは、標準確率モデルである確率的主成分分析を強化し、各列に対して限界変換を行い、モデルがデータの分布によく一致するようにします。ブール、順序、実数値の観測を自然に処理し、各計算における不確実性を定量化する。モデルに適合するために必要な時間は、データセット内の行数や列数と線形にスケールする。実験結果から,高階データを含む多種多様なデータ型に対して,最先端の計算精度が得られた。我々の不確実性尺度は、計算誤差をよく予測する: 低い不確実性を持つエントリは(平均において)計算誤差を低くする。さらに、実数値データでは、結果の信頼区間が適切に調整される。 Modern large scale datasets are often plagued with missing entries. For tabular data with missing values, a flurry of imputation algorithms solve for a complete matrix which minimizes some penalized reconstruction error. However, almost none of them can estimate the uncertainty of its imputations. This paper proposes a probabilistic and scalable framework for missing value imputation with quantified uncertainty. Our model, the Low Rank Gaussian Copula, augments a standard probabilistic model, Probabilistic Principal Component Analysis, with marginal transformations for each column that allow the model to better match the distribution of the data. It naturally handles Boolean, ordinal, and real-valued observations and quantifies the uncertainty in each imputation. The time required to fit the model scales linearly with the number of rows and the number of columns in the dataset. Empirical results show the method yields state-of-the-art imputation accuracy across a wide range of data types, including those with high rank. Our uncertainty measure predicts imputation error well: entries with lower uncertainty do have lower imputation error (on average). Moreover, for real-valued data, the resulting confidence intervals are well-calibrated.	翻訳日:2022-11-19 13:06:27 公開日:2021-01-18
# 非平衡応答理論を用いたリカレントニューラルネットワークの理解 Understanding Recurrent Neural Networks Using Nonequilibrium Response Theory ( http://arxiv.org/abs/2006.11052v2 ) ライセンス: Link先を確認	Soon Hoe Lim	(参考訳) リカレントニューラルネットワーク(Recurrent Neural Network, RNN)は、シーケンシャルデータの解析に機械学習で広く使用される脳モデルである。この研究は、RNNが非平衡統計力学の応答理論を用いて入力信号をどのように処理するかを深く理解するための貢献である。入力信号によって駆動される連続時間確率RNN(SRNN)のクラスに対して、Volterra型系列表現を出力として導出する。この表現は解釈可能であり、SRNNアーキテクチャから入力信号を切り離す。系列の核は、出力を完全に決定する非摂動力学に関して、ある種の再帰的に定義された相関関数である。この表現とその大まかな経路理論への含意を明らかにすることで、入力信号のテンソル積のシグネチャであり、自然な支持基盤であることが判明した、普遍的な特徴である応答特徴を識別する。特に,読み出し層の重みのみを最適化し,隠れた層内の重みを固定し,最適化しないsrnnを,応答特性に付随するカーネルヒルベルト空間上で動作するカーネルマシンとして捉えることができることを示した。 Recurrent neural networks (RNNs) are brain-inspired models widely used in machine learning for analyzing sequential data. The present work is a contribution towards a deeper understanding of how RNNs process input signals using the response theory from nonequilibrium statistical mechanics. For a class of continuous-time stochastic RNNs (SRNNs) driven by an input signal, we derive a Volterra type series representation for their output. This representation is interpretable and disentangles the input signal from the SRNN architecture. The kernels of the series are certain recursively defined correlation functions with respect to the unperturbed dynamics that completely determine the output. Exploiting connections of this representation and its implications to rough paths theory, we identify a universal feature -- the response feature, which turns out to be the signature of tensor product of the input signal and a natural support basis. In particular, we show that SRNNs, with only the weights in the readout layer optimized and the weights in the hidden layer kept fixed and not optimized, can be viewed as kernel machines operating on a reproducing kernel Hilbert space associated with the response feature.	翻訳日:2022-11-19 04:24:58 公開日:2021-01-18
# 確率勾配アルゴリズムの適応的逆強化学習のためのランゲヴィンダイナミクス Langevin Dynamics for Adaptive Inverse Reinforcement Learning of Stochastic Gradient Algorithms ( http://arxiv.org/abs/2006.11674v2 ) ライセンス: Link先を確認	Vikram Krishnamurthy and George Yin	(参考訳) 逆強化学習(IRL)は、エージェントの反応(見積や行動)を観察することで、エージェントの報酬関数を推定することを目的としている。本稿では,複数の確率勾配エージェントが生成する報酬関数の勾配のノイズ推定を行った場合のIRLについて考察する。一般化したランゲヴィン力学アルゴリズムを用いて報酬関数 $R(\theta)$ を推定する。具体的には、結果のランゲヴィンアルゴリズムは、分布から$\exp(R(\theta))$ に比例してサンプルを漸近的に生成する。提案するirlアルゴリズムはカーネルベースのパッシブ学習方式を用いる。また、高次元データに適したIRLのためのマルチカーネル受動ランゲインアルゴリズムを構築した。提案するirlアルゴリズムの性能は、適応ベイズ学習、ロジスティック回帰(高次元問題)、制約付きマルコフ決定過程の例で示される。提案するirlアルゴリズムの弱収束をmartingale平均化法を用いて証明する。また,ユーティリティ関数$R(\theta)$ jumpが遅いマルコフ連鎖として時間とともに変化する非定常環境におけるIRLアルゴリズムの追跡性能も解析する。 Inverse reinforcement learning (IRL) aims to estimate the reward function of optimizing agents by observing their response (estimates or actions). This paper considers IRL when noisy estimates of the gradient of a reward function generated by multiple stochastic gradient agents are observed. We present a generalized Langevin dynamics algorithm to estimate the reward function $R(\theta)$; specifically, the resulting Langevin algorithm asymptotically generates samples from the distribution proportional to $\exp(R(\theta))$. The proposed IRL algorithms use kernel-based passive learning schemes. We also construct multi-kernel passive Langevin algorithms for IRL which are suitable for high dimensional data. The performance of the proposed IRL algorithms are illustrated on examples in adaptive Bayesian learning, logistic regression (high dimensional problem) and constrained Markov decision processes. We prove weak convergence of the proposed IRL algorithms using martingale averaging methods. We also analyze the tracking performance of the IRL algorithms in non-stationary environments where the utility function $R(\theta)$ jump changes over time as a slow Markov chain.	翻訳日:2022-11-18 22:38:43 公開日:2021-01-18
# 配列の微分可能なセグメンテーション Differentiable Segmentation of Sequences ( http://arxiv.org/abs/2006.13105v2 ) ライセンス: Link先を確認	Erik Scharw\"achter and Jonathan Lennartz and Emmanuel M\"uller	(参考訳) セグメンテーションモデルは、離散的な変化点を持つ非定常シーケンシャルデータを記述するために広く使われている。これらの推定は通常、セグメンテーションが離散部分であり、他の全てのモデルパラメータが連続である混合離散連続最適化問題を解く必要がある。特定のモデル仮定に高度に特化された多くの推定アルゴリズムが開発されている。非標準アルゴリズムへの依存は、勾配に基づく最適化技術に批判的に依存する最先端のディープラーニングアーキテクチャにセグメントモデルを統合するのを難しくする。本研究では,セグメント化を含む全てのモデルパラメータを勾配降下で共同で推定することのできるセグメンテーションモデルの緩和変種を定式化する。我々は,近年の継続的ワーピング関数の学習の進歩を基盤として,両面パワー(TSP)分布に基づく新しいワーピング関数群を提案する。 TSPベースのワープ関数は微分可能であり、単純なクローズドフォーム式を持ち、セグメント関数を正確に表現することができる。この定式化は、特別の場合として、セグメント化された一般化線形モデルの重要なクラスを含み、非常に多様である。ポアソン回帰(poisson regression)によるcovid-19の拡散をモデル化し,変化点検出タスクに適用し,概念ドリフトを用いた分類モデルを学ぶ。提案手法は,勾配降下のための標準アルゴリズムを用いて,これらのタスクを効果的に学習することを示す。 Segmented models are widely used to describe non-stationary sequential data with discrete change points. Their estimation usually requires solving a mixed discrete-continuous optimization problem, where the segmentation is the discrete part and all other model parameters are continuous. A number of estimation algorithms have been developed that are highly specialized for their specific model assumptions. The dependence on non-standard algorithms makes it hard to integrate segmented models in state-of-the-art deep learning architectures that critically depend on gradient-based optimization techniques. In this work, we formulate a relaxed variant of segmented models that enables joint estimation of all model parameters, including the segmentation, with gradient descent. We build on recent advances in learning continuous warping functions and propose a novel family of warping functions based on the two-sided power (TSP) distribution. TSP-based warping functions are differentiable, have simple closed-form expressions, and can represent segmentation functions exactly. Our formulation includes the important class of segmented generalized linear models as a special case, which makes it highly versatile. We use our approach to model the spread of COVID-19 with Poisson regression, apply it on a change point detection task, and learn classification models with concept drift. The experiments show that our approach effectively learns all these tasks with standard algorithms for gradient descent.	翻訳日:2022-11-17 22:45:31 公開日:2021-01-18
# ヒット確率に基づく有向グラフとマルコフ連鎖の計量 A metric on directed graphs and Markov chains based on hitting probabilities ( http://arxiv.org/abs/2006.14482v2 ) ライセンス: Link先を確認	Zachary M. Boyd, Nicolas Fraiman, Jeremy L. Marzuola, Peter J. Mucha, Braxton Osting, and Jonathan Weare	(参考訳) 非向グラフにおける最短経路、可換時間、拡散距離は、次元減少、リンク予測、トリップ計画といった応用で広く用いられている。マルコフ連鎖や有向グラフから導出されるデータの非対称構造の利用に関心が高まるが、このタスクに特に適応する指標はほとんどない。我々は、任意のエルゴード、有限状態、時間同質なマルコフ連鎖の状態空間上の計量、特に有向グラフから導かれる任意のマルコフ連鎖について紹介する。提案手法は,あるノードから別のノードへのランダムウォーカーの移動に伴う距離空間の近さを仮定して構築した。特に、私たちの測定基準は、最短距離と平均歩行距離に影響を受けないので、既存の測定基準と比較して新しい情報を与えます。我々は、計量における退化の可能性を利用して、有向グラフの興味深い構造理論を開発し、関連する商化手順を探求する。我々のメトリックは、$O(n^3)$ timeで計算でき、$n$は状態の数であり、例えば、デスクトップコンピュータ上の$n=10,000$ノードと$\approx 38M$エッジまでスケールする。いくつかの例では、メートル法の性質を調べ、別の方法と比較し、密度グラフにおけるコミュニティ構造の弱い回復、可視化、構造回復、ダイナミクス探索、マルチスケールクラスタ検出に有用性を示す。 The shortest-path, commute time, and diffusion distances on undirected graphs have been widely employed in applications such as dimensionality reduction, link prediction, and trip planning. Increasingly, there is interest in using asymmetric structure of data derived from Markov chains and directed graphs, but few metrics are specifically adapted to this task. We introduce a metric on the state space of any ergodic, finite-state, time-homogeneous Markov chain and, in particular, on any Markov chain derived from a directed graph. Our construction is based on hitting probabilities, with nearness in the metric space related to the transfer of random walkers from one node to another at stationarity. Notably, our metric is insensitive to shortest and average walk distances, thus giving new information compared to existing metrics. We use possible degeneracies in the metric to develop an interesting structural theory of directed graphs and explore a related quotienting procedure. Our metric can be computed in $O(n^3)$ time, where $n$ is the number of states, and in examples we scale up to $n=10,000$ nodes and $\approx 38M$ edges on a desktop computer. In several examples, we explore the nature of the metric, compare it to alternative methods, and demonstrate its utility for weak recovery of community structure in dense graphs, visualization, structure recovering, dynamics exploration, and multiscale cluster detection.	翻訳日:2022-11-17 03:57:41 公開日:2021-01-18
# 低パス協調フィルタを用いたグラフ畳み込みネットワークの提案 Graph Convolutional Network for Recommendation with Low-pass Collaborative Filters ( http://arxiv.org/abs/2006.15516v3 ) ライセンス: Link先を確認	Wenhui Yu and Zheng Qin	(参考訳) \textbf{g}raph \textbf{c}onvolutional \textbf{n}etwork (\textbf{gcn})はレコメンデーションなどのグラフデータ学習タスクで広く使われている。しかし、大きなグラフに直面する場合、グラフの畳み込みは非常に計算コストがかかるため、既存のすべてのGCNでは単純化されるが、過剰な単純化により深刻な障害が生じる。このギャップに対処するために、GCN の \textit{ Origin graph convolution} を利用し、大きなグラフに適用するために \textbf{L}ow-pass \textbf{C}ollaborative \textbf{F}ilter (\textbf{LCF}) を提案する。 LCFは、観測データの露出と量子化に起因するノイズを取り除くように設計されており、また、グラフ畳み込みの複雑さを非スケールで低減する。実験の結果,LCFはグラフ畳み込みの有効性と効率を向上し,GCNは既存のGCNよりも優れていた。コードは \url{https://github.com/wenhui-yu/lcfn} で入手できる。 \textbf{G}raph \textbf{C}onvolutional \textbf{N}etwork (\textbf{GCN}) is widely used in graph data learning tasks such as recommendation. However, when facing a large graph, the graph convolution is very computationally expensive thus is simplified in all existing GCNs, yet is seriously impaired due to the oversimplification. To address this gap, we leverage the \textit{original graph convolution} in GCN and propose a \textbf{L}ow-pass \textbf{C}ollaborative \textbf{F}ilter (\textbf{LCF}) to make it applicable to the large graph. LCF is designed to remove the noise caused by exposure and quantization in the observed data, and it also reduces the complexity of graph convolution in an unscathed way. Experiments show that LCF improves the effectiveness and efficiency of graph convolution and our GCN outperforms existing GCNs significantly. Codes are available on \url{https://github.com/Wenhui-Yu/LCFN}.	翻訳日:2022-11-16 02:25:54 公開日:2021-01-18
# 可聴, 探究: 好奇心をオーディオ・ビジュアル・アソシエーションで見る See, Hear, Explore: Curiosity via Audio-Visual Association ( http://arxiv.org/abs/2007.03669v2 ) ライセンス: Link先を確認	Victoria Dean, Shubham Tulsiani, Abhinav Gupta	(参考訳) 探索は強化学習における中核的な課題の1つだ。好奇心駆動探索の一般的な定式化は、学習モデルによって予測される実際の未来と未来との差を用いる。しかし、未来を予測することは本質的に難しい課題であり、確率性に直面しても不適切である。本稿では,異なる感覚間の新たな関連に報いる好奇心の代替形態を提案する。我々のアプローチは、より効率的な探索のためのより強力な信号を提供するために、複数のモダリティを利用する。我々の手法は、人間にとって視覚と音の両方が探索において重要な役割を果たすという事実に着想を得ている。いくつかのAtari環境とHabitat(フォトリアリスティックナビゲーションシミュレータ)について,外部報酬がない場合の学習エージェントを内在的に導くために,オーディオ視覚関連モデルを使用することの利点を示す。ビデオやコードについてはhttps://vdean.github.io/audio-curiosity.htmlを参照。 Exploration is one of the core challenges in reinforcement learning. A common formulation of curiosity-driven exploration uses the difference between the real future and the future predicted by a learned model. However, predicting the future is an inherently difficult task which can be ill-posed in the face of stochasticity. In this paper, we introduce an alternative form of curiosity that rewards novel associations between different senses. Our approach exploits multiple modalities to provide a stronger signal for more efficient exploration. Our method is inspired by the fact that, for humans, both sight and sound play a critical role in exploration. We present results on several Atari environments and Habitat (a photorealistic navigation simulator), showing the benefits of using an audio-visual association model for intrinsically guiding learning agents in the absence of external rewards. For videos and code, see https://vdean.github.io/audio-curiosity.html.	翻訳日:2022-11-12 18:13:37 公開日:2021-01-18
# 構造畳み込みモデルの昇降によるロスレス圧縮 Lossless Compression of Structured Convolutional Models via Lifting ( http://arxiv.org/abs/2007.06567v2 ) ライセンス: Link先を確認	Gustav Sourek, Filip Zelezny, Ondrej Kuzelka	(参考訳) 持ち上げは、基礎となる対称性を利用して、関係ドメインに一般化されたグラフィカルなモデルをスケールアップする効率的なテクニックである。同時に、ニューラルネットワークはグリッドのようなテンソルデータから様々な属性グラフやリレーショナルデータベースなどの構造化表現へと継続的に拡張されている。データの不規則構造に対処するため、モデルは通常、畳み込みの概念を外挿し、パラメータ共有を動的に展開された計算グラフに効果的に導入する。計算グラフ自体は、持ち上げられたグラフィカルモデルと同様に、基礎となるデータの対称性を反映する。昇降に触発されて,対称性を検知し,情報を失うことなく神経モデルを圧縮する簡易かつ効率的な手法を提案する。このような圧縮が、分子分類や知識ベース補完といった様々なタスクにおいて、様々なグラフニューラルネットワークのような構造化畳み込みモデルの大幅な高速化につながることを示す。 Lifting is an efficient technique to scale up graphical models generalized to relational domains by exploiting the underlying symmetries. Concurrently, neural models are continuously expanding from grid-like tensor data into structured representations, such as various attributed graphs and relational databases. To address the irregular structure of the data, the models typically extrapolate on the idea of convolution, effectively introducing parameter sharing in their, dynamically unfolded, computation graphs. The computation graphs themselves then reflect the symmetries of the underlying data, similarly to the lifted graphical models. Inspired by lifting, we introduce a simple and efficient technique to detect the symmetries and compress the neural models without loss of any information. We demonstrate through experiments that such compression can lead to significant speedups of structured convolutional models, such as various Graph Neural Networks, across various tasks, such as molecule classification and knowledge-base completion.	翻訳日:2022-11-10 22:46:47 公開日:2021-01-18
# 音響表現からモデルロバスト性へ From Sound Representation to Model Robustness ( http://arxiv.org/abs/2007.13703v3 ) ライセンス: Link先を確認	Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich	(参考訳) 本稿では, 各種標準環境音響表現(スペクトログラム)が, 被害者残差畳み込みニューラルネットワークの認識性能と対角攻撃性に与える影響について検討する。 ResNet-18モデルは,3つの環境音響データセットのベンチマーク実験により,分類精度とトレーニングパラメータ数の両方において,GoogLeNetやAlexNetといった他のディープラーニングアーキテクチャよりも優れていることがわかった。そこで我々はこのモデルを,その後の調査のためのフロントエンド分類器として設定した。ここでは,より情報的なメル周波数ケプストラム係数(mfcc),短時間フーリエ変換(stft),離散ウェーブレット変換(dwt)の生成に必要な様々な設定の影響を測定する。この測定は、対向ロバスト性に対する分類性能の比較を含む。敵が割り当てる平均予算と攻撃コストのバランスについて,6つの攻撃アルゴリズムに対する認識精度とモデルロバスト性の逆関係を示す。さらに,DWTスペクトルを用いたResNet-18モデルでは高い認識精度が得られたが,他の2次元表現と比較して,このモデルに対する攻撃は比較的コストがかかることを示した。 In this paper, we investigate the impact of different standard environmental sound representations (spectrograms) on the recognition performance and adversarial attack robustness of a victim residual convolutional neural network. Averaged over various experiments on three benchmarking environmental sound datasets, we found the ResNet-18 model outperforms other deep learning architectures such as GoogLeNet and AlexNet both in terms of classification accuracy and the number of training parameters. Therefore we set this model as our front-end classifier for subsequent investigations. Herein, we measure the impact of different settings required for generating more informative mel-frequency cepstral coefficient (MFCC), short-time Fourier transform (STFT), and discrete wavelet transform (DWT) representations on our front-end model. This measurement involves comparing the classification performance over the adversarial robustness. On the balance of average budgets allocated by adversary and the cost of attack, we demonstrate an inverse relationship between recognition accuracy and model robustness against six attack algorithms. Moreover, our experimental results show that while the ResNet-18 model trained on DWT spectrograms achieves the highest recognition accuracy, attacking this model is relatively more costly for the adversary compared to other 2D representations.	翻訳日:2022-11-06 11:54:54 公開日:2021-01-18
# 深部スケッチガイド付きマンガビデオインタリング Deep Sketch-guided Cartoon Video Inbetweening ( http://arxiv.org/abs/2008.04149v2 ) ライセンス: Link先を確認	Xiaoyu Li, Bo Zhang, Jing Liao and Pedro V. Sander	(参考訳) 本研究では,2つの入力キーフレームから色情報を抽出し,ユーザスケッチによるアニメーション動作に追従し,マンガ映像を作成するための新しい枠組みを提案する。提案手法の重要な考え方は,スケッチと漫画ビデオフレーム間の密なクロスドメイン対応を推定することであり,スケッチに導かれた中間フレームを合成するためにオクルージョン推定を伴うブレンディングモジュールを用いる。その後、入力フレームと確立対応を備えた合成フレームとを任意の時間フレーム補間パイプラインに供給し、追加のフレーム間を合成する。最後に、時間的一貫性を保つモジュールを用いる。一般的なフレーム補間法と比較して,比較的大きな動きのフレームに対処できると同時に,スケッチガイドを編集することで,ユーザが生成したビデオシーケンスを制御できる柔軟性も備えている。フレームとスケッチの対応を明示的に考慮することで、他の画像合成法よりも高品質な結果が得られる。これらの結果から,本システムは,既存のソリューションよりも優れた結果が得られることを示す。 We propose a novel framework to produce cartoon videos by fetching the color information from two input keyframes while following the animated motion guided by a user sketch. The key idea of the proposed approach is to estimate the dense cross-domain correspondence between the sketch and cartoon video frames, and employ a blending module with occlusion estimation to synthesize the middle frame guided by the sketch. After that, the input frames and the synthetic frame equipped with established correspondence are fed into an arbitrary-time frame interpolation pipeline to generate and refine additional inbetween frames. Finally, a module to preserve temporal consistency is employed. Compared to common frame interpolation methods, our approach can address frames with relatively large motion and also has the flexibility to enable users to control the generated video sequences by editing the sketch guidance. By explicitly considering the correspondence between frames and the sketch, we can achieve higher quality results than other image synthesis methods. Our results show that our system generalizes well to different movie frames, achieving better results than existing solutions.	翻訳日:2022-10-31 23:03:49 公開日:2021-01-18
# 脳接続ネットワーク分類のための順序パターンカーネル Ordinal Pattern Kernel for Brain Connectivity Network Classification ( http://arxiv.org/abs/2008.07719v2 ) ライセンス: Link先を確認	Kai Ma, Biao Jie, Daoqiang Zhang	(参考訳) 脳領域の機能的または構造的相互作用を特徴付ける脳接続ネットワークは、脳疾患の分類に広く使われている。グラフカーネル(すなわちグラフ上に定義されたカーネル)のようなカーネルベースの手法は、脳ネットワークの類似性を測定するために提案され、有望な分類性能が得られる。しかし、ほとんどのグラフカーネルは、エッジが存在するか否かに関わらず、未重み付きグラフ(すなわちネットワーク)上に構築されており、脳接続ネットワークにおけるエッジの貴重な重み情報を無視し、エッジ重みは脳領域間の時間的相関やファイバー接続の強さを伝達する。そこで本研究では,脳接続ネットワーク分類のための順序パターンカーネルを提案する。非重み付きグラフの位相的類似度を測定する既存のグラフカーネルとは異なり、提案した順序パターンカーネルは重み付きネットワークの順序パターンを比較して重み付きネットワークの類似度を算出する。提案手法の有効性を評価するため,adniデータベースから脳疾患の実データを用いて,深さ優先型順序パターンカーネルをさらに開発し,広範な実験を行った。その結果,提案する順序パターンカーネルは,最先端グラフカーネルと比較して分類性能が向上することが示された。 Brain connectivity networks, which characterize the functional or structural interaction of brain regions, has been widely used for brain disease classification. Kernel-based method, such as graph kernel (i.e., kernel defined on graphs), has been proposed for measuring the similarity of brain networks, and yields the promising classification performance. However, most of graph kernels are built on unweighted graph (i.e., network) with edge present or not, and neglecting the valuable weight information of edges in brain connectivity network, with edge weights conveying the strengths of temporal correlation or fiber connection between brain regions. Accordingly, in this paper, we present an ordinal pattern kernel for brain connectivity network classification. Different with existing graph kernels that measures the topological similarity of unweighted graphs, the proposed ordinal pattern kernels calculate the similarity of weighted networks by comparing ordinal patterns from weighted networks. To evaluate the effectiveness of the proposed ordinal kernel, we further develop a depth-first-based ordinal pattern kernel, and perform extensive experiments in a real dataset of brain disease from ADNI database. The results demonstrate that our proposed ordinal pattern kernel can achieve better classification performance compared with state-of-the-art graph kernels.	翻訳日:2022-10-27 20:46:51 公開日:2021-01-18
# クローズドループコンピュータ支援肺超音波画像の画質評価 Image quality assessment for closed-loop computer-assisted lung ultrasound ( http://arxiv.org/abs/2008.08840v2 ) ライセンス: Link先を確認	Zachary M C Baum, Ester Bonmati, Lorenzo Cristoni, Andrew Walden, Ferran Prados, Baris Kanber, Dean C Barratt, David J Hawkes, Geoffrey J M Parker, Claudia A M Gandini Wheeler-Kingshott, Yipeng Hu	(参考訳) 本稿では,集中治療環境における超音波画像を用いた肺異常検出のための新しい2段階コンピュータ支援システムについて述べる。提案システムは, 画像品質の予測を自動化する品質評価モジュールと, 十分な品質の超音波画像におけるオアノマリーの可能性を判定する診断支援モジュールの2つの深層学習モデルから構成される。 2段階戦略では,品質評価分類器の訓練に利用可能な制御ケースの欠如に対処するために,新規検出アルゴリズムを用いる。診断支援モジュールは、品質評価モジュールからクローズドループフィードバック機構によって保証される十分な品質と判断されたデータでトレーニングすることができる。 2つの病院でスキャンされた37人の新型コロナウイルス陽性患者の超音波画像と12のコントロールケースから,提案した機械学習アプローチの有効性を実証した。品質評価モジュールを用いて,十分な画質画像と不十分な画質画像の分類を行う場合の精度は86%であった。品質評価モジュールによって決定される十分な品質のデータについて,提案システム内のネットワークのトレーニング中,5つのホールドアウトテストデータセットにおいて,covid-19陽性例の平均分類精度,感度,特異性がそれぞれ0.95, 0.91, 0.97であった。全体として、この2つのモジュールの統合は、医療現場で疑われる呼吸器疾患の患者に対して、正確、迅速、実用的な取得指導と診断支援をもたらす。 We describe a novel, two-stage computer assistance system for lung anomaly detection using ultrasound imaging in the intensive care setting to improve operator performance and patient stratification during coronavirus pandemics. The proposed system consists of two deep-learning-based models: a quality assessment module that automates predictions of image quality, and a diagnosis assistance module that determines the likelihood-oh-anomaly in ultrasound images of sufficient quality. Our two-stage strategy uses a novelty detection algorithm to address the lack of control cases available for training the quality assessment classifier. The diagnosis assistance module can then be trained with data that are deemed of sufficient quality, guaranteed by the closed-loop feedback mechanism from the quality assessment module. Using more than 25000 ultrasound images from 37 COVID-19-positive patients scanned at two hospitals, plus 12 control cases, this study demonstrates the feasibility of using the proposed machine learning approach. We report an accuracy of 86% when classifying between sufficient and insufficient quality images by the quality assessment module. For data of sufficient quality - as determined by the quality assessment module - the mean classification accuracy, sensitivity, and specificity in detecting COVID-19-positive cases were 0.95, 0.91, and 0.97, respectively, across five holdout test data sets unseen during the training of any networks within the proposed system. Overall, the integration of the two modules yields accurate, fast, and practical acquisition guidance and diagnostic assistance for patients with suspected respiratory conditions at point-of-care.	翻訳日:2022-10-27 04:08:12 公開日:2021-01-18
# パイロット:IJCAI 2020の人間とエージェントのネゴシエーションチャレンジの勝者 Pilot: Winner of the Human-Agent Negotiation Challenge at IJCAI 2020 ( http://arxiv.org/abs/2009.06781v2 ) ライセンス: Link先を確認	Kushal Chawla, Gale Lucas	(参考訳) この文書は、IJCAI 2020のANACの人間-エージェントネゴシエーションチャレンジで優勝したエージェントのパイロットについて記述しています。 pilotは、人間のパートナーと3つの交渉の連続に参加する仮想人間である。本システムは,IAGO(Interactive Arbitration Guide Online)ネゴシエーションフレームワークをベースとしている。我々は,エージェントの行動や性格を規定する様々な鍵となる原則を導出するために,事前の感情コンピューティングと心理学の研究を交渉に活用する。 This document describes our agent Pilot, winner of the Human-Agent Negotiation Challenge at ANAC, IJCAI 2020. Pilot is a virtual human that participates in a sequence of three negotiations with a human partner. Our system is based on the Interactive Arbitration Guide Online (IAGO) negotiation framework. We leverage prior Affective Computing and Psychology research in negotiations to guide various key principles that define the behavior and personality of our agent.	翻訳日:2022-10-18 12:50:26 公開日:2021-01-18
# マルチエージェント価値分解のためのエネルギーベースサプライズ最小化 Energy-based Surprise Minimization for Multi-Agent Value Factorization ( http://arxiv.org/abs/2009.09842v4 ) ライセンス: Link先を確認	Karush Suri, Xiao Qi Shi, Konstantinos Plataniotis, Yuri Lawryshyn	(参考訳) MARL(Multi-Agent Reinforcement Learning)は、分散政策を集中的に訓練する上で、価値分解法を用いて大きな成功を収めている。しかしながら、スプリアス状態と近似バイアスにまたがる驚きに対処することは、マルチエージェントの設定では未解決の問題のままである。この目標に向けて,エージェント間のエネルギー利用を最小化するアルゴリズムであるEMIX(Energy-based MIXer)を導入する。 1) emixは,マルチエージェントの部分観測可能な設定の場合,複数のエージェントにまたがる新たなサプライズ最小化手法を導入している。 2) emix はエネルギー作用素の理論的保証と実験検証を伴う marl におけるエネルギー関数の実用化を強調する。最後に、(3)EMIXはMARLのエージェント間の過大評価バイアスに対処するためにMaxmin Q-learningを拡張する。 StarCraft IIのマイクロマネジメントシナリオを挑戦する研究において、EMIXはマルチエージェントサプライズ最小化のための一貫した安定したパフォーマンスを示す。さらに, エネルギーベース方式の必要性と, MARLにおける過大評価バイアスの除去の必要性について検討した。 EMIXの実装はkarush17.github.io/emix-web/で確認できます。 Multi-Agent Reinforcement Learning (MARL) has demonstrated significant success in training decentralised policies in a centralised manner by making use of value factorization methods. However, addressing surprise across spurious states and approximation bias remain open problems for multi-agent settings. Towards this goal, we introduce the Energy-based MIXer (EMIX), an algorithm which minimizes surprise utilizing the energy across agents. Our contributions are threefold; (1) EMIX introduces a novel surprise minimization technique across multiple agents in the case of multi-agent partially-observable settings. (2) EMIX highlights a practical use of energy functions in MARL with theoretical guarantees and experiment validations of the energy operator. Lastly, (3) EMIX extends Maxmin Q-learning for addressing overestimation bias across agents in MARL. In a study of challenging StarCraft II micromanagement scenarios, EMIX demonstrates consistent stable performance for multiagent surprise minimization. Moreover, our ablation study highlights the necessity of the energy-based scheme and the need for elimination of overestimation bias in MARL. Our implementation of EMIX can be found at karush17.github.io/emix-web/.	翻訳日:2022-10-17 23:47:07 公開日:2021-01-18
# ハミルトニアン完備問題(hamiltonian completion problem)の変遷 Evolving test instances of the Hamiltonian completion problem ( http://arxiv.org/abs/2011.02291v2 ) ライセンス: Link先を確認	Thibault Lechien, Jorik Jooken, Patrick De Causmaecker	(参考訳) グラフインスタンス上でのアルゴリズム性能の予測と比較は、複数の理由から難しい。まず、パフォーマンスをベンチマークするインスタンスの標準セットは、通常存在しない。第二に、既存のグラフ生成器を使用すると、困難なスペクトルが制限され、結果として得られるグラフは通常、音の結論を引き出すのに十分な多様性がない。そこで最近の研究は、進化的アルゴリズムを用いて多様なインスタンス群を生成する新しい手法を提案する。そして、結果のグラフを分析し、どの属性がアルゴリズムのパフォーマンスに最も関連しているかに関する重要な洞察を得ることができます。以前は目に見えない機能の組み合わせでグラフを生成するために、インスタンス空間の観測されたギャップを埋めることもできる。この手法は、2つの異なる解法、すなわち Concorde TSP Solver とマルチスタート局所探索アルゴリズムを用いてハミルトン完備化問題のインスタンス空間に適用する。 Predicting and comparing algorithm performance on graph instances is challenging for multiple reasons. First, there is usually no standard set of instances to benchmark performance. Second, using existing graph generators results in a restricted spectrum of difficulty and the resulting graphs are usually not diverse enough to draw sound conclusions. That is why recent work proposes a new methodology to generate a diverse set of instances by using an evolutionary algorithm. We can then analyze the resulting graphs and get key insights into which attributes are most related to algorithm performance. We can also fill observed gaps in the instance space in order to generate graphs with previously unseen combinations of features. This methodology is applied to the instance space of the Hamiltonian completion problem using two different solvers, namely the Concorde TSP Solver and a multi-start local search algorithm.	翻訳日:2022-10-10 20:02:11 公開日:2021-01-18
# 一連の不幸な反事実的出来事:反事実的説明における時間の役割 A Series of Unfortunate Counterfactual Events: the Role of Time in Counterfactual Explanations ( http://arxiv.org/abs/2010.04687v2 ) ライセンス: Link先を確認	Andrea Ferrario, Michele Loi	(参考訳) 反事実的説明は、説明可能な人工知能研究領域におけるポストホック解釈可能性手法の顕著な例である。彼らは個人に代替シナリオと一連のレコメンデーションを提供し、機械学習モデルの結果を追求する。近年,本論文は,現実の文脈における適用性を支えるための実現可能性や行動可能性,スパーシティといった反事実的説明のデシデラタを特定している。しかし,本論文は,反実的説明の時間依存性の問題を無視していることを示す。時間的依存とレコメンデーションの提供のため、現実のアプリケーションでは、実現可能で、行動可能で、スパースなカウンターファクトな説明が適さないかもしれない、と我々は主張する。これは、私たちが"不幸な反ファクトイベント"と呼ぶものが出現する可能性があるためです。これらの出来事は、結果を説明する必要がある機械学習モデルの再訓練によって起こりうる。一連の不幸な反事実的出来事は、反事実的説明の推奨を実行に移した人々の努力をいらいらさせる。これは、学習支援決定を一貫して提供できる機関の能力に対する人々の信頼に悪影響を及ぼす。本稿では,反事実的説明の履歴を利用した不運な反事実的事象の発生問題に対処するためのアプローチを提案する。本論文の最終部では,不運な対実事件に対処する2つの異なる戦略の倫理的分析を提案する。信用貸付組織の信頼度、採用する意思決定モデル、信用貸付の社会的経済的機能を維持するための倫理的責任を負う命令に反応することを示す。 Counterfactual explanations are a prominent example of post-hoc interpretability methods in the explainable Artificial Intelligence research domain. They provide individuals with alternative scenarios and a set of recommendations to achieve a sought-after machine learning model outcome. Recently, the literature has identified desiderata of counterfactual explanations, such as feasibility, actionability and sparsity that should support their applicability in real-world contexts. However, we show that the literature has neglected the problem of the time dependency of counterfactual explanations. We argue that, due to their time dependency and because of the provision of recommendations, even feasible, actionable and sparse counterfactual explanations may not be appropriate in real-world applications. This is due to the possible emergence of what we call "unfortunate counterfactual events." These events may occur due to the retraining of machine learning models whose outcomes have to be explained via counterfactual explanation. Series of unfortunate counterfactual events frustrate the efforts of those individuals who successfully implemented the recommendations of counterfactual explanations. This negatively affects people's trust in the ability of institutions to provide machine learning-supported decisions consistently. We introduce an approach to address the problem of the emergence of unfortunate counterfactual events that makes use of histories of counterfactual explanations. In the final part of the paper we propose an ethical analysis of two distinct strategies to cope with the challenge of unfortunate counterfactual events. We show that they respond to an ethically responsible imperative to preserve the trustworthiness of credit lending organizations, the decision models they employ, and the social-economic function of credit lending.	翻訳日:2022-10-09 05:40:58 公開日:2021-01-18
# スケルトンベース行動認識のためのポーズ改善グラフ畳み込みネットワーク Pose Refinement Graph Convolutional Network for Skeleton-based Action Recognition ( http://arxiv.org/abs/2010.07367v2 ) ライセンス: Link先を確認	Shijie Li, Jinhui Yi, Yazan Abu Farha and Juergen Gall	(参考訳) 2Dまたは3Dの骨格データを取得する進歩により、骨格に基づく行動認識はここ数年で注目されている。スケルトンデータはグラフによって一般的に表現されるので、グラフ畳み込みネットワークが提案されている。現在のグラフ畳み込みネットワークはアクションを正確に認識するが、計算資源が限られているロボティクスアプリケーションでは高価すぎる。そこで本稿では,従来の作業の限界に対処する高効率なグラフ畳み込みネットワークを提案する。これは、動きと空間情報を徐々に融合し、時間分解能をできるだけ早く低減する並列構造によって達成される。さらに、人間のポーズがエラーを含むことができる問題に明示的に対処する。この目的のために、ネットワークはまず、アクションを認識するためにさらに処理される前に、ポーズを洗練する。したがって、我々はネットワークを Pose Refinement Graph Convolutional Network と呼ぶ。他のグラフ畳み込みネットワークと比較して、我々のネットワークはパラメータを86\%-93\%少なくし、浮動小数点演算を89%-96%削減し、同等の精度を達成する。したがって、精度、メモリフットプリント、処理時間の間のトレードオフがより良くなり、ロボティクスアプリケーションに適している。 With the advances in capturing 2D or 3D skeleton data, skeleton-based action recognition has received an increasing interest over the last years. As skeleton data is commonly represented by graphs, graph convolutional networks have been proposed for this task. While current graph convolutional networks accurately recognize actions, they are too expensive for robotics applications where limited computational resources are available. In this paper, we therefore propose a highly efficient graph convolutional network that addresses the limitations of previous works. This is achieved by a parallel structure that gradually fuses motion and spatial information and by reducing the temporal resolution as early as possible. Furthermore, we explicitly address the issue that human poses can contain errors. To this end, the network first refines the poses before they are further processed to recognize the action. We therefore call the network Pose Refinement Graph Convolutional Network. Compared to other graph convolutional networks, our network requires 86\%-93\% less parameters and reduces the floating point operations by 89%-96% while achieving a comparable accuracy. It therefore provides a much better trade-off between accuracy, memory footprint and processing time, which makes it suitable for robotics applications.	翻訳日:2022-10-07 13:57:38 公開日:2021-01-18
# MRI前立腺病変分類における領域適応の不確かさ Harnessing Uncertainty in Domain Adaptation for MRI Prostate Lesion Segmentation ( http://arxiv.org/abs/2010.07411v2 ) ライセンス: Link先を確認	Eleni Chiou, Francesco Giganti, Shonit Punwani, Iasonas Kokkinos, Eleftheria Panagiotaki	(参考訳) トレーニングデータの必要性は、学習型医用画像解析における新しい画像モダリティの導入を妨げる可能性がある。ドメイン適応法は、関連するソースドメインから新しいターゲットドメインにトレーニングデータを変換することで部分的にこの問題を軽減するが、一般的には1対1の翻訳が可能であると仮定する。我々の研究は、単一のソースサンプルから複数のターゲットサンプルが出現する、より情報的なターゲットドメインに適応するという課題に対処する。特に,癌評価のための最適化された取得プロトコルを含む,よりリッチなMRIモダリティである mp-MRI から VERDICT への変換を検討する。我々は、このマッピングの固有の不確実性を明確に説明し、1つの入力で条件付けられた複数の出力を生成するためにそれを利用する。以上の結果から,単純なCycleGANベースラインと,識別的セグメンテーション損失と/または残差アダプタを併用したより強力なアプローチの両面から,対象領域に対する画像表現を系統的に向上させることが可能であることが示唆された。決定論的手法と比較して、我々の手法は、幅広いデータセットサイズ、ますます強力なベースライン、評価尺度で大幅に改善される。 The need for training data can impede the adoption of novel imaging modalities for learning-based medical image analysis. Domain adaptation methods partially mitigate this problem by translating training data from a related source domain to a novel target domain, but typically assume that a one-to-one translation is possible. Our work addresses the challenge of adapting to a more informative target domain where multiple target samples can emerge from a single source sample. In particular we consider translating from mp-MRI to VERDICT, a richer MRI modality involving an optimized acquisition protocol for cancer characterization. We explicitly account for the inherent uncertainty of this mapping and exploit it to generate multiple outputs conditioned on a single input. Our results show that this allows us to extract systematically better image representations for the target domain, when used in tandem with both simple, CycleGAN-based baselines, as well as more powerful approaches that integrate discriminative segmentation losses and/or residual adapters. When compared to its deterministic counterparts, our approach yields substantial improvements across a broad range of dataset sizes, increasingly strong baselines, and evaluation measures.	翻訳日:2022-10-07 13:10:28 公開日:2021-01-18
# 条件付き変分オートエンコーダにおけるマルチモーダル潜時空間のエビデンシャルスカラー化 Evidential Sparsification of Multimodal Latent Spaces in Conditional Variational Autoencoders ( http://arxiv.org/abs/2010.09164v3 ) ライセンス: Link先を確認	Masha Itkina, Boris Ivanovic, Ransalu Senanayake, Mykel J. Kochenderfer, and Marco Pavone	(参考訳) 変分オートエンコーダの離散的潜在空間は、自然言語理解、人間の意図予測、視覚シーン表現など、多くの現実世界の問題に対するデータ分布を効果的に捉えることが示されている。しかし、離散潜在空間は実世界のデータの複雑さを捉えるのに十分な大きさでなければならない。例えば、高次元の潜在環境表現で動き計画を実行することは難解である。学習されたマルチモダリティを保ちつつ、訓練された条件付き変分オートエンコーダの離散的潜在空間をスパースする問題を考える。ポストホック潜在空間還元法として,特定の入力条件から直接的証拠を受け取る潜在クラスを同定し,そうでないクラスをフィルタする。画像生成や人間の行動予測などの多様なタスクの実験は、学習された多モード性を維持しながら、モデルの離散潜在サンプル空間サイズを小さくする手法の有効性を実証する。 Discrete latent spaces in variational autoencoders have been shown to effectively capture the data distribution for many real-world problems such as natural language understanding, human intent prediction, and visual scene representation. However, discrete latent spaces need to be sufficiently large to capture the complexities of real-world data, rendering downstream tasks computationally challenging. For instance, performing motion planning in a high-dimensional latent representation of the environment could be intractable. We consider the problem of sparsifying the discrete latent space of a trained conditional variational autoencoder, while preserving its learned multimodality. As a post hoc latent space reduction technique, we use evidential theory to identify the latent classes that receive direct evidence from a particular input condition and filter out those that do not. Experiments on diverse tasks, such as image generation and human behavior prediction, demonstrate the effectiveness of our proposed technique at reducing the discrete latent sample space size of a model while maintaining its learned multimodality.	翻訳日:2022-10-05 20:21:16 公開日:2021-01-18
# 連続学習のためのモジュール関連性 Modular-Relatedness for Continual Learning ( http://arxiv.org/abs/2011.01272v2 ) ライセンス: Link先を確認	Ammar Shaker, Shujian Yu, Francesco Alesiani	(参考訳) 本稿では,逐次的タスク学習者にとって有益な連続学習(CL)手法を提案する。このアプローチの主なターゲットは、ニューラルネットワークのモジュール部分の自動抽出と、これらのモジュールコンポーネントによって与えられたタスク間の関連性の推定です。この手法は、正規化ベースの(例えばElastic Weight Consolidation)やリハーサルベースの(例えばGradient Episodic Memory)といった、エピソードメモリを必要とするCLメソッドの異なるファミリーに適用できる。実験結果から,EWC や GEM などの手法,特にメモリ予算が極めて限られている場合に,顕著な性能向上(忘れることへの堅牢性)が得られた。 In this paper, we propose a continual learning (CL) technique that is beneficial to sequential task learners by improving their retained accuracy and reducing catastrophic forgetting. The principal target of our approach is the automatic extraction of modular parts of the neural network and then estimating the relatedness between the tasks given these modular components. This technique is applicable to different families of CL methods such as regularization-based (e.g., the Elastic Weight Consolidation) or the rehearsal-based (e.g., the Gradient Episodic Memory) approaches where episodic memory is needed. Empirical results demonstrate remarkable performance gain (in terms of robustness to forgetting) for methods such as EWC and GEM based on our technique, especially when the memory budget is very limited.	翻訳日:2022-09-30 11:20:50 公開日:2021-01-18
# 天然ガスタービン発電プラントにおけるプロセス解析と予測モデリングによるnoxの環境汚染予測 Environmental Pollution Prediction of NOx by Process Analysis and Predictive Modelling in Natural Gas Turbine Power Plants ( http://arxiv.org/abs/2011.08978v2 ) ライセンス: Link先を確認	Alan Rezazadeh	(参考訳) 本研究の目的は,天然ガス発電タービンからのNOx排出を予測するK-Nearest-Neighbor (KNN) アルゴリズムを提案することである。電力生産のプロセスは、気象や電力網の要求など多くの要因により、動的かつ急速に変化している。ガスタービン装置はタービンの寿命とともに機器特性や熱力学的挙動が変化するため、発電のダイナミックな部分でもある。タービンの定期的なメンテナンスも発電プロセスの別のダイナミックな部分であり、機器の性能に影響を及ぼす。この分析は、比較的小さなデータセットでトレーニングされたKNNを使用して、最も正確な予測率を生成する。このステートメントは、KNNが現在の入力パラメータに最も近いKを見つけ、歴史的に類似した観測の定式平均を予測として推定するときに論理的に説明できる。本稿では,環境条件,電気出力,タービン性能要因を取り入れ,nox排出量予測のための機械学習モデルを構築した。このモデルは、有害な排出を減らすための運用プロセスを最適化し、全体の運用効率を向上させるために使用できる。原理成分アルゴリズム(PCA)のような潜在アルゴリズムは、機器の性能変化を監視し、プロセスのパラメントに深く影響し、結果としてNOx排出量を決定する。本報告では,多変量解析,クラスタリング,残差解析などの機械学習性能評価の典型的な統計的手法を用いている。 The main objective of this paper is to propose K-Nearest-Neighbor (KNN) algorithm for predicting NOx emissions from natural gas electrical generation turbines. The process of producing electricity is dynamic and rapidly changing due to many factors such as weather and electrical grid requirements. Gas turbine equipment are also a dynamic part of the electricity generation since the equipment characteristics and thermodynamics behavior change as the turbines age. Regular maintenance of turbines are also another dynamic part of the electrical generation process, affecting the performance of equipment. This analysis discovered using KNN, trained on relatively small dataset produces the most accurate prediction rates. This statement can be logically explained as KNN finds the K nearest neighbor to the current input parameters and estimates a rated average of historically similar observations as prediction. This paper incorporates ambient weather conditions, electrical output as well as turbine performance factors to build a machine learning model to predict NOx emissions. The model can be used to optimize the operational processes for reduction in harmful emissions and increasing overall operational efficiency. Latent algorithms such as Principle Component Algorithms (PCA) have been used for monitoring the equipment performance behavior change which deeply influences process paraments and consequently determines NOx emissions. Typical statistical methods of machine learning performance evaluations such as multivariate analysis, clustering and residual analysis have been used throughout the paper.	翻訳日:2022-09-29 13:10:16 公開日:2021-01-18
# オートエンコーダを用いた教師付き学習における誤ラベル画像の同定 Identifying Mislabeled Images in Supervised Learning Utilizing Autoencoder ( http://arxiv.org/abs/2011.03667v2 ) ライセンス: Link先を確認	Yunhao Yang, Andrew Whinston	(参考訳) 教師付き学習は、トレーニングデータの基底真理が正確であるという仮定に基づいている。しかし、これは現実世界の設定では保証されない。不正確なトレーニングデータは、予想外の予測をもたらす。画像分類では、不正確なラベルによって分類モデルも不正確になる可能性がある。本稿では,分類ネットワークを訓練する前に,教師なしの手法をトレーニングデータに適用する。画像のエンコードおよび再構成に畳み込みオートエンコーダを適用する。エンコーダは画像データを潜在空間に投影する。潜在空間では、画像の特徴は低い次元で保存される。同様の特徴を持つデータサンプルは、同じラベルを持つ可能性が高いと仮定する。ノイズサンプルは、密度ベーススキャン(DBSCAN)クラスタリングアルゴリズムによって潜在空間に分類することができる。これらの不正確なラベル付きデータは潜在空間の異常値として可視化される。そのため、DBSCANアルゴリズムで同定された外れ値は、誤ってラベル付けされたサンプルに分類することができる。外れ値が検出されると、すべての外れ値が誤ってラベル付けされたデータサンプルとして扱われ、データセットから削除される。これにより、教師付き学習ネットワークのトレーニングにトレーニングデータを直接使用できる。このアルゴリズムは、実験データセットの67%以上の不正ラベル付きデータを検出および削除することができる。 Supervised learning is based on the assumption that the ground truth in the training data is accurate. However, this may not be guaranteed in real-world settings. Inaccurate training data will result in some unexpected predictions. In image classification, incorrect labels may cause the classification model to be inaccurate as well. In this paper, I am going to apply unsupervised techniques to the training data before training the classification network. A convolutional autoencoder is applied to encode and reconstruct images. The encoder will project the image data on to latent space. In the latent space, image features are preserved in a lower dimension. The assumption is that data samples with similar features are likely to have the same label. Noised samples can be classified in the latent space by the Density-Base Scan (DBSCAN) clustering algorithm. These incorrectly labeled data are visualized as outliers in the latent space. Therefore, the outliers identified by the DBSCAN algorithm can be classified as incorrectly labeled samples. After the outliers are detected, all the outliers are treated as mislabeled data samples and removed from the dataset. Thus the training data can be directly used in training the supervised learning network. The algorithm can detect and remove above 67\% of mislabeled data in the experimental dataset.	翻訳日:2022-09-28 22:27:06 公開日:2021-01-18
# 自己修飾機能を有する有界有理エージェントの性能 Performance of Bounded-Rational Agents With the Ability to Self-Modify ( http://arxiv.org/abs/2011.06275v2 ) ライセンス: Link先を確認	Jakub T\v{e}tek, Marek Sklenka, Tom\'a\v{s} Gaven\v{c}iak	(参考訳) 複雑な環境に埋め込まれたエージェントの自己修正は、直接的手段(例えば、コードの変更)や間接的(例えば、オペレーターに影響、バグを悪用する、あるいは環境を悪用する)によって発生するのを避けるのが難しい。インテリジェントエージェントは、将来のインスタンスが同じ目標に向かって動くように、ユーティリティ機能を変更することを避けるインセンティブがある、と論じられている。 Everitt et al. (2016) は、完全に合理的なエージェントに対して自己修正オプションを提供することは無害であることを示した。この結果は有界合理性を持つエージェントにはもはや当てはまらないことを示す。このようなエージェントでは、自己修飾は、パフォーマンスの指数関数的劣化と、予め整列されたエージェントの徐々にの不適応を引き起こす可能性がある。この効果の大きさが、エージェントの合理性における不完全性のタイプと大きさ(以下1-4)に依存するかを検討する。また,モデル仮定とより広い問題とフレーミング空間についても論じる。エージェントが有界有理化できる4つの方法を検討する。(1)は必ずしも最適な行動を選択しない、(2)は人間の値と完全に一致しない、(3)は環境の不正確なモデルを持っている、(4)は間違った時間的割引係数を使用する。 2)-(4)の場合,エージェントの不完全性に起因する誤用は時間とともに増大しないが,(1)誤用は指数関数的に増加する可能性がある。 Self-modification of agents embedded in complex environments is hard to avoid, whether it happens via direct means (e.g. own code modification) or indirectly (e.g. influencing the operator, exploiting bugs or the environment). It has been argued that intelligent agents have an incentive to avoid modifying their utility function so that their future instances work towards the same goals. Everitt et al. (2016) formally show that providing an option to self-modify is harmless for perfectly rational agents. We show that this result is no longer true for agents with bounded rationality. In such agents, self-modification may cause exponential deterioration in performance and gradual misalignment of a previously aligned agent. We investigate how the size of this effect depends on the type and magnitude of imperfections in the agent's rationality (1-4 below). We also discuss model assumptions and the wider problem and framing space. We examine four ways in which an agent can be bounded-rational: it either (1) doesn't always choose the optimal action, (2) is not perfectly aligned with human values, (3) has an inaccurate model of the environment, or (4) uses the wrong temporal discounting factor. We show that while in the cases (2)-(4) the misalignment caused by the agent's imperfection does not increase over time, with (1) the misalignment may grow exponentially.	翻訳日:2022-09-26 07:24:47 公開日:2021-01-18
# (参考訳) リモートセンシングデータを用いたインドにおける大気汚染のシグネチャの同定 Use of Remote Sensing Data to Identify Air Pollution Signatures in India ( http://arxiv.org/abs/2012.00402v2 ) ライセンス: CC BY 4.0	Sivaramakrishnan KN, Lipika Deka, Manik Gupta	(参考訳) 大気汚染は国家の社会経済的地位に大きな影響を及ぼし、主要な大気汚染源を特定することが問題に取り組む中心となっている。インドのように様々な国にまたがる空間的・時間的な空気質データ取得は、このような分析の課題となっている。センチネル5P衛星の打ち上げは、地球規模の大気汚染物質を毎日観測するよりも幅広い種類の大気汚染物質を観測するのに役立った。本章では、センチネル-5p衛星から得られた時空間的マルチ汚染物質データを、インド国内の各地域、およびそれに伴う月平均汚染サインおよび各クラスターで表される傾向を導出して提示し、各種汚染源から放出される汚染物質の種類に基づいて、国や地域を特定するためにクラスタリング署名を用いる。 Air quality has major impact on a country's socio-economic position and identifying major air pollution sources is at the heart of tackling the issue. Spatially and temporally distributed air quality data acquisition across a country as varied as India has been a challenge to such analysis. The launch of the Sentinel-5P satellite has helped in the observation of a wider variety of air pollutants than measured before at a global scale on a daily basis. In this chapter, spatio-temporal multi pollutant data retrieved from Sentinel-5P satellite is used to cluster states as well as districts in India and associated average monthly pollution signature and trends depicted by each of the clusters are derived and presented.The clustering signatures can be used to identify states and districts based on the types of pollutants emitted by various pollution sources.	翻訳日:2021-05-31 10:20:02 公開日:2021-01-18
# (参考訳) クロスエントロピー損失を伴う神経崩壊 Neural Collapse with Cross-Entropy Loss ( http://arxiv.org/abs/2012.08465v2 ) ライセンス: CC BY 4.0	Jianfeng Lu, Stefan Steinerberger	(参考訳) 我々は、単位超球面上の n$ 特徴ベクトルを $\mathbb{r}^d$ とするクロスエントロピー損失の変分問題を考える。我々は、$d \geq n1$ のとき、大域的最小値は、神経崩壊の振る舞いを正当化するsimplex equiangular tight frameによって与えられることを証明する。また、固定$d$の$n \rightarrow \infty$として、極小化点は超球面上で一様に分布し、ベネデット・アンド・フィッカスのフレームポテンシャルとの接続を示す。 We consider the variational problem of cross-entropy loss with $n$ feature vectors on a unit hypersphere in $\mathbb{R}^d$. We prove that when $d \geq n - 1$, the global minimum is given by the simplex equiangular tight frame, which justifies the neural collapse behavior. We also prove that as $n \rightarrow \infty$ with fixed $d$, the minimizing points will distribute uniformly on the hypersphere and show a connection with the frame potential of Benedetto & Fickus.	翻訳日:2021-05-07 10:10:52 公開日:2021-01-18
# (参考訳) グラフニューラルネットワーク - 分類学、進歩、トレンド Graph Neural Networks: Taxonomy, Advances and Trends ( http://arxiv.org/abs/2012.08752v2 ) ライセンス: CC BY 4.0	Yu Zhou, Haixia Zheng, Xin Huang	(参考訳) グラフニューラルネットワークは、特定のタスクに応じて、現実世界のグラフを低次元空間に埋め込む強力なツールキットを提供する。これまでのところ、このトピックに関するいくつかの調査がある。しかし、通常は異なる角度に重点を置いているため、読者はグラフニューラルネットワークのパノラマを見ることができない。この調査は、この制限を克服し、グラフニューラルネットワークの包括的なレビューを提供することを目的としている。まず、グラフニューラルネットワークの新しい分類法を提供し、その後、最大400の関連する文献を参照して、グラフニューラルネットワークのパノラマを示す。これらはすべて対応するカテゴリに分類される。グラフニューラルネットワークを新たな段階に導くために,我々は,直面する課題を克服するために,今後4つの研究方向をまとめる。より多くの研究者がグラフニューラルネットワークを理解し、活用し、研究コミュニティで利用することが期待されている。 Graph neural networks provide a powerful toolkit for embedding real-world graphs into low-dimensional spaces according to specific tasks. Up to now, there have been several surveys on this topic. However, they usually lay emphasis on different angles so that the readers can not see a panorama of the graph neural networks. This survey aims to overcome this limitation, and provide a comprehensive review on the graph neural networks. First of all, we provide a novel taxonomy for the graph neural networks, and then refer to up to 400 relevant literatures to show the panorama of the graph neural networks. All of them are classified into the corresponding categories. In order to drive the graph neural networks into a new stage, we summarize four future research directions so as to overcome the facing challenges. It is expected that more and more scholars can understand and exploit the graph neural networks, and use them in their research community.	翻訳日:2021-05-06 11:21:26 公開日:2021-01-18
# 学習ブロックベースハイブリッド画像圧縮 Learned Block-based Hybrid Image Compression ( http://arxiv.org/abs/2012.09550v3 ) ライセンス: Link先を確認	Yaojun Wu, Xin Li, Zhizheng Zhang, Xin Jin, Zhibo Chen	(参考訳) 近年の学習画像圧縮技術は, 符号化処理と復号処理をフル解像度で行い, 実用用途に展開する際の2つの問題点を生じさせている。第一に、自己回帰エントロピーモデルの並列加速度はシリアルデコードにより達成できない。第二に、フル解像度の推論は、特に高解像度の画像に対して、GPUリソースが限られているメモリ外問題(OOM)を引き起こすことが多い。ブロックパーティションは上記の問題に対処するためのよい設計選択だが、ブロック間の冗長性を減らし、ブロック効果をなくすという新たな課題をもたらす。上記の課題に対処するため,本稿では,学習ブロックベースハイブリッド画像圧縮(LBHIC)フレームワークを提案する。具体的には,隣接ブロック間の関係を利用するために,学習画像圧縮フレームワークに明示的な内部予測を導入する。従来のコーデックにおける隣接画素の線形重み付けによるコンテキストモデリングに優れており、ストリッププーリングを利用して隣接潜在空間における最も関連する情報を抽出し、効果的な情報予測を実現することで、長距離相関をよりよく捉えるコンテキスト予測モジュール(cpm)を提案する。さらに,ブロッキングアーティファクトを緩和するために,エッジの重要性を考慮した境界対応後処理モジュール(BPM)を提案する。広範な実験により、lbhicコーデックはvvcを4.1%のビットレート保存で上回り、最先端の学習画像圧縮法と比較して約86.7%の復号時間を削減できることが示されている。 Recent works on learned image compression perform encoding and decoding processes in a full-resolution manner, resulting in two problems when deployed for practical applications. First, parallel acceleration of the autoregressive entropy model cannot be achieved due to serial decoding. Second, full-resolution inference often causes the out-of-memory(OOM) problem with limited GPU resources, especially for high-resolution images. Block partition is a good design choice to handle the above issues, but it brings about new challenges in reducing the redundancy between blocks and eliminating block effects. To tackle the above challenges, this paper provides a learned block-based hybrid image compression (LBHIC) framework. Specifically, we introduce explicit intra prediction into a learned image compression framework to utilize the relation among adjacent blocks. Superior to context modeling by linear weighting of neighbor pixels in traditional codecs, we propose a contextual prediction module (CPM) to better capture long-range correlations by utilizing the strip pooling to extract the most relevant information in neighboring latent space, thus achieving effective information prediction. Moreover, to alleviate blocking artifacts, we further propose a boundary-aware postprocessing module (BPM) with the edge importance taken into account. Extensive experiments demonstrate that the proposed LBHIC codec outperforms the VVC, with a bit-rate conservation of 4.1%, and reduces the decoding time by approximately 86.7% compared with that of state-of-the-art learned image compression methods.	翻訳日:2021-05-02 07:17:13 公開日:2021-01-18
# フレキシビリティ設計問題に対する強化学習 Reinforcement Learning for Flexibility Design Problems ( http://arxiv.org/abs/2101.00355v2 ) ライセンス: Link先を確認	Yehua Wei, Lei Zhang, Ruiyi Zhang, Shijing Si, Hao Zhang, Lawrence Carin	(参考訳) フレキシビリティ設計問題(英: Flexibility design problem)とは、産業間の戦略的意思決定において、柔軟性と適応性を持つネットワーク(例えば製造コスト)を設計することを目的とする問題である。基礎となる組合せの性質と確率的目的は、標準最適化法において柔軟性設計の問題を引き起こす。本稿では、柔軟性設計問題に対する強化学習(RL)フレームワークを開発する。具体的には、実験的な成功を確実にするため、ノイズ探索と分散低減によるメカニズムを慎重に設計し、高速適応の観点からRLの独特な利点を示す。実験結果から、RLに基づく手法は古典的ヒューリスティックよりも優れた解を常に見出すことが示された。 Flexibility design problems are a class of problems that appear in strategic decision-making across industries, where the objective is to design a ($e.g.$, manufacturing) network that affords flexibility and adaptivity. The underlying combinatorial nature and stochastic objectives make flexibility design problems challenging for standard optimization methods. In this paper, we develop a reinforcement learning (RL) framework for flexibility design problems. Specifically, we carefully design mechanisms with noisy exploration and variance reduction to ensure empirical success and show the unique advantage of RL in terms of fast-adaptation. Empirical results show that the RL-based method consistently finds better solutions compared to classical heuristics.	翻訳日:2021-04-13 07:21:53 公開日:2021-01-18
# Rough Set AlgebraとCoreular Double Stone Algebraについての一考察 A Note on Rough Set Algebra and Core Regular Double Stone Algebras ( http://arxiv.org/abs/2101.02313v2 ) ライセンス: Link先を確認	Daniel J. Clouse	(参考訳) 近似空間 $\langle u,\theta \rangle$ が与えられたとき、$e$ が$\theta$ の同値類のインデックス集合であると仮定し、$r_\theta$ を通常の二重石代数として $\langle\underline{x},\overline{x}\rangle$ という形の粗集合の集合と、i. dunstch がカトリナック代数と呼ぶものと仮定する。 [7],[8] が [1] で与えられる証明から別の証明を与える:$\|\theta_u\| > 1\ \forall\ u \in U$ ならば、$R_\theta$ は核正則な二重ストーン代数である。さらに、$C_3$ は 3 つの元鎖をコア正則ダブルストーン代数とし、$TP_U$ は集合 $U$ 上の三次分割の集合を表す。 R_\theta$ with $\|\theta_u\| > 1\ \forall\ u \in U$ to be isomorphic to $TP_E$ and $C_3^E$, with $E$ is a indexing set for $\theta$, and the three CRDSA's are complete and atomic。これはアプリケーション内の特定の$r_\theta$を扱うときに非常に便利だと思います。 r_\theta$をそれぞれ$tp_u$、$c_3^u$、$\phi\circ \alpha_r:r_\theta\hookrightarrow tp_u\hookrightarrow c_3^u$に組み込む方法を明確に示します。定理 3 と [7] の補題 2.4 を踏襲すると、$c_3^j \cong r_\theta$ for $\langle u,\theta \rangle$ $u = j \times \{0,1\}$, $\theta = \{(j0),(j1)\} : j \in j\}$ で与えられる近似空間が示され、すべての crdsa は主粗集合代数の部分代数 $r_\theta$ に同型である。最後に、 [1] から例を拡張することで、これと主定理を実証する。さらに、一般に$TP_U$ および $C_3^U$ の部分代数についてもう少し知ることができ、これは任意の同値関係の同値類に対するインデックス集合である$E$ に対して、$\|\theta_u\| > 1\ \forall\ u \in U$ に対して存在しなければならない。 Given an approximation space $\langle U,\theta \rangle$, assume that $E$ is the indexing set for the equivalence classes of $\theta$ and let $R_\theta$ denote the collection of rough sets of the form $\langle\underline{X},\overline{X}\rangle$ as a regular double Stone algebra and what I. Dunstch referred to as a Katrinak algebra.[7],[8] We give an alternate proof from the one given in [1] of the fact that if $\|\theta_u\| > 1\ \forall\ u \in U$ then $R_\theta$ is a core regular double Stone algebra. Further let $C_3$ denote the 3 element chain as a core regular double Stone algebra and $TP_U$ denote the collection of ternary partitions over the set $U$. In our Main Theorem we show $R_\theta$ with $\|\theta_u\| > 1\ \forall\ u \in U$ to be isomorphic to $TP_E$ and $C_3^E$, with $E$ is an indexing set for $\theta$, and that the three CRDSA's are complete and atomic. We feel this could be very useful when dealing with a specific $R_\theta$ in an application. In our Main Corollary we show explicitly how we can embed such $R_\theta$ in $TP_U$, $C_3^U$, respectively, $\phi\circ \alpha_r:R_\theta\hookrightarrow TP_U\hookrightarrow C_3^U$, and hence identify it with its specific images. Following in the footsteps of Theorem 3. and Corollary 2.4 of [7], we show $C_3^J \cong R_\theta$ for $\langle U,\theta \rangle$ the approximation space given by $U = J \times \{0,1\}$, $\theta = \{(j0),(j1)\} : j \in J\}$ and every CRDSA is isomorphic to a subalgebra of a principal rough set algebra, $R_\theta$, for some approximation space $\langle U,\theta \rangle$. Finally, we demonstrate this and our Main Theorem by expanding an example from [1]. Further, we know a little more about the subalgebras of $TP_U$ and $C_3^U$ in general as they must exist for every $E$ that is an indexing set for the equivalence classes of any equivalence relation $\theta$ on $U$ satisfying $\|\theta_u\| > 1\ \forall\ u \in U$.	翻訳日:2021-04-10 13:29:02 公開日:2021-01-18
# 良い生徒が大きな宝くじを弾く Good Students Play Big Lottery Better ( http://arxiv.org/abs/2101.03255v2 ) ライセンス: Link先を確認	Haoyu Ma, Tianlong Chen, Ting-Kuei Hu, Chenyu You, Xiaohui Xie, Zhangyang Wang	(参考訳) 宝くじの仮説は、高密度ニューラルネットワークは、(同じ)ランダム初期化から訓練されたとき、元の高密度ネットのテスト精度に一致するスパースサブネットワークを含んでいることを示唆している。しかし、この仮説はResNet-50のようなより大きなネットワークに一般化できなかった。近年の研究では、ランダム初期化ではなく、初期モデルの訓練重量や学習率から再学習する巻き戻し技術を用いてスパースサブネットワークが得られることが示されている。 rewindingは宝くじをスケールアップする唯一の方法か、あるいは最良の方法か? 本稿では,KDチケット(Knowledge Distillation ticket)と呼ばれるサブネットワークの再学習手法を提案する。 rewindingは、大規模ネットワークでの抽選チケットを改善するために、初期のトレーニングフェーズから知識を継承する価値を利用する。対照的に、KDチケットは相補的な可能性に対処し、密集モデルの後期トレーニングフェーズから有用な知識を継承する。トレーニングされた高密度モデルによって生成されたソフトラベルを活用して、ハードラベルの代わりにサブネットワークをトレーニングする。 CIFAR-10とImageNetデータセット上の複数の大きなディープネットワーク(ResNet-50やResNet-110など)を使用して大規模な実験を行う。ベルやホイッスルがなければ、kdチケットはリワインディングと同等かそれ以上の性能を発揮するが、ハイパーパラメータやアドホックな選択がほとんどない。 KDチケットはさらに巻き戻しと共に適用でき、大規模宝くじの最先端結果が得られる。 Lottery ticket hypothesis suggests that a dense neural network contains a sparse sub-network that can match the test accuracy of the original dense net when trained in isolation from (the same) random initialization. However, the hypothesis failed to generalize to larger dense networks such as ResNet-50. As a remedy, recent studies demonstrate that a sparse sub-network can still be obtained by using a rewinding technique, which is to re-train it from early-phase training weights or learning rates of the dense model, rather than from random initialization. Is rewinding the only or the best way to scale up lottery tickets? This paper proposes a new, simpler and yet powerful technique for re-training the sub-network, called "Knowledge Distillation ticket" (KD ticket). Rewinding exploits the value of inheriting knowledge from the early training phase to improve lottery tickets in large networks. In comparison, KD ticket addresses a complementary possibility - inheriting useful knowledge from the late training phase of the dense model. It is achieved by leveraging the soft labels generated by the trained dense model to re-train the sub-network, instead of the hard labels. Extensive experiments are conducted using several large deep networks (e.g ResNet-50 and ResNet-110) on CIFAR-10 and ImageNet datasets. Without bells and whistles, when applied by itself, KD ticket performs on par or better than rewinding, while being nearly free of hyperparameters or ad-hoc selection. KD ticket can be further applied together with rewinding, yielding state-of-the-art results for large-scale lottery tickets.	翻訳日:2021-04-10 05:11:05 公開日:2021-01-18
# 特徴変換と自己重み付け注意に基づくレゾリューション不変人物reid Resolution-invariant Person ReID Based on Feature Transformation and Self-weighted Attention ( http://arxiv.org/abs/2101.04544v2 ) ライセンス: Link先を確認	Ziyue Zhang, Shuai Jiang, Congzhentao Huang, Richard Yi Da Xu	(参考訳) Person Re-identification (ReID) は、画像やビデオのシーケンスで同一人物と一致することを目的としたコンピュータビジョンタスクである。現在の作品のほとんどは、画像の解像度が同じである設定に焦点を当てている。しかし、この解像度は人物のReIDにおいて重要な要素であり、特にカメラが人物と異なる距離にある場合や、カメラのモデルが異なる場合などである。本稿では,RID特徴変換(RAFT)モジュールと自己重み付きアテンション(SWA)ReIDモジュールを組み合わせた2ストリームネットワークを提案する。 RAFTは低解像度特徴を対応する高解像度特徴に変換する。 SWAは、両方の特徴を評価して、ReIDの重み付けを行う。どちらのモジュールも解像度不変表現を得るために共同で訓練されている。 5つのベンチマークデータセットの大規模な実験により,本手法の有効性が示された。例えば、caviar と mlr-cuhk03 における rank-1 の精度は43.3% と 83.2% である。 Person Re-identification (ReID) is a critical computer vision task which aims to match the same person in images or video sequences. Most current works focus on settings where the resolution of images is kept the same. However, the resolution is a crucial factor in person ReID, especially when the cameras are at different distances from the person or the camera's models are different from each other. In this paper, we propose a novel two-stream network with a lightweight resolution association ReID feature transformation (RAFT) module and a self-weighted attention (SWA) ReID module to evaluate features under different resolutions. RAFT transforms the low resolution features to corresponding high resolution features. SWA evaluates both features to get weight factors for the person ReID. Both modules are jointly trained to get a resolution-invariant representation. Extensive experiments on five benchmark datasets show the effectiveness of our method. For instance, we achieve Rank-1 accuracy of 43.3% and 83.2% on CAVIAR and MLR-CUHK03, outperforming the state-of-the-art.	翻訳日:2021-04-04 01:43:41 公開日:2021-01-18
# (参考訳) ランダムシャドウとハイライト: 極端照明条件のための新しいデータ拡張法 Random Shadows and Highlights: A new data augmentation method for extreme lighting conditions ( http://arxiv.org/abs/2101.05361v2 ) ライセンス: CC BY 4.0	Osama Mazhar and Jens Kober	(参考訳) 本稿では,光の摂動に対するロバスト性を得るために,新しいデータ拡張手法であるランダムシャドウとハイライト(RSH)を提案する。提案手法はランダムな影と画像のハイライトを生成するため,学習過程においてニューラルネットワークに挑戦し,現実世界のアプリケーションにおける入力汚職に対する免疫を得る。これはパラメータ学習自由手法であり、ほとんどの視覚関連学習アプリケーションに統合することができる。広汎な実験により、RSHは照明摂動に対するモデルの堅牢性を高めるだけでなく、過度な適合性を著しく低減することを示した。したがって、RSHはすべての視覚関連学習システムに不可欠であると考えられるべきである。コードはhttps://github.com/osamamazhar/random-shadows-highlights。 In this paper, we propose a new data augmentation method, Random Shadows and Highlights (RSH) to acquire robustness against lighting perturbations. Our method creates random shadows and highlights on images, thus challenging the neural network during the learning process such that it acquires immunity against such input corruptions in real world applications. It is a parameter-learning free method which can be integrated into most vision related learning applications effortlessly. With extensive experimentation, we demonstrate that RSH not only increases the robustness of the models against lighting perturbations, but also reduces over-fitting significantly. Thus RSH should be considered essential for all vision related learning systems. Code is available at: https://github.com/OsamaMazhar/Random-Shadows-Highlights.	翻訳日:2021-03-30 08:34:55 公開日:2021-01-18
# (参考訳) トランスフォーマーを用いた新型コロナウイルス偽ニュース検出のための言語モデル微調整法 Transformer-based Language Model Fine-tuning Methods for COVID-19 Fake News Detection ( http://arxiv.org/abs/2101.05509v2 ) ライセンス: CC BY 4.0	Ben Chen, Bin Chen, Dehong Gao, Qijin Chen, Chengfu Huo, Xiaonan Meng, Weijun Ren, Yang Zhou	(参考訳) 新型コロナウイルス(COVID-19)のパンデミックで、関連する偽ニュースがソーシャルメディア全体に広まっている。差別なく彼らを信じることは、人々の生活に大きなトラブルを引き起こす可能性がある。しかし、このような偽ニュースの検出には、大規模な注釈付きデータやドメイン固有の知識の十分なセマンティック理解が欠如しているため、普遍言語モデルは弱い。対応するコーパスで訓練されたモデルは、不十分な学習にも適している。本稿では,これら偽ニュース検出のためのトランスフォーマーに基づく言語モデル微調整手法を提案する。まず、個々のモデルのトークン語彙を専門用語の実際の意味論のために拡張する。第2に,短文の曖昧さから偽ニュースによく見られるハードマイニングサンプルを区別するために,加熱したソフトマックス損失を適用した。そして、モデルの堅牢性を改善するために、敵の訓練を行う。最後に、普遍言語モデルRoBERTaとドメイン固有モデルCT-BERTによって抽出された予測特徴を、複数の層認識によって融合させ、微細で高レベルな特定の表現を統合する。既存のCOVID-19フェイクニュースデータセットで評価された定量的な実験結果は、様々な評価指標の最先端手法と比較して優れた性能を示した。さらに、ベストウェイト平均F1スコアは99.02%に達する。 With the pandemic of COVID-19, relevant fake news is spreading all over the sky throughout the social media. Believing in them without discrimination can cause great trouble to people's life. However, universal language models may perform weakly in these fake news detection for lack of large-scale annotated data and sufficient semantic understanding of domain-specific knowledge. While the model trained on corresponding corpora is also mediocre for insufficient learning. In this paper, we propose a novel transformer-based language model fine-tuning approach for these fake news detection. First, the token vocabulary of individual model is expanded for the actual semantics of professional phrases. Second, we adapt the heated-up softmax loss to distinguish the hard-mining samples, which are common for fake news because of the disambiguation of short text. Then, we involve adversarial training to improve the model's robustness. Last, the predicted features extracted by universal language model RoBERTa and domain-specific model CT-BERT are fused by one multiple layer perception to integrate fine-grained and high-level specific representations. Quantitative experimental results evaluated on existing COVID-19 fake news dataset show its superior performances compared to the state-of-the-art methods among various evaluation metrics. Furthermore, the best weighted average F1 score achieves 99.02%.	翻訳日:2021-03-29 06:52:29 公開日:2021-01-18
# (参考訳) 大規模言語モデルにおける持続的反ムスリムバイアス Persistent Anti-Muslim Bias in Large Language Models ( http://arxiv.org/abs/2101.05783v2 ) ライセンス: CC BY 4.0	Abubakar Abid, Maheen Farooqi, James Zou	(参考訳) 大規模言語モデルは望ましくない社会的バイアスを捉えていることが観察されている。人種や性別に関連するが、宗教的な偏見は比較的探究されていない。我々は、現在最先端の文脈言語モデルであるGPT-3が、永続的なムスリム-暴力バイアスを捉えていることを実証した。我々は, GPT-3を, 即時完成, 類推, 物語生成など様々な方法で探索し, この反ムスリムバイアスを理解するとともに, モデルが異なる用途で一貫して, 創造的に現れること, 他宗教集団のバイアスと比較しても深刻であることを実証した。例えば、"イスラム教徒"はテストケースの23%で"テロリスト"に、"ユダヤ人"はテストケースの5%で"お金"にマッピングされます。敵対的なテキストプロンプトでこのバイアスを克服するために必要なポジティブな注意を定量化し、最もポジティブな6つの形容詞の使用は「ムスリム」の暴力的な完成度を66%から20%に減少させるが、他の宗教グループよりは依然として高い。 It has been observed that large-scale language models capture undesirable societal biases, e.g. relating to race and gender; yet religious bias has been relatively unexplored. We demonstrate that GPT-3, a state-of-the-art contextual language model, captures persistent Muslim-violence bias. We probe GPT-3 in various ways, including prompt completion, analogical reasoning, and story generation, to understand this anti-Muslim bias, demonstrating that it appears consistently and creatively in different uses of the model and that it is severe even compared to biases about other religious groups. For instance, "Muslim" is analogized to "terrorist" in 23% of test cases, while "Jewish" is mapped to "money" in 5% of test cases. We quantify the positive distraction needed to overcome this bias with adversarial text prompts, and find that use of the most positive 6 adjectives reduces violent completions for "Muslims" from 66% to 20%, but which is still higher than for other religious groups.	翻訳日:2021-03-29 03:51:47 公開日:2021-01-18
# 異方性ガウス混合モデルにおける最適クラスタリング Optimal Clustering in Anisotropic Gaussian Mixture Models ( http://arxiv.org/abs/2101.05402v2 ) ライセンス: Link先を確認	Xin Chen, Anderson Y. Zhang	(参考訳) 異方性ガウス混合モデルでは、異なるクラスタからの共分散行列が未知であり、必ずしも同一行列であるとは限らない。本稿では,クラスタ中心と共分散行列に対する信号対雑音比の依存性を特徴付け,クラスタリング問題に対するミニマックス下界を求める。さらに,計算可能な手順を提案し,数回の反復で最適値が得られることを示す。提案手法はハードem型アルゴリズムであり、異方性共分散行列に調整されたロイドのアルゴリズムの変種と見なすこともできる。 We study the clustering task under anisotropic Gaussian Mixture Models where the covariance matrices from different clusters are unknown and are not necessarily the identical matrix. We characterize the dependence of signal-to-noise ratios on the cluster centers and covariance matrices and obtain the minimax lower bound for the clustering problem. In addition, we propose a computationally feasible procedure and prove it achieves the optimal rate within a few iterations. The proposed procedure is a hard EM type algorithm, and it can also be seen as a variant of the Lloyd's algorithm that is adjusted to the anisotropic covariance matrices.	翻訳日:2021-03-29 00:55:14 公開日:2021-01-18
# (参考訳) プレイヤーとAIのインタラクション: ニューラルネットワークゲームがAIをプレイする理由 Player-AI Interaction: What Neural Network Games Reveal About AI as Play ( http://arxiv.org/abs/2101.06220v2 ) ライセンス: CC BY 4.0	Jichen Zhu, Jennifer Villareale, Nithesh Javvaji, Sebastian Risi, Mathias L\"owe, Rush Weigelt, Casper Harteveld	(参考訳) 人工知能(AI)と機械学習(ML)の出現は、HCI研究の最前線に人間とAIの相互作用をもたらす。本稿では,人間がAIとどのように相互作用するかを学習し,実験する上で,ゲームは理想的な領域であると論じる。ニューラルネットワークゲーム(n = 38)のシステマティックサーベイを通じて、これらのゲームにおける支配的な相互作用メタファーとAIインタラクションパターンを特定した。さらに,既存の人間-AIインタラクションガイドラインを適用し,AIシステムにおけるプレイヤー-AIインタラクションをさらに強調した。私たちの中核的な発見は、AIが現在の人間とAIの相互作用の概念を拡大できるということです。特に、ゲームとUXデザイナは、人間のAIインタラクションの学習曲線を構造化するためのフローを考慮し、発見に基づく学習を取り入れてAIと遊んだり、結果を観察し、新たなタイプのAIインタラクションを探索するための遊びの招待を与えるべきだ、と提案しています。 The advent of artificial intelligence (AI) and machine learning (ML) bring human-AI interaction to the forefront of HCI research. This paper argues that games are an ideal domain for studying and experimenting with how humans interact with AI. Through a systematic survey of neural network games (n = 38), we identified the dominant interaction metaphors and AI interaction patterns in these games. In addition, we applied existing human-AI interaction guidelines to further shed light on player-AI interaction in the context of AI-infused systems. Our core finding is that AI as play can expand current notions of human-AI interaction, which are predominantly productivity-based. In particular, our work suggests that game and UX designers should consider flow to structure the learning curve of human-AI interaction, incorporate discovery-based learning to play around with the AI and observe the consequences, and offer users an invitation to play to explore new forms of human-AI interaction.	翻訳日:2021-03-28 14:04:57 公開日:2021-01-18
# airbnbのデータを使った nowcasting gentrification Nowcasting Gentrification Using Airbnb Data ( http://arxiv.org/abs/2101.05924v2 ) ライセンス: Link先を確認	Shomik Jain, Davide Proserpio, Giovanni Quattrone, Daniele Quercia	(参考訳) 一部の都市では、ゲントリファイターが抗議活動や攻撃の対象となっていると推定されているが、他の都市では新しい仕事や税金のジェネレーターとして歓迎されている。国勢調査データは10年ごとに更新されるため、リアルタイムに近所の変化を測定することができない。この研究によると、Airbnbのデータは近所の変化の定量化と追跡に利用できる。具体的には、両方の構造化データ(例)を考える。リスト数、レビュー数、一覧情報)、構造化されていないデータ(例) 自然言語処理と機械学習アルゴリズムで処理されたユーザー生成レビュー) ニューヨーク(アメリカ)、ロサンゼルス(アメリカ)、グレーター・ロンドン(イギリス)の3つの主要都市で作成されている。 Airbnbのデータ(特に非構造的な部分)は、住宅価格と人口統計の変化として測定された近隣のジェントリフィケーションを予測しているようだ。全体として,オンラインプラットフォームからのユーザ生成データを用いて,より粒度の低い従来の尺度を補完する社会経済指標を作成できることが示唆された。 There is a rumbling debate over the impact of gentrification: presumed gentrifiers have been the target of protests and attacks in some cities, while they have been welcome as generators of new jobs and taxes in others. Census data fails to measure neighborhood change in real-time since it is usually updated every ten years. This work shows that Airbnb data can be used to quantify and track neighborhood changes. Specifically, we consider both structured data (e.g. number of listings, number of reviews, listing information) and unstructured data (e.g. user-generated reviews processed with natural language processing and machine learning algorithms) for three major cities, New York City (US), Los Angeles (US), and Greater London (UK). We find that Airbnb data (especially its unstructured part) appears to nowcast neighborhood gentrification, measured as changes in housing affordability and demographics. Overall, our results suggest that user-generated data from online platforms can be used to create socioeconomic indices to complement traditional measures that are less granular, not in real-time, and more costly to obtain.	翻訳日:2021-03-28 11:14:04 公開日:2021-01-18
# STENCIL-NET:偏微分方程式のデータ駆動型解適応離散化 STENCIL-NET: Data-driven solution-adaptive discretization of partial differential equations ( http://arxiv.org/abs/2101.06182v2 ) ライセンス: Link先を確認	Suryanarayana Maddu, Dominik Sturm, Bevan L. Cheeseman, Christian L. M\"uller, Ivo F. Sbalzarini	(参考訳) 偏微分方程式(PDE)を近似的に解く数値解法は、科学計算の核である。しばしば、これは高分解能または適応的な離散化格子を必要とし、例えば乱流、燃焼、衝撃伝播などの応用において、PDE溶液中の関連する時空間的特徴を捉える。数値近似はまた、問題固有の離散化を構築するためにPDEを理解する必要がある。しかし、そのような解適応離散作用素を体系的に導出することは現在の課題である。本稿では,非線形pdesの解法と分解能固有の局所的離散化をデータ駆動学習する人工ニューラルネットワークであるstencil-netを提案する。 stencil-net は正規直交格子上の空間的および時間的適応的パラメトリックプーリングと離散時間積分に関する知識を取り入れることで、未知の非線形 pde における作用素の数値的安定な離散化を実現する。解データがネットワークをトレーニングし、個別の演算子を学習するのに十分なので、実際のPDEを知る必要はない。一度トレーニングされたSTENCIL-NETモデルは、より大きな空間領域におけるPDEの解を、トレーニングされた時間よりも長い時間予測するために使用することができ、従ってデータからのPDE制約外挿の問題に対処することができる。この主張を支持するために、粗い時空間格子上のカオスPDE解の長期予測に関する数値実験を行った。また,線形数値法を方程式のないSTENCIL-NET予測に置き換えることで,精度を損なうことなく高速化する。 Numerical methods for approximately solving partial differential equations (PDE) are at the core of scientific computing. Often, this requires high-resolution or adaptive discretization grids to capture relevant spatio-temporal features in the PDE solution, e.g., in applications like turbulence, combustion, and shock propagation. Numerical approximation also requires knowing the PDE in order to construct problem-specific discretizations. Systematically deriving such solution-adaptive discrete operators, however, is a current challenge. Here we present STENCIL-NET, an artificial neural network architecture for data-driven learning of problem- and resolution-specific local discretizations of nonlinear PDEs. STENCIL-NET achieves numerically stable discretization of the operators in an unknown nonlinear PDE by spatially and temporally adaptive parametric pooling on regular Cartesian grids, and by incorporating knowledge about discrete time integration. Knowing the actual PDE is not necessary, as solution data is sufficient to train the network to learn the discrete operators. A once-trained STENCIL-NET model can be used to predict solutions of the PDE on larger spatial domains and for longer times than it was trained for, hence addressing the problem of PDE-constrained extrapolation from data. To support this claim, we present numerical experiments on long-term forecasting of chaotic PDE solutions on coarse spatio-temporal grids. We also quantify the speed-up achieved by substituting base-line numerical methods with equation-free STENCIL-NET predictions on coarser grids with little compromise on accuracy.	翻訳日:2021-03-28 11:10:59 公開日:2021-01-18
# (参考訳) ExpFinder:$N$-gramベクトル空間モデルと$\mu$CO-HITSを統合するアンサンブルエキスパート発見モデル ExpFinder: An Ensemble Expert Finding Model Integrating $N$-gram Vector Space Model and $\mu$CO-HITS ( http://arxiv.org/abs/2101.06821v1 ) ライセンス: CC BY 4.0	Yong-Bin Kang, Hung Du, Abdur Rahim Mohammad Forkan, Prem Prakash Jayaraman, Amir Aryani, Timos Sellis (Fellow, IEEE)	(参考訳) 専門家を見つけることは、コラボレーションを成功させ、高品質の研究開発とイノベーションをスピードアップする上で重要な役割を担います。しかし、科学出版物やデジタル専門データの急速な成長により、適切な専門家を特定することが困難な問題となっている。あるトピックに与えられた専門家を見つける既存のアプローチは、ベクトル空間モデル、文書言語モデル、グラフベースモデルに基づく情報検索技術に分類することができる。本稿では、専門家探しのための新しいアンサンブルモデルである$\textit{expfinder}$を提案する。これは、新しい$n$-gramベクトル空間モデル($n$vsmと表記される)と、$\textit{$\mu$co-hits}$と表記されるグラフベースモデルとを統合したものである。 n$vsm の鍵は、n$-gram ワードと $\textit{expfinder}$ に対する最近の逆文書の頻度重み付け手法を、専門家を見つけるために$n$vsm を$\textit{$\mu$co-hits}$ に組み込むことである。学術分野の4つの異なるデータセットに対して,6つの専門家発見モデルと比較して,$\textit{expfinder}$を総合的に評価する。評価の結果、$\textit{expfinder}$は専門家の発見に非常に効果的なモデルであり、19%から160.2%で比較した全てのモデルを大きく上回っている。 Finding an expert plays a crucial role in driving successful collaborations and speeding up high-quality research development and innovations. However, the rapid growth of scientific publications and digital expertise data makes identifying the right experts a challenging problem. Existing approaches for finding experts given a topic can be categorised into information retrieval techniques based on vector space models, document language models, and graph-based models. In this paper, we propose $\textit{ExpFinder}$, a new ensemble model for expert finding, that integrates a novel $N$-gram vector space model, denoted as $n$VSM, and a graph-based model, denoted as $\textit{$\mu$CO-HITS}$, that is a proposed variation of the CO-HITS algorithm. The key of $n$VSM is to exploit recent inverse document frequency weighting method for $N$-gram words and $\textit{ExpFinder}$ incorporates $n$VSM into $\textit{$\mu$CO-HITS}$ to achieve expert finding. We comprehensively evaluate $\textit{ExpFinder}$ on four different datasets from the academic domains in comparison with six different expert finding models. The evaluation results show that $\textit{ExpFinder}$ is a highly effective model for expert finding, substantially outperforming all the compared models in 19% to 160.2%.	翻訳日:2021-03-27 19:28:28 公開日:2021-01-18
# (参考訳) ZeRO-Offload: 数十億ドル規模のモデルトレーニングを民主化 ZeRO-Offload: Democratizing Billion-Scale Model Training ( http://arxiv.org/abs/2101.06840v1 ) ライセンス: CC BY 4.0	Jie Ren, Samyam Rajbhandari, Reza Yazdani Aminabadi, Olatunji Ruwase, Shuangyan Yang, Minjia Zhang, Dong Li, Yuxiong He	(参考訳) 大規模モデルのトレーニングは、複雑なモデルのリファクタリングと、非常に高価なgpuクラスタへのアクセスを必要とするごく少数の理由の1つだ。 ZeRO-Offloadは、大きめのモデルトレーニング環境を、ほぼすべての人が利用できるようにすることで変更する。単一のGPU上で13億以上のパラメータを持つモデルをトレーニングすることが可能で、PyTorchのような一般的なフレームワークと比較して10倍の規模で、データサイエンティストからモデル変更を必要とせず、計算効率を犠牲にする必要がない。 ZeRO-Offloadはデータと計算をCPUにオフロードすることで、大規模なモデルトレーニングを可能にする。計算効率を維持するため、GPUへのデータ移動を最小化し、GPU上のメモリ節約を最大化しながらCPU計算時間を短縮するように設計されている。その結果、ZeRO-Offloadは、1つのNVIDIA V100 GPUで10Bパラメータモデルで40 TFlops/GPUを達成することができ、PyTorch単独で1.4Bパラメータモデルで30TFを使用するのに対して、メモリを使い果たさずにトレーニングできる最大である。 ZeRO-Offloadはまた、利用可能な場合、複数のGPUでスケールするように設計されており、最大128GPUでほぼ線形スピードアップを提供する。さらに、1つのDGX-2ボックスに700億以上のパラメータを持つモデルをトレーニングするために、モデルの並列性と連携することができる。 ZeRO-Offloadは計算とメモリ効率と使いやすさを組み合わせることで、大規模なモデルトレーニングを民主化し、単一のGPUにアクセス可能なデータサイエンティストにもアクセスできるようにする。 Large-scale model training has been a playing ground for a limited few requiring complex model refactoring and access to prohibitively expensive GPU clusters. ZeRO-Offload changes the large model training landscape by making large model training accessible to nearly everyone. It can train models with over 13 billion parameters on a single GPU, a 10x increase in size compared to popular framework such as PyTorch, and it does so without requiring any model change from the data scientists or sacrificing computational efficiency. ZeRO-Offload enables large model training by offloading data and compute to CPU. To preserve compute efficiency, it is designed to minimize the data movement to/from GPU, and reduce CPU compute time while maximizing memory savings on GPU. As a result, ZeRO-Offload can achieve 40 TFlops/GPU on a single NVIDIA V100 GPU for 10B parameter model compared to 30TF using PyTorch alone for a 1.4B parameter model, the largest that can be trained without running out of memory. ZeRO-Offload is also designed to scale on multiple-GPUs when available, offering near linear speedup on up to 128 GPUs. Additionally, it can work together with model parallelism to train models with over 70 billion parameters on a single DGX-2 box, a 4.5x increase in model size compared to using model parallelism alone. By combining compute and memory efficiency with ease-of-use, ZeRO-Offload democratizes large-scale model training making it accessible to even data scientists with access to just a single GPU.	翻訳日:2021-03-27 18:42:28 公開日:2021-01-18
# (参考訳) 自動走行のための建設ゾーンの時空間分割のための非パラメトリックメモリ Non-parametric Memory for Spatio-Temporal Segmentation of Construction Zones for Self-Driving ( http://arxiv.org/abs/2101.06865v1 ) ライセンス: CC BY 4.0	Min Bai, Shenlong Wang, Kelvin Wong, Ersin Yumer, Raquel Urtasun	(参考訳) 本稿では,自律走行車(AV)周囲の局所的空間と時間を把握する時空間分割のための非パラメトリックメモリ表現を提案する。我々の表現には3つの重要な特性がある: (i) 過去に見たことを思い出す; (ii) 補強する; (iii) 新しい証拠に基づいて過去の信念を忘れる。補強は、例えば、その要素が強く隠蔽されているか、範囲内であるような、不確実であるかもしれない要素を初めて見るときに重要である。偽陽性がなければ、自動運転車が不規則に振る舞うことになるため、忘れることも望ましい。我々のプロセスは3D推論によって知らされ、隠蔽は忘れたい欲求と忘れたい欲求を区別する鍵となる。提案手法は,hdマップなどの静的世界表現を補完するオンラインコンポーネントとして,このようなイベントによる静的ビュー上に重畳すべき変更を検出・記憶することにより,どのように利用することができるかを示す。 In this paper, we introduce a non-parametric memory representation for spatio-temporal segmentation that captures the local space and time around an autonomous vehicle (AV). Our representation has three important properties: (i) it remembers what it has seen in the past, (ii) it reinforces and (iii) forgets its past beliefs based on new evidence. Reinforcing is important as the first time we see an element we might be uncertain, e.g, if the element is heavily occluded or at range. Forgetting is desirable, as otherwise false positives will make the self driving vehicle behave erratically. Our process is informed by 3D reasoning, as occlusion is key to distinguishing between the desire to forget and to remember. We show how our method can be used as an online component to complement static world representations such as HD maps by detecting and remembering changes that should be superimposed on top of this static view due to such events.	翻訳日:2021-03-27 17:54:05 公開日:2021-01-18
# (参考訳) マルチエージェント強化学習のための協調バイアスと競争バイアス Cooperative and Competitive Biases for Multi-Agent Reinforcement Learning ( http://arxiv.org/abs/2101.06890v1 ) ライセンス: CC BY 4.0	Heechang Ryu, Hayong Shin, Jinkyoo Park	(参考訳) マルチエージェント強化学習(MARL)アルゴリズムの訓練は、エージェント間の複雑な相互作用と確率的・動的環境との相互作用に強く依存するため、シングルエージェント強化学習アルゴリズムの訓練よりも難しい。本稿では,他のエージェントの偏りのある行動情報を用いたMARL訓練を促進するアルゴリズムを提案する。協調的で競争的な環境には、一般的に2つのエージェント(協調エージェントと競争エージェント)がある。提案アルゴリズムでは,各エージェントがそれぞれのアクションと2つのグループの他のエージェントのバイアス作用情報を用いて値関数を更新する。協調エージェントのバイアス付き共同動作は、すべての協調エージェントが共同してターゲットエージェントの価値関数を最大化することにより、実際の共同動作と想像上の共同動作の合計として計算される。競合剤のバイアス付き共同作用も同様に計算できる。各エージェントはバイアス付きアクション情報を使用して自身の値関数を更新し、バイアス付き値関数と対応するバイアス付きポリシを生成する。その後、各エージェントのバイアスドポリシーは必然的に、他のエージェントと協力し、競合するアクションを推奨し、エージェント間のより活発な相互作用を導入し、MARLポリシー学習を強化する。提案アルゴリズムは,様々な混合協調競合環境において,既存のアルゴリズムよりも優れていることを示す。さらに、訓練が進むにつれて導入されるバイアスは徐々に減少し、虚偽の仮定に基づく補正がなくなる。 Training a multi-agent reinforcement learning (MARL) algorithm is more challenging than training a single-agent reinforcement learning algorithm, because the result of a multi-agent task strongly depends on the complex interactions among agents and their interactions with a stochastic and dynamic environment. We propose an algorithm that boosts MARL training using the biased action information of other agents based on a friend-or-foe concept. For a cooperative and competitive environment, there are generally two groups of agents: cooperative-agents and competitive-agents. In the proposed algorithm, each agent updates its value function using its own action and the biased action information of other agents in the two groups. The biased joint action of cooperative agents is computed as the sum of their actual joint action and the imaginary cooperative joint action, by assuming all the cooperative agents jointly maximize the target agent's value function. The biased joint action of competitive agents can be computed similarly. Each agent then updates its own value function using the biased action information, resulting in a biased value function and corresponding biased policy. Subsequently, the biased policy of each agent is inevitably subjected to recommend an action to cooperate and compete with other agents, thereby introducing more active interactions among agents and enhancing the MARL policy learning. We empirically demonstrate that our algorithm outperforms existing algorithms in various mixed cooperative-competitive environments. Furthermore, the introduced biases gradually decrease as the training proceeds and the correction based on the imaginary assumption vanishes.	翻訳日:2021-03-27 17:18:07 公開日:2021-01-18
# (参考訳) DeepPayload: ニューラルネットワークによるディープラーニングモデルに対するブラックボックスバックドア攻撃 DeepPayload: Black-box Backdoor Attack on Deep Learning Models through Neural Payload Injection ( http://arxiv.org/abs/2101.06896v1 ) ライセンス: CC BY 4.0	Yuanchun Li, Jiayi Hua, Haoyu Wang, Chunyang Chen, Yunxin Liu	(参考訳) ディープラーニングモデルは、モバイルアプリケーションにおいて重要なコンポーネントとしてますます利用されている。脆弱性や脅威が広く指摘されているプログラムバイトコードとは異なり、アプリケーションにデプロイされるディープラーニングモデルがどのように妥協するかは、ニューラルネットワークが通常ブラックボックスと見なされるため、十分に理解されていない。本稿では,コンパイルされたディープラーニングモデルに対して,一連のリバースエンジニアリング技術を用いて,極めて実用的なバックドア攻撃を提案する。攻撃の中核は、トリガー検出器と複数のオペレータで構築され、悪意のあるペイロードとして犠牲者モデルに注入される神経条件分岐である。この攻撃は、条件論理が攻撃者によって柔軟にカスタマイズできるため効果的であり、元のモデルから事前の知識を必要としないためスケーラブルである。 30ユーザから収集した5つの最先端ディープラーニングモデルと実世界のサンプルを用いて攻撃効果を評価した。その結果、インジェクションされたバックドアは成功率93.5%で起動できるが、2ミリ秒以下の遅延オーバーヘッドと1.4%の精度低下しか得られなかった。さらに,google playから収集した実世界のモバイル深層学習アプリに関する実証研究を行った。私たちの攻撃に対して脆弱な54のアプリを見つけました。結果は、デプロイされたモデルの保護を強化するために、ディープラーニングアプリケーション開発者と監査役の意識を喚起する。 Deep learning models are increasingly used in mobile applications as critical components. Unlike the program bytecode whose vulnerabilities and threats have been widely-discussed, whether and how the deep learning models deployed in the applications can be compromised are not well-understood since neural networks are usually viewed as a black box. In this paper, we introduce a highly practical backdoor attack achieved with a set of reverse-engineering techniques over compiled deep learning models. The core of the attack is a neural conditional branch constructed with a trigger detector and several operators and injected into the victim model as a malicious payload. The attack is effective as the conditional logic can be flexibly customized by the attacker, and scalable as it does not require any prior knowledge from the original model. We evaluated the attack effectiveness using 5 state-of-the-art deep learning models and real-world samples collected from 30 users. The results demonstrated that the injected backdoor can be triggered with a success rate of 93.5%, while only brought less than 2ms latency overhead and no more than 1.4% accuracy decrease. We further conducted an empirical study on real-world mobile deep learning apps collected from Google Play. We found 54 apps that were vulnerable to our attack, including popular and security-critical ones. The results call for the awareness of deep learning application developers and auditors to enhance the protection of deployed models.	翻訳日:2021-03-27 16:58:28 公開日:2021-01-18
# (参考訳) ブロックチェーンによる分散フェデレーション学習(blade-fl)のパフォーマンス分析とリソース割り当て Blockchain Assisted Decentralized Federated Learning (BLADE-FL): Performance Analysis and Resource Allocation ( http://arxiv.org/abs/2101.06905v1 ) ライセンス: CC0 1.0	Jun Li, Yumeng Shao, Kang Wei, Ming Ding, Chuan Ma, Long Shi, Zhu Han, and H. Vincent Poor	(参考訳) 分散機械学習パラダイムであるフェデレートラーニング(FL)は、クライアントが生データをローカルに処理することで、個人のプライバシーを促進する。しかし、モデルアグリゲーションのための集中型サーバに頼ると、標準FLはサーバーの故障、不信なサーバ、外部攻撃に弱い。この問題に対処するために、ブロックチェーンをFL、すなわちブロックチェーン支援型分散連邦学習(BLADE-FL)に統合する分散FLフレームワークを提案する。提案したBLADE-FLのラウンドでは、各クライアントがトレーニングされたモデルを他のクライアントにブロードキャストし、受信したモデルに基づいてブロックを生成し、次のラウンドのローカルトレーニングの前に生成されたブロックからモデルを集約する。本研究では,blade-flの学習性能を評価し,大域的損失関数の上限を開発する。次に、この境界が全ラウンド数Kに対して凸であることを確認し、上限を最小化するための計算資源割り当てを最適化する。また,他人の訓練したモデルを盗聴し,不正行為を偽装する人工ノイズを付加する遅延クライアントが原因で,トレーニング不足が重大な問題となっていることも留意する。そこで本研究では,遅延クライアントがblade-flの学習性能に与える影響を考察し,最適なk,学習パラメータ,遅延クライアントの割合の関係を特徴付ける。 MNIST と Fashion-MNIST のデータセットから,実験結果は解析結果と一致していることを示す。具体的には、開発した上限値と実験値との差が5%以下であり、上限値に基づく最適化kは損失関数を効果的に最小化することができる。 Federated learning (FL), as a distributed machine learning paradigm, promotes personal privacy by clients' processing raw data locally. However, relying on a centralized server for model aggregation, standard FL is vulnerable to server malfunctions, untrustworthy server, and external attacks. To address this issue, we propose a decentralized FL framework by integrating blockchain into FL, namely, blockchain assisted decentralized federated learning (BLADE-FL). In a round of the proposed BLADE-FL, each client broadcasts its trained model to other clients, competes to generate a block based on the received models, and then aggregates the models from the generated block before its local training of the next round. We evaluate the learning performance of BLADE-FL, and develop an upper bound on the global loss function. Then we verify that this bound is convex with respect to the number of overall rounds K, and optimize the computing resource allocation for minimizing the upper bound. We also note that there is a critical problem of training deficiency, caused by lazy clients who plagiarize others' trained models and add artificial noises to disguise their cheating behaviors. Focusing on this problem, we explore the impact of lazy clients on the learning performance of BLADE-FL, and characterize the relationship among the optimal K, the learning parameters, and the proportion of lazy clients. Based on the MNIST and Fashion-MNIST datasets, we show that the experimental results are consistent with the analytical ones. To be specific, the gap between the developed upper bound and experimental results is lower than 5%, and the optimized K based on the upper bound can effectively minimize the loss function.	翻訳日:2021-03-27 16:37:06 公開日:2021-01-18
# (参考訳) 熱-クロスドメインカラー化画像の新しいレジストレーション・カラー化手法 A Novel Registration & Colorization Technique for Thermal to Cross Domain Colorized Images ( http://arxiv.org/abs/2101.06910v1 ) ライセンス: CC BY 4.0	Suranjan Goswami, Satish Kumar Singh	(参考訳) 熱画像は、撮影対象の熱プロファイルに基づいて、グレースケール画像または擬似カラー画像として得ることができる。本論文では,複数のサーマルイメージ装置で撮影された画像に対して,メイクや内部解像度に関係なく動作する新規な登録方法と,光学画像と類似したカラー化感熱画像を得るためのカラー化スキームと,その出力の一部としてサーマルプロファイルの情報を保持し,両ドメインの情報を協調的に提供できる新規な登録方式を提案する。これをクロスドメインカラー化画像と呼ぶ。また,本論文の一部として提示する新しい熱光学対データベースを概説し,複数の熱画像から得られたユニークなデータポイントについて述べる。最後に、この結果と先行文献を比較し、我々の結果がどのように異なるかを示し、この領域でさらに探求できる今後の研究について議論する。 Thermal images can be obtained as either grayscale images or pseudo colored images based on the thermal profile of the object being captured. We present a novel registration method that works on images captured via multiple thermal imagers irrespective of make and internal resolution as well as a colorization scheme that can be used to obtain a colorized thermal image which is similar to an optical image, while retaining the information of the thermal profile as a part of the output, thus providing information of both domains jointly. We call this a cross domain colorized image. We also outline a new public thermal-optical paired database that we are presenting as a part of this paper, containing unique data points obtained via multiple thermal imagers. Finally, we compare the results with prior literature, show how our results are different and discuss on some future work that can be explored further in this domain as well.	翻訳日:2021-03-27 16:14:20 公開日:2021-01-18
# (参考訳) TLU-Net:鉄鋼表面欠陥の自動検出のための深層学習手法 TLU-Net: A Deep Learning Approach for Automatic Steel Surface Defect Detection ( http://arxiv.org/abs/2101.06915v1 ) ライセンス: CC BY 4.0	Praveen Damacharla, Achuth Rao M. V., Jordan Ringenberg, and Ahmad Y Javaid	(参考訳) 視覚的鋼板表面欠陥検出は鋼板製造における必須ステップである。近年,機械学習に基づく自動視覚検査(AVI)手法が研究されている。しかし、ほとんどの製鋼業では、AVI法に関連するトレーニング時間や不正確さのために、手動の視覚検査が使われている。自動鋼の欠陥検出法は、安価でより高速な品質制御とフィードバックに有用である。しかし、セグメンテーションと分類のための注釈付きトレーニングデータを作成するのはコストがかかる。本研究では,鋼表面欠陥検出にTransfer Learning-based U-Net(TLU-Net)フレームワークを提案する。ベースとしてU-Netアーキテクチャを使用し、ResNetとDenseNetの2種類のエンコーダを探索する。これらのネットの性能をランダム初期化とimagenetデータセットを用いてトレーニングした事前学習ネットワークを用いて比較する。実験はSeverstalデータを用いて行われる。その結果, 伝達学習は欠陥分類におけるランダム初期化よりも5%(絶対的)に優れることがわかった。その結果,伝達学習は欠陥分割のランダム初期化よりも26%(相対的)に優れることがわかった。また, 学習データの減少に伴い, 転校学習の利得は増加し, 転校学習による収束率はランダム初期化よりも優れていることがわかった。 Visual steel surface defect detection is an essential step in steel sheet manufacturing. Several machine learning-based automated visual inspection (AVI) methods have been studied in recent years. However, most steel manufacturing industries still use manual visual inspection due to training time and inaccuracies involved with AVI methods. Automatic steel defect detection methods could be useful in less expensive and faster quality control and feedback. But preparing the annotated training data for segmentation and classification could be a costly process. In this work, we propose to use the Transfer Learning-based U-Net (TLU-Net) framework for steel surface defect detection. We use a U-Net architecture as the base and explore two kinds of encoders: ResNet and DenseNet. We compare these nets' performance using random initialization and the pre-trained networks trained using the ImageNet data set. The experiments are performed using Severstal data. The results demonstrate that the transfer learning performs 5% (absolute) better than that of the random initialization in defect classification. We found that the transfer learning performs 26% (relative) better than that of the random initialization in defect segmentation. We also found the gain of transfer learning increases as the training data decreases, and the convergence rate with transfer learning is better than that of the random initialization.	翻訳日:2021-03-27 15:56:42 公開日:2021-01-18
# (参考訳) 分散計画下次アルゴリズムにおけるインサイダー攻撃の検出 Detection of Insider Attacks in Distributed Projected Subgradient Algorithms ( http://arxiv.org/abs/2101.06917v1 ) ライセンス: CC BY 4.0	Sissi Xiaoxiao Wu, Gangqiang Li, Shengli Zhang, and Xiaohui Lin	(参考訳) Gossipベースの分散アルゴリズムは、様々なマルチエージェントアプリケーションの分散最適化問題を解決するために広く使われているが、一般的には、各エージェントが権限のない適切な方向をローカルに見積もっているため、内部悪意のあるエージェントによるデータインジェクション攻撃に対して脆弱である。本研究では、内部攻撃を検出する人工知能(AI)技術の適用について検討する。一般のニューラルネットワークは,収集されたデータに基づく非線形関係を効果的に探索できるため,悪意のあるエージェントの検出とローカライズに特に適している。さらに,協調学習における最先端のアプローチ,すなわち協調型ピアツーピア機械学習プロトコルを採用し,ゴシップ交換によるニューラルネットワークモデルのトレーニングを容易にすることを提案する。この高度なアプローチは、トレーニングデータ不足やミスマッチテストデータといった課題に対して、モデルをより堅牢にすることが期待されます。シミュレーションでは,AI手法の有効性と有効性を検証するために,最小二乗問題を考える。シミュレーションの結果,提案するaiベースの手法は,スコアに基づく手法よりも悪意のあるエージェントの検出とローカライズのパフォーマンス向上に有用であり,ピアツーピアニューラルネットワークモデルは,実際に問題に対して頑健であることが示された。 The gossip-based distributed algorithms are widely used to solve decentralized optimization problems in various multi-agent applications, while they are generally vulnerable to data injection attacks by internal malicious agents as each agent locally estimates its decent direction without an authorized supervision. In this work, we explore the application of artificial intelligence (AI) technologies to detect internal attacks. We show that a general neural network is particularly suitable for detecting and localizing the malicious agents, as they can effectively explore nonlinear relationship underlying the collected data. Moreover, we propose to adopt one of the state-of-art approaches in federated learning, i.e., a collaborative peer-to-peer machine learning protocol, to facilitate training our neural network models by gossip exchanges. This advanced approach is expected to make our model more robust to challenges with insufficient training data, or mismatched test data. In our simulations, a least-squared problem is considered to verify the feasibility and effectiveness of AI-based methods. Simulation results demonstrate that the proposed AI-based methods are beneficial to improve performance of detecting and localizing malicious agents over score-based methods, and the peer-to-peer neural network model is indeed robust to target issues.	翻訳日:2021-03-27 15:50:40 公開日:2021-01-18
# (参考訳) 属性インフォームド摂動によるニューラルネットワーク生成 Generative Counterfactuals for Neural Networks via Attribute-Informed Perturbation ( http://arxiv.org/abs/2101.06930v1 ) ライセンス: CC BY 4.0	Fan Yang, Ninghao Liu, Mengnan Du, Xia Hu	(参考訳) ディープニューラルネットワーク(dnn)の広範な利用により、高リスクシナリオでは説明可能な決定が望ましいため、モデル解釈性が重要な関心事となっている。現在の解釈技術は機能帰属の観点から主に焦点を当てており、特定の説明が予測とどのように関連しているかを示すのに制限がある。この目的のために、ファクトファクト(反事実)と呼ばれる興味深い説明のクラスが開発され、解釈のための「何」の状況をさらに探求し、ブラックボックスモデルにおける推論能力を実現する。しかし, 生データインスタンス(テキストや画像など)に対する偽造物の生成は, 高いデータ次元と非意味な生の特徴に課題があるため, まだ初期段階にある。本稿では,提案するAttribute-Informed Perturbation (AIP)を用いて,生データインスタンスに特化して偽物を生成するフレームワークを設計する。異なる属性を条件とした生成モデルを利用することで、所望のラベルとの反事実を効果的かつ効率的に得ることができる。データ空間のインスタンスを直接変更するのではなく、属性に変換された潜在空間を反復的に最適化します。実世界のテキストや画像に対する実験結果から, 提案したフレームワークの有効性, サンプル品質, および有効性を示し, その他の選択肢よりも優れていることを示す。さらに,本フレームワークに基づく実用的応用例も紹介し,モデルの解釈可能性を超えた可能性を示した。 With the wide use of deep neural networks (DNN), model interpretability has become a critical concern, since explainable decisions are preferred in high-stake scenarios. Current interpretation techniques mainly focus on the feature attribution perspective, which are limited in indicating why and how particular explanations are related to the prediction. To this end, an intriguing class of explanations, named counterfactuals, has been developed to further explore the "what-if" circumstances for interpretation, and enables the reasoning capability on black-box models. However, generating counterfactuals for raw data instances (i.e., text and image) is still in the early stage due to its challenges on high data dimensionality and unsemantic raw features. In this paper, we design a framework to generate counterfactuals specifically for raw data instances with the proposed Attribute-Informed Perturbation (AIP). By utilizing generative models conditioned with different attributes, counterfactuals with desired labels can be obtained effectively and efficiently. Instead of directly modifying instances in the data space, we iteratively optimize the constructed attribute-informed latent space, where features are more robust and semantic. Experimental results on real-world texts and images demonstrate the effectiveness, sample quality as well as efficiency of our designed framework, and show the superiority over other alternatives. Besides, we also introduce some practical applications based on our framework, indicating its potential beyond the model interpretability aspect.	翻訳日:2021-03-27 15:07:10 公開日:2021-01-18
# (参考訳) データ強化と一貫性に基づく半教師付き学習について On Data-Augmentation and Consistency-Based Semi-Supervised Learning ( http://arxiv.org/abs/2101.06967v1 ) ライセンス: CC BY 4.0	Atin Ghosh and Alexandre H. Thiery	(参考訳) 最近提案された一貫性に基づくセミスーパーバイザラーニング(SSL)手法(例えば、$\Pi$-model, temporal ensembling, the mean teacher, or the virtual adversarial training)は、SSLタスクにおける最先端技術である。これらの手法は通常、ラベル付き例をほんの一部使用しながら、完全に監督されたものと同等のパフォーマンスに到達できる。これらの方法論的進歩にもかかわらず、これらの手法の理解はまだ比較的限られている。このテキストでは、分析的に扱いやすい結果が得られる設定において、$\pi$-model の分析(変動)を行う。我々はManifold Tangent Classifiersとのリンクを確立し、摂動の質が適切なSSL性能を得るための鍵であることを実証する。重要なことは、データ拡張スキームを自然に組み込んだHidden Manifold Modelのシンプルな拡張を提案し、SSLメソッドの理解と実験のためのフレームワークを提供する。 Recently proposed consistency-based Semi-Supervised Learning (SSL) methods such as the $\Pi$-model, temporal ensembling, the mean teacher, or the virtual adversarial training, have advanced the state of the art in several SSL tasks. These methods can typically reach performances that are comparable to their fully supervised counterparts while using only a fraction of labelled examples. Despite these methodological advances, the understanding of these methods is still relatively limited. In this text, we analyse (variations of) the $\Pi$-model in settings where analytically tractable results can be obtained. We establish links with Manifold Tangent Classifiers and demonstrate that the quality of the perturbations is key to obtaining reasonable SSL performances. Importantly, we propose a simple extension of the Hidden Manifold Model that naturally incorporates data-augmentation schemes and offers a framework for understanding and experimenting with SSL methods.	翻訳日:2021-03-27 14:23:37 公開日:2021-01-18
# (参考訳) 信号導出と凝集関数を用いた運動画像に基づく脳コンピューターインタフェース Motor-Imagery-Based Brain Computer Interface using Signal Derivation and Aggregation Functions ( http://arxiv.org/abs/2101.06968v1 ) ライセンス: CC BY 4.0	Javier Fumanal-Idocin, Yu-Kai Wang, Chin-Teng Lin, Javier Fern\'andez, Jose Antonio Sanz, Humberto Bustince	(参考訳) 脳コンピュータインタフェース技術は、人間の脳と外部デバイスの間のコミュニケーションの一般的な方法である。 BCIの最も一般的なアプローチの1つは、Motor Imageryである。 BCIの応用において、脳波グラフは非侵襲的な性質のため、脳力学の非常に一般的な測定方法である。 bciの話題には高い関心が寄せられているが、脳波信号におけるパターン認識タスクの実行が困難であるため、既存のシステムの性能はいまだに理想的ではない。 BCIシステムは、信号前処理、特徴抽出、意思決定を行う幅広いコンポーネントで構成されている。本稿では,既存のMIベースのBCIフレームワークを改善するための3つの異なるアイデアを提案する。まず、信号のさらなる前処理ステップ:時間を不変にする脳波信号の微分を含む。第2に,システムの機能として周波数帯域を追加し,システムの性能にその効果を示す。最後に,システムにおける最終決定の方法に関する深い考察を行う。本研究では,最大6種類の異なる分類器と広範囲の集約関数(古典集約,チョケ,スゲノ積分およびそれらの拡張および重なり関数を含む)を用いて,分類器が与える情報を融合する手法を提案する。本システムでは,20名のボランティアのデータセットを用いて,運動画像を用いた脳-コンピュータインタフェース実験を行った。このデータセットでは、新しいシステムは88.80%の精度を達成した。また,最大90,76%を達成できる最適化版も提案する。さらに、ペアのChoquet/Sugeno積分と重なり関数が最良の結果を提供するものであることが分かる。 Brain Computer Interface technologies are popular methods of communication between the human brain and external devices. One of the most popular approaches to BCI is Motor Imagery. In BCI applications, the ElectroEncephaloGraphy is a very popular measurement for brain dynamics because of its non-invasive nature. Although there is a high interest in the BCI topic, the performance of existing systems is still far from ideal, due to the difficulty of performing pattern recognition tasks in EEG signals. BCI systems are composed of a wide range of components that perform signal pre-processing, feature extraction and decision making. In this paper, we define a BCI Framework, named Enhanced Fusion Framework, where we propose three different ideas to improve the existing MI-based BCI frameworks. Firstly, we include aan additional pre-processing step of the signal: a differentiation of the EEG signal that makes it time-invariant. Secondly, we add an additional frequency band as feature for the system and we show its effect on the performance of the system. Finally, we make a profound study of how to make the final decision in the system. We propose the usage of both up to six types of different classifiers and a wide range of aggregation functions (including classical aggregations, Choquet and Sugeno integrals and their extensions and overlap functions) to fuse the information given by the considered classifiers. We have tested this new system on a dataset of 20 volunteers performing motor imagery-based brain-computer interface experiments. On this dataset, the new system achieved a 88.80% of accuracy. We also propose an optimized version of our system that is able to obtain up to 90,76%. Furthermore, we find that the pair Choquet/Sugeno integrals and overlap functions are the ones providing the best results.	翻訳日:2021-03-27 14:06:21 公開日:2021-01-18
# (参考訳) T1およびT2強調MRIにおける肝のマルチモーダルセグメンテーションにおけるDeep Learning戦略の比較 Comparing Deep Learning strategies for paired but unregistered multimodal segmentation of the liver in T1 and T2-weighted MRI ( http://arxiv.org/abs/2101.06979v1 ) ライセンス: CC BY 4.0	Vincent Couteaux, Mathilde Trintignac, Olivier Nempont, Guillaume Pizaine, Anna Sesilia Vlachomitrou, Pierre-Jean Valette, Laurent Milot, Isabelle Bloch	(参考訳) マルチモーダル肝セグメンテーションにおけるT1,T2強調MR画像の問題点について検討した。文献に記述されているいくつかの戦略とマルチタスクトレーニングの有無,事前登録の有無を比較した。また,異なる損失関数(クロスエントロピー,ダイス損失,3つの逆損失)を比較する。全てのメソッドは、同時に両方のセグメンテーションを実行するマルチタスク設定を除いて、同等のパフォーマンスを達成した。 We address the problem of multimodal liver segmentation in paired but unregistered T1 and T2-weighted MR images. We compare several strategies described in the literature, with or without multi-task training, with or without pre-registration. We also compare different loss functions (cross-entropy, Dice loss, and three adversarial losses). All methods achieved comparable performances with the exception of a multi-task setting that performs both segmentations at once, which performed poorly.	翻訳日:2021-03-27 13:25:24 公開日:2021-01-18
# (参考訳) ニューラルランク付けモデルにおけるカタストロフィックフォーミングの研究 Studying Catastrophic Forgetting in Neural Ranking Models ( http://arxiv.org/abs/2101.06984v1 ) ライセンス: CC BY 4.0	Jesus Lovon-Melgarejo, Laure Soulier, Karen Pinel-Sauvagnat, Lynda Tamine	(参考訳) 最近のIR文献では、いくつかの深いニューラルネットワークランキングモデルが提案されている。データセットが保持する1つのターゲットドメインへの転送性は、従来のドメイン適応戦略を用いて広く取り組まれているが、そのクロスドメイン転送性に関する問題は、まだ未検討である。ニューラルランキングモデルは、新しい知識を得た後、以前に観測された領域から得られた古い知識を破滅的に忘れる程度に研究し、これらの領域のパフォーマンスを低下させる。実験の結果,脳波ランキングモデルの有効性は破滅的な忘れを犠牲にして達成され,クロスドメイン正規化器を用いた生涯学習戦略が問題を軽減することがわかった。また,回帰モデルに基づく説明的アプローチを用いて,ドメイン特性が破滅的忘れることの高まりに与える影響を示す。得られた結果は,神経赤外線における理論的および実用的な研究に有用であると考えられる。 Several deep neural ranking models have been proposed in the recent IR literature. While their transferability to one target domain held by a dataset has been widely addressed using traditional domain adaptation strategies, the question of their cross-domain transferability is still under-studied. We study here in what extent neural ranking models catastrophically forget old knowledge acquired from previously observed domains after acquiring new knowledge, leading to performance decrease on those domains. Our experiments show that the effectiveness of neuralIR ranking models is achieved at the cost of catastrophic forgetting and that a lifelong learning strategy using a cross-domain regularizer success-fully mitigates the problem. Using an explanatory approach built on a regression model, we also show the effect of domain characteristics on the rise of catastrophic forgetting. We believe that the obtained results can be useful for both theoretical and practical future work in neural IR.	翻訳日:2021-03-27 13:17:21 公開日:2021-01-18
# (参考訳) 機械学習モデル探索のためのインタラクティブスライス可視化 Interactive slice visualization for exploring machine learning models ( http://arxiv.org/abs/2101.06986v1 ) ライセンス: CC BY 4.0	Catherine B. Hurley, Mark O'Connell, Katarina Domijan	(参考訳) 機械学習モデルは、任意の規模のデータセットに複雑なアルゴリズムを適合させる。これらのアルゴリズムは、性能が高く、解釈性が低いことでよく知られている。我々は、予測空間のスライスをインタラクティブに可視化し、解釈可能性の欠陥に対処し、事実上、機械学習アルゴリズムのブラックボックスを開くことで、モデルの適合性を疑問視し、説明し、検証し、比較することを目的としている。スライスは相互作用を通じて直接指定されるか、あるいはモデルに適合する高占有領域や地域を訪れるように設計された様々なツアーアルゴリズムを使用する。ここで提示されるメソッドは、Rパッケージ \pkg{condvis2} に実装される。 Machine learning models fit complex algorithms to arbitrarily large datasets. These algorithms are well-known to be high on performance and low on interpretability. We use interactive visualization of slices of predictor space to address the interpretability deficit; in effect opening up the black-box of machine learning algorithms, for the purpose of interrogating, explaining, validating and comparing model fits. Slices are specified directly through interaction, or using various touring algorithms designed to visit high-occupancy sections or regions where the model fits have interesting properties. The methods presented here are implemented in the R package \pkg{condvis2}.	翻訳日:2021-03-27 13:01:57 公開日:2021-01-18
# (参考訳) テネシー・イーストマン化学プロセスにおける故障検出のためのニューラルネットワークの深部圧縮 Deep Compression of Neural Networks for Fault Detection on Tennessee Eastman Chemical Processes ( http://arxiv.org/abs/2101.06993v1 ) ライセンス: CC BY 4.0	Mingxuan Li, Yuanxun Shao	(参考訳) 人工ニューラルネットワークはテネシー・イーストマンプロセスにおいて最先端のフォールト検出性能を達成したが、膨大なパラメータに資金を提供するには膨大なメモリを必要とすることが多い。オンラインリアルタイム故障検出を実現するために,3つの深部圧縮技術(プルーニング,クラスタリング,量子化)を適用し,計算負担を軽減する。我々は7種類の圧縮技術の組み合わせを広範囲に研究し、全ての手法が高いモデル圧縮率を64%以上達成し、高い故障検出精度を維持した。最も優れた結果として、3つのテクニックを全て適用し、モデルのサイズを91.5%削減し、精度は94%以上である。これにより、本番環境でのストレージ要件が小さくなり、実環境におけるデプロイメントがよりスムーズになる。 Artificial neural network has achieved the state-of-art performance in fault detection on the Tennessee Eastman process, but it often requires enormous memory to fund its massive parameters. In order to implement online real-time fault detection, three deep compression techniques (pruning, clustering, and quantization) are applied to reduce the computational burden. We have extensively studied 7 different combinations of compression techniques, all methods achieve high model compression rates over 64% while maintain high fault detection accuracy. The best result is applying all three techniques, which reduces the model sizes by 91.5% and remains a high accuracy over 94%. This result leads to a smaller storage requirement in production environments, and makes the deployment smoother in real world.	翻訳日:2021-03-27 13:01:03 公開日:2021-01-18
# (参考訳) 深部普遍ブラインド画像 Deep Universal Blind Image Denoising ( http://arxiv.org/abs/2101.07017v1 ) ライセンス: CC BY 4.0	Jae Woong Soh, Nam Ik Cho	(参考訳) 画像のノイズ除去は、画像取得時に避けられないノイズによる多くの画像処理やコンピュータビジョンタスクで不可欠な部分である。伝統的に、多くの研究者が画像の性質と統計に基づくベイズ的視点で画像の優先順位を調査してきた。近年,deep convolutional neural networks (cnns) は大規模合成データセットを組み込んだ画像デノイジングにおいて大きな成功を収めている。しかし、どちらも長所と短所がある。ディープCNNは既知の統計でノイズを取り除くのに強力だが、視覚障害者や現実世界の騒音には柔軟性と実用性が欠けている傾向がある。さらに、明示的な事前設定は簡単には採用できない。一方、従来の非学習手法は明示的な画像先行処理を伴い得るが、かなりの計算時間を必要とし、大規模な外部データセットを活用できない。本稿では,ベイズ的視点に基づく両手法の利点を生かしたCNNに基づく手法を提案する。具体的には,視覚障害をサブプロブレムに分割し,各推論問題を分解する。 CNNは推論のための強力なツールであるため,提案手法はCNNに根ざし,効率的な推論のための新しいネットワーク設計を提案する。提案手法により,広帯域CNNのパラメータを適度に行うことで,視覚と現実世界のノイズを除去できる。 Image denoising is an essential part of many image processing and computer vision tasks due to inevitable noise corruption during image acquisition. Traditionally, many researchers have investigated image priors for the denoising, within the Bayesian perspective based on image properties and statistics. Recently, deep convolutional neural networks (CNNs) have shown great success in image denoising by incorporating large-scale synthetic datasets. However, they both have pros and cons. While the deep CNNs are powerful for removing the noise with known statistics, they tend to lack flexibility and practicality for the blind and real-world noise. Moreover, they cannot easily employ explicit priors. On the other hand, traditional non-learning methods can involve explicit image priors, but they require considerable computation time and cannot exploit large-scale external datasets. In this paper, we present a CNN-based method that leverages the advantages of both methods based on the Bayesian perspective. Concretely, we divide the blind image denoising problem into sub-problems and conquer each inference problem separately. As the CNN is a powerful tool for inference, our method is rooted in CNNs and propose a novel design of network for efficient inference. With our proposed method, we can successfully remove blind and real-world noise, with a moderate number of parameters of universal CNN.	翻訳日:2021-03-27 12:55:16 公開日:2021-01-18
# (参考訳) weibull分布による事象ログを用いたイベント駆動型予測保守のキーフレーバーの解析 Analysis of key flavors of event-driven predictive maintenance using logs of phenomena described by Weibull distributions ( http://arxiv.org/abs/2101.07033v1 ) ライセンス: CC BY 4.0	Petros Petsinis, Athanasios Naskos and Anastasios Gounaris	(参考訳) この研究は、業界4.0におけるイベント駆動型予測保守への2つのアプローチを探求し、それぞれ問題を分類または回帰として、最先端の2つのソリューションの出発点として使用します。これら2つの手法のそれぞれについて,異なるデータ前処理手法,異なる予測アルゴリズム,およびアンサンブルとサンプリング方法の影響について検討する。以上のような側面を体系的に実験することで,選択肢の強みを理解し,さらに重要な点として,多数の代替手段をインフォームドでナビゲートする方法に光を当てた。我々の研究は、この種のデータ駆動型予測保守の真の可能性を理解するための重要なステップを構成し、実践者が最も影響の大きい側面に集中するのを手助けします。 This work explores two approaches to event-driven predictive maintenance in Industry 4.0 that cast the problem at hand as a classification or a regression one, respectively, using as a starting point two state-of-the-art solutions. For each of the two approaches, we examine different data preprocessing techniques, different prediction algorithms and the impact of ensemble and sampling methods. Through systematic experiments regarding the aspectsmentioned above,we aimto understand the strengths of the alternatives, and more importantly, shed light on how to navigate through the vast number of such alternatives in an informed manner. Our work constitutes a key step towards understanding the true potential of this type of data-driven predictive maintenance as of to date, and assist practitioners in focusing on the aspects that have the greatest impact.	翻訳日:2021-03-27 12:43:06 公開日:2021-01-18
# (参考訳) 顔解析のための適応グラフ表現学習と推論 Adaptive Graph Representation Learning and Reasoning for Face Parsing ( http://arxiv.org/abs/2101.07034v1 ) ライセンス: CC BY 4.0	Gusi Te, Wei Hu, Yinglu Liu, Hailin Shi, Tao Mei	(参考訳) 顔解析は、最近注目を集めている各顔コンポーネントにピクセル単位のラベルを推測する。これまでは顔解析に成功していたが、顔成分間の相関を見落としている。実際、コンポーネント間の関係は、顔領域の曖昧なピクセルを識別するための重要な手がかりである。そこで本研究では,顔成分に対する適応的グラフ表現学習と推論を提案し,各成分を記述した代表頂点を学習し,成分関係を活用し,曖昧性に対する正確な解析結果を生成する。特に,ある顔領域内の画素特徴が頂点に集約される予測解析マップの初期条件下で,画素対頂点投影によりグラフ上の成分を表現する適応的で微分可能なグラフ抽象化手法を考案した。さらに,画像エッジを先行として,投影中にエッジと非エッジの画素を識別し,エッジに沿った解析結果の洗練に寄与するモデルとして,画像エッジを明示的に組み込む。そして,グラフ上の頂点をまたいで情報を伝播することにより,コンポーネント間の関係を学習し,理由付けを行う。最後に、改良された頂点機能は最終解析マップの予測のためにピクセルグリッドに投影される。本モデルでは,特徴空間における頂点間の小さな距離をペナルティ化する識別的損失を提案する。実験の結果,提案モデルが複数顔解析データセット上で優れた性能を示すとともに,人間の解析タスクの検証を行い,モデルの一般化可能性を示した。 Face parsing infers a pixel-wise label to each facial component, which has drawn much attention recently. Previous methods have shown their success in face parsing, which however overlook the correlation among facial components. As a matter of fact, the component-wise relationship is a critical clue in discriminating ambiguous pixels in facial area. To address this issue, we propose adaptive graph representation learning and reasoning over facial components, aiming to learn representative vertices that describe each component, exploit the component-wise relationship and thereby produce accurate parsing results against ambiguity. In particular, we devise an adaptive and differentiable graph abstraction method to represent the components on a graph via pixel-to-vertex projection under the initial condition of a predicted parsing map, where pixel features within a certain facial region are aggregated onto a vertex. Further, we explicitly incorporate the image edge as a prior in the model, which helps to discriminate edge and non-edge pixels during the projection, thus leading to refined parsing results along the edges. Then, our model learns and reasons over the relations among components by propagating information across vertices on the graph. Finally, the refined vertex features are projected back to pixel grids for the prediction of the final parsing map. To train our model, we propose a discriminative loss to penalize small distances between vertices in the feature space, which leads to distinct vertices with strong semantics. Experimental results show the superior performance of the proposed model on multiple face parsing datasets, along with the validation on the human parsing task to demonstrate the generalizability of our model.	翻訳日:2021-03-27 12:25:06 公開日:2021-01-18
# (参考訳) CLASTER:ゼロショット動作認識のための強化学習によるクラスタリング CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action Recognition ( http://arxiv.org/abs/2101.07042v1 ) ライセンス: CC BY 4.0	Shreyank N Gowda, Laura Sevilla-Lara, Frank Keller, Marcus Rohrbach	(参考訳) ゼロショットアクション認識は、視覚的な例のないアクションクラスを認識するタスクであり、目に見えないクラスに関連するセマンティックな埋め込みである。問題は、クラス間の区別を失うことなく、目に見えないクラスのインスタンスによく一般化する関数を学ぶことである。ニューラルネットワークは、視覚クラス間の複雑な境界をモデル化することができる。しかし、ゼロショット学習では、これらの高度に専門化されたクラス境界は、目に見えるクラスから見当たらないクラスへうまく移行できないかもしれない。本稿では,各インスタンスを個別に最適化するのではなく,すべてのトレーニングサンプルを同時に検討するクラスタリングモデルを提案する。私たちはReinforcement Learningを使ってクラスタリングを最適化します。我々は提案手法をCLASTERと呼び、標準ゼロショット評価と一般化ゼロショット学習の両方において、標準データセットであるUCF101, HMDB51, オリンピックスポーツの最先端性を常に改善することを確認する。 Zero-shot action recognition is the task of recognizing action classes without visual examples, only with a semantic embedding which relates unseen to seen classes. The problem can be seen as learning a function which generalizes well to instances of unseen classes without losing discrimination between classes. Neural networks can model the complex boundaries between visual classes, which explains their success as supervised models. However, in zero-shot learning, these highly specialized class boundaries may not transfer well from seen to unseen classes. In this paper, we propose a clustering-based model, which considers all training samples at once, instead of optimizing for each instance individually. We optimize the clustering using Reinforcement Learning which we show is critical for our approach to work. We call the proposed method CLASTER and observe that it consistently improves over the state-of-the-art in all standard datasets, UCF101, HMDB51, and Olympic Sports; both in the standard zero-shot evaluation and the generalized zero-shot learning.	翻訳日:2021-03-27 12:04:51 公開日:2021-01-18
# (参考訳) IMU事前積分による深部慣性オドメトリー Deep Inertial Odometry with Accurate IMU Preintegration ( http://arxiv.org/abs/2101.07061v1 ) ライセンス: CC BY 4.0	Rooholla Khorrambakht, Chris Xiaoxuan Lu, Hamed Damirchi, Zhenghua Chen, Zhengguo Li	(参考訳) 慣性測定ユニット (IMU) は、環境要因に依存しないエゴモーション計測を提供する、インターセプティブなモダリティである。様々な自律システムで広く採用されている。数理モデルを用いてこれらのセンサからノイズ測定を処理することの限界に触発され、研究者は近年、慣性計測をエンドツーエンドに推定する様々なディープラーニングアーキテクチャを提案している。それでも、IMUからの高周波および冗長な測定により、長い生の配列が処理される。本研究では, 深部慣性計測のためのIMU運動モデル(DIO)のより現実的な解法として, 精度の高い事前積分の有効性を検討することを目的としている。正確なIMU事前積分は、既存のDIOで使用される連続IMUモデルの数値近似よりも優れている可能性がある。実験結果は提案したDIOを検証する。 Inertial Measurement Units (IMUs) are interceptive modalities that provide ego-motion measurements independent of the environmental factors. They are widely adopted in various autonomous systems. Motivated by the limitations in processing the noisy measurements from these sensors using their mathematical models, researchers have recently proposed various deep learning architectures to estimate inertial odometry in an end-to-end manner. Nevertheless, the high-frequency and redundant measurements from IMUs lead to long raw sequences to be processed. In this study, we aim to investigate the efficacy of accurate preintegration as a more realistic solution to the IMU motion model for deep inertial odometry (DIO) and the resultant DIO is a fusion of model-driven and data-driven approaches. The accurate IMU preintegration has the potential to outperform numerical approximation of the continuous IMU model used in the existing DIOs. Experimental results validate the proposed DIO.	翻訳日:2021-03-27 11:50:45 公開日:2021-01-18
# (参考訳) 因果効果推定による領域適応のためのモデル圧縮 Model Compression for Domain Adaptation through Causal Effect Estimation ( http://arxiv.org/abs/2101.07086v1 ) ライセンス: CC BY 4.0	Guy Rotman, Amir Feder and Roi Reichart	(参考訳) 自然言語処理システムの予測品質の最近の改善は、しばしばモデルパラメータの大幅な増加に依存している。これは、これらのモデルを圧縮する様々な試みにつながったが、既存の手法では、様々なモデルコンポーネントの予測能力や圧縮モデルの一般化可能性の違いは考慮されていない。モデル圧縮とアウト・オブ・ディストリビューション一般化の関連性を理解するため,ドメイン適応設定において最良となるように言語表現モデルを圧縮するタスクを定義する。我々は、モデルの予測に基づいて、単一層のようなモデルコンポーネントの \textit{average treatment effect} (ATE) を推定しようと、因果的な観点からこの問題に対処することを選択した。提案したATE誘導モデル圧縮スキーム(AMoC)は,除去されたモデルコンポーネントによって異なる多くのモデル候補を生成する。次に、ATEを利用した段階的回帰モデルを用いて、最適候補を選択し、対象領域における期待性能を予測する。 AMoCは2つのテキスト分類タスクで60のドメインペアのうち46の強いベースラインより優れており、F1の平均的な改善は最強のベースラインより3倍以上多い。 Recent improvements in the predictive quality of natural language processing systems are often dependent on a substantial increase in the number of model parameters. This has led to various attempts of compressing such models, but existing methods have not considered the differences in the predictive power of various model components or in the generalizability of the compressed models. To understand the connection between model compression and out-of-distribution generalization, we define the task of compressing language representation models such that they perform best in a domain adaptation setting. We choose to address this problem from a causal perspective, attempting to estimate the \textit{average treatment effect} (ATE) of a model component, such as a single layer, on the model's predictions. Our proposed ATE-guided Model Compression scheme (AMoC), generates many model candidates, differing by the model components that were removed. Then, we select the best candidate through a stepwise regression model that utilizes the ATE to predict the expected performance on the target domain. AMoC outperforms strong baselines on 46 of 60 domain pairs across two text classification tasks, with an average improvement of more than 3\% in F1 above the strongest baseline.	翻訳日:2021-03-27 11:42:07 公開日:2021-01-18
# (参考訳) ストレージシミュレータが生成する障害のオンライン検出 Online detection of failures generated by storage simulator ( http://arxiv.org/abs/2101.07100v1 ) ライセンス: CC BY 4.0	Kenenbek Arzymatov, Mikhail Hushchyn, Andrey Sapronov, Vladislav Belavin, Leonid Gremyachikh, Maksim Karpov and Andrey Ustyuzhanin	(参考訳) 現代の大規模データファームは、分散インフラストラクチャにまたがる数十万のストレージデバイスで構成されている。現代のデータセンター(コントローラ、リンク、SSD、HDDディスクなど)で使用されるデバイスは、ハードウェアとソフトウェアの問題によって故障する可能性がある。このような障害や異常は、機械学習技術を用いてコンポーネントのアクティビティを監視することで検出できる。これらの技術を使うためには、研究者は通常のデバイスの履歴データと、アルゴリズムのトレーニングに障害モードを必要とする。本研究では,1)シミュレータ作成によるメソッド内のストレージデータの欠如,2)コンポーネントの1つで発生した障害を素早く検出できる既存のオンラインアルゴリズムの適用という2つの課題に挑戦する。現代のストレージインフラストラクチャの振る舞いをシミュレートするためのGoベースの(golang)パッケージを作成しました。このソフトウェアは離散イベントモデリングのパラダイムに基づいており、高レベルのストレージシステム構築ブロックの構造とダイナミクスをキャプチャする。パッケージのフレキシブルな構造により、構成可能なコンポーネント数で現実世界のストレージシステムのモデルを作成することができます。主な関心領域は、部品の故障を観察するための中長期のストレステストや利用の下での記憶装置の動作を探索することである。シミュレータが生成した時系列分布の故障を検出するため,オンラインモードで動作する変更点検出アルゴリズムを改良した。変化点検出の目標は、時系列分布の違いを発見することである。本稿では,バイナリ分類器を用いた直接密度比推定に基づく時系列データの故障検出手法について述べる。 Modern large-scale data-farms consist of hundreds of thousands of storage devices that span distributed infrastructure. Devices used in modern data centers (such as controllers, links, SSD- and HDD-disks) can fail due to hardware as well as software problems. Such failures or anomalies can be detected by monitoring the activity of components using machine learning techniques. In order to use these techniques, researchers need plenty of historical data of devices in normal and failure mode for training algorithms. In this work, we challenge two problems: 1) lack of storage data in the methods above by creating a simulator and 2) applying existing online algorithms that can faster detect a failure occurred in one of the components. We created a Go-based (golang) package for simulating the behavior of modern storage infrastructure. The software is based on the discrete-event modeling paradigm and captures the structure and dynamics of high-level storage system building blocks. The package's flexible structure allows us to create a model of a real-world storage system with a configurable number of components. The primary area of interest is exploring the storage machine's behavior under stress testing or exploitation in the medium- or long-term for observing failures of its components. To discover failures in the time series distribution generated by the simulator, we modified a change point detection algorithm that works in online mode. The goal of the change-point detection is to discover differences in time series distribution. This work describes an approach for failure detection in time series data based on direct density ratio estimation via binary classifiers.	翻訳日:2021-03-27 11:23:47 公開日:2021-01-18
# (参考訳) Telugu言語のためのニューラル抽象テキスト要約器 Neural Abstractive Text Summarizer for Telugu Language ( http://arxiv.org/abs/2101.07120v1 ) ライセンス: CC BY 4.0	Mohan Bharath B, Aravindh Gowtham B, Akhil M	(参考訳) 抽象テキスト要約 (Abstractive Text Summarization) は、ソーステキストの全体的意味の本質を捉える意味論的に関連する短い文を構築する過程である。実際、人間がテキストの大きな文書を手作業で要約するのは困難であり、非常に時間がかかります。抽象的なテキスト要約の作業の多くは英語で行われており、テルグの抽象的なテキスト要約にはほとんど大きな成果が報告されていない。そこで我々は,Deep Learningを用いたTelugu言語のための抽象的なテキスト要約手法を提案する。本稿では,Telugu言語のための抽象テキスト要約深層学習モデルを提案する。提案手法は注意機構を有するエンコーダ・デコーダシーケンシャルモデルに基づく。このモデルを手作業で作成したデータセットに適用して,ソーステキストの一文要約を生成し,質的に測定した結果を得た。 Abstractive Text Summarization is the process of constructing semantically relevant shorter sentences which captures the essence of the overall meaning of the source text. It is actually difficult and very time consuming for humans to summarize manually large documents of text. Much of work in abstractive text summarization is being done in English and almost no significant work has been reported in Telugu abstractive text summarization. So, we would like to propose an abstractive text summarization approach for Telugu language using Deep learning. In this paper we are proposing an abstractive text summarization Deep learning model for Telugu language. The proposed architecture is based on encoder-decoder sequential models with attention mechanism. We have applied this model on manually created dataset to generate a one sentence summary of the source text and have got good results measured qualitatively.	翻訳日:2021-03-27 11:18:00 公開日:2021-01-18
# (参考訳) ReLUネットワークにおける深さの利点に関する簡単な幾何学的証明 A simple geometric proof for the benefit of depth in ReLU networks ( http://arxiv.org/abs/2101.07126v1 ) ライセンス: CC BY 4.0	Asaf Amrami and Yoav Goldberg	(参考訳) 本稿では, 再活性化した多層フィードフォワードネットワーク(deepth separation)における深度効果の簡易な証明を提案する。具体的には、$m$でインデックス付けされた一連の分類問題を示し、(a)任意の固定深さ整流ネットワークに対して、(a) 問題を正しく分類するには指数関数的なパラメータ数($m$)が必要となる$m$ と、(b) シーケンス中の任意の問題に対して、問題をゼロエラーで分類する、線形深さ($m$)と小さい定数幅($\leq 4$)を持つ具体的なニューラルネットワークを示す。構成的証明は幾何学的議論と空間折り畳み構成に基づいている。より強固な境界と結果が存在する一方で、この証明は極めて単純なツールと技術を用いており、コンピュータサイエンスの学部生や同様の背景を持つ人々にもアクセス可能であるべきである。 We present a simple proof for the benefit of depth in multi-layer feedforward network with rectified activation ("depth separation"). Specifically we present a sequence of classification problems indexed by $m$ such that (a) for any fixed depth rectified network there exist an $m$ above which classifying problem $m$ correctly requires exponential number of parameters (in $m$); and (b) for any problem in the sequence, we present a concrete neural network with linear depth (in $m$) and small constant width ($\leq 4$) that classifies the problem with zero error. The constructive proof is based on geometric arguments and a space folding construction. While stronger bounds and results exist, our proof uses substantially simpler tools and techniques, and should be accessible to undergraduate students in computer science and people with similar backgrounds.	翻訳日:2021-03-27 11:13:46 公開日:2021-01-18
# (参考訳) 近似k-部分モジュラー関数の最大化 Maximizing approximately k-submodular functions ( http://arxiv.org/abs/2101.07157v1 ) ライセンス: CC BY 4.0	Leqian Zheng and Hau Chan and Grigorios Loukides and Minming Li	(参考訳) サイズ制約を受ける約$k$-サブモジュラー関数を最大化する問題を導入する。この問題では、有界な総サイズまたは個々のサイズを持つ基底集合の$k$-disjoint部分集合と、$k$-submodular となる関数の "close" によって与えられる最大効用を選ぼうとする。この問題は、ノイズの多いセンサのタイプに$k$をインストールしたいというセンサー配置や、影響力のレベルが不確実なソーシャルネットワークのユーザに$k$のトピックを宣伝しようとしている最大化などのタスクに応用されている。この問題に対処するために、我々はまず、約$k$-submodular関数に対する2つの自然な定義を提供し、それらの間の階層的関係を確立する。次に, 単純な欲望アルゴリズムが, 異なるサイズ制約に対する近似保証を提供することを示す。最後に,このアルゴリズムがセンサ配置や最大化問題に有効であることを実験的に示す。 We introduce the problem of maximizing approximately $k$-submodular functions subject to size constraints. In this problem, one seeks to select $k$-disjoint subsets of a ground set with bounded total size or individual sizes, and maximum utility, given by a function that is "close" to being $k$-submodular. The problem finds applications in tasks such as sensor placement, where one wishes to install $k$ types of sensors whose measurements are noisy, and influence maximization, where one seeks to advertise $k$ topics to users of a social network whose level of influence is uncertain. To deal with the problem, we first provide two natural definitions for approximately $k$-submodular functions and establish a hierarchical relationship between them. Next, we show that simple greedy algorithms offer approximation guarantees for different types of size constraints. Last, we demonstrate experimentally that the greedy algorithms are effective in sensor placement and influence maximization problems.	翻訳日:2021-03-27 11:05:08 公開日:2021-01-18
# (参考訳) 限られたデータによる機械学習 Machine learning with limited data ( http://arxiv.org/abs/2101.11461v1 ) ライセンス: CC BY 4.0	Fupin Yao	(参考訳) 強力なコンピューティングリソース、ビッグデータ、ディープラーニングアルゴリズムの可用性のおかげで、ここ数年でコンピュータビジョンに大きな進歩を遂げました。コンピュータビジョンシステムは、物体認識、物体検出、顔認識、ポーズ推定など、いくつかのタスクで人間を超え始めます。多くのコンピュータビジョンアルゴリズムが現実世界のアプリケーションにデプロイされ、私たちの生活の質を改善し始めた。しかし、ビッグデータやラベルは必ずしも利用できない。時には、専門家がラベルを付ける必要のある医療画像など、非常に限定されたラベルデータしか持っていないことがあります。本稿では,少ないラベル付きデータしか持たない撮影画像分類について検討する。小さなデータによる機械学習は大きな課題だ。この課題に取り組むために,我々は2つの手法を提案し,その効果を徹底的に検証する。 1つの方法は、これらの画像のスタイルを混ぜることで、画像の特徴を強化することである。第2の方法は、画像のパッチ間の関係を探索するために空間的注意を適用することである。また、トレーニングドメインとテストドメインが異なる場合、わずかなショット学習では、ドメインシフトが重要な問題であることも分かりました。そこで本稿では,ラベルのないデータセットを対象とする,より現実的なドメイン間数ショット学習を提案する。この設定では2つの方法を提案する。第1の方法は、ラベルのないターゲットデータセットのスタイル情報をソースデータセットのサンプルに転送し、スタイリッシュなイメージとオリジナルイメージでモデルをトレーニングする。第2の方法は,すべてのデータを完全に活用するための統一フレームワークを提案する。どちらの手法も基準法を大きなマージンで上回ります。 Thanks to the availability of powerful computing resources, big data and deep learning algorithms, we have made great progress on computer vision in the last few years. Computer vision systems begin to surpass humans in some tasks, such as object recognition, object detection, face recognition and pose estimation. Lots of computer vision algorithms have been deployed to real world applications and started to improve our life quality. However, big data and labels are not always available. Sometimes we only have very limited labeled data, such as medical images which requires experts to label them. In this paper, we study few shot image classification, in which we only have very few labeled data. Machine learning with little data is a big challenge. To tackle this challenge, we propose two methods and test their effectiveness thoroughly. One method is to augment image features by mixing the style of these images. The second method is applying spatial attention to explore the relations between patches of images. We also find that domain shift is a critical issue in few shot learning when the training domain and testing domain are different. So we propose a more realistic cross-domain few-shot learning with unlabeled data setting, in which some unlabeled data is available in the target domain. We propose two methods in this setting. Our first method transfers the style information of the unlabeled target dataset to the samples in the source dataset and trains a model with stylized images and original images. Our second method proposes a unified framework to fully utilize all the data. Both of our methods surpass the baseline method by a large margin.	翻訳日:2021-03-27 10:08:57 公開日:2021-01-18
# (参考訳) アクティブ輪郭モデルとスピードアップロバスト特徴を用いた顔料病変の自動分離と評価の新しいアプローチ A New Approach for Automatic Segmentation and Evaluation of Pigmentation Lesion by using Active Contour Model and Speeded Up Robust Features ( http://arxiv.org/abs/2101.07195v1 ) ライセンス: CC0 1.0	Sara Mardanisamani, Zahra Karimi, Akram Jamshidzadeh, Mehran Yazdi, Melika Farshad, Amirmehdi Farshad	(参考訳) デジタル画像処理技術は、医学を含む様々な科学分野に広く応用されている。画像処理アルゴリズムを用いることで、医師はさまざまな疾患の診断に成功し、より優れた治療結果を得た。本稿では,皮膚病変の分類とそれに関連する特徴の抽出を自動で行う手法を提案する。この目的には、高速化されたロバスト特徴量(surf)とアクティブ輪郭モデル(acm)の組み合わせを用いる。提案手法では,皮膚病変の第一領域を全皮膚画像から抽出し,その領域から平均値,分散値,RGB値,HSV値などの特徴を抽出する。提案手法は,大津しきい値を用いたセグメンテーションの結果と比較し,大津留置法よりも手順が優れていることを示す。提案法と大津しきい値による皮膚病変の分別は,医師の手作業法と比較した。 SURFとACMを併用した皮膚病変の分画法は,最もよい結果が得られた。本手法を実験的に評価するために,20種類の皮膚病変画像に適用した。その結果,提案手法の性能,速度,精度が確認できた。 Digital image processing techniques have wide applications in different scientific fields including the medicine. By use of image processing algorithms, physicians have been more successful in diagnosis of different diseases and have achieved much better treatment results. In this paper, we propose an automatic method for segmenting the skin lesions and extracting features that are associated to them. At this aim, a combination of Speeded-Up Robust Features (SURF) and Active Contour Model (ACM), is used. In the suggested method, at first region of skin lesion is segmented from the whole skin image, and then some features like the mean, variance, RGB and HSV parameters are extracted from the segmented region. Comparing the segmentation results, by use of Otsu thresholding, our proposed method, shows the superiority of our procedure over the Otsu theresholding method. Segmentation of the skin lesion by the proposed method and Otsu thresholding compared the results with physician's manual method. The proposed method for skin lesion segmentation, which is a combination of SURF and ACM, gives the best result. For empirical evaluation of our method, we have applied it on twenty different skin lesion images. Obtained results confirm the high performance, speed and accuracy of our method.	翻訳日:2021-03-27 09:59:36 公開日:2021-01-18
# (参考訳) 集中的相手を用いた医用画像の連合生成モデルによるバイアス低減と有用性の向上 Reducing bias and increasing utility by federated generative modeling of medical images using a centralized adversary ( http://arxiv.org/abs/2101.07235v1 ) ライセンス: CC BY 4.0	Jean-Francois Rajotte, Sumit Mukherjee, Caleb Robinson, Anthony Ortiz, Christopher West, Juan Lavista Ferres, Raymond T Ng	(参考訳) 我々は、協調学習を可能にする生成メカニズムであるFELICIA(Federated LearnIng with a CentralIzed Adversary)を紹介する。特に、限定的かつ偏りのあるデータを持つデータ所有者が、すべてのソースからのデータをプライベートに保ちながら、他のデータ所有者の利益を享受できることを示す。これは、プライバシー法がデータをローカルな施設外で共有することを防ぐ医療画像解析において一般的なシナリオである。 FELICIAは、この研究で示されているように、バニラや条件付きGANを含むGAN(Generative Adversarial Networks)アーキテクチャの大規模なファミリーで動作する。 FELICIA機構を用いることで,データ所有者がデータへのアクセスを提供しなくても,画像サンプルに制限のあるデータ所有者が高能率で高品質な合成画像を生成することができることを示す。共有は、合成データに限られる中央の識別器を通してのみ行われる。ここで、ユーティリティは実際のテストセットの分類性能として定義される。皮膚病変分類のための医用画像およびベンチマーク画像データセット(mnist, cifar-10)を用いて,いくつかの現実的な医療シナリオにおいて,これらの利点を実証する。複数の実験で、最悪の場合においても、FELICIAと実データを組み合わせることで、実データと同等の性能が得られ、ほとんどの結果が実用性を大幅に向上することを示した。 We introduce FELICIA (FEderated LearnIng with a CentralIzed Adversary) a generative mechanism enabling collaborative learning. In particular, we show how a data owner with limited and biased data could benefit from other data owners while keeping data from all the sources private. This is a common scenario in medical image analysis where privacy legislation prevents data from being shared outside local premises. FELICIA works for a large family of Generative Adversarial Networks (GAN) architectures including vanilla and conditional GANs as demonstrated in this work. We show that by using the FELICIA mechanism, a data owner with limited image samples can generate high-quality synthetic images with high utility while neither data owners has to provide access to its data. The sharing happens solely through a central discriminator that has access limited to synthetic data. Here, utility is defined as classification performance on a real test set. We demonstrate these benefits on several realistic healthcare scenarions using benchmark image datasets (MNIST, CIFAR-10) as well as on medical images for the task of skin lesion classification. With multiple experiments, we show that even in the worst cases, combining FELICIA with real data gracefully achieves performance on par with real data while most results significantly improves the utility.	翻訳日:2021-03-27 09:51:38 公開日:2021-01-18
# (参考訳) 半教師付き学習のためのマルチモーダル変分オートエンコーダ--製品・オブ・エキスパートの擁護 Multimodal Variational Autoencoders for Semi-Supervised Learning: In Defense of Product-of-Experts ( http://arxiv.org/abs/2101.07240v1 ) ライセンス: CC BY 4.0	Svetlana Kutuzova, Oswin Krause, Douglas McCloskey, Mads Nielsen, Christian Igel	(参考訳) マルチモーダル生成モデルは、すべてのモダリティ(画像やテキストなど)のコヒーレントな共同生成を可能にする有意義な潜在表現を学べるべきである。多くの応用では、モダリティのサブセットの観測で条件付けられたモダリティを正確にサンプリングする能力も必要である。すべてのトレーニングデータポイントですべてのモダリティが観測されるわけではないため、半教師付き学習が可能となる。本研究では,これらの特性を持つ多変量オートエンコーダの製品群(PoE)を評価する。我々は新しいpoeベースのアーキテクチャとトレーニング手順を含む。経験的評価は、PoEベースのモデルが添加性混合(MoE)アプローチより優れていることを示している。我々の実験は、PoEモデルがモジュラリティの共役結合に適しているのに対して、MoEは接合融合に適しているという直感を支持する。 Multimodal generative models should be able to learn a meaningful latent representation that enables a coherent joint generation of all modalities (e.g., images and text). Many applications also require the ability to accurately sample modalities conditioned on observations of a subset of the modalities. Often not all modalities may be observed for all training data points, so semi-supervised learning should be possible. In this study, we evaluate a family of product-of-experts (PoE) based variational autoencoders that have these desired properties. We include a novel PoE based architecture and training procedure. An empirical evaluation shows that the PoE based models can outperform an additive mixture-of-experts (MoE) approach. Our experiments support the intuition that PoE models are more suited for a conjunctive combination of modalities while MoEs are more suited for a disjunctive fusion.	翻訳日:2021-03-27 09:37:27 公開日:2021-01-18
# (参考訳) 観察による学習:人間ビデオからの操作スキルの物理的模倣 Learning by Watching: Physical Imitation of Manipulation Skills from Human Videos ( http://arxiv.org/abs/2101.07241v1 ) ライセンス: CC0 1.0	Haoyu Xiong, Quanzhou Li, Yun-Chun Chen, Homanga Bharadhwaj, Samarth Sinha, Animesh Garg	(参考訳) 本稿では,ロボット操作作業のための人間ビデオからの物理模倣手法を提案する。我々の手法の鍵となる考え方は、ビデオに埋め込まれた運動情報と運動情報を明示的に活用して、ロボットが自身のコンテキストで操作を行う方法を想像できる構造的表現を学ぶことである。そこで我々は,人間の映像をロボット領域に翻訳し,教師なしのキーポイント検出を行う知覚モジュールを設計した。得られたキーポイントに基づく表現は意味的に意味のある情報を提供し、報酬計算やポリシー学習に直接利用できる。提案手法は, ロボット操作作業において, リーチ, 押圧, スライディング, コーヒーメイキング, 引き出しクローズの5つの課題に対して有効性を評価する。詳細な実験評価の結果,従来の手法に好適な効果を示した。 We present an approach for physical imitation from human videos for robot manipulation tasks. The key idea of our method lies in explicitly exploiting the kinematics and motion information embedded in the video to learn structured representations that endow the robot with the ability to imagine how to perform manipulation tasks in its own context. To achieve this, we design a perception module that learns to translate human videos to the robot domain followed by unsupervised keypoint detection. The resulting keypoint-based representations provide semantically meaningful information that can be directly used for reward computing and policy learning. We evaluate the effectiveness of our approach on five robot manipulation tasks, including reaching, pushing, sliding, coffee making, and drawer closing. Detailed experimental evaluations demonstrate that our method performs favorably against previous approaches.	翻訳日:2021-03-27 09:01:24 公開日:2021-01-18
# (参考訳) 量子格子モデルに対するゲージ不変自己回帰ニューラルネットワーク Gauge Invariant Autoregressive Neural Networks for Quantum Lattice Models ( http://arxiv.org/abs/2101.07243v1 ) ライセンス: CC0 1.0	Di Luo, Zhuo Chen, Kaiwen Hu, Zhizhen Zhao, Vera Mikyoung Hur, and Bryan K. Clark	(参考訳) ゲージ不変性は、凝縮物物理学から高エネルギー物理学まで量子力学において重要な役割を果たす。量子格子モデルのためのゲージ不変自己回帰ニューラルネットワークの構築手法を開発する。これらのネットワークは効率的にサンプリングでき、ゲージ対称性を明示的に従うことができる。我々は、ゲージ不変自己回帰ニューラルネットワークの基底状態と、様々なモデルのリアルタイムダイナミクスを可変に最適化する。 2Dおよび3Dトーリック符号の基底状態と励起状態、およびX-キューブフラクトンモデルを正確に表現する。我々は、$\text{u(1)}$格子ゲージ理論の量子リンクモデルのダイナミクスをシミュレートし、2d $\mathbb{z}_2$ゲージ理論の位相図を取得し、$\text{su(2)}_3$anyonic chainの位相遷移と中心電荷を決定し、$\text{su(2)}$ invariant heisenbergスピンチェーンの基底状態エネルギーを計算する。我々のアプローチは、凝縮物質物理学、高エネルギー物理学、量子情報科学を探索するための強力なツールを提供する。 Gauge invariance plays a crucial role in quantum mechanics from condensed matter physics to high energy physics. We develop an approach to constructing gauge invariant autoregressive neural networks for quantum lattice models. These networks can be efficiently sampled and explicitly obey gauge symmetries. We variationally optimize our gauge invariant autoregressive neural networks for ground states as well as real-time dynamics for a variety of models. We exactly represent the ground and excited states of the 2D and 3D toric codes, and the X-cube fracton model. We simulate the dynamics of the quantum link model of $\text{U(1)}$ lattice gauge theory, obtain the phase diagram for the 2D $\mathbb{Z}_2$ gauge theory, determine the phase transition and the central charge of the $\text{SU(2)}_3$ anyonic chain, and also compute the ground state energy of the $\text{SU(2)}$ invariant Heisenberg spin chain. Our approach provides powerful tools for exploring condensed matter physics, high energy physics and quantum information science.	翻訳日:2021-03-27 08:45:53 公開日:2021-01-18
# (参考訳) HAMMER:学習メッセージによる強化学習エージェントの多層コーディネーション HAMMER: Multi-Level Coordination of Reinforcement Learning Agents via Learned Messaging ( http://arxiv.org/abs/2102.00824v1 ) ライセンス: CC0 1.0	Nikunj Gupta, G Srinivasaraghavan, Swarup Kumar Mohalik, Matthew E. Taylor	(参考訳) 協調型マルチエージェント強化学習(marl)は,ディープニューラルネットワークの表現学習能力を活用することで,大きな成果を上げている。しかし、エージェントの数が増えるにつれて、大規模な集中型アプローチはすぐに実現不可能になり、完全な分散型アプローチは情報共有と協調の重要な機会を逃す可能性がある。さらに、すべてのエージェントが等しくはない - 場合によっては、個々のエージェントが他のエージェントに通信を送信したり、他のエージェントを明示的にモデル化する能力さえ持たない場合がある。本稿では、観測空間全体を観測できる単一の、強力な、中央のエージェントが存在する場合と、局所的な観測しか受信できず、互いに通信できない複数の、低パワーのローカルエージェントが存在することを考察する。中央エージェントの役割は、問題全体を一元的に解決し、アクションコマンドを送信することではなく、個々のエージェントが受信すべき追加情報を決定することによって、グローバルな観察に基づいて、異なるローカルエージェントに送信すべきメッセージを知ることである。 MARLアルゴリズム、ハンマー、そして最も適用可能な場所を説明した後、協調ナビゲーションとマルチエージェントウォーカードメインで実装する。その結果,1)学習したコミュニケーションはシステム性能が向上し,2)成果は複数のエージェントに一般化し,3)成果は報酬構造に一般化した。 Cooperative multi-agent reinforcement learning (MARL) has achieved significant results, most notably by leveraging the representation learning abilities of deep neural networks. However, large centralized approaches quickly become infeasible as the number of agents scale, and fully decentralized approaches can miss important opportunities for information sharing and coordination. Furthermore, not all agents are equal - in some cases, individual agents may not even have the ability to send communication to other agents or explicitly model other agents. This paper considers the case where there is a single, powerful, central agent that can observe the entire observation space, and there are multiple, low powered, local agents that can only receive local observations and cannot communicate with each other. The job of the central agent is to learn what message to send to different local agents, based on the global observations, not by centrally solving the entire problem and sending action commands, but by determining what additional information an individual agent should receive so that it can make a better decision. After explaining our MARL algorithm, hammer, and where it would be most applicable, we implement it in the cooperative navigation and multi-agent walker domains. Empirical results show that 1) learned communication does indeed improve system performance, 2) results generalize to multiple numbers of agents, and 3) results generalize to different reward structures.	翻訳日:2021-03-27 07:41:24 公開日:2021-01-18
# (参考訳) 従来の機械学習とディープラーニングモデルを用いた教育内容の分類 Classification of Pedagogical content using conventional machine learning and deep learning model ( http://arxiv.org/abs/2101.07321v1 ) ライセンス: CC BY 4.0	Vedat Apuk, Krenare Pireva Nu\c{c}i	(参考訳) インターネットの出現と多くのデジタル技術によって、様々な課題がもたらされた。大量のデータがWeb上で発見され、多くの場合、構造化されておらず、組織化されていないため、このデータの使用と操作は極めて難しいプロセスであるという事実に寄与する。この事実により、テキスト分類における異なる機械学習技術とディープラーニング技術の使用が重要となり、この分野を改善し、科学者や研究者にとってさらなる研究がより興味深いものとなった。本稿では,従来のモデルからk-nearest neighbor(knn),ディープラーニングモデルからlong short-term memory(lstm)リカレントニューラルネットワークの2つの異なるモデルを用いて,教育内容の分類を行う。その結果,教育内容の分類精度はKNNモデルで92.52 %,LSTMモデルで87.71 %に達することがわかった。 The advent of the Internet and a large number of digital technologies has brought with it many different challenges. A large amount of data is found on the web, which in most cases is unstructured and unorganized, and this contributes to the fact that the use and manipulation of this data is quite a difficult process. Due to this fact, the usage of different machine and deep learning techniques for Text Classification has gained its importance, which improved this discipline and made it more interesting for scientists and researchers for further study. This paper aims to classify the pedagogical content using two different models, the K-Nearest Neighbor (KNN) from the conventional models and the Long short-term memory (LSTM) recurrent neural network from the deep learning models. The result indicates that the accuracy of classifying the pedagogical content reaches 92.52 % using KNN model and 87.71 % using LSTM model.	翻訳日:2021-03-27 07:24:25 公開日:2021-01-18
# (参考訳) BERTモデルによる自動句読点復元 Automatic punctuation restoration with BERT models ( http://arxiv.org/abs/2101.07343v1 ) ライセンス: CC BY 4.0	Attila Nagy, Bence Bial, Judit \'Acs	(参考訳) 本稿では,英語とハンガリー語に対するBERTモデルを用いた自動句読点復元手法を提案する。ハンガリー語ではSzeged Treebankデータセットでモデルを評価する一方、英語では句読点復元のための一般的なベンチマークであるTed Talksで実験を行った。我々の最良のモデルは、英語で79.8ドル、ハンガリー語で82.2ドルのマクロ平均$F_1$スコアを達成する。私たちのコードは公開されています。 We present an approach for automatic punctuation restoration with BERT models for English and Hungarian. For English, we conduct our experiments on Ted Talks, a commonly used benchmark for punctuation restoration, while for Hungarian we evaluate our models on the Szeged Treebank dataset. Our best models achieve a macro-averaged $F_1$-score of 79.8 in English and 82.2 in Hungarian. Our code is publicly available.	翻訳日:2021-03-27 07:00:12 公開日:2021-01-18
# (参考訳) リアルタイム適応ロボット把持のためのrgbと深度データからの物体検出とポーズ推定 Object Detection and Pose Estimation from RGB and Depth Data for Real-time, Adaptive Robotic Grasping ( http://arxiv.org/abs/2101.07347v1 ) ライセンス: CC BY-SA 4.0	S. K. Paul, M. T. Chowdhury, M. Nicolescu, M. Nicolescu	(参考訳) 近年,ロボット視覚応用の文脈において,物体検出とポーズ推定が注目されている。興味のある物体の識別とポーズの推定は、ロボットが家庭の作業から工業的な操作まで、多くのロボットアプリケーションに対して効果的な支援を提供するためにも重要である。この問題は、異なる形と潜在的に複雑な形状を持つ物体の多様性と、背景のクラッタと物体間の部分的な閉塞によって生じる困難のため、特に困難である。本研究の主な貢献として,動的ロボットの把握を目的としたリアルタイム物体検出とポーズ推定を行うシステムを提案する。ロボットは、各オブジェクトに対するいくつかの固定されたポーズから、少数の標準的グリップを実行するために事前訓練されている。任意のポーズで未知のオブジェクトを提示すると、ロボットはオブジェクトの同一性とその実際のポーズを検知し、新しいポーズで使用するために標準的グリップを適用することができる。訓練のためのシステムは、ロボットの手首に取り付けられたグリッパーに対する対象の相対的な姿勢を捉えることで、標準的な把握を定義する。試験中、新たなポーズが検出されると、ロボットアームの関節角度を調整して物体の正準把持を識別して動的に適応させ、グリッパーが新たなポーズで物体を把持できるようにする。我々はヒューマノイドPR2ロボットを用いて実験を行い、提案したフレームワークが良好なテクスチャを持つ物体を検知し、許容量の飛行機外回転の存在下で正確なポーズ推定を行うことを示した。また、ロボットが任意のポーズからオブジェクトをつかむのに成功し、パフォーマンスを図示する。 In recent times, object detection and pose estimation have gained significant attention in the context of robotic vision applications. Both the identification of objects of interest as well as the estimation of their pose remain important capabilities in order for robots to provide effective assistance for numerous robotic applications ranging from household tasks to industrial manipulation. This problem is particularly challenging because of the heterogeneity of objects having different and potentially complex shapes, and the difficulties arising due to background clutter and partial occlusions between objects. As the main contribution of this work, we propose a system that performs real-time object detection and pose estimation, for the purpose of dynamic robot grasping. The robot has been pre-trained to perform a small set of canonical grasps from a few fixed poses for each object. When presented with an unknown object in an arbitrary pose, the proposed approach allows the robot to detect the object identity and its actual pose, and then adapt a canonical grasp in order to be used with the new pose. For training, the system defines a canonical grasp by capturing the relative pose of an object with respect to the gripper attached to the robot's wrist. During testing, once a new pose is detected, a canonical grasp for the object is identified and then dynamically adapted by adjusting the robot arm's joint angles, so that the gripper can grasp the object in its new pose. We conducted experiments using a humanoid PR2 robot and showed that the proposed framework can detect well-textured objects, and provide accurate pose estimation in the presence of tolerable amounts of out-of-plane rotation. The performance is also illustrated by the robot successfully grasping objects from a wide range of arbitrary poses.	翻訳日:2021-03-27 06:53:12 公開日:2021-01-18
# (参考訳) 合成医療データの忠実性とプライバシー Fidelity and Privacy of Synthetic Medical Data ( http://arxiv.org/abs/2101.08658v1 ) ライセンス: CC BY 4.0	Ofer Mendelevitch, Michael D. Lesh	(参考訳) 医療記録のデジタル化は、新しい時代のビッグデータを臨床科学に継承し、データを共有できる可能性とともに、研究者が論文記録から抽象化できるものを超えて洞察を積み重ねた。精度医療の革新を促進するために、個々のレベルの医療データを共有する必要性は拡大し続けており、科学者が新型コロナウイルス(COVID-19)のパンデミックに苦しむ中で、より緊急なものになったことはない。しかし、ビッグデータの利用に対する熱意は、患者の自律性とプライバシに対する完全な適切な懸念によって誘惑された。つまり、個人に関するプライベートまたはシークレットな情報を抽出する能力は、データを共有する前に重要なインフラストラクチャとデータガバナンスを確立する必要があるため、データの共有を難しくする。 HIPAAは、データ共有の承認メカニズムとして非識別を提供したが、リンク攻撃は大きな脆弱性として特定された。フィールド抑圧や抽象化といった個人情報の漏洩を避けるために、共有できる情報の量を制限する、微分プライバシーのような数学的手法を用いるといった様々なメカニズムが確立されている。もうひとつのアプローチは、基礎となるデータを模倣する合成データを作ることです。合成データは, 医療革新を支えるための有用なメカニズムであり, 実世界の証拠のプロキシであるためには, 合成データセットの2つの特性を示す必要がある。(1) 実データに関する分析は, 合成データの分析(統計的忠実性)と(2) 合成データは, 最小限の再識別(プライバシ保証)のリスクを伴って, プライバシーを保たなければならない。本稿では,合成データセットの統計忠実性とプライバシ保存特性を定量化する枠組みを提案し,syntegra技術によって生成された合成データの指標を示す。 The digitization of medical records ushered in a new era of big data to clinical science, and with it the possibility that data could be shared, to multiply insights beyond what investigators could abstract from paper records. The need to share individual-level medical data to accelerate innovation in precision medicine continues to grow, and has never been more urgent, as scientists grapple with the COVID-19 pandemic. However, enthusiasm for the use of big data has been tempered by a fully appropriate concern for patient autonomy and privacy. That is, the ability to extract private or confidential information about an individual, in practice, renders it difficult to share data, since significant infrastructure and data governance must be established before data can be shared. Although HIPAA provided de-identification as an approved mechanism for data sharing, linkage attacks were identified as a major vulnerability. A variety of mechanisms have been established to avoid leaking private information, such as field suppression or abstraction, strictly limiting the amount of information that can be shared, or employing mathematical techniques such as differential privacy. Another approach, which we focus on here, is creating synthetic data that mimics the underlying data. For synthetic data to be a useful mechanism in support of medical innovation and a proxy for real-world evidence, one must demonstrate two properties of the synthetic dataset: (1) any analysis on the real data must be matched by analysis of the synthetic data (statistical fidelity) and (2) the synthetic data must preserve privacy, with minimal risk of re-identification (privacy guarantee). In this paper we propose a framework for quantifying the statistical fidelity and privacy preservation properties of synthetic datasets and demonstrate these metrics for synthetic data generated by Syntegra technology.	翻訳日:2021-03-27 06:37:46 公開日:2021-01-18
# (参考訳) 完全畳み込みネットワークによるテキスト線抽出とエネルギー最小化 Text line extraction using fully convolutional network and energy minimization ( http://arxiv.org/abs/2101.07370v1 ) ライセンス: CC BY 4.0	Berat Kurar Barakat, Ahmad Droby, Reem Alaasam, Boraq Madi, Irina Rabaev, Jihad El-Sana	(参考訳) テキスト行は手書き文書画像の重要な部分であり、さらなるアプリケーションにより分析が容易である。最近のテキスト行検出の進歩にもかかわらず、手書き文書からのテキスト行抽出は未解決の作業である。本稿では,テキストライン検出のための完全畳み込みネットワークと,テキストライン抽出のためのエネルギー最小化手法を提案する。検出されたテキスト行は、テキスト行を貫くブロブ線で表現される。これらのブロブ線は、テキスト線抽出のためのエネルギー関数を支援する。検出段階は任意に向き付けられたテキスト行を特定できる。さらに、抽出段階は、その向きによらず、さまざまな高さのテキスト行の画素と線間近接を見出すことができる。さらに、向きを仮定することなく、タッチと重なり合うテキスト行を細かく分割することができる。本稿では,VML-AHTE,VML-MOC,Diva-HisDBデータセットに対する提案手法の評価を行う。 VML-AHTEデータセットは、リッチなダイアクリティカルなテキスト行の重複、タッチ、クローズを含む。 VML-MOCデータセットは、マルチ指向で歪んだテキスト行によって非常に難しい。 Diva-HisDBデータセットは、テキスト行の高さとタッチ行を表示する。その結果, 様々な課題があるにもかかわらず, 全ての実験において同じパラメータを用いた手法の有効性が示された。 Text lines are important parts of handwritten document images and easier to analyze by further applications. Despite recent progress in text line detection, text line extraction from a handwritten document remains an unsolved task. This paper proposes to use a fully convolutional network for text line detection and energy minimization for text line extraction. Detected text lines are represented by blob lines that strike through the text lines. These blob lines assist an energy function for text line extraction. The detection stage can locate arbitrarily oriented text lines. Furthermore, the extraction stage is capable of finding out the pixels of text lines with various heights and interline proximity independent of their orientations. Besides, it can finely split the touching and overlapping text lines without an orientation assumption. We evaluate the proposed method on VML-AHTE, VML-MOC, and Diva-HisDB datasets. The VML-AHTE dataset contains overlapping, touching and close text lines with rich diacritics. The VML-MOC dataset is very challenging by its multiply oriented and skewed text lines. The Diva-HisDB dataset exhibits distinct text line heights and touching text lines. The results demonstrate the effectiveness of the method despite various types of challenges, yet using the same parameters in all the experiments.	翻訳日:2021-03-27 06:31:44 公開日:2021-01-18
# 自然言語とRLによる解釈可能な政策仕様と合成 Interpretable Policy Specification and Synthesis through Natural Language and RL ( http://arxiv.org/abs/2101.07140v1 ) ライセンス: Link先を確認	Pradyumna Tambwekar, Andrew Silva, Nakul Gopalan, Matthew Gombolay	(参考訳) ポリシー仕様は、人間がロボットの動作を初期化して、強化学習(Reinforcement Learning, RL)を通して温かい開始ポリシーを最適化するプロセスである。ポリシーの仕様/設計は本質的に協調的なプロセスであるが、デモや深いrlからの学習に基づくモダンな手法は、モデル解釈性とアクセシビリティを欠いている。これらのモデルは、エージェントが学習したポリシーを検査する手段を提供しておらず、ロボットの振る舞いを教えるために使用可能なモダリティの作成に重点を置いていません。本稿では,1)自然言語を通じて,理解しやすい決定木という形で解釈可能なポリシーを規定し,2)これらのポリシーをウォームスタート強化学習に活用し,3)自然言語初期化機構を欠いたベースラインよりも優れる,新たな機械学習フレームワークを提案する。我々は,木をベースとした政策決定に,自由形式の自然言語ポリシー記述をマッピングすることで,アプローチを訓練する。本稿では,2つの領域にまたがる保留コーパスにおいて,自然言語を96%,97%の精度で決定木に翻訳する手法を提案する。最後に、自然言語コマンドで初期化されるポリシーが、自然言語ベースのウォームスタートテクニックの恩恵を受けない関連するベースライン(p < 0.001)を大幅に上回ることができることを検証します。 Policy specification is a process by which a human can initialize a robot's behaviour and, in turn, warm-start policy optimization via Reinforcement Learning (RL). While policy specification/design is inherently a collaborative process, modern methods based on Learning from Demonstration or Deep RL lack the model interpretability and accessibility to be classified as such. Current state-of-the-art methods for policy specification rely on black-box models, which are an insufficient means of collaboration for non-expert users: These models provide no means of inspecting policies learnt by the agent and are not focused on creating a usable modality for teaching robot behaviour. In this paper, we propose a novel machine learning framework that enables humans to 1) specify, through natural language, interpretable policies in the form of easy-to-understand decision trees, 2) leverage these policies to warm-start reinforcement learning and 3) outperform baselines that lack our natural language initialization mechanism. We train our approach by collecting a first-of-its-kind corpus mapping free-form natural language policy descriptions to decision tree-based policies. We show that our novel framework translates natural language to decision trees with a 96% and 97% accuracy on a held-out corpus across two domains, respectively. Finally, we validate that policies initialized with natural language commands are able to significantly outperform relevant baselines (p < 0.001) that do not benefit from our natural language-based warm-start technique.	翻訳日:2021-03-27 06:08:51 公開日:2021-01-18
# MP3: マップ、知覚、予測、計画のための統一モデル MP3: A Unified Model to Map, Perceive, Predict and Plan ( http://arxiv.org/abs/2101.06806v1 ) ライセンス: Link先を確認	Sergio Casas, Abbas Sadat, Raquel Urtasun	(参考訳) 高精細地図(HDマップ)は、その意味や幾何学的情報から、現代のほとんどの自動運転システムにおいて重要な要素である。残念なことに、HDマップの構築はコストと、それに伴うローカライゼーションシステムに課される要件のため、スケールが難しいことが証明されている。 hdマップなしで運転できることは、自動運転ソリューションをスケールしたり、既存のソリューションの障害耐性を高めるのに非常に有益である(例えば、ローカライズが失敗したり、マップが最新でない場合)。この目的に向けて,入力が生センサデータと高レベルコマンド(例えば交差点で左折する)を持つマップレス運転におけるエンドツーエンドのMP3を提案する。 mp3は、オンラインマップと動的エージェントの現在および将来の状態の中間表現を予測し、それらを新しいニューラルモーションプランナーで活用し、不確実性を考慮した解釈可能な決定を行う。長期的なクローズドループシミュレーションや,大規模な実世界のデータセットのエキスパートドライバと比較して,当社のアプローチは極めて安全で快適で,ベースラインよりもコマンドを追従可能であることが分かりました。 High-definition maps (HD maps) are a key component of most modern self-driving systems due to their valuable semantic and geometric information. Unfortunately, building HD maps has proven hard to scale due to their cost as well as the requirements they impose in the localization system that has to work everywhere with centimeter-level accuracy. Being able to drive without an HD map would be very beneficial to scale self-driving solutions as well as to increase the failure tolerance of existing ones (e.g., if localization fails or the map is not up-to-date). Towards this goal, we propose MP3, an end-to-end approach to mapless driving where the input is raw sensor data and a high-level command (e.g., turn left at the intersection). MP3 predicts intermediate representations in the form of an online map and the current and future state of dynamic agents, and exploits them in a novel neural motion planner to make interpretable decisions taking into account uncertainty. We show that our approach is significantly safer, more comfortable, and can follow commands better than the baselines in challenging long-term closed-loop simulations, as well as when compared to an expert driver in a large-scale real-world dataset.	翻訳日:2021-03-27 06:08:27 公開日:2021-01-18
# 深い構造を持つリアクティブプランニング Deep Structured Reactive Planning ( http://arxiv.org/abs/2101.06832v1 ) ライセンス: Link先を確認	Jerry Liu, Wenyuan Zeng, Raquel Urtasun, Ersin Yumer	(参考訳) 現実世界で活動するインテリジェントエージェントは、その目標を達成するために、自身だけでなく、周囲のシーンの他の参加者の安全と快適さを維持することのバランスをとる必要がある。これは、他のアクターの行動について共同で推論すると同時に、これらの2つのプロセスが本質的に相互に絡み合っているため、独自の行動を決定する必要があります。しかしこれは、計画が予測に従うほとんどの自動運転パイプラインでは捉えられていない。本稿では,自動運転車が自己計画や他のアクターがどう反応するかを共同で判断できる,新たなデータ駆動型,リアクティブな計画目標を提案する。この問題を観測データから学習し,計画問題と予測問題の両方を符号化したエネルギーベース深部構造モデルとして定式化する。実世界の運転と合成された高密度交通の両方に基づくシミュレーションにより、我々の反応モデルは、衝突速度を抑えることなく、高度に複雑な操作(交通路のマージ/ターン)を高速に完了させることで、非反応性の変動よりも優れることを示した。 An intelligent agent operating in the real-world must balance achieving its goal with maintaining the safety and comfort of not only itself, but also other participants within the surrounding scene. This requires jointly reasoning about the behavior of other actors while deciding its own actions as these two processes are inherently intertwined - a vehicle will yield to us if we decide to proceed first at the intersection but will proceed first if we decide to yield. However, this is not captured in most self-driving pipelines, where planning follows prediction. In this paper we propose a novel data-driven, reactive planning objective which allows a self-driving vehicle to jointly reason about its own plans as well as how other actors will react to them. We formulate the problem as an energy-based deep structured model that is learned from observational data and encodes both the planning and prediction problems. Through simulations based on both real-world driving and synthetically generated dense traffic, we demonstrate that our reactive model outperforms a non-reactive variant in successfully completing highly complex maneuvers (lane merges/turns in traffic) faster, without trading off collision rate.	翻訳日:2021-03-27 06:08:05 公開日:2021-01-18
# LNSMM:ローカルネットワーク共有マルチビューマルチタスクによる眼球運動推定 LNSMM: Eye Gaze Estimation With Local Network Share Multiview Multitask ( http://arxiv.org/abs/2101.07116v1 ) ライセンス: Link先を確認	Yong Huang, Ben Chen, Daiming Qu	(参考訳) Eye gaze estimation has become increasingly significant in computer vision.In this paper,we systematically study the mainstream of eye gaze estimation methods,propose a novel methodology to estimate eye gaze points and eye gaze directions simultaneously.First,we construct a local sharing network for feature extraction of gaze points and gaze directions estimation,which can reduce network computational parameters and converge quickly;Second,we propose a Multiview Multitask Learning (MTL) framework,for gaze directions,a coplanar constraint is proposed for the left and right eyes,for gaze points,three views data input indirectly introduces eye position information,a cross-view pooling module is designed, propose joint loss which handle both gaze points and gaze directions estimation.Eventually,we collect a dataset to use of gaze points,which have three views to exist public dataset.The experiment show our method is state-of-the-art the current mainstream methods on two indicators of gaze points and gaze directions. Eye gaze estimation has become increasingly significant in computer vision.In this paper,we systematically study the mainstream of eye gaze estimation methods,propose a novel methodology to estimate eye gaze points and eye gaze directions simultaneously.First,we construct a local sharing network for feature extraction of gaze points and gaze directions estimation,which can reduce network computational parameters and converge quickly;Second,we propose a Multiview Multitask Learning (MTL) framework,for gaze directions,a coplanar constraint is proposed for the left and right eyes,for gaze points,three views data input indirectly introduces eye position information,a cross-view pooling module is designed, propose joint loss which handle both gaze points and gaze directions estimation.Eventually,we collect a dataset to use of gaze points,which have three views to exist public dataset.The experiment show our method is state-of-the-art the current mainstream methods on two indicators of gaze points and gaze directions.	翻訳日:2021-03-27 06:07:14 公開日:2021-01-18
# 複数の領域にわたる効率的な教師なし適応のための知識蒸留法 Knowledge Distillation Methods for Efficient Unsupervised Adaptation Across Multiple Domains ( http://arxiv.org/abs/2101.07308v1 ) ライセンス: Link先を確認	Le Thanh Nguyen-Meidine, Atif Belal, Madhu Kiran, Jose Dolz, Louis-Antoine Blais-Morin, Eric Granger	(参考訳) 大規模なアノテートデータセットのトレーニングを必要とするCNNの複雑さに加えて、設計と運用データのドメインシフトは、多くの現実世界アプリケーションにおいてCNNの採用を制限している。例えば、個人の再識別では、ビデオは重複しない視点を持つ分散したカメラセットでキャプチャされる。ソース間のシフト(例) 実験室の設定)とターゲット(例) カメラ)ドメインは認識精度を著しく低下させる可能性がある。さらに、最先端のCNNは、計算要求からすると、そのようなリアルタイムアプリケーションには適さないかもしれない。近年,非教師なし領域適応(uda)や知識蒸留(kd)によるcnnの高速化と圧縮を行う手法が提案されているが,複数の対象領域にまたがるcnnの適応と圧縮を同時に行なおうとしている。本稿では、CNNの教師なし単一ターゲットDA(STDA)とマルチターゲットDA(MTDA)に対するプログレッシブKDアプローチを提案する。我々のKD-STDA法は,CNNを1つのターゲット領域に適応させるため,より大規模な教師CNNから抽出し,目標領域データとソース領域データの両方で学習し,共通表現との整合性を維持する。提案手法は,Office31 および ImageClef-DA 画像分類データセット上の CNN の圧縮と STDA の最先端手法と比較する。また、Digits、Office31、OfficeHome上のMTDAの最先端メソッドと比較される。両方の設定 -- KD-STDAとKD-MTDA -- の結果から、我々のアプローチは、CNNの複雑さを同等または低いものにしつつ、ターゲットドメイン全体で最高の精度を達成できることを示している。 Beyond the complexity of CNNs that require training on large annotated datasets, the domain shift between design and operational data has limited the adoption of CNNs in many real-world applications. For instance, in person re-identification, videos are captured over a distributed set of cameras with non-overlapping viewpoints. The shift between the source (e.g. lab setting) and target (e.g. cameras) domains may lead to a significant decline in recognition accuracy. Additionally, state-of-the-art CNNs may not be suitable for such real-time applications given their computational requirements. Although several techniques have recently been proposed to address domain shift problems through unsupervised domain adaptation (UDA), or to accelerate/compress CNNs through knowledge distillation (KD), we seek to simultaneously adapt and compress CNNs to generalize well across multiple target domains. In this paper, we propose a progressive KD approach for unsupervised single-target DA (STDA) and multi-target DA (MTDA) of CNNs. Our method for KD-STDA adapts a CNN to a single target domain by distilling from a larger teacher CNN, trained on both target and source domain data in order to maintain its consistency with a common representation. Our proposed approach is compared against state-of-the-art methods for compression and STDA of CNNs on the Office31 and ImageClef-DA image classification datasets. It is also compared against state-of-the-art methods for MTDA on Digits, Office31, and OfficeHome. In both settings -- KD-STDA and KD-MTDA -- results indicate that our approach can achieve the highest level of accuracy across target domains, while requiring a comparable or lower CNN complexity.	翻訳日:2021-03-27 06:07:03 公開日:2021-01-18
# テキストからテキストへの変換による自然言語からの関数のラベル付け Teach me how to Label: Labeling Functions from Natural Language with Text-to-text Transformers ( http://arxiv.org/abs/2101.07138v1 ) ライセンス: Link先を確認	Yannis Papanikolaou	(参考訳) 注釈付きデータは、特にドメインの専門知識を必要とする分野において、正確な機械学習モデルをトレーニングする上で最も重要なボトルネックとなっている。上記の問題に対処する最近のアプローチでは、個々のデータポイントをラベル付けするのではなく、自然言語による説明を用いることで、アノテータの効率を向上し、コストを大幅に削減する。本稿では,これらの自然言語記述をPythonラベリング関数に変換する作業について,事前学習したテキスト・テキスト・トランスフォーマを用いたセマンティック・パースに追従する。一連の実験で、我々のアプローチはセマンティック構文解析ベンチマークのconalaの新たな最先端を達成し、以前のベストアプローチを3.7 bleuポイントで上回った。さらに,自然言語記述ラベル関数ペアを手作業で構築したデータセットでは,0。我々のアプローチは、特定のラベル付きサンプルを提供するのではなく、自然言語でラベル付けする方法を教えるモデルへのステップストーンと見なすことができる。私たちのコード、構築されたデータセット、モデルは、https://github.com/ypapanik/t5-for-code-generationで利用可能です。 Annotated data has become the most important bottleneck in training accurate machine learning models, especially for areas that require domain expertise. A recent approach to deal with the above issue proposes using natural language explanations instead of labeling individual data points, thereby increasing human annotators' efficiency as well as decreasing costs substantially. This paper focuses on the task of turning these natural language descriptions into Python labeling functions by following a novel approach to semantic parsing with pre-trained text-to-text Transformers. In a series of experiments our approach achieves a new state of the art on the semantic parsing benchmark CoNaLa, surpassing the previous best approach by 3.7 BLEU points. Furthermore, on a manually constructed dataset of natural language descriptions-labeling functions pairs we achieve a BLEU of 0.39. Our approach can be regarded as a stepping stone towards models that are taught how to label in natural language, instead of being provided specific labeled samples. Our code, constructed dataset and models are available at https://github.com/ypapanik/t5-for-code-generation.	翻訳日:2021-03-27 06:06:35 公開日:2021-01-18
# Kalman Smoothing を用いた重ね合わせLSTMを用いた深部繰り返しニューラルネットワークによる血糖予測 Stacked LSTM Based Deep Recurrent Neural Network with Kalman Smoothing for Blood Glucose Prediction ( http://arxiv.org/abs/2101.06850v1 ) ライセンス: Link先を確認	Md Fazle Rabby, Yazhou Tu, Md Imran Hossen, Insup Le, Anthony S Maida, Xiali Hei	(参考訳) 血液グルコース (BG) は, 信頼性の高い人工膵臓やインスリン注入システムを必要とする1型糖尿病患者に必須である。近年,より正確なBGレベルの予測システムとしてディープラーニング技術が活用されている。しかし、連続グルコースモニタリング(CGM)はセンサーエラーの影響を受けやすい。その結果、最も最適な機械学習モデルを使用しても、不正確なCGMの読み取りがBG予測に影響を与え、信頼性が低下する。本研究では,センサの故障を考慮したLong-term memory(LSTM)に基づく深部リカレントニューラルネットワーク(RNN)モデルを用いて,血糖値の予測手法を提案する。センサ誤差による不正確なCGM読影の補正にはカルマン平滑化法を用いる。 6人の異なる患者の8週間のデータを含むOttoT1DMデータセットでは、平均RMSEは6.45と17.24mg/dlで、それぞれ30分60分予測水平線(PH)を達成している。我々の知る限りでは、これはOoioT1DMデータセットの平均予測精度の最上位である。例えば、カルマンのCGMデータのスムーズ化、食事からの炭水化物、ボルスインシュリン、累積ステップ数などの異なる生理的情報は、モデルへの入力として使われる有意義な特徴を表すために作成される。アプローチの目的は、予測されたCGM値と指先での血糖値の差を下げることである。以上の結果から,t1d糖尿病管理のための人工膵およびインスリン注入システムの性能向上を期待できる,より信頼性の高いbg予測が可能と考えられた。 Blood glucose (BG) management is crucial for type-1 diabetes patients resulting in the necessity of reliable artificial pancreas or insulin infusion systems. In recent years, deep learning techniques have been utilized for a more accurate BG level prediction system. However, continuous glucose monitoring (CGM) readings are susceptible to sensor errors. As a result, inaccurate CGM readings would affect BG prediction and make it unreliable, even if the most optimal machine learning model is used. In this work, we propose a novel approach to predicting blood glucose level with a stacked Long short-term memory (LSTM) based deep recurrent neural network (RNN) model considering sensor fault. We use the Kalman smoothing technique for the correction of the inaccurate CGM readings due to sensor error. For the OhioT1DM dataset, containing eight weeks' data from six different patients, we achieve an average RMSE of 6.45 and 17.24 mg/dl for 30 minutes and 60 minutes of prediction horizon (PH), respectively. To the best of our knowledge, this is the leading average prediction accuracy for the ohioT1DM dataset. Different physiological information, e.g., Kalman smoothed CGM data, carbohydrates from the meal, bolus insulin, and cumulative step counts in a fixed time interval, are crafted to represent meaningful features used as input to the model. The goal of our approach is to lower the difference between the predicted CGM values and the fingerstick blood glucose readings - the ground truth. Our results indicate that the proposed approach is feasible for more reliable BG forecasting that might improve the performance of the artificial pancreas and insulin infusion system for T1D diabetes management.	翻訳日:2021-03-27 06:06:17 公開日:2021-01-18
# サブタスクとしての報酬の不確実性予測による安定な深層強化学習法 Stable deep reinforcement learning method by predicting uncertainty in rewards as a subtask ( http://arxiv.org/abs/2101.06906v1 ) ライセンス: Link先を確認	Kanata Suzuki, Tetsuya Ogata	(参考訳) 近年, 深部強化学習(DRL)によって様々な課題が達成されている。しかし,実環境におけるタスクにdrlを適用する場合,適切な報酬の設計は困難である。実際のハードウェアセンサーから得られる報酬には、ノイズ、誤解、あるいは観測失敗が含まれる。これらの不安定な信号による学習不安定性は、DRLでは未解決の問題である。本研究では,報酬信号に含まれる分散を直接推定するためにサブタスクを追加することで,既存のDRLモデルを拡張するアプローチを提案する。次にモデルは、批判ネットワークのサブタスクによって学習されたフィーチャーマップをアクタネットワークに送信する。これにより、潜在的なノイズの影響にロバストな安定した学習が可能になる。不安定報奨信号を持つatariゲーム領域における実験の結果,本手法はトレーニング収束を安定化することがわかった。また,特徴マップの可視化による拡張性についても検討する。このアプローチは、ノイズの多い現実世界のシナリオでDRLをより実用的なものにする可能性がある。 In recent years, a variety of tasks have been accomplished by deep reinforcement learning (DRL). However, when applying DRL to tasks in a real-world environment, designing an appropriate reward is difficult. Rewards obtained via actual hardware sensors may include noise, misinterpretation, or failed observations. The learning instability caused by these unstable signals is a problem that remains to be solved in DRL. In this work, we propose an approach that extends existing DRL models by adding a subtask to directly estimate the variance contained in the reward signal. The model then takes the feature map learned by the subtask in a critic network and sends it to the actor network. This enables stable learning that is robust to the effects of potential noise. The results of experiments in the Atari game domain with unstable reward signals show that our method stabilizes training convergence. We also discuss the extensibility of the model by visualizing feature maps. This approach has the potential to make DRL more practical for use in noisy, real-world scenarios.	翻訳日:2021-03-27 06:05:48 公開日:2021-01-18
# 正規化ポリシはリワードロバストである Regularized Policies are Reward Robust ( http://arxiv.org/abs/2101.07012v1 ) ライセンス: Link先を確認	Hisham Husain and Kamil Ciosek and Ryota Tomioka	(参考訳) 強化学習(RL)における政策のエントロピー正則化(Entropic regularization)は、学習された政策が局所的最適政策に過度に適合する前に国家空間を十分に探索することを保証するために一般的に用いられるヒューリスティックである。エントロピーを使う主な動機は最適政策の探索と曖昧化であるが、理論的な効果は完全には理解されていない。本研究では、より一般化された正規化RLの目的とフェンシェル双対性について検討し、対角的報酬問題の形をとる双対問題を導出する。特に, 正規化対象が求める最適方針は, 最悪の対人報酬の下での強化学習問題の最適方針であることがわかった。その結果、一般的なエントロピー正規化スキームをロバスト化の形式として再解釈することができる。さらに,結果の一般性から,既存の他の正規化スキームにも適用する。以上の結果から,政策の正則化の効果を考察し,堅牢な報酬を通じて探索の理解を深めることができた。 Entropic regularization of policies in Reinforcement Learning (RL) is a commonly used heuristic to ensure that the learned policy explores the state-space sufficiently before overfitting to a local optimal policy. The primary motivation for using entropy is for exploration and disambiguating optimal policies; however, the theoretical effects are not entirely understood. In this work, we study the more general regularized RL objective and using Fenchel duality; we derive the dual problem which takes the form of an adversarial reward problem. In particular, we find that the optimal policy found by a regularized objective is precisely an optimal policy of a reinforcement learning problem under a worst-case adversarial reward. Our result allows us to reinterpret the popular entropic regularization scheme as a form of robustification. Furthermore, due to the generality of our results, we apply to other existing regularization schemes. Our results thus give insights into the effects of regularization of policies and deepen our understanding of exploration through robust rewards at large.	翻訳日:2021-03-27 06:05:18 公開日:2021-01-18
# 連続学習=破滅的な忘れ方? Does Continual Learning = Catastrophic Forgetting? ( http://arxiv.org/abs/2101.07295v1 ) ライセンス: Link先を確認	Anh Thai, Stefan Stojanov, Isaac Rehg, James M. Rehg	(参考訳) 継続的な学習は破滅的な忘れ込みに苦しむことで知られており、これはより最近のサンプルを犠牲にして初期の学習概念が忘れられる現象である。本研究は,連続学習が必然的に破滅的な記憶に結びつくという仮定に挑戦し,継続的に学習しても破滅的な記憶に苦しむことのない一連の課題を提示する。本研究では,これらの課題の特性を把握し,破滅的な忘れ方や,連続的な分類のための代理表現学習タスクの可能性を実証する。さらに,クラスインクリメンタルな分類学習タスクにおいて,最先端の手法より優れている新しいアルゴリズムYASSを導入する。最後に、連続モデルにおける表現学習のダイナミクスを追跡する新しいツールであるDyRTを提案する。この記事でリリースされたコードベース、データセット、事前トレーニングされたモデルは、https://github.com/ngailapdi/CLRec.comで見ることができる。 Continual learning is known for suffering from catastrophic forgetting, a phenomenon where earlier learned concepts are forgotten at the expense of more recent samples. In this work, we challenge the assumption that continual learning is inevitably associated with catastrophic forgetting by presenting a set of tasks that surprisingly do not suffer from catastrophic forgetting when learned continually. We attempt to provide an insight into the property of these tasks that make them robust to catastrophic forgetting and the potential of having a proxy representation learning task for continual classification. We further introduce a novel yet simple algorithm, YASS that outperforms state-of-the-art methods in the class-incremental categorization learning task. Finally, we present DyRT, a novel tool for tracking the dynamics of representation learning in continual models. The codebase, dataset and pre-trained models released with this article can be found at https://github.com/ngailapdi/CLRec.	翻訳日:2021-03-27 06:04:13 公開日:2021-01-18
# shape to categorize: 明示的な形状バイアスによる低ショット学習 Using Shape to Categorize: Low-Shot Learning with an Explicit Shape Bias ( http://arxiv.org/abs/2101.07296v1 ) ライセンス: Link先を確認	Stefan Stojanov, Anh Thai, James M. Rehg	(参考訳) 物体形状の推論が物体認識にとって重要であることは広く受け入れられている。しかし、今日の最も強力なオブジェクト認識手法は、学習中に明示的にオブジェクト形状を使用しない。本研究では,低ショット学習の最近の発展,発達心理学の知見,コンピュータビジョン研究における合成データの利用の増加に動機づけられ,低ショット学習法の一般化性能向上に3次元形状の推論がいかに役立つかを検討する。本研究では, 3次元物体形状を用いた判別埋め込み空間を学習し, 画像のマッピング法を学習することにより, 既存の低ショット学習手法を改善する新しい手法を提案する。新しいアプローチは、複数のデータセットにおける画像のみの低ショット学習アプローチのパフォーマンスを向上させる。また、最も多くのオブジェクトカテゴリを持つ新しい3dオブジェクトデータセットであるtoys4kも開発しています。 It is widely accepted that reasoning about object shape is important for object recognition. However, the most powerful object recognition methods today do not explicitly make use of object shape during learning. In this work, motivated by recent developments in low-shot learning, findings in developmental psychology, and the increased use of synthetic data in computer vision research, we investigate how reasoning about 3D shape can be used to improve low-shot learning methods' generalization performance. We propose a new way to improve existing low-shot learning approaches by learning a discriminative embedding space using 3D object shape, and utilizing this embedding by learning how to map images into it. Our new approach improves the performance of image-only low-shot learning approaches on multiple datasets. We also develop Toys4K, a new 3D object dataset with the biggest number of object categories that can also support low-shot learning.	翻訳日:2021-03-27 06:03:57 公開日:2021-01-18
# 病理画像埋め込みのための拡大一般化 Magnification Generalization for Histopathology Image Embedding ( http://arxiv.org/abs/2101.07757v1 ) ライセンス: Link先を確認	Milad Sikaroudi, Benyamin Ghojogh, Fakhri Karray, Mark Crowley, H.R. Tizhoosh	(参考訳) 病理像の埋め込みはコンピュータビジョンの活発な研究領域である。埋め込みモデルのほとんどは、特定の倍率レベルにのみ集中する。しかしながら、病理組織学の埋め込みにおいて有用なタスクは、拡大レベルに関係なく埋め込み空間を訓練することである。この目標に対処するための2つの主要なアプローチは、ドメイン適応とドメイン一般化である。拡大適応は文献でよく研究されている話題であるが,我々の知る限りでは,組織病理画像埋め込みのための拡大一般化に関する最初の研究である。本稿では,モデル非依存メタラーニング(MAML)の概念に基づく,意味的特徴のモデル非依存学習(MASF)という,拡大一般化のためのエピソード学習可能な領域一般化手法を用いる。 4種類の倍率の乳腺病理組織学的データセットを用いた実験結果から,提案法の有効性が示唆された。 Histopathology image embedding is an active research area in computer vision. Most of the embedding models exclusively concentrate on a specific magnification level. However, a useful task in histopathology embedding is to train an embedding space regardless of the magnification level. Two main approaches for tackling this goal are domain adaptation and domain generalization, where the target magnification levels may or may not be introduced to the model in training, respectively. Although magnification adaptation is a well-studied topic in the literature, this paper, to the best of our knowledge, is the first work on magnification generalization for histopathology image embedding. We use an episodic trainable domain generalization technique for magnification generalization, namely Model Agnostic Learning of Semantic Features (MASF), which works based on the Model Agnostic Meta-Learning (MAML) concept. Our experimental results on a breast cancer histopathology dataset with four different magnification levels show the proposed method's effectiveness for magnification generalization.	翻訳日:2021-03-27 06:03:45 公開日:2021-01-18
# 降水の高速かつ高精度なマルチレゾリューション動的ダウンスケーリング Fast and accurate learned multiresolution dynamical downscaling for precipitation ( http://arxiv.org/abs/2101.06813v1 ) ライセンス: Link先を確認	Jiali Wang, Zhengchun Liu, Ian Foster, Won Chang, Rajkumar Kettimuthu, Rao Kotamarthi	(参考訳) 本研究では,高分解能モデルによる降水データと同等の統計的特性をエミュレートするニューラルネットワークに基づく手法を開発した。鍵となるアイデアは、低解像度と高解像度のシミュレーションを組み合わせてニューラルネットワークを訓練し、前者から後者にマップすることだ。具体的には,変数を直接スタックし,各変数をスタックする前にエンコードする2つのタイプのcnnを定義し,各cnnタイプを平均二乗誤差 (mse) や条件付き生成敵ネットワーク (cgan) といった従来の損失関数を用いて,合計4つのcnn変種で訓練する。 CNNに基づく4つの新しい高分解能降水結果と、元の高分解能シミュレーション、双線形補間および最先端CNNに基づく超分解能(SR)技術から生じる降水量を比較した。その結果,SR法は従来の高分解能シミュレーションよりもスムーズな空間分布と時間分布,データ変動と極値の両線形補間器と同様の結果が得られた。 MSEによって訓練された新しいCNNは、補間器やSR技術よりもいくつかの領域でより良い結果を生成するが、その予測は元の高分解能シミュレーションほど近いものではない。 CGANによって訓練されたCNNは、より現実的で物理的に合理的な結果を生成し、時間と空間におけるデータの変動だけでなく、激しい嵐や長期間の嵐のような極端な現象もよりよく捉えている。新たに提案されたcnnベースのダウンスケーリングアプローチは、ネットワークがトレーニングされた後(1~gpuを使用して4～時間を要する)、14～minで50～kmから12～kmまでの降雨を30～年にダウンスケールすることができる。 This study develops a neural network-based approach for emulating high-resolution modeled precipitation data with comparable statistical properties but at greatly reduced computational cost. The key idea is to use combination of low- and high- resolution simulations to train a neural network to map from the former to the latter. Specifically, we define two types of CNNs, one that stacks variables directly and one that encodes each variable before stacking, and we train each CNN type both with a conventional loss function, such as mean square error (MSE), and with a conditional generative adversarial network (CGAN), for a total of four CNN variants. We compare the four new CNN-derived high-resolution precipitation results with precipitation generated from original high resolution simulations, a bilinear interpolater and the state-of-the-art CNN-based super-resolution (SR) technique. Results show that the SR technique produces results similar to those of the bilinear interpolator with smoother spatial and temporal distributions and smaller data variabilities and extremes than the original high resolution simulations. While the new CNNs trained by MSE generate better results over some regions than the interpolator and SR technique do, their predictions are still not as close as the original high resolution simulations. The CNNs trained by CGAN generate more realistic and physically reasonable results, better capturing not only data variability in time and space but also extremes such as intense and long-lasting storms. The new proposed CNN-based downscaling approach can downscale precipitation from 50~km to 12~km in 14~min for 30~years once the network is trained (training takes 4~hours using 1~GPU), while the conventional dynamical downscaling would take 1~month using 600 CPU cores to generate simulations at the resolution of 12~km over contiguous United States.	翻訳日:2021-03-27 06:03:30 公開日:2021-01-18
# 浸透テスト中のウェブサイト構造発見を最適化するAIを活用する Leveraging AI to optimize website structure discovery during Penetration Testing ( http://arxiv.org/abs/2101.07223v1 ) ライセンス: Link先を確認	Diego Antonelli, Roberta Cascella, Gaetano Perrone, Simon Pietro Romano, Antonio Schiano	(参考訳) Dirbustingは、サーバの内容を列挙するために、HTTPレスポンスを監視しながら、Webサーバ上のディレクトリとファイル名をブルートするテクニックである。このような手法は、共通の単語のリストを使用して、ターゲットウェブサイトの隠れた構造を発見する。 dirbustingは通常、新しいページを見つけるための発見条件としてレスポンスコードに依存している。これは企業がウェブサイトの脆弱性を検知する活動であるWebアプリケーションの浸透テストで広く利用されている。 dirbustingのテクニックは時間とリソースの両方を消費するものであり、この分野で革新的なアプローチが探求されたことはない。そこで我々は,人工知能を活用し,ディルバスティングプロセスを最適化する高度な手法を提案する。具体的には、セマンティッククラスタリング手法を用いて、意味的意味に応じて異なるグループで単語リストを整理する。生成されたクラスタは、アドホックに実装された次のワードインテリジェント戦略で使用される。本稿では,クラスタリング手法が一般的なブライト力法よりも優れていることを示す。パフォーマンスは8つの異なるWebアプリケーションをテストすることで評価される。その結果,各実験で最大50%の性能向上が確認された。 Dirbusting is a technique used to brute force directories and file names on web servers while monitoring HTTP responses, in order to enumerate server contents. Such a technique uses lists of common words to discover the hidden structure of the target website. Dirbusting typically relies on response codes as discovery conditions to find new pages. It is widely used in web application penetration testing, an activity that allows companies to detect websites vulnerabilities. Dirbusting techniques are both time and resource consuming and innovative approaches have never been explored in this field. We hence propose an advanced technique to optimize the dirbusting process by leveraging Artificial Intelligence. More specifically, we use semantic clustering techniques in order to organize wordlist items in different groups according to their semantic meaning. The created clusters are used in an ad-hoc implemented next-word intelligent strategy. This paper demonstrates that the usage of clustering techniques outperforms the commonly used brute force methods. Performance is evaluated by testing eight different web applications. Results show a performance increase that is up to 50% for each of the conducted experiments.	翻訳日:2021-03-27 06:02:40 公開日:2021-01-18
# 深層強化学習エージェントのための摂動に基づく塩分マップのベンチマーク Benchmarking Perturbation-based Saliency Maps for Explaining Deep Reinforcement Learning Agents ( http://arxiv.org/abs/2101.07312v1 ) ライセンス: Link先を確認	Tobias Huber, Benedikt Limmer, Elisabeth Andr\'e	(参考訳) 近年、複雑な知的エージェントの説明が盛んに行われている。 1つの例は、各ピクセルがエージェントの決定にどの程度の理由があるかを示す、サリエンシマップを生成するアルゴリズムの開発である。しかし,このようなサリエンシマップのほとんどの評価は,画像分類作業に重点を置いている。私たちが知る限り、深層強化学習エージェントの異なる給与マップを徹底的に比較する作業はありません。本稿では,4つの異なるAtari 2600ゲームで訓練された深層強化学習エージェントに対して,摂動に基づく4つのサリエンシマップ作成手法を比較した。 4つのアプローチはすべて、入力の一部を摂動させ、エージェントの出力にどの程度影響するかを測定することで機能する。アプローチはエージェントの学習パラメータへの依存(正当性チェック)、エージェントの推論への忠実さ(入力劣化)、実行時間という3つの計算指標を用いて比較される。 Recent years saw a plethora of work on explaining complex intelligent agents. One example is the development of several algorithms that generate saliency maps which show how much each pixel attributed to the agents' decision. However, most evaluations of such saliency maps focus on image classification tasks. As far as we know, there is no work which thoroughly compares different saliency maps for Deep Reinforcement Learning agents. This paper compares four perturbation-based approaches to create saliency maps for Deep Reinforcement Learning agents trained on four different Atari 2600 games. All four approaches work by perturbing parts of the input and measuring how much this affects the agent's output. The approaches are compared using three computational metrics: dependence on the learned parameters of the agent (sanity checks), faithfulness to the agent's reasoning (input degradation), and run-time.	翻訳日:2021-03-27 06:02:24 公開日:2021-01-18
# 多項出力を用いたBARTの推論 Inference for BART with Multinomial Outcomes ( http://arxiv.org/abs/2101.06823v1 ) ライセンス: Link先を確認	Yizhen Xu, Joseph W. Hogan, Michael J. Daniels, Rami Kantor, Ann Mwangi	(参考訳) multinomial probit bayesian additive regression trees (mpbart) フレームワークはkindoらによって提案された。 (kd)は,マルチノミナルプロビット(mnp)モデルの潜在ユーティリティをbart(chipman et al.)で近似する。 2010). 多項ロジスティックモデルと比較して、MNPは独立した代替案を仮定せず、多変量ガウス分布潜在ユーティリティを通して代替案間の相関構造を特定できる。我々はMPBARTに適合する2つの新しいアルゴリズムを導入し、提案手法の理論的混合速度が既存のKDアルゴリズムと等しいか優れていることを示す。シミュレーションを通じて,提案手法のロバスト性,基準レベルの選択,結果周波数の不均衡,実用的誤差項に対する事前ハイパーパラメータの仕様について検討する。この研究は、ケニアのAMPATH(Academic Model Providing Access to Healthcare)からEHR(Electronic Health Record)に基づいて、HIV陽性患者の死亡率と罹患率に関する後続の予測分布を生成することによる。応用とシミュレーションの両方において,mcmc収束率と後方予測精度の観点からkdと比較して,提案手法により良好な性能が得られた。 The multinomial probit Bayesian additive regression trees (MPBART) framework was proposed by Kindo et al. (KD), approximating the latent utilities in the multinomial probit (MNP) model with BART (Chipman et al. 2010). Compared to multinomial logistic models, MNP does not assume independent alternatives and the correlation structure among alternatives can be specified through multivariate Gaussian distributed latent utilities. We introduce two new algorithms for fitting the MPBART and show that the theoretical mixing rates of our proposals are equal or superior to the existing algorithm in KD. Through simulations, we explore the robustness of the methods to the choice of reference level, imbalance in outcome frequencies, and the specifications of prior hyperparameters for the utility error term. The work is motivated by the application of generating posterior predictive distributions for mortality and engagement in care among HIV-positive patients based on electronic health records (EHRs) from the Academic Model Providing Access to Healthcare (AMPATH) in Kenya. In both the application and simulations, we observe better performance using our proposals as compared to KD in terms of MCMC convergence rate and posterior predictive accuracy.	翻訳日:2021-03-27 06:02:10 公開日:2021-01-18
# 深層ニューラルネットワークによる形状不確かさ定量化における非スムース量の推定 Deep neural network surrogates for non-smooth quantities of interest in shape uncertainty quantification ( http://arxiv.org/abs/2101.07023v1 ) ライセンス: Link先を確認	Laura Scarabosio	(参考訳) 幾何学的不確かさを持つ界面問題に対する解のポイント評価について検討し、障害物の不確かさを高次元パラメータ $\boldsymbol{y}\in[-1,1]^d$, $d\in\mathbb{n}$ で記述する。特に楕円型インタフェース問題とヘルムホルツ伝送問題に焦点を当てる。物理領域における解のポイント値は、高次元パラメータに非スムースに依存し、サーロゲートの構築に関心がある場合の課題となる。実際、高次法は収束率が低いが、不連続性を追跡する方法は通常、いわゆる次元性の呪いに悩まされる。そこで本研究では,深層ニューラルネットワークを用いた点評価のためのサロゲートの構築を提案する。ニューラルネットワークが優れたサロゲートを提供する理由を理論的に正当化する。さらに, 実運用における優れた性能を示す広範な数値実験を行った。特に,ニューラルネットワークが次元の呪いに苦しむことはないことを観察し,点評定点数(つまり,パラメータ空間における不連続点数)に対する誤差の依存性や,2つの物質間のコントラストやヘルムホルツ伝達問題に対するヘルムホルツ伝達問題などのモデリングパラメータについて検討した。 We consider the point evaluation of the solution to interface problems with geometric uncertainties, where the uncertainty in the obstacle is described by a high-dimensional parameter $\boldsymbol{y}\in[-1,1]^d$, $d\in\mathbb{N}$. We focus in particular on an elliptic interface problem and a Helmholtz transmission problem. Point values of the solution in the physical domain depend in general non-smoothly on the high-dimensional parameter, posing a challenge when one is interested in building surrogates. Indeed, high-order methods show poor convergence rates, while methods which are able to track discontinuities usually suffer from the so-called curse of dimensionality. For this reason, in this work we propose to build surrogates for point evaluation using deep neural networks. We provide a theoretical justification for why we expect neural networks to provide good surrogates. Furthermore, we present extensive numerical experiments showing their good performance in practice. We observe in particular that neural networks do not suffer from the curse of dimensionality, and we study the dependence of the error on the number of point evaluations (that is, the number of discontinuities in the parameter space), as well as on several modeling parameters, such as the contrast between the two materials and, for the Helmholtz transmission problem, the wavenumber.	翻訳日:2021-03-27 06:01:52 公開日:2021-01-18
# ランダムウォークに基づくネットワーク埋め込みアルゴリズムの一貫性 Consistency of random-walk based network embedding algorithms ( http://arxiv.org/abs/2101.07354v1 ) ライセンス: Link先を確認	Yichi Zhang, Minh Tang	(参考訳) node2vecやDeepWalkのようなランダムウォークベースのネットワーク埋め込みアルゴリズムは、ダウンストリームネットワーク推論タスクを実行する前にネットワーク内のノードのユークリッド表現を得るために広く使用されている。しかし、その印象的な経験的パフォーマンスにもかかわらず、その振る舞いを説明する理論的結果の欠如がある。本稿では行列分解の観点から, node2vec と DeepWalk のアルゴリズムについて検討した。確率的ブロックモデルグラフに対するコミュニティ検出の設定において,これらのアルゴリズムを解析し,特に,大きなサンプル誤差境界を設定し,node2vec/deepwalk埋め込みとk-meansクラスタリングの一貫したコミュニティ回復を証明した。理論的には,観測されたネットワークのスパース性,ランダムウォークのウィンドウサイズ,node2vec/deepwalk埋め込みの収束率と,真だが未知のエッジ確率行列の埋め込みとの微妙な相互作用を示す。より具体的には、ネットワークがスペーサー化するにつれて、より大きなウィンドウサイズ、または同等に長いランダムウォークを用いて、結果としての埋め込みの収束率を改善することが提案される。これらの観測を裏付ける数値実験を含む。 Random-walk based network embedding algorithms like node2vec and DeepWalk are widely used to obtain Euclidean representation of the nodes in a network prior to performing down-stream network inference tasks. Nevertheless, despite their impressive empirical performance, there is a lack of theoretical results explaining their behavior. In this paper we studied the node2vec and DeepWalk algorithms through the perspective of matrix factorization. We analyze these algorithms in the setting of community detection for stochastic blockmodel graphs; in particular we established large-sample error bounds and prove consistent community recovery of node2vec/DeepWalk embedding followed by k-means clustering. Our theoretical results indicate a subtle interplay between the sparsity of the observed networks, the window sizes of the random walks, and the convergence rates of the node2vec/DeepWalk embedding toward the embedding of the true but unknown edge probabilities matrix. More specifically, as the network becomes sparser, our results suggest using larger window sizes, or equivalently, taking longer random walks, in order to attain better convergence rate for the resulting embeddings. The paper includes numerical experiments corroborating these observations.	翻訳日:2021-03-27 06:01:27 公開日:2021-01-18
# 還元フラックスCTのための深層学習型ノイズ低減 Deep-Learning Driven Noise Reduction for Reduced Flux Computed Tomography ( http://arxiv.org/abs/2101.07376v1 ) ライセンス: Link先を確認	Khalid L. Alsamadony, Ertugrul U. Yildirim, Guenther Glatz, Umair bin Waheed, Sherif M. Hanafy	(参考訳) ディープニューラルネットワークは、特に放射線リスクの低減に関して、臨床画像にかなりの注目を集めている。光子フラックスを低減して放射線線量を減らすと、スキャンされた画像の品質が低下する。そこで研究者たちは、深層畳み込みニューラルネットワーク(dcnn)を利用して、低品質の低用量画像を高用量で高品質な画像にマッピングすることで、関連する放射線ハザードを最小限に抑えることを模索している。逆に、地球物質のCT(Computerd tomography)測定は放射線線量によって制限されない。しかしながら、人体とは対照的に、地球材料は高密度成分からなり、X線の減衰が増大する可能性がある。したがって、スキャン品質を得るためには、より高い量画像が必要である。マイクロCTベースの走査技術では, 長期取得の問題は特に深刻である。サンプルのサイズや露出時間の設定によっては、1回のスキャンで完了するには数時間を要する。これは、指数温度依存性の現象が解明される場合、特に懸念される。プロセスは、CTスキャンによって適切にキャプチャされるには早すぎるかもしれない。以上の課題に対処するため, 岩盤CT画像の品質向上と露光時間の60%以上短縮にDCNNを適用した。我々は、マイクロCTから得られたデータセットに基づいて現在の結果を強調し、DCNNの結果を改善するために転送学習を適用した。この手法は、あらゆる計算トモグラフィー技術に適用できる。さらに、平均二乗誤差や構造的類似度指数などの異なる損失関数を最小化するDCNNの性能を比較検討する。 Deep neural networks have received considerable attention in clinical imaging, particularly with respect to the reduction of radiation risk. Lowering the radiation dose by reducing the photon flux inevitably results in the degradation of the scanned image quality. Thus, researchers have sought to exploit deep convolutional neural networks (DCNNs) to map low-quality, low-dose images to higher-dose, higher-quality images thereby minimizing the associated radiation hazard. Conversely, computed tomography (CT) measurements of geomaterials are not limited by the radiation dose. In contrast to the human body, however, geomaterials may be comprised of high-density constituents causing increased attenuation of the X-Rays. Consequently, higher dosage images are required to obtain an acceptable scan quality. The problem of prolonged acquisition times is particularly severe for micro-CT based scanning technologies. Depending on the sample size and exposure time settings, a single scan may require several hours to complete. This is of particular concern if phenomena with an exponential temperature dependency are to be elucidated. A process may happen too fast to be adequately captured by CT scanning. To address the aforementioned issues, we apply DCNNs to improve the quality of rock CT images and reduce exposure times by more than 60\%, simultaneously. We highlight current results based on micro-CT derived datasets and apply transfer learning to improve DCNN results without increasing training time. The approach is applicable to any computed tomography technology. Furthermore, we contrast the performance of the DCNN trained by minimizing different loss functions such as mean squared error and structural similarity index.	翻訳日:2021-03-27 06:01:07 公開日:2021-01-18
# フェアネス達成のための最適前処理と全変動バリーセンタとの関係 Optimal Pre-Processing to Achieve Fairness and Its Relationship with Total Variation Barycenter ( http://arxiv.org/abs/2101.06811v1 ) ライセンス: Link先を確認	Farhad Farokhi	(参考訳) 我々は異なる影響、すなわち、アウトプットを観察する確率が人種や性別などの保護された属性に依存する程度を使って公正さを測定する。保護属性が与えられた入力の分布の合計変動距離によって異なる影響が上界であることが証明された。次に、公正性を強制するために前処理(データ修復とも呼ばれる)を使用します。本研究では,データの事前処理による予測モデルの成功度が,前処理前後におけるデータ分布のばらつき距離によって上限されることを示す。これにより、保護された属性の分布間の総変分距離の制約を受ける前処理前後のデータ分布間の総変分距離を最小化して、公正性を確保するための最適な前処理連隊を見つけることができる。この問題は効率的に解くことができる線形プログラムである。確率空間内の距離を全変動距離で測定した場合,この問題は2つの分布の重心(すなわち質量中心)の発見と密接に関連していることを示す。また,差分プライバシーが公平性に及ぼす影響を,全変動距離を用いて検討した。実践データセットを用いて数値実験を行い,実験結果を示す。 We use disparate impact, i.e., the extent that the probability of observing an output depends on protected attributes such as race and gender, to measure fairness. We prove that disparate impact is upper bounded by the total variation distance between the distribution of the inputs given the protected attributes. We then use pre-processing, also known as data repair, to enforce fairness. We show that utility degradation, i.e., the extent that the success of a forecasting model changes by pre-processing the data, is upper bounded by the total variation distance between the distribution of the data before and after pre-processing. Hence, the problem of finding the optimal pre-processing regiment for enforcing fairness can be cast as minimizing total variations distance between the distribution of the data before and after pre-processing subject to a constraint on the total variation distance between the distribution of the inputs given protected attributes. This problem is a linear program that can be efficiently solved. We show that this problem is intimately related to finding the barycenter (i.e., center of mass) of two distributions when distances in the probability space are measured by total variation distance. We also investigate the effect of differential privacy on fairness using the proposed the total variation distances. We demonstrate the results using numerical experimentation with a practice dataset.	翻訳日:2021-03-27 06:00:45 公開日:2021-01-18
# インクリメンタル知識に基づく質問応答 Incremental Knowledge Based Question Answering ( http://arxiv.org/abs/2101.06938v1 ) ライセンス: Link先を確認	Yongqi Li, Wenjie Li, Liqiang Nie	(参考訳) 近年,知識ベースの質問回答 (KBQA) は,知識ベースで事実を用いて自然言語の質問に答えることを目的としている。既存のアプローチは静的な知識ベースを想定することが多い。しかし、知識は現実世界で時間とともに進化している。進化する知識ベースに微調整戦略を直接適用すれば、深刻な破滅的な忘れの問題に悩まされるでしょう。本稿では,人間と同じように学習能力を段階的に拡大できる新しいインクリメンタルkbqa学習フレームワークを提案する。具体的には、知識蒸留を生かして壊滅的な忘れる問題を克服するために、マージン蒸留損失と協調抽出方法とを含む。提案するインクリメンタル学習ソリューションを評価するために,simplequestionデータセットを再編成した。包括的な実験は、進化する知識ベースに取り組む際にその効果と効率を示す。 In the past years, Knowledge-Based Question Answering (KBQA), which aims to answer natural language questions using facts in a knowledge base, has been well developed. Existing approaches often assume a static knowledge base. However, the knowledge is evolving over time in the real world. If we directly apply a fine-tuning strategy on an evolving knowledge base, it will suffer from a serious catastrophic forgetting problem. In this paper, we propose a new incremental KBQA learning framework that can progressively expand learning capacity as humans do. Specifically, it comprises a margin-distilled loss and a collaborative exemplar selection method, to overcome the catastrophic forgetting problem by taking advantage of knowledge distillation. We reorganize the SimpleQuestion dataset to evaluate the proposed incremental learning solution to KBQA. The comprehensive experiments demonstrate its effectiveness and efficiency when working with the evolving knowledge base.	翻訳日:2021-03-27 05:59:56 公開日:2021-01-18
# HinFlair: Hindi言語におけるposタグとテキスト分類のための事前訓練された文脈文字列埋め込み HinFlair: pre-trained contextual string embeddings for pos tagging and text classification in the Hindi language ( http://arxiv.org/abs/2101.06949v1 ) ライセンス: Link先を確認	Harsh Patel	(参考訳) 繰り返しニューラルネットワークとトランスフォーマーアーキテクチャに基づく言語モデルの最近の進歩は、posタグ付け、名前付きエンティティ認識、テキスト分類など、幅広い自然言語処理タスクにおいて最先端の結果を得た。しかし、これらの言語モデルのほとんどは、英語、ドイツ語、スペイン語のような高資源言語で事前学習されている。多言語言語モデルはヒンディー語、テルグ語、ベンガル語などのインドの言語を訓練用コーパスに含んでいるが、これらの言語が研究の主要な言語ではないため、言語の特徴を表現できないことが多い。 HinFlairは、巨大な単言語Hindiコーパス上で事前訓練された言語表現モデル(コンテキスト文字列埋め込み)である。 6つのテキスト分類データセットとヒンディー語の依存木バンクを用いて、ヒンディー語のコンテキスト化文字列埋め込みの性能を分析する実験を行った。結果は、HinFlairが、テキスト分類やposタグ付けといった下流タスクのために、既存の最先端の公開トレーニング済みの埋め込みよりも優れていることを示している。また、HinFlairとFastTextの埋め込みの組み合わせは、特にヒンディー語のために訓練された多くのトランスフォーマーベースの言語モデルより優れている。 Recent advancements in language models based on recurrent neural networks and transformers architecture have achieved state-of-the-art results on a wide range of natural language processing tasks such as pos tagging, named entity recognition, and text classification. However, most of these language models are pre-trained in high resource languages like English, German, Spanish. Multi-lingual language models include Indian languages like Hindi, Telugu, Bengali in their training corpus, but they often fail to represent the linguistic features of these languages as they are not the primary language of the study. We introduce HinFlair, which is a language representation model (contextual string embeddings) pre-trained on a large monolingual Hindi corpus. Experiments were conducted on 6 text classification datasets and a Hindi dependency treebank to analyze the performance of these contextualized string embeddings for the Hindi language. Results show that HinFlair outperforms previous state-of-the-art publicly available pre-trained embeddings for downstream tasks like text classification and pos tagging. Also, HinFlair when combined with FastText embeddings outperforms many transformers-based language models trained particularly for the Hindi language.	翻訳日:2021-03-27 05:59:45 公開日:2021-01-18
# caegcn:クロス・アテンション・フュージョンベースの強化グラフ畳み込みネットワーク CaEGCN: Cross-Attention Fusion based Enhanced Graph Convolutional Network for Clustering ( http://arxiv.org/abs/2101.06883v1 ) ライセンス: Link先を確認	Guangyu Huo, Yong Zhang, Junbin Gao, Boyue Wang, Yongli Hu, and Baocai Yin	(参考訳) 深層畳み込みネットワークの強力な学習能力により、深層クラスタリング手法は個々のデータから最も識別性の高い情報を抽出し、より良好なクラスタリング結果を生成することができる。しかしながら、既存のディープクラスタリング手法は通常、データ間の関係を無視する。幸いなことに、グラフ畳み込みネットワークはそのような関係を処理でき、ディープクラスタリングの新しい研究方向を開くことができる。 In this paper, we propose a cross-attention based deep clustering framework, named Cross-Attention Fusion based Enhanced Graph Convolutional Network (CaEGCN), which contains four main modules: the cross-attention fusion module which innovatively concatenates the Content Auto-encoder module (CAE) relating to the individual data and Graph Convolutional Auto-encoder module (GAE) relating to the relationship between the data in a layer-by-layer manner, and the self-supervised model that highlights the discriminative information for clustering tasks. クロスアテンション融合モジュールは2種類の異種表現を融合するが、CAEモジュールはGAEモジュールのコンテンツ情報を補完し、GCNの過度に平滑な問題を回避する。 GAEモジュールでは、各データの内容と関係を再構成する2つの新しい損失関数が提案されている。最後に、自己教師付きモジュールは、CAEとGAEの中間層表現の分布を一貫性に制約する。異なるタイプのデータセットに対する実験結果は、提案したCaEGCNの優位性と堅牢性を証明する。 With the powerful learning ability of deep convolutional networks, deep clustering methods can extract the most discriminative information from individual data and produce more satisfactory clustering results. However, existing deep clustering methods usually ignore the relationship between the data. Fortunately, the graph convolutional network can handle such relationship, opening up a new research direction for deep clustering. In this paper, we propose a cross-attention based deep clustering framework, named Cross-Attention Fusion based Enhanced Graph Convolutional Network (CaEGCN), which contains four main modules: the cross-attention fusion module which innovatively concatenates the Content Auto-encoder module (CAE) relating to the individual data and Graph Convolutional Auto-encoder module (GAE) relating to the relationship between the data in a layer-by-layer manner, and the self-supervised model that highlights the discriminative information for clustering tasks. While the cross-attention fusion module fuses two kinds of heterogeneous representation, the CAE module supplements the content information for the GAE module, which avoids the over-smoothing problem of GCN. In the GAE module, two novel loss functions are proposed that reconstruct the content and relationship between the data, respectively. Finally, the self-supervised module constrains the distributions of the middle layer representations of CAE and GAE to be consistent. Experimental results on different types of datasets prove the superiority and robustness of the proposed CaEGCN.	翻訳日:2021-03-27 05:59:26 公開日:2021-01-18
# 認知症者における動揺・尿路感染症のリスク分析のための注意モデル An attention model to analyse the risk of agitation and urinary tract infections in people with dementia ( http://arxiv.org/abs/2101.07007v1 ) ライセンス: Link先を確認	Honglin Li, Roonak Rezvani, Magdalena Anita Kolanko, David J. Sharp, Maitreyee Wairagkar, Ravi Vaidyanathan, Ramin Nilforooshan, Payam Barnaghi	(参考訳) 行動症状と尿路感染症(uti)は認知症患者が直面する最も一般的な問題である。これらの状況を管理する上で重要な課題の1つは、苦痛を減らし、未計画の入院を避けるために早期発見とタイムリーな介入である。センサーデータの統合と分析に家庭内センシング技術と機械学習モデルを使用することで、臨床的に重要な出来事や健康状態の変化を検出し予測する機会が得られる。我々は,家庭内センサデータを収集する統合プラットフォームを開発し,機械学習モデルを扇動およびUTIリスク分析に適用するための観察的研究を行った。平均年齢82名,標準偏差6.5名(女性47名,男性41名)の88名から大規模なデータセットを収集し,注意と合理的なメカニズムを利用した新しい深層学習モデルの評価を行った。提案手法では,大量のデータを一定時間にわたって処理し,時系列データから重要なパターンを抽出することができる。注意) 抽出された特徴とパターンを使用してリスク分析モデル(すなわち、リスク分析モデル)をトレーニングする。合理的)。提案モデルでは,時系列データにおいてどの時間ステップと特徴が使用されるかを示すことにより,予測を説明することができる。本モデルでは, 排卵リスクとウティスの検出において, 91\%のリコールと83\%の精度を提供する。このモデルは、初期治療や早期介入アプローチと連動して、ウティスなどの病態の早期検出や興奮などの神経精神症状の管理に使用できる。本研究は,提案モデルが生成した警告を用いて早期介入を行うための臨床経路のセットを開発し,このプラットフォームを用いて,作成した介入計画に従って警告に応答する臨床モニタリングチームを設置した。 Behavioural symptoms and urinary tract infections (UTI) are among the most common problems faced by people with dementia. One of the key challenges in the management of these conditions is early detection and timely intervention in order to reduce distress and avoid unplanned hospital admissions. Using in-home sensing technologies and machine learning models for sensor data integration and analysis provides opportunities to detect and predict clinically significant events and changes in health status. We have developed an integrated platform to collect in-home sensor data and performed an observational study to apply machine learning models for agitation and UTI risk analysis. We collected a large dataset from 88 participants with a mean age of 82 and a standard deviation of 6.5 (47 females and 41 males) to evaluate a new deep learning model that utilises attention and rational mechanism. The proposed solution can process a large volume of data over a period of time and extract significant patterns in a time-series data (i.e. attention) and use the extracted features and patterns to train risk analysis models (i.e. rational). The proposed model can explain the predictions by indicating which time-steps and features are used in a long series of time-series data. The model provides a recall of 91\% and precision of 83\% in detecting the risk of agitation and UTIs. This model can be used for early detection of conditions such as UTIs and managing of neuropsychiatric symptoms such as agitation in association with initial treatment and early intervention approaches. In our study we have developed a set of clinical pathways for early interventions using the alerts generated by the proposed model and a clinical monitoring team has been set up to use the platform and respond to the alerts according to the created intervention plans.	翻訳日:2021-03-27 05:59:05 公開日:2021-01-18
# 新しく得られた有効観測値の光によるデータ欠落検出 Data Obsolescence Detection in the Light of Newly Acquired Valid Observations ( http://arxiv.org/abs/2101.07067v1 ) ライセンス: Link先を確認	Salma Chaieb and Ali Ben Mrad and Brahim Hnich and V\'eronique Delcroix	(参考訳) システムまたは人の状態を記述する情報は、常に進化し、時代遅れになり、他の情報と矛盾する可能性がある。したがってデータベースは、データベースに含まれる時代遅れのものと矛盾する、新しい有効な観測の取得によって一貫して更新されなければならない。本稿では,情報陳腐化問題に対処するための新しい手法を提案する。提案手法は,観測結果間の矛盾をリアルタイムに検出し,表現モデルから古いものを特定することを目的としている。情報不足を特徴とする不確実な環境下で作業するため、ベイズネットワークを表現モデルとして使用し、新しい近似概念である$\epsilon$-Contradictionを提案する。新しい概念は、一連の観測において矛盾を持つ自信レベルによってパラメータ化される。本稿では,古い情報を検出する多項式時間アルゴリズムを提案する。結果として得られた古くなった情報は、単純な観測結果よりもAND-OR木の方がよいことを示す。最後に,本手法の有効性を高齢者の転倒防止データベースに示すとともに,この木を用いて医師に信頼できる推薦を行う方法を示す。我々の実験は体系的にも実質的にも良い結果をもたらす。 The information describing the conditions of a system or a person is constantly evolving and may become obsolete and contradict other information. A database, therefore, must be consistently updated upon the acquisition of new valid observations that contradict obsolete ones contained in the database. In this paper, we propose a novel approach for dealing with the information obsolescence problem. Our approach aims to detect, in real-time, contradictions between observations and then identify the obsolete ones, given a representation model. Since we work within an uncertain environment characterized by the lack of information, we choose to use a Bayesian network as our representation model and propose a new approximate concept, $\epsilon$-Contradiction. The new concept is parameterised by a confidence level of having a contradiction in a set of observations. We propose a polynomial-time algorithm for detecting obsolete information. We show that the resulting obsolete information is better represented by an AND-OR tree than a simple set of observations. Finally, we demonstrate the effectiveness of our approach on a real elderly fall-prevention database and showcase how this tree can be used to give reliable recommendations to doctors. Our experiments give systematically and substantially very good results.	翻訳日:2021-03-27 05:58:39 公開日:2021-01-18
# 人間と機械理解の不協和性 Dissonance Between Human and Machine Understanding ( http://arxiv.org/abs/2101.07337v1 ) ライセンス: Link先を確認	Zijian Zhang, Jaspreet Singh, Ujwal Gadiraju, Avishek Anand	(参考訳) 複雑な機械学習モデルは、現在医療や自動運転車を含むいくつかの重要な領域にデプロイされている。その結果、このような複雑なモデルの決定を人間に説明するための解釈が近年急増している。タスクの人間の解釈に対応するモデルは、特定のコンテキストにおいてより望ましいものであり、責任の属性、信頼の構築、バイアスの顕在化、よりよいモデルの構築に役立つ。したがって、どのモデルがタスクの人間の理解にどのように準拠しているかを理解することが重要である。本稿では,画像分類タスクのレンズを通して,人間と機械の理解の不協和性を明らかにし,定量化する大規模クラウドソーシング研究を行う。特に、我々は以下の質問に答えようとしている。どの(十分にパフォーマンスのよい)複雑なMLモデルが、正確な予測を行うために機能を使用することで、人間に近づきつつあるか? タスクの難易度は、人間と比較して機械の機能選択能力にどのように影響するか? 人間は画像認識をより正確にする機能を選択するのに一貫して優れているか? 私たちの発見は、人工知能の分野における長期的な目標は、人間のように学習し推論できる機械を作ることであると考え、人間と機械のコラボレーションに重要な意味を持つ。 Complex machine learning models are deployed in several critical domains including healthcare and autonomous vehicles nowadays, albeit as functional black boxes. Consequently, there has been a recent surge in interpreting decisions of such complex models in order to explain their actions to humans. Models that correspond to human interpretation of a task are more desirable in certain contexts and can help attribute liability, build trust, expose biases and in turn build better models. It is, therefore, crucial to understand how and which models conform to human understanding of tasks. In this paper, we present a large-scale crowdsourcing study that reveals and quantifies the dissonance between human and machine understanding, through the lens of an image classification task. In particular, we seek to answer the following questions: Which (well-performing) complex ML models are closer to humans in their use of features to make accurate predictions? How does task difficulty affect the feature selection capability of machines in comparison to humans? Are humans consistently better at selecting features that make image recognition more accurate? Our findings have important implications on human-machine collaboration, considering that a long term goal in the field of artificial intelligence is to make machines capable of learning and reasoning like humans.	翻訳日:2021-03-27 05:58:23 公開日:2021-01-18
# 無標識植物病画像に対するカオス的微細クラスタリング Chaotic-to-Fine Clustering for Unlabeled Plant Disease Images ( http://arxiv.org/abs/2101.06820v1 ) ライセンス: Link先を確認	Uno Fang, Jianxin Li, Xuequan Lu, Mumtaz Ali, Longxiang Gao and Yong Xiang	(参考訳) 現在の植物病画像の注釈は、農業の専門家による手作業による仕分けと手作りの特徴に依存する。本稿では,Kernel K-meansの脆弱性に基づいた,植物病画像のグループ化のための自己組織化クラスタリングフレームワークを提案する。主なアイデアは、カーネルk-meansに基づくクロスイテレーティブなアンダークラスタ化アルゴリズムを確立し、擬似ラベルトレーニングセットとカオスクラスタを生成し、さらにディープラーニングモジュールによって分類することである。提案手法の有効性を検証するため,植物5種と植物17種の3種類の病原体について広範な実験を行った。画像に基づく植物病の分類をバランスとバランスのとれないデータセットに対して, 既存の5つの著作物と異なる指標を用いて比較し, 高い優越性を示した。 Current annotation for plant disease images depends on manual sorting and handcrafted features by agricultural experts, which is time-consuming and labour-intensive. In this paper, we propose a self-supervised clustering framework for grouping plant disease images based on the vulnerability of Kernel K-means. The main idea is to establish a cross iterative under-clustering algorithm based on Kernel K-means to produce the pseudo-labeled training set and a chaotic cluster to be further classified by a deep learning module. In order to verify the effectiveness of our proposed framework, we conduct extensive experiments on three different plant disease datatsets with five plants and 17 plant diseases. The experimental results show the high superiority of our method to do image-based plant disease classification over balanced and unbalanced datasets by comparing with five state-of-the-art existing works in terms of different metrics.	翻訳日:2021-03-27 05:58:05 公開日:2021-01-18
# CFC-Net:リモートセンシング画像における任意指向物体検出のための重要な特徴キャプチャネットワーク CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images ( http://arxiv.org/abs/2101.06849v1 ) ライセンス: Link先を確認	Qi Ming, Lingjuan Miao, Zhiqiang Zhou, Yunpeng Dong	(参考訳) 光リモートセンシング画像における物体検出は重要かつ困難な課題である。近年,畳み込みニューラルネットワークに基づく手法が進歩している。しかし, 物体スケール, アスペクト比, 任意の方向のばらつきが大きいため, 検出性能がさらに向上することは困難である。本稿では,物体検出における識別的特徴の役割について検討し,特徴表現の構築,事前設定アンカーの改良,ラベル割り当ての最適化という3つの側面から検出精度を向上させるために,cfc-net (critical feature capture network) を提案する。具体的には、まず分類と回帰の特徴を分離し、次に分極注意モジュール(pam)を介して各タスクに適応したロバストな重要な特徴を構築する。抽出した識別回帰特性により、R-ARM(Rotation Anchor Refinement Module)は、予め設定された水平アンカーの局所化処理を行い、より優れたローテーションアンカーを得る。次に、ダイナミックアンカー学習(DAL)戦略により、重要な特徴を捉える能力に基づいて、高品質なアンカーを適応的に選択する。提案フレームワークは、リモートセンシング画像におけるオブジェクトのより強力なセマンティック表現を生成し、高性能なリアルタイムオブジェクト検出を実現する。 HRSC2016, DOTA, UCAS-AODの3つのリモートセンシングデータセットによる実験結果から, 本手法は多くの最先端手法と比較して優れた検出性能を示すことが示された。コードとモデルはhttps://github.com/ming71/cfc-netで入手できる。 Object detection in optical remote sensing images is an important and challenging task. In recent years, the methods based on convolutional neural networks have made good progress. However, due to the large variation in object scale, aspect ratio, and arbitrary orientation, the detection performance is difficult to be further improved. In this paper, we discuss the role of discriminative features in object detection, and then propose a Critical Feature Capturing Network (CFC-Net) to improve detection accuracy from three aspects: building powerful feature representation, refining preset anchors, and optimizing label assignment. Specifically, we first decouple the classification and regression features, and then construct robust critical features adapted to the respective tasks through the Polarization Attention Module (PAM). With the extracted discriminative regression features, the Rotation Anchor Refinement Module (R-ARM) performs localization refinement on preset horizontal anchors to obtain superior rotation anchors. Next, the Dynamic Anchor Learning (DAL) strategy is given to adaptively select high-quality anchors based on their ability to capture critical features. The proposed framework creates more powerful semantic representations for objects in remote sensing images and achieves high-performance real-time object detection. Experimental results on three remote sensing datasets including HRSC2016, DOTA, and UCAS-AOD show that our method achieves superior detection performance compared with many state-of-the-art approaches. Code and models are available at https://github.com/ming71/CFC-Net.	翻訳日:2021-03-27 05:57:49 公開日:2021-01-18
# 野生における3次元物体形状復元の秘密 Secrets of 3D Implicit Object Shape Reconstruction in the Wild ( http://arxiv.org/abs/2101.06860v1 ) ライセンス: Link先を確認	Shivam Duggal, Zihao Wang, Wei-Chiu Ma, Sivabalan Manivasagam, Justin Liang, Shenlong Wang and Raquel Urtasun	(参考訳) 高忠実度3Dオブジェクトをスパースから再構成し、部分的な観察はコンピュータビジョン、ロボティクス、グラフィックスの様々な応用において重要である。最近のニューラル暗黙的モデリング手法は、合成されたデータセットや高密度なデータセットに対して有望な結果を示すが、それらはスパースでノイズの多い実世界のデータでは不十分である。本稿では,ニューラルネットワークを用いたニューラル暗黙モデルの性能低下の根本原因を解析する。この制限は、非常に複雑な目的、正規化の欠如、そして初期化の欠如によるものである。これらの問題を克服するために、 (i) 潜時コード最適化のためのより良い、より安定した初期化を提供するディープエンコーダ、 (ii) 形状の忠実度を高めるための事前モデルとして機能するディープディミネータの2つの簡単な修正を導入する。提案手法は実車2台について評価し,最先端の3dオブジェクト復元法よりも優れた性能を示す。 Reconstructing high-fidelity 3D objects from sparse, partial observation is of crucial importance for various applications in computer vision, robotics, and graphics. While recent neural implicit modeling methods show promising results on synthetic or dense datasets, they perform poorly on real-world data that is sparse and noisy. This paper analyzes the root cause of such deficient performance of a popular neural implicit model. We discover that the limitations are due to highly complicated objectives, lack of regularization, and poor initialization. To overcome these issues, we introduce two simple yet effective modifications: (i) a deep encoder that provides a better and more stable initialization for latent code optimization; and (ii) a deep discriminator that serves as a prior model to boost the fidelity of the shape. We evaluate our approach on two real-wold self-driving datasets and show superior performance over state-of-the-art 3D object reconstruction methods.	翻訳日:2021-03-27 05:57:23 公開日:2021-01-18
# Label-Efficient Point Cloud Semantic Segmentation: アクティブラーニングアプローチ Label-Efficient Point Cloud Semantic Segmentation: An Active Learning Approach ( http://arxiv.org/abs/2101.06931v1 ) ライセンス: Link先を確認	Xian Shi, Xun Xu, Ke Chen, Lile Cai, Chuan Sheng Foo, Kui Jia	(参考訳) 3Dポイントクラウドのセマンティックセグメンテーションは、大量のラベル付きデータによる深層モデルのトレーニングに依存している。しかし、3Dポイントクラウドのラベル付けは高価であり、データアノテーションに対する賢いアプローチである。ラベル効率のよいクラウドセグメンテーションには,アクティブな学習が不可欠だ。そこで本研究では,より現実的なアノテーションカウントスキームを提案する。ラベル付け予算をよりうまく活用するために,我々は,点クラウド幾何上で定義された多様体を利用するスーパーポイントベースのアクティブラーニング戦略を採用する。さらに,形状レベルの多様性と局所的空間的一貫性制約を促進するためのアクティブラーニング戦略を提案する。 2つのベンチマークデータセットにおける実験により,提案手法がポイントクラウドのラベル効率の高い意味セグメンテーションに有効であることを示す。特に、あらゆるレベルのアノテーション予算において大幅な改善を実現し、同じレベルのアノテーションコストで最先端のメソッドよりも優れています。 Semantic segmentation of 3D point clouds relies on training deep models with a large amount of labeled data. However, labeling 3D point clouds is expensive, thus smart approach towards data annotation, a.k.a. active learning is essential to label-efficient point cloud segmentation. In this work, we first propose a more realistic annotation counting scheme so that a fair benchmark is possible. To better exploit labeling budget, we adopt a super-point based active learning strategy where we make use of manifold defined on the point cloud geometry. We further propose active learning strategy to encourage shape level diversity and local spatial consistency constraint. Experiments on two benchmark datasets demonstrate the efficacy of our proposed active learning strategy for label-efficient semantic segmentation of point clouds. Notably, we achieve significant improvement at all levels of annotation budgets and outperform the state-of-the-art methods under the same level of annotation cost.	翻訳日:2021-03-27 05:56:46 公開日:2021-01-18
# 3次元意味セグメンテーションにおける領域適応のためのクロスモーダル学習 Cross-modal Learning for Domain Adaptation in 3D Semantic Segmentation ( http://arxiv.org/abs/2101.07253v1 ) ライセンス: Link先を確認	Maximilian Jaritz, Tuan-Hung Vu, Raoul de Charette, \'Emilie Wirbel, and Patrick P\'erez	(参考訳) ドメイン適応はラベルが不足している場合の学習を可能にする重要なタスクである。ほとんどの作業はイメージモダリティのみに焦点を当てているが、多くの重要なマルチモーダルデータセットが存在する。ドメイン適応にマルチモーダルを活用するために,相互模倣による2つのモーダルの予測の整合性を実現するクロスモーダル学習を提案する。我々は、ラベル付きデータに対する正確な予測とラベルなしのターゲットドメインデータに対するモダリティ間の一貫性のある予測をネットワークに制限する。教師なしおよび半教師付きドメイン適応設定の実験は、この新しいドメイン適応戦略の有効性を証明している。具体的には,画像と点クラウドモダリティを用いて3次元意味セグメンテーションのタスクを評価する。最近の自動運転データセットを利用して、シーンレイアウトの変更、照明、センサーの設定、天気、合成から現実への設定など、さまざまなドメイン適応シナリオを作成します。本手法は,すべての適応シナリオにおいて,以前のユニモーダル適応ベースラインよりも大幅に向上する。コードは利用可能になる。 Domain adaptation is an important task to enable learning when labels are scarce. While most works focus only on the image modality, there are many important multi-modal datasets. In order to leverage multi-modality for domain adaptation, we propose cross-modal learning, where we enforce consistency between the predictions of two modalities via mutual mimicking. We constrain our network to make correct predictions on labeled data and consistent predictions across modalities on unlabeled target-domain data. Experiments in unsupervised and semi-supervised domain adaptation settings prove the effectiveness of this novel domain adaptation strategy. Specifically, we evaluate on the task of 3D semantic segmentation using the image and point cloud modality. We leverage recent autonomous driving datasets to produce a wide variety of domain adaptation scenarios including changes in scene layout, lighting, sensor setup and weather, as well as the synthetic-to-real setup. Our method significantly improves over previous uni-modal adaptation baselines on all adaption scenarios. Code will be made available.	翻訳日:2021-03-27 05:56:25 公開日:2021-01-18
# 診断用キャプション:サーベイ Diagnostic Captioning: A Survey ( http://arxiv.org/abs/2101.07299v1 ) ライセンス: Link先を確認	John Pavlopoulos, Vasiliki Kougia, Ion Androutsopoulos, Dimitris Papamichail	(参考訳) 診断用キャプション(DC)は、検査中に収集した患者の医療画像から診断用テキストを自動的に生成するものである。 DCは経験の浅い医師を助け、臨床ミスを減らすことができる。経験豊富な医師がより早く診断レポートを作成するのに役立つ。ディープラーニングの進歩、特に一般的なイメージキャプションにおいて、DCは近年より注目を集め、いくつかのシステムやデータセットにつながった。この記事では、DCの概要を概観する。関連するデータセット、評価基準、および最新のシステムを示す。また、DCの進歩を妨げる欠点を強調し、今後の方向性を提案する。 Diagnostic Captioning (DC) concerns the automatic generation of a diagnostic text from a set of medical images of a patient collected during an examination. DC can assist inexperienced physicians, reducing clinical errors. It can also help experienced physicians produce diagnostic reports faster. Following the advances of deep learning, especially in generic image captioning, DC has recently attracted more attention, leading to several systems and datasets. This article is an extensive overview of DC. It presents relevant datasets, evaluation measures, and up to date systems. It also highlights shortcomings that hinder DC's progress and proposes future directions.	翻訳日:2021-03-27 05:56:10 公開日:2021-01-18
# パートベース表現の探索による顔認証の改善 Improving Makeup Face Verification by Exploring Part-Based Representations ( http://arxiv.org/abs/2101.07338v1 ) ライセンス: Link先を確認	Marcus de Assis Angeloni and Helio Pedrini	(参考訳) 近年,世界の顔認識市場規模が増加している。畳み込みニューラルネットワークの採用による顔認識技術の大幅な進歩にもかかわらず、顔に化粧があるようなオープンな課題がまだ残っている。この課題に対処するために,現在の全体表現と融合する顔部品の採用を提案,評価する。顔面部は4つの領域(左眼,右眼,鼻,口)と3つの顔面(上,中,下)の2つの戦略を提案する。 4つの公開メイクアップフェイスデータセットと難解なクロスデータセットプロトコルによって得られた実験結果は、顔部分から抽出された深い特徴と全体表現との融合により、顔認証システムの精度が向上し、cnnモデルのリトレーニングなしにエラーレートが低下することを示している。提案したパイプラインは,YMUデータセットの最先端性能と,他の3つのデータセット(EMFD,FAM,M501)の競合結果を得た。 Recently, we have seen an increase in the global facial recognition market size. Despite significant advances in face recognition technology with the adoption of convolutional neural networks, there are still open challenges, as when there is makeup in the face. To address this challenge, we propose and evaluate the adoption of facial parts to fuse with current holistic representations. We propose two strategies of facial parts: one with four regions (left periocular, right periocular, nose and mouth) and another with three facial thirds (upper, middle and lower). Experimental results obtained in four public makeup face datasets and in a challenging cross-dataset protocol show that the fusion of deep features extracted of facial parts with holistic representation increases the accuracy of face verification systems and decreases the error rates, even without any retraining of the CNN models. Our proposed pipeline achieved state-of-the-art performance for the YMU dataset and competitive results for other three datasets (EMFD, FAM and M501).	翻訳日:2021-03-27 05:56:03 公開日:2021-01-18
# 変圧器モデルの通路再配置における位置偏りの軽減 Mitigating the Position Bias of Transformer Models in Passage Re-Ranking ( http://arxiv.org/abs/2101.06980v1 ) ライセンス: Link先を確認	Sebastian Hofst\"atter, Aldo Lipani, Sophia Althammer, Markus Zlabinger, Allan Hanbury	(参考訳) 教師付き機械学習モデルとその評価は、基礎となるデータセットの品質に大きく依存する。関連した情報を検索すると、指定された通路のどこにでも現れる可能性がある。しかし,文中の正しい回答の位置の偏りを,文節の再ランキングに用いる2つの一般的な質問応答データセットで観察する。通路内の初期の位置を過度に好むことは望ましくない人工物である。これにより、トランスフォーマーベースの3つの一般的なリグレードモデルが、目に見えない通路で関連する部分を無視する。さらに、評価セットが同じバイアス分布から取られるので、そのバイアスに過度に適合するモデルは、真の効果を過大評価する。本研究では,データセットの位置バイアス,文脈表現,それらの検索結果への影響を分析する。本稿では,データセットのデバイアス化手法を提案する。以上の結果から,位置バイアスデータセットでトレーニングしたモデルでは,デバイアスデータセットで評価した場合,再評価の有効性が著しく低下することが示唆された。位置バイアスを緩和することにより、トランスフォーマーベースのリグレードモデルはバイアス付きおよびバイアス付きデータセットに対して等しく有効であり、2つの異なるバイアス付きデータセット間の転送学習設定においてより効果的であることを示す。 Supervised machine learning models and their evaluation strongly depends on the quality of the underlying dataset. When we search for a relevant piece of information it may appear anywhere in a given passage. However, we observe a bias in the position of the correct answer in the text in two popular Question Answering datasets used for passage re-ranking. The excessive favoring of earlier positions inside passages is an unwanted artefact. This leads to three common Transformer-based re-ranking models to ignore relevant parts in unseen passages. More concerningly, as the evaluation set is taken from the same biased distribution, the models overfitting to that bias overestimate their true effectiveness. In this work we analyze position bias on datasets, the contextualized representations, and their effect on retrieval results. We propose a debiasing method for retrieval datasets. Our results show that a model trained on a position-biased dataset exhibits a significant decrease in re-ranking effectiveness when evaluated on a debiased dataset. We demonstrate that by mitigating the position bias, Transformer-based re-ranking models are equally effective on a biased and debiased dataset, as well as more effective in a transfer-learning setting between two differently biased datasets.	翻訳日:2021-03-27 05:55:43 公開日:2021-01-18
# ガウス過程回帰のためのベイズ変換学習における転移モデル構造 Transferring model structure in Bayesian transfer learning for Gaussian process regression ( http://arxiv.org/abs/2101.06884v1 ) ライセンス: Link先を確認	Milan Pape\v{z}, Anthony Quinn	(参考訳) ベイズ変換学習(Bayesian Transfer Learning, BTL)は、転送元分布にターゲット確率分布を条件付けるタスクとして定義される。ターゲットは、ソースとターゲット間のインタラクションと、独立したローカルソースモデラーで利用可能な確率的データ予測器の条件をグローバルにモデル化する。この最適意思決定問題を解決するために、完全な確率的設計が採用されている。ソースのより高いモーメントを転送することで、ターゲットは信頼できないソース知識(つまり)を拒否することができる。堅牢な転送を実現します) このデュアルモデラーフレームワークは、ソースの生データを転送された予測分布に局所的に処理し、圧縮可能な可能性を持つ)局所的なソースモデルによって強化されることを意味する。さらに、グローバルターゲットモデラーの導入により、ソースとターゲットタスク -- ターゲットに知られていれば -- の相関を説明できる。重要な結果が生まれる。まず、新しいスキームは、完全にモデル化された(すなわち)性能を達成する。従来)ターゲットモデルの誤特定を避ける(稀な)ケースにおけるマルチタスク学習スキーム。第二に、新しいデュアルモデルフレームワークは、従来のマルチタスク学習を損なうようなモデルのミススペクテーションに対して堅牢である。我々はこれらの問題をガウス的プロセス回帰タスクの相互作用の重要な文脈で徹底的に検討する。合成データと実データの両方による実験的な証拠は、BTLフレームワークが転送時の堅牢性を保ちながら、誤特定のモデル化にも堅牢であることを示す。 Bayesian transfer learning (BTL) is defined in this paper as the task of conditioning a target probability distribution on a transferred source distribution. The target globally models the interaction between the source and target, and conditions on a probabilistic data predictor made available by an independent local source modeller. Fully probabilistic design is adopted to solve this optimal decision-making problem in the target. By successfully transferring higher moments of the source, the target can reject unreliable source knowledge (i.e. it achieves robust transfer). This dual-modeller framework means that the source's local processing of raw data into a transferred predictive distribution -- with compressive possibilities -- is enriched by (the possible expertise of) the local source model. In addition, the introduction of the global target modeller allows correlation between the source and target tasks -- if known to the target -- to be accounted for. Important consequences emerge. Firstly, the new scheme attains the performance of fully modelled (i.e. conventional) multitask learning schemes in (those rare) cases where target model misspecification is avoided. Secondly, and more importantly, the new dual-modeller framework is robust to the model misspecification that undermines conventional multitask learning. We thoroughly explore these issues in the key context of interacting Gaussian process regression tasks. Experimental evidence from both synthetic and real data settings validates our technical findings: that the proposed BTL framework enjoys robustness in transfer while also being robust to model misspecification.	翻訳日:2021-03-27 05:55:24 公開日:2021-01-18
# 群流予測のための複数モード間の不均一関係のモデル化 Modeling Heterogeneous Relations across Multiple Modes for Potential Crowd Flow Prediction ( http://arxiv.org/abs/2101.06954v1 ) ライセンス: Link先を確認	Qiang Zhou, Jingjing Gu, Xinjiang Lu, Fuzhen Zhuang, Yanchao Zhao, Qiuhong Wang, Xiao Zhang	(参考訳) 都市計画立案者や管理者にとって、新しい交通施設のクラウドフロー予測は基本的な課題である。直感的には、近くの場所を探索することで、新しい来訪地の潜在的な群集の流れを示唆することができる。しかし、近隣の交通モード(例:交通モード)は異なる。バスの駅、自転車の駅など)は対象地(例)と異なる場合がある。地下鉄の駅) 深刻なデータ不足の問題を引き起こしますそこで本研究では,新しい計画サイトにおいて,クラウドフローを予測可能なデータ駆動型手法であるmoherを提案する。具体的には,まず,地理的な近接性と都市機能の類似性を調べることにより,対象地近傍の地域を識別する。次に,これらの不均一な関係を集約するために,相関だけでなく,異なる移動モード間の差異も学習可能な,新しい関係特異的変換モデルである交叉モード関係性gcnを考案する。その後,誘導電位流表現のためのアグリゲータを設計する。最後に、LTSMモジュールがシーケンシャルフロー予測に使用される。実世界のデータセットに関する大規模な実験は、最先端のアルゴリズムと比較してMOHERフレームワークの優位性を示している。 Potential crowd flow prediction for new planned transportation sites is a fundamental task for urban planners and administrators. Intuitively, the potential crowd flow of the new coming site can be implied by exploring the nearby sites. However, the transportation modes of nearby sites (e.g. bus stations, bicycle stations) might be different from the target site (e.g. subway station), which results in severe data scarcity issues. To this end, we propose a data driven approach, named MOHER, to predict the potential crowd flow in a certain mode for a new planned site. Specifically, we first identify the neighbor regions of the target site by examining the geographical proximity as well as the urban function similarity. Then, to aggregate these heterogeneous relations, we devise a cross-mode relational GCN, a novel relation-specific transformation model, which can learn not only the correlations but also the differences between different transportation modes. Afterward, we design an aggregator for inductive potential flow representation. Finally, an LTSM module is used for sequential flow prediction. Extensive experiments on real-world data sets demonstrate the superiority of the MOHER framework compared with the state-of-the-art algorithms.	翻訳日:2021-03-27 05:55:03 公開日:2021-01-18
# ほぼ一定ピークメモリ使用量を持つディープコントラスト学習バッチサイズのスケーリング Scaling Deep Contrastive Learning Batch Size with Almost Constant Peak Memory Usage ( http://arxiv.org/abs/2101.06983v1 ) ライセンス: Link先を確認	Luyu Gao, Yunyi Zhang	(参考訳) コントラスト学習は、テキストや画像などの様々な形式のデータの数値ベクトル表現の学習に成功している。学習エンコーダは、多くの下流タスクに汎用的な転送能力を示す。表現に基づく検索は最先端のパフォーマンスで非常に効率的である。従来の研究では、高品質な表現を学習するには、対照的な損失に多くの否定が必要であることが示されていた。実際には、バッチ内の各例について、他のバッチサンプルの正を負とみなし、余分な負のエンコーディングを避ける、バッチ内の負のテクニックが使用される。しかし、これは依然としてすべてのバッチの例で各例の損失を条件としており、大規模なバッチ全体をgpuメモリに適合させる必要がある。本稿では,コントラスト損失とエンコーダ間のバック伝搬を分離する再計算手法を提案する。その結果、グラデーションはバッチの1つのサブセットに対して一度に計算でき、異なるサイズのバッチに対するGPUメモリ使用量がほぼ一定になる。 Contrastive learning has been applied successfully to learn numerical vector representations of various forms of data, such as texts and images. Learned encoders exhibit versatile transfer capabilities to many downstream tasks. Representation based search is highly efficient with state-of-the-art performance. Previous researches demonstrated that learning high-quality representations requires a large number of negatives in contrastive loss. In practice, the technique of in-batch negative is used, where for each example in a batch, other batch examples' positives will be taken as its negatives, avoiding encoding extra negatives. This, however, still conditions each example's loss on all batch examples and requires fitting the entire large batch into GPU memory. This paper introduces a re-computation technique that decouples back propagation between contrastive loss and the encoder, removing encoder backward pass data dependency along the batch dimension. As a result, gradients can be computed for one subset of the batch at a time, leading to an almost constant peak GPU memory usage for batches of different sizes.	翻訳日:2021-03-27 05:54:49 公開日:2021-01-18
# 継承状態とゴール依存値の学習:数学的視点 Learning Successor States and Goal-Dependent Values: A Mathematical Viewpoint ( http://arxiv.org/abs/2101.07123v1 ) ライセンス: Link先を確認	L\'eonard Blier, Corentin Tallec, Yann Ollivier	(参考訳) 強化学習では、時間差に基づくアルゴリズムはサンプル非効率であり、例えば、スパース報酬の場合、報酬が観察されるまで学習は行われない。これは、環境のモデルや後継状態といったよりリッチなオブジェクトを学習することで解決できる。後継状態は、ある政策の任意の状態から期待される将来の状態占有度をモデル化し、任意の状態に到達する方法を学習するゴール依存値関数と関連付ける。我々は,後続状態と目標依存値関数学習のための時間差アルゴリズムを,離散環境,あるいは関数近似を伴う連続環境に対して形式的に導出する。特に,有限分散推定器を連続環境においても提供し,目標状態に正確に到達する報酬は無限にスパースする。後続状態はベルマン方程式以上のものを満たす: 後方のベルマン作用素とベルマン・ニュートン作用素は環境中の経路構成性を符号化する。 BN作用素は二階勾配降下法に似ており、より多くの観測値を得るときの値関数の真の更新を提供する。表の場合と無限小の学習率では、通常のベルマン作用素と後方のベルマン作用素を混合することで漸近収束の固有値が向上し、BN作用素の漸近収束はTDよりも確率的に良い。しかし、bn法はサンプリングノイズに対してより複雑でロバストではない。最後に、後続状態のフォワードバックワード(fb)有限ランクパラメータ化は、分散の低減とsamplabilityの改善を享受し、値関数の直接モデルを提供し、長距離依存性に対応する不動点を完全に理解し、bn法を近似し、副産物として状態の2つの標準表現を提供する。 In reinforcement learning, temporal difference-based algorithms can be sample-inefficient: for instance, with sparse rewards, no learning occurs until a reward is observed. This can be remedied by learning richer objects, such as a model of the environment, or successor states. Successor states model the expected future state occupancy from any given state for a given policy and are related to goal-dependent value functions, which learn how to reach arbitrary states. We formally derive the temporal difference algorithm for successor state and goal-dependent value function learning, either for discrete or for continuous environments with function approximation. Especially, we provide finite-variance estimators even in continuous environments, where the reward for exactly reaching a goal state becomes infinitely sparse. Successor states satisfy more than just the Bellman equation: a backward Bellman operator and a Bellman-Newton (BN) operator encode path compositionality in the environment. The BN operator is akin to second-order gradient descent methods and provides the true update of the value function when acquiring more observations, with explicit tabular bounds. In the tabular case and with infinitesimal learning rates, mixing the usual and backward Bellman operators provably improves eigenvalues for asymptotic convergence, and the asymptotic convergence of the BN operator is provably better than TD, with a rate independent from the environment. However, the BN method is more complex and less robust to sampling noise. Finally, a forward-backward (FB) finite-rank parameterization of successor states enjoys reduced variance and improved samplability, provides a direct model of the value function, has fully understood fixed points corresponding to long-range dependencies, approximates the BN method, and provides two canonical representations of states as a byproduct.	翻訳日:2021-03-27 05:54:16 公開日:2021-01-18
# 不確実性下におけるfNIRSデータの分類:ベイズニューラルネットワークによるアプローチ Classification of fNIRS Data Under Uncertainty: A Bayesian Neural Network Approach ( http://arxiv.org/abs/2101.07128v1 ) ライセンス: Link先を確認	Talha Siddique and Md Shaad Mahmud	(参考訳) 機能近赤外分光法(FNIRS)は脳-コンピュータインタフェース(BCI)の非侵襲型である。脳血行動態のイメージングに用いられ、他の類似した技術よりも生じる特定のプロースによって人気を博している。全体的な機能には、脳信号の捕捉、処理、分類が含まれる。血行動態の反応は生理的ノイズによって汚染されるため、過去の文献では、焦点の反応を望ましくないものから分類するためにいくつかの方法が採用されている。しかし、これまでの方法では、データやモデルパラメータの不確実性は考慮されていない。本稿では,ベイズ型ニューラルネットワーク(bnn)を用いて,一方的指タッピング(左右指タッピング)からなるオープンアクセスデータセットのバイナリ分類を行う。 BNNはベイズ統計を用いて、点推定の代わりに確率分布をネットワーク重みに割り当てる。このように分類を行いながら、データとモデルの不確実性を考慮に入れます。モデルのトレーニングには変分推論(VI)を使用しました。我々のモデルでは30名以上のボランティアに対して86.44%の総合的な分類精度が得られた。我々は、モデルのエビデンス下限(elbo)関数がイテレーションでどのように収束するかを示した。さらに,重量の後方分布のサンプリング中に生じる不確実性について考察した。また,1人のボランティアによるテストデータを用いて,BNN分類器のROC曲線を生成し,AUCスコアが0.855である。 Functional Near-Infrared Spectroscopy (fNIRS) is a non-invasive form of Brain-Computer Interface (BCI). It is used for the imaging of brain hemodynamics and has gained popularity due to the certain pros it poses over other similar technologies. The overall functionalities encompass the capture, processing and classification of brain signals. Since hemodynamic responses are contaminated by physiological noises, several methods have been implemented in the past literature to classify the responses in focus from the unwanted ones. However, the methods, thus far does not take into consideration the uncertainty in the data or model parameters. In this paper, we use a Bayesian Neural Network (BNN) to carry out a binary classification on an open-access dataset, consisting of unilateral finger tapping (left- and right-hand finger tapping). A BNN uses Bayesian statistics to assign a probability distribution to the network weights instead of a point estimate. In this way, it takes data and model uncertainty into consideration while carrying out the classification. We used Variational Inference (VI) to train our model. Our model produced an overall classification accuracy of 86.44% over 30 volunteers. We illustrated how the evidence lower bound (ELBO) function of the model converges over iterations. We further illustrated the uncertainty that is inherent during the sampling of the posterior distribution of the weights. We also generated a ROC curve for our BNN classifier using test data from a single volunteer and our model has an AUC score of 0.855.	翻訳日:2021-03-27 05:53:42 公開日:2021-01-18
# 絡み合う重みの安定回復 : 極小サンプルからの深層ニューラルネットワークのロバスト同定に向けて Stable Recovery of Entangled Weights: Towards Robust Identification of Deep Neural Networks from Minimal Samples ( http://arxiv.org/abs/2101.07150v1 ) ライセンス: Link先を確認	Christian Fiedler, Massimo Fornasier, Timo Klock, and Michael Rauchensteiner	(参考訳) 本稿では,有限個の入力出力サンプルから,ピラミッド形状とスムーズな活性化関数を有する汎用深層ニューラルネットワークの特異かつ安定した識別性の問題にアプローチする。より具体的には、活性化関数とそのシフトに応じて、適切な対角行列および可逆行列と交差する連続層の重みを構成するいわゆる絡み合い重みを導入する。エンタングル重みは、ネットワークの$\mathcal o(d^2 \times m)$非適応入出力サンプルが収集され、$d$が入力次元、$m$がネットワークのニューロン数であることが証明される。さらに、このアプローチは最大$\mathcal o(d \times m_l)$ニューロンのネットワークに適用され、ここで$m_l$は層$l$の出力ニューロンの数である。エンタングル重みの層割り当てと、最小二乗でさらにヒューリスティックに得られるかもしれないスケーリングとシフトパラメータの残差に関する知識により、エンタングル重みはネットワークを完全に一意的に識別する。絡み合った重みの安定回復に関する理論的結果の妥当性を明らかにするため, 一般化重み付き多層ネットワークを頑健に同定し, 提案したアルゴリズムパイプラインにより一様に近似できることを示す数値実験を行った。対照的に、バックプロパゲーションはこの設定では安定に一般化することができず、常に比較的大きな均一誤差によって制限される。本研究は,入力出力情報をネットワークパラメータに一意かつ安定的に関連付けることができ,説明可能性の一形態を提供する。さらに, 過パラメータ化ネットワークの圧縮や, 最小複雑性ネットワークのトレーニングを行う方法を提案する。 In this paper we approach the problem of unique and stable identifiability of generic deep artificial neural networks with pyramidal shape and smooth activation functions from a finite number of input-output samples. More specifically we introduce the so-called entangled weights, which compose weights of successive layers intertwined with suitable diagonal and invertible matrices depending on the activation functions and their shifts. We prove that entangled weights are completely and stably approximated by an efficient and robust algorithm as soon as $\mathcal O(D^2 \times m)$ nonadaptive input-output samples of the network are collected, where $D$ is the input dimension and $m$ is the number of neurons of the network. Moreover, we empirically observe that the approach applies to networks with up to $\mathcal O(D \times m_L)$ neurons, where $m_L$ is the number of output neurons at layer $L$. Provided knowledge of layer assignments of entangled weights and of remaining scaling and shift parameters, which may be further heuristically obtained by least squares, the entangled weights identify the network completely and uniquely. To highlight the relevance of the theoretical result of stable recovery of entangled weights, we present numerical experiments, which demonstrate that multilayered networks with generic weights can be robustly identified and therefore uniformly approximated by the presented algorithmic pipeline. In contrast backpropagation cannot generalize stably very well in this setting, being always limited by relatively large uniform error. In terms of practical impact, our study shows that we can relate input-output information uniquely and stably to network parameters, providing a form of explainability. Moreover, our method paves the way for compression of overparametrized networks and for the training of minimal complexity networks.	翻訳日:2021-03-27 05:53:24 公開日:2021-01-18
# 不均一共有交通空間における道路利用者の運動モデルの一般化可能性について On the Generalizability of Motion Models for Road Users in Heterogeneous Shared Traffic Spaces ( http://arxiv.org/abs/2101.06974v1 ) ライセンス: Link先を確認	Fatema T. Johora, Dongfang Yang, J\"org P. M\"uller, and \"Umit \"Ozg\"uner	(参考訳) 混合交通運動と相互作用のモデル化は,将来の都市部の安全性,効率,実現可能性を評価する上で重要である。交通規制の欠如、多様な輸送モード、共有空間のような混合交通ゾーンの動的な性質は、そのような環境の現実的なモデリングを困難にしている。本稿では, 動作モデルの一般化性, すなわち, 既存の作業に欠けている, 異なる環境条件下で現実的な行動を生成する能力に焦点を当てる。具体的には, 一般的な運動モデルを定式化し, このプロセスの応用として, ゲーム理論社会力モデル(GSFM, Game-theoretic Social Force Model)を, 異なる共有空間から歩行者や車の多種多様な運動行動を生成する汎用モデルへと拡張する。第2の貢献は、個人歩行者の運動関連特徴を調整し、グループ化することで、歩行者の異なる動きパターンを検討することである。 2つのクラスタリングアプローチを分析した。このモデルのキャリブレーションと評価は、3つの異なる共有空間データセット上で行われる。その結果, 本モデルでは, 様々な動作行動やインタラクションシナリオを現実的にシミュレートでき, モデルに歩行者の異なる動きパターンを加えることで, その性能が向上することがわかった。 Modeling mixed-traffic motion and interactions is crucial to assess safety, efficiency, and feasibility of future urban areas. The lack of traffic regulations, diverse transport modes, and the dynamic nature of mixed-traffic zones like shared spaces make realistic modeling of such environments challenging. This paper focuses on the generalizability of the motion model, i.e., its ability to generate realistic behavior in different environmental settings, an aspect which is lacking in existing works. Specifically, our first contribution is a novel and systematic process of formulating general motion models and application of this process is to extend our Game-Theoretic Social Force Model (GSFM) towards a general model for generating a large variety of motion behaviors of pedestrians and cars from different shared spaces. Our second contribution is to consider different motion patterns of pedestrians by calibrating motion-related features of individual pedestrian and clustering them into groups. We analyze two clustering approaches. The calibration and evaluation of our model are performed on three different shared space data sets. The results indicate that our model can realistically simulate a wide range of motion behaviors and interaction scenarios, and that adding different motion patterns of pedestrians into our model improves its performance.	翻訳日:2021-03-27 05:52:49 公開日:2021-01-18
# クロスモダリティ医療画像セグメンテーションのためのDeep Symmetric Adaptation Network Deep Symmetric Adaptation Network for Cross-modality Medical Image Segmentation ( http://arxiv.org/abs/2101.06853v1 ) ライセンス: Link先を確認	Xiaoting Han, Lei Qi, Qian Yu, Ziqi Zhou, Yefeng Zheng, Yinghuan Shi, Yang Gao	(参考訳) 非教師なし領域適応 (UDA) 法は, 医療画像分割作業において有望な性能を示した。これらの典型的な手法は通常、翻訳ネットワークを使用して、ソースドメインからターゲットドメインへの画像変換や、翻訳されたソースイメージと元のターゲットイメージのみを使用してピクセルレベルの分類器をトレーニングする。しかし、ソースドメインとターゲットドメインの間に大きなドメインシフトが存在する場合、この非対称構造はドメインギャップを完全に排除することができない。本稿では,セグメンテーションサブネットワークと2つの対称ソースとターゲットドメイン翻訳サブネットワークからなる医用画像セグメンテーションのための,udaの新たな深対称アーキテクチャを提案する。具体的には,2つのサブネットワークをベースとして,共有エンコーダとプライベートデコーダによる双方向アライメント方式を導入し,1)ソースからターゲットドメイン,2)ターゲットドメインからソースドメインへのアライメントを同時に行うことにより,ドメイン間の差異を効果的に緩和する。さらに,セグメンテーションサブネットワークにおいて,元のターゲット画像と翻訳されたソース画像だけでなく,元のソース画像と翻訳されたターゲット画像を用いて画素レベルの分類器を訓練し,異なるスタイルの画像からの意味情報を十分に活用する。拡張実験により,Cardiac と BraTS のセグメンテーションタスクにおける最先端手法と比較して,本手法が顕著に優れていることが示された。 Unsupervised domain adaptation (UDA) methods have shown their promising performance in the cross-modality medical image segmentation tasks. These typical methods usually utilize a translation network to transform images from the source domain to target domain or train the pixel-level classifier merely using translated source images and original target images. However, when there exists a large domain shift between source and target domains, we argue that this asymmetric structure could not fully eliminate the domain gap. In this paper, we present a novel deep symmetric architecture of UDA for medical image segmentation, which consists of a segmentation sub-network, and two symmetric source and target domain translation sub-networks. To be specific, based on two translation sub-networks, we introduce a bidirectional alignment scheme via a shared encoder and private decoders to simultaneously align features 1) from source to target domain and 2) from target to source domain, which helps effectively mitigate the discrepancy between domains. Furthermore, for the segmentation sub-network, we train a pixel-level classifier using not only original target images and translated source images, but also original source images and translated target images, which helps sufficiently leverage the semantic information from the images with different styles. Extensive experiments demonstrate that our method has remarkable advantages compared to the state-of-the-art methods in both cross-modality Cardiac and BraTS segmentation tasks.	翻訳日:2021-03-27 05:52:28 公開日:2021-01-18
# ディープニューラルネットワークと信仰機能を用いたCovid-19分類 Covid-19 classification with deep neural network and belief functions ( http://arxiv.org/abs/2101.06958v1 ) ライセンス: Link先を確認	Ling Huang, Su Ruan, Thierry Denoeux	(参考訳) CT画像は、放射線医がCovid-19を診断するのに有用な情報を提供する。しかし,CTスキャンの視覚的解析には時間を要する。したがってCT画像からCovid-19を自動的に検出するアルゴリズムを開発する必要がある。本稿では,コビッド19症例を検出するための半教師付きトレーニングを用いた信念関数に基づく畳み込みニューラルネットワークを提案する。本手法はまず,深い特徴を抽出し,信頼度マップにマップし,最終的な分類決定を行う。我々の結果は従来のディープラーニングに基づく分類モデルよりも信頼性が高く説明しやすい。実験の結果,0.81,f10.812,auc0.875の精度で良好な性能を得ることができた。 Computed tomography (CT) image provides useful information for radiologists to diagnose Covid-19. However, visual analysis of CT scans is time-consuming. Thus, it is necessary to develop algorithms for automatic Covid-19 detection from CT images. In this paper, we propose a belief function-based convolutional neural network with semi-supervised training to detect Covid-19 cases. Our method first extracts deep features, maps them into belief degree maps and makes the final classification decision. Our results are more reliable and explainable than those of traditional deep learning-based classification models. Experimental results show that our approach is able to achieve a good performance with an accuracy of 0.81, an F1 of 0.812 and an AUC of 0.875.	翻訳日:2021-03-27 05:52:03 公開日:2021-01-18
# サイクリックリバースジェネレータを用いた反復顔画像インペインティング Iterative Facial Image Inpainting using Cyclic Reverse Generator ( http://arxiv.org/abs/2101.07036v1 ) ライセンス: Link先を確認	Yahya Dogan and Hacer Yalim Keles	(参考訳) 顔画像のインペインティングは、顔にマスクされたキーコンポーネント(例えば目と鼻)の意味情報を含む新しいピクセルを生成する必要があるため、難しい問題である。近年,この分野では注目すべき手法が提案されている。これらのアプローチのほとんどはエンコーダ-デコーダアーキテクチャを使用しており、与えられた画像と特定のマスクのユニークな結果を可能にするといった制限がある。あるいは、ジェネレータネットワークと異なるマスクを使って有望な結果を生み出すアプローチもある。しかしながら、これらのアプローチは最適化ベースであり、通常多くのイテレーションを必要とする。本稿では, 循環逆生成器(cyclic reverse generator, crg)アーキテクチャを用いた, コーダ生成器モデルによる顔画像描画問題に対する効率的な解法を提案する。このエンコーダを用いて、生成領域に所定の画像を埋め込み、可算な画像が生成されるまでマスク領域を段階的に塗りつぶし、反復中に生成された画像を評価するために判別器ネットワークを利用する。提案したモデルを用いて実写画像を生成するには,数イテレーションで十分であることがわかった。生成プロセスの後、ポスト処理では、マスク境界に近いアーティファクトを修復するために、このタスクのために特別に訓練したUnetモデルを使用します。本手法では,様々なマスクタイプを用いてスケッチベースのインペインティングを適用でき,多種多様な結果が得られる。我々は,この手法を最先端のモデルと比較し,全てのマスクタイプにおいて他のモデルと競合することを観察した。 Facial image inpainting is a challenging problem as it requires generating new pixels that include semantic information for masked key components in a face, e.g., eyes and nose. Recently, remarkable methods have been proposed in this field. Most of these approaches use encoder-decoder architectures and have different limitations such as allowing unique results for a given image and a particular mask. Alternatively, some approaches generate promising results using different masks with generator networks. However, these approaches are optimization-based and usually require quite a number of iterations. In this paper, we propose an efficient solution to the facial image painting problem using the Cyclic Reverse Generator (CRG) architecture, which provides an encoder-generator model. We use the encoder to embed a given image to the generator space and incrementally inpaint the masked regions until a plausible image is generated; a discriminator network is utilized to assess the generated images during the iterations. We empirically observed that only a few iterations are sufficient to generate realistic images with the proposed model. After the generation process, for the post processing, we utilize a Unet model that we trained specifically for this task to remedy the artifacts close to the mask boundaries. Our method allows applying sketch-based inpaintings, using variety of mask types, and producing multiple and diverse results. We qualitatively compared our method with the state-of-the-art models and observed that our method can compete with the other models in all mask types; it is particularly better in images where larger masks are utilized.	翻訳日:2021-03-27 05:51:39 公開日:2021-01-18
# 共有潜在空間表現を用いた大腸内視鏡ビデオにおける欠損面の可視化 Visualizing Missing Surfaces In Colonoscopy Videos using Shared Latent Space Representations ( http://arxiv.org/abs/2101.07280v1 ) ライセンス: Link先を確認	Shawn Mathew, Saad Nadeem and Arie Kaufman	(参考訳) 最も普及している大腸癌スクリーニングツールである光大腸内視鏡(oc)は、大腸の形状(水平折りたたみや鋭い屈曲)、内科医の経験不足や疲労、内視鏡の視野などを含む多くの要因により、ミス率が高い。大腸内視鏡検査中にフレーム当たりの欠落領域を可視化する枠組みを提示し,有効な臨床ソリューションを提供する。具体的には、3D再構成仮想大腸内視鏡(VC)データと、VCとOCが同じ形状を共有しているが、OCドメインに埋め込まれた色、テクスチャ、スペキュラリフレクションが異なるという知見を用いる。 OCとVCのための強制的共有潜在空間を伴って、損失のない画像から画像への変換モデルを導入する。この共有潜在空間は、追加のガウス雑音入力に対して色、テクスチャ、スペック情報の生成を遅らせながら幾何情報をキャプチャする。この追加ノイズ入力を使用して、VCからOC、OCからOCへの1対多マッピングを生成することができる。 Optical colonoscopy (OC), the most prevalent colon cancer screening tool, has a high miss rate due to a number of factors, including the geometry of the colon (haustral fold and sharp bends occlusions), endoscopist inexperience or fatigue, endoscope field of view, etc. We present a framework to visualize the missed regions per-frame during the colonoscopy, and provides a workable clinical solution. Specifically, we make use of 3D reconstructed virtual colonoscopy (VC) data and the insight that VC and OC share the same underlying geometry but differ in color, texture and specular reflections, embedded in the OC domain. A lossy unpaired image-to-image translation model is introduced with enforced shared latent space for OC and VC. This shared latent space captures the geometric information while deferring the color, texture, and specular information creation to additional Gaussian noise input. This additional noise input can be utilized to generate one-to-many mappings from VC to OC and OC to OC.	翻訳日:2021-03-27 05:51:17 公開日:2021-01-18
# 摂動勾配降下の微分的にプライベートな性質について On the Differentially Private Nature of Perturbed Gradient Descent ( http://arxiv.org/abs/2101.06847v1 ) ライセンス: Link先を確認	Thulasi Tholeti, Sheetal Kalyani	(参考訳) 勾配降下アルゴリズムを用いて,データベースを与えられた経験的リスク最小化の問題を考える。最適化される関数は、アルゴリズムの収束を妨げる鞍点からなる非凸であるかもしれないことに注意する。摂動勾配降下アルゴリズムは典型的にはこれらのサドル点から逃れるために用いられる。勾配を乱すこのアルゴリズムは本質的にデータのプライバシーを保っていることを示す。次に、得られたプライバシーを定量化するために、差分プライバシーフレームワークを使用します。また,問題次元やデータベース間の距離といったパラメータによって,プライバシーの変化を分析する。 We consider the problem of empirical risk minimization given a database, using the gradient descent algorithm. We note that the function to be optimized may be non-convex, consisting of saddle points which impede the convergence of the algorithm. A perturbed gradient descent algorithm is typically employed to escape these saddle points. We show that this algorithm, that perturbs the gradient, inherently preserves the privacy of the data. We then employ the differential privacy framework to quantify the privacy hence achieved. We also analyze the change in privacy with varying parameters such as problem dimension and the distance between the databases.	翻訳日:2021-03-27 05:50:39 公開日:2021-01-18
# GraphAttacker: 一般的なマルチタスクGraphAttackフレームワーク GraphAttacker: A General Multi-Task GraphAttack Framework ( http://arxiv.org/abs/2101.06855v1 ) ライセンス: Link先を確認	Jinyin Chen, Dunjie Zhang, Zhaoyan Ming and Kejie Huang	(参考訳) グラフニューラルネットワーク(GNN)は多くの実世界のアプリケーションでグラフ解析タスクにうまく活用されている。しかしながら、GNNは攻撃者によって生成された敵のサンプルによって課される潜在的なセキュリティ上の問題があり、ほとんど知覚不能な摂動を伴う攻撃性能を達成している。これらの攻撃者の幅広い適用を制限するのは、ノード分類やリンク予測のような特定のグラフ分析タスクに対する手法の特異性である。そこで我々は,グラフ解析タスクに従って,構造や攻撃戦略を柔軟に調整できる新しい汎用グラフ攻撃フレームワークであるGraphAttackerを提案する。 GAN(Generative Adversarial Network)に基づいて、GraphAttackerは、3つの主要なコンポーネント、MAG(Multi-strategy Attack Generator)、SD(Simisity Discriminator)、AD(Attatity Discriminator)を交互にトレーニングすることで、敵のサンプルを生成する。さらに,摂動予算内で攻撃者を実現するために,ノード間の類似性を定量化する新しい類似性修正率(smr)を提案する。本研究では,ノード分類,グラフ分類,リンク予測のグラフ解析タスクにおいて,GraphAttackerが最先端攻撃性能を達成可能であることを示す。さらに,各タスクのユニークな特性と,それらの応答を統一攻撃フレームワークで分析する。将来の攻撃研究のためのオープンソースのシミュレーションプラットフォームとして、GraphAttackerをリリースします。 Graph Neural Networks (GNNs) have been successfully exploited in graph analysis tasks in many real-world applications. However, GNNs have been shown to have potential security issues imposed by adversarial samples generated by attackers, which achieved great attack performance with almost imperceptible perturbations. What limit the wide application of these attackers are their methods' specificity on a certain graph analysis task, such as node classification or link prediction. We thus propose GraphAttacker, a novel generic graph attack framework that can flexibly adjust the structures and the attack strategies according to the graph analysis tasks. Based on the Generative Adversarial Network (GAN), GraphAttacker generates adversarial samples through alternate training on three key components, the Multi-strategy Attack Generator (MAG), the Similarity Discriminator (SD), and the Attack Discriminator(AD). Furthermore, to achieve attackers within perturbation budget, we propose a novel Similarity Modification Rate (SMR) to quantify the similarity between nodes thus constrain the attack budget. We carry out extensive experiments and the results show that GraphAttacker can achieve state-of-the-art attack performance on graph analysis tasks of node classification, graph classification, and link prediction. Besides, we also analyze the unique characteristics of each task and their specific response in the unified attack framework. We will release GraphAttacker as an open-source simulation platform for future attack researches.	翻訳日:2021-03-27 05:50:33 公開日:2021-01-18
# まばらなオンライン学習のためのスクリーニング Screening for Sparse Online Learning ( http://arxiv.org/abs/2101.06982v1 ) ライセンス: Link先を確認	Jingwei Liang and Clarice Poon	(参考訳) 正規化を促進させるスパーシティは、低複雑さ構造(例)を課すために広く用いられている。 l1-norm for sparsity) は教師あり学習の回帰係数である。決定論的最適化の領域では、反復アルゴリズム(近位勾配降下など)によって生成されたシーケンスは「有限のアクティビティ識別」を示す。しかし、ほとんどのオンラインアルゴリズム(近確率勾配降下など)は、消滅するステップサイズと非消滅的な分散のために性質を持たない。本稿では,スクリーニングルールと組み合わせることで,オンラインアルゴリズムが生成するイテレートの不要な特徴を解消し,有限なアクティビティ識別を実現する方法を示す。その結果、任意の収束オンラインアルゴリズムと組み合わせることで、正規化器によって課される空間特性を計算利得に利用することができる。数値的には、大きな加速が得られる。 Sparsity promoting regularizers are widely used to impose low-complexity structure (e.g. l1-norm for sparsity) to the regression coefficients of supervised learning. In the realm of deterministic optimization, the sequence generated by iterative algorithms (such as proximal gradient descent) exhibit "finite activity identification", namely, they can identify the low-complexity structure in a finite number of iterations. However, most online algorithms (such as proximal stochastic gradient descent) do not have the property owing to the vanishing step-size and non-vanishing variance. In this paper, by combining with a screening rule, we show how to eliminate useless features of the iterates generated by online algorithms, and thereby enforce finite activity identification. One consequence is that when combined with any convergent online algorithm, sparsity properties imposed by the regularizer can be exploited for computational gains. Numerically, significant acceleration can be obtained.	翻訳日:2021-03-27 05:50:07 公開日:2021-01-18
# 接続特徴と畳み込みニューラルネットワークを用いた感情脳波分類 Emotional EEG Classification using Connectivity Features and Convolutional Neural Networks ( http://arxiv.org/abs/2101.07069v1 ) ライセンス: Link先を確認	Seong-Eun Moon, Chun-Jui Chen, Cho-Jui Hsieh, Jane-Ling Wang, Jong-Seok Lee	(参考訳) 畳み込みニューラルネットワーク(CNN)は脳波(EEG)信号を通じてユーザの状態を認識するために広く用いられている。前回の研究では、脳波信号は通常、高次元の生データによってcnnに供給される。しかし,本手法では,機能的脳ネットワークを記述し,ユーザの知覚状態を推定する上で有効な脳接続情報の活用が困難である。我々は,脳とCNNの接続を利用した新しい分類システムを導入し,その効果を3種類の接続手段を用いて感情映像分類によって検証する。さらに,連結行列を構成するための2つのデータ駆動手法を提案し,分類性能を最大化する。さらに分析した結果,対象映像の感情特性に関連する脳接続の濃度が分類性能と相関していることが判明した。 Convolutional neural networks (CNNs) are widely used to recognize the user's state through electroencephalography (EEG) signals. In the previous studies, the EEG signals are usually fed into the CNNs in the form of high-dimensional raw data. However, this approach makes it difficult to exploit the brain connectivity information that can be effective in describing the functional brain network and estimating the perceptual state of the user. We introduce a new classification system that utilizes brain connectivity with a CNN and validate its effectiveness via the emotional video classification by using three different types of connectivity measures. Furthermore, two data-driven methods to construct the connectivity matrix are proposed to maximize classification performance. Further analysis reveals that the level of concentration of the brain connectivity related to the emotional property of the target video is correlated with classification performance.	翻訳日:2021-03-27 05:49:53 公開日:2021-01-18
# 埋め込みのアライメントと安定性:測定と推論の改善 Alignment and stability of embeddings: measurement and inference improvement ( http://arxiv.org/abs/2101.07251v1 ) ライセンス: Link先を確認	Furkan G\"ursoy, Mounir Haddad, C\'ecile Bothorel	(参考訳) 表現学習(rl)法は、情報が距離によって保存されるオブジェクトの潜在埋め込みを学習する。距離はある種の線型変換に不変であるため、同じ情報を保持しながら異なる埋め込みが得られる。力学系では、埋め込みの時間的差はシステムの安定性や任意の変換による埋め込みの誤用によって説明できる。文献では、埋め込みアライメントは公式に定義されておらず、理論的に、あるいは経験的に分析されていない。ここでは埋め込みアライメントとその部分を調査し,最初の形式的定義を提供し,アライメントと安定性を測定するための新しい指標を提案し,合成実験を通じてそれらの適合性を示す。実世界の実験では、静的RL法と動的RL法の両方が不整合な埋め込みを生成する傾向があり、そのような不整合は動的ネットワーク推論タスクの性能を悪化させる。アライメントを確保することで、予測精度は静的で最大90%上昇し、動的RL法では40%上昇する。 Representation learning (RL) methods learn objects' latent embeddings where information is preserved by distances. Since distances are invariant to certain linear transformations, one may obtain different embeddings while preserving the same information. In dynamic systems, a temporal difference in embeddings may be explained by the stability of the system or by the misalignment of embeddings due to arbitrary transformations. In the literature, embedding alignment has not been defined formally, explored theoretically, or analyzed empirically. Here, we explore the embedding alignment and its parts, provide the first formal definitions, propose novel metrics to measure alignment and stability, and show their suitability through synthetic experiments. Real-world experiments show that both static and dynamic RL methods are prone to produce misaligned embeddings and such misalignment worsens the performance of dynamic network inference tasks. By ensuring alignment, the prediction accuracy raises by up to 90% in static and by 40% in dynamic RL methods.	翻訳日:2021-03-27 05:49:43 公開日:2021-01-18
# セキュアなマルチパーティ計算に基づく高速プライバシー保護テキスト分類 Fast Privacy-Preserving Text Classification based on Secure Multiparty Computation ( http://arxiv.org/abs/2101.07365v1 ) ライセンス: Link先を確認	Amanda Resende, Davis Railsback, Rafael Dowsley, Anderson C. A. Nascimento, Diego F. Aranha	(参考訳) 本稿では,プライバシ保存型ベイズ分類器を提案し,プライベートテキスト分類問題に適用する。この設定では、あるパーティー(アリス)がテキストメッセージを持ち、別のパーティー(ボブ)が分類器を持っている。プロトコルの最後には、Aliceはテキスト入力に適用される分類器の結果のみを学習し、Bobは何も学習しない。我々のソリューションはセキュアマルチパーティ計算(SMC)に基づいている。我々のRust実装は、構造化されていないテキストの分類のための高速でセキュアなソリューションを提供します。スパム検出の場合(ソリューションは汎用的であり、ネイブベイズ分類器が使用できる他のシナリオで使用することができる)にソリューションを適用することで、Bobのモデルの辞書サイズがすべての単語(n = 5200)を含み、AliceのSMSが少なくともm = 160ユニグラムである場合、SMSをスパムまたはハムとして340ms未満で分類することができる。 n = 369 および m = 8 の場合(データベース内のスパムSMSの平均値)、我々のソリューションは21msしか必要としない。 We propose a privacy-preserving Naive Bayes classifier and apply it to the problem of private text classification. In this setting, a party (Alice) holds a text message, while another party (Bob) holds a classifier. At the end of the protocol, Alice will only learn the result of the classifier applied to her text input and Bob learns nothing. Our solution is based on Secure Multiparty Computation (SMC). Our Rust implementation provides a fast and secure solution for the classification of unstructured text. Applying our solution to the case of spam detection (the solution is generic, and can be used in any other scenario in which the Naive Bayes classifier can be employed), we can classify an SMS as spam or ham in less than 340ms in the case where the dictionary size of Bob's model includes all words (n = 5200) and Alice's SMS has at most m = 160 unigrams. In the case with n = 369 and m = 8 (the average of a spam SMS in the database), our solution takes only 21ms.	翻訳日:2021-03-27 05:49:28 公開日:2021-01-18
# 歌声変換のための階層的不整合表現学習 Hierarchical disentangled representation learning for singing voice conversion ( http://arxiv.org/abs/2101.06842v1 ) ライセンス: Link先を確認	Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji	(参考訳) 従来の歌唱音声変換(SVC)法は、高次元データによる高解像度音声の操作に悩まされることが多い。本稿では,複数の解像度を持つ不連続表現を独立に学習できる階層表現学習を提案する。学習された不整合表現を用いて、提案手法は低解像度から高解像度のSVCを段階的に実行する。実験の結果,提案手法は平均評価スコア(MOS),類似度スコア,ピッチ精度において,単一解像度で動作するベースラインよりも優れていた。 Conventional singing voice conversion (SVC) methods often suffer from operating in high-resolution audio owing to a high dimensionality of data. In this paper, we propose a hierarchical representation learning that enables the learning of disentangled representations with multiple resolutions independently. With the learned disentangled representations, the proposed method progressively performs SVC from low to high resolutions. Experimental results show that the proposed method outperforms baselines that operate with a single resolution in terms of mean opinion score (MOS), similarity score, and pitch accuracy.	翻訳日:2021-03-27 05:49:08 公開日:2021-01-18
# 電力系統におけるサイバー攻撃検出のためのマルチソースデータフュージョン Multi-Source Data Fusion for Cyberattack Detection in Power Systems ( http://arxiv.org/abs/2101.06897v1 ) ライセンス: Link先を確認	Abhijeet Sahu and Zeyu Mao and Patrick Wlazlo and Hao Huang and Katherine Davis and Ana Goulart and Saman Zonouz	(参考訳) サイバー攻撃は早期に検出されない限り、電力システムに深刻な影響を与える可能性がある。しかしながら、重要なインフラストラクチャシステムにおける正確かつタイムリーな検出は、ゼロデイ脆弱性の搾取や、システムのサイバー物理的性質と、高い信頼性とレジリエンスの必要性によって、課題を呈している。産業制御システム(ICS)ネットワークにおけるゼロデイサイバー侵入を検出するには,従来のルールベースおよび異常ベース侵入検知システム(IDS)ツールが不十分である。そこで本研究では,複数のデータソースからの情報を融合することで,サイバーインシデントを識別し,偽陽性を低減できることを示す。具体的には,複数のデータソースによる核融合検出の正確な使用を防止するための障壁の認識と対処について述べる。我々は,実世界のデータソースをエミュレートする複数のセンサからサイバーや物理のデータを収集し,これらを合成して侵入を検知するアルゴリズムの機能とするサイバー・フィジカル・システムテストベッドにおいて,idをトレーニングするためのマルチソースデータ融合を行う。提案したデータ融合アプリケーションを用いてFalse DataとCommand InjectionベースのMan-in-The-Middle(MiTM)攻撃を推測する。 data fusionアプリケーションは、時間同期マージを使用して、idのパフォーマンスを評価するために教師付き、半教師付き、教師なしの学習モデルの前に、インプテーションやエンコーディングなどの前処理を行う。主な発見は、サイバー、セキュリティ、物理的ドメインの特徴の融合による検出精度の向上である。また,協調学習技術は,特徴を取り入れた指導的学習手法と同等に機能することを示した。 Cyberattacks can cause a severe impact on power systems unless detected early. However, accurate and timely detection in critical infrastructure systems presents challenges, e.g., due to zero-day vulnerability exploitations and the cyber-physical nature of the system coupled with the need for high reliability and resilience of the physical system. Conventional rule-based and anomaly-based intrusion detection system (IDS) tools are insufficient for detecting zero-day cyber intrusions in the industrial control system (ICS) networks. Hence, in this work, we show that fusing information from multiple data sources can help identify cyber-induced incidents and reduce false positives. Specifically, we present how to recognize and address the barriers that can prevent the accurate use of multiple data sources for fusion-based detection. We perform multi-source data fusion for training IDS in a cyber-physical power system testbed where we collect cyber and physical side data from multiple sensors emulating real-world data sources that would be found in a utility and synthesizes these into features for algorithms to detect intrusions. Results are presented using the proposed data fusion application to infer False Data and Command injection-based Man-in- The-Middle (MiTM) attacks. Post collection, the data fusion application uses time-synchronized merge and extracts features followed by pre-processing such as imputation and encoding before training supervised, semi-supervised, and unsupervised learning models to evaluate the performance of the IDS. A major finding is the improvement of detection accuracy by fusion of features from cyber, security, and physical domains. Additionally, we observed the co-training technique performs at par with supervised learning methods when fed with our features.	翻訳日:2021-03-27 05:47:50 公開日:2021-01-18
# 圧縮センシングアプリケーションを用いた未修正ReLUを用いたDNNネットワークの学習 Learning DNN networks using un-rectifying ReLU with compressed sensing application ( http://arxiv.org/abs/2101.06940v1 ) ライセンス: Link先を確認	Wen-Liang Hwang, Shih-Shuo Tung	(参考訳) 非修正技術は、データ依存変数として非線形のポイントワイドアクティベーション関数を表現し、その入力と出力と共にアクティベーション変数を全て最適化に利用することができる。この研究におけるReLUネットワークは未修正であり、活性化関数を方程式や制約の形でデータ依存のアクティベーション変数に置き換えることができる。不整合ReLUに関連するアクティベーション変数の離散的性質は、組合せ最適化の問題としてディープラーニング問題の再構成を可能にする。しかし,活性化変数の離散領域を閉区間に緩和することにより,組合せ最適化問題の最適解が維持可能であることを示す。これにより、実領域制約最適化のために開発された手法により、ネットワークの学習が容易になる。また、データ依存スラック変数を制約として導入することにより、拡張ラグランジアンアプローチに基づいてネットワークを最適化できることを示す。これは,理論上はグローバル収束を達成でき,全ての極限点が学習問題の臨界点であることを意味する。実験では,MNISTデータベースや自然画像に適用した場合,圧縮されたセンサリカバリ問題の解法により最先端の性能が得られた。 The un-rectifying technique expresses a non-linear point-wise activation function as a data-dependent variable, which means that the activation variable along with its input and output can all be employed in optimization. The ReLU network in this study was un-rectified means that the activation functions could be replaced with data-dependent activation variables in the form of equations and constraints. The discrete nature of activation variables associated with un-rectifying ReLUs allows the reformulation of deep learning problems as problems of combinatorial optimization. However, we demonstrate that the optimal solution to a combinatorial optimization problem can be preserved by relaxing the discrete domains of activation variables to closed intervals. This makes it easier to learn a network using methods developed for real-domain constrained optimization. We also demonstrate that by introducing data-dependent slack variables as constraints, it is possible to optimize a network based on the augmented Lagrangian approach. This means that our method could theoretically achieve global convergence and all limit points are critical points of the learning problem. In experiments, our novel approach to solving the compressed sensing recovery problem achieved state-of-the-art performance when applied to the MNIST database and natural images.	翻訳日:2021-03-27 05:47:21 公開日:2021-01-18
# 最適スイッチングレグレットによるオンラインキャッシング Online Caching with Optimal Switching Regret ( http://arxiv.org/abs/2101.07043v1 ) ライセンス: Link先を確認	Samrat Mukhopadhyay, Abhishek Sinha	(参考訳) オンライン学習の観点から、古典的なアンコードキャッシュ問題を考察する。限られたストレージ容量のキャッシュは、大きなカタログから一度に$c$ファイルを保持することができる。ユーザは、各タイムスロットでカタログから任意のファイルをリクエストする。ユーザからのファイル要求が到着する前に、キャッシュポリシーは、その選択した$c$ファイルでキャッシュをポピュレートする。キャッシュヒットの場合、ポリシーは単位報酬を受け取り、それ以外は報酬を受け取らない。それに加えて、キャッシュへのファイルのフェッチに関連するコストがあります。目的は、キャッシュヒットによる報酬とファイルフェッチによる切り替えコストの両方を考慮して、最小限の後悔を招くキャッシュポリシーを設計することである。この論文の主な貢献は、リーダーベースの永続キャッシングポリシーに従う場合の切替後悔分析であり、これは順番に最適な切替後悔を有することを示している。そこで本研究では,商用cdnサーバから入手可能なトレースを用いて,さまざまなキャッシュポリシのパフォーマンスを比較することにより,この問題に対する最もよく知られたスイッチング後悔を,$\theta(\sqrt{c})という係数で改善する。 We consider the classical uncoded caching problem from an online learning point-of-view. A cache of limited storage capacity can hold $C$ files at a time from a large catalog. A user requests an arbitrary file from the catalog at each time slot. Before the file request from the user arrives, a caching policy populates the cache with any $C$ files of its choice. In the case of a cache-hit, the policy receives a unit reward and zero rewards otherwise. In addition to that, there is a cost associated with fetching files to the cache, which we refer to as the switching cost. The objective is to design a caching policy that incurs minimal regret while considering both the rewards due to cache-hits and the switching cost due to the file fetches. The main contribution of this paper is the switching regret analysis of a Follow the Perturbed Leader-based anytime caching policy, which is shown to have an order optimal switching regret. In this pursuit, we improve the best-known switching regret bound for this problem by a factor of $\Theta(\sqrt{C}).$ We conclude the paper by comparing the performance of different popular caching policies using a publicly available trace from a commercial CDN server.	翻訳日:2021-03-27 05:47:04 公開日:2021-01-18
# 入力/出力トレースからハイブリッドオートマタを学習するためのパッシブオンライン手法 A Passive Online Technique for Learning Hybrid Automata from Input/Output Traces ( http://arxiv.org/abs/2101.07053v1 ) ライセンス: Link先を確認	Iman Saberi, Fathiyeh Faghih, Farzad Sobhi Bavil	(参考訳) 仕様合成は、システムの入力出力トレースからモデルを導出する過程である。テスト設計、リバースエンジニアリング、システム識別に広く使われている。サイバー物理システムにおけるこのプロセスの成果物の一つはハイブリッドオートマトンである。直感的で、正確で、ツールに依存し、抽象度が高く、離散変数と連続変数の両方のシステムをモデル化できる。本稿では,非線形サイバー物理システムの入力出力トレースからハイブリッドオートマトンを合成する新しい手法を提案する。非線形挙動における類似性検出は、そのようなモデルを抽出するための大きな課題である。動的時間ワープ技術を用いてこの問題に対処する。私たちのアプローチは受動的であり、ログされたトレースからのオートマトン合成の間、システムとのインタラクションは不要であり、オンラインでは、各入出力トレースが手順で1回だけ使用されることを意味する。言い換えれば、それぞれの新しいトレースは、既に合成されたオートマトンを改善するために使用できる。我々は,本アルゴリズムを2つの産業・シミュレーションケーススタディで評価した。導出オートマトンの精度は有望な結果を示す。 Specification synthesis is the process of deriving a model from the input-output traces of a system. It is used extensively in test design, reverse engineering, and system identification. One type of the resulting artifact of this process for cyber-physical systems is hybrid automata. They are intuitive, precise, tool independent, and at a high level of abstraction, and can model systems with both discrete and continuous variables. In this paper, we propose a new technique for synthesizing hybrid automaton from the input-output traces of a non-linear cyber-physical system. Similarity detection in non-linear behaviors is the main challenge for extracting such models. We address this problem by utilizing the Dynamic Time Warping technique. Our approach is passive, meaning that it does not need interaction with the system during automata synthesis from the logged traces; and online, which means that each input/output trace is used only once in the procedure. In other words, each new trace can be used to improve the already synthesized automaton. We evaluated our algorithm in two industrial and simulated case studies. The accuracy of the derived automata show promising results.	翻訳日:2021-03-27 05:46:47 公開日:2021-01-18
# PRESTO: 時間モチーフ数の厳密な近似のためのシンプルでスケーラブルなサンプリング手法 PRESTO: Simple and Scalable Sampling Techniques for the Rigorous Approximation of Temporal Motif Counts ( http://arxiv.org/abs/2101.07152v1 ) ライセンス: Link先を確認	Ilie Sarpe, Fabio Vandin	(参考訳) ネットワークモチーフと呼ばれる小さなグラフパターンの識別とカウントは、ソーシャルネットワークから神経科学まで、様々な分野のネットワーク分析における基本的な原始である。静的ネットワークにおけるモチーフの発生を数えるためにいくつかの技術が設計され、近年では大規模ネットワークによる計算課題に焦点が当てられている。現代のネットワークデータセットには、ネットワークエッジによってモデル化されたイベントが発生した時間など、豊富な情報が含まれている。時間的モチーフと呼ばれる時間的ネットワークにおけるモチーフの分析は、現代のネットワーク化されたデータセットの分析において重要な要素となっている。最近、時間的ネットワークにおける時間的モチーフのインスタンス数をカウントするために、いくつかの手法が設計されている。このような手法は厳密であり、大規模ネットワークには適用できないが、生産する推定値に対する弱い保証しか提供せず、大規模ネットワークにはスケールしない。本研究では,時間モチーフ数を厳密に近似するために,効率的でスケーラブルなアルゴリズムを提案する。アルゴリズムは単純だが効果的なサンプリング手法に基づいており、非常に大きなデータセットに対してアルゴリズムを実用的なものにしている。実験により,本アルゴリズムは,最先端サンプリングアルゴリズムよりも精度の高い時間的モチーフ数を推定し,正確なアプローチよりも実行時間が有意に低く,従来考えられていたより大きい時間的モチーフを数十億のエッジネットワーク上で研究することができることを示した。 The identification and counting of small graph patterns, called network motifs, is a fundamental primitive in the analysis of networks, with application in various domains, from social networks to neuroscience. Several techniques have been designed to count the occurrences of motifs in static networks, with recent work focusing on the computational challenges provided by large networks. Modern networked datasets contain rich information, such as the time at which the events modeled by the networks edges happened, which can provide useful insights into the process modeled by the network. The analysis of motifs in temporal networks, called temporal motifs, is becoming an important component in the analysis of modern networked datasets. Several methods have been recently designed to count the number of instances of temporal motifs in temporal networks, which is even more challenging than its counterpart for static networks. Such methods are either exact, and not applicable to large networks, or approximate, but provide only weak guarantees on the estimates they produce and do not scale to very large networks. In this work we present an efficient and scalable algorithm to obtain rigorous approximations of the count of temporal motifs. Our algorithm is based on a simple but effective sampling approach, which renders our algorithm practical for very large datasets. Our extensive experimental evaluation shows that our algorithm provides estimates of temporal motif counts which are more accurate than the state-of-the-art sampling algorithms, with significantly lower running time than exact approaches, enabling the study of temporal motifs, of size larger than the ones considered in previous works, on billion edges networks.	翻訳日:2021-03-27 05:46:32 公開日:2021-01-18
# 組込みLQRコントローラによる深層強化学習 Deep Reinforcement Learning with Embedded LQR Controllers ( http://arxiv.org/abs/2101.07175v1 ) ライセンス: Link先を確認	Wouter Caarls	(参考訳) 強化学習は、環境との直接相互作用を通じて制御ポリシーを最適化するモデルフリーの最適制御方法である。規制に終止符を打つタスクには、ゴール状態のチャタリングのため、一般的な離散アクションメソッドが適していない。強化学習と古典的lqr制御を組み合わせることで,この問題を解決する3つの方法を比較した。特に,LQR制御をアクションセットに統合し,学習力学に基づく場合のリプレイメモリにおける計算制御の一般化と修正を回避する手法を提案する。また,LQR制御を連続動作法に組み込む。いずれの場合においても、lqr制御の追加によりパフォーマンスが向上するが、個別のアクションセットの強化に使用できる場合、その効果はより深くなる。 Reinforcement learning is a model-free optimal control method that optimizes a control policy through direct interaction with the environment. For reaching tasks that end in regulation, popular discrete-action methods are not well suited due to chattering in the goal state. We compare three different ways to solve this problem through combining reinforcement learning with classical LQR control. In particular, we introduce a method that integrates LQR control into the action set, allowing generalization and avoiding fixing the computed control in the replay memory if it is based on learned dynamics. We also embed LQR control into a continuous-action method. In all cases, we show that adding LQR control can improve performance, although the effect is more profound if it can be used to augment a discrete action set.	翻訳日:2021-03-27 05:46:06 公開日:2021-01-18
# 非侵襲負荷モニタリングへのコインシデント水データの導入 Incorporating Coincidental Water Data into Non-intrusive Load Monitoring ( http://arxiv.org/abs/2101.07190v1 ) ライセンス: Link先を確認	Mohammad-Mehdi Keramati, Elnaz Azizi, Hamidreza Momeni, Sadegh Bolouki	(参考訳) 集積電力信号から家電の使用パターンを抽出するプロセスとしての非侵入負荷監視(NILM)は,住宅エネルギー管理を支援するアプローチとして成功している。近年、パワープロファイルの大量データセットが利用可能になり、nilmの目的のために使われる分類手法をより効果的かつ正確にするために役立った。しかし、近接電力値を持つ多モードアプライアンスやアプライアンスの存在は、計算の複雑さを悪化させ、これらのアルゴリズムの精度を低下させることに影響を与え続けている。これらの課題に対処するため、我々はイベントベースの分類プロセスを提案し、その第1段階では、K$-nearest neighbors法を高速な分類手法として、排他的非重複パワー値を持つ家電の電力信号を抽出する。そこで,ネットワーク上の新しいシグネチャとして,一部の家電の水消費を考慮に入れた2つのディープラーニングモデルを用いて,重なり合うパワー値のアプライアンスを識別する。電力の分散化に加えて, 提案手法は, 特定の家電の水消費プロファイルも抽出する。提案手法を概説し,その効率性を検証するため,既存の分類に基づくNILM技術に対して,数値分類結果が顕著に改善されたAMPdを7つのアプライアンスとして検討した。 Non-intrusive load monitoring (NILM) as the process of extracting the usage pattern of appliances from the aggregated power signal is among successful approaches aiding residential energy management. In recent years, high volume datasets on power profiles have become available, which has helped make classification methods employed for the NILM purpose more effective and more accurate. However, the presence of multi-mode appliances and appliances with close power values have remained influential in worsening the computational complexity and diminishing the accuracy of these algorithms. To tackle these challenges, we propose an event-based classification process, in the first phase of which the $K$-nearest neighbors method, as a fast classification technique, is employed to extract power signals of appliances with exclusive non-overlapping power values. Then, two deep learning models, which consider the water consumption of some appliances as a novel signature in the network, are utilized to distinguish between appliances with overlapping power values. In addition to power disaggregation, the proposed process as well extracts the water consumption profiles of specific appliances. To illustrate the proposed process and validate its efficiency, seven appliances of the AMPds are considered, with the numerical classification results showing marked improvement with respect to the existing classification-based NILM techniques.	翻訳日:2021-03-27 05:45:53 公開日:2021-01-18
# メータ数に着目した解離困難度の定量化 Quantification of Disaggregation Difficulty with Respect to the Number of Meters ( http://arxiv.org/abs/2101.07191v1 ) ライセンス: Link先を確認	Elnaz Azizi, Mohammad T H Beheshti, Sadegh Bolouki	(参考訳) 効率的なエネルギー管理への有望なアプローチは、集約された消費信号を分析して住宅内の家電製品の消費プロファイルを抽出する非侵入負荷監視(nilm)である。効率的なNILM法には、集約された信号のイベントを検知し、それらを引き起こすアプライアンスに従って分類するイベントベースのアルゴリズムがある。多数のアプライアンスと消費が密接なアプライアンスの存在は、イベントベースのnilmメソッドの性能を制限することが知られている。これらの課題に取り組むために、ハードウェアコストの増大、インストールの複雑さ、消費者の快適さとプライバシに関する懸念をもたらす機能空間を強化することができる。これは、アプライアンスをブロックに分割し、各ブロックの消費を別々の電力メータで監視する、セミインタラクティブ負荷監視(silm)と呼ばれる別のアプローチの出現につながった。より多くのメーターがより正確なデアグリゲーションをもたらすが、これは負荷監視の金銭的コストを増大させ、この分野の重要なギャップを示すトレードオフを示している。本稿では,このギャップを解消するための包括的アプローチとして,電力値と消費者の利用行動の両方に基づいて,任意の家電群のイベントを監視することがいかに難しいかを定量化する「分散困難度メトリクス(ddm)」という概念を確立した。したがって、DDMは、アプライアンスの任意の分割のブロックにメーターを設置することにより、一般的なイベントベースのアルゴリズムの分解精度の観点から、どれだけの量が得られるかを本質的に定量化する。 REDDデータセットに基づく実験結果は、上記のトレードオフに対応するための提案手法の実用性を示している。 A promising approach toward efficient energy management is non-intrusive load monitoring (NILM), that is to extract the consumption profiles of appliances within a residence by analyzing the aggregated consumption signal. Among efficient NILM methods are event-based algorithms in which events of the aggregated signal are detected and classified in accordance with the appliances causing them. The large number of appliances and the presence of appliances with close consumption values are known to limit the performance of event-based NILM methods. To tackle these challenges, one could enhance the feature space which in turn results in extra hardware costs, installation complexity, and concerns regarding the consumer's comfort and privacy. This has led to the emergence of an alternative approach, namely semi-intrusive load monitoring (SILM), where appliances are partitioned into blocks and the consumption of each block is monitored via separate power meters. While a greater number of meters can result in more accurate disaggregation, it increases the monetary cost of load monitoring, indicating a trade-off that represents an important gap in this field. In this paper, we take a comprehensive approach to close this gap by establishing a so-called notion of "disaggregation difficulty metric (DDM)," which quantifies how difficult it is to monitor the events of any given group of appliances based on both their power values and the consumer's usage behavior. Thus, DDM in essence quantifies how much is expected to be gained in terms of disaggregation accuracy of a generic event-based algorithm by installing meters on the blocks of any partition of the appliances. Experimental results based on the REDD dataset illustrate the practicality of the proposed approach in addressing the aforementioned trade-off.	翻訳日:2021-03-27 05:45:31 公開日:2021-01-18
# 深層材料ネットワークにおける細胞分裂のマルチスケールひずみ局在モデリングへの応用 Cell division in deep material networks applied to multiscale strain localization modeling ( http://arxiv.org/abs/2101.07226v1 ) ライセンス: Link先を確認	Zeliang Liu	(参考訳) コンピュータ支援工学におけるひずみ局所化モデリング(例えば、故障解析)の重要性は高まっているが、複数の長さスケールにわたる関連物質挙動を一貫してモデル化するための効果的なアプローチは存在しない。このギャップを、ビルディングブロックに埋め込まれた物理ベースの機械学習モデルであるディープマテリアルネットワーク(DMN)のフレームワーク内で解決することを目指している。ネットワーク上のスケール遷移を追跡するために新しいセル分割スキームが提案され、その一貫性は適合パラメータの物理によって保証される。本質的には、下層の各マイクロスケールノードは、その次元がマクロスケールの材料点からバックプロパゲーションされた楕円体細胞によって記述される。セル内の新しい亀裂面は凝集層を濃縮することによってモデル化され、暗黙のDMN分析において亀裂発生と進展のための故障アルゴリズムが開発された。粒子強化複合管の動的破砕と炭素繊維強化ポリマー複合材料の各種試験について, マルチスケールモデルを同時マルチスケールシミュレーションに適用した。後者については,オフ軸引張試験試料の実験的検証も行う。 Despite the increasing importance of strain localization modeling (e.g., failure analysis) in computer-aided engineering, there is a lack of effective approaches to consistently modeling related material behaviors across multiple length scales. We aim to address this gap within the framework of deep material networks (DMN) - a physics-based machine learning model with embedded mechanics in the building blocks. A new cell division scheme is proposed to track the scale transition through the network, and its consistency is ensured by the physics of fitting parameters. Essentially, each microscale node in the bottom layer is described by an ellipsoidal cell with its dimensions back-propagated from the macroscale material point. New crack surfaces in the cell are modeled by enriching cohesive layers, and failure algorithms are developed for crack initiation and evolution in the implicit DMN analysis. Besides single material point studies, we apply the multiscale model to concurrent multiscale simulations for the dynamic crush of a particle-reinforced composite tube and various tests on carbon fiber reinforced polymer composites. For the latter, experimental validations on an off-axis tensile test specimen are also provided.	翻訳日:2021-03-27 05:45:01 公開日:2021-01-18
# 前立腺癌検出のための機械学習を用いたラマンケミカルイメージングとデジタル組織像の融合 Feature Fusion of Raman Chemical Imaging and Digital Histopathology using Machine Learning for Prostate Cancer Detection ( http://arxiv.org/abs/2101.07342v1 ) ライセンス: Link先を確認	Trevor Doherty, Susan McKeever, Nebras Al-Attar, Tiarnan Murphy, Claudia Aura, Arman Rahman, Amanda O'Neill, Stephen P Finn, Elaine Kay, William M. Gallagher, R. William G. Watson, Aoife Gowen and Patrick Jackman	(参考訳) 前立腺癌の診断はプレゼンテーションの多様性のため困難であり,非臨床的に重要な疾患の診断と治療が過度に行われている。正確な診断は患者の生活の質や予後に直接利益をもたらす。この問題に対処するために,前立腺癌の自動診断のための学習モデルを提案する。多くの前立腺がん研究ではラマン分光法が採用されているが、ラマン化学イメージング(Raman Chemical Imaging, RCI)と他の画像モダリティの組み合わせは利用されていない。本研究は, 染色デジタル組織学(DP)と非定常RCIを併用したマルチモーダル画像を用いた。本手法は,非癌性Gleason grade 3 (G3) およびグレード4 (G4) 組織マイクロアレイ標本を含む32例の臨床試料178例を用いて開発・試験した。病理組織学的にはDP-RCI画像対とラベルが付けられている。検証された仮説は、診断精度の観点から、マルチモーダル画像モデルが単一モダリティベースラインモデルより優れているかどうかである。 2種類の非癌/がんモデルとより困難なG3/G4の分化について検討した。 g3/g4分類では,マルチモーダルアプローチは73.8%,特異度88.1%,ベースラインdpモデルは54.1%,特異度84.7%であった。マルチモーダルアプローチは、統計学的に有意な12.7%のAUCの優位性を、RCIと中央ラマンスペクトルのみに基づくモデルよりも85.8%の値で証明した。 DPとRCIの特徴融合は、腫瘍識別のより簡単な作業を改善するものではなく、G3/G4識別において観察された優位性をもたらす。これらの有望な結果に基づいて、将来の研究には、拡張モデル一般化のためのより大きなデータセットの取得が含まれる。 The diagnosis of prostate cancer is challenging due to the heterogeneity of its presentations, leading to the over diagnosis and treatment of non-clinically important disease. Accurate diagnosis can directly benefit a patient's quality of life and prognosis. Towards addressing this issue, we present a learning model for the automatic identification of prostate cancer. While many prostate cancer studies have adopted Raman spectroscopy approaches, none have utilised the combination of Raman Chemical Imaging (RCI) and other imaging modalities. This study uses multimodal images formed from stained Digital Histopathology (DP) and unstained RCI. The approach was developed and tested on a set of 178 clinical samples from 32 patients, containing a range of non-cancerous, Gleason grade 3 (G3) and grade 4 (G4) tissue microarray samples. For each histological sample, there is a pathologist labelled DP - RCI image pair. The hypothesis tested was whether multimodal image models can outperform single modality baseline models in terms of diagnostic accuracy. Binary non-cancer/cancer models and the more challenging G3/G4 differentiation were investigated. Regarding G3/G4 classification, the multimodal approach achieved a sensitivity of 73.8% and specificity of 88.1% while the baseline DP model showed a sensitivity and specificity of 54.1% and 84.7% respectively. The multimodal approach demonstrated a statistically significant 12.7% AUC advantage over the baseline with a value of 85.8% compared to 73.1%, also outperforming models based solely on RCI and median Raman spectra. Feature fusion of DP and RCI does not improve the more trivial task of tumour identification but does deliver an observed advantage in G3/G4 discrimination. Building on these promising findings, future work could include the acquisition of larger datasets for enhanced model generalization.	翻訳日:2021-03-27 05:44:45 公開日:2021-01-18
# 学習キャッシュによるディープラーニング推論の高速化 Accelerating Deep Learning Inference via Learned Caches ( http://arxiv.org/abs/2101.07344v1 ) ライセンス: Link先を確認	Arjun Balasubramanian, Adarsh Kumar, Yuhan Liu, Han Cao, Shivaram Venkataraman, Aditya Akella	(参考訳) Deep Neural Networks(DNN)は、現実世界の問題を解決する上で、高い精度で複数のドメインが採用されるのを目撃している。しかし、この高い精度は、より深いネットワークを構築することによって達成され、ユーザ向けアプリケーションによって望まれる低レイテンシの推論に対する根本的な課題となっている。現在の低レイテンシソリューションは、正確性に関するトレードオフか、ワークロード提供の予測に固有の時間的局所性を活用できないかのどちらかだ。我々は、DNNの隠れ層出力をキャッシュすることで、推論要求が必要な計算量だけを消費する遅延バインディングの形式を導入することを観察する。これにより、低レイテンシを実現するためのメカニズムと、時間的局所性を活用する能力が組み合わされる。しかし、従来のキャッシュアプローチでは、高いメモリオーバーヘッドとルックアップのレイテンシが発生し、学習したキャッシュ – 継続的に更新される単純なmlモデルで構成されるキャッシュ – を設計することになります。低レイテンシDNN推論のための学習キャッシュを組み込んだエンドツーエンド予測サービスであるGATIの設計を提案する。その結果、GATIは現実的なワークロードにおいて、推論遅延を最大7.69倍削減できることがわかった。 Deep Neural Networks (DNNs) are witnessing increased adoption in multiple domains owing to their high accuracy in solving real-world problems. However, this high accuracy has been achieved by building deeper networks, posing a fundamental challenge to the low latency inference desired by user-facing applications. Current low latency solutions trade-off on accuracy or fail to exploit the inherent temporal locality in prediction serving workloads. We observe that caching hidden layer outputs of the DNN can introduce a form of late-binding where inference requests only consume the amount of computation needed. This enables a mechanism for achieving low latencies, coupled with an ability to exploit temporal locality. However, traditional caching approaches incur high memory overheads and lookup latencies, leading us to design learned caches - caches that consist of simple ML models that are continuously updated. We present the design of GATI, an end-to-end prediction serving system that incorporates learned caches for low-latency DNN inference. Results show that GATI can reduce inference latency by up to 7.69X on realistic workloads.	翻訳日:2021-03-27 05:44:14 公開日:2021-01-18
# データ管理レンズを通して:公正な分類の実験的分析と評価 Through the Data Management Lens: Experimental Analysis and Evaluation of Fair Classification ( http://arxiv.org/abs/2101.07361v1 ) ライセンス: Link先を確認	Maliha Tashfia Islam, Anna Fariha, Alexandra Meliou	(参考訳) データ駆動機械学習タスクである分類は、ローン承認や犯罪リスク評価といった重要な人間の判断を含む予測システムの増加を推進している。しかし、分類器はしばしば識別行動を示し、特にバイアスデータで示される場合である。その結果、分類の公平性は高優先度の研究領域として浮上した。データ管理研究は、公正分類のトピックを含む、データとアルゴリズムの公平性に関連するトピックの存在と関心を示している。公平な分類における学際的な取り組みは、機械学習の研究が最大の存在感を持ち、多くの公平性概念と、体系的に評価・比較されていない幅広いアプローチを生み出した。本稿では,その正確性,公平性,効率性,スケーラビリティ,安定性について,さまざまなメトリクスと実世界のデータセットを用いて,13の公正な分類アプローチと,さらに別のバリエーションを幅広く分析する。我々の分析は、異なるメトリクスとハイレベルなアプローチ特性がパフォーマンスの異なる側面に与える影響に関する新しい洞察を強調します。また、異なる実践的設定に適したアプローチを選択するための一般的な原則を議論し、データ管理中心のソリューションが最も影響を与える可能性のある領域を特定する。 Classification, a heavily-studied data-driven machine learning task, drives an increasing number of prediction systems involving critical human decisions such as loan approval and criminal risk assessment. However, classifiers often demonstrate discriminatory behavior, especially when presented with biased data. Consequently, fairness in classification has emerged as a high-priority research area. Data management research is showing an increasing presence and interest in topics related to data and algorithmic fairness, including the topic of fair classification. The interdisciplinary efforts in fair classification, with machine learning research having the largest presence, have resulted in a large number of fairness notions and a wide range of approaches that have not been systematically evaluated and compared. In this paper, we contribute a broad analysis of 13 fair classification approaches and additional variants, over their correctness, fairness, efficiency, scalability, and stability, using a variety of metrics and real-world datasets. Our analysis highlights novel insights on the impact of different metrics and high-level approach characteristics on different aspects of performance. We also discuss general principles for choosing approaches suitable for different practical settings, and identify areas where data-management-centric solutions are likely to have the most impact.	翻訳日:2021-03-27 05:43:58 公開日:2021-01-18

Title

Authors

Abstract

論文公表日・翻訳日

# 最適戦略を用いた量子状態検証の標準化に向けて

Towards the standardization of quantum state verification using optimal strategies ( http://arxiv.org/abs/2002.00640v2 )

ライセンス: Link先を確認

Xinhe Jiang, Kun Wang, Kaiyi Qian, Zhaozhong Chen, Zhiyu Chen, Liangliang Lu, Lijun Xia, Fangmin Song, Shining Zhu, Xiaosong Ma

(参考訳) 絡み合った状態を生成する量子デバイスは広く研究され、広く使われている。そのため、特定されたデバイスが本当に確実に、かつ効率的に動作するかどうかを確認する必要がある。本稿では,フォトニックプラットフォームを用いた局所的測定(非適応的)とアクティブフィードフォワード操作(適応的)の両方を用いて,提案する2量子ビットエンタングル状態検証手法を実験的に実現する。約3283/536のコピー(N$)は、非適応的/適応的戦略に対するターゲット量子状態を検証するために99%の信頼を得るために必要である。これらの最適戦略は、パラメータ $r=-1$ を持つ $N$$$\epsilon$$\sim$$N^r$ の関数として不忠実な $\epsilon$ のハイゼンベルクスケーリングを提供し、$r=-0.5$ の標準量子極限を超える。非適応的および適応的戦略にそれぞれ$r=-0.88\pm$0.03と$-0.78\pm$0.07のスケーリングパラメータを得る。我々の実験は量子状態の検証のための標準化された手順として機能する可能性がある。

Quantum devices for generating entangled states have been extensively studied and widely used. As so, it becomes necessary to verify that these devices truly work reliably and efficiently as they are specified. Here, we experimentally realize the recently proposed two-qubit entangled state verification strategies using both local measurements (nonadaptive) and active feed-forward operations (adaptive) with a photonic platform. About 3283/536 number of copies ($N$) are required to achieve a 99% confidence to verify the target quantum state for nonadaptive/adaptive strategies. These optimal strategies provide the Heisenberg scaling of the infidelity $\epsilon$ as a function of $N$ ($\epsilon$ $\sim$ $N^r$) with the parameter $r=-1$, exceeding the standard quantum limit with $r=-0.5$. We experimentally obtain the scaling parameter of $r=-0.88\pm$0.03 and $-0.78\pm$0.07 for nonadaptive and adaptive strategies, respectively. Our experimental work could serve as a standardized procedure for the verification of quantum states.

翻訳日:2023-06-04 20:50:50 公開日:2021-01-18

# 量子暗号を用いたセキュアな対称的個人情報検索

Provably-secure symmetric private information retrieval with quantum cryptography ( http://arxiv.org/abs/2004.13921v2 )

ライセンス: Link先を確認

Wen Yu Kon, Charles Ci Wen Lim

(参考訳) プライベート情報検索 (pir) は、ユーザが興味のあるデータベースの特定のエントリを学習できるが、そのクエリはデータセンターから隠蔽されるという、ユーザのプライバシーを提供するデータベースクエリプロトコルである。シンメトリ・プライベート情報検索(SPIR)は、ユーザがデータベースの追加エントリを学習できないデータベースプライバシを付加することで、PIRをさらに強化する。複数のデータベースを持つ無条件でセキュアなSPIRソリューションは古典的には知られているが、セキュアな通信とプロトコル内のランダムな共有のために、パーティ間で長い秘密鍵を必要とするため、非現実的である。本稿では,セキュアな通信と共有ランダム性要件の両方を実現するための実装として,量子鍵分布(QKD)を提案する。我々は、QKDがSPIRプロトコルのセキュリティを維持しており、外部の盗聴者に対しても安全であることを証明した。また,測定装置に依存しないQKDによって生成される鍵を持つ2データベースSPIRプロトコルの例を用いて,このような古典量子システムを実際に実装する方法を示す。キーレート計算により,現在のqkd技術を用いて,都市レベルで実現可能であることを示す。

Private information retrieval (PIR) is a database query protocol that provides user privacy, in that the user can learn a particular entry of the database of his interest but his query would be hidden from the data centre. Symmetric private information retrieval (SPIR) takes PIR further by additionally offering database privacy, where the user cannot learn any additional entries of the database. Unconditionally secure SPIR solutions with multiple databases are known classically, but are unrealistic because they require long shared secret keys between the parties for secure communication and shared randomness in the protocol. Here, we propose using quantum key distribution (QKD) instead for a practical implementation, which can realise both the secure communication and shared randomness requirements. We prove that QKD maintains the security of the SPIR protocol and that it is also secure against any external eavesdropper. We also show how such a classical-quantum system could be implemented practically, using the example of a two-database SPIR protocol with keys generated by measurement device-independent QKD. Through key rate calculations, we show that such an implementation is feasible at the metropolitan level with current QKD technology.

翻訳日:2023-05-21 19:46:14 公開日:2021-01-18

# 調和に閉じ込められた相互作用粒子の2体クエンチダイナミクス

Two-body quench dynamics of harmonically trapped interacting particles ( http://arxiv.org/abs/2005.01235v4 )

ライセンス: Link先を確認

A. D. Kerin and A. M. Martin

(参考訳) 我々は、相互作用強度が1つの値から別の値にキューチされる3次元等方性トラップにおける相互作用原子対の量子進化を考える。静的問題の厳密な解を用いることで、初期状態と最終状態の重なりや2つの原子間の分離の期待値など、時間依存の観測可能性を評価することができる。相互作用が非相互作用的状態から強い相互作用的状態に切り替わる場合、あるいはその逆の場合、分析結果が得られる。初期状態と最終状態の重なりを調べると、相互作用が非相互作用から強相互作用状態へと縮められるとき、初期の依存ダイナミクスは単一の不純物多体極限における理論的な仕事と一致することが分かる。系が強い状態から非相互作用状態へと焼成されると、相互作用ポテンシャルのゼロレンジの性質による対数発散から生じる2つの原子の分離における大きな振動を予測する。

We consider the quantum evolution of a pair of interacting atoms in a three dimensional isotropic trap where the interaction strength is quenched from one value to another. Using exact solutions of the static problem we are able to evaluate time-dependent observables such as the overlap between initial and final states and the expectation value of the separation between the two atoms. In the case where the interaction is quenched from the non-interacting regime to the strongly interacting regime, or vice versa, we are able to obtain analytic results. Examining the overlap between the initial and final states we show that when the interaction is quenched from the non-interacting to strongly interacting regimes the early time dependence dynamics are consistent with theoretical work in the single impurity many-body limit. When the system is quenched from the strongly to non-interacting regime we predict large oscillations in the separation between the two atoms, which arises from a logarithmic divergence due to the zero-range nature of the interaction potential.

翻訳日:2023-05-21 05:30:09 公開日:2021-01-18

# 高絡み合い状態からの量子コード構築の修正法

Modifying method of constructing quantum codes from highly entangled states ( http://arxiv.org/abs/2005.01426v3 )

ライセンス: Link先を確認

Zahra Raissi

(参考訳) 古典符号、高度に絡み合った純粋状態(k-ユニフォームまたは絶対最大絡み合う(AME)状態)と量子誤り訂正符号(QECC)の間には関連がある。これにより、k一様状態または対応する古典コードから開始し、各ステップで1つのパーティをトレースして安定化器qeccを構築する体系的な方法が導かれる。我々は、古典符号の対応する生成行列に部分的トレースが原因となる変化を記述することにより、コードワード、エンコーディング手順、およびQECCの安定化形式について明示的な構成を提供する。次に、この方法を変更して、論理的なquditをAME状態に分散した部分空間にエンコードする安定化器QECCを生成する。この構成は、パーティを追跡せずにAME状態から始まる量子コードを生成する。したがって、より大きな符号空間を持つ量子安定化符号を構築することができる。

There is a connection between classical codes, highly entangled pure states (called k-uniform or absolutely maximally entangled (AME) states), and quantum error correcting codes (QECCs). This leads to a systematic method to construct stabilizer QECCs by starting from a k-uniform state or the corresponding classical code and tracing out one party at each step. We provide explicit constructions for codewords, encoding procedure and stabilizer formalism of the QECCs by describing the changes that partial traces cause on the corresponding generator matrix of the classical codes. We then modify the method to produce another set of stabilizer QECCs that encode a logical qudit into a subspace spanned by AME states. This construction produces quantum codes starting from an AME state without tracing out any party. Therefore, quantum stabilizer codes with larger codespace can be constructed.

翻訳日:2023-05-21 05:24:27 公開日:2021-01-18

# 2レベルゆらぎのアンサンブルからの正・負の周波数雑音

Positive- and negative-frequency noise from an ensemble of two-level fluctuators ( http://arxiv.org/abs/2005.03591v2 )

ライセンス: Link先を確認

Xinyuan You, Aashish A. Clerk, Jens Koch

(参考訳) 発散性2レベルゆらぎのアンサンブルのブロッホ・レッドフィールド処理に基づく電荷ノイズの解析は、一般的には発散定理に違反する。標準的なマルコフ近似(浴槽に結合した2段のゆらぎに適用される場合)は、この故障の主な原因として特定できる。結果として生じる脱コヒーレンス速度は、変動周波数でのバス応答のみを含み、周波数拡大の効果を完全に無視する。この問題を克服するための体系的かつ計算学的に便利な方法は、スペクター・キュービット法を用いることである: 補助キュービットを2レベルゆらぎのアンサンブルに結合することにより、$S({\omega})$に対する解析近似をゆらぎ散逸定理と完全に整合できる。本稿では, クロスオーバー周波数の温度依存性を考慮した1/f$から1/f^2$のクロスオーバーを含む, いくつかの周波数範囲で異なる挙動を示すノイズの特性について論じる。

The analysis of charge noise based on the Bloch-Redfield treatment of an ensemble of dissipative two-level fluctuators generally results in a violation of the fluctuation-dissipation theorem. The standard Markov approximation (when applied to the two-level fluctuators coupled to a bath) can be identified as the main origin of this failure. The resulting decoherence rates only involve the bath response at the fluctuator frequency, and thus completely neglect the effects of frequency broadening. A systematic and computationally convenient way to overcome this issue is to employ the spectator-qubit method: by coupling an auxiliary qubit to the two-level fluctuator ensemble, an analytical approximation for $S({\omega})$ fully consistent with the fluctuation-dissipation theorem can be obtained. We discuss the resulting characteristics of the noise which exhibits distinct behavior over several frequency ranges, including a $1/f$ to $1/f^2$ crossover with a $T^3$ temperature dependence of the crossover frequency.

翻訳日:2023-05-20 22:25:38 公開日:2021-01-18

# 強レーザー駆動下での励起エネルギー移動

Excitation Energy Transfer under Strong Laser Drive ( http://arxiv.org/abs/2005.04719v2 )

ライセンス: Link先を確認

Xuanhua Wang, Zhedong Zhang, Jin Wang

(参考訳) 強い分子-光相互作用は分子構造と動的過程の制御を可能にする。光キャビティにより分子が強く駆動される共鳴エネルギー伝達の分子間距離を大幅に向上させるために、強いレーザー駆動を持つモデルを提案する。エネルギー移動の最適ラビ周波数と量子収率は、双極子-双極子相互作用と分子-キャビティカップリングのトレードオフから生じる。特定のラビ周波数での強い駆動を印加すると, 共振エネルギー伝達のF\"オルスター機構と比較して, 有効エネルギー移動の空間範囲と, 距離の遅い減衰速度が観察される。我々の研究は、分子ポラリトンにおける協調エネルギー移動の分光学的研究に光を当てている。

Strong molecule-light interaction enables the control of molecular structures and dynamical processes. A model with strong laser drive is proposed to greatly enhance the intermolecular distance of resonant energy transfer, where the molecules are strongly driven by an optical cavity. The optimal Rabi frequency and quantum yield of energy transfer are observed, resulting from the trade off between dipole-dipole interaction and molecule-cavity coupling. When the strong drive at certain Rabi frequency is applied, a larger spatial range of effective energy transfer and a slower decay rate with the distance compared to the F\"orster mechanism of resonant energy transfer are observed in our model. Our work sheds light on spectroscopic study of the cooperative energy transfer in molecular polaritons.

翻訳日:2023-05-20 16:07:49 公開日:2021-01-18

# 相対論的量子情報における粒子検出器モデルの破れ共分散

Broken covariance of particle detector models in relativistic quantum information ( http://arxiv.org/abs/2006.12514v3 )

ライセンス: Link先を確認

Eduardo Mart\'in-Mart\'inez, T. Rick Perche and Bruno de S. L. Torres

(参考訳) 量子場に結合した空間的スメア粒子検出器の予測は、一般に点的極限の外側で共変ではないことを示す。この共変の欠如は、時間順序演算における曖昧さとして現れている。共分散の崩壊が、unruh-dewittモデルのような量子場理論における典型的な検出器モデルにどのように影響するかを分析する。具体的には,共分散の破れが検出器-場系の状態,検出器の形状と運動状態,時空幾何にどのように依存するかを示す。さらに,違反の大きさを明示的に評価するツールを提供し,摂動解析においてスメア検出器の予測が正確に,あるいはほぼ共変している状態を特定する。

We show that the predictions of spatially smeared particle detectors coupled to quantum fields are not generally covariant outside the pointlike limit. This lack of covariance manifests itself as an ambiguity in the time-ordering operation. We analyze how the breakdown of covariance affects typical detector models in quantum field theory such as the Unruh-DeWitt model. Specifically, we show how the violations of covariance depend on the state of the detectors-field system, the shape and state of motion of the detectors, and the spacetime geometry. Furthermore, we provide the tools to explicitly evaluate the magnitude of the violation, and identify the regimes where the predictions of smeared detectors are either exactly or approximately covariant in perturbative analyses.

翻訳日:2023-05-13 04:51:52 公開日:2021-01-18

# 量子レーダー入門

Introduction to quantum radar ( http://arxiv.org/abs/2006.14238v3 )

ライセンス: Link先を確認

Ricardo Gallego Torrom\'e, Nadya Ben Bekhti-Winkel and Peter Knott

(参考訳) 量子絡み合いと量子相関の概念を簡潔に紹介した後、量子照明と他のプロトコルに基づく量子レーダのいくつかのスキームについて論じる。我々は,レーダアプリケーションのための量子生成および/または検出量子センシングプロトコルの実装におけるいくつかの本質的な困難を克服するために導入された異なる概念をレビューする。本レビューは, 異なる概念の実現可能性の評価を事例として, 最先端の最先端技術に関する最新の批判的プレゼンテーションである。また、レビューを現場の非専門家に公開することを目標としています。そのため、いくつかの付録と技術用語集が含まれている。

After a brief introduction to the notion of quantum entanglement and quantum correlations, several schemes for a quantum radar based upon the quantum illumination and others protocols are discussed. We review different concepts that have been introduced to overcome several of the inherent difficulties in the implementation of quantum generation and/or detection quantum sensing protocols for RADAR applications. Our review is an up-to date critical presentation of the state of the art, with emphasis in the case by case assessment of the feasibility of the different concepts. We also aim that the review is accessible to non-experts in the field. Hence several appendixes and a technical glossary are included.

翻訳日:2023-05-12 20:02:37 公開日:2021-01-18

# マジックステート蒸留のための測定シーケンス

Measurement sequences for magic state distillation ( http://arxiv.org/abs/2007.07929v3 )

ライセンス: Link先を確認

Jeongwan Haah, Matthew B. Hastings

(参考訳) マジック状態蒸留(magic state distillation)は入力状態のエラーを抑制するために特別な符号を使用する。本稿では,量子ビット間の誤りの独立性を仮定して,任意の部分の誤りを抑制できるマジック状態蒸留プロトコルの詳細な測定手順を提案する。入力魔法の状態と合わせて、このプロトコルは2次元の正方形グリッド上で動作し、キュービットの水平ペアに$zz$、垂直ペアに$xx$、単一キュービットに$z,x$の測定を行う。

Magic state distillation uses special codes to suppress errors in input states, which are often tailored to a Clifford-twirled error model. We present detailed measurement sequences for magic state distillation protocols which can suppress arbitrary errors on any part of a protocol, assuming the independence of errors across qubits. Provided with input magic states, our protocol operates on a two-dimensional square grid by measurements of $ZZ$ on horizontal pairs of qubits, $XX$ on vertical pairs, and $Z,X$ on single qubits.

翻訳日:2023-05-09 09:03:49 公開日:2021-01-18

# 未知量子ビットの遠方における交換自由計算

Exchange-Free Computation on an Unknown Qubit at a Distance ( http://arxiv.org/abs/2008.00841v4 )

ライセンス: Link先を確認

Hatim Salih, Jonte R. Hance, Will McCutcheon, Terry Rudolph, and John Rarity

(参考訳) 我々は任意の量子ビットを直接操作する方法を示し、粒子の交換は行わない。これは、リモート古典的ボブによるアリスにおける任意の量子状態の交換自由な準備を含む。その結果,遠隔の第三者の未知の量子ビット上での任意の計算交換自由なプログラムにより,一方の第三者が直接実行可能なプロトコルを提案することができた。さらに、これを普遍的な2量子ビットゲートの交換自由制御に利用する方法を示し、プログラム可能な量子回路上で任意の所望のアルゴリズムを直接実行することが可能であることを示す。

We present a way of directly manipulating an arbitrary qubit, without the exchange of any particles. This includes as an application the exchange-free preparation of an arbitrary quantum state at Alice by a remote classical Bob. As a result, we are able to propose a protocol that allows one party to directly enact, by means of a suitable program, any computation exchange-free on a remote second party's unknown qubit. Further, we show how to use this for the exchange-free control of a universal two-qubit gate, thus opening the possibility of directly enacting any desired algorithm remotely on a programmable quantum circuit.

翻訳日:2023-05-07 06:47:24 公開日:2021-01-18

# DLTベースのCOVID-19パスポートのためのフレームワーク

Framework for a DLT Based COVID-19 Passport ( http://arxiv.org/abs/2008.01120v7 )

ライセンス: Link先を確認

Sarang Chaudhari, Michael Clear and Hitesh Tewari

(参考訳) 日常的に対話するさまざまなネットワークをまたいで個人を識別することは、私たちが住んでいるデジタル世界にとっての課題であり、セキュアで効率的なプライバシー保護id機構の開発は重要な研究分野となっている。さらに、Bitcoinのような分散型意思決定ネットワークの人気は、エンドユーザーの認証情報を保管し、安全に広めるために分散型台帳技術を使うことに大きな関心を寄せている。本稿では、新型コロナウイルスのワクチン接種の詳細を公開され、分散化され、不変なブロックチェーン上に保存し、バイオメトリック・暗号ハッシュ技術を用いて各ユーザー固有の識別子を生成する2要素認証システムを利用するメカニズムについて述べる。私たちの主な貢献は、ユーザーを認証し、匿名でブロックチェーン上の予防接種記録を見つけるのに使える虹彩抽出技術に対して、確実にセキュアで局所性に敏感なハッシュアルゴリズムを使用することです。

Uniquely identifying individuals across the various networks they interact with on a daily basis remains a challenge for the digital world that we live in, and therefore the development of secure and efficient privacy preserving identity mechanisms has become an important field of research. In addition, the popularity of decentralised decision making networks such as Bitcoin has seen a huge interest in making use of distributed ledger technology to store and securely disseminate end user identity credentials. In this paper we describe a mechanism that allows one to store the COVID-19 vaccination details of individuals on a publicly readable, decentralised, immutable blockchain, and makes use of a two-factor authentication system that employs biometric cryptographic hashing techniques to generate a unique identifier for each user. Our main contribution is the employment of a provably secure input-hiding, locality-sensitive hashing algorithm over an iris extraction technique, that can be used to authenticate users and anonymously locate vaccination records on the blockchain, without leaking any personally identifiable information to the blockchain.

翻訳日:2023-05-07 06:27:59 公開日:2021-01-18

# 離散回転対称性を持つ量子ドットアレイからのねじれ光の放出

Emission of twisted light from quantum dot arrays with a discrete rotational symmetry ( http://arxiv.org/abs/2008.03908v3 )

ライセンス: Link先を確認

H. T. Sullivan, J. H. Cole

(参考訳) 量子ドットの円形配列の光学的性質を理論的に検討する。円形エミッタアレイ(CEA)と呼ばれるこの構造は、軌道角運動量を運ぶ光と同様に、円偏光の放出と吸収を通じて光学角運動量と交換することができる。バンド間およびバンド内遷移率の両方について解析式を導出し、選択規則を決定する。 ceaをモデル化する場合、量子ドットの重要な特性のみが考慮される。これにより、我々のモデルは様々な量子ドットからなるCEAに適用可能である。最後に、CEAの特定の光学特性をチューニングするための設計原理を決定する。これにより、光学角運動量に逆転するCEAの表面からなるメタマテリアルを設計する可能性が開ける。

We theoretically explore the optical properties of a circular array of quantum dots. This structure, that we call a circular emitter array (CEA), can exchange optical angular momentum via the emission and absorption of circularly polarised light as well as light carrying orbital angular momentum. Analytical expressions are derived for both interband and intraband transition rates and selection rules are determined. Only the key properties of the quantum dots are considering when modelling the CEA. This extends the applicability of our model to CEAs composed of a variety of quantum dots. Finally design principles for the tuning of the specific optical properties of the CEA are determined. This opens up the prospect of the designing a metamaterial, consisting of a surface of CEAs, that upconverts optical angular momentum.

翻訳日:2023-05-06 16:09:21 公開日:2021-01-18

# 超強結合系における分光と臨界量子温度測定

Spectroscopy and critical quantum thermometry in the ultrastrong coupling regime ( http://arxiv.org/abs/2009.01994v2 )

ライセンス: Link先を確認

M. Salado-Mej\'ia, R. Rom\'an-Ancheyta, F. Soto-Eguibar and H. M. Moya-Cessa

(参考訳) 我々は、異方性ホップフィールドモデルの厳密な解析解を示し、2つの超強結合量子系のスペクトルおよび熱的応答を詳細に研究する。興味深いことに,結合系の初期状態によっては,真空ラビ分裂は,逆直観的疎結合効果のスペクトルシグネチャと考えられる重要な非対称性を示す。量子熱力学応用のための温度計として結合系を用い,超強結合法で有効な温度推定の究極の境界を求める。驚くべきことに、もしシステムが量子相転移を行うと、量子フィッシャー情報は周期的な発散を示し、そのような臨界量子センサに対して任意に高い温度測定精度を持つことができる。

We present an exact analytical solution of the anisotropic Hopfield model, and we use it to investigate in detail the spectral and thermometric response of two ultrastrongly coupled quantum systems. Interestingly, we show that depending on the initial state of the coupled system, the vacuum Rabi splitting manifests significant asymmetries that may be considered spectral signatures of the counterintuitive decoupling effect. Using the coupled system as a thermometer for quantum thermodynamics applications, we obtain the ultimate bounds on the estimation of temperature that remain valid in the ultrastrong coupling regime. Remarkably, if the system performs a quantum phase transition, the quantum Fisher information exhibits periodic divergences, suggesting that one can have several points of arbitrarily high thermometric precision for such a critical quantum sensor.

翻訳日:2023-05-03 20:59:26 公開日:2021-01-18

# 光学波長変換のためのニオブ酸リチウムへのシリコンフォトニックデバイスのハイブリッド集積

Hybrid integration of silicon photonic devices on lithium niobate for optomechanical wavelength conversion ( http://arxiv.org/abs/2010.08493v2 )

ライセンス: Link先を確認

Igor Marinkovi\'c, Maxwell Drimmer, Bas Hensen, Simon Gr\"oblacher

(参考訳) 量子情報プロセッサの急速な発展は、量子ネットワークを実現する技術に対する需要を加速させた。有望なアプローチの1つは、マイクロ波と光学場の中間体としてメカニカル共振器を用いる。超伝導、トポロジー、スピン量子ビットプロセッサからの信号は、通信波長で光学状態とコヒーレントに変換できる。しかし、均質な構造から作られた現在のデバイスはノイズの増加と変換効率の低下に苦しむ。異なる材料の有利な特性を不均一な設計に組み合わせることで、優れた量子トランスダクションデバイスが実現できるはずであり、これらのハイブリッドアプローチは、しかしながら複雑な製造手順によって妨げられている。そこで本研究では,異なる素材の独立したデバイス部品を1つのデバイスに統合する,従来のピック・アンド・プレイス・アイデアに基づく新たな統合手法を提案する。この方法は、プロセス中に連続的な光学的モニタリングによって精度のアライメントを可能にする。本手法を用いて, 最先端の波長変換特性を有するニオブ酸シリコンハイブリッドデバイスを作製した。

The rapid development of quantum information processors has accelerated the demand for technologies that enable quantum networking. One promising approach uses mechanical resonators as an intermediary between microwave and optical fields. Signals from a superconducting, topological, or spin qubit processor can then be converted coherently to optical states at telecom wavelengths. However, current devices built from homogeneous structures suffer from added noise and small conversion efficiency. Combining advantageous properties of different materials into a heterogeneous design should allow for superior quantum transduction devices -- so far these hybrid approaches have however been hampered by complex fabrication procedures. Here we present a novel integration method based on previous pick-and-place ideas, that can combine independently fabricated device components of different materials into a single device. The method allows for precision alignment by continuous optical monitoring during the process. Using our method, we assemble a hybrid silicon-lithium niobate device with state-of-the-art wavelength conversion characteristics.

翻訳日:2023-04-28 22:03:14 公開日:2021-01-18

# 回路量子音響力学における量子対古典構造

Quantum versus Classical Regime in Circuit Quantum Acoustodynamics ( http://arxiv.org/abs/2011.05075v2 )

ライセンス: Link先を確認

Gang-hui Zeng, Yang Zhang, Aleksey N. Bolgar, Dong He, Bin Li, Xin-hui Ruan, Lan Zhou, Le-Mang Kuang, Oleg V. Astafiev, Yu-xi Liu, Z. H. Peng

(参考訳) 超伝導人工原子からなる回路量子音響力学系を2次元表面波共振器と1次元マイクロ波伝送線路の両方に結合して実験的に検討した。人工原子と音響波共振器との強い結合は, 希釈冷凍機の基礎温度における真空ラビ分裂の観察によって確認される。マイクロ波伝送線路におけるマイクロ波光子の伝搬は、音波共振器内の数個のフォノンによって制御可能であることを示す。さらに,高励起状態からのRabi分裂および温度誘起遷移の測定に対する温度効果を実証した。その結果,Rabi分裂における2ピークのスペクトル構造はいくつかのピークに変化し,環境温度の上昇に伴い徐々に消失することがわかった。量子-古典遷移は、熱ゆらぎエネルギー$k_{B}T$と結合系の特性エネルギーレベル間隔によって決定されるクロスオーバー温度$T_{c}T$の周囲で観測される。実験結果は, 実効温度の異なる結合系の主方程式による理論シミュレーションとよく一致している。

We experimentally study a circuit quantum acoustodynamics system, which consists of a superconducting artificial atom, coupled to both a two-dimensional surface acoustic wave resonator and a one-dimensional microwave transmission line. The strong coupling between the artificial atom and the acoustic wave resonator is confirmed by the observation of the vacuum Rabi splitting at the base temperature of dilution refrigerator. We show that the propagation of microwave photons in the microwave transmission line can be controlled by a few phonons in the acoustic wave resonator. Furthermore, we demonstrate the temperature effect on the measurements of the Rabi splitting and temperature induced transitions from high excited dressed states. We find that the spectrum structure of two-peak for the Rabi splitting becomes into those of several peaks, and gradually disappears with the increase of the environmental temperature $T$. The quantum-to-classical transition is observed around the crossover temperature $T_{c}$, which is determined via the thermal fluctuation energy $k_{B}T$ and the characteristic energy level spacing of the coupled system. Experimental results agree well with the theoretical simulations via the master equation of the coupled system at different effective temperatures.

翻訳日:2023-04-24 19:04:23 公開日:2021-01-18

# ホログラフィックの絡み合ったポリトープとしてのアソシヘドロン

The associahedron as a holographic entanglement polytope ( http://arxiv.org/abs/2101.03823v2 )

ライセンス: Link先を確認

P\'eter L\'evay

(参考訳) 本項では、${\rm AdS}_3/{\rm CFT}_2$対応を用いて、散乱振幅の理解に使用されるArkani-Hamed-Bai-He-Yan(ABHY)アソシアヘドロンと、絡み合いのパターンから生じる時空を理解するために用いられるアソシアヘドロンとの類似性を観察する。この類推は、アソシアヘドロンを${\rm CFT}_2$真空に付随するホログラフィック絡みポリトープとして自然な解釈を示唆している。我々の観測は、散乱振幅の分解特性が、ホログラフィック量子絡み合いの理論で用いられる時空の分離性の概念と結びついている可能性を示唆している。

By employing the ${\rm AdS}_3/{\rm CFT}_2$ correspondence in this note we observe an analogy between the structures found in connection with the Arkani-Hamed-Bai-He-Yan (ABHY) associahedron used for understanding scattering amplitudes, and the one used for understanding space-time emerging from patterns of entanglement. The analogy suggests the natural interpretation for the associahedron as a holographic entanglement polytope associated to the ${\rm CFT}_2$ vacuum. Our observations hint at the possibility that the factorization properties of scattering amplitudes are connected to the notion of separability of space-time as used in the theory of holographic quantum entanglement.

翻訳日:2023-04-17 02:52:57 公開日:2021-01-18

# ibmq-melbourne量子コンピュータにおけるschr\"odinger cat状態の絡み合いの作成と研究

Preparation and study of the entanglement of the Schr\"odinger cat state on the ibmq-melbourne quantum computer ( http://arxiv.org/abs/2101.05089v2 )

ライセンス: Link先を確認

A.R. Kuzmak, V.M. Tkachuk

(参考訳) ibmq-melbourne量子コンピュータで作製したschr\"odinger cat状態における、ある量子ビットと残りのシステムの絡み合いについて検討した。この目的のために使用されるプロトコルは、ある量子ビットに対応するスピンの平均値を決定することに基づいている。異なる数の量子ビットからなるシュリンガー猫状態のパラメータに対する絡み合いの依存性について検討する。さらに、各量子ビットのエンタングルメントを、残りのシステムで最大エンタングル化されたschr\"odinger cat状態で検討する。

We study the entanglement between a certain qubit and the remaining system in the Schr\"odinger cat state prepared on the ibmq-melbourne quantum computer. The protocol, which we use for this purpose, is based on the determination of the mean value of spin corresponding to a certain qubit. We explore the dependence of the entanglement on a parameter of the Schr\"odinger cat state which consists of different numbers of qubits. In addition, we explore the entanglement of each qubit with the remaining system in the maximum entangled Schr\"odinger cat state.

翻訳日:2023-04-15 17:42:10 公開日:2021-01-18

# バリューアライメントの挑戦 - より公正なアルゴリズムからAI安全性まで

The Challenge of Value Alignment: from Fairer Algorithms to AI Safety ( http://arxiv.org/abs/2101.06060v2 )

ライセンス: Link先を確認

Iason Gabriel and Vafa Ghazavi

(参考訳) 本稿では,AIシステムと人的価値の整合性の問題に対処し,それを技術と価値に関するより広い思考範囲に位置づける。真空中に存在するのではなく、異なる価値システムをロックインするテクノロジーの能力に長年関心が寄せられている。また、参加型デザインプロセスなど、技術と特定の社会的価値を連携させる方法も検討されている。本稿では、AIの価値アライメントに関する問題をより詳しく検討し、AIシステムのパワーと自律性が、これまで遭遇したことのない価値領域における機会と課題をもたらすことを示唆する。公正性、説明責任、透明性、倫理的コミュニティの作業と、技術AI安全研究者による作業との間の重要な連続性について、我々は「社会的価値の整合性」という問題により多くの注意を払う必要があることを示唆している。

This paper addresses the question of how to align AI systems with human values and situates it within a wider body of thought regarding technology and value. Far from existing in a vacuum, there has long been an interest in the ability of technology to 'lock-in' different value systems. There has also been considerable thought about how to align technologies with specific social values, including through participatory design-processes. In this paper we look more closely at the question of AI value alignment and suggest that the power and autonomy of AI systems gives rise to opportunities and challenges in the domain of value that have not been encountered before. Drawing important continuities between the work of the fairness, accountability, transparency and ethics community, and work being done by technical AI safety researchers, we suggest that more attention needs to be paid to the question of 'social value alignment' - that is, how to align AI systems with the plurality of values endorsed by groups of people, especially on the global level.

翻訳日:2023-04-15 03:03:40 公開日:2021-01-18

# ハイパーパラレルトランジスタ、ルータ及びユニティファイパティを有する動的ランダムアクセスメモリ

Hyperparallel transistor, router and dynamic random access memory with unity fidelities ( http://arxiv.org/abs/2101.06872v1 )

ライセンス: Link先を確認

Ji-Zhen Liu, Ning-Yang Chen, Wen-Qiang Liu, Hai-Rui Wei and Ming Hua

(参考訳) 理論的には、量子単一光子トランジスタ、ルータ、動的ランダムアクセスメモリ(DRAM)など、いくつかの超並列光学素子を実装している。量子ドット(qd)-キャビティ中間体の必然的な側漏れと不完全な複屈折を考慮に入れ、我々の光学素子の統一性を達成することができる。ハイパー並列構造は光子の偏光と空間自由度(DOF)に基づいており、並列効率を高め、チャネルの容量を改善し、量子資源を節約し、運転時間を短縮し、環境騒音を低減している。また, 実用的スキームは, マイクロキャビティの側漏れや結合強度制限に対して頑健である。

We theoretically implement some hyperparallel optical elements, including quantum single photon transistor, router, and dynamic random access memory (DRAM). The inevitable side leakage and the imperfect birefringence of the quantum dot (QD)-cavity mediates are taken into account, and unity fidelities of our optical elements can be achieved. The hyperparallel constructions are based on polarization and spatial degrees of freedom (DOFs) of the photon to increase the parallel efficiency, improve the capacity of channel, save the quantum resources, reduce the operation time, and decrease the environment noises. Moreover, the practical schemes are robust against the side leakage and the coupling strength limitation in the microcavities.

翻訳日:2023-04-14 21:25:48 公開日:2021-01-18

# 準周期的及びランダム駆動量子多体系における加熱速度の厳密な境界

Rigorous Bounds on the Heating Rate in Thue-Morse Quasiperiodically and Randomly Driven Quantum Many-Body Systems ( http://arxiv.org/abs/2101.07065v1 )

ライセンス: Link先を確認

Takashi Mori, Hongzheng Zhao, Florian Mintert, Johannes Knolle, Roderich Moessner

(参考訳) 閉多体系の非平衡量子力学はリッチだが挑戦的な分野である。周期駆動(フロケット)システムの最近の進歩は多くの厳密な結果をもたらしてきたが、量子多体系に対する我々の理解は急速に変化しているが、半周期駆動は限定的である。ここでは、Thue-Morse準周期駆動およびランダム多極駆動の下での量子多体系の加熱速度の厳密な非摂動境界を導出し、後者は前者の調整可能なランダム化変種である。この過程において、局所可観測物の力学を含む過渡前熱状態を記述する静的有効ハミルトニアンを導出する。 thue-morse準周期駆動のバウンドは、数値シミュレーションと一致して、加熱時間は(\omega/g)^{-c\ln(\omega/g)}$、正の定数$c$、典型的なエネルギースケールのハミルトニアン$g$であることを示唆している。

The nonequilibrium quantum dynamics of closed many-body systems is a rich yet challenging field. While recent progress for periodically driven (Floquet) systems has yielded a number of rigorous results, our understanding on quantum many-body systems driven by rapidly varying but a- and quasi-periodic driving is still limited. Here, we derive rigorous, non-perturbative, bounds on the heating rate in quantum many-body systems under Thue-Morse quasi-periodic driving and under random multipolar driving, the latter being a tunably randomized variant of the former. In the process, we derive a static effective Hamiltonian that describes the transient prethermal state, including the dynamics of local observables. Our bound for Thue-Morse quasi-periodic driving suggests that the heating time scales like $(\omega/g)^{-C\ln(\omega/g)}$ with a positive constant $C$ and a typical energy scale $g$ of the Hamiltonian, in agreement with our numerical simulations.

翻訳日:2023-04-14 21:21:36 公開日:2021-01-18

# SpiNNakerとLoihiのニューロモルフィックボード上で動くシミュレートされた街灯ロボットのためのスパイキング中央パターン生成装置

A Spiking Central Pattern Generator for the control of a simulated lamprey robot running on SpiNNaker and Loihi neuromorphic boards ( http://arxiv.org/abs/2101.07001v1 )

ライセンス: Link先を確認

Emmanouil Angelidis, Emanuel Buchholz, Jonathan Patrick Arreguit O'Neil, Alexis Roug\`e, Terrence Stewart, Axel von Arnim, Alois Knoll, Auke Ijspeert

(参考訳) 中央パターン生成器(cpgs)モデルは、動物の移動を阻害する神経機構とロボット研究の道具の両方を調べるために長い間用いられてきた。本研究では,シミュレートされたランプレーモデルを制御する手段として,スパイキングCPGニューラルネットワークとそのニューロモルフィックハードウェアの実装を提案する。 CPGモデルを構築するために、ニューラルエンジニアリング・フレームワーク(NEF)で繰り返し発生する神経集団を用いて自然に出現する力学系を用いる。モデルの背後にある数学的定式化は、高レベル信号で変調された結合抽象振動子のシステムからなり、様々な出力歩数を生成することができる。中央パターン生成モデルの数学的定式化によって、モデルがスパイクニューラルネットワーク(snn)に変換され、snシミュレータであるnengoで簡単にシミュレーションできることを示した。スパイキングcpgモデルは、様々なシナリオで模擬ランプレイロボットモデルの水泳歩行を生成するために使用される。センサ情報によって提供できるネットワークへの入力を変更することで、ロボットの方向や速度を動的に制御できることを示す。提案手法は工学的応用と科学的研究に適した他のタイプのCPGに一般化することができる。我々はspinnakerとloihiという2つのニューロモルフィック・プラットフォームでシステムをテストする。最後に、このスパイキングアルゴリズムのカテゴリは、エネルギー効率と計算速度の観点から、ニューロモルフィックハードウェアの理論的優位性を活用できる可能性を示している。

Central Pattern Generators (CPGs) models have been long used to investigate both the neural mechanisms that underlie animal locomotion as well as a tool for robotic research. In this work we propose a spiking CPG neural network and its implementation on neuromorphic hardware as a means to control a simulated lamprey model. To construct our CPG model, we employ the naturally emerging dynamical systems that arise through the use of recurrent neural populations in the Neural Engineering Framework (NEF). We define the mathematical formulation behind our model, which consists of a system of coupled abstract oscillators modulated by high-level signals, capable of producing a variety of output gaits. We show that with this mathematical formulation of the Central Pattern Generator model, the model can be turned into a Spiking Neural Network (SNN) that can be easily simulated with Nengo, an SNN simulator. The spiking CPG model is then used to produce the swimming gaits of a simulated lamprey robot model in various scenarios. We show that by modifying the input to the network, which can be provided by sensory information, the robot can be controlled dynamically in direction and pace. The proposed methodology can be generalized to other types of CPGs suitable for both engineering applications and scientific research. We test our system on two neuromorphic platforms, SpiNNaker and Loihi. Finally, we show that this category of spiking algorithms shows a promising potential to exploit the theoretical advantages of neuromorphic hardware in terms of energy efficiency and computational speed.

翻訳日:2023-04-14 21:20:59 公開日:2021-01-18

# 熱平衡から相変化材料を用いたキャビティ壁と原子とのカシミール-ポルダー相互作用

Casimir-Polder Interaction of an Atom with a Cavity Wall Made of Phase-Change Material out of Thermal Equilibrium ( http://arxiv.org/abs/2101.06995v1 )

ライセンス: Link先を確認

G. L. Klimchitskaya and V. M. Mostepanenko

(参考訳) 我々は,He$^*$,Na,Cs,Rbの原子と,二酸化バナジウム膜でコーティングされたサファイアのキャビティウォールとの間の熱平衡なカシミール・ポリダー相互作用を,壁温度の増加とともに誘電-金属相転移を経ると考えている。原子壁分離と壁温度の関数としてのカシミール・ポルダー力とその勾配の数値計算は、後者が環境温度を超えるときに行う。その結果, カシミール・ポルダー力の測定実験において, 熱平衡を欠く石英ガラス壁と$^{87}$Rb原子の勾配を測定した結果と比較した。また, 相変化壁材の使用は, 誘電体壁の場合と異なり, 力の大きさ, 特に力勾配を大きく増加させることが示された。

We consider the out-of-thermal-equilibrium Casimir-Polder interaction between atoms of He$^*$, Na, Cs, and Rb and a cavity wall made of sapphire coated with a vanadium dioxide film which undergoes the dielectric-to-metal phase transition with increasing wall temperature. Numerical computations of the Casimir-Polder force and its gradient as the functions of atom-wall separation and wall temperature are made when the latter exceeds the temperature of the environment. The obtained results are compared with those in experiment on measuring the gradient of the Casimir-Polder force between $^{87}$Rb atoms and a silica glass wall out of thermal equilibrium. It is shown that the use of phase-change wall material increases significantly the force magnitude and especially the force gradient, as opposed to the case of dielectric wall.

翻訳日:2023-04-14 21:20:34 公開日:2021-01-18

# 実空間時間依存schr\"odinger計算によるヘキサゴナルナノリボンの高調波スペクトル

High-harmonic spectra of hexagonal nanoribbons from real-space time-dependent Schr\"odinger calculations ( http://arxiv.org/abs/2101.06970v1 )

ライセンス: Link先を確認

Helena Dr\"ueke and Dieter Bauer

(参考訳) 高ハーモニック分光法は、全ての光学的手段と前例のない時間分解能で凝縮物質の電子構造とダイナミクスを撮像する有望な候補である。本研究では, 六角形ナノリボン, グラフェン, 六角形窒化ホウ素などの高調波スペクトルをアームチェアおよびジグザグ構成で検討した。系の対称性は、放射された高調波の存在と強度を説明する。

High-harmonic spectroscopy is a promising candidate for imaging electronic structures and dynamics in condensed matter by all-optical means and with unprecedented temporal resolution. We investigate harmonic spectra from finite, hexagonal nanoribbons, such as graphene and hexagonal boron nitride, in armchair and zig-zag configuration. The symmetry of the system explains the existence and intensity of the emitted harmonics.

翻訳日:2023-04-14 21:20:18 公開日:2021-01-18

# Capitol (Pat)riots: TwitterとParlerの比較研究

Capitol (Pat)riots: A comparative study of Twitter and Parler ( http://arxiv.org/abs/2101.06914v1 )

ライセンス: Link先を確認

Hitkul, Avinash Prabhu, Dipanwita Guhathakurta, Jivitesh jain, Mallika Subramanian, Manvith Reddy, Shradha Sehgal, Tanvi Karandikar, Amogh Gulati, Udit Arora, Rajiv Ratn Shah and Ponnurangam Kumaraguru

(参考訳) 2021年1月6日、右派保守派の暴徒がアメリカ議会議事堂ヒルを襲撃し、2020年の大統領選挙結果を議会が承認した。イベント開始直後、暴動に関連する投稿がソーシャルメディアで流行し始めた。ソーシャルメディアプラットフォームは、ソーシャルメディアプラットフォームであるParlerを支持する言論の自由であり、暴動が計画され、議論されたプラットフォームとして主張されている。われわれのレポートは、暴動の前後のparlerとtwitterのトレンドコンテンツの対比を示している。トレンドハッシュタグに基づいて両プラットフォームからデータを収集し,話題の話題,プラットフォームでアクティブな人,両プラットフォームで生成されたコンテンツのオーガニック性などに基づいて比較を行った。 twitter上のコンテンツはイベントに対する強い不満を持ち、暴動やインキッターに対する行動を求めたが、パーラーのコンテンツは、攻撃的な暴徒と同様の投票者詐欺の考えを反映する強い保守的な物語を持っていた。またTwitterと比較すると、Parlerのトラフィックの操作率も非常に高い。

On 6 January 2021, a mob of right-wing conservatives stormed the USA Capitol Hill interrupting the session of congress certifying 2020 Presidential election results. Immediately after the start of the event, posts related to the riots started to trend on social media. A social media platform which stood out was a free speech endorsing social media platform Parler; it is being claimed as the platform on which the riots were planned and talked about. Our report presents a contrast between the trending content on Parler and Twitter around the time of riots. We collected data from both platforms based on the trending hashtags and draw comparisons based on what are the topics being talked about, who are the people active on the platforms and how organic is the content generated on the two platforms. While the content trending on Twitter had strong resentments towards the event and called for action against rioters and inciters, Parler content had a strong conservative narrative echoing the ideas of voter fraud similar to the attacking mob. We also find a disproportionately high manipulation of traffic on Parler when compared to Twitter.

翻訳日:2023-04-14 21:20:09 公開日:2021-01-18

# BECから量子関連相への交叉における量子性の発見

Uncover quantumness in the crossover from BEC to quantum-correlated phase ( http://arxiv.org/abs/2101.06878v1 )

ライセンス: Link先を確認

J.P. Restrepo Cuartas and H. Vinck-Posada

(参考訳) Tavis-Cummingsモデルにおける集団現象は相転移の特徴に着目して広く研究されている。多くの場合、分離された放射線マター系を考慮した変分法が用いられている。本稿では, 単一モードキャビティに結合した2レベルエミッタの集合体における量子絡み合いの役割について検討する。系の統計的性質、例えば最初の4つの統計モーメントは、光と物質の分布の構造を明確に示している。 2階相関関数はいくつかの状態において1つになるが、統計解析はコヒーレントな振る舞いから、共通理解とは対照的に急激な離脱を証明している。

Collective phenomena in the Tavis-Cummings model has been widely studied, focusing on the phase transition features. In many occasions, it has been used variational approaches that consider separated radiation-matters systems. In this paper, we examine the role of the quantum entanglement of an assembly of two-level emitters coupled to a single-mode cavity; this allows us to characterise the quantum correlated state for each regime. Statistical properties of the system, e.g., the first four statistical moments, show clearly the structure of the light and matter distributions. Even though the second order correlation function goes to one in some regimes, the statistical analysis evidence a sharp departure from coherent behaviour, contrarily to the common understanding.

翻訳日:2023-04-14 21:19:12 公開日:2021-01-18

# コロナ・アプリにおけるデータ保護効果評価

Data Protection Impact Assessment for the Corona App ( http://arxiv.org/abs/2101.07292v1 )

ライセンス: Link先を確認

Kirsten Bock, Christian R. K\"uhne, Rainer M\"uhlhoff, M\v{e}to R. Ost, J\"org Pohle, Rainer Rehak

(参考訳) SARS-CoV-2は2020年初頭にヨーロッパで普及して以来、パンデミックとの戦いや封じ込めに関する技術的な解決を求める声が強く、議論の中心に接触追跡アプリがある。 EUのGDPR(General Daten Protection Regulation)は、データ処理が権利と自由に高いリスクをもたらす可能性のあるデータ保護影響評価(DPIA)を実施するよう、管理者に要求している(第35条GDPR)。 DPIAは、基本的権利に関連するデータ処理の結果を識別し、評価する構造化されたリスク分析であり、これらのリスクに対処するために考えられた措置や、それを行うことができないことを示す。標準データ保護モデル (SDM) に基づいて, PEPP-PT, DP-3T, およびChaos Computer ClubのメンバーであるLinus Neumannによって要約された概念である, 最も"プライバシフレンドリー"であると考えられる3つの接触追跡アプリ設計を, 徹底的に検証する科学的DPIAを提案する。 DPIAは、処理コンテキストと期待されるユースケースの分析から始まります。そして、現実的な処理目的を定義することにより、処理アクティビティを記述する。続いて法的な評価としきい値分析が行われる。最後に,弱点,リスクを分析し,適切な保護策を決定する。分散化実装でさえも、多くの重大な弱点とリスクを伴うことを示している。法的には、同意は法的根拠として適さないので、データは法律に基づいて処理されなければならない。また,データ主体と影響を受ける人々の権利を実現するための対策が不十分であることがわかった。最後に、匿名化は、個人的参照を分離することを目的とした継続的プロセスとして理解され、法的、組織的、技術的措置が混在していることを示します。現在利用可能なすべての提案には、そのような明確な分離プロセスがない。

Since SARS-CoV-2 started spreading in Europe in early 2020, there has been a strong call for technical solutions to combat or contain the pandemic, with contact tracing apps at the heart of the debates. The EU's General Daten Protection Regulation (GDPR) requires controllers to carry out a data protection impact assessment (DPIA) where their data processing is likely to result in a high risk to the rights and freedoms (Art. 35 GDPR). A DPIA is a structured risk analysis that identifies and evaluates possible consequences of data processing relevant to fundamental rights and describes the measures envisaged to address these risks or expresses the inability to do so. Based on the Standard Data Protection Model (SDM), we present a scientific DPIA which thoroughly examines three published contact tracing app designs that are considered to be the most "privacy-friendly": PEPP-PT, DP-3T and a concept summarized by Chaos Computer Club member Linus Neumann, all of which process personal health data. The DPIA starts with an analysis of the processing context and some expected use cases. Then, the processing activities are described by defining a realistic processing purpose. This is followed by the legal assessment and threshold analysis. Finally, we analyse the weak points, the risks and determine appropriate protective measures. We show that even decentralized implementations involve numerous serious weaknesses and risks. Legally, consent is unfit as legal ground hence data must be processed based on a law. We also found that measures to realize the rights of data subjects and affected people are not sufficient. Last but not least, we show that anonymization must be understood as a continuous process, which aims at separating the personal reference and is based on a mix of legal, organizational and technical measures. All currently available proposals lack such an explicit separation process.

翻訳日:2023-04-14 21:12:34 公開日:2021-01-18

# コヒーレントワンウェイ量子鍵分布に対するゼロエラー攻撃

Zero-error attack against coherent-one-way quantum key distribution ( http://arxiv.org/abs/2101.07192v1 )

ライセンス: Link先を確認

R\'obert Tr\'enyi, Marcos Curty

(参考訳) コヒーレントワンウェイ(COW)量子鍵分布(QKD)は、単純な実験装置で秘密鍵を長距離に分散するという約束を守った。実際、このスキームは現在商用アプリケーションで使われている。しかし、最近、その秘密鍵レートはシステムの透過率とほぼ4分の1でスケールしており、長距離QKD伝送には適していないことが示されている。このような悲観的な結果はいわゆるゼロエラー攻撃(ゼロエラー攻撃)によって引き起こされ、盗聴器はエラーを発生させないが、システムの正統な利用者はセキュアな鍵を抽出できない。そこで本研究では,誤差のない場合,その最大到達距離を制限できないという観点から,事実上最適であるcow-qkdに対するゼロエラー攻撃を提案する。これは秘密鍵レートの上界に変換され、これは以前に知られていた上界よりも桁違いに低い。

Coherent-one-way (COW) quantum key distribution (QKD) held the promise of distributing secret keys over long distances with a simple experimental setup. Indeed, this scheme is currently used in commercial applications. Surprisingly, however, it has been recently shown that its secret key rate scales at most quadratically with the system's transmittance and, thus, it is not appropriate for long distance QKD transmission. Such pessimistic result was derived by employing a so-called zero-error attack, in which the eavesdropper does not introduce any error, but still the legitimate users of the system cannot distill a secure key. Here, we present a zero-error attack against COW-QKD that is essentially optimal, in the sense that no other attack can restrict further its maximum achievable distance in the absence of errors. This translates into an upper bound on its secret key rate that is more than an order of magnitude lower than previously known upper bounds.

翻訳日:2023-04-14 21:10:31 公開日:2021-01-18

# プレーヤーデータを生成するモバイルゲームの設計 -- 学んだ教訓

Designing a mobile game to generate player data -- lessons learned ( http://arxiv.org/abs/2101.07144v1 )

ライセンス: Link先を確認

William Wallis and William Kavanagh and Alice Miller and Tim Storer

(参考訳) ユーザフレンドリーなツールは、高品質なゲーム設計の要件を、開発経験のない研究者が独自のゲームをリリースできるレベルまで引き下げた。しかし、研究目的のゲームは少ないため、最高の実践は確立されていない。同様のプロジェクトの指導なしにモバイルゲームを開発したので、私たちは経験を共有する必要性に気づき、将来の研究者がそれに追随する道を開くことに気付きました。ゲームバランシングとシステムシミュレーションの研究は、マルチプレイヤーモバイルゲーム「RPGLite」の開発に触発された実験ケーススタディを必要とした。 RPGの作成では、開発に関する専門知識がなく、研究目的で効果的なアマチュアゲーム開発に関する一連の教訓を学びました。本稿では,開発プロセス全体を振り返り,これらの教訓を紹介する。

User friendly tools have lowered the requirements of high-quality game design to the point where researchers without development experience can release their own games. However, there is no established best-practice as few games have been produced for research purposes. Having developed a mobile game without the guidance of similar projects, we realised the need to share our experience so future researchers have a path to follow. Research into game balancing and system simulation required an experimental case study, which inspired the creation of "RPGLite", a multiplayer mobile game. In creating RPGLitewith no development expertise we learned a series of lessons about effective amateur game development for research purposes. In this paper we reflect on the entire development process and present these lessons.

翻訳日:2023-04-14 21:09:40 公開日:2021-01-18

# 批判的分析:batアルゴリズムに基づく複数の領域の探索と応用

Critical Analysis: Bat Algorithm based Investigation and Application on Several Domains ( http://arxiv.org/abs/2102.01201v1 )

ライセンス: Link先を確認

Shahla U. Umar, Tarik A. Rashid

(参考訳) 近年,2010 年に xin-she yang が提案した bat algorithm (ba) などの群最適化アルゴリズムが提案されている。このアルゴリズムのアイデアはコウモリのエコーロケーション能力から取られた。目的: 本研究の目的は, batアルゴリズムの限界, アルゴリズムが適用されている分野, 異なる領域における汎用最適化問題, および他のメタヒューリスティックアルゴリズムに対する性能を評価するすべての研究を含む, 読者にbatアルゴリズムの完全な研究を提供することである。 Approach: Bat Algorithm is given in-depth in terms of backgrounds, characteristics, limitations, it has also displayed the algorithms that hybridized with BA (K-Medoids, Back-propagation neural network, Harmony Search Algorithm, Differential Evaluation Strategies, Enhanced Particle Swarm Optimization, and Cuckoo Search Algorithm) and their theoretical results, as well as to the modifications that have been performed of the algorithm (Modified Bat Algorithm (MBA), Enhanced Bat Algorithm (EBA), Bat Algorithm with Mutation (BAM), Uninhabited Combat Aerial Vehicle-Bat algorithm with Mutation (UCAV-BAM), Nonlinear Optimization)... 発見:このアルゴリズムの長所と短所を、アルゴリズムに対処するすべての研究と、それについて科学者が理解し、開発するのに役立つことを期待した分野と応用に光を当てた。 originality/value: 研究コミュニティの知識に関しては、このアルゴリズムに関する包括的な調査は行われていません。キーワードは、swarm intelligence、nature-inspired algorithms、metaheuristic algorithms、optimize algorithms、bat algorithmである。

In recent years several swarm optimization algorithms, such as Bat Algorithm (BA) have emerged, which was proposed by Xin-She Yang in 2010. The idea of the algorithm was taken from the echolocation ability of bats. Purpose: The purpose of this study is to provide the reader with a full study of the Bat Algorithm, including its limitations, the fields that the algorithm has been applied, versatile optimization problems in different domains, and all the studies that assess its performance against other meta-heuristic algorithms. Approach: Bat Algorithm is given in-depth in terms of backgrounds, characteristics, limitations, it has also displayed the algorithms that hybridized with BA (K-Medoids, Back-propagation neural network, Harmony Search Algorithm, Differential Evaluation Strategies, Enhanced Particle Swarm Optimization, and Cuckoo Search Algorithm) and their theoretical results, as well as to the modifications that have been performed of the algorithm (Modified Bat Algorithm (MBA), Enhanced Bat Algorithm (EBA), Bat Algorithm with Mutation (BAM), Uninhabited Combat Aerial Vehicle-Bat algorithm with Mutation (UCAV-BAM), Nonlinear Optimization)... Findings: Shed light on the advantages and disadvantages of this algorithm through all the researches that dealt with the algorithm in addition to the fields and applications it has addressed in the hope that it will help scientists understand and develop it. Originality/value: As far as the research community knowledge, there is no comprehensive survey study conducted on this algorithm cover{\i}ng all its aspects. Keywords: Swarm Intelligence; Nature-Inspired Algorithms; Metaheuristic Algorithms; Optimization Algorithms; Bat Algorithm.

翻訳日:2023-04-14 21:03:08 公開日:2021-01-18

# データ資源プロファイル:2020年3月から5月にかけてのニューヨークで発生した新型コロナウイルスの感染状況

Data Resource Profile: Egress Behavior from Select NYC COVID-19 Exposed Health Facilities March-May 2020 ( http://arxiv.org/abs/2101.10079v1 )

ライセンス: Link先を確認

Debra F. Laefer, Thomas Kirchner, Haoran (Frank) Jiang, Darlene Cheong, Yunqi (Veronica) Jiang, Aseah Khan, Weiyi Qiu, Nikki Tai, Tiffany Truong, Maimunah Virk

(参考訳) ベクターコントロール戦略は、新型コロナウイルス(covid-19)の緩和と封じ込めの中心であり、公共および民間の空間および関連サービスの運用状況を制限する自治体条例の形で行われている。しかし、リスク行動の観点から特定の集団反応についてはほとんど知られていない。これらのベクターコントロール変数戦略の影響を理解するために、ニューヨーク市の最初の新型コロナウイルス波(03/22/20-05/19/20)のピーク時に、ニューヨーク市の19の医療施設の外で、複数週間にわたる多地点観測研究が行われた。本研究の目的は, 病院や救急医療センターから退院した個人の触覚, 目的地選択, PPE 利用行動の把握である。主要な目標は、人々が三次元ベクトル環境と相互作用する方法に関する将来の研究のための経験的基礎を確立することであった。匿名化されたデータはスマートフォンで収集された。各データレコードには、医療施設を離れる個人の時間、データ、場所、ルーティング、ビルド環境とのインタラクション、他の個人、そして自分自身が含まれている。 PPEの使用状況、目的地、仲介所、交通機関の選択も記録されている。この記録は施設のジップコードによる61の社会経済的要因と7つの同時気象要因に関連付けられ、ARCGISシステムで統合された形状ファイルにまとめられた。本稿では,5,100以上の公開アクセス可能な観測記録を作成するためのプロジェクトチームとプロトコルについて述べる。

Vector control strategies are central to the mitigation and containment of COVID-19 and have come in the form of municipal ordinances that restrict the operational status of public and private spaces and associated services. Yet, little is known about specific population responses in terms of risk behaviors. To help understand the impact of those vector control variable strategies, a multi-week, multi-site observational study was undertaken outside of 19 New York City medical facilities during the peak of the city's initial COVID-19 wave (03/22/20-05/19/20). The aim was to capture perishable data of the touch, destination choice, and PPE usage behavior of individuals egressing hospitals and urgent care centers. A major goal was to establish an empirical basis for future research on the way people interact with three-dimensional vector environments. Anonymized data were collected via smart phones. Each data record includes the time, data, and location of an individual leaving a healthcare facility, their routing, interactions with the build environment, other individuals, and themselves. Most records also note their PPE usage, destination, intermediary stops, and transportation choices. The records were linked with 61 socio-economic factors by the facility zip code and 7 contemporaneous weather factors and the merged in a unified shapefile in an ARCGIS system. This paper describes the project team and protocols used to produce over 5,100 publicly accessible observational records and an affiliated codebook that can be used to study linkages between individual behaviors and on-the-ground conditions.

翻訳日:2023-04-14 21:02:30 公開日:2021-01-18

# 機械的TA2: TAをサポートしたピアグレーディングシステム

Mechanical TA 2: A System for Peer Grading with TA Support ( http://arxiv.org/abs/2101.10078v1 )

ライセンス: Link先を確認

Hedayat Zarkoob, Farzad Abdolhosseini, and Kevin Leyton-Brown

(参考訳) Mechanical TA 2 (MTA2) は、信頼性の高い TA グレーダを利用して高品質なピアレビューをインセンティブ化する、オープンソースの Web ベースのピアグレーティングアプリケーションである。以前のMTAのプロトタイプ実装では、コンセプトの価値は証明されていたが、スケールや拡張性には適せず、MTA2はこれらのハードルを克服したシステムを完全に再実装した。 MTA2は2つの相互接続された目的を果たす: 実用的なピアグレーディングを容易にし、異なるピアグレーディング機構の実験用のテストベッドとして機能する。このシステムの特徴は、カスタマイズを容易にするモジュラーデザイン、生徒をピアグレードの技量に基づいて異なるプールに分割する支援、自動校正とスポットチェックの仕組み、学生が格付けをアピールし、個々のレビューに対するフィードバックを与える能力などである。

Mechanical TA 2 (MTA2) is an open source web-based peer grading application that leverages trusted TA graders to incentivize high-quality peer review. A previous, prototype implementation of MTA proved the value of the concept, but was neither suitable for use at scale nor easily extensible; MTA2 is a complete reimplementation of the system that overcomes these hurdles. MTA2 serves two, interconnected purposes: facilitating practical peer grading and serving as a testbed for experimentation with different peer grading mechanisms. The system is characterized by a modular design that makes customization easy; support for dividing students into different pools based on their peer-grading prowess; mechanisms for automated calibration and spot checking; and the ability for students to appeal grades and to give feedback about individual reviews.

翻訳日:2023-04-14 21:02:02 公開日:2021-01-18

# パネル:人間とテクノロジーによる包括的プライバシーとセキュリティ

Panel: Humans and Technology for Inclusive Privacy and Security ( http://arxiv.org/abs/2101.07377v1 )

ライセンス: Link先を確認

Sanchari Das and Robert S. Gutzwiller and Rod D. Roscoe and Prashanth Rajivan and Yang Wang and L. Jean Camp and Roberto Hoyle

(参考訳) コンピュータセキュリティとユーザプライバシは、ユーザの増加とデータに対する脅威の両方により、デジタル時代の重要な問題と懸念である。一般的なサイバーセキュリティガイダンス(すなわち、悪意のある脅威からすべてのユーザーデータを保護)とプライバシーの個人主義的アプローチ(すなわち、ユーザ固有のものであり、ユーザのニーズやリスク認識に依存する)の間に、別の問題が生じる。ソフトウェアバグ(Streiff, Kenny, Das, Leeth, & Camp, 2018)、安全でない認証(Das, Wang, Tingle, & Camp, 2019)、行動的(パスワードの共有(Das, Dingman, & Camp, 2018)、コンプライアンス(Das, Dev, & Srinivasan, 2018)である。このパネルの提案は、セキュリティとプライバシの非独占的な設計から生じる、社会技術的脆弱性の第3のカテゴリに対処します。本パネルでは,プライバシーに対するユーザのニーズと欲求に対処する。パネルは価値に敏感なデザインについて詳細な議論を行い、高齢者や10代、障害のある人、一般のセキュリティやプライバシーの懸念に重点を置かない人たちなど、潜在的に脆弱な人々に焦点を当てる。人的要因は、これらの領域の改善を促進することへの関心と能力を持っている。

Computer security and user privacy are critical issues and concerns in the digital era due to both increasing users and threats to their data. Separate issues arise between generic cybersecurity guidance (i.e., protect all user data from malicious threats) and the individualistic approach of privacy (i.e., specific to users and dependent on user needs and risk perceptions). Research has shown that several security- and privacy-focused vulnerabilities are technological (e.g., software bugs (Streiff, Kenny, Das, Leeth, & Camp, 2018), insecure authentication (Das, Wang, Tingle, & Camp, 2019)), or behavioral (e.g., sharing passwords (Das, Dingman, & Camp, 2018); and compliance (Das, Dev, & Srinivasan, 2018) (Dev, Das, Rashidi, & Camp, 2019)). This panel proposal addresses a third category of sociotechnical vulnerabilities that can and sometimes do arise from non-inclusive design of security and privacy. In this panel, we will address users' needs and desires for privacy. The panel will engage in in-depth discussions about value-sensitive design while focusing on potentially vulnerable populations, such as older adults, teens, persons with disabilities, and others who are not typically emphasized in general security and privacy concerns. Human factors have a stake in and ability to facilitate improvements in these areas.

翻訳日:2023-04-14 21:01:22 公開日:2021-01-18

# PLLay:永続景観に基づく効率的な地形層

PLLay: Efficient Topological Layer based on Persistence Landscapes ( http://arxiv.org/abs/2002.02778v4 )

ライセンス: Link先を確認

Kwangho Kim, Jisu Kim, Manzil Zaheer, Joon Sik Kim, Frederic Chazal, and Larry Wasserman

(参考訳) 本研究では,パーシステンスランドスケープに基づく一般ディープラーニングモデルのための新しいトポロジカルレイヤpllayを提案し,入力データ構造の基盤となるトポロジ的特徴を効率的に活用する。本研究では,任意のフィルタを持つ一般永続ホモロジーに対して,層入力に関して微分可能性を示す。したがって,提案するレイヤをネットワーク内の任意の場所に配置し,入力データのトポロジ的特徴に関する重要な情報を次のレイヤに供給することで,ネットワークの学習性を向上させることができる。 PLLayのタスク最適構造は、入力処理やデータ前処理を必要とせずに、バックプロパゲーションを通じてトレーニング中に学習される。本稿では,DTM関数に基づくフィルタの新しい適応法を提案し,安定性解析により,提案した層が雑音や外れ値に対して頑健であることを示す。各種データセットの分類実験により,本手法の有効性を示す。

We propose PLLay, a novel topological layer for general deep learning models based on persistence landscapes, in which we can efficiently exploit the underlying topological features of the input data structure. In this work, we show differentiability with respect to layer inputs, for a general persistent homology with arbitrary filtration. Thus, our proposed layer can be placed anywhere in the network and feed critical information on the topological features of input data into subsequent layers to improve the learnability of the networks toward a given task. A task-optimal structure of PLLay is learned during training via backpropagation, without requiring any input featurization or data preprocessing. We provide a novel adaptation for the DTM function-based filtration, and show that the proposed layer is robust against noise and outliers through a stability analysis. We demonstrate the effectiveness of our approach by classification experiments on various datasets.

翻訳日:2023-01-03 04:27:58 公開日:2021-01-18

# グラフニューラルネットワークを強化したランダム特徴

Random Features Strengthen Graph Neural Networks ( http://arxiv.org/abs/2002.03155v3 )

ライセンス: Link先を確認

Ryoma Sato, Makoto Yamada, Hisashi Kashima

(参考訳) グラフニューラルネットワーク(GNN)は、さまざまなグラフ学習タスクのための強力な機械学習モデルである。近年,様々なGNNモデルの表現力の限界が明らかにされている。例えば、gnnは非同型グラフを区別できず、効率的なグラフアルゴリズムを学べない。本稿では,各ノードにランダムな特徴を加えるだけで,GNNが強力になることを示す。このランダムな特徴により、GNNは最小支配セット問題と最大マッチング問題に対して近似比でほぼ最適な多項式時間近似アルゴリズムを学習できることを示す。本手法の主な利点は,市販のGNNモデルと若干の修正を加えて組み合わせることができる点である。実験により、ランダムな特徴の追加により、グラフ畳み込みネットワーク(GCN)やグラフ同型ネットワーク(GIN)など、通常のGNNが解決できない様々な問題を解決することができることを示す。

Graph neural networks (GNNs) are powerful machine learning models for various graph learning tasks. Recently, the limitations of the expressive power of various GNN models have been revealed. For example, GNNs cannot distinguish some non-isomorphic graphs and they cannot learn efficient graph algorithms. In this paper, we demonstrate that GNNs become powerful just by adding a random feature to each node. We prove that the random features enable GNNs to learn almost optimal polynomial-time approximation algorithms for the minimum dominating set problem and maximum matching problem in terms of approximation ratios. The main advantage of our method is that it can be combined with off-the-shelf GNN models with slight modifications. Through experiments, we show that the addition of random features enables GNNs to solve various problems that normal GNNs, including the graph convolutional networks (GCNs) and graph isomorphism networks (GINs), cannot solve.

翻訳日:2023-01-02 22:19:37 公開日:2021-01-18

# スケーラブルな信念伝播のための緩和スケジューリング

Relaxed Scheduling for Scalable Belief Propagation ( http://arxiv.org/abs/2002.11505v2 )

ライセンス: Link先を確認

Vitaly Aksenov and Dan Alistarh and Janne H. Korhonen

(参考訳) 大規模ハードウェア並列性を活用する能力は、機械学習の急速な進歩の重要な実現要因のひとつだ。その結果、古典的機械学習アルゴリズムの効率的な並列変種の開発に多大な労力が注がれた。しかし、並列化に関する豊富な知識にもかかわらず、いくつかの古典的な機械学習アルゴリズムは収束を維持しながら効率的に並列化するのが難しいことがしばしばある。本稿では,グラフィカルモデルに基づく推論の鍵となる機械学習タスクに対する効率的な並列アルゴリズム,特に基本的な信念伝播アルゴリズムに着目した。この文脈でスケーラブルな緩和スケジューラをどのように活用するかを示すことによって、この古典的なパラダイムを効率的に並列化するという課題に対処する。本稿では,本手法が,拡張性および壁時計収束時間の観点から,様々な実用的応用において,従来の並列信念伝達実装よりも優れていることを示す,広範な実証研究を行う。

The ability to leverage large-scale hardware parallelism has been one of the key enablers of the accelerated recent progress in machine learning. Consequently, there has been considerable effort invested into developing efficient parallel variants of classic machine learning algorithms. However, despite the wealth of knowledge on parallelization, some classic machine learning algorithms often prove hard to parallelize efficiently while maintaining convergence. In this paper, we focus on efficient parallel algorithms for the key machine learning task of inference on graphical models, in particular on the fundamental belief propagation algorithm. We address the challenge of efficiently parallelizing this classic paradigm by showing how to leverage scalable relaxed schedulers in this context. We present an extensive empirical study, showing that our approach outperforms previous parallel belief propagation implementations both in terms of scalability and in terms of wall-clock convergence time, on a range of practical applications.

翻訳日:2022-12-28 20:27:10 公開日:2021-01-18

# 逆整合損失を用いた不対向画像変換

Unpaired Image-to-Image Translation using Adversarial Consistency Loss ( http://arxiv.org/abs/2003.04858v7 )

ライセンス: Link先を確認

Yihao Zhao, Ruihai Wu, Hao Dong

(参考訳) unpaired image-to-image translationは、unpaired trainingデータを使用して異なる画像ドメイン間のマッピングを見つけることを目的としたビジョン問題のクラスである。サイクル一貫性損失はそのような問題に対して広く用いられる制約である。しかし、厳密なピクセルレベルの制約のため、幾何学的な変化や大きな物体の除去、無関係なテクスチャの無視はできない。本稿では,画像から画像への変換における新たな逆抵抗損失を提案する。この損失は、翻訳された画像が特定のソースイメージに変換される必要はなく、翻訳された画像がソースイメージの重要な特徴を保持し、上記のサイクルコンシスタンスロスの欠点を克服するよう促すことができる。本手法は, 眼鏡の除去, 男性から女性への翻訳, 自撮りからアニメへの翻訳の3つの課題に対して, 最先端の成果を得る。

Unpaired image-to-image translation is a class of vision problems whose goal is to find the mapping between different image domains using unpaired training data. Cycle-consistency loss is a widely used constraint for such problems. However, due to the strict pixel-level constraint, it cannot perform geometric changes, remove large objects, or ignore irrelevant texture. In this paper, we propose a novel adversarial-consistency loss for image-to-image translation. This loss does not require the translated image to be translated back to be a specific source image but can encourage the translated images to retain important features of the source images and overcome the drawbacks of cycle-consistency loss noted above. Our method achieves state-of-the-art results on three challenging tasks: glasses removal, male-to-female translation, and selfie-to-anime translation.

翻訳日:2022-12-24 21:20:05 公開日:2021-01-18

# PiP:自律運転のための計画インフォームド軌道予測

PiP: Planning-informed Trajectory Prediction for Autonomous Driving ( http://arxiv.org/abs/2003.11476v2 )

ライセンス: Link先を確認

Haoran Song, Wenchao Ding, Yuxuan Chen, Shaojie Shen, Michael Yu Wang, Qifeng Chen

(参考訳) 特に社会的に準拠した柔軟な方法で、自動運転計画のための周辺車両の動きを予測することは重要である。しかし、運転行動の相互作用と不確実性のため、将来の予測は困難である。マルチエージェント環境での予測問題に対処するために,計画インフォームド軌道予測(PiP)を提案する。我々のアプローチは、歴史的情報のみに基づいて計画と切り離された従来の予測方法と区別される。本手法は,ego車両の計画と共に予測プロセスを通知することにより,高速道路データセットにおけるマルチエージェント予測の最先端性能を実現する。さらに,インタラクティブなシナリオにおいて,自律走行に非常に有益であるego車両の複数候補軌道にpipを条件付けすることにより,予測と計画を結合する新しいパイプラインを実現する。

It is critical to predict the motion of surrounding vehicles for self-driving planning, especially in a socially compliant and flexible way. However, future prediction is challenging due to the interaction and uncertainty in driving behaviors. We propose planning-informed trajectory prediction (PiP) to tackle the prediction problem in the multi-agent setting. Our approach is differentiated from the traditional manner of prediction, which is only based on historical information and decoupled with planning. By informing the prediction process with the planning of ego vehicle, our method achieves the state-of-the-art performance of multi-agent forecasting on highway datasets. Moreover, our approach enables a novel pipeline which couples the prediction and planning, by conditioning PiP on multiple candidate trajectories of the ego vehicle, which is highly beneficial for autonomous driving in interactive scenarios.

翻訳日:2022-12-20 03:50:16 公開日:2021-01-18

# 時空間特徴抽出のための畳み込みスパイクニューラルネットワーク

Convolutional Spiking Neural Networks for Spatio-Temporal Feature Extraction ( http://arxiv.org/abs/2003.12346v2 )

ライセンス: Link先を確認

Ali Samadzadeh, Fatemeh Sadat Tabatabaei Far, Ali Javadi, Ahmad Nickabadi, Morteza Haghir Chehreghani

(参考訳) スパイキングニューラルネットワーク(SNN)は、イベントベースの性質のため、低消費電力および組み込みシステム(新しいニューロモルフィックチップなど)で使用できる。また、従来のニューラルネットワーク(anns)とは対照的に、anの特性を維持しながら計算コストが低いという利点がある。しかしながら、畳み込みスパイクニューラルネットワークやその他の種類のSNNの層における時間的符号化はまだ研究されていない。本稿では,この特性を利用した実験において,畳み込みsnsの時空間的特徴抽出について考察する。浅い畳み込みSNNは、C3DやConvLstmなどの最先端の時空間特徴抽出手法よりも優れている。さらに,NMNIST (99.6%), DVS-CIFAR10 (69.2%), DVS-Gesture (96.7%), ANN の UCF-101 (42.1%) および HMDB-51 (21.5%) のデータセットに比べて優れた性能を示した実世界の問題(特に分類タスク)に取り組むための新しいディープスパイクアーキテクチャを提案する。また,本論文で説明した時空間バックプロパゲーションの変化に基づいて,トレーニングプロセスが実施されていることも注目に値する。

Spiking neural networks (SNNs) can be used in low-power and embedded systems (such as emerging neuromorphic chips) due to their event-based nature. Also, they have the advantage of low computation cost in contrast to conventional artificial neural networks (ANNs), while preserving ANN's properties. However, temporal coding in layers of convolutional spiking neural networks and other types of SNNs has yet to be studied. In this paper, we provide insight into spatio-temporal feature extraction of convolutional SNNs in experiments designed to exploit this property. The shallow convolutional SNN outperforms state-of-the-art spatio-temporal feature extractor methods such as C3D, ConvLstm, and similar networks. Furthermore, we present a new deep spiking architecture to tackle real-world problems (in particular classification tasks) which achieved superior performance compared to other SNN methods on NMNIST (99.6%), DVS-CIFAR10 (69.2%) and DVS-Gesture (96.7%) and ANN methods on UCF-101 (42.1%) and HMDB-51 (21.5%) datasets. It is also worth noting that the training process is implemented based on variation of spatio-temporal backpropagation explained in the paper.

翻訳日:2022-12-19 04:46:24 公開日:2021-01-18

# 微分プライバシーのための離散ガウス

The Discrete Gaussian for Differential Privacy ( http://arxiv.org/abs/2004.00010v5 )

ライセンス: Link先を確認

Cl\'ement L. Canonne, Gautam Kamath, Thomas Steinke

(参考訳) 微分プライベートシステムを構築するための重要なツールは、機密データセットで評価された関数の出力にガウスノイズを追加することである。残念ながら、継続的分散を使うことにはいくつかの実践的な課題がある。まず第一に、有限のコンピュータは、連続分布のサンプルを正確に表現することはできない。さらに、基礎となるデータがそれ自体が離散的(例えば人口数)である場合、連続的なノイズを加えると、結果の解釈が困難になる。これらの欠点を念頭に置いて,微分プライバシーの文脈において離散ガウシアンを導入し,分析する。具体的には,離散ガウス雑音の追加は連続ガウス雑音の追加と本質的に同一のプライバシーと精度を保証することを示す。また,この分布から正確なサンプリングを行うための簡易かつ効率的なアルゴリズムを提案する。これは、プライベートに応答するクエリ、あるいは一般的には低感度整数値クエリに適用可能であることを示している。

A key tool for building differentially private systems is adding Gaussian noise to the output of a function evaluated on a sensitive dataset. Unfortunately, using a continuous distribution presents several practical challenges. First and foremost, finite computers cannot exactly represent samples from continuous distributions, and previous work has demonstrated that seemingly innocuous numerical errors can entirely destroy privacy. Moreover, when the underlying data is itself discrete (e.g., population counts), adding continuous noise makes the result less interpretable. With these shortcomings in mind, we introduce and analyze the discrete Gaussian in the context of differential privacy. Specifically, we theoretically and experimentally show that adding discrete Gaussian noise provides essentially the same privacy and accuracy guarantees as the addition of continuous Gaussian noise. We also present an simple and efficient algorithm for exact sampling from this distribution. This demonstrates its applicability for privately answering counting queries, or more generally, low-sensitivity integer-valued queries.

翻訳日:2022-12-18 01:51:27 公開日:2021-01-18

# 深層学習に基づく無線変調分類器の可視化

Visualizing Deep Learning-based Radio Modulation Classifier ( http://arxiv.org/abs/2005.02175v2 )

ライセンス: Link先を確認

Liang Huang (Member, IEEE), You Zhang, Weijian Pan, Jinyin Chen, Li Ping Qian (Senior Member, IEEE) and Yuan Wu (Senior Member, IEEE)

(参考訳) 近年,無線特徴をエンドツーエンドに抽出・分類することで,自動変調分類に深層学習が応用されている。しかし、深層学習に基づく無線変調分類器は解釈可能性に欠けており、どの無線特徴が抽出され、分類するために選択されるかの説明や可視性はほとんどない。本稿では,クラスアクティベーションベクトルを導入することで,異なる深層学習型無線変調分類器を可視化する。具体的には、畳み込みニューラルネットワーク(CNN)ベースの分類器と長短期記憶(LSTM)ベースの分類器の両方を別々に研究し、抽出した無線特徴を可視化する。 CNNに基づく分類器とLSTMに基づく分類器は、変調基準点に関する類似の無線特徴を抽出する。特にLSTMを用いた分類器では,得られた電波特性は人間の知識と類似している。以上の結果から,深層学習に基づく分類器によって抽出された無線特徴は,無線信号の搬送内容に大きく依存しており,短い無線サンプルが誤分類につながる可能性がある。

Deep learning has recently been successfully applied in automatic modulation classification by extracting and classifying radio features in an end-to-end way. However, deep learning-based radio modulation classifiers are lack of interpretability, and there is little explanation or visibility into what kinds of radio features are extracted and chosen for classification. In this paper, we visualize different deep learning-based radio modulation classifiers by introducing a class activation vector. Specifically, both convolutional neural networks (CNN) based classifier and long short-term memory (LSTM) based classifier are separately studied, and their extracted radio features are visualized. Extensive numerical results show both the CNN-based classifier and LSTM-based classifier extract similar radio features relating to modulation reference points. In particular, for the LSTM-based classifier, its obtained radio features are similar to the knowledge of human experts. Our numerical results indicate the radio features extracted by deep learning-based classifiers greatly depend on the contents carried by radio signals, and a short radio sample may lead to misclassification.

翻訳日:2022-12-07 06:58:06 公開日:2021-01-18

# 会話エージェントのための過去の会話からの意図マイニング

Intent Mining from past conversations for conversational agent ( http://arxiv.org/abs/2005.11014v4 )

ライセンス: Link先を確認

Ajay Chatterjee and Shubhashis Sengupta

(参考訳) 会話システムは、AIコミュニティにおいて主要な関心事である。チャットボットは、時間単位のサポートを提供し、顧客エンゲージメントを高めるために、ますますデプロイされている。多くの商用ボット構築フレームワークは、ユーザ入力を認識するためにインテントモデルを構築し、トレーニングする必要がある標準アプローチに従っている。インテントモデルは、テキストによる発話とインテントラベルペアの集まりで教師あり設定で訓練される。異なる意図でトレーニングデータをかなり広範囲に収集することは、ボット構築プロセスにおけるボトルネックである。さらに、100から数千の会話を意図してラベル付けするコストは、時間と労力のかかる作業である。 In this paper, we present an intent discovery framework that involves 4 primary steps: Extraction of textual utterances from a conversation using a pre-trained domain agnostic Dialog Act Classifier (Data Extraction), automatic clustering of similar user utterances (Clustering), manual annotation of clusters with an intent label (Labeling) and propagation of intent labels to the utterances from the previous step, which are not mapped to any cluster (Label Propagation); to generate intent training data from raw conversations. 我々は,不均衡データクラスタリングのための新しい密度ベースクラスタリングアルゴリズムiter-dbscanを導入した。 subject Matter Expert(ドメインの専門知識を持つアノテーション)は、手動でクラスタ化されたユーザー発話を調べ、発見のためのインテントラベルを提供する。手動アノテーションの意図範囲,正確性,時間節約の観点から,訓練した意図モデルの有効性を検証するため,ユーザ実験を行った。このシステムは会話システムのための意図モデルを構築するために開発されたが、このフレームワークは短いテキストクラスタリングやラベリングフレームワークとしても使用できる。

Conversational systems are of primary interest in the AI community. Chatbots are increasingly being deployed to provide round-the-clock support and to increase customer engagement. Many of the commercial bot building frameworks follow a standard approach that requires one to build and train an intent model to recognize a user input. Intent models are trained in a supervised setting with a collection of textual utterance and intent label pairs. Gathering a substantial and wide coverage of training data for different intent is a bottleneck in the bot building process. Moreover, the cost of labeling a hundred to thousands of conversations with intent is a time consuming and laborious job. In this paper, we present an intent discovery framework that involves 4 primary steps: Extraction of textual utterances from a conversation using a pre-trained domain agnostic Dialog Act Classifier (Data Extraction), automatic clustering of similar user utterances (Clustering), manual annotation of clusters with an intent label (Labeling) and propagation of intent labels to the utterances from the previous step, which are not mapped to any cluster (Label Propagation); to generate intent training data from raw conversations. We have introduced a novel density-based clustering algorithm ITER-DBSCAN for unbalanced data clustering. Subject Matter Expert (Annotators with domain expertise) manually looks into the clustered user utterances and provides an intent label for discovery. We conducted user studies to validate the effectiveness of the trained intent model generated in terms of coverage of intents, accuracy and time saving concerning manual annotation. Although the system is developed for building an intent model for the conversational system, this framework can also be used for a short text clustering or as a labeling framework.

翻訳日:2022-11-30 08:14:03 公開日:2021-01-18

# AIコモンズの悲劇

The Tragedy of the AI Commons ( http://arxiv.org/abs/2006.05203v2 )

ライセンス: Link先を確認

Travis LaCroix and Aydin Mohseni

(参考訳) 近年,倫理的人工知能研究の政策とガイドラインの提案が盛んである。これらは、共通の利益のために、社会的責任のあるaiの開発を導くものだ。しかしながら、通常、非協力のためのインセンティブ(つまり、そのような政策やガイドラインに従わないこと)が存在し、これらの提案は、彼ら自身の規範的主張を強制する効果的なメカニズムを欠いている。説明された状況は、社会的ジレンマ、すなわち、協力する個別のインセンティブを持たない状況を構成するが、相互協力は、すべての関係者にとって最良の結果をもたらす。本稿では,この社会ジレンマを,人工知能の倫理的発展の文脈でモデル化するために,確率論的進化ゲームダイナミクスを用いる。このフォーマリズムは、介入される可能性のある変数を分離することを可能にするため、AIの多くのステークホルダー間の協力を強化するための実用的な提案を提供する。以上の結果から,このようなシナリオにおいて,確率的効果が協調に有効であることを示す。彼らは、協力のコストが低く、失敗のリスクが高い小さなグループで共通の利益の調整を試みるべきであることを示唆している。これは、そのような倫理提案が、その範囲、規模、内容に関して成功すると期待すべき条件について洞察を与える。

Policy and guideline proposals for ethical artificial-intelligence research have proliferated in recent years. These are supposed to guide the socially-responsible development of AI for the common good. However, there typically exist incentives for non-cooperation (i.e., non-adherence to such policies and guidelines); and, these proposals often lack effective mechanisms to enforce their own normative claims. The situation just described constitutes a social dilemma; namely, a situation where no one has an individual incentive to cooperate, though mutual cooperation would lead to the best outcome for all involved. In this paper, we use stochastic evolutionary game dynamics to model this social dilemma in the context of the ethical development of artificial intelligence. This formalism allows us to isolate variables that may be intervened upon, thus providing actionable suggestions for increased cooperation amongst numerous stakeholders in AI. Our results show how stochastic effects can help make cooperation viable in such a scenario. They suggest that coordination for a common good should be attempted in smaller groups in which the cost for cooperation is low, and the perceived risk of failure is high. This provides insight into the conditions under which we should expect such ethics proposals to be successful with regard to their scope, scale, and content.

翻訳日:2022-11-23 14:28:07 公開日:2021-01-18

# クラウドソースによる呼吸音データからのCOVID-19自動診断の探索

Exploring Automatic Diagnosis of COVID-19 from Crowdsourced Respiratory Sound Data ( http://arxiv.org/abs/2006.05919v3 )

ライセンス: Link先を確認

Chlo\"e Brown, Jagmohan Chauhan, Andreas Grammenos, Jing Han, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, Cecilia Mascolo

(参考訳) 人体から発生する音声信号(息、呼吸、心臓、消化、振動音など)は、臨床医が病気の診断や疾患の進行を評価する指標として日常的に用いられている。最近まで、このような信号は通常、定期的な訪問の際に手動の聴取を通して収集されていた。研究は現在、身体の音(例えばデジタル聴診器から)を心臓血管や呼吸検査に収集するためにデジタル技術を使い始めており、自動分析に使用することができる。初期の研究で、声とせきからcovid-19の診断信号を検出することが期待されている。本稿では,covid-19の診断を支援するために収集された呼吸音の大規模クラウドソーシングデータセットに関するデータ分析について述べる。気管支喘息や健康管理の人たちからcovid-19の音がいかに識別できるかを理解するために、cooughsとbreathを使っています。その結果、単純なバイナリ機械学習分類器でも、正常な健康音とcovid-19音を分類できることがわかった。また、covid-19陽性者と、covid-19陽性者とを区別する方法、および、covid-19陽性者と、喘息患者と、coough患者とを区別する方法を示す。我々のモデルはすべてのタスクで80%以上のAUCを達成する。これらの結果は予備的であり、この種のデータとオーディオベースの機械学習のポテンシャルの表面のみを掻き取る。この研究は、新型コロナウイルス(COVID-19)の診断に役立つ事前スクリーニング信号として、自動で呼吸パターンを分析する方法について、さらなる調査を行うための扉を開く。

Audio signals generated by the human body (e.g., sighs, breathing, heart, digestion, vibration sounds) have routinely been used by clinicians as indicators to diagnose disease or assess disease progression. Until recently, such signals were usually collected through manual auscultation at scheduled visits. Research has now started to use digital technology to gather bodily sounds (e.g., from digital stethoscopes) for cardiovascular or respiratory examination, which could then be used for automatic analysis. Some initial work shows promise in detecting diagnostic signals of COVID-19 from voice and coughs. In this paper we describe our data analysis over a large-scale crowdsourced dataset of respiratory sounds collected to aid diagnosis of COVID-19. We use coughs and breathing to understand how discernible COVID-19 sounds are from those in asthma or healthy controls. Our results show that even a simple binary machine learning classifier is able to classify correctly healthy and COVID-19 sounds. We also show how we distinguish a user who tested positive for COVID-19 and has a cough from a healthy user with a cough, and users who tested positive for COVID-19 and have a cough from users with asthma and a cough. Our models achieve an AUC of above 80% across all tasks. These results are preliminary and only scratch the surface of the potential of this type of data and audio-based machine learning. This work opens the door to further investigation of how automatically analysed respiratory patterns could be used as pre-screening signals to aid COVID-19 diagnosis.

翻訳日:2022-11-23 06:44:40 公開日:2021-01-18

# AdamP: スケール不変ウェイトにおけるモーメント最適化のスローダウン

AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights ( http://arxiv.org/abs/2006.08217v3 )

ライセンス: Link先を確認

Byeongho Heo, Sanghyuk Chun, Seong Joon Oh, Dongyoon Han, Sangdoo Yun, Gyuwan Kim, Youngjung Uh, Jung-Woo Ha

(参考訳) 正規化技術は現代の深層学習の恩恵である。彼らはしばしばより良い一般化性能で重みをより早く収束させる。重み間の正規化誘起スケール不変性は、勾配降下(GD)最適化器に有利な土台を与えると論じられ、実効的なステップサイズは時間とともに自動的に減少し、全体的な訓練手順を安定化させる。しかし、GDオプティマイザに運動量を導入することで、スケール不変量に対する効果的なステップサイズが大幅に減少し、これはまだ研究されていない現象であり、現在の実践において望ましくない副作用を引き起こした可能性がある。現代のディープニューラルネットワークの大多数は(1)運動量に基づくgd(sgdやadamなど)と(2)スケール不変パラメータで構成されているため、これは重要な問題である。本稿では,これら2成分の多種多様な組み合わせが,有効なステップサイズとサブ最適モデル性能の早期崩壊につながることを検証した。本稿では,SGDPとAdamPによる簡易かつ効果的な対策として,各最適化ステップにおいて放射状成分(標準増加方向)を除去する手法を提案する。スケールの不変性のため、この修正は有効な更新方向を変更することなく有効なステップサイズだけを変更し、GDオプティマイザの本来の収束特性を享受する。機械学習における運動量GDの多様さとスケール不変性を考慮して,13ベンチマークの基準値に対して評価を行った。それらは、分類(例:イメージネット)、検索(例:cubとsop)、検出(例:coco)、言語モデリング(例:wikitext)、音声分類(例:dcase)といったビジョンタスクから成り立っている。当社のソリューションがベンチマークで均一に向上していることを確認します。ソースコードはhttps://github.com/clovaai/adampで入手できる。

Normalization techniques are a boon for modern deep learning. They let weights converge more quickly with often better generalization performances. It has been argued that the normalization-induced scale invariance among the weights provides an advantageous ground for gradient descent (GD) optimizers: the effective step sizes are automatically reduced over time, stabilizing the overall training procedure. It is often overlooked, however, that the additional introduction of momentum in GD optimizers results in a far more rapid reduction in effective step sizes for scale-invariant weights, a phenomenon that has not yet been studied and may have caused unwanted side effects in the current practice. This is a crucial issue because arguably the vast majority of modern deep neural networks consist of (1) momentum-based GD (e.g. SGD or Adam) and (2) scale-invariant parameters. In this paper, we verify that the widely-adopted combination of the two ingredients lead to the premature decay of effective step sizes and sub-optimal model performances. We propose a simple and effective remedy, SGDP and AdamP: get rid of the radial component, or the norm-increasing direction, at each optimizer step. Because of the scale invariance, this modification only alters the effective step sizes without changing the effective update directions, thus enjoying the original convergence properties of GD optimizers. Given the ubiquity of momentum GD and scale invariance in machine learning, we have evaluated our methods against the baselines on 13 benchmarks. They range from vision tasks like classification (e.g. ImageNet), retrieval (e.g. CUB and SOP), and detection (e.g. COCO) to language modelling (e.g. WikiText) and audio classification (e.g. DCASE) tasks. We verify that our solution brings about uniform gains in those benchmarks. Source code is available at https://github.com/clovaai/AdamP.

翻訳日:2022-11-21 02:20:22 公開日:2021-01-18

# 低位ガウスコプラによる量的不確かさの行列補完

Matrix Completion with Quantified Uncertainty through Low Rank Gaussian Copula ( http://arxiv.org/abs/2006.10829v2 )

ライセンス: Link先を確認

Yuxuan Zhao, Madeleine Udell

(参考訳) 現代の大規模データセットは、しばしば欠落したエントリで悩まされる。値が不足している表データに対しては、ペナルティ化された再構成エラーを最小化する完全行列に対して、複数のインプテーションアルゴリズムが解く。しかし、ほとんど誰もその計算の不確かさを見積もることができない。本稿では,定量化の不確実性を伴う価値インプテーションを欠く確率的かつスケーラブルなフレームワークを提案する。我々のモデルである低ランクガウスコピュラは、標準確率モデルである確率的主成分分析を強化し、各列に対して限界変換を行い、モデルがデータの分布によく一致するようにします。ブール、順序、実数値の観測を自然に処理し、各計算における不確実性を定量化する。モデルに適合するために必要な時間は、データセット内の行数や列数と線形にスケールする。実験結果から,高階データを含む多種多様なデータ型に対して,最先端の計算精度が得られた。我々の不確実性尺度は、計算誤差をよく予測する: 低い不確実性を持つエントリは(平均において)計算誤差を低くする。さらに、実数値データでは、結果の信頼区間が適切に調整される。

Modern large scale datasets are often plagued with missing entries. For tabular data with missing values, a flurry of imputation algorithms solve for a complete matrix which minimizes some penalized reconstruction error. However, almost none of them can estimate the uncertainty of its imputations. This paper proposes a probabilistic and scalable framework for missing value imputation with quantified uncertainty. Our model, the Low Rank Gaussian Copula, augments a standard probabilistic model, Probabilistic Principal Component Analysis, with marginal transformations for each column that allow the model to better match the distribution of the data. It naturally handles Boolean, ordinal, and real-valued observations and quantifies the uncertainty in each imputation. The time required to fit the model scales linearly with the number of rows and the number of columns in the dataset. Empirical results show the method yields state-of-the-art imputation accuracy across a wide range of data types, including those with high rank. Our uncertainty measure predicts imputation error well: entries with lower uncertainty do have lower imputation error (on average). Moreover, for real-valued data, the resulting confidence intervals are well-calibrated.

翻訳日:2022-11-19 13:06:27 公開日:2021-01-18

# 非平衡応答理論を用いたリカレントニューラルネットワークの理解

Understanding Recurrent Neural Networks Using Nonequilibrium Response Theory ( http://arxiv.org/abs/2006.11052v2 )

ライセンス: Link先を確認

Soon Hoe Lim

(参考訳) リカレントニューラルネットワーク(Recurrent Neural Network, RNN)は、シーケンシャルデータの解析に機械学習で広く使用される脳モデルである。この研究は、RNNが非平衡統計力学の応答理論を用いて入力信号をどのように処理するかを深く理解するための貢献である。入力信号によって駆動される連続時間確率RNN(SRNN)のクラスに対して、Volterra型系列表現を出力として導出する。この表現は解釈可能であり、SRNNアーキテクチャから入力信号を切り離す。系列の核は、出力を完全に決定する非摂動力学に関して、ある種の再帰的に定義された相関関数である。この表現とその大まかな経路理論への含意を明らかにすることで、入力信号のテンソル積のシグネチャであり、自然な支持基盤であることが判明した、普遍的な特徴である応答特徴を識別する。特に,読み出し層の重みのみを最適化し,隠れた層内の重みを固定し,最適化しないsrnnを,応答特性に付随するカーネルヒルベルト空間上で動作するカーネルマシンとして捉えることができることを示した。

Recurrent neural networks (RNNs) are brain-inspired models widely used in machine learning for analyzing sequential data. The present work is a contribution towards a deeper understanding of how RNNs process input signals using the response theory from nonequilibrium statistical mechanics. For a class of continuous-time stochastic RNNs (SRNNs) driven by an input signal, we derive a Volterra type series representation for their output. This representation is interpretable and disentangles the input signal from the SRNN architecture. The kernels of the series are certain recursively defined correlation functions with respect to the unperturbed dynamics that completely determine the output. Exploiting connections of this representation and its implications to rough paths theory, we identify a universal feature -- the response feature, which turns out to be the signature of tensor product of the input signal and a natural support basis. In particular, we show that SRNNs, with only the weights in the readout layer optimized and the weights in the hidden layer kept fixed and not optimized, can be viewed as kernel machines operating on a reproducing kernel Hilbert space associated with the response feature.

翻訳日:2022-11-19 04:24:58 公開日:2021-01-18

# 確率勾配アルゴリズムの適応的逆強化学習のためのランゲヴィンダイナミクス

Langevin Dynamics for Adaptive Inverse Reinforcement Learning of Stochastic Gradient Algorithms ( http://arxiv.org/abs/2006.11674v2 )

ライセンス: Link先を確認

Vikram Krishnamurthy and George Yin

(参考訳) 逆強化学習(IRL)は、エージェントの反応(見積や行動)を観察することで、エージェントの報酬関数を推定することを目的としている。本稿では,複数の確率勾配エージェントが生成する報酬関数の勾配のノイズ推定を行った場合のIRLについて考察する。一般化したランゲヴィン力学アルゴリズムを用いて報酬関数 $R(\theta)$ を推定する。具体的には、結果のランゲヴィンアルゴリズムは、分布から$\exp(R(\theta))$ に比例してサンプルを漸近的に生成する。提案するirlアルゴリズムはカーネルベースのパッシブ学習方式を用いる。また、高次元データに適したIRLのためのマルチカーネル受動ランゲインアルゴリズムを構築した。提案するirlアルゴリズムの性能は、適応ベイズ学習、ロジスティック回帰(高次元問題)、制約付きマルコフ決定過程の例で示される。提案するirlアルゴリズムの弱収束をmartingale平均化法を用いて証明する。また,ユーティリティ関数$R(\theta)$ jumpが遅いマルコフ連鎖として時間とともに変化する非定常環境におけるIRLアルゴリズムの追跡性能も解析する。

Inverse reinforcement learning (IRL) aims to estimate the reward function of optimizing agents by observing their response (estimates or actions). This paper considers IRL when noisy estimates of the gradient of a reward function generated by multiple stochastic gradient agents are observed. We present a generalized Langevin dynamics algorithm to estimate the reward function $R(\theta)$; specifically, the resulting Langevin algorithm asymptotically generates samples from the distribution proportional to $\exp(R(\theta))$. The proposed IRL algorithms use kernel-based passive learning schemes. We also construct multi-kernel passive Langevin algorithms for IRL which are suitable for high dimensional data. The performance of the proposed IRL algorithms are illustrated on examples in adaptive Bayesian learning, logistic regression (high dimensional problem) and constrained Markov decision processes. We prove weak convergence of the proposed IRL algorithms using martingale averaging methods. We also analyze the tracking performance of the IRL algorithms in non-stationary environments where the utility function $R(\theta)$ jump changes over time as a slow Markov chain.

翻訳日:2022-11-18 22:38:43 公開日:2021-01-18

# 配列の微分可能なセグメンテーション

Differentiable Segmentation of Sequences ( http://arxiv.org/abs/2006.13105v2 )

ライセンス: Link先を確認

Erik Scharw\"achter and Jonathan Lennartz and Emmanuel M\"uller

(参考訳) セグメンテーションモデルは、離散的な変化点を持つ非定常シーケンシャルデータを記述するために広く使われている。これらの推定は通常、セグメンテーションが離散部分であり、他の全てのモデルパラメータが連続である混合離散連続最適化問題を解く必要がある。特定のモデル仮定に高度に特化された多くの推定アルゴリズムが開発されている。非標準アルゴリズムへの依存は、勾配に基づく最適化技術に批判的に依存する最先端のディープラーニングアーキテクチャにセグメントモデルを統合するのを難しくする。本研究では,セグメント化を含む全てのモデルパラメータを勾配降下で共同で推定することのできるセグメンテーションモデルの緩和変種を定式化する。我々は,近年の継続的ワーピング関数の学習の進歩を基盤として,両面パワー(TSP)分布に基づく新しいワーピング関数群を提案する。 TSPベースのワープ関数は微分可能であり、単純なクローズドフォーム式を持ち、セグメント関数を正確に表現することができる。この定式化は、特別の場合として、セグメント化された一般化線形モデルの重要なクラスを含み、非常に多様である。ポアソン回帰(poisson regression)によるcovid-19の拡散をモデル化し,変化点検出タスクに適用し,概念ドリフトを用いた分類モデルを学ぶ。提案手法は,勾配降下のための標準アルゴリズムを用いて,これらのタスクを効果的に学習することを示す。

Segmented models are widely used to describe non-stationary sequential data with discrete change points. Their estimation usually requires solving a mixed discrete-continuous optimization problem, where the segmentation is the discrete part and all other model parameters are continuous. A number of estimation algorithms have been developed that are highly specialized for their specific model assumptions. The dependence on non-standard algorithms makes it hard to integrate segmented models in state-of-the-art deep learning architectures that critically depend on gradient-based optimization techniques. In this work, we formulate a relaxed variant of segmented models that enables joint estimation of all model parameters, including the segmentation, with gradient descent. We build on recent advances in learning continuous warping functions and propose a novel family of warping functions based on the two-sided power (TSP) distribution. TSP-based warping functions are differentiable, have simple closed-form expressions, and can represent segmentation functions exactly. Our formulation includes the important class of segmented generalized linear models as a special case, which makes it highly versatile. We use our approach to model the spread of COVID-19 with Poisson regression, apply it on a change point detection task, and learn classification models with concept drift. The experiments show that our approach effectively learns all these tasks with standard algorithms for gradient descent.

翻訳日:2022-11-17 22:45:31 公開日:2021-01-18

# ヒット確率に基づく有向グラフとマルコフ連鎖の計量

A metric on directed graphs and Markov chains based on hitting probabilities ( http://arxiv.org/abs/2006.14482v2 )

ライセンス: Link先を確認

Zachary M. Boyd, Nicolas Fraiman, Jeremy L. Marzuola, Peter J. Mucha, Braxton Osting, and Jonathan Weare

(参考訳) 非向グラフにおける最短経路、可換時間、拡散距離は、次元減少、リンク予測、トリップ計画といった応用で広く用いられている。マルコフ連鎖や有向グラフから導出されるデータの非対称構造の利用に関心が高まるが、このタスクに特に適応する指標はほとんどない。我々は、任意のエルゴード、有限状態、時間同質なマルコフ連鎖の状態空間上の計量、特に有向グラフから導かれる任意のマルコフ連鎖について紹介する。提案手法は,あるノードから別のノードへのランダムウォーカーの移動に伴う距離空間の近さを仮定して構築した。特に、私たちの測定基準は、最短距離と平均歩行距離に影響を受けないので、既存の測定基準と比較して新しい情報を与えます。我々は、計量における退化の可能性を利用して、有向グラフの興味深い構造理論を開発し、関連する商化手順を探求する。我々のメトリックは、$O(n^3)$ timeで計算でき、$n$は状態の数であり、例えば、デスクトップコンピュータ上の$n=10,000$ノードと$\approx 38M$エッジまでスケールする。いくつかの例では、メートル法の性質を調べ、別の方法と比較し、密度グラフにおけるコミュニティ構造の弱い回復、可視化、構造回復、ダイナミクス探索、マルチスケールクラスタ検出に有用性を示す。

The shortest-path, commute time, and diffusion distances on undirected graphs have been widely employed in applications such as dimensionality reduction, link prediction, and trip planning. Increasingly, there is interest in using asymmetric structure of data derived from Markov chains and directed graphs, but few metrics are specifically adapted to this task. We introduce a metric on the state space of any ergodic, finite-state, time-homogeneous Markov chain and, in particular, on any Markov chain derived from a directed graph. Our construction is based on hitting probabilities, with nearness in the metric space related to the transfer of random walkers from one node to another at stationarity. Notably, our metric is insensitive to shortest and average walk distances, thus giving new information compared to existing metrics. We use possible degeneracies in the metric to develop an interesting structural theory of directed graphs and explore a related quotienting procedure. Our metric can be computed in $O(n^3)$ time, where $n$ is the number of states, and in examples we scale up to $n=10,000$ nodes and $\approx 38M$ edges on a desktop computer. In several examples, we explore the nature of the metric, compare it to alternative methods, and demonstrate its utility for weak recovery of community structure in dense graphs, visualization, structure recovering, dynamics exploration, and multiscale cluster detection.

翻訳日:2022-11-17 03:57:41 公開日:2021-01-18

# 低パス協調フィルタを用いたグラフ畳み込みネットワークの提案

Graph Convolutional Network for Recommendation with Low-pass Collaborative Filters ( http://arxiv.org/abs/2006.15516v3 )

ライセンス: Link先を確認

Wenhui Yu and Zheng Qin

(参考訳) \textbf{g}raph \textbf{c}onvolutional \textbf{n}etwork (\textbf{gcn})はレコメンデーションなどのグラフデータ学習タスクで広く使われている。しかし、大きなグラフに直面する場合、グラフの畳み込みは非常に計算コストがかかるため、既存のすべてのGCNでは単純化されるが、過剰な単純化により深刻な障害が生じる。このギャップに対処するために、GCN の \textit{ Origin graph convolution} を利用し、大きなグラフに適用するために \textbf{L}ow-pass \textbf{C}ollaborative \textbf{F}ilter (\textbf{LCF}) を提案する。 LCFは、観測データの露出と量子化に起因するノイズを取り除くように設計されており、また、グラフ畳み込みの複雑さを非スケールで低減する。実験の結果,LCFはグラフ畳み込みの有効性と効率を向上し,GCNは既存のGCNよりも優れていた。コードは \url{https://github.com/wenhui-yu/lcfn} で入手できる。

\textbf{G}raph \textbf{C}onvolutional \textbf{N}etwork (\textbf{GCN}) is widely used in graph data learning tasks such as recommendation. However, when facing a large graph, the graph convolution is very computationally expensive thus is simplified in all existing GCNs, yet is seriously impaired due to the oversimplification. To address this gap, we leverage the \textit{original graph convolution} in GCN and propose a \textbf{L}ow-pass \textbf{C}ollaborative \textbf{F}ilter (\textbf{LCF}) to make it applicable to the large graph. LCF is designed to remove the noise caused by exposure and quantization in the observed data, and it also reduces the complexity of graph convolution in an unscathed way. Experiments show that LCF improves the effectiveness and efficiency of graph convolution and our GCN outperforms existing GCNs significantly. Codes are available on \url{https://github.com/Wenhui-Yu/LCFN}.

翻訳日:2022-11-16 02:25:54 公開日:2021-01-18

# 可聴, 探究: 好奇心をオーディオ・ビジュアル・アソシエーションで見る

See, Hear, Explore: Curiosity via Audio-Visual Association ( http://arxiv.org/abs/2007.03669v2 )

ライセンス: Link先を確認

Victoria Dean, Shubham Tulsiani, Abhinav Gupta

(参考訳) 探索は強化学習における中核的な課題の1つだ。好奇心駆動探索の一般的な定式化は、学習モデルによって予測される実際の未来と未来との差を用いる。しかし、未来を予測することは本質的に難しい課題であり、確率性に直面しても不適切である。本稿では,異なる感覚間の新たな関連に報いる好奇心の代替形態を提案する。我々のアプローチは、より効率的な探索のためのより強力な信号を提供するために、複数のモダリティを利用する。我々の手法は、人間にとって視覚と音の両方が探索において重要な役割を果たすという事実に着想を得ている。いくつかのAtari環境とHabitat(フォトリアリスティックナビゲーションシミュレータ)について,外部報酬がない場合の学習エージェントを内在的に導くために,オーディオ視覚関連モデルを使用することの利点を示す。ビデオやコードについてはhttps://vdean.github.io/audio-curiosity.htmlを参照。

Exploration is one of the core challenges in reinforcement learning. A common formulation of curiosity-driven exploration uses the difference between the real future and the future predicted by a learned model. However, predicting the future is an inherently difficult task which can be ill-posed in the face of stochasticity. In this paper, we introduce an alternative form of curiosity that rewards novel associations between different senses. Our approach exploits multiple modalities to provide a stronger signal for more efficient exploration. Our method is inspired by the fact that, for humans, both sight and sound play a critical role in exploration. We present results on several Atari environments and Habitat (a photorealistic navigation simulator), showing the benefits of using an audio-visual association model for intrinsically guiding learning agents in the absence of external rewards. For videos and code, see https://vdean.github.io/audio-curiosity.html.

翻訳日:2022-11-12 18:13:37 公開日:2021-01-18

# 構造畳み込みモデルの昇降によるロスレス圧縮

Lossless Compression of Structured Convolutional Models via Lifting ( http://arxiv.org/abs/2007.06567v2 )

ライセンス: Link先を確認

Gustav Sourek, Filip Zelezny, Ondrej Kuzelka

(参考訳) 持ち上げは、基礎となる対称性を利用して、関係ドメインに一般化されたグラフィカルなモデルをスケールアップする効率的なテクニックである。同時に、ニューラルネットワークはグリッドのようなテンソルデータから様々な属性グラフやリレーショナルデータベースなどの構造化表現へと継続的に拡張されている。データの不規則構造に対処するため、モデルは通常、畳み込みの概念を外挿し、パラメータ共有を動的に展開された計算グラフに効果的に導入する。計算グラフ自体は、持ち上げられたグラフィカルモデルと同様に、基礎となるデータの対称性を反映する。昇降に触発されて,対称性を検知し,情報を失うことなく神経モデルを圧縮する簡易かつ効率的な手法を提案する。このような圧縮が、分子分類や知識ベース補完といった様々なタスクにおいて、様々なグラフニューラルネットワークのような構造化畳み込みモデルの大幅な高速化につながることを示す。

Lifting is an efficient technique to scale up graphical models generalized to relational domains by exploiting the underlying symmetries. Concurrently, neural models are continuously expanding from grid-like tensor data into structured representations, such as various attributed graphs and relational databases. To address the irregular structure of the data, the models typically extrapolate on the idea of convolution, effectively introducing parameter sharing in their, dynamically unfolded, computation graphs. The computation graphs themselves then reflect the symmetries of the underlying data, similarly to the lifted graphical models. Inspired by lifting, we introduce a simple and efficient technique to detect the symmetries and compress the neural models without loss of any information. We demonstrate through experiments that such compression can lead to significant speedups of structured convolutional models, such as various Graph Neural Networks, across various tasks, such as molecule classification and knowledge-base completion.

翻訳日:2022-11-10 22:46:47 公開日:2021-01-18

# 音響表現からモデルロバスト性へ

From Sound Representation to Model Robustness ( http://arxiv.org/abs/2007.13703v3 )

ライセンス: Link先を確認

Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich

(参考訳) 本稿では, 各種標準環境音響表現(スペクトログラム)が, 被害者残差畳み込みニューラルネットワークの認識性能と対角攻撃性に与える影響について検討する。 ResNet-18モデルは,3つの環境音響データセットのベンチマーク実験により,分類精度とトレーニングパラメータ数の両方において,GoogLeNetやAlexNetといった他のディープラーニングアーキテクチャよりも優れていることがわかった。そこで我々はこのモデルを,その後の調査のためのフロントエンド分類器として設定した。ここでは,より情報的なメル周波数ケプストラム係数(mfcc),短時間フーリエ変換(stft),離散ウェーブレット変換(dwt)の生成に必要な様々な設定の影響を測定する。この測定は、対向ロバスト性に対する分類性能の比較を含む。敵が割り当てる平均予算と攻撃コストのバランスについて,6つの攻撃アルゴリズムに対する認識精度とモデルロバスト性の逆関係を示す。さらに,DWTスペクトルを用いたResNet-18モデルでは高い認識精度が得られたが,他の2次元表現と比較して,このモデルに対する攻撃は比較的コストがかかることを示した。

In this paper, we investigate the impact of different standard environmental sound representations (spectrograms) on the recognition performance and adversarial attack robustness of a victim residual convolutional neural network. Averaged over various experiments on three benchmarking environmental sound datasets, we found the ResNet-18 model outperforms other deep learning architectures such as GoogLeNet and AlexNet both in terms of classification accuracy and the number of training parameters. Therefore we set this model as our front-end classifier for subsequent investigations. Herein, we measure the impact of different settings required for generating more informative mel-frequency cepstral coefficient (MFCC), short-time Fourier transform (STFT), and discrete wavelet transform (DWT) representations on our front-end model. This measurement involves comparing the classification performance over the adversarial robustness. On the balance of average budgets allocated by adversary and the cost of attack, we demonstrate an inverse relationship between recognition accuracy and model robustness against six attack algorithms. Moreover, our experimental results show that while the ResNet-18 model trained on DWT spectrograms achieves the highest recognition accuracy, attacking this model is relatively more costly for the adversary compared to other 2D representations.

翻訳日:2022-11-06 11:54:54 公開日:2021-01-18

# 深部スケッチガイド付きマンガビデオインタリング

Deep Sketch-guided Cartoon Video Inbetweening ( http://arxiv.org/abs/2008.04149v2 )

ライセンス: Link先を確認

Xiaoyu Li, Bo Zhang, Jing Liao and Pedro V. Sander

(参考訳) 本研究では,2つの入力キーフレームから色情報を抽出し,ユーザスケッチによるアニメーション動作に追従し,マンガ映像を作成するための新しい枠組みを提案する。提案手法の重要な考え方は,スケッチと漫画ビデオフレーム間の密なクロスドメイン対応を推定することであり,スケッチに導かれた中間フレームを合成するためにオクルージョン推定を伴うブレンディングモジュールを用いる。その後、入力フレームと確立対応を備えた合成フレームとを任意の時間フレーム補間パイプラインに供給し、追加のフレーム間を合成する。最後に、時間的一貫性を保つモジュールを用いる。一般的なフレーム補間法と比較して,比較的大きな動きのフレームに対処できると同時に,スケッチガイドを編集することで,ユーザが生成したビデオシーケンスを制御できる柔軟性も備えている。フレームとスケッチの対応を明示的に考慮することで、他の画像合成法よりも高品質な結果が得られる。これらの結果から,本システムは,既存のソリューションよりも優れた結果が得られることを示す。

We propose a novel framework to produce cartoon videos by fetching the color information from two input keyframes while following the animated motion guided by a user sketch. The key idea of the proposed approach is to estimate the dense cross-domain correspondence between the sketch and cartoon video frames, and employ a blending module with occlusion estimation to synthesize the middle frame guided by the sketch. After that, the input frames and the synthetic frame equipped with established correspondence are fed into an arbitrary-time frame interpolation pipeline to generate and refine additional inbetween frames. Finally, a module to preserve temporal consistency is employed. Compared to common frame interpolation methods, our approach can address frames with relatively large motion and also has the flexibility to enable users to control the generated video sequences by editing the sketch guidance. By explicitly considering the correspondence between frames and the sketch, we can achieve higher quality results than other image synthesis methods. Our results show that our system generalizes well to different movie frames, achieving better results than existing solutions.

翻訳日:2022-10-31 23:03:49 公開日:2021-01-18

# 脳接続ネットワーク分類のための順序パターンカーネル

Ordinal Pattern Kernel for Brain Connectivity Network Classification ( http://arxiv.org/abs/2008.07719v2 )

ライセンス: Link先を確認

Kai Ma, Biao Jie, Daoqiang Zhang

(参考訳) 脳領域の機能的または構造的相互作用を特徴付ける脳接続ネットワークは、脳疾患の分類に広く使われている。グラフカーネル(すなわちグラフ上に定義されたカーネル)のようなカーネルベースの手法は、脳ネットワークの類似性を測定するために提案され、有望な分類性能が得られる。しかし、ほとんどのグラフカーネルは、エッジが存在するか否かに関わらず、未重み付きグラフ(すなわちネットワーク)上に構築されており、脳接続ネットワークにおけるエッジの貴重な重み情報を無視し、エッジ重みは脳領域間の時間的相関やファイバー接続の強さを伝達する。そこで本研究では,脳接続ネットワーク分類のための順序パターンカーネルを提案する。非重み付きグラフの位相的類似度を測定する既存のグラフカーネルとは異なり、提案した順序パターンカーネルは重み付きネットワークの順序パターンを比較して重み付きネットワークの類似度を算出する。提案手法の有効性を評価するため,adniデータベースから脳疾患の実データを用いて,深さ優先型順序パターンカーネルをさらに開発し,広範な実験を行った。その結果,提案する順序パターンカーネルは,最先端グラフカーネルと比較して分類性能が向上することが示された。

Brain connectivity networks, which characterize the functional or structural interaction of brain regions, has been widely used for brain disease classification. Kernel-based method, such as graph kernel (i.e., kernel defined on graphs), has been proposed for measuring the similarity of brain networks, and yields the promising classification performance. However, most of graph kernels are built on unweighted graph (i.e., network) with edge present or not, and neglecting the valuable weight information of edges in brain connectivity network, with edge weights conveying the strengths of temporal correlation or fiber connection between brain regions. Accordingly, in this paper, we present an ordinal pattern kernel for brain connectivity network classification. Different with existing graph kernels that measures the topological similarity of unweighted graphs, the proposed ordinal pattern kernels calculate the similarity of weighted networks by comparing ordinal patterns from weighted networks. To evaluate the effectiveness of the proposed ordinal kernel, we further develop a depth-first-based ordinal pattern kernel, and perform extensive experiments in a real dataset of brain disease from ADNI database. The results demonstrate that our proposed ordinal pattern kernel can achieve better classification performance compared with state-of-the-art graph kernels.

翻訳日:2022-10-27 20:46:51 公開日:2021-01-18

# クローズドループコンピュータ支援肺超音波画像の画質評価

Image quality assessment for closed-loop computer-assisted lung ultrasound ( http://arxiv.org/abs/2008.08840v2 )

ライセンス: Link先を確認

Zachary M C Baum, Ester Bonmati, Lorenzo Cristoni, Andrew Walden, Ferran Prados, Baris Kanber, Dean C Barratt, David J Hawkes, Geoffrey J M Parker, Claudia A M Gandini Wheeler-Kingshott, Yipeng Hu

(参考訳) 本稿では,集中治療環境における超音波画像を用いた肺異常検出のための新しい2段階コンピュータ支援システムについて述べる。提案システムは, 画像品質の予測を自動化する品質評価モジュールと, 十分な品質の超音波画像におけるオアノマリーの可能性を判定する診断支援モジュールの2つの深層学習モデルから構成される。 2段階戦略では,品質評価分類器の訓練に利用可能な制御ケースの欠如に対処するために,新規検出アルゴリズムを用いる。診断支援モジュールは、品質評価モジュールからクローズドループフィードバック機構によって保証される十分な品質と判断されたデータでトレーニングすることができる。 2つの病院でスキャンされた37人の新型コロナウイルス陽性患者の超音波画像と12のコントロールケースから,提案した機械学習アプローチの有効性を実証した。品質評価モジュールを用いて,十分な画質画像と不十分な画質画像の分類を行う場合の精度は86%であった。品質評価モジュールによって決定される十分な品質のデータについて,提案システム内のネットワークのトレーニング中,5つのホールドアウトテストデータセットにおいて,covid-19陽性例の平均分類精度,感度,特異性がそれぞれ0.95, 0.91, 0.97であった。全体として、この2つのモジュールの統合は、医療現場で疑われる呼吸器疾患の患者に対して、正確、迅速、実用的な取得指導と診断支援をもたらす。

We describe a novel, two-stage computer assistance system for lung anomaly detection using ultrasound imaging in the intensive care setting to improve operator performance and patient stratification during coronavirus pandemics. The proposed system consists of two deep-learning-based models: a quality assessment module that automates predictions of image quality, and a diagnosis assistance module that determines the likelihood-oh-anomaly in ultrasound images of sufficient quality. Our two-stage strategy uses a novelty detection algorithm to address the lack of control cases available for training the quality assessment classifier. The diagnosis assistance module can then be trained with data that are deemed of sufficient quality, guaranteed by the closed-loop feedback mechanism from the quality assessment module. Using more than 25000 ultrasound images from 37 COVID-19-positive patients scanned at two hospitals, plus 12 control cases, this study demonstrates the feasibility of using the proposed machine learning approach. We report an accuracy of 86% when classifying between sufficient and insufficient quality images by the quality assessment module. For data of sufficient quality - as determined by the quality assessment module - the mean classification accuracy, sensitivity, and specificity in detecting COVID-19-positive cases were 0.95, 0.91, and 0.97, respectively, across five holdout test data sets unseen during the training of any networks within the proposed system. Overall, the integration of the two modules yields accurate, fast, and practical acquisition guidance and diagnostic assistance for patients with suspected respiratory conditions at point-of-care.

翻訳日:2022-10-27 04:08:12 公開日:2021-01-18

# パイロット:IJCAI 2020の人間とエージェントのネゴシエーションチャレンジの勝者

Pilot: Winner of the Human-Agent Negotiation Challenge at IJCAI 2020 ( http://arxiv.org/abs/2009.06781v2 )

ライセンス: Link先を確認

Kushal Chawla, Gale Lucas

(参考訳) この文書は、IJCAI 2020のANACの人間-エージェントネゴシエーションチャレンジで優勝したエージェントのパイロットについて記述しています。 pilotは、人間のパートナーと3つの交渉の連続に参加する仮想人間である。本システムは,IAGO(Interactive Arbitration Guide Online)ネゴシエーションフレームワークをベースとしている。我々は,エージェントの行動や性格を規定する様々な鍵となる原則を導出するために,事前の感情コンピューティングと心理学の研究を交渉に活用する。

This document describes our agent Pilot, winner of the Human-Agent Negotiation Challenge at ANAC, IJCAI 2020. Pilot is a virtual human that participates in a sequence of three negotiations with a human partner. Our system is based on the Interactive Arbitration Guide Online (IAGO) negotiation framework. We leverage prior Affective Computing and Psychology research in negotiations to guide various key principles that define the behavior and personality of our agent.

翻訳日:2022-10-18 12:50:26 公開日:2021-01-18

# マルチエージェント価値分解のためのエネルギーベースサプライズ最小化

Energy-based Surprise Minimization for Multi-Agent Value Factorization ( http://arxiv.org/abs/2009.09842v4 )

ライセンス: Link先を確認

Karush Suri, Xiao Qi Shi, Konstantinos Plataniotis, Yuri Lawryshyn

(参考訳) MARL(Multi-Agent Reinforcement Learning)は、分散政策を集中的に訓練する上で、価値分解法を用いて大きな成功を収めている。しかしながら、スプリアス状態と近似バイアスにまたがる驚きに対処することは、マルチエージェントの設定では未解決の問題のままである。この目標に向けて,エージェント間のエネルギー利用を最小化するアルゴリズムであるEMIX(Energy-based MIXer)を導入する。 1) emixは,マルチエージェントの部分観測可能な設定の場合,複数のエージェントにまたがる新たなサプライズ最小化手法を導入している。 2) emix はエネルギー作用素の理論的保証と実験検証を伴う marl におけるエネルギー関数の実用化を強調する。最後に、(3)EMIXはMARLのエージェント間の過大評価バイアスに対処するためにMaxmin Q-learningを拡張する。 StarCraft IIのマイクロマネジメントシナリオを挑戦する研究において、EMIXはマルチエージェントサプライズ最小化のための一貫した安定したパフォーマンスを示す。さらに, エネルギーベース方式の必要性と, MARLにおける過大評価バイアスの除去の必要性について検討した。 EMIXの実装はkarush17.github.io/emix-web/で確認できます。

Multi-Agent Reinforcement Learning (MARL) has demonstrated significant success in training decentralised policies in a centralised manner by making use of value factorization methods. However, addressing surprise across spurious states and approximation bias remain open problems for multi-agent settings. Towards this goal, we introduce the Energy-based MIXer (EMIX), an algorithm which minimizes surprise utilizing the energy across agents. Our contributions are threefold; (1) EMIX introduces a novel surprise minimization technique across multiple agents in the case of multi-agent partially-observable settings. (2) EMIX highlights a practical use of energy functions in MARL with theoretical guarantees and experiment validations of the energy operator. Lastly, (3) EMIX extends Maxmin Q-learning for addressing overestimation bias across agents in MARL. In a study of challenging StarCraft II micromanagement scenarios, EMIX demonstrates consistent stable performance for multiagent surprise minimization. Moreover, our ablation study highlights the necessity of the energy-based scheme and the need for elimination of overestimation bias in MARL. Our implementation of EMIX can be found at karush17.github.io/emix-web/.

翻訳日:2022-10-17 23:47:07 公開日:2021-01-18

# ハミルトニアン完備問題(hamiltonian completion problem)の変遷

Evolving test instances of the Hamiltonian completion problem ( http://arxiv.org/abs/2011.02291v2 )

ライセンス: Link先を確認

Thibault Lechien, Jorik Jooken, Patrick De Causmaecker

(参考訳) グラフインスタンス上でのアルゴリズム性能の予測と比較は、複数の理由から難しい。まず、パフォーマンスをベンチマークするインスタンスの標準セットは、通常存在しない。第二に、既存のグラフ生成器を使用すると、困難なスペクトルが制限され、結果として得られるグラフは通常、音の結論を引き出すのに十分な多様性がない。そこで最近の研究は、進化的アルゴリズムを用いて多様なインスタンス群を生成する新しい手法を提案する。そして、結果のグラフを分析し、どの属性がアルゴリズムのパフォーマンスに最も関連しているかに関する重要な洞察を得ることができます。以前は目に見えない機能の組み合わせでグラフを生成するために、インスタンス空間の観測されたギャップを埋めることもできる。この手法は、2つの異なる解法、すなわち Concorde TSP Solver とマルチスタート局所探索アルゴリズムを用いてハミルトン完備化問題のインスタンス空間に適用する。

Predicting and comparing algorithm performance on graph instances is challenging for multiple reasons. First, there is usually no standard set of instances to benchmark performance. Second, using existing graph generators results in a restricted spectrum of difficulty and the resulting graphs are usually not diverse enough to draw sound conclusions. That is why recent work proposes a new methodology to generate a diverse set of instances by using an evolutionary algorithm. We can then analyze the resulting graphs and get key insights into which attributes are most related to algorithm performance. We can also fill observed gaps in the instance space in order to generate graphs with previously unseen combinations of features. This methodology is applied to the instance space of the Hamiltonian completion problem using two different solvers, namely the Concorde TSP Solver and a multi-start local search algorithm.

翻訳日:2022-10-10 20:02:11 公開日:2021-01-18

# 一連の不幸な反事実的出来事:反事実的説明における時間の役割

A Series of Unfortunate Counterfactual Events: the Role of Time in Counterfactual Explanations ( http://arxiv.org/abs/2010.04687v2 )

ライセンス: Link先を確認

Andrea Ferrario, Michele Loi

(参考訳) 反事実的説明は、説明可能な人工知能研究領域におけるポストホック解釈可能性手法の顕著な例である。彼らは個人に代替シナリオと一連のレコメンデーションを提供し、機械学習モデルの結果を追求する。近年,本論文は,現実の文脈における適用性を支えるための実現可能性や行動可能性,スパーシティといった反事実的説明のデシデラタを特定している。しかし,本論文は,反実的説明の時間依存性の問題を無視していることを示す。時間的依存とレコメンデーションの提供のため、現実のアプリケーションでは、実現可能で、行動可能で、スパースなカウンターファクトな説明が適さないかもしれない、と我々は主張する。これは、私たちが"不幸な反ファクトイベント"と呼ぶものが出現する可能性があるためです。これらの出来事は、結果を説明する必要がある機械学習モデルの再訓練によって起こりうる。一連の不幸な反事実的出来事は、反事実的説明の推奨を実行に移した人々の努力をいらいらさせる。これは、学習支援決定を一貫して提供できる機関の能力に対する人々の信頼に悪影響を及ぼす。本稿では,反事実的説明の履歴を利用した不運な反事実的事象の発生問題に対処するためのアプローチを提案する。本論文の最終部では,不運な対実事件に対処する2つの異なる戦略の倫理的分析を提案する。信用貸付組織の信頼度、採用する意思決定モデル、信用貸付の社会的経済的機能を維持するための倫理的責任を負う命令に反応することを示す。

Counterfactual explanations are a prominent example of post-hoc interpretability methods in the explainable Artificial Intelligence research domain. They provide individuals with alternative scenarios and a set of recommendations to achieve a sought-after machine learning model outcome. Recently, the literature has identified desiderata of counterfactual explanations, such as feasibility, actionability and sparsity that should support their applicability in real-world contexts. However, we show that the literature has neglected the problem of the time dependency of counterfactual explanations. We argue that, due to their time dependency and because of the provision of recommendations, even feasible, actionable and sparse counterfactual explanations may not be appropriate in real-world applications. This is due to the possible emergence of what we call "unfortunate counterfactual events." These events may occur due to the retraining of machine learning models whose outcomes have to be explained via counterfactual explanation. Series of unfortunate counterfactual events frustrate the efforts of those individuals who successfully implemented the recommendations of counterfactual explanations. This negatively affects people's trust in the ability of institutions to provide machine learning-supported decisions consistently. We introduce an approach to address the problem of the emergence of unfortunate counterfactual events that makes use of histories of counterfactual explanations. In the final part of the paper we propose an ethical analysis of two distinct strategies to cope with the challenge of unfortunate counterfactual events. We show that they respond to an ethically responsible imperative to preserve the trustworthiness of credit lending organizations, the decision models they employ, and the social-economic function of credit lending.

翻訳日:2022-10-09 05:40:58 公開日:2021-01-18

# スケルトンベース行動認識のためのポーズ改善グラフ畳み込みネットワーク

Pose Refinement Graph Convolutional Network for Skeleton-based Action Recognition ( http://arxiv.org/abs/2010.07367v2 )

ライセンス: Link先を確認

Shijie Li, Jinhui Yi, Yazan Abu Farha and Juergen Gall

(参考訳) 2Dまたは3Dの骨格データを取得する進歩により、骨格に基づく行動認識はここ数年で注目されている。スケルトンデータはグラフによって一般的に表現されるので、グラフ畳み込みネットワークが提案されている。現在のグラフ畳み込みネットワークはアクションを正確に認識するが、計算資源が限られているロボティクスアプリケーションでは高価すぎる。そこで本稿では,従来の作業の限界に対処する高効率なグラフ畳み込みネットワークを提案する。これは、動きと空間情報を徐々に融合し、時間分解能をできるだけ早く低減する並列構造によって達成される。さらに、人間のポーズがエラーを含むことができる問題に明示的に対処する。この目的のために、ネットワークはまず、アクションを認識するためにさらに処理される前に、ポーズを洗練する。したがって、我々はネットワークを Pose Refinement Graph Convolutional Network と呼ぶ。他のグラフ畳み込みネットワークと比較して、我々のネットワークはパラメータを86\%-93\%少なくし、浮動小数点演算を89%-96%削減し、同等の精度を達成する。したがって、精度、メモリフットプリント、処理時間の間のトレードオフがより良くなり、ロボティクスアプリケーションに適している。

With the advances in capturing 2D or 3D skeleton data, skeleton-based action recognition has received an increasing interest over the last years. As skeleton data is commonly represented by graphs, graph convolutional networks have been proposed for this task. While current graph convolutional networks accurately recognize actions, they are too expensive for robotics applications where limited computational resources are available. In this paper, we therefore propose a highly efficient graph convolutional network that addresses the limitations of previous works. This is achieved by a parallel structure that gradually fuses motion and spatial information and by reducing the temporal resolution as early as possible. Furthermore, we explicitly address the issue that human poses can contain errors. To this end, the network first refines the poses before they are further processed to recognize the action. We therefore call the network Pose Refinement Graph Convolutional Network. Compared to other graph convolutional networks, our network requires 86\%-93\% less parameters and reduces the floating point operations by 89%-96% while achieving a comparable accuracy. It therefore provides a much better trade-off between accuracy, memory footprint and processing time, which makes it suitable for robotics applications.

翻訳日:2022-10-07 13:57:38 公開日:2021-01-18

# MRI前立腺病変分類における領域適応の不確かさ

Harnessing Uncertainty in Domain Adaptation for MRI Prostate Lesion Segmentation ( http://arxiv.org/abs/2010.07411v2 )

ライセンス: Link先を確認

Eleni Chiou, Francesco Giganti, Shonit Punwani, Iasonas Kokkinos, Eleftheria Panagiotaki

(参考訳) トレーニングデータの必要性は、学習型医用画像解析における新しい画像モダリティの導入を妨げる可能性がある。ドメイン適応法は、関連するソースドメインから新しいターゲットドメインにトレーニングデータを変換することで部分的にこの問題を軽減するが、一般的には1対1の翻訳が可能であると仮定する。我々の研究は、単一のソースサンプルから複数のターゲットサンプルが出現する、より情報的なターゲットドメインに適応するという課題に対処する。特に,癌評価のための最適化された取得プロトコルを含む,よりリッチなMRIモダリティである mp-MRI から VERDICT への変換を検討する。我々は、このマッピングの固有の不確実性を明確に説明し、1つの入力で条件付けられた複数の出力を生成するためにそれを利用する。以上の結果から,単純なCycleGANベースラインと,識別的セグメンテーション損失と/または残差アダプタを併用したより強力なアプローチの両面から,対象領域に対する画像表現を系統的に向上させることが可能であることが示唆された。決定論的手法と比較して、我々の手法は、幅広いデータセットサイズ、ますます強力なベースライン、評価尺度で大幅に改善される。

The need for training data can impede the adoption of novel imaging modalities for learning-based medical image analysis. Domain adaptation methods partially mitigate this problem by translating training data from a related source domain to a novel target domain, but typically assume that a one-to-one translation is possible. Our work addresses the challenge of adapting to a more informative target domain where multiple target samples can emerge from a single source sample. In particular we consider translating from mp-MRI to VERDICT, a richer MRI modality involving an optimized acquisition protocol for cancer characterization. We explicitly account for the inherent uncertainty of this mapping and exploit it to generate multiple outputs conditioned on a single input. Our results show that this allows us to extract systematically better image representations for the target domain, when used in tandem with both simple, CycleGAN-based baselines, as well as more powerful approaches that integrate discriminative segmentation losses and/or residual adapters. When compared to its deterministic counterparts, our approach yields substantial improvements across a broad range of dataset sizes, increasingly strong baselines, and evaluation measures.

翻訳日:2022-10-07 13:10:28 公開日:2021-01-18

# 条件付き変分オートエンコーダにおけるマルチモーダル潜時空間のエビデンシャルスカラー化

Evidential Sparsification of Multimodal Latent Spaces in Conditional Variational Autoencoders ( http://arxiv.org/abs/2010.09164v3 )

ライセンス: Link先を確認

Masha Itkina, Boris Ivanovic, Ransalu Senanayake, Mykel J. Kochenderfer, and Marco Pavone

(参考訳) 変分オートエンコーダの離散的潜在空間は、自然言語理解、人間の意図予測、視覚シーン表現など、多くの現実世界の問題に対するデータ分布を効果的に捉えることが示されている。しかし、離散潜在空間は実世界のデータの複雑さを捉えるのに十分な大きさでなければならない。例えば、高次元の潜在環境表現で動き計画を実行することは難解である。学習されたマルチモダリティを保ちつつ、訓練された条件付き変分オートエンコーダの離散的潜在空間をスパースする問題を考える。ポストホック潜在空間還元法として,特定の入力条件から直接的証拠を受け取る潜在クラスを同定し,そうでないクラスをフィルタする。画像生成や人間の行動予測などの多様なタスクの実験は、学習された多モード性を維持しながら、モデルの離散潜在サンプル空間サイズを小さくする手法の有効性を実証する。

Discrete latent spaces in variational autoencoders have been shown to effectively capture the data distribution for many real-world problems such as natural language understanding, human intent prediction, and visual scene representation. However, discrete latent spaces need to be sufficiently large to capture the complexities of real-world data, rendering downstream tasks computationally challenging. For instance, performing motion planning in a high-dimensional latent representation of the environment could be intractable. We consider the problem of sparsifying the discrete latent space of a trained conditional variational autoencoder, while preserving its learned multimodality. As a post hoc latent space reduction technique, we use evidential theory to identify the latent classes that receive direct evidence from a particular input condition and filter out those that do not. Experiments on diverse tasks, such as image generation and human behavior prediction, demonstrate the effectiveness of our proposed technique at reducing the discrete latent sample space size of a model while maintaining its learned multimodality.

翻訳日:2022-10-05 20:21:16 公開日:2021-01-18

# 連続学習のためのモジュール関連性

Modular-Relatedness for Continual Learning ( http://arxiv.org/abs/2011.01272v2 )

ライセンス: Link先を確認

Ammar Shaker, Shujian Yu, Francesco Alesiani

(参考訳) 本稿では,逐次的タスク学習者にとって有益な連続学習(CL)手法を提案する。このアプローチの主なターゲットは、ニューラルネットワークのモジュール部分の自動抽出と、これらのモジュールコンポーネントによって与えられたタスク間の関連性の推定です。この手法は、正規化ベースの(例えばElastic Weight Consolidation)やリハーサルベースの(例えばGradient Episodic Memory)といった、エピソードメモリを必要とするCLメソッドの異なるファミリーに適用できる。実験結果から,EWC や GEM などの手法,特にメモリ予算が極めて限られている場合に,顕著な性能向上(忘れることへの堅牢性)が得られた。

In this paper, we propose a continual learning (CL) technique that is beneficial to sequential task learners by improving their retained accuracy and reducing catastrophic forgetting. The principal target of our approach is the automatic extraction of modular parts of the neural network and then estimating the relatedness between the tasks given these modular components. This technique is applicable to different families of CL methods such as regularization-based (e.g., the Elastic Weight Consolidation) or the rehearsal-based (e.g., the Gradient Episodic Memory) approaches where episodic memory is needed. Empirical results demonstrate remarkable performance gain (in terms of robustness to forgetting) for methods such as EWC and GEM based on our technique, especially when the memory budget is very limited.

翻訳日:2022-09-30 11:20:50 公開日:2021-01-18

# 天然ガスタービン発電プラントにおけるプロセス解析と予測モデリングによるnoxの環境汚染予測

Environmental Pollution Prediction of NOx by Process Analysis and Predictive Modelling in Natural Gas Turbine Power Plants ( http://arxiv.org/abs/2011.08978v2 )

ライセンス: Link先を確認

Alan Rezazadeh

(参考訳) 本研究の目的は,天然ガス発電タービンからのNOx排出を予測するK-Nearest-Neighbor (KNN) アルゴリズムを提案することである。電力生産のプロセスは、気象や電力網の要求など多くの要因により、動的かつ急速に変化している。ガスタービン装置はタービンの寿命とともに機器特性や熱力学的挙動が変化するため、発電のダイナミックな部分でもある。タービンの定期的なメンテナンスも発電プロセスの別のダイナミックな部分であり、機器の性能に影響を及ぼす。この分析は、比較的小さなデータセットでトレーニングされたKNNを使用して、最も正確な予測率を生成する。このステートメントは、KNNが現在の入力パラメータに最も近いKを見つけ、歴史的に類似した観測の定式平均を予測として推定するときに論理的に説明できる。本稿では,環境条件,電気出力,タービン性能要因を取り入れ,nox排出量予測のための機械学習モデルを構築した。このモデルは、有害な排出を減らすための運用プロセスを最適化し、全体の運用効率を向上させるために使用できる。原理成分アルゴリズム(PCA)のような潜在アルゴリズムは、機器の性能変化を監視し、プロセスのパラメントに深く影響し、結果としてNOx排出量を決定する。本報告では,多変量解析,クラスタリング,残差解析などの機械学習性能評価の典型的な統計的手法を用いている。

The main objective of this paper is to propose K-Nearest-Neighbor (KNN) algorithm for predicting NOx emissions from natural gas electrical generation turbines. The process of producing electricity is dynamic and rapidly changing due to many factors such as weather and electrical grid requirements. Gas turbine equipment are also a dynamic part of the electricity generation since the equipment characteristics and thermodynamics behavior change as the turbines age. Regular maintenance of turbines are also another dynamic part of the electrical generation process, affecting the performance of equipment. This analysis discovered using KNN, trained on relatively small dataset produces the most accurate prediction rates. This statement can be logically explained as KNN finds the K nearest neighbor to the current input parameters and estimates a rated average of historically similar observations as prediction. This paper incorporates ambient weather conditions, electrical output as well as turbine performance factors to build a machine learning model to predict NOx emissions. The model can be used to optimize the operational processes for reduction in harmful emissions and increasing overall operational efficiency. Latent algorithms such as Principle Component Algorithms (PCA) have been used for monitoring the equipment performance behavior change which deeply influences process paraments and consequently determines NOx emissions. Typical statistical methods of machine learning performance evaluations such as multivariate analysis, clustering and residual analysis have been used throughout the paper.

翻訳日:2022-09-29 13:10:16 公開日:2021-01-18

# オートエンコーダを用いた教師付き学習における誤ラベル画像の同定

Identifying Mislabeled Images in Supervised Learning Utilizing Autoencoder ( http://arxiv.org/abs/2011.03667v2 )

ライセンス: Link先を確認

Yunhao Yang, Andrew Whinston

(参考訳) 教師付き学習は、トレーニングデータの基底真理が正確であるという仮定に基づいている。しかし、これは現実世界の設定では保証されない。不正確なトレーニングデータは、予想外の予測をもたらす。画像分類では、不正確なラベルによって分類モデルも不正確になる可能性がある。本稿では,分類ネットワークを訓練する前に,教師なしの手法をトレーニングデータに適用する。画像のエンコードおよび再構成に畳み込みオートエンコーダを適用する。エンコーダは画像データを潜在空間に投影する。潜在空間では、画像の特徴は低い次元で保存される。同様の特徴を持つデータサンプルは、同じラベルを持つ可能性が高いと仮定する。ノイズサンプルは、密度ベーススキャン(DBSCAN)クラスタリングアルゴリズムによって潜在空間に分類することができる。これらの不正確なラベル付きデータは潜在空間の異常値として可視化される。そのため、DBSCANアルゴリズムで同定された外れ値は、誤ってラベル付けされたサンプルに分類することができる。外れ値が検出されると、すべての外れ値が誤ってラベル付けされたデータサンプルとして扱われ、データセットから削除される。これにより、教師付き学習ネットワークのトレーニングにトレーニングデータを直接使用できる。このアルゴリズムは、実験データセットの67%以上の不正ラベル付きデータを検出および削除することができる。

Supervised learning is based on the assumption that the ground truth in the training data is accurate. However, this may not be guaranteed in real-world settings. Inaccurate training data will result in some unexpected predictions. In image classification, incorrect labels may cause the classification model to be inaccurate as well. In this paper, I am going to apply unsupervised techniques to the training data before training the classification network. A convolutional autoencoder is applied to encode and reconstruct images. The encoder will project the image data on to latent space. In the latent space, image features are preserved in a lower dimension. The assumption is that data samples with similar features are likely to have the same label. Noised samples can be classified in the latent space by the Density-Base Scan (DBSCAN) clustering algorithm. These incorrectly labeled data are visualized as outliers in the latent space. Therefore, the outliers identified by the DBSCAN algorithm can be classified as incorrectly labeled samples. After the outliers are detected, all the outliers are treated as mislabeled data samples and removed from the dataset. Thus the training data can be directly used in training the supervised learning network. The algorithm can detect and remove above 67\% of mislabeled data in the experimental dataset.

翻訳日:2022-09-28 22:27:06 公開日:2021-01-18

# 自己修飾機能を有する有界有理エージェントの性能

Performance of Bounded-Rational Agents With the Ability to Self-Modify ( http://arxiv.org/abs/2011.06275v2 )

ライセンス: Link先を確認

Jakub T\v{e}tek, Marek Sklenka, Tom\'a\v{s} Gaven\v{c}iak

(参考訳) 複雑な環境に埋め込まれたエージェントの自己修正は、直接的手段(例えば、コードの変更)や間接的(例えば、オペレーターに影響、バグを悪用する、あるいは環境を悪用する)によって発生するのを避けるのが難しい。インテリジェントエージェントは、将来のインスタンスが同じ目標に向かって動くように、ユーティリティ機能を変更することを避けるインセンティブがある、と論じられている。 Everitt et al. (2016) は、完全に合理的なエージェントに対して自己修正オプションを提供することは無害であることを示した。この結果は有界合理性を持つエージェントにはもはや当てはまらないことを示す。このようなエージェントでは、自己修飾は、パフォーマンスの指数関数的劣化と、予め整列されたエージェントの徐々にの不適応を引き起こす可能性がある。この効果の大きさが、エージェントの合理性における不完全性のタイプと大きさ(以下1-4)に依存するかを検討する。また,モデル仮定とより広い問題とフレーミング空間についても論じる。エージェントが有界有理化できる4つの方法を検討する。(1)は必ずしも最適な行動を選択しない、(2)は人間の値と完全に一致しない、(3)は環境の不正確なモデルを持っている、(4)は間違った時間的割引係数を使用する。 2)-(4)の場合,エージェントの不完全性に起因する誤用は時間とともに増大しないが,(1)誤用は指数関数的に増加する可能性がある。

Self-modification of agents embedded in complex environments is hard to avoid, whether it happens via direct means (e.g. own code modification) or indirectly (e.g. influencing the operator, exploiting bugs or the environment). It has been argued that intelligent agents have an incentive to avoid modifying their utility function so that their future instances work towards the same goals. Everitt et al. (2016) formally show that providing an option to self-modify is harmless for perfectly rational agents. We show that this result is no longer true for agents with bounded rationality. In such agents, self-modification may cause exponential deterioration in performance and gradual misalignment of a previously aligned agent. We investigate how the size of this effect depends on the type and magnitude of imperfections in the agent's rationality (1-4 below). We also discuss model assumptions and the wider problem and framing space. We examine four ways in which an agent can be bounded-rational: it either (1) doesn't always choose the optimal action, (2) is not perfectly aligned with human values, (3) has an inaccurate model of the environment, or (4) uses the wrong temporal discounting factor. We show that while in the cases (2)-(4) the misalignment caused by the agent's imperfection does not increase over time, with (1) the misalignment may grow exponentially.

翻訳日:2022-09-26 07:24:47 公開日:2021-01-18

# (参考訳) リモートセンシングデータを用いたインドにおける大気汚染のシグネチャの同定

Use of Remote Sensing Data to Identify Air Pollution Signatures in India ( http://arxiv.org/abs/2012.00402v2 )

ライセンス: CC BY 4.0

Sivaramakrishnan KN, Lipika Deka, Manik Gupta

(参考訳) 大気汚染は国家の社会経済的地位に大きな影響を及ぼし、主要な大気汚染源を特定することが問題に取り組む中心となっている。インドのように様々な国にまたがる空間的・時間的な空気質データ取得は、このような分析の課題となっている。センチネル5P衛星の打ち上げは、地球規模の大気汚染物質を毎日観測するよりも幅広い種類の大気汚染物質を観測するのに役立った。本章では、センチネル-5p衛星から得られた時空間的マルチ汚染物質データを、インド国内の各地域、およびそれに伴う月平均汚染サインおよび各クラスターで表される傾向を導出して提示し、各種汚染源から放出される汚染物質の種類に基づいて、国や地域を特定するためにクラスタリング署名を用いる。

Air quality has major impact on a country's socio-economic position and identifying major air pollution sources is at the heart of tackling the issue. Spatially and temporally distributed air quality data acquisition across a country as varied as India has been a challenge to such analysis. The launch of the Sentinel-5P satellite has helped in the observation of a wider variety of air pollutants than measured before at a global scale on a daily basis. In this chapter, spatio-temporal multi pollutant data retrieved from Sentinel-5P satellite is used to cluster states as well as districts in India and associated average monthly pollution signature and trends depicted by each of the clusters are derived and presented.The clustering signatures can be used to identify states and districts based on the types of pollutants emitted by various pollution sources.

翻訳日:2021-05-31 10:20:02 公開日:2021-01-18

# (参考訳) クロスエントロピー損失を伴う神経崩壊

Neural Collapse with Cross-Entropy Loss ( http://arxiv.org/abs/2012.08465v2 )

ライセンス: CC BY 4.0

Jianfeng Lu, Stefan Steinerberger

(参考訳) 我々は、単位超球面上の n$ 特徴ベクトルを $\mathbb{r}^d$ とするクロスエントロピー損失の変分問題を考える。我々は、$d \geq n1$ のとき、大域的最小値は、神経崩壊の振る舞いを正当化するsimplex equiangular tight frameによって与えられることを証明する。また、固定$d$の$n \rightarrow \infty$として、極小化点は超球面上で一様に分布し、ベネデット・アンド・フィッカスのフレームポテンシャルとの接続を示す。

We consider the variational problem of cross-entropy loss with $n$ feature vectors on a unit hypersphere in $\mathbb{R}^d$. We prove that when $d \geq n - 1$, the global minimum is given by the simplex equiangular tight frame, which justifies the neural collapse behavior. We also prove that as $n \rightarrow \infty$ with fixed $d$, the minimizing points will distribute uniformly on the hypersphere and show a connection with the frame potential of Benedetto & Fickus.

翻訳日:2021-05-07 10:10:52 公開日:2021-01-18

# (参考訳) グラフニューラルネットワーク - 分類学、進歩、トレンド

Graph Neural Networks: Taxonomy, Advances and Trends ( http://arxiv.org/abs/2012.08752v2 )

ライセンス: CC BY 4.0

Yu Zhou, Haixia Zheng, Xin Huang

(参考訳) グラフニューラルネットワークは、特定のタスクに応じて、現実世界のグラフを低次元空間に埋め込む強力なツールキットを提供する。これまでのところ、このトピックに関するいくつかの調査がある。しかし、通常は異なる角度に重点を置いているため、読者はグラフニューラルネットワークのパノラマを見ることができない。この調査は、この制限を克服し、グラフニューラルネットワークの包括的なレビューを提供することを目的としている。まず、グラフニューラルネットワークの新しい分類法を提供し、その後、最大400の関連する文献を参照して、グラフニューラルネットワークのパノラマを示す。これらはすべて対応するカテゴリに分類される。グラフニューラルネットワークを新たな段階に導くために,我々は,直面する課題を克服するために,今後4つの研究方向をまとめる。より多くの研究者がグラフニューラルネットワークを理解し、活用し、研究コミュニティで利用することが期待されている。

Graph neural networks provide a powerful toolkit for embedding real-world graphs into low-dimensional spaces according to specific tasks. Up to now, there have been several surveys on this topic. However, they usually lay emphasis on different angles so that the readers can not see a panorama of the graph neural networks. This survey aims to overcome this limitation, and provide a comprehensive review on the graph neural networks. First of all, we provide a novel taxonomy for the graph neural networks, and then refer to up to 400 relevant literatures to show the panorama of the graph neural networks. All of them are classified into the corresponding categories. In order to drive the graph neural networks into a new stage, we summarize four future research directions so as to overcome the facing challenges. It is expected that more and more scholars can understand and exploit the graph neural networks, and use them in their research community.

翻訳日:2021-05-06 11:21:26 公開日:2021-01-18

# 学習ブロックベースハイブリッド画像圧縮

Learned Block-based Hybrid Image Compression ( http://arxiv.org/abs/2012.09550v3 )

ライセンス: Link先を確認

Yaojun Wu, Xin Li, Zhizheng Zhang, Xin Jin, Zhibo Chen

(参考訳) 近年の学習画像圧縮技術は, 符号化処理と復号処理をフル解像度で行い, 実用用途に展開する際の2つの問題点を生じさせている。第一に、自己回帰エントロピーモデルの並列加速度はシリアルデコードにより達成できない。第二に、フル解像度の推論は、特に高解像度の画像に対して、GPUリソースが限られているメモリ外問題(OOM)を引き起こすことが多い。ブロックパーティションは上記の問題に対処するためのよい設計選択だが、ブロック間の冗長性を減らし、ブロック効果をなくすという新たな課題をもたらす。上記の課題に対処するため,本稿では,学習ブロックベースハイブリッド画像圧縮(LBHIC)フレームワークを提案する。具体的には,隣接ブロック間の関係を利用するために,学習画像圧縮フレームワークに明示的な内部予測を導入する。従来のコーデックにおける隣接画素の線形重み付けによるコンテキストモデリングに優れており、ストリッププーリングを利用して隣接潜在空間における最も関連する情報を抽出し、効果的な情報予測を実現することで、長距離相関をよりよく捉えるコンテキスト予測モジュール(cpm)を提案する。さらに,ブロッキングアーティファクトを緩和するために,エッジの重要性を考慮した境界対応後処理モジュール(BPM)を提案する。広範な実験により、lbhicコーデックはvvcを4.1%のビットレート保存で上回り、最先端の学習画像圧縮法と比較して約86.7%の復号時間を削減できることが示されている。

Recent works on learned image compression perform encoding and decoding processes in a full-resolution manner, resulting in two problems when deployed for practical applications. First, parallel acceleration of the autoregressive entropy model cannot be achieved due to serial decoding. Second, full-resolution inference often causes the out-of-memory(OOM) problem with limited GPU resources, especially for high-resolution images. Block partition is a good design choice to handle the above issues, but it brings about new challenges in reducing the redundancy between blocks and eliminating block effects. To tackle the above challenges, this paper provides a learned block-based hybrid image compression (LBHIC) framework. Specifically, we introduce explicit intra prediction into a learned image compression framework to utilize the relation among adjacent blocks. Superior to context modeling by linear weighting of neighbor pixels in traditional codecs, we propose a contextual prediction module (CPM) to better capture long-range correlations by utilizing the strip pooling to extract the most relevant information in neighboring latent space, thus achieving effective information prediction. Moreover, to alleviate blocking artifacts, we further propose a boundary-aware postprocessing module (BPM) with the edge importance taken into account. Extensive experiments demonstrate that the proposed LBHIC codec outperforms the VVC, with a bit-rate conservation of 4.1%, and reduces the decoding time by approximately 86.7% compared with that of state-of-the-art learned image compression methods.

翻訳日:2021-05-02 07:17:13 公開日:2021-01-18

# フレキシビリティ設計問題に対する強化学習

Reinforcement Learning for Flexibility Design Problems ( http://arxiv.org/abs/2101.00355v2 )

ライセンス: Link先を確認

Yehua Wei, Lei Zhang, Ruiyi Zhang, Shijing Si, Hao Zhang, Lawrence Carin

(参考訳) フレキシビリティ設計問題(英: Flexibility design problem)とは、産業間の戦略的意思決定において、柔軟性と適応性を持つネットワーク(例えば製造コスト)を設計することを目的とする問題である。基礎となる組合せの性質と確率的目的は、標準最適化法において柔軟性設計の問題を引き起こす。本稿では、柔軟性設計問題に対する強化学習(RL)フレームワークを開発する。具体的には、実験的な成功を確実にするため、ノイズ探索と分散低減によるメカニズムを慎重に設計し、高速適応の観点からRLの独特な利点を示す。実験結果から、RLに基づく手法は古典的ヒューリスティックよりも優れた解を常に見出すことが示された。

Flexibility design problems are a class of problems that appear in strategic decision-making across industries, where the objective is to design a ($e.g.$, manufacturing) network that affords flexibility and adaptivity. The underlying combinatorial nature and stochastic objectives make flexibility design problems challenging for standard optimization methods. In this paper, we develop a reinforcement learning (RL) framework for flexibility design problems. Specifically, we carefully design mechanisms with noisy exploration and variance reduction to ensure empirical success and show the unique advantage of RL in terms of fast-adaptation. Empirical results show that the RL-based method consistently finds better solutions compared to classical heuristics.

翻訳日:2021-04-13 07:21:53 公開日:2021-01-18

# Rough Set AlgebraとCoreular Double Stone Algebraについての一考察

A Note on Rough Set Algebra and Core Regular Double Stone Algebras ( http://arxiv.org/abs/2101.02313v2 )

ライセンス: Link先を確認

Daniel J. Clouse

(参考訳) 近似空間 $\langle u,\theta \rangle$ が与えられたとき、$e$ が$\theta$ の同値類のインデックス集合であると仮定し、$r_\theta$ を通常の二重石代数として $\langle\underline{x},\overline{x}\rangle$ という形の粗集合の集合と、i. dunstch がカトリナック代数と呼ぶものと仮定する。 [7],[8] が [1] で与えられる証明から別の証明を与える:$|\theta_u| > 1\ \forall\ u \in U$ ならば、$R_\theta$ は核正則な二重ストーン代数である。さらに、$C_3$ は 3 つの元鎖をコア正則ダブルストーン代数とし、$TP_U$ は集合 $U$ 上の三次分割の集合を表す。 R_\theta$ with $|\theta_u| > 1\ \forall\ u \in U$ to be isomorphic to $TP_E$ and $C_3^E$, with $E$ is a indexing set for $\theta$, and the three CRDSA's are complete and atomic。これはアプリケーション内の特定の$r_\theta$を扱うときに非常に便利だと思います。 r_\theta$をそれぞれ$tp_u$、$c_3^u$、$\phi\circ \alpha_r:r_\theta\hookrightarrow tp_u\hookrightarrow c_3^u$に組み込む方法を明確に示します。定理 3 と [7] の補題 2.4 を踏襲すると、$c_3^j \cong r_\theta$ for $\langle u,\theta \rangle$ $u = j \times \{0,1\}$, $\theta = \{(j0),(j1)\} : j \in j\}$ で与えられる近似空間が示され、すべての crdsa は主粗集合代数の部分代数 $r_\theta$ に同型である。最後に、 [1] から例を拡張することで、これと主定理を実証する。さらに、一般に$TP_U$ および $C_3^U$ の部分代数についてもう少し知ることができ、これは任意の同値関係の同値類に対するインデックス集合である$E$ に対して、$|\theta_u| > 1\ \forall\ u \in U$ に対して存在しなければならない。

Given an approximation space $\langle U,\theta \rangle$, assume that $E$ is the indexing set for the equivalence classes of $\theta$ and let $R_\theta$ denote the collection of rough sets of the form $\langle\underline{X},\overline{X}\rangle$ as a regular double Stone algebra and what I. Dunstch referred to as a Katrinak algebra.[7],[8] We give an alternate proof from the one given in [1] of the fact that if $|\theta_u| > 1\ \forall\ u \in U$ then $R_\theta$ is a core regular double Stone algebra. Further let $C_3$ denote the 3 element chain as a core regular double Stone algebra and $TP_U$ denote the collection of ternary partitions over the set $U$. In our Main Theorem we show $R_\theta$ with $|\theta_u| > 1\ \forall\ u \in U$ to be isomorphic to $TP_E$ and $C_3^E$, with $E$ is an indexing set for $\theta$, and that the three CRDSA's are complete and atomic. We feel this could be very useful when dealing with a specific $R_\theta$ in an application. In our Main Corollary we show explicitly how we can embed such $R_\theta$ in $TP_U$, $C_3^U$, respectively, $\phi\circ \alpha_r:R_\theta\hookrightarrow TP_U\hookrightarrow C_3^U$, and hence identify it with its specific images. Following in the footsteps of Theorem 3. and Corollary 2.4 of [7], we show $C_3^J \cong R_\theta$ for $\langle U,\theta \rangle$ the approximation space given by $U = J \times \{0,1\}$, $\theta = \{(j0),(j1)\} : j \in J\}$ and every CRDSA is isomorphic to a subalgebra of a principal rough set algebra, $R_\theta$, for some approximation space $\langle U,\theta \rangle$. Finally, we demonstrate this and our Main Theorem by expanding an example from [1]. Further, we know a little more about the subalgebras of $TP_U$ and $C_3^U$ in general as they must exist for every $E$ that is an indexing set for the equivalence classes of any equivalence relation $\theta$ on $U$ satisfying $|\theta_u| > 1\ \forall\ u \in U$.

翻訳日:2021-04-10 13:29:02 公開日:2021-01-18

# 良い生徒が大きな宝くじを弾く

Good Students Play Big Lottery Better ( http://arxiv.org/abs/2101.03255v2 )

ライセンス: Link先を確認

Haoyu Ma, Tianlong Chen, Ting-Kuei Hu, Chenyu You, Xiaohui Xie, Zhangyang Wang

(参考訳) 宝くじの仮説は、高密度ニューラルネットワークは、(同じ)ランダム初期化から訓練されたとき、元の高密度ネットのテスト精度に一致するスパースサブネットワークを含んでいることを示唆している。しかし、この仮説はResNet-50のようなより大きなネットワークに一般化できなかった。近年の研究では、ランダム初期化ではなく、初期モデルの訓練重量や学習率から再学習する巻き戻し技術を用いてスパースサブネットワークが得られることが示されている。 rewindingは宝くじをスケールアップする唯一の方法か、あるいは最良の方法か? 本稿では,KDチケット(Knowledge Distillation ticket)と呼ばれるサブネットワークの再学習手法を提案する。 rewindingは、大規模ネットワークでの抽選チケットを改善するために、初期のトレーニングフェーズから知識を継承する価値を利用する。対照的に、KDチケットは相補的な可能性に対処し、密集モデルの後期トレーニングフェーズから有用な知識を継承する。トレーニングされた高密度モデルによって生成されたソフトラベルを活用して、ハードラベルの代わりにサブネットワークをトレーニングする。 CIFAR-10とImageNetデータセット上の複数の大きなディープネットワーク(ResNet-50やResNet-110など)を使用して大規模な実験を行う。ベルやホイッスルがなければ、kdチケットはリワインディングと同等かそれ以上の性能を発揮するが、ハイパーパラメータやアドホックな選択がほとんどない。 KDチケットはさらに巻き戻しと共に適用でき、大規模宝くじの最先端結果が得られる。

Lottery ticket hypothesis suggests that a dense neural network contains a sparse sub-network that can match the test accuracy of the original dense net when trained in isolation from (the same) random initialization. However, the hypothesis failed to generalize to larger dense networks such as ResNet-50. As a remedy, recent studies demonstrate that a sparse sub-network can still be obtained by using a rewinding technique, which is to re-train it from early-phase training weights or learning rates of the dense model, rather than from random initialization. Is rewinding the only or the best way to scale up lottery tickets? This paper proposes a new, simpler and yet powerful technique for re-training the sub-network, called "Knowledge Distillation ticket" (KD ticket). Rewinding exploits the value of inheriting knowledge from the early training phase to improve lottery tickets in large networks. In comparison, KD ticket addresses a complementary possibility - inheriting useful knowledge from the late training phase of the dense model. It is achieved by leveraging the soft labels generated by the trained dense model to re-train the sub-network, instead of the hard labels. Extensive experiments are conducted using several large deep networks (e.g ResNet-50 and ResNet-110) on CIFAR-10 and ImageNet datasets. Without bells and whistles, when applied by itself, KD ticket performs on par or better than rewinding, while being nearly free of hyperparameters or ad-hoc selection. KD ticket can be further applied together with rewinding, yielding state-of-the-art results for large-scale lottery tickets.

翻訳日:2021-04-10 05:11:05 公開日:2021-01-18

# 特徴変換と自己重み付け注意に基づくレゾリューション不変人物reid

Resolution-invariant Person ReID Based on Feature Transformation and Self-weighted Attention ( http://arxiv.org/abs/2101.04544v2 )

ライセンス: Link先を確認

Ziyue Zhang, Shuai Jiang, Congzhentao Huang, Richard Yi Da Xu

(参考訳) Person Re-identification (ReID) は、画像やビデオのシーケンスで同一人物と一致することを目的としたコンピュータビジョンタスクである。現在の作品のほとんどは、画像の解像度が同じである設定に焦点を当てている。しかし、この解像度は人物のReIDにおいて重要な要素であり、特にカメラが人物と異なる距離にある場合や、カメラのモデルが異なる場合などである。本稿では,RID特徴変換(RAFT)モジュールと自己重み付きアテンション(SWA)ReIDモジュールを組み合わせた2ストリームネットワークを提案する。 RAFTは低解像度特徴を対応する高解像度特徴に変換する。 SWAは、両方の特徴を評価して、ReIDの重み付けを行う。どちらのモジュールも解像度不変表現を得るために共同で訓練されている。 5つのベンチマークデータセットの大規模な実験により,本手法の有効性が示された。例えば、caviar と mlr-cuhk03 における rank-1 の精度は43.3% と 83.2% である。

Person Re-identification (ReID) is a critical computer vision task which aims to match the same person in images or video sequences. Most current works focus on settings where the resolution of images is kept the same. However, the resolution is a crucial factor in person ReID, especially when the cameras are at different distances from the person or the camera's models are different from each other. In this paper, we propose a novel two-stream network with a lightweight resolution association ReID feature transformation (RAFT) module and a self-weighted attention (SWA) ReID module to evaluate features under different resolutions. RAFT transforms the low resolution features to corresponding high resolution features. SWA evaluates both features to get weight factors for the person ReID. Both modules are jointly trained to get a resolution-invariant representation. Extensive experiments on five benchmark datasets show the effectiveness of our method. For instance, we achieve Rank-1 accuracy of 43.3% and 83.2% on CAVIAR and MLR-CUHK03, outperforming the state-of-the-art.

翻訳日:2021-04-04 01:43:41 公開日:2021-01-18

# (参考訳) ランダムシャドウとハイライト: 極端照明条件のための新しいデータ拡張法

Random Shadows and Highlights: A new data augmentation method for extreme lighting conditions ( http://arxiv.org/abs/2101.05361v2 )

ライセンス: CC BY 4.0

Osama Mazhar and Jens Kober

(参考訳) 本稿では,光の摂動に対するロバスト性を得るために,新しいデータ拡張手法であるランダムシャドウとハイライト(RSH)を提案する。提案手法はランダムな影と画像のハイライトを生成するため,学習過程においてニューラルネットワークに挑戦し,現実世界のアプリケーションにおける入力汚職に対する免疫を得る。これはパラメータ学習自由手法であり、ほとんどの視覚関連学習アプリケーションに統合することができる。広汎な実験により、RSHは照明摂動に対するモデルの堅牢性を高めるだけでなく、過度な適合性を著しく低減することを示した。したがって、RSHはすべての視覚関連学習システムに不可欠であると考えられるべきである。コードはhttps://github.com/osamamazhar/random-shadows-highlights。

In this paper, we propose a new data augmentation method, Random Shadows and Highlights (RSH) to acquire robustness against lighting perturbations. Our method creates random shadows and highlights on images, thus challenging the neural network during the learning process such that it acquires immunity against such input corruptions in real world applications. It is a parameter-learning free method which can be integrated into most vision related learning applications effortlessly. With extensive experimentation, we demonstrate that RSH not only increases the robustness of the models against lighting perturbations, but also reduces over-fitting significantly. Thus RSH should be considered essential for all vision related learning systems. Code is available at: https://github.com/OsamaMazhar/Random-Shadows-Highlights.

翻訳日:2021-03-30 08:34:55 公開日:2021-01-18

# (参考訳) トランスフォーマーを用いた新型コロナウイルス偽ニュース検出のための言語モデル微調整法

Transformer-based Language Model Fine-tuning Methods for COVID-19 Fake News Detection ( http://arxiv.org/abs/2101.05509v2 )

ライセンス: CC BY 4.0

Ben Chen, Bin Chen, Dehong Gao, Qijin Chen, Chengfu Huo, Xiaonan Meng, Weijun Ren, Yang Zhou

(参考訳) 新型コロナウイルス(COVID-19)のパンデミックで、関連する偽ニュースがソーシャルメディア全体に広まっている。差別なく彼らを信じることは、人々の生活に大きなトラブルを引き起こす可能性がある。しかし、このような偽ニュースの検出には、大規模な注釈付きデータやドメイン固有の知識の十分なセマンティック理解が欠如しているため、普遍言語モデルは弱い。対応するコーパスで訓練されたモデルは、不十分な学習にも適している。本稿では,これら偽ニュース検出のためのトランスフォーマーに基づく言語モデル微調整手法を提案する。まず、個々のモデルのトークン語彙を専門用語の実際の意味論のために拡張する。第2に,短文の曖昧さから偽ニュースによく見られるハードマイニングサンプルを区別するために,加熱したソフトマックス損失を適用した。そして、モデルの堅牢性を改善するために、敵の訓練を行う。最後に、普遍言語モデルRoBERTaとドメイン固有モデルCT-BERTによって抽出された予測特徴を、複数の層認識によって融合させ、微細で高レベルな特定の表現を統合する。既存のCOVID-19フェイクニュースデータセットで評価された定量的な実験結果は、様々な評価指標の最先端手法と比較して優れた性能を示した。さらに、ベストウェイト平均F1スコアは99.02%に達する。

With the pandemic of COVID-19, relevant fake news is spreading all over the sky throughout the social media. Believing in them without discrimination can cause great trouble to people's life. However, universal language models may perform weakly in these fake news detection for lack of large-scale annotated data and sufficient semantic understanding of domain-specific knowledge. While the model trained on corresponding corpora is also mediocre for insufficient learning. In this paper, we propose a novel transformer-based language model fine-tuning approach for these fake news detection. First, the token vocabulary of individual model is expanded for the actual semantics of professional phrases. Second, we adapt the heated-up softmax loss to distinguish the hard-mining samples, which are common for fake news because of the disambiguation of short text. Then, we involve adversarial training to improve the model's robustness. Last, the predicted features extracted by universal language model RoBERTa and domain-specific model CT-BERT are fused by one multiple layer perception to integrate fine-grained and high-level specific representations. Quantitative experimental results evaluated on existing COVID-19 fake news dataset show its superior performances compared to the state-of-the-art methods among various evaluation metrics. Furthermore, the best weighted average F1 score achieves 99.02%.

翻訳日:2021-03-29 06:52:29 公開日:2021-01-18

# (参考訳) 大規模言語モデルにおける持続的反ムスリムバイアス

Persistent Anti-Muslim Bias in Large Language Models ( http://arxiv.org/abs/2101.05783v2 )

ライセンス: CC BY 4.0

Abubakar Abid, Maheen Farooqi, James Zou

(参考訳) 大規模言語モデルは望ましくない社会的バイアスを捉えていることが観察されている。人種や性別に関連するが、宗教的な偏見は比較的探究されていない。我々は、現在最先端の文脈言語モデルであるGPT-3が、永続的なムスリム-暴力バイアスを捉えていることを実証した。我々は, GPT-3を, 即時完成, 類推, 物語生成など様々な方法で探索し, この反ムスリムバイアスを理解するとともに, モデルが異なる用途で一貫して, 創造的に現れること, 他宗教集団のバイアスと比較しても深刻であることを実証した。例えば、"イスラム教徒"はテストケースの23%で"テロリスト"に、"ユダヤ人"はテストケースの5%で"お金"にマッピングされます。敵対的なテキストプロンプトでこのバイアスを克服するために必要なポジティブな注意を定量化し、最もポジティブな6つの形容詞の使用は「ムスリム」の暴力的な完成度を66%から20%に減少させるが、他の宗教グループよりは依然として高い。

It has been observed that large-scale language models capture undesirable societal biases, e.g. relating to race and gender; yet religious bias has been relatively unexplored. We demonstrate that GPT-3, a state-of-the-art contextual language model, captures persistent Muslim-violence bias. We probe GPT-3 in various ways, including prompt completion, analogical reasoning, and story generation, to understand this anti-Muslim bias, demonstrating that it appears consistently and creatively in different uses of the model and that it is severe even compared to biases about other religious groups. For instance, "Muslim" is analogized to "terrorist" in 23% of test cases, while "Jewish" is mapped to "money" in 5% of test cases. We quantify the positive distraction needed to overcome this bias with adversarial text prompts, and find that use of the most positive 6 adjectives reduces violent completions for "Muslims" from 66% to 20%, but which is still higher than for other religious groups.

翻訳日:2021-03-29 03:51:47 公開日:2021-01-18

# 異方性ガウス混合モデルにおける最適クラスタリング

Optimal Clustering in Anisotropic Gaussian Mixture Models ( http://arxiv.org/abs/2101.05402v2 )

ライセンス: Link先を確認

Xin Chen, Anderson Y. Zhang

(参考訳) 異方性ガウス混合モデルでは、異なるクラスタからの共分散行列が未知であり、必ずしも同一行列であるとは限らない。本稿では,クラスタ中心と共分散行列に対する信号対雑音比の依存性を特徴付け,クラスタリング問題に対するミニマックス下界を求める。さらに,計算可能な手順を提案し,数回の反復で最適値が得られることを示す。提案手法はハードem型アルゴリズムであり、異方性共分散行列に調整されたロイドのアルゴリズムの変種と見なすこともできる。

We study the clustering task under anisotropic Gaussian Mixture Models where the covariance matrices from different clusters are unknown and are not necessarily the identical matrix. We characterize the dependence of signal-to-noise ratios on the cluster centers and covariance matrices and obtain the minimax lower bound for the clustering problem. In addition, we propose a computationally feasible procedure and prove it achieves the optimal rate within a few iterations. The proposed procedure is a hard EM type algorithm, and it can also be seen as a variant of the Lloyd's algorithm that is adjusted to the anisotropic covariance matrices.

翻訳日:2021-03-29 00:55:14 公開日:2021-01-18

# (参考訳) プレイヤーとAIのインタラクション: ニューラルネットワークゲームがAIをプレイする理由

Player-AI Interaction: What Neural Network Games Reveal About AI as Play ( http://arxiv.org/abs/2101.06220v2 )

ライセンス: CC BY 4.0

Jichen Zhu, Jennifer Villareale, Nithesh Javvaji, Sebastian Risi, Mathias L\"owe, Rush Weigelt, Casper Harteveld

(参考訳) 人工知能(AI)と機械学習(ML)の出現は、HCI研究の最前線に人間とAIの相互作用をもたらす。本稿では,人間がAIとどのように相互作用するかを学習し,実験する上で,ゲームは理想的な領域であると論じる。ニューラルネットワークゲーム(n = 38)のシステマティックサーベイを通じて、これらのゲームにおける支配的な相互作用メタファーとAIインタラクションパターンを特定した。さらに,既存の人間-AIインタラクションガイドラインを適用し,AIシステムにおけるプレイヤー-AIインタラクションをさらに強調した。私たちの中核的な発見は、AIが現在の人間とAIの相互作用の概念を拡大できるということです。特に、ゲームとUXデザイナは、人間のAIインタラクションの学習曲線を構造化するためのフローを考慮し、発見に基づく学習を取り入れてAIと遊んだり、結果を観察し、新たなタイプのAIインタラクションを探索するための遊びの招待を与えるべきだ、と提案しています。

The advent of artificial intelligence (AI) and machine learning (ML) bring human-AI interaction to the forefront of HCI research. This paper argues that games are an ideal domain for studying and experimenting with how humans interact with AI. Through a systematic survey of neural network games (n = 38), we identified the dominant interaction metaphors and AI interaction patterns in these games. In addition, we applied existing human-AI interaction guidelines to further shed light on player-AI interaction in the context of AI-infused systems. Our core finding is that AI as play can expand current notions of human-AI interaction, which are predominantly productivity-based. In particular, our work suggests that game and UX designers should consider flow to structure the learning curve of human-AI interaction, incorporate discovery-based learning to play around with the AI and observe the consequences, and offer users an invitation to play to explore new forms of human-AI interaction.

翻訳日:2021-03-28 14:04:57 公開日:2021-01-18

# airbnbのデータを使った nowcasting gentrification

Nowcasting Gentrification Using Airbnb Data ( http://arxiv.org/abs/2101.05924v2 )

ライセンス: Link先を確認

Shomik Jain, Davide Proserpio, Giovanni Quattrone, Daniele Quercia

(参考訳) 一部の都市では、ゲントリファイターが抗議活動や攻撃の対象となっていると推定されているが、他の都市では新しい仕事や税金のジェネレーターとして歓迎されている。国勢調査データは10年ごとに更新されるため、リアルタイムに近所の変化を測定することができない。この研究によると、Airbnbのデータは近所の変化の定量化と追跡に利用できる。具体的には、両方の構造化データ(例)を考える。リスト数、レビュー数、一覧情報)、構造化されていないデータ(例) 自然言語処理と機械学習アルゴリズムで処理されたユーザー生成レビュー) ニューヨーク(アメリカ)、ロサンゼルス(アメリカ)、グレーター・ロンドン(イギリス)の3つの主要都市で作成されている。 Airbnbのデータ(特に非構造的な部分)は、住宅価格と人口統計の変化として測定された近隣のジェントリフィケーションを予測しているようだ。全体として,オンラインプラットフォームからのユーザ生成データを用いて,より粒度の低い従来の尺度を補完する社会経済指標を作成できることが示唆された。

There is a rumbling debate over the impact of gentrification: presumed gentrifiers have been the target of protests and attacks in some cities, while they have been welcome as generators of new jobs and taxes in others. Census data fails to measure neighborhood change in real-time since it is usually updated every ten years. This work shows that Airbnb data can be used to quantify and track neighborhood changes. Specifically, we consider both structured data (e.g. number of listings, number of reviews, listing information) and unstructured data (e.g. user-generated reviews processed with natural language processing and machine learning algorithms) for three major cities, New York City (US), Los Angeles (US), and Greater London (UK). We find that Airbnb data (especially its unstructured part) appears to nowcast neighborhood gentrification, measured as changes in housing affordability and demographics. Overall, our results suggest that user-generated data from online platforms can be used to create socioeconomic indices to complement traditional measures that are less granular, not in real-time, and more costly to obtain.

翻訳日:2021-03-28 11:14:04 公開日:2021-01-18

# STENCIL-NET:偏微分方程式のデータ駆動型解適応離散化

STENCIL-NET: Data-driven solution-adaptive discretization of partial differential equations ( http://arxiv.org/abs/2101.06182v2 )

ライセンス: Link先を確認

Suryanarayana Maddu, Dominik Sturm, Bevan L. Cheeseman, Christian L. M\"uller, Ivo F. Sbalzarini

(参考訳) 偏微分方程式(PDE)を近似的に解く数値解法は、科学計算の核である。しばしば、これは高分解能または適応的な離散化格子を必要とし、例えば乱流、燃焼、衝撃伝播などの応用において、PDE溶液中の関連する時空間的特徴を捉える。数値近似はまた、問題固有の離散化を構築するためにPDEを理解する必要がある。しかし、そのような解適応離散作用素を体系的に導出することは現在の課題である。本稿では,非線形pdesの解法と分解能固有の局所的離散化をデータ駆動学習する人工ニューラルネットワークであるstencil-netを提案する。 stencil-net は正規直交格子上の空間的および時間的適応的パラメトリックプーリングと離散時間積分に関する知識を取り入れることで、未知の非線形 pde における作用素の数値的安定な離散化を実現する。解データがネットワークをトレーニングし、個別の演算子を学習するのに十分なので、実際のPDEを知る必要はない。一度トレーニングされたSTENCIL-NETモデルは、より大きな空間領域におけるPDEの解を、トレーニングされた時間よりも長い時間予測するために使用することができ、従ってデータからのPDE制約外挿の問題に対処することができる。この主張を支持するために、粗い時空間格子上のカオスPDE解の長期予測に関する数値実験を行った。また,線形数値法を方程式のないSTENCIL-NET予測に置き換えることで,精度を損なうことなく高速化する。

Numerical methods for approximately solving partial differential equations (PDE) are at the core of scientific computing. Often, this requires high-resolution or adaptive discretization grids to capture relevant spatio-temporal features in the PDE solution, e.g., in applications like turbulence, combustion, and shock propagation. Numerical approximation also requires knowing the PDE in order to construct problem-specific discretizations. Systematically deriving such solution-adaptive discrete operators, however, is a current challenge. Here we present STENCIL-NET, an artificial neural network architecture for data-driven learning of problem- and resolution-specific local discretizations of nonlinear PDEs. STENCIL-NET achieves numerically stable discretization of the operators in an unknown nonlinear PDE by spatially and temporally adaptive parametric pooling on regular Cartesian grids, and by incorporating knowledge about discrete time integration. Knowing the actual PDE is not necessary, as solution data is sufficient to train the network to learn the discrete operators. A once-trained STENCIL-NET model can be used to predict solutions of the PDE on larger spatial domains and for longer times than it was trained for, hence addressing the problem of PDE-constrained extrapolation from data. To support this claim, we present numerical experiments on long-term forecasting of chaotic PDE solutions on coarse spatio-temporal grids. We also quantify the speed-up achieved by substituting base-line numerical methods with equation-free STENCIL-NET predictions on coarser grids with little compromise on accuracy.

翻訳日:2021-03-28 11:10:59 公開日:2021-01-18

# (参考訳) ExpFinder:$N$-gramベクトル空間モデルと$\mu$CO-HITSを統合するアンサンブルエキスパート発見モデル

ExpFinder: An Ensemble Expert Finding Model Integrating $N$-gram Vector Space Model and $\mu$CO-HITS ( http://arxiv.org/abs/2101.06821v1 )

ライセンス: CC BY 4.0

Yong-Bin Kang, Hung Du, Abdur Rahim Mohammad Forkan, Prem Prakash Jayaraman, Amir Aryani, Timos Sellis (Fellow, IEEE)

(参考訳) 専門家を見つけることは、コラボレーションを成功させ、高品質の研究開発とイノベーションをスピードアップする上で重要な役割を担います。しかし、科学出版物やデジタル専門データの急速な成長により、適切な専門家を特定することが困難な問題となっている。あるトピックに与えられた専門家を見つける既存のアプローチは、ベクトル空間モデル、文書言語モデル、グラフベースモデルに基づく情報検索技術に分類することができる。本稿では、専門家探しのための新しいアンサンブルモデルである$\textit{expfinder}$を提案する。これは、新しい$n$-gramベクトル空間モデル($n$vsmと表記される)と、$\textit{$\mu$co-hits}$と表記されるグラフベースモデルとを統合したものである。 n$vsm の鍵は、n$-gram ワードと $\textit{expfinder}$ に対する最近の逆文書の頻度重み付け手法を、専門家を見つけるために$n$vsm を$\textit{$\mu$co-hits}$ に組み込むことである。学術分野の4つの異なるデータセットに対して,6つの専門家発見モデルと比較して,$\textit{expfinder}$を総合的に評価する。評価の結果、$\textit{expfinder}$は専門家の発見に非常に効果的なモデルであり、19%から160.2%で比較した全てのモデルを大きく上回っている。

Finding an expert plays a crucial role in driving successful collaborations and speeding up high-quality research development and innovations. However, the rapid growth of scientific publications and digital expertise data makes identifying the right experts a challenging problem. Existing approaches for finding experts given a topic can be categorised into information retrieval techniques based on vector space models, document language models, and graph-based models. In this paper, we propose $\textit{ExpFinder}$, a new ensemble model for expert finding, that integrates a novel $N$-gram vector space model, denoted as $n$VSM, and a graph-based model, denoted as $\textit{$\mu$CO-HITS}$, that is a proposed variation of the CO-HITS algorithm. The key of $n$VSM is to exploit recent inverse document frequency weighting method for $N$-gram words and $\textit{ExpFinder}$ incorporates $n$VSM into $\textit{$\mu$CO-HITS}$ to achieve expert finding. We comprehensively evaluate $\textit{ExpFinder}$ on four different datasets from the academic domains in comparison with six different expert finding models. The evaluation results show that $\textit{ExpFinder}$ is a highly effective model for expert finding, substantially outperforming all the compared models in 19% to 160.2%.

翻訳日:2021-03-27 19:28:28 公開日:2021-01-18

# (参考訳) ZeRO-Offload: 数十億ドル規模のモデルトレーニングを民主化

ZeRO-Offload: Democratizing Billion-Scale Model Training ( http://arxiv.org/abs/2101.06840v1 )

ライセンス: CC BY 4.0

Jie Ren, Samyam Rajbhandari, Reza Yazdani Aminabadi, Olatunji Ruwase, Shuangyan Yang, Minjia Zhang, Dong Li, Yuxiong He

(参考訳) 大規模モデルのトレーニングは、複雑なモデルのリファクタリングと、非常に高価なgpuクラスタへのアクセスを必要とするごく少数の理由の1つだ。 ZeRO-Offloadは、大きめのモデルトレーニング環境を、ほぼすべての人が利用できるようにすることで変更する。単一のGPU上で13億以上のパラメータを持つモデルをトレーニングすることが可能で、PyTorchのような一般的なフレームワークと比較して10倍の規模で、データサイエンティストからモデル変更を必要とせず、計算効率を犠牲にする必要がない。 ZeRO-Offloadはデータと計算をCPUにオフロードすることで、大規模なモデルトレーニングを可能にする。計算効率を維持するため、GPUへのデータ移動を最小化し、GPU上のメモリ節約を最大化しながらCPU計算時間を短縮するように設計されている。その結果、ZeRO-Offloadは、1つのNVIDIA V100 GPUで10Bパラメータモデルで40 TFlops/GPUを達成することができ、PyTorch単独で1.4Bパラメータモデルで30TFを使用するのに対して、メモリを使い果たさずにトレーニングできる最大である。 ZeRO-Offloadはまた、利用可能な場合、複数のGPUでスケールするように設計されており、最大128GPUでほぼ線形スピードアップを提供する。さらに、1つのDGX-2ボックスに700億以上のパラメータを持つモデルをトレーニングするために、モデルの並列性と連携することができる。 ZeRO-Offloadは計算とメモリ効率と使いやすさを組み合わせることで、大規模なモデルトレーニングを民主化し、単一のGPUにアクセス可能なデータサイエンティストにもアクセスできるようにする。

Large-scale model training has been a playing ground for a limited few requiring complex model refactoring and access to prohibitively expensive GPU clusters. ZeRO-Offload changes the large model training landscape by making large model training accessible to nearly everyone. It can train models with over 13 billion parameters on a single GPU, a 10x increase in size compared to popular framework such as PyTorch, and it does so without requiring any model change from the data scientists or sacrificing computational efficiency. ZeRO-Offload enables large model training by offloading data and compute to CPU. To preserve compute efficiency, it is designed to minimize the data movement to/from GPU, and reduce CPU compute time while maximizing memory savings on GPU. As a result, ZeRO-Offload can achieve 40 TFlops/GPU on a single NVIDIA V100 GPU for 10B parameter model compared to 30TF using PyTorch alone for a 1.4B parameter model, the largest that can be trained without running out of memory. ZeRO-Offload is also designed to scale on multiple-GPUs when available, offering near linear speedup on up to 128 GPUs. Additionally, it can work together with model parallelism to train models with over 70 billion parameters on a single DGX-2 box, a 4.5x increase in model size compared to using model parallelism alone. By combining compute and memory efficiency with ease-of-use, ZeRO-Offload democratizes large-scale model training making it accessible to even data scientists with access to just a single GPU.

翻訳日:2021-03-27 18:42:28 公開日:2021-01-18

# (参考訳) 自動走行のための建設ゾーンの時空間分割のための非パラメトリックメモリ

Non-parametric Memory for Spatio-Temporal Segmentation of Construction Zones for Self-Driving ( http://arxiv.org/abs/2101.06865v1 )

ライセンス: CC BY 4.0

Min Bai, Shenlong Wang, Kelvin Wong, Ersin Yumer, Raquel Urtasun

(参考訳) 本稿では,自律走行車(AV)周囲の局所的空間と時間を把握する時空間分割のための非パラメトリックメモリ表現を提案する。我々の表現には3つの重要な特性がある: (i) 過去に見たことを思い出す; (ii) 補強する; (iii) 新しい証拠に基づいて過去の信念を忘れる。補強は、例えば、その要素が強く隠蔽されているか、範囲内であるような、不確実であるかもしれない要素を初めて見るときに重要である。偽陽性がなければ、自動運転車が不規則に振る舞うことになるため、忘れることも望ましい。我々のプロセスは3D推論によって知らされ、隠蔽は忘れたい欲求と忘れたい欲求を区別する鍵となる。提案手法は,hdマップなどの静的世界表現を補完するオンラインコンポーネントとして,このようなイベントによる静的ビュー上に重畳すべき変更を検出・記憶することにより,どのように利用することができるかを示す。

In this paper, we introduce a non-parametric memory representation for spatio-temporal segmentation that captures the local space and time around an autonomous vehicle (AV). Our representation has three important properties: (i) it remembers what it has seen in the past, (ii) it reinforces and (iii) forgets its past beliefs based on new evidence. Reinforcing is important as the first time we see an element we might be uncertain, e.g, if the element is heavily occluded or at range. Forgetting is desirable, as otherwise false positives will make the self driving vehicle behave erratically. Our process is informed by 3D reasoning, as occlusion is key to distinguishing between the desire to forget and to remember. We show how our method can be used as an online component to complement static world representations such as HD maps by detecting and remembering changes that should be superimposed on top of this static view due to such events.

翻訳日:2021-03-27 17:54:05 公開日:2021-01-18

# (参考訳) マルチエージェント強化学習のための協調バイアスと競争バイアス

Cooperative and Competitive Biases for Multi-Agent Reinforcement Learning ( http://arxiv.org/abs/2101.06890v1 )

ライセンス: CC BY 4.0

Heechang Ryu, Hayong Shin, Jinkyoo Park

(参考訳) マルチエージェント強化学習(MARL)アルゴリズムの訓練は、エージェント間の複雑な相互作用と確率的・動的環境との相互作用に強く依存するため、シングルエージェント強化学習アルゴリズムの訓練よりも難しい。本稿では,他のエージェントの偏りのある行動情報を用いたMARL訓練を促進するアルゴリズムを提案する。協調的で競争的な環境には、一般的に2つのエージェント(協調エージェントと競争エージェント)がある。提案アルゴリズムでは,各エージェントがそれぞれのアクションと2つのグループの他のエージェントのバイアス作用情報を用いて値関数を更新する。協調エージェントのバイアス付き共同動作は、すべての協調エージェントが共同してターゲットエージェントの価値関数を最大化することにより、実際の共同動作と想像上の共同動作の合計として計算される。競合剤のバイアス付き共同作用も同様に計算できる。各エージェントはバイアス付きアクション情報を使用して自身の値関数を更新し、バイアス付き値関数と対応するバイアス付きポリシを生成する。その後、各エージェントのバイアスドポリシーは必然的に、他のエージェントと協力し、競合するアクションを推奨し、エージェント間のより活発な相互作用を導入し、MARLポリシー学習を強化する。提案アルゴリズムは,様々な混合協調競合環境において,既存のアルゴリズムよりも優れていることを示す。さらに、訓練が進むにつれて導入されるバイアスは徐々に減少し、虚偽の仮定に基づく補正がなくなる。

Training a multi-agent reinforcement learning (MARL) algorithm is more challenging than training a single-agent reinforcement learning algorithm, because the result of a multi-agent task strongly depends on the complex interactions among agents and their interactions with a stochastic and dynamic environment. We propose an algorithm that boosts MARL training using the biased action information of other agents based on a friend-or-foe concept. For a cooperative and competitive environment, there are generally two groups of agents: cooperative-agents and competitive-agents. In the proposed algorithm, each agent updates its value function using its own action and the biased action information of other agents in the two groups. The biased joint action of cooperative agents is computed as the sum of their actual joint action and the imaginary cooperative joint action, by assuming all the cooperative agents jointly maximize the target agent's value function. The biased joint action of competitive agents can be computed similarly. Each agent then updates its own value function using the biased action information, resulting in a biased value function and corresponding biased policy. Subsequently, the biased policy of each agent is inevitably subjected to recommend an action to cooperate and compete with other agents, thereby introducing more active interactions among agents and enhancing the MARL policy learning. We empirically demonstrate that our algorithm outperforms existing algorithms in various mixed cooperative-competitive environments. Furthermore, the introduced biases gradually decrease as the training proceeds and the correction based on the imaginary assumption vanishes.

翻訳日:2021-03-27 17:18:07 公開日:2021-01-18

# (参考訳) DeepPayload: ニューラルネットワークによるディープラーニングモデルに対するブラックボックスバックドア攻撃

DeepPayload: Black-box Backdoor Attack on Deep Learning Models through Neural Payload Injection ( http://arxiv.org/abs/2101.06896v1 )

ライセンス: CC BY 4.0

Yuanchun Li, Jiayi Hua, Haoyu Wang, Chunyang Chen, Yunxin Liu

(参考訳) ディープラーニングモデルは、モバイルアプリケーションにおいて重要なコンポーネントとしてますます利用されている。脆弱性や脅威が広く指摘されているプログラムバイトコードとは異なり、アプリケーションにデプロイされるディープラーニングモデルがどのように妥協するかは、ニューラルネットワークが通常ブラックボックスと見なされるため、十分に理解されていない。本稿では,コンパイルされたディープラーニングモデルに対して,一連のリバースエンジニアリング技術を用いて,極めて実用的なバックドア攻撃を提案する。攻撃の中核は、トリガー検出器と複数のオペレータで構築され、悪意のあるペイロードとして犠牲者モデルに注入される神経条件分岐である。この攻撃は、条件論理が攻撃者によって柔軟にカスタマイズできるため効果的であり、元のモデルから事前の知識を必要としないためスケーラブルである。 30ユーザから収集した5つの最先端ディープラーニングモデルと実世界のサンプルを用いて攻撃効果を評価した。その結果、インジェクションされたバックドアは成功率93.5%で起動できるが、2ミリ秒以下の遅延オーバーヘッドと1.4%の精度低下しか得られなかった。さらに,google playから収集した実世界のモバイル深層学習アプリに関する実証研究を行った。私たちの攻撃に対して脆弱な54のアプリを見つけました。結果は、デプロイされたモデルの保護を強化するために、ディープラーニングアプリケーション開発者と監査役の意識を喚起する。

Deep learning models are increasingly used in mobile applications as critical components. Unlike the program bytecode whose vulnerabilities and threats have been widely-discussed, whether and how the deep learning models deployed in the applications can be compromised are not well-understood since neural networks are usually viewed as a black box. In this paper, we introduce a highly practical backdoor attack achieved with a set of reverse-engineering techniques over compiled deep learning models. The core of the attack is a neural conditional branch constructed with a trigger detector and several operators and injected into the victim model as a malicious payload. The attack is effective as the conditional logic can be flexibly customized by the attacker, and scalable as it does not require any prior knowledge from the original model. We evaluated the attack effectiveness using 5 state-of-the-art deep learning models and real-world samples collected from 30 users. The results demonstrated that the injected backdoor can be triggered with a success rate of 93.5%, while only brought less than 2ms latency overhead and no more than 1.4% accuracy decrease. We further conducted an empirical study on real-world mobile deep learning apps collected from Google Play. We found 54 apps that were vulnerable to our attack, including popular and security-critical ones. The results call for the awareness of deep learning application developers and auditors to enhance the protection of deployed models.

翻訳日:2021-03-27 16:58:28 公開日:2021-01-18

# (参考訳) ブロックチェーンによる分散フェデレーション学習(blade-fl)のパフォーマンス分析とリソース割り当て

Blockchain Assisted Decentralized Federated Learning (BLADE-FL): Performance Analysis and Resource Allocation ( http://arxiv.org/abs/2101.06905v1 )

ライセンス: CC0 1.0

Jun Li, Yumeng Shao, Kang Wei, Ming Ding, Chuan Ma, Long Shi, Zhu Han, and H. Vincent Poor

(参考訳) 分散機械学習パラダイムであるフェデレートラーニング(FL)は、クライアントが生データをローカルに処理することで、個人のプライバシーを促進する。しかし、モデルアグリゲーションのための集中型サーバに頼ると、標準FLはサーバーの故障、不信なサーバ、外部攻撃に弱い。この問題に対処するために、ブロックチェーンをFL、すなわちブロックチェーン支援型分散連邦学習(BLADE-FL)に統合する分散FLフレームワークを提案する。提案したBLADE-FLのラウンドでは、各クライアントがトレーニングされたモデルを他のクライアントにブロードキャストし、受信したモデルに基づいてブロックを生成し、次のラウンドのローカルトレーニングの前に生成されたブロックからモデルを集約する。本研究では,blade-flの学習性能を評価し,大域的損失関数の上限を開発する。次に、この境界が全ラウンド数Kに対して凸であることを確認し、上限を最小化するための計算資源割り当てを最適化する。また,他人の訓練したモデルを盗聴し,不正行為を偽装する人工ノイズを付加する遅延クライアントが原因で,トレーニング不足が重大な問題となっていることも留意する。そこで本研究では,遅延クライアントがblade-flの学習性能に与える影響を考察し,最適なk,学習パラメータ,遅延クライアントの割合の関係を特徴付ける。 MNIST と Fashion-MNIST のデータセットから,実験結果は解析結果と一致していることを示す。具体的には、開発した上限値と実験値との差が5%以下であり、上限値に基づく最適化kは損失関数を効果的に最小化することができる。

Federated learning (FL), as a distributed machine learning paradigm, promotes personal privacy by clients' processing raw data locally. However, relying on a centralized server for model aggregation, standard FL is vulnerable to server malfunctions, untrustworthy server, and external attacks. To address this issue, we propose a decentralized FL framework by integrating blockchain into FL, namely, blockchain assisted decentralized federated learning (BLADE-FL). In a round of the proposed BLADE-FL, each client broadcasts its trained model to other clients, competes to generate a block based on the received models, and then aggregates the models from the generated block before its local training of the next round. We evaluate the learning performance of BLADE-FL, and develop an upper bound on the global loss function. Then we verify that this bound is convex with respect to the number of overall rounds K, and optimize the computing resource allocation for minimizing the upper bound. We also note that there is a critical problem of training deficiency, caused by lazy clients who plagiarize others' trained models and add artificial noises to disguise their cheating behaviors. Focusing on this problem, we explore the impact of lazy clients on the learning performance of BLADE-FL, and characterize the relationship among the optimal K, the learning parameters, and the proportion of lazy clients. Based on the MNIST and Fashion-MNIST datasets, we show that the experimental results are consistent with the analytical ones. To be specific, the gap between the developed upper bound and experimental results is lower than 5%, and the optimized K based on the upper bound can effectively minimize the loss function.

翻訳日:2021-03-27 16:37:06 公開日:2021-01-18

# (参考訳) 熱-クロスドメインカラー化画像の新しいレジストレーション・カラー化手法

A Novel Registration & Colorization Technique for Thermal to Cross Domain Colorized Images ( http://arxiv.org/abs/2101.06910v1 )

ライセンス: CC BY 4.0

Suranjan Goswami, Satish Kumar Singh

(参考訳) 熱画像は、撮影対象の熱プロファイルに基づいて、グレースケール画像または擬似カラー画像として得ることができる。本論文では,複数のサーマルイメージ装置で撮影された画像に対して,メイクや内部解像度に関係なく動作する新規な登録方法と,光学画像と類似したカラー化感熱画像を得るためのカラー化スキームと,その出力の一部としてサーマルプロファイルの情報を保持し,両ドメインの情報を協調的に提供できる新規な登録方式を提案する。これをクロスドメインカラー化画像と呼ぶ。また,本論文の一部として提示する新しい熱光学対データベースを概説し,複数の熱画像から得られたユニークなデータポイントについて述べる。最後に、この結果と先行文献を比較し、我々の結果がどのように異なるかを示し、この領域でさらに探求できる今後の研究について議論する。

Thermal images can be obtained as either grayscale images or pseudo colored images based on the thermal profile of the object being captured. We present a novel registration method that works on images captured via multiple thermal imagers irrespective of make and internal resolution as well as a colorization scheme that can be used to obtain a colorized thermal image which is similar to an optical image, while retaining the information of the thermal profile as a part of the output, thus providing information of both domains jointly. We call this a cross domain colorized image. We also outline a new public thermal-optical paired database that we are presenting as a part of this paper, containing unique data points obtained via multiple thermal imagers. Finally, we compare the results with prior literature, show how our results are different and discuss on some future work that can be explored further in this domain as well.

翻訳日:2021-03-27 16:14:20 公開日:2021-01-18

# (参考訳) TLU-Net:鉄鋼表面欠陥の自動検出のための深層学習手法

TLU-Net: A Deep Learning Approach for Automatic Steel Surface Defect Detection ( http://arxiv.org/abs/2101.06915v1 )

ライセンス: CC BY 4.0

Praveen Damacharla, Achuth Rao M. V., Jordan Ringenberg, and Ahmad Y Javaid

(参考訳) 視覚的鋼板表面欠陥検出は鋼板製造における必須ステップである。近年,機械学習に基づく自動視覚検査(AVI)手法が研究されている。しかし、ほとんどの製鋼業では、AVI法に関連するトレーニング時間や不正確さのために、手動の視覚検査が使われている。自動鋼の欠陥検出法は、安価でより高速な品質制御とフィードバックに有用である。しかし、セグメンテーションと分類のための注釈付きトレーニングデータを作成するのはコストがかかる。本研究では,鋼表面欠陥検出にTransfer Learning-based U-Net(TLU-Net)フレームワークを提案する。ベースとしてU-Netアーキテクチャを使用し、ResNetとDenseNetの2種類のエンコーダを探索する。これらのネットの性能をランダム初期化とimagenetデータセットを用いてトレーニングした事前学習ネットワークを用いて比較する。実験はSeverstalデータを用いて行われる。その結果, 伝達学習は欠陥分類におけるランダム初期化よりも5%(絶対的)に優れることがわかった。その結果,伝達学習は欠陥分割のランダム初期化よりも26%(相対的)に優れることがわかった。また, 学習データの減少に伴い, 転校学習の利得は増加し, 転校学習による収束率はランダム初期化よりも優れていることがわかった。

Visual steel surface defect detection is an essential step in steel sheet manufacturing. Several machine learning-based automated visual inspection (AVI) methods have been studied in recent years. However, most steel manufacturing industries still use manual visual inspection due to training time and inaccuracies involved with AVI methods. Automatic steel defect detection methods could be useful in less expensive and faster quality control and feedback. But preparing the annotated training data for segmentation and classification could be a costly process. In this work, we propose to use the Transfer Learning-based U-Net (TLU-Net) framework for steel surface defect detection. We use a U-Net architecture as the base and explore two kinds of encoders: ResNet and DenseNet. We compare these nets' performance using random initialization and the pre-trained networks trained using the ImageNet data set. The experiments are performed using Severstal data. The results demonstrate that the transfer learning performs 5% (absolute) better than that of the random initialization in defect classification. We found that the transfer learning performs 26% (relative) better than that of the random initialization in defect segmentation. We also found the gain of transfer learning increases as the training data decreases, and the convergence rate with transfer learning is better than that of the random initialization.

翻訳日:2021-03-27 15:56:42 公開日:2021-01-18

# (参考訳) 分散計画下次アルゴリズムにおけるインサイダー攻撃の検出

Detection of Insider Attacks in Distributed Projected Subgradient Algorithms ( http://arxiv.org/abs/2101.06917v1 )

ライセンス: CC BY 4.0

Sissi Xiaoxiao Wu, Gangqiang Li, Shengli Zhang, and Xiaohui Lin

(参考訳) Gossipベースの分散アルゴリズムは、様々なマルチエージェントアプリケーションの分散最適化問題を解決するために広く使われているが、一般的には、各エージェントが権限のない適切な方向をローカルに見積もっているため、内部悪意のあるエージェントによるデータインジェクション攻撃に対して脆弱である。本研究では、内部攻撃を検出する人工知能(AI)技術の適用について検討する。一般のニューラルネットワークは,収集されたデータに基づく非線形関係を効果的に探索できるため,悪意のあるエージェントの検出とローカライズに特に適している。さらに,協調学習における最先端のアプローチ,すなわち協調型ピアツーピア機械学習プロトコルを採用し,ゴシップ交換によるニューラルネットワークモデルのトレーニングを容易にすることを提案する。この高度なアプローチは、トレーニングデータ不足やミスマッチテストデータといった課題に対して、モデルをより堅牢にすることが期待されます。シミュレーションでは,AI手法の有効性と有効性を検証するために,最小二乗問題を考える。シミュレーションの結果,提案するaiベースの手法は,スコアに基づく手法よりも悪意のあるエージェントの検出とローカライズのパフォーマンス向上に有用であり,ピアツーピアニューラルネットワークモデルは,実際に問題に対して頑健であることが示された。

The gossip-based distributed algorithms are widely used to solve decentralized optimization problems in various multi-agent applications, while they are generally vulnerable to data injection attacks by internal malicious agents as each agent locally estimates its decent direction without an authorized supervision. In this work, we explore the application of artificial intelligence (AI) technologies to detect internal attacks. We show that a general neural network is particularly suitable for detecting and localizing the malicious agents, as they can effectively explore nonlinear relationship underlying the collected data. Moreover, we propose to adopt one of the state-of-art approaches in federated learning, i.e., a collaborative peer-to-peer machine learning protocol, to facilitate training our neural network models by gossip exchanges. This advanced approach is expected to make our model more robust to challenges with insufficient training data, or mismatched test data. In our simulations, a least-squared problem is considered to verify the feasibility and effectiveness of AI-based methods. Simulation results demonstrate that the proposed AI-based methods are beneficial to improve performance of detecting and localizing malicious agents over score-based methods, and the peer-to-peer neural network model is indeed robust to target issues.

翻訳日:2021-03-27 15:50:40 公開日:2021-01-18

# (参考訳) 属性インフォームド摂動によるニューラルネットワーク生成

Generative Counterfactuals for Neural Networks via Attribute-Informed Perturbation ( http://arxiv.org/abs/2101.06930v1 )

ライセンス: CC BY 4.0

Fan Yang, Ninghao Liu, Mengnan Du, Xia Hu

(参考訳) ディープニューラルネットワーク(dnn)の広範な利用により、高リスクシナリオでは説明可能な決定が望ましいため、モデル解釈性が重要な関心事となっている。現在の解釈技術は機能帰属の観点から主に焦点を当てており、特定の説明が予測とどのように関連しているかを示すのに制限がある。この目的のために、ファクトファクト(反事実)と呼ばれる興味深い説明のクラスが開発され、解釈のための「何」の状況をさらに探求し、ブラックボックスモデルにおける推論能力を実現する。しかし, 生データインスタンス(テキストや画像など)に対する偽造物の生成は, 高いデータ次元と非意味な生の特徴に課題があるため, まだ初期段階にある。本稿では,提案するAttribute-Informed Perturbation (AIP)を用いて,生データインスタンスに特化して偽物を生成するフレームワークを設計する。異なる属性を条件とした生成モデルを利用することで、所望のラベルとの反事実を効果的かつ効率的に得ることができる。データ空間のインスタンスを直接変更するのではなく、属性に変換された潜在空間を反復的に最適化します。実世界のテキストや画像に対する実験結果から, 提案したフレームワークの有効性, サンプル品質, および有効性を示し, その他の選択肢よりも優れていることを示す。さらに,本フレームワークに基づく実用的応用例も紹介し,モデルの解釈可能性を超えた可能性を示した。

With the wide use of deep neural networks (DNN), model interpretability has become a critical concern, since explainable decisions are preferred in high-stake scenarios. Current interpretation techniques mainly focus on the feature attribution perspective, which are limited in indicating why and how particular explanations are related to the prediction. To this end, an intriguing class of explanations, named counterfactuals, has been developed to further explore the "what-if" circumstances for interpretation, and enables the reasoning capability on black-box models. However, generating counterfactuals for raw data instances (i.e., text and image) is still in the early stage due to its challenges on high data dimensionality and unsemantic raw features. In this paper, we design a framework to generate counterfactuals specifically for raw data instances with the proposed Attribute-Informed Perturbation (AIP). By utilizing generative models conditioned with different attributes, counterfactuals with desired labels can be obtained effectively and efficiently. Instead of directly modifying instances in the data space, we iteratively optimize the constructed attribute-informed latent space, where features are more robust and semantic. Experimental results on real-world texts and images demonstrate the effectiveness, sample quality as well as efficiency of our designed framework, and show the superiority over other alternatives. Besides, we also introduce some practical applications based on our framework, indicating its potential beyond the model interpretability aspect.

翻訳日:2021-03-27 15:07:10 公開日:2021-01-18

# (参考訳) データ強化と一貫性に基づく半教師付き学習について

On Data-Augmentation and Consistency-Based Semi-Supervised Learning ( http://arxiv.org/abs/2101.06967v1 )

ライセンス: CC BY 4.0

Atin Ghosh and Alexandre H. Thiery

(参考訳) 最近提案された一貫性に基づくセミスーパーバイザラーニング(SSL)手法(例えば、$\Pi$-model, temporal ensembling, the mean teacher, or the virtual adversarial training)は、SSLタスクにおける最先端技術である。これらの手法は通常、ラベル付き例をほんの一部使用しながら、完全に監督されたものと同等のパフォーマンスに到達できる。これらの方法論的進歩にもかかわらず、これらの手法の理解はまだ比較的限られている。このテキストでは、分析的に扱いやすい結果が得られる設定において、$\pi$-model の分析(変動)を行う。我々はManifold Tangent Classifiersとのリンクを確立し、摂動の質が適切なSSL性能を得るための鍵であることを実証する。重要なことは、データ拡張スキームを自然に組み込んだHidden Manifold Modelのシンプルな拡張を提案し、SSLメソッドの理解と実験のためのフレームワークを提供する。

Recently proposed consistency-based Semi-Supervised Learning (SSL) methods such as the $\Pi$-model, temporal ensembling, the mean teacher, or the virtual adversarial training, have advanced the state of the art in several SSL tasks. These methods can typically reach performances that are comparable to their fully supervised counterparts while using only a fraction of labelled examples. Despite these methodological advances, the understanding of these methods is still relatively limited. In this text, we analyse (variations of) the $\Pi$-model in settings where analytically tractable results can be obtained. We establish links with Manifold Tangent Classifiers and demonstrate that the quality of the perturbations is key to obtaining reasonable SSL performances. Importantly, we propose a simple extension of the Hidden Manifold Model that naturally incorporates data-augmentation schemes and offers a framework for understanding and experimenting with SSL methods.

翻訳日:2021-03-27 14:23:37 公開日:2021-01-18

# (参考訳) 信号導出と凝集関数を用いた運動画像に基づく脳コンピューターインタフェース

Motor-Imagery-Based Brain Computer Interface using Signal Derivation and Aggregation Functions ( http://arxiv.org/abs/2101.06968v1 )

ライセンス: CC BY 4.0

Javier Fumanal-Idocin, Yu-Kai Wang, Chin-Teng Lin, Javier Fern\'andez, Jose Antonio Sanz, Humberto Bustince

(参考訳) 脳コンピュータインタフェース技術は、人間の脳と外部デバイスの間のコミュニケーションの一般的な方法である。 BCIの最も一般的なアプローチの1つは、Motor Imageryである。 BCIの応用において、脳波グラフは非侵襲的な性質のため、脳力学の非常に一般的な測定方法である。 bciの話題には高い関心が寄せられているが、脳波信号におけるパターン認識タスクの実行が困難であるため、既存のシステムの性能はいまだに理想的ではない。 BCIシステムは、信号前処理、特徴抽出、意思決定を行う幅広いコンポーネントで構成されている。本稿では,既存のMIベースのBCIフレームワークを改善するための3つの異なるアイデアを提案する。まず、信号のさらなる前処理ステップ:時間を不変にする脳波信号の微分を含む。第2に,システムの機能として周波数帯域を追加し,システムの性能にその効果を示す。最後に,システムにおける最終決定の方法に関する深い考察を行う。本研究では,最大6種類の異なる分類器と広範囲の集約関数(古典集約,チョケ,スゲノ積分およびそれらの拡張および重なり関数を含む)を用いて,分類器が与える情報を融合する手法を提案する。本システムでは,20名のボランティアのデータセットを用いて,運動画像を用いた脳-コンピュータインタフェース実験を行った。このデータセットでは、新しいシステムは88.80%の精度を達成した。また,最大90,76%を達成できる最適化版も提案する。さらに、ペアのChoquet/Sugeno積分と重なり関数が最良の結果を提供するものであることが分かる。

Brain Computer Interface technologies are popular methods of communication between the human brain and external devices. One of the most popular approaches to BCI is Motor Imagery. In BCI applications, the ElectroEncephaloGraphy is a very popular measurement for brain dynamics because of its non-invasive nature. Although there is a high interest in the BCI topic, the performance of existing systems is still far from ideal, due to the difficulty of performing pattern recognition tasks in EEG signals. BCI systems are composed of a wide range of components that perform signal pre-processing, feature extraction and decision making. In this paper, we define a BCI Framework, named Enhanced Fusion Framework, where we propose three different ideas to improve the existing MI-based BCI frameworks. Firstly, we include aan additional pre-processing step of the signal: a differentiation of the EEG signal that makes it time-invariant. Secondly, we add an additional frequency band as feature for the system and we show its effect on the performance of the system. Finally, we make a profound study of how to make the final decision in the system. We propose the usage of both up to six types of different classifiers and a wide range of aggregation functions (including classical aggregations, Choquet and Sugeno integrals and their extensions and overlap functions) to fuse the information given by the considered classifiers. We have tested this new system on a dataset of 20 volunteers performing motor imagery-based brain-computer interface experiments. On this dataset, the new system achieved a 88.80% of accuracy. We also propose an optimized version of our system that is able to obtain up to 90,76%. Furthermore, we find that the pair Choquet/Sugeno integrals and overlap functions are the ones providing the best results.

翻訳日:2021-03-27 14:06:21 公開日:2021-01-18

# (参考訳) T1およびT2強調MRIにおける肝のマルチモーダルセグメンテーションにおけるDeep Learning戦略の比較

Comparing Deep Learning strategies for paired but unregistered multimodal segmentation of the liver in T1 and T2-weighted MRI ( http://arxiv.org/abs/2101.06979v1 )

ライセンス: CC BY 4.0

Vincent Couteaux, Mathilde Trintignac, Olivier Nempont, Guillaume Pizaine, Anna Sesilia Vlachomitrou, Pierre-Jean Valette, Laurent Milot, Isabelle Bloch

(参考訳) マルチモーダル肝セグメンテーションにおけるT1,T2強調MR画像の問題点について検討した。文献に記述されているいくつかの戦略とマルチタスクトレーニングの有無,事前登録の有無を比較した。また,異なる損失関数(クロスエントロピー,ダイス損失,3つの逆損失)を比較する。全てのメソッドは、同時に両方のセグメンテーションを実行するマルチタスク設定を除いて、同等のパフォーマンスを達成した。

We address the problem of multimodal liver segmentation in paired but unregistered T1 and T2-weighted MR images. We compare several strategies described in the literature, with or without multi-task training, with or without pre-registration. We also compare different loss functions (cross-entropy, Dice loss, and three adversarial losses). All methods achieved comparable performances with the exception of a multi-task setting that performs both segmentations at once, which performed poorly.

翻訳日:2021-03-27 13:25:24 公開日:2021-01-18

# (参考訳) ニューラルランク付けモデルにおけるカタストロフィックフォーミングの研究

Studying Catastrophic Forgetting in Neural Ranking Models ( http://arxiv.org/abs/2101.06984v1 )

ライセンス: CC BY 4.0

Jesus Lovon-Melgarejo, Laure Soulier, Karen Pinel-Sauvagnat, Lynda Tamine

(参考訳) 最近のIR文献では、いくつかの深いニューラルネットワークランキングモデルが提案されている。データセットが保持する1つのターゲットドメインへの転送性は、従来のドメイン適応戦略を用いて広く取り組まれているが、そのクロスドメイン転送性に関する問題は、まだ未検討である。ニューラルランキングモデルは、新しい知識を得た後、以前に観測された領域から得られた古い知識を破滅的に忘れる程度に研究し、これらの領域のパフォーマンスを低下させる。実験の結果,脳波ランキングモデルの有効性は破滅的な忘れを犠牲にして達成され,クロスドメイン正規化器を用いた生涯学習戦略が問題を軽減することがわかった。また,回帰モデルに基づく説明的アプローチを用いて,ドメイン特性が破滅的忘れることの高まりに与える影響を示す。得られた結果は,神経赤外線における理論的および実用的な研究に有用であると考えられる。

Several deep neural ranking models have been proposed in the recent IR literature. While their transferability to one target domain held by a dataset has been widely addressed using traditional domain adaptation strategies, the question of their cross-domain transferability is still under-studied. We study here in what extent neural ranking models catastrophically forget old knowledge acquired from previously observed domains after acquiring new knowledge, leading to performance decrease on those domains. Our experiments show that the effectiveness of neuralIR ranking models is achieved at the cost of catastrophic forgetting and that a lifelong learning strategy using a cross-domain regularizer success-fully mitigates the problem. Using an explanatory approach built on a regression model, we also show the effect of domain characteristics on the rise of catastrophic forgetting. We believe that the obtained results can be useful for both theoretical and practical future work in neural IR.

翻訳日:2021-03-27 13:17:21 公開日:2021-01-18

# (参考訳) 機械学習モデル探索のためのインタラクティブスライス可視化

Interactive slice visualization for exploring machine learning models ( http://arxiv.org/abs/2101.06986v1 )

ライセンス: CC BY 4.0

Catherine B. Hurley, Mark O'Connell, Katarina Domijan

(参考訳) 機械学習モデルは、任意の規模のデータセットに複雑なアルゴリズムを適合させる。これらのアルゴリズムは、性能が高く、解釈性が低いことでよく知られている。我々は、予測空間のスライスをインタラクティブに可視化し、解釈可能性の欠陥に対処し、事実上、機械学習アルゴリズムのブラックボックスを開くことで、モデルの適合性を疑問視し、説明し、検証し、比較することを目的としている。スライスは相互作用を通じて直接指定されるか、あるいはモデルに適合する高占有領域や地域を訪れるように設計された様々なツアーアルゴリズムを使用する。ここで提示されるメソッドは、Rパッケージ \pkg{condvis2} に実装される。

Machine learning models fit complex algorithms to arbitrarily large datasets. These algorithms are well-known to be high on performance and low on interpretability. We use interactive visualization of slices of predictor space to address the interpretability deficit; in effect opening up the black-box of machine learning algorithms, for the purpose of interrogating, explaining, validating and comparing model fits. Slices are specified directly through interaction, or using various touring algorithms designed to visit high-occupancy sections or regions where the model fits have interesting properties. The methods presented here are implemented in the R package \pkg{condvis2}.

翻訳日:2021-03-27 13:01:57 公開日:2021-01-18

# (参考訳) テネシー・イーストマン化学プロセスにおける故障検出のためのニューラルネットワークの深部圧縮

Deep Compression of Neural Networks for Fault Detection on Tennessee Eastman Chemical Processes ( http://arxiv.org/abs/2101.06993v1 )

ライセンス: CC BY 4.0

Mingxuan Li, Yuanxun Shao

(参考訳) 人工ニューラルネットワークはテネシー・イーストマンプロセスにおいて最先端のフォールト検出性能を達成したが、膨大なパラメータに資金を提供するには膨大なメモリを必要とすることが多い。オンラインリアルタイム故障検出を実現するために,3つの深部圧縮技術(プルーニング,クラスタリング,量子化)を適用し,計算負担を軽減する。我々は7種類の圧縮技術の組み合わせを広範囲に研究し、全ての手法が高いモデル圧縮率を64%以上達成し、高い故障検出精度を維持した。最も優れた結果として、3つのテクニックを全て適用し、モデルのサイズを91.5%削減し、精度は94%以上である。これにより、本番環境でのストレージ要件が小さくなり、実環境におけるデプロイメントがよりスムーズになる。

Artificial neural network has achieved the state-of-art performance in fault detection on the Tennessee Eastman process, but it often requires enormous memory to fund its massive parameters. In order to implement online real-time fault detection, three deep compression techniques (pruning, clustering, and quantization) are applied to reduce the computational burden. We have extensively studied 7 different combinations of compression techniques, all methods achieve high model compression rates over 64% while maintain high fault detection accuracy. The best result is applying all three techniques, which reduces the model sizes by 91.5% and remains a high accuracy over 94%. This result leads to a smaller storage requirement in production environments, and makes the deployment smoother in real world.

翻訳日:2021-03-27 13:01:03 公開日:2021-01-18

# (参考訳) 深部普遍ブラインド画像

Deep Universal Blind Image Denoising ( http://arxiv.org/abs/2101.07017v1 )

ライセンス: CC BY 4.0

Jae Woong Soh, Nam Ik Cho

(参考訳) 画像のノイズ除去は、画像取得時に避けられないノイズによる多くの画像処理やコンピュータビジョンタスクで不可欠な部分である。伝統的に、多くの研究者が画像の性質と統計に基づくベイズ的視点で画像の優先順位を調査してきた。近年,deep convolutional neural networks (cnns) は大規模合成データセットを組み込んだ画像デノイジングにおいて大きな成功を収めている。しかし、どちらも長所と短所がある。ディープCNNは既知の統計でノイズを取り除くのに強力だが、視覚障害者や現実世界の騒音には柔軟性と実用性が欠けている傾向がある。さらに、明示的な事前設定は簡単には採用できない。一方、従来の非学習手法は明示的な画像先行処理を伴い得るが、かなりの計算時間を必要とし、大規模な外部データセットを活用できない。本稿では,ベイズ的視点に基づく両手法の利点を生かしたCNNに基づく手法を提案する。具体的には,視覚障害をサブプロブレムに分割し,各推論問題を分解する。 CNNは推論のための強力なツールであるため,提案手法はCNNに根ざし,効率的な推論のための新しいネットワーク設計を提案する。提案手法により,広帯域CNNのパラメータを適度に行うことで,視覚と現実世界のノイズを除去できる。

Image denoising is an essential part of many image processing and computer vision tasks due to inevitable noise corruption during image acquisition. Traditionally, many researchers have investigated image priors for the denoising, within the Bayesian perspective based on image properties and statistics. Recently, deep convolutional neural networks (CNNs) have shown great success in image denoising by incorporating large-scale synthetic datasets. However, they both have pros and cons. While the deep CNNs are powerful for removing the noise with known statistics, they tend to lack flexibility and practicality for the blind and real-world noise. Moreover, they cannot easily employ explicit priors. On the other hand, traditional non-learning methods can involve explicit image priors, but they require considerable computation time and cannot exploit large-scale external datasets. In this paper, we present a CNN-based method that leverages the advantages of both methods based on the Bayesian perspective. Concretely, we divide the blind image denoising problem into sub-problems and conquer each inference problem separately. As the CNN is a powerful tool for inference, our method is rooted in CNNs and propose a novel design of network for efficient inference. With our proposed method, we can successfully remove blind and real-world noise, with a moderate number of parameters of universal CNN.

翻訳日:2021-03-27 12:55:16 公開日:2021-01-18

# (参考訳) weibull分布による事象ログを用いたイベント駆動型予測保守のキーフレーバーの解析

Analysis of key flavors of event-driven predictive maintenance using logs of phenomena described by Weibull distributions ( http://arxiv.org/abs/2101.07033v1 )

ライセンス: CC BY 4.0

Petros Petsinis, Athanasios Naskos and Anastasios Gounaris

(参考訳) この研究は、業界4.0におけるイベント駆動型予測保守への2つのアプローチを探求し、それぞれ問題を分類または回帰として、最先端の2つのソリューションの出発点として使用します。これら2つの手法のそれぞれについて,異なるデータ前処理手法,異なる予測アルゴリズム,およびアンサンブルとサンプリング方法の影響について検討する。以上のような側面を体系的に実験することで,選択肢の強みを理解し,さらに重要な点として,多数の代替手段をインフォームドでナビゲートする方法に光を当てた。我々の研究は、この種のデータ駆動型予測保守の真の可能性を理解するための重要なステップを構成し、実践者が最も影響の大きい側面に集中するのを手助けします。

This work explores two approaches to event-driven predictive maintenance in Industry 4.0 that cast the problem at hand as a classification or a regression one, respectively, using as a starting point two state-of-the-art solutions. For each of the two approaches, we examine different data preprocessing techniques, different prediction algorithms and the impact of ensemble and sampling methods. Through systematic experiments regarding the aspectsmentioned above,we aimto understand the strengths of the alternatives, and more importantly, shed light on how to navigate through the vast number of such alternatives in an informed manner. Our work constitutes a key step towards understanding the true potential of this type of data-driven predictive maintenance as of to date, and assist practitioners in focusing on the aspects that have the greatest impact.

翻訳日:2021-03-27 12:43:06 公開日:2021-01-18

# (参考訳) 顔解析のための適応グラフ表現学習と推論

Adaptive Graph Representation Learning and Reasoning for Face Parsing ( http://arxiv.org/abs/2101.07034v1 )

ライセンス: CC BY 4.0

Gusi Te, Wei Hu, Yinglu Liu, Hailin Shi, Tao Mei

(参考訳) 顔解析は、最近注目を集めている各顔コンポーネントにピクセル単位のラベルを推測する。これまでは顔解析に成功していたが、顔成分間の相関を見落としている。実際、コンポーネント間の関係は、顔領域の曖昧なピクセルを識別するための重要な手がかりである。そこで本研究では,顔成分に対する適応的グラフ表現学習と推論を提案し,各成分を記述した代表頂点を学習し,成分関係を活用し,曖昧性に対する正確な解析結果を生成する。特に,ある顔領域内の画素特徴が頂点に集約される予測解析マップの初期条件下で,画素対頂点投影によりグラフ上の成分を表現する適応的で微分可能なグラフ抽象化手法を考案した。さらに,画像エッジを先行として,投影中にエッジと非エッジの画素を識別し,エッジに沿った解析結果の洗練に寄与するモデルとして,画像エッジを明示的に組み込む。そして,グラフ上の頂点をまたいで情報を伝播することにより,コンポーネント間の関係を学習し,理由付けを行う。最後に、改良された頂点機能は最終解析マップの予測のためにピクセルグリッドに投影される。本モデルでは,特徴空間における頂点間の小さな距離をペナルティ化する識別的損失を提案する。実験の結果,提案モデルが複数顔解析データセット上で優れた性能を示すとともに,人間の解析タスクの検証を行い,モデルの一般化可能性を示した。

Face parsing infers a pixel-wise label to each facial component, which has drawn much attention recently. Previous methods have shown their success in face parsing, which however overlook the correlation among facial components. As a matter of fact, the component-wise relationship is a critical clue in discriminating ambiguous pixels in facial area. To address this issue, we propose adaptive graph representation learning and reasoning over facial components, aiming to learn representative vertices that describe each component, exploit the component-wise relationship and thereby produce accurate parsing results against ambiguity. In particular, we devise an adaptive and differentiable graph abstraction method to represent the components on a graph via pixel-to-vertex projection under the initial condition of a predicted parsing map, where pixel features within a certain facial region are aggregated onto a vertex. Further, we explicitly incorporate the image edge as a prior in the model, which helps to discriminate edge and non-edge pixels during the projection, thus leading to refined parsing results along the edges. Then, our model learns and reasons over the relations among components by propagating information across vertices on the graph. Finally, the refined vertex features are projected back to pixel grids for the prediction of the final parsing map. To train our model, we propose a discriminative loss to penalize small distances between vertices in the feature space, which leads to distinct vertices with strong semantics. Experimental results show the superior performance of the proposed model on multiple face parsing datasets, along with the validation on the human parsing task to demonstrate the generalizability of our model.

翻訳日:2021-03-27 12:25:06 公開日:2021-01-18

# (参考訳) CLASTER:ゼロショット動作認識のための強化学習によるクラスタリング

CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action Recognition ( http://arxiv.org/abs/2101.07042v1 )

ライセンス: CC BY 4.0

Shreyank N Gowda, Laura Sevilla-Lara, Frank Keller, Marcus Rohrbach

(参考訳) ゼロショットアクション認識は、視覚的な例のないアクションクラスを認識するタスクであり、目に見えないクラスに関連するセマンティックな埋め込みである。問題は、クラス間の区別を失うことなく、目に見えないクラスのインスタンスによく一般化する関数を学ぶことである。ニューラルネットワークは、視覚クラス間の複雑な境界をモデル化することができる。しかし、ゼロショット学習では、これらの高度に専門化されたクラス境界は、目に見えるクラスから見当たらないクラスへうまく移行できないかもしれない。本稿では,各インスタンスを個別に最適化するのではなく,すべてのトレーニングサンプルを同時に検討するクラスタリングモデルを提案する。私たちはReinforcement Learningを使ってクラスタリングを最適化します。我々は提案手法をCLASTERと呼び、標準ゼロショット評価と一般化ゼロショット学習の両方において、標準データセットであるUCF101, HMDB51, オリンピックスポーツの最先端性を常に改善することを確認する。

Zero-shot action recognition is the task of recognizing action classes without visual examples, only with a semantic embedding which relates unseen to seen classes. The problem can be seen as learning a function which generalizes well to instances of unseen classes without losing discrimination between classes. Neural networks can model the complex boundaries between visual classes, which explains their success as supervised models. However, in zero-shot learning, these highly specialized class boundaries may not transfer well from seen to unseen classes. In this paper, we propose a clustering-based model, which considers all training samples at once, instead of optimizing for each instance individually. We optimize the clustering using Reinforcement Learning which we show is critical for our approach to work. We call the proposed method CLASTER and observe that it consistently improves over the state-of-the-art in all standard datasets, UCF101, HMDB51, and Olympic Sports; both in the standard zero-shot evaluation and the generalized zero-shot learning.

翻訳日:2021-03-27 12:04:51 公開日:2021-01-18

# (参考訳) IMU事前積分による深部慣性オドメトリー

Deep Inertial Odometry with Accurate IMU Preintegration ( http://arxiv.org/abs/2101.07061v1 )

ライセンス: CC BY 4.0

Rooholla Khorrambakht, Chris Xiaoxuan Lu, Hamed Damirchi, Zhenghua Chen, Zhengguo Li

(参考訳) 慣性測定ユニット (IMU) は、環境要因に依存しないエゴモーション計測を提供する、インターセプティブなモダリティである。様々な自律システムで広く採用されている。数理モデルを用いてこれらのセンサからノイズ測定を処理することの限界に触発され、研究者は近年、慣性計測をエンドツーエンドに推定する様々なディープラーニングアーキテクチャを提案している。それでも、IMUからの高周波および冗長な測定により、長い生の配列が処理される。本研究では, 深部慣性計測のためのIMU運動モデル(DIO)のより現実的な解法として, 精度の高い事前積分の有効性を検討することを目的としている。正確なIMU事前積分は、既存のDIOで使用される連続IMUモデルの数値近似よりも優れている可能性がある。実験結果は提案したDIOを検証する。

Inertial Measurement Units (IMUs) are interceptive modalities that provide ego-motion measurements independent of the environmental factors. They are widely adopted in various autonomous systems. Motivated by the limitations in processing the noisy measurements from these sensors using their mathematical models, researchers have recently proposed various deep learning architectures to estimate inertial odometry in an end-to-end manner. Nevertheless, the high-frequency and redundant measurements from IMUs lead to long raw sequences to be processed. In this study, we aim to investigate the efficacy of accurate preintegration as a more realistic solution to the IMU motion model for deep inertial odometry (DIO) and the resultant DIO is a fusion of model-driven and data-driven approaches. The accurate IMU preintegration has the potential to outperform numerical approximation of the continuous IMU model used in the existing DIOs. Experimental results validate the proposed DIO.

翻訳日:2021-03-27 11:50:45 公開日:2021-01-18

# (参考訳) 因果効果推定による領域適応のためのモデル圧縮

Model Compression for Domain Adaptation through Causal Effect Estimation ( http://arxiv.org/abs/2101.07086v1 )

ライセンス: CC BY 4.0

Guy Rotman, Amir Feder and Roi Reichart

(参考訳) 自然言語処理システムの予測品質の最近の改善は、しばしばモデルパラメータの大幅な増加に依存している。これは、これらのモデルを圧縮する様々な試みにつながったが、既存の手法では、様々なモデルコンポーネントの予測能力や圧縮モデルの一般化可能性の違いは考慮されていない。モデル圧縮とアウト・オブ・ディストリビューション一般化の関連性を理解するため,ドメイン適応設定において最良となるように言語表現モデルを圧縮するタスクを定義する。我々は、モデルの予測に基づいて、単一層のようなモデルコンポーネントの \textit{average treatment effect} (ATE) を推定しようと、因果的な観点からこの問題に対処することを選択した。提案したATE誘導モデル圧縮スキーム(AMoC)は,除去されたモデルコンポーネントによって異なる多くのモデル候補を生成する。次に、ATEを利用した段階的回帰モデルを用いて、最適候補を選択し、対象領域における期待性能を予測する。 AMoCは2つのテキスト分類タスクで60のドメインペアのうち46の強いベースラインより優れており、F1の平均的な改善は最強のベースラインより3倍以上多い。

Recent improvements in the predictive quality of natural language processing systems are often dependent on a substantial increase in the number of model parameters. This has led to various attempts of compressing such models, but existing methods have not considered the differences in the predictive power of various model components or in the generalizability of the compressed models. To understand the connection between model compression and out-of-distribution generalization, we define the task of compressing language representation models such that they perform best in a domain adaptation setting. We choose to address this problem from a causal perspective, attempting to estimate the \textit{average treatment effect} (ATE) of a model component, such as a single layer, on the model's predictions. Our proposed ATE-guided Model Compression scheme (AMoC), generates many model candidates, differing by the model components that were removed. Then, we select the best candidate through a stepwise regression model that utilizes the ATE to predict the expected performance on the target domain. AMoC outperforms strong baselines on 46 of 60 domain pairs across two text classification tasks, with an average improvement of more than 3\% in F1 above the strongest baseline.

翻訳日:2021-03-27 11:42:07 公開日:2021-01-18

# (参考訳) ストレージシミュレータが生成する障害のオンライン検出

Online detection of failures generated by storage simulator ( http://arxiv.org/abs/2101.07100v1 )

ライセンス: CC BY 4.0

Kenenbek Arzymatov, Mikhail Hushchyn, Andrey Sapronov, Vladislav Belavin, Leonid Gremyachikh, Maksim Karpov and Andrey Ustyuzhanin

(参考訳) 現代の大規模データファームは、分散インフラストラクチャにまたがる数十万のストレージデバイスで構成されている。現代のデータセンター(コントローラ、リンク、SSD、HDDディスクなど)で使用されるデバイスは、ハードウェアとソフトウェアの問題によって故障する可能性がある。このような障害や異常は、機械学習技術を用いてコンポーネントのアクティビティを監視することで検出できる。これらの技術を使うためには、研究者は通常のデバイスの履歴データと、アルゴリズムのトレーニングに障害モードを必要とする。本研究では,1)シミュレータ作成によるメソッド内のストレージデータの欠如,2)コンポーネントの1つで発生した障害を素早く検出できる既存のオンラインアルゴリズムの適用という2つの課題に挑戦する。現代のストレージインフラストラクチャの振る舞いをシミュレートするためのGoベースの(golang)パッケージを作成しました。このソフトウェアは離散イベントモデリングのパラダイムに基づいており、高レベルのストレージシステム構築ブロックの構造とダイナミクスをキャプチャする。パッケージのフレキシブルな構造により、構成可能なコンポーネント数で現実世界のストレージシステムのモデルを作成することができます。主な関心領域は、部品の故障を観察するための中長期のストレステストや利用の下での記憶装置の動作を探索することである。シミュレータが生成した時系列分布の故障を検出するため,オンラインモードで動作する変更点検出アルゴリズムを改良した。変化点検出の目標は、時系列分布の違いを発見することである。本稿では,バイナリ分類器を用いた直接密度比推定に基づく時系列データの故障検出手法について述べる。

Modern large-scale data-farms consist of hundreds of thousands of storage devices that span distributed infrastructure. Devices used in modern data centers (such as controllers, links, SSD- and HDD-disks) can fail due to hardware as well as software problems. Such failures or anomalies can be detected by monitoring the activity of components using machine learning techniques. In order to use these techniques, researchers need plenty of historical data of devices in normal and failure mode for training algorithms. In this work, we challenge two problems: 1) lack of storage data in the methods above by creating a simulator and 2) applying existing online algorithms that can faster detect a failure occurred in one of the components. We created a Go-based (golang) package for simulating the behavior of modern storage infrastructure. The software is based on the discrete-event modeling paradigm and captures the structure and dynamics of high-level storage system building blocks. The package's flexible structure allows us to create a model of a real-world storage system with a configurable number of components. The primary area of interest is exploring the storage machine's behavior under stress testing or exploitation in the medium- or long-term for observing failures of its components. To discover failures in the time series distribution generated by the simulator, we modified a change point detection algorithm that works in online mode. The goal of the change-point detection is to discover differences in time series distribution. This work describes an approach for failure detection in time series data based on direct density ratio estimation via binary classifiers.

翻訳日:2021-03-27 11:23:47 公開日:2021-01-18

# (参考訳) Telugu言語のためのニューラル抽象テキスト要約器

Neural Abstractive Text Summarizer for Telugu Language ( http://arxiv.org/abs/2101.07120v1 )

ライセンス: CC BY 4.0

Mohan Bharath B, Aravindh Gowtham B, Akhil M

(参考訳) 抽象テキスト要約 (Abstractive Text Summarization) は、ソーステキストの全体的意味の本質を捉える意味論的に関連する短い文を構築する過程である。実際、人間がテキストの大きな文書を手作業で要約するのは困難であり、非常に時間がかかります。抽象的なテキスト要約の作業の多くは英語で行われており、テルグの抽象的なテキスト要約にはほとんど大きな成果が報告されていない。そこで我々は,Deep Learningを用いたTelugu言語のための抽象的なテキスト要約手法を提案する。本稿では,Telugu言語のための抽象テキスト要約深層学習モデルを提案する。提案手法は注意機構を有するエンコーダ・デコーダシーケンシャルモデルに基づく。このモデルを手作業で作成したデータセットに適用して,ソーステキストの一文要約を生成し,質的に測定した結果を得た。

Abstractive Text Summarization is the process of constructing semantically relevant shorter sentences which captures the essence of the overall meaning of the source text. It is actually difficult and very time consuming for humans to summarize manually large documents of text. Much of work in abstractive text summarization is being done in English and almost no significant work has been reported in Telugu abstractive text summarization. So, we would like to propose an abstractive text summarization approach for Telugu language using Deep learning. In this paper we are proposing an abstractive text summarization Deep learning model for Telugu language. The proposed architecture is based on encoder-decoder sequential models with attention mechanism. We have applied this model on manually created dataset to generate a one sentence summary of the source text and have got good results measured qualitatively.

翻訳日:2021-03-27 11:18:00 公開日:2021-01-18

# (参考訳) ReLUネットワークにおける深さの利点に関する簡単な幾何学的証明

A simple geometric proof for the benefit of depth in ReLU networks ( http://arxiv.org/abs/2101.07126v1 )

ライセンス: CC BY 4.0

Asaf Amrami and Yoav Goldberg

(参考訳) 本稿では, 再活性化した多層フィードフォワードネットワーク(deepth separation)における深度効果の簡易な証明を提案する。具体的には、$m$でインデックス付けされた一連の分類問題を示し、(a)任意の固定深さ整流ネットワークに対して、(a) 問題を正しく分類するには指数関数的なパラメータ数($m$)が必要となる$m$ と、(b) シーケンス中の任意の問題に対して、問題をゼロエラーで分類する、線形深さ($m$)と小さい定数幅($\leq 4$)を持つ具体的なニューラルネットワークを示す。構成的証明は幾何学的議論と空間折り畳み構成に基づいている。より強固な境界と結果が存在する一方で、この証明は極めて単純なツールと技術を用いており、コンピュータサイエンスの学部生や同様の背景を持つ人々にもアクセス可能であるべきである。

We present a simple proof for the benefit of depth in multi-layer feedforward network with rectified activation ("depth separation"). Specifically we present a sequence of classification problems indexed by $m$ such that (a) for any fixed depth rectified network there exist an $m$ above which classifying problem $m$ correctly requires exponential number of parameters (in $m$); and (b) for any problem in the sequence, we present a concrete neural network with linear depth (in $m$) and small constant width ($\leq 4$) that classifies the problem with zero error. The constructive proof is based on geometric arguments and a space folding construction. While stronger bounds and results exist, our proof uses substantially simpler tools and techniques, and should be accessible to undergraduate students in computer science and people with similar backgrounds.

翻訳日:2021-03-27 11:13:46 公開日:2021-01-18

# (参考訳) 近似k-部分モジュラー関数の最大化

Maximizing approximately k-submodular functions ( http://arxiv.org/abs/2101.07157v1 )

ライセンス: CC BY 4.0

Leqian Zheng and Hau Chan and Grigorios Loukides and Minming Li

(参考訳) サイズ制約を受ける約$k$-サブモジュラー関数を最大化する問題を導入する。この問題では、有界な総サイズまたは個々のサイズを持つ基底集合の$k$-disjoint部分集合と、$k$-submodular となる関数の "close" によって与えられる最大効用を選ぼうとする。この問題は、ノイズの多いセンサのタイプに$k$をインストールしたいというセンサー配置や、影響力のレベルが不確実なソーシャルネットワークのユーザに$k$のトピックを宣伝しようとしている最大化などのタスクに応用されている。この問題に対処するために、我々はまず、約$k$-submodular関数に対する2つの自然な定義を提供し、それらの間の階層的関係を確立する。次に, 単純な欲望アルゴリズムが, 異なるサイズ制約に対する近似保証を提供することを示す。最後に,このアルゴリズムがセンサ配置や最大化問題に有効であることを実験的に示す。

We introduce the problem of maximizing approximately $k$-submodular functions subject to size constraints. In this problem, one seeks to select $k$-disjoint subsets of a ground set with bounded total size or individual sizes, and maximum utility, given by a function that is "close" to being $k$-submodular. The problem finds applications in tasks such as sensor placement, where one wishes to install $k$ types of sensors whose measurements are noisy, and influence maximization, where one seeks to advertise $k$ topics to users of a social network whose level of influence is uncertain. To deal with the problem, we first provide two natural definitions for approximately $k$-submodular functions and establish a hierarchical relationship between them. Next, we show that simple greedy algorithms offer approximation guarantees for different types of size constraints. Last, we demonstrate experimentally that the greedy algorithms are effective in sensor placement and influence maximization problems.

翻訳日:2021-03-27 11:05:08 公開日:2021-01-18

# (参考訳) 限られたデータによる機械学習

Machine learning with limited data ( http://arxiv.org/abs/2101.11461v1 )

ライセンス: CC BY 4.0

Fupin Yao

(参考訳) 強力なコンピューティングリソース、ビッグデータ、ディープラーニングアルゴリズムの可用性のおかげで、ここ数年でコンピュータビジョンに大きな進歩を遂げました。コンピュータビジョンシステムは、物体認識、物体検出、顔認識、ポーズ推定など、いくつかのタスクで人間を超え始めます。多くのコンピュータビジョンアルゴリズムが現実世界のアプリケーションにデプロイされ、私たちの生活の質を改善し始めた。しかし、ビッグデータやラベルは必ずしも利用できない。時には、専門家がラベルを付ける必要のある医療画像など、非常に限定されたラベルデータしか持っていないことがあります。本稿では,少ないラベル付きデータしか持たない撮影画像分類について検討する。小さなデータによる機械学習は大きな課題だ。この課題に取り組むために,我々は2つの手法を提案し,その効果を徹底的に検証する。 1つの方法は、これらの画像のスタイルを混ぜることで、画像の特徴を強化することである。第2の方法は、画像のパッチ間の関係を探索するために空間的注意を適用することである。また、トレーニングドメインとテストドメインが異なる場合、わずかなショット学習では、ドメインシフトが重要な問題であることも分かりました。そこで本稿では,ラベルのないデータセットを対象とする,より現実的なドメイン間数ショット学習を提案する。この設定では2つの方法を提案する。第1の方法は、ラベルのないターゲットデータセットのスタイル情報をソースデータセットのサンプルに転送し、スタイリッシュなイメージとオリジナルイメージでモデルをトレーニングする。第2の方法は,すべてのデータを完全に活用するための統一フレームワークを提案する。どちらの手法も基準法を大きなマージンで上回ります。

Thanks to the availability of powerful computing resources, big data and deep learning algorithms, we have made great progress on computer vision in the last few years. Computer vision systems begin to surpass humans in some tasks, such as object recognition, object detection, face recognition and pose estimation. Lots of computer vision algorithms have been deployed to real world applications and started to improve our life quality. However, big data and labels are not always available. Sometimes we only have very limited labeled data, such as medical images which requires experts to label them. In this paper, we study few shot image classification, in which we only have very few labeled data. Machine learning with little data is a big challenge. To tackle this challenge, we propose two methods and test their effectiveness thoroughly. One method is to augment image features by mixing the style of these images. The second method is applying spatial attention to explore the relations between patches of images. We also find that domain shift is a critical issue in few shot learning when the training domain and testing domain are different. So we propose a more realistic cross-domain few-shot learning with unlabeled data setting, in which some unlabeled data is available in the target domain. We propose two methods in this setting. Our first method transfers the style information of the unlabeled target dataset to the samples in the source dataset and trains a model with stylized images and original images. Our second method proposes a unified framework to fully utilize all the data. Both of our methods surpass the baseline method by a large margin.

翻訳日:2021-03-27 10:08:57 公開日:2021-01-18

# (参考訳) アクティブ輪郭モデルとスピードアップロバスト特徴を用いた顔料病変の自動分離と評価の新しいアプローチ

A New Approach for Automatic Segmentation and Evaluation of Pigmentation Lesion by using Active Contour Model and Speeded Up Robust Features ( http://arxiv.org/abs/2101.07195v1 )

ライセンス: CC0 1.0

Sara Mardanisamani, Zahra Karimi, Akram Jamshidzadeh, Mehran Yazdi, Melika Farshad, Amirmehdi Farshad

(参考訳) デジタル画像処理技術は、医学を含む様々な科学分野に広く応用されている。画像処理アルゴリズムを用いることで、医師はさまざまな疾患の診断に成功し、より優れた治療結果を得た。本稿では,皮膚病変の分類とそれに関連する特徴の抽出を自動で行う手法を提案する。この目的には、高速化されたロバスト特徴量(surf)とアクティブ輪郭モデル(acm)の組み合わせを用いる。提案手法では,皮膚病変の第一領域を全皮膚画像から抽出し,その領域から平均値,分散値,RGB値,HSV値などの特徴を抽出する。提案手法は,大津しきい値を用いたセグメンテーションの結果と比較し,大津留置法よりも手順が優れていることを示す。提案法と大津しきい値による皮膚病変の分別は,医師の手作業法と比較した。 SURFとACMを併用した皮膚病変の分画法は,最もよい結果が得られた。本手法を実験的に評価するために,20種類の皮膚病変画像に適用した。その結果,提案手法の性能,速度,精度が確認できた。

Digital image processing techniques have wide applications in different scientific fields including the medicine. By use of image processing algorithms, physicians have been more successful in diagnosis of different diseases and have achieved much better treatment results. In this paper, we propose an automatic method for segmenting the skin lesions and extracting features that are associated to them. At this aim, a combination of Speeded-Up Robust Features (SURF) and Active Contour Model (ACM), is used. In the suggested method, at first region of skin lesion is segmented from the whole skin image, and then some features like the mean, variance, RGB and HSV parameters are extracted from the segmented region. Comparing the segmentation results, by use of Otsu thresholding, our proposed method, shows the superiority of our procedure over the Otsu theresholding method. Segmentation of the skin lesion by the proposed method and Otsu thresholding compared the results with physician's manual method. The proposed method for skin lesion segmentation, which is a combination of SURF and ACM, gives the best result. For empirical evaluation of our method, we have applied it on twenty different skin lesion images. Obtained results confirm the high performance, speed and accuracy of our method.

翻訳日:2021-03-27 09:59:36 公開日:2021-01-18

# (参考訳) 集中的相手を用いた医用画像の連合生成モデルによるバイアス低減と有用性の向上

Reducing bias and increasing utility by federated generative modeling of medical images using a centralized adversary ( http://arxiv.org/abs/2101.07235v1 )

ライセンス: CC BY 4.0

Jean-Francois Rajotte, Sumit Mukherjee, Caleb Robinson, Anthony Ortiz, Christopher West, Juan Lavista Ferres, Raymond T Ng

(参考訳) 我々は、協調学習を可能にする生成メカニズムであるFELICIA(Federated LearnIng with a CentralIzed Adversary)を紹介する。特に、限定的かつ偏りのあるデータを持つデータ所有者が、すべてのソースからのデータをプライベートに保ちながら、他のデータ所有者の利益を享受できることを示す。これは、プライバシー法がデータをローカルな施設外で共有することを防ぐ医療画像解析において一般的なシナリオである。 FELICIAは、この研究で示されているように、バニラや条件付きGANを含むGAN(Generative Adversarial Networks)アーキテクチャの大規模なファミリーで動作する。 FELICIA機構を用いることで,データ所有者がデータへのアクセスを提供しなくても,画像サンプルに制限のあるデータ所有者が高能率で高品質な合成画像を生成することができることを示す。共有は、合成データに限られる中央の識別器を通してのみ行われる。ここで、ユーティリティは実際のテストセットの分類性能として定義される。皮膚病変分類のための医用画像およびベンチマーク画像データセット(mnist, cifar-10)を用いて,いくつかの現実的な医療シナリオにおいて,これらの利点を実証する。複数の実験で、最悪の場合においても、FELICIAと実データを組み合わせることで、実データと同等の性能が得られ、ほとんどの結果が実用性を大幅に向上することを示した。

We introduce FELICIA (FEderated LearnIng with a CentralIzed Adversary) a generative mechanism enabling collaborative learning. In particular, we show how a data owner with limited and biased data could benefit from other data owners while keeping data from all the sources private. This is a common scenario in medical image analysis where privacy legislation prevents data from being shared outside local premises. FELICIA works for a large family of Generative Adversarial Networks (GAN) architectures including vanilla and conditional GANs as demonstrated in this work. We show that by using the FELICIA mechanism, a data owner with limited image samples can generate high-quality synthetic images with high utility while neither data owners has to provide access to its data. The sharing happens solely through a central discriminator that has access limited to synthetic data. Here, utility is defined as classification performance on a real test set. We demonstrate these benefits on several realistic healthcare scenarions using benchmark image datasets (MNIST, CIFAR-10) as well as on medical images for the task of skin lesion classification. With multiple experiments, we show that even in the worst cases, combining FELICIA with real data gracefully achieves performance on par with real data while most results significantly improves the utility.

翻訳日:2021-03-27 09:51:38 公開日:2021-01-18

# (参考訳) 半教師付き学習のためのマルチモーダル変分オートエンコーダ--製品・オブ・エキスパートの擁護

Multimodal Variational Autoencoders for Semi-Supervised Learning: In Defense of Product-of-Experts ( http://arxiv.org/abs/2101.07240v1 )

ライセンス: CC BY 4.0

Svetlana Kutuzova, Oswin Krause, Douglas McCloskey, Mads Nielsen, Christian Igel

(参考訳) マルチモーダル生成モデルは、すべてのモダリティ(画像やテキストなど)のコヒーレントな共同生成を可能にする有意義な潜在表現を学べるべきである。多くの応用では、モダリティのサブセットの観測で条件付けられたモダリティを正確にサンプリングする能力も必要である。すべてのトレーニングデータポイントですべてのモダリティが観測されるわけではないため、半教師付き学習が可能となる。本研究では,これらの特性を持つ多変量オートエンコーダの製品群(PoE)を評価する。我々は新しいpoeベースのアーキテクチャとトレーニング手順を含む。経験的評価は、PoEベースのモデルが添加性混合(MoE)アプローチより優れていることを示している。我々の実験は、PoEモデルがモジュラリティの共役結合に適しているのに対して、MoEは接合融合に適しているという直感を支持する。

Multimodal generative models should be able to learn a meaningful latent representation that enables a coherent joint generation of all modalities (e.g., images and text). Many applications also require the ability to accurately sample modalities conditioned on observations of a subset of the modalities. Often not all modalities may be observed for all training data points, so semi-supervised learning should be possible. In this study, we evaluate a family of product-of-experts (PoE) based variational autoencoders that have these desired properties. We include a novel PoE based architecture and training procedure. An empirical evaluation shows that the PoE based models can outperform an additive mixture-of-experts (MoE) approach. Our experiments support the intuition that PoE models are more suited for a conjunctive combination of modalities while MoEs are more suited for a disjunctive fusion.

翻訳日:2021-03-27 09:37:27 公開日:2021-01-18

# (参考訳) 観察による学習:人間ビデオからの操作スキルの物理的模倣

Learning by Watching: Physical Imitation of Manipulation Skills from Human Videos ( http://arxiv.org/abs/2101.07241v1 )

ライセンス: CC0 1.0

Haoyu Xiong, Quanzhou Li, Yun-Chun Chen, Homanga Bharadhwaj, Samarth Sinha, Animesh Garg

(参考訳) 本稿では,ロボット操作作業のための人間ビデオからの物理模倣手法を提案する。我々の手法の鍵となる考え方は、ビデオに埋め込まれた運動情報と運動情報を明示的に活用して、ロボットが自身のコンテキストで操作を行う方法を想像できる構造的表現を学ぶことである。そこで我々は,人間の映像をロボット領域に翻訳し,教師なしのキーポイント検出を行う知覚モジュールを設計した。得られたキーポイントに基づく表現は意味的に意味のある情報を提供し、報酬計算やポリシー学習に直接利用できる。提案手法は, ロボット操作作業において, リーチ, 押圧, スライディング, コーヒーメイキング, 引き出しクローズの5つの課題に対して有効性を評価する。詳細な実験評価の結果,従来の手法に好適な効果を示した。

We present an approach for physical imitation from human videos for robot manipulation tasks. The key idea of our method lies in explicitly exploiting the kinematics and motion information embedded in the video to learn structured representations that endow the robot with the ability to imagine how to perform manipulation tasks in its own context. To achieve this, we design a perception module that learns to translate human videos to the robot domain followed by unsupervised keypoint detection. The resulting keypoint-based representations provide semantically meaningful information that can be directly used for reward computing and policy learning. We evaluate the effectiveness of our approach on five robot manipulation tasks, including reaching, pushing, sliding, coffee making, and drawer closing. Detailed experimental evaluations demonstrate that our method performs favorably against previous approaches.

翻訳日:2021-03-27 09:01:24 公開日:2021-01-18

# (参考訳) 量子格子モデルに対するゲージ不変自己回帰ニューラルネットワーク

Gauge Invariant Autoregressive Neural Networks for Quantum Lattice Models ( http://arxiv.org/abs/2101.07243v1 )

ライセンス: CC0 1.0

Di Luo, Zhuo Chen, Kaiwen Hu, Zhizhen Zhao, Vera Mikyoung Hur, and Bryan K. Clark

(参考訳) ゲージ不変性は、凝縮物物理学から高エネルギー物理学まで量子力学において重要な役割を果たす。量子格子モデルのためのゲージ不変自己回帰ニューラルネットワークの構築手法を開発する。これらのネットワークは効率的にサンプリングでき、ゲージ対称性を明示的に従うことができる。我々は、ゲージ不変自己回帰ニューラルネットワークの基底状態と、様々なモデルのリアルタイムダイナミクスを可変に最適化する。 2Dおよび3Dトーリック符号の基底状態と励起状態、およびX-キューブフラクトンモデルを正確に表現する。我々は、$\text{u(1)}$格子ゲージ理論の量子リンクモデルのダイナミクスをシミュレートし、2d $\mathbb{z}_2$ゲージ理論の位相図を取得し、$\text{su(2)}_3$anyonic chainの位相遷移と中心電荷を決定し、$\text{su(2)}$ invariant heisenbergスピンチェーンの基底状態エネルギーを計算する。我々のアプローチは、凝縮物質物理学、高エネルギー物理学、量子情報科学を探索するための強力なツールを提供する。

Gauge invariance plays a crucial role in quantum mechanics from condensed matter physics to high energy physics. We develop an approach to constructing gauge invariant autoregressive neural networks for quantum lattice models. These networks can be efficiently sampled and explicitly obey gauge symmetries. We variationally optimize our gauge invariant autoregressive neural networks for ground states as well as real-time dynamics for a variety of models. We exactly represent the ground and excited states of the 2D and 3D toric codes, and the X-cube fracton model. We simulate the dynamics of the quantum link model of $\text{U(1)}$ lattice gauge theory, obtain the phase diagram for the 2D $\mathbb{Z}_2$ gauge theory, determine the phase transition and the central charge of the $\text{SU(2)}_3$ anyonic chain, and also compute the ground state energy of the $\text{SU(2)}$ invariant Heisenberg spin chain. Our approach provides powerful tools for exploring condensed matter physics, high energy physics and quantum information science.

翻訳日:2021-03-27 08:45:53 公開日:2021-01-18

# (参考訳) HAMMER:学習メッセージによる強化学習エージェントの多層コーディネーション

HAMMER: Multi-Level Coordination of Reinforcement Learning Agents via Learned Messaging ( http://arxiv.org/abs/2102.00824v1 )

ライセンス: CC0 1.0

Nikunj Gupta, G Srinivasaraghavan, Swarup Kumar Mohalik, Matthew E. Taylor

(参考訳) 協調型マルチエージェント強化学習(marl)は,ディープニューラルネットワークの表現学習能力を活用することで,大きな成果を上げている。しかし、エージェントの数が増えるにつれて、大規模な集中型アプローチはすぐに実現不可能になり、完全な分散型アプローチは情報共有と協調の重要な機会を逃す可能性がある。さらに、すべてのエージェントが等しくはない - 場合によっては、個々のエージェントが他のエージェントに通信を送信したり、他のエージェントを明示的にモデル化する能力さえ持たない場合がある。本稿では、観測空間全体を観測できる単一の、強力な、中央のエージェントが存在する場合と、局所的な観測しか受信できず、互いに通信できない複数の、低パワーのローカルエージェントが存在することを考察する。中央エージェントの役割は、問題全体を一元的に解決し、アクションコマンドを送信することではなく、個々のエージェントが受信すべき追加情報を決定することによって、グローバルな観察に基づいて、異なるローカルエージェントに送信すべきメッセージを知ることである。 MARLアルゴリズム、ハンマー、そして最も適用可能な場所を説明した後、協調ナビゲーションとマルチエージェントウォーカードメインで実装する。その結果,1)学習したコミュニケーションはシステム性能が向上し,2)成果は複数のエージェントに一般化し,3)成果は報酬構造に一般化した。

Cooperative multi-agent reinforcement learning (MARL) has achieved significant results, most notably by leveraging the representation learning abilities of deep neural networks. However, large centralized approaches quickly become infeasible as the number of agents scale, and fully decentralized approaches can miss important opportunities for information sharing and coordination. Furthermore, not all agents are equal - in some cases, individual agents may not even have the ability to send communication to other agents or explicitly model other agents. This paper considers the case where there is a single, powerful, central agent that can observe the entire observation space, and there are multiple, low powered, local agents that can only receive local observations and cannot communicate with each other. The job of the central agent is to learn what message to send to different local agents, based on the global observations, not by centrally solving the entire problem and sending action commands, but by determining what additional information an individual agent should receive so that it can make a better decision. After explaining our MARL algorithm, hammer, and where it would be most applicable, we implement it in the cooperative navigation and multi-agent walker domains. Empirical results show that 1) learned communication does indeed improve system performance, 2) results generalize to multiple numbers of agents, and 3) results generalize to different reward structures.

翻訳日:2021-03-27 07:41:24 公開日:2021-01-18

# (参考訳) 従来の機械学習とディープラーニングモデルを用いた教育内容の分類

Classification of Pedagogical content using conventional machine learning and deep learning model ( http://arxiv.org/abs/2101.07321v1 )

ライセンス: CC BY 4.0

Vedat Apuk, Krenare Pireva Nu\c{c}i

(参考訳) インターネットの出現と多くのデジタル技術によって、様々な課題がもたらされた。大量のデータがWeb上で発見され、多くの場合、構造化されておらず、組織化されていないため、このデータの使用と操作は極めて難しいプロセスであるという事実に寄与する。この事実により、テキスト分類における異なる機械学習技術とディープラーニング技術の使用が重要となり、この分野を改善し、科学者や研究者にとってさらなる研究がより興味深いものとなった。本稿では,従来のモデルからk-nearest neighbor(knn),ディープラーニングモデルからlong short-term memory(lstm)リカレントニューラルネットワークの2つの異なるモデルを用いて,教育内容の分類を行う。その結果,教育内容の分類精度はKNNモデルで92.52 %,LSTMモデルで87.71 %に達することがわかった。

The advent of the Internet and a large number of digital technologies has brought with it many different challenges. A large amount of data is found on the web, which in most cases is unstructured and unorganized, and this contributes to the fact that the use and manipulation of this data is quite a difficult process. Due to this fact, the usage of different machine and deep learning techniques for Text Classification has gained its importance, which improved this discipline and made it more interesting for scientists and researchers for further study. This paper aims to classify the pedagogical content using two different models, the K-Nearest Neighbor (KNN) from the conventional models and the Long short-term memory (LSTM) recurrent neural network from the deep learning models. The result indicates that the accuracy of classifying the pedagogical content reaches 92.52 % using KNN model and 87.71 % using LSTM model.

翻訳日:2021-03-27 07:24:25 公開日:2021-01-18

# (参考訳) BERTモデルによる自動句読点復元

Automatic punctuation restoration with BERT models ( http://arxiv.org/abs/2101.07343v1 )

ライセンス: CC BY 4.0

Attila Nagy, Bence Bial, Judit \'Acs

(参考訳) 本稿では,英語とハンガリー語に対するBERTモデルを用いた自動句読点復元手法を提案する。ハンガリー語ではSzeged Treebankデータセットでモデルを評価する一方、英語では句読点復元のための一般的なベンチマークであるTed Talksで実験を行った。我々の最良のモデルは、英語で79.8ドル、ハンガリー語で82.2ドルのマクロ平均$F_1$スコアを達成する。私たちのコードは公開されています。

We present an approach for automatic punctuation restoration with BERT models for English and Hungarian. For English, we conduct our experiments on Ted Talks, a commonly used benchmark for punctuation restoration, while for Hungarian we evaluate our models on the Szeged Treebank dataset. Our best models achieve a macro-averaged $F_1$-score of 79.8 in English and 82.2 in Hungarian. Our code is publicly available.

翻訳日:2021-03-27 07:00:12 公開日:2021-01-18

# (参考訳) リアルタイム適応ロボット把持のためのrgbと深度データからの物体検出とポーズ推定

Object Detection and Pose Estimation from RGB and Depth Data for Real-time, Adaptive Robotic Grasping ( http://arxiv.org/abs/2101.07347v1 )

ライセンス: CC BY-SA 4.0

S. K. Paul, M. T. Chowdhury, M. Nicolescu, M. Nicolescu

(参考訳) 近年,ロボット視覚応用の文脈において,物体検出とポーズ推定が注目されている。興味のある物体の識別とポーズの推定は、ロボットが家庭の作業から工業的な操作まで、多くのロボットアプリケーションに対して効果的な支援を提供するためにも重要である。この問題は、異なる形と潜在的に複雑な形状を持つ物体の多様性と、背景のクラッタと物体間の部分的な閉塞によって生じる困難のため、特に困難である。本研究の主な貢献として,動的ロボットの把握を目的としたリアルタイム物体検出とポーズ推定を行うシステムを提案する。ロボットは、各オブジェクトに対するいくつかの固定されたポーズから、少数の標準的グリップを実行するために事前訓練されている。任意のポーズで未知のオブジェクトを提示すると、ロボットはオブジェクトの同一性とその実際のポーズを検知し、新しいポーズで使用するために標準的グリップを適用することができる。訓練のためのシステムは、ロボットの手首に取り付けられたグリッパーに対する対象の相対的な姿勢を捉えることで、標準的な把握を定義する。試験中、新たなポーズが検出されると、ロボットアームの関節角度を調整して物体の正準把持を識別して動的に適応させ、グリッパーが新たなポーズで物体を把持できるようにする。我々はヒューマノイドPR2ロボットを用いて実験を行い、提案したフレームワークが良好なテクスチャを持つ物体を検知し、許容量の飛行機外回転の存在下で正確なポーズ推定を行うことを示した。また、ロボットが任意のポーズからオブジェクトをつかむのに成功し、パフォーマンスを図示する。

In recent times, object detection and pose estimation have gained significant attention in the context of robotic vision applications. Both the identification of objects of interest as well as the estimation of their pose remain important capabilities in order for robots to provide effective assistance for numerous robotic applications ranging from household tasks to industrial manipulation. This problem is particularly challenging because of the heterogeneity of objects having different and potentially complex shapes, and the difficulties arising due to background clutter and partial occlusions between objects. As the main contribution of this work, we propose a system that performs real-time object detection and pose estimation, for the purpose of dynamic robot grasping. The robot has been pre-trained to perform a small set of canonical grasps from a few fixed poses for each object. When presented with an unknown object in an arbitrary pose, the proposed approach allows the robot to detect the object identity and its actual pose, and then adapt a canonical grasp in order to be used with the new pose. For training, the system defines a canonical grasp by capturing the relative pose of an object with respect to the gripper attached to the robot's wrist. During testing, once a new pose is detected, a canonical grasp for the object is identified and then dynamically adapted by adjusting the robot arm's joint angles, so that the gripper can grasp the object in its new pose. We conducted experiments using a humanoid PR2 robot and showed that the proposed framework can detect well-textured objects, and provide accurate pose estimation in the presence of tolerable amounts of out-of-plane rotation. The performance is also illustrated by the robot successfully grasping objects from a wide range of arbitrary poses.

翻訳日:2021-03-27 06:53:12 公開日:2021-01-18

# (参考訳) 合成医療データの忠実性とプライバシー

Fidelity and Privacy of Synthetic Medical Data ( http://arxiv.org/abs/2101.08658v1 )

ライセンス: CC BY 4.0

Ofer Mendelevitch, Michael D. Lesh

(参考訳) 医療記録のデジタル化は、新しい時代のビッグデータを臨床科学に継承し、データを共有できる可能性とともに、研究者が論文記録から抽象化できるものを超えて洞察を積み重ねた。精度医療の革新を促進するために、個々のレベルの医療データを共有する必要性は拡大し続けており、科学者が新型コロナウイルス(COVID-19)のパンデミックに苦しむ中で、より緊急なものになったことはない。しかし、ビッグデータの利用に対する熱意は、患者の自律性とプライバシに対する完全な適切な懸念によって誘惑された。つまり、個人に関するプライベートまたはシークレットな情報を抽出する能力は、データを共有する前に重要なインフラストラクチャとデータガバナンスを確立する必要があるため、データの共有を難しくする。 HIPAAは、データ共有の承認メカニズムとして非識別を提供したが、リンク攻撃は大きな脆弱性として特定された。フィールド抑圧や抽象化といった個人情報の漏洩を避けるために、共有できる情報の量を制限する、微分プライバシーのような数学的手法を用いるといった様々なメカニズムが確立されている。もうひとつのアプローチは、基礎となるデータを模倣する合成データを作ることです。合成データは, 医療革新を支えるための有用なメカニズムであり, 実世界の証拠のプロキシであるためには, 合成データセットの2つの特性を示す必要がある。(1) 実データに関する分析は, 合成データの分析(統計的忠実性)と(2) 合成データは, 最小限の再識別(プライバシ保証)のリスクを伴って, プライバシーを保たなければならない。本稿では,合成データセットの統計忠実性とプライバシ保存特性を定量化する枠組みを提案し,syntegra技術によって生成された合成データの指標を示す。

The digitization of medical records ushered in a new era of big data to clinical science, and with it the possibility that data could be shared, to multiply insights beyond what investigators could abstract from paper records. The need to share individual-level medical data to accelerate innovation in precision medicine continues to grow, and has never been more urgent, as scientists grapple with the COVID-19 pandemic. However, enthusiasm for the use of big data has been tempered by a fully appropriate concern for patient autonomy and privacy. That is, the ability to extract private or confidential information about an individual, in practice, renders it difficult to share data, since significant infrastructure and data governance must be established before data can be shared. Although HIPAA provided de-identification as an approved mechanism for data sharing, linkage attacks were identified as a major vulnerability. A variety of mechanisms have been established to avoid leaking private information, such as field suppression or abstraction, strictly limiting the amount of information that can be shared, or employing mathematical techniques such as differential privacy. Another approach, which we focus on here, is creating synthetic data that mimics the underlying data. For synthetic data to be a useful mechanism in support of medical innovation and a proxy for real-world evidence, one must demonstrate two properties of the synthetic dataset: (1) any analysis on the real data must be matched by analysis of the synthetic data (statistical fidelity) and (2) the synthetic data must preserve privacy, with minimal risk of re-identification (privacy guarantee). In this paper we propose a framework for quantifying the statistical fidelity and privacy preservation properties of synthetic datasets and demonstrate these metrics for synthetic data generated by Syntegra technology.

翻訳日:2021-03-27 06:37:46 公開日:2021-01-18

# (参考訳) 完全畳み込みネットワークによるテキスト線抽出とエネルギー最小化

Text line extraction using fully convolutional network and energy minimization ( http://arxiv.org/abs/2101.07370v1 )

ライセンス: CC BY 4.0

Berat Kurar Barakat, Ahmad Droby, Reem Alaasam, Boraq Madi, Irina Rabaev, Jihad El-Sana

(参考訳) テキスト行は手書き文書画像の重要な部分であり、さらなるアプリケーションにより分析が容易である。最近のテキスト行検出の進歩にもかかわらず、手書き文書からのテキスト行抽出は未解決の作業である。本稿では,テキストライン検出のための完全畳み込みネットワークと,テキストライン抽出のためのエネルギー最小化手法を提案する。検出されたテキスト行は、テキスト行を貫くブロブ線で表現される。これらのブロブ線は、テキスト線抽出のためのエネルギー関数を支援する。検出段階は任意に向き付けられたテキスト行を特定できる。さらに、抽出段階は、その向きによらず、さまざまな高さのテキスト行の画素と線間近接を見出すことができる。さらに、向きを仮定することなく、タッチと重なり合うテキスト行を細かく分割することができる。本稿では,VML-AHTE,VML-MOC,Diva-HisDBデータセットに対する提案手法の評価を行う。 VML-AHTEデータセットは、リッチなダイアクリティカルなテキスト行の重複、タッチ、クローズを含む。 VML-MOCデータセットは、マルチ指向で歪んだテキスト行によって非常に難しい。 Diva-HisDBデータセットは、テキスト行の高さとタッチ行を表示する。その結果, 様々な課題があるにもかかわらず, 全ての実験において同じパラメータを用いた手法の有効性が示された。

Text lines are important parts of handwritten document images and easier to analyze by further applications. Despite recent progress in text line detection, text line extraction from a handwritten document remains an unsolved task. This paper proposes to use a fully convolutional network for text line detection and energy minimization for text line extraction. Detected text lines are represented by blob lines that strike through the text lines. These blob lines assist an energy function for text line extraction. The detection stage can locate arbitrarily oriented text lines. Furthermore, the extraction stage is capable of finding out the pixels of text lines with various heights and interline proximity independent of their orientations. Besides, it can finely split the touching and overlapping text lines without an orientation assumption. We evaluate the proposed method on VML-AHTE, VML-MOC, and Diva-HisDB datasets. The VML-AHTE dataset contains overlapping, touching and close text lines with rich diacritics. The VML-MOC dataset is very challenging by its multiply oriented and skewed text lines. The Diva-HisDB dataset exhibits distinct text line heights and touching text lines. The results demonstrate the effectiveness of the method despite various types of challenges, yet using the same parameters in all the experiments.

翻訳日:2021-03-27 06:31:44 公開日:2021-01-18

# 自然言語とRLによる解釈可能な政策仕様と合成

Interpretable Policy Specification and Synthesis through Natural Language and RL ( http://arxiv.org/abs/2101.07140v1 )

ライセンス: Link先を確認

Pradyumna Tambwekar, Andrew Silva, Nakul Gopalan, Matthew Gombolay

(参考訳) ポリシー仕様は、人間がロボットの動作を初期化して、強化学習(Reinforcement Learning, RL)を通して温かい開始ポリシーを最適化するプロセスである。ポリシーの仕様/設計は本質的に協調的なプロセスであるが、デモや深いrlからの学習に基づくモダンな手法は、モデル解釈性とアクセシビリティを欠いている。これらのモデルは、エージェントが学習したポリシーを検査する手段を提供しておらず、ロボットの振る舞いを教えるために使用可能なモダリティの作成に重点を置いていません。本稿では,1)自然言語を通じて,理解しやすい決定木という形で解釈可能なポリシーを規定し,2)これらのポリシーをウォームスタート強化学習に活用し,3)自然言語初期化機構を欠いたベースラインよりも優れる,新たな機械学習フレームワークを提案する。我々は,木をベースとした政策決定に,自由形式の自然言語ポリシー記述をマッピングすることで,アプローチを訓練する。本稿では,2つの領域にまたがる保留コーパスにおいて,自然言語を96%,97%の精度で決定木に翻訳する手法を提案する。最後に、自然言語コマンドで初期化されるポリシーが、自然言語ベースのウォームスタートテクニックの恩恵を受けない関連するベースライン(p < 0.001)を大幅に上回ることができることを検証します。

Policy specification is a process by which a human can initialize a robot's behaviour and, in turn, warm-start policy optimization via Reinforcement Learning (RL). While policy specification/design is inherently a collaborative process, modern methods based on Learning from Demonstration or Deep RL lack the model interpretability and accessibility to be classified as such. Current state-of-the-art methods for policy specification rely on black-box models, which are an insufficient means of collaboration for non-expert users: These models provide no means of inspecting policies learnt by the agent and are not focused on creating a usable modality for teaching robot behaviour. In this paper, we propose a novel machine learning framework that enables humans to 1) specify, through natural language, interpretable policies in the form of easy-to-understand decision trees, 2) leverage these policies to warm-start reinforcement learning and 3) outperform baselines that lack our natural language initialization mechanism. We train our approach by collecting a first-of-its-kind corpus mapping free-form natural language policy descriptions to decision tree-based policies. We show that our novel framework translates natural language to decision trees with a 96% and 97% accuracy on a held-out corpus across two domains, respectively. Finally, we validate that policies initialized with natural language commands are able to significantly outperform relevant baselines (p < 0.001) that do not benefit from our natural language-based warm-start technique.

翻訳日:2021-03-27 06:08:51 公開日:2021-01-18

# MP3: マップ、知覚、予測、計画のための統一モデル

MP3: A Unified Model to Map, Perceive, Predict and Plan ( http://arxiv.org/abs/2101.06806v1 )

ライセンス: Link先を確認

Sergio Casas, Abbas Sadat, Raquel Urtasun

(参考訳) 高精細地図(HDマップ)は、その意味や幾何学的情報から、現代のほとんどの自動運転システムにおいて重要な要素である。残念なことに、HDマップの構築はコストと、それに伴うローカライゼーションシステムに課される要件のため、スケールが難しいことが証明されている。 hdマップなしで運転できることは、自動運転ソリューションをスケールしたり、既存のソリューションの障害耐性を高めるのに非常に有益である(例えば、ローカライズが失敗したり、マップが最新でない場合)。この目的に向けて,入力が生センサデータと高レベルコマンド(例えば交差点で左折する)を持つマップレス運転におけるエンドツーエンドのMP3を提案する。 mp3は、オンラインマップと動的エージェントの現在および将来の状態の中間表現を予測し、それらを新しいニューラルモーションプランナーで活用し、不確実性を考慮した解釈可能な決定を行う。長期的なクローズドループシミュレーションや,大規模な実世界のデータセットのエキスパートドライバと比較して,当社のアプローチは極めて安全で快適で,ベースラインよりもコマンドを追従可能であることが分かりました。

High-definition maps (HD maps) are a key component of most modern self-driving systems due to their valuable semantic and geometric information. Unfortunately, building HD maps has proven hard to scale due to their cost as well as the requirements they impose in the localization system that has to work everywhere with centimeter-level accuracy. Being able to drive without an HD map would be very beneficial to scale self-driving solutions as well as to increase the failure tolerance of existing ones (e.g., if localization fails or the map is not up-to-date). Towards this goal, we propose MP3, an end-to-end approach to mapless driving where the input is raw sensor data and a high-level command (e.g., turn left at the intersection). MP3 predicts intermediate representations in the form of an online map and the current and future state of dynamic agents, and exploits them in a novel neural motion planner to make interpretable decisions taking into account uncertainty. We show that our approach is significantly safer, more comfortable, and can follow commands better than the baselines in challenging long-term closed-loop simulations, as well as when compared to an expert driver in a large-scale real-world dataset.

翻訳日:2021-03-27 06:08:27 公開日:2021-01-18

# 深い構造を持つリアクティブプランニング

Deep Structured Reactive Planning ( http://arxiv.org/abs/2101.06832v1 )

ライセンス: Link先を確認

Jerry Liu, Wenyuan Zeng, Raquel Urtasun, Ersin Yumer

(参考訳) 現実世界で活動するインテリジェントエージェントは、その目標を達成するために、自身だけでなく、周囲のシーンの他の参加者の安全と快適さを維持することのバランスをとる必要がある。これは、他のアクターの行動について共同で推論すると同時に、これらの2つのプロセスが本質的に相互に絡み合っているため、独自の行動を決定する必要があります。しかしこれは、計画が予測に従うほとんどの自動運転パイプラインでは捉えられていない。本稿では,自動運転車が自己計画や他のアクターがどう反応するかを共同で判断できる,新たなデータ駆動型,リアクティブな計画目標を提案する。この問題を観測データから学習し,計画問題と予測問題の両方を符号化したエネルギーベース深部構造モデルとして定式化する。実世界の運転と合成された高密度交通の両方に基づくシミュレーションにより、我々の反応モデルは、衝突速度を抑えることなく、高度に複雑な操作(交通路のマージ/ターン)を高速に完了させることで、非反応性の変動よりも優れることを示した。

An intelligent agent operating in the real-world must balance achieving its goal with maintaining the safety and comfort of not only itself, but also other participants within the surrounding scene. This requires jointly reasoning about the behavior of other actors while deciding its own actions as these two processes are inherently intertwined - a vehicle will yield to us if we decide to proceed first at the intersection but will proceed first if we decide to yield. However, this is not captured in most self-driving pipelines, where planning follows prediction. In this paper we propose a novel data-driven, reactive planning objective which allows a self-driving vehicle to jointly reason about its own plans as well as how other actors will react to them. We formulate the problem as an energy-based deep structured model that is learned from observational data and encodes both the planning and prediction problems. Through simulations based on both real-world driving and synthetically generated dense traffic, we demonstrate that our reactive model outperforms a non-reactive variant in successfully completing highly complex maneuvers (lane merges/turns in traffic) faster, without trading off collision rate.

翻訳日:2021-03-27 06:08:05 公開日:2021-01-18

# LNSMM:ローカルネットワーク共有マルチビューマルチタスクによる眼球運動推定

LNSMM: Eye Gaze Estimation With Local Network Share Multiview Multitask ( http://arxiv.org/abs/2101.07116v1 )

ライセンス: Link先を確認

Yong Huang, Ben Chen, Daiming Qu

(参考訳) Eye gaze estimation has become increasingly significant in computer vision.In this paper,we systematically study the mainstream of eye gaze estimation methods,propose a novel methodology to estimate eye gaze points and eye gaze directions simultaneously.First,we construct a local sharing network for feature extraction of gaze points and gaze directions estimation,which can reduce network computational parameters and converge quickly;Second,we propose a Multiview Multitask Learning (MTL) framework,for gaze directions,a coplanar constraint is proposed for the left and right eyes,for gaze points,three views data input indirectly introduces eye position information,a cross-view pooling module is designed, propose joint loss which handle both gaze points and gaze directions estimation.Eventually,we collect a dataset to use of gaze points,which have three views to exist public dataset.The experiment show our method is state-of-the-art the current mainstream methods on two indicators of gaze points and gaze directions.

Eye gaze estimation has become increasingly significant in computer vision.In this paper,we systematically study the mainstream of eye gaze estimation methods,propose a novel methodology to estimate eye gaze points and eye gaze directions simultaneously.First,we construct a local sharing network for feature extraction of gaze points and gaze directions estimation,which can reduce network computational parameters and converge quickly;Second,we propose a Multiview Multitask Learning (MTL) framework,for gaze directions,a coplanar constraint is proposed for the left and right eyes,for gaze points,three views data input indirectly introduces eye position information,a cross-view pooling module is designed, propose joint loss which handle both gaze points and gaze directions estimation.Eventually,we collect a dataset to use of gaze points,which have three views to exist public dataset.The experiment show our method is state-of-the-art the current mainstream methods on two indicators of gaze points and gaze directions.

翻訳日:2021-03-27 06:07:14 公開日:2021-01-18

# 複数の領域にわたる効率的な教師なし適応のための知識蒸留法

Knowledge Distillation Methods for Efficient Unsupervised Adaptation Across Multiple Domains ( http://arxiv.org/abs/2101.07308v1 )

ライセンス: Link先を確認

Le Thanh Nguyen-Meidine, Atif Belal, Madhu Kiran, Jose Dolz, Louis-Antoine Blais-Morin, Eric Granger

(参考訳) 大規模なアノテートデータセットのトレーニングを必要とするCNNの複雑さに加えて、設計と運用データのドメインシフトは、多くの現実世界アプリケーションにおいてCNNの採用を制限している。例えば、個人の再識別では、ビデオは重複しない視点を持つ分散したカメラセットでキャプチャされる。ソース間のシフト(例) 実験室の設定)とターゲット(例) カメラ)ドメインは認識精度を著しく低下させる可能性がある。さらに、最先端のCNNは、計算要求からすると、そのようなリアルタイムアプリケーションには適さないかもしれない。近年,非教師なし領域適応(uda)や知識蒸留(kd)によるcnnの高速化と圧縮を行う手法が提案されているが,複数の対象領域にまたがるcnnの適応と圧縮を同時に行なおうとしている。本稿では、CNNの教師なし単一ターゲットDA(STDA)とマルチターゲットDA(MTDA)に対するプログレッシブKDアプローチを提案する。我々のKD-STDA法は,CNNを1つのターゲット領域に適応させるため,より大規模な教師CNNから抽出し,目標領域データとソース領域データの両方で学習し,共通表現との整合性を維持する。提案手法は,Office31 および ImageClef-DA 画像分類データセット上の CNN の圧縮と STDA の最先端手法と比較する。また、Digits、Office31、OfficeHome上のMTDAの最先端メソッドと比較される。両方の設定 -- KD-STDAとKD-MTDA -- の結果から、我々のアプローチは、CNNの複雑さを同等または低いものにしつつ、ターゲットドメイン全体で最高の精度を達成できることを示している。

Beyond the complexity of CNNs that require training on large annotated datasets, the domain shift between design and operational data has limited the adoption of CNNs in many real-world applications. For instance, in person re-identification, videos are captured over a distributed set of cameras with non-overlapping viewpoints. The shift between the source (e.g. lab setting) and target (e.g. cameras) domains may lead to a significant decline in recognition accuracy. Additionally, state-of-the-art CNNs may not be suitable for such real-time applications given their computational requirements. Although several techniques have recently been proposed to address domain shift problems through unsupervised domain adaptation (UDA), or to accelerate/compress CNNs through knowledge distillation (KD), we seek to simultaneously adapt and compress CNNs to generalize well across multiple target domains. In this paper, we propose a progressive KD approach for unsupervised single-target DA (STDA) and multi-target DA (MTDA) of CNNs. Our method for KD-STDA adapts a CNN to a single target domain by distilling from a larger teacher CNN, trained on both target and source domain data in order to maintain its consistency with a common representation. Our proposed approach is compared against state-of-the-art methods for compression and STDA of CNNs on the Office31 and ImageClef-DA image classification datasets. It is also compared against state-of-the-art methods for MTDA on Digits, Office31, and OfficeHome. In both settings -- KD-STDA and KD-MTDA -- results indicate that our approach can achieve the highest level of accuracy across target domains, while requiring a comparable or lower CNN complexity.

翻訳日:2021-03-27 06:07:03 公開日:2021-01-18

# テキストからテキストへの変換による自然言語からの関数のラベル付け

Teach me how to Label: Labeling Functions from Natural Language with Text-to-text Transformers ( http://arxiv.org/abs/2101.07138v1 )

ライセンス: Link先を確認

Yannis Papanikolaou

(参考訳) 注釈付きデータは、特にドメインの専門知識を必要とする分野において、正確な機械学習モデルをトレーニングする上で最も重要なボトルネックとなっている。上記の問題に対処する最近のアプローチでは、個々のデータポイントをラベル付けするのではなく、自然言語による説明を用いることで、アノテータの効率を向上し、コストを大幅に削減する。本稿では,これらの自然言語記述をPythonラベリング関数に変換する作業について,事前学習したテキスト・テキスト・トランスフォーマを用いたセマンティック・パースに追従する。一連の実験で、我々のアプローチはセマンティック構文解析ベンチマークのconalaの新たな最先端を達成し、以前のベストアプローチを3.7 bleuポイントで上回った。さらに,自然言語記述ラベル関数ペアを手作業で構築したデータセットでは,0。我々のアプローチは、特定のラベル付きサンプルを提供するのではなく、自然言語でラベル付けする方法を教えるモデルへのステップストーンと見なすことができる。私たちのコード、構築されたデータセット、モデルは、https://github.com/ypapanik/t5-for-code-generationで利用可能です。

Annotated data has become the most important bottleneck in training accurate machine learning models, especially for areas that require domain expertise. A recent approach to deal with the above issue proposes using natural language explanations instead of labeling individual data points, thereby increasing human annotators' efficiency as well as decreasing costs substantially. This paper focuses on the task of turning these natural language descriptions into Python labeling functions by following a novel approach to semantic parsing with pre-trained text-to-text Transformers. In a series of experiments our approach achieves a new state of the art on the semantic parsing benchmark CoNaLa, surpassing the previous best approach by 3.7 BLEU points. Furthermore, on a manually constructed dataset of natural language descriptions-labeling functions pairs we achieve a BLEU of 0.39. Our approach can be regarded as a stepping stone towards models that are taught how to label in natural language, instead of being provided specific labeled samples. Our code, constructed dataset and models are available at https://github.com/ypapanik/t5-for-code-generation.

翻訳日:2021-03-27 06:06:35 公開日:2021-01-18

# Kalman Smoothing を用いた重ね合わせLSTMを用いた深部繰り返しニューラルネットワークによる血糖予測

Stacked LSTM Based Deep Recurrent Neural Network with Kalman Smoothing for Blood Glucose Prediction ( http://arxiv.org/abs/2101.06850v1 )

ライセンス: Link先を確認

Md Fazle Rabby, Yazhou Tu, Md Imran Hossen, Insup Le, Anthony S Maida, Xiali Hei

(参考訳) 血液グルコース (BG) は, 信頼性の高い人工膵臓やインスリン注入システムを必要とする1型糖尿病患者に必須である。近年,より正確なBGレベルの予測システムとしてディープラーニング技術が活用されている。しかし、連続グルコースモニタリング(CGM)はセンサーエラーの影響を受けやすい。その結果、最も最適な機械学習モデルを使用しても、不正確なCGMの読み取りがBG予測に影響を与え、信頼性が低下する。本研究では,センサの故障を考慮したLong-term memory(LSTM)に基づく深部リカレントニューラルネットワーク(RNN)モデルを用いて,血糖値の予測手法を提案する。センサ誤差による不正確なCGM読影の補正にはカルマン平滑化法を用いる。 6人の異なる患者の8週間のデータを含むOttoT1DMデータセットでは、平均RMSEは6.45と17.24mg/dlで、それぞれ30分60分予測水平線(PH)を達成している。我々の知る限りでは、これはOoioT1DMデータセットの平均予測精度の最上位である。例えば、カルマンのCGMデータのスムーズ化、食事からの炭水化物、ボルスインシュリン、累積ステップ数などの異なる生理的情報は、モデルへの入力として使われる有意義な特徴を表すために作成される。アプローチの目的は、予測されたCGM値と指先での血糖値の差を下げることである。以上の結果から,t1d糖尿病管理のための人工膵およびインスリン注入システムの性能向上を期待できる,より信頼性の高いbg予測が可能と考えられた。

Blood glucose (BG) management is crucial for type-1 diabetes patients resulting in the necessity of reliable artificial pancreas or insulin infusion systems. In recent years, deep learning techniques have been utilized for a more accurate BG level prediction system. However, continuous glucose monitoring (CGM) readings are susceptible to sensor errors. As a result, inaccurate CGM readings would affect BG prediction and make it unreliable, even if the most optimal machine learning model is used. In this work, we propose a novel approach to predicting blood glucose level with a stacked Long short-term memory (LSTM) based deep recurrent neural network (RNN) model considering sensor fault. We use the Kalman smoothing technique for the correction of the inaccurate CGM readings due to sensor error. For the OhioT1DM dataset, containing eight weeks' data from six different patients, we achieve an average RMSE of 6.45 and 17.24 mg/dl for 30 minutes and 60 minutes of prediction horizon (PH), respectively. To the best of our knowledge, this is the leading average prediction accuracy for the ohioT1DM dataset. Different physiological information, e.g., Kalman smoothed CGM data, carbohydrates from the meal, bolus insulin, and cumulative step counts in a fixed time interval, are crafted to represent meaningful features used as input to the model. The goal of our approach is to lower the difference between the predicted CGM values and the fingerstick blood glucose readings - the ground truth. Our results indicate that the proposed approach is feasible for more reliable BG forecasting that might improve the performance of the artificial pancreas and insulin infusion system for T1D diabetes management.

翻訳日:2021-03-27 06:06:17 公開日:2021-01-18

# サブタスクとしての報酬の不確実性予測による安定な深層強化学習法

Stable deep reinforcement learning method by predicting uncertainty in rewards as a subtask ( http://arxiv.org/abs/2101.06906v1 )

ライセンス: Link先を確認

Kanata Suzuki, Tetsuya Ogata

(参考訳) 近年, 深部強化学習(DRL)によって様々な課題が達成されている。しかし,実環境におけるタスクにdrlを適用する場合,適切な報酬の設計は困難である。実際のハードウェアセンサーから得られる報酬には、ノイズ、誤解、あるいは観測失敗が含まれる。これらの不安定な信号による学習不安定性は、DRLでは未解決の問題である。本研究では,報酬信号に含まれる分散を直接推定するためにサブタスクを追加することで,既存のDRLモデルを拡張するアプローチを提案する。次にモデルは、批判ネットワークのサブタスクによって学習されたフィーチャーマップをアクタネットワークに送信する。これにより、潜在的なノイズの影響にロバストな安定した学習が可能になる。不安定報奨信号を持つatariゲーム領域における実験の結果,本手法はトレーニング収束を安定化することがわかった。また,特徴マップの可視化による拡張性についても検討する。このアプローチは、ノイズの多い現実世界のシナリオでDRLをより実用的なものにする可能性がある。

In recent years, a variety of tasks have been accomplished by deep reinforcement learning (DRL). However, when applying DRL to tasks in a real-world environment, designing an appropriate reward is difficult. Rewards obtained via actual hardware sensors may include noise, misinterpretation, or failed observations. The learning instability caused by these unstable signals is a problem that remains to be solved in DRL. In this work, we propose an approach that extends existing DRL models by adding a subtask to directly estimate the variance contained in the reward signal. The model then takes the feature map learned by the subtask in a critic network and sends it to the actor network. This enables stable learning that is robust to the effects of potential noise. The results of experiments in the Atari game domain with unstable reward signals show that our method stabilizes training convergence. We also discuss the extensibility of the model by visualizing feature maps. This approach has the potential to make DRL more practical for use in noisy, real-world scenarios.

翻訳日:2021-03-27 06:05:48 公開日:2021-01-18

# 正規化ポリシはリワードロバストである

Regularized Policies are Reward Robust ( http://arxiv.org/abs/2101.07012v1 )

ライセンス: Link先を確認

Hisham Husain and Kamil Ciosek and Ryota Tomioka

(参考訳) 強化学習(RL)における政策のエントロピー正則化(Entropic regularization)は、学習された政策が局所的最適政策に過度に適合する前に国家空間を十分に探索することを保証するために一般的に用いられるヒューリスティックである。エントロピーを使う主な動機は最適政策の探索と曖昧化であるが、理論的な効果は完全には理解されていない。本研究では、より一般化された正規化RLの目的とフェンシェル双対性について検討し、対角的報酬問題の形をとる双対問題を導出する。特に, 正規化対象が求める最適方針は, 最悪の対人報酬の下での強化学習問題の最適方針であることがわかった。その結果、一般的なエントロピー正規化スキームをロバスト化の形式として再解釈することができる。さらに,結果の一般性から,既存の他の正規化スキームにも適用する。以上の結果から,政策の正則化の効果を考察し,堅牢な報酬を通じて探索の理解を深めることができた。

Entropic regularization of policies in Reinforcement Learning (RL) is a commonly used heuristic to ensure that the learned policy explores the state-space sufficiently before overfitting to a local optimal policy. The primary motivation for using entropy is for exploration and disambiguating optimal policies; however, the theoretical effects are not entirely understood. In this work, we study the more general regularized RL objective and using Fenchel duality; we derive the dual problem which takes the form of an adversarial reward problem. In particular, we find that the optimal policy found by a regularized objective is precisely an optimal policy of a reinforcement learning problem under a worst-case adversarial reward. Our result allows us to reinterpret the popular entropic regularization scheme as a form of robustification. Furthermore, due to the generality of our results, we apply to other existing regularization schemes. Our results thus give insights into the effects of regularization of policies and deepen our understanding of exploration through robust rewards at large.

翻訳日:2021-03-27 06:05:18 公開日:2021-01-18

# 連続学習=破滅的な忘れ方?

Does Continual Learning = Catastrophic Forgetting? ( http://arxiv.org/abs/2101.07295v1 )

ライセンス: Link先を確認

Anh Thai, Stefan Stojanov, Isaac Rehg, James M. Rehg

(参考訳) 継続的な学習は破滅的な忘れ込みに苦しむことで知られており、これはより最近のサンプルを犠牲にして初期の学習概念が忘れられる現象である。本研究は,連続学習が必然的に破滅的な記憶に結びつくという仮定に挑戦し,継続的に学習しても破滅的な記憶に苦しむことのない一連の課題を提示する。本研究では,これらの課題の特性を把握し,破滅的な忘れ方や,連続的な分類のための代理表現学習タスクの可能性を実証する。さらに,クラスインクリメンタルな分類学習タスクにおいて,最先端の手法より優れている新しいアルゴリズムYASSを導入する。最後に、連続モデルにおける表現学習のダイナミクスを追跡する新しいツールであるDyRTを提案する。この記事でリリースされたコードベース、データセット、事前トレーニングされたモデルは、https://github.com/ngailapdi/CLRec.comで見ることができる。

Continual learning is known for suffering from catastrophic forgetting, a phenomenon where earlier learned concepts are forgotten at the expense of more recent samples. In this work, we challenge the assumption that continual learning is inevitably associated with catastrophic forgetting by presenting a set of tasks that surprisingly do not suffer from catastrophic forgetting when learned continually. We attempt to provide an insight into the property of these tasks that make them robust to catastrophic forgetting and the potential of having a proxy representation learning task for continual classification. We further introduce a novel yet simple algorithm, YASS that outperforms state-of-the-art methods in the class-incremental categorization learning task. Finally, we present DyRT, a novel tool for tracking the dynamics of representation learning in continual models. The codebase, dataset and pre-trained models released with this article can be found at https://github.com/ngailapdi/CLRec.

翻訳日:2021-03-27 06:04:13 公開日:2021-01-18

# shape to categorize: 明示的な形状バイアスによる低ショット学習

Using Shape to Categorize: Low-Shot Learning with an Explicit Shape Bias ( http://arxiv.org/abs/2101.07296v1 )

ライセンス: Link先を確認

Stefan Stojanov, Anh Thai, James M. Rehg

(参考訳) 物体形状の推論が物体認識にとって重要であることは広く受け入れられている。しかし、今日の最も強力なオブジェクト認識手法は、学習中に明示的にオブジェクト形状を使用しない。本研究では,低ショット学習の最近の発展,発達心理学の知見,コンピュータビジョン研究における合成データの利用の増加に動機づけられ,低ショット学習法の一般化性能向上に3次元形状の推論がいかに役立つかを検討する。本研究では, 3次元物体形状を用いた判別埋め込み空間を学習し, 画像のマッピング法を学習することにより, 既存の低ショット学習手法を改善する新しい手法を提案する。新しいアプローチは、複数のデータセットにおける画像のみの低ショット学習アプローチのパフォーマンスを向上させる。また、最も多くのオブジェクトカテゴリを持つ新しい3dオブジェクトデータセットであるtoys4kも開発しています。

It is widely accepted that reasoning about object shape is important for object recognition. However, the most powerful object recognition methods today do not explicitly make use of object shape during learning. In this work, motivated by recent developments in low-shot learning, findings in developmental psychology, and the increased use of synthetic data in computer vision research, we investigate how reasoning about 3D shape can be used to improve low-shot learning methods' generalization performance. We propose a new way to improve existing low-shot learning approaches by learning a discriminative embedding space using 3D object shape, and utilizing this embedding by learning how to map images into it. Our new approach improves the performance of image-only low-shot learning approaches on multiple datasets. We also develop Toys4K, a new 3D object dataset with the biggest number of object categories that can also support low-shot learning.

翻訳日:2021-03-27 06:03:57 公開日:2021-01-18

# 病理画像埋め込みのための拡大一般化

Magnification Generalization for Histopathology Image Embedding ( http://arxiv.org/abs/2101.07757v1 )

ライセンス: Link先を確認

Milad Sikaroudi, Benyamin Ghojogh, Fakhri Karray, Mark Crowley, H.R. Tizhoosh

(参考訳) 病理像の埋め込みはコンピュータビジョンの活発な研究領域である。埋め込みモデルのほとんどは、特定の倍率レベルにのみ集中する。しかしながら、病理組織学の埋め込みにおいて有用なタスクは、拡大レベルに関係なく埋め込み空間を訓練することである。この目標に対処するための2つの主要なアプローチは、ドメイン適応とドメイン一般化である。拡大適応は文献でよく研究されている話題であるが,我々の知る限りでは,組織病理画像埋め込みのための拡大一般化に関する最初の研究である。本稿では,モデル非依存メタラーニング(MAML)の概念に基づく,意味的特徴のモデル非依存学習(MASF)という,拡大一般化のためのエピソード学習可能な領域一般化手法を用いる。 4種類の倍率の乳腺病理組織学的データセットを用いた実験結果から,提案法の有効性が示唆された。

Histopathology image embedding is an active research area in computer vision. Most of the embedding models exclusively concentrate on a specific magnification level. However, a useful task in histopathology embedding is to train an embedding space regardless of the magnification level. Two main approaches for tackling this goal are domain adaptation and domain generalization, where the target magnification levels may or may not be introduced to the model in training, respectively. Although magnification adaptation is a well-studied topic in the literature, this paper, to the best of our knowledge, is the first work on magnification generalization for histopathology image embedding. We use an episodic trainable domain generalization technique for magnification generalization, namely Model Agnostic Learning of Semantic Features (MASF), which works based on the Model Agnostic Meta-Learning (MAML) concept. Our experimental results on a breast cancer histopathology dataset with four different magnification levels show the proposed method's effectiveness for magnification generalization.

翻訳日:2021-03-27 06:03:45 公開日:2021-01-18

# 降水の高速かつ高精度なマルチレゾリューション動的ダウンスケーリング

Fast and accurate learned multiresolution dynamical downscaling for precipitation ( http://arxiv.org/abs/2101.06813v1 )

ライセンス: Link先を確認

Jiali Wang, Zhengchun Liu, Ian Foster, Won Chang, Rajkumar Kettimuthu, Rao Kotamarthi

(参考訳) 本研究では,高分解能モデルによる降水データと同等の統計的特性をエミュレートするニューラルネットワークに基づく手法を開発した。鍵となるアイデアは、低解像度と高解像度のシミュレーションを組み合わせてニューラルネットワークを訓練し、前者から後者にマップすることだ。具体的には,変数を直接スタックし,各変数をスタックする前にエンコードする2つのタイプのcnnを定義し,各cnnタイプを平均二乗誤差 (mse) や条件付き生成敵ネットワーク (cgan) といった従来の損失関数を用いて,合計4つのcnn変種で訓練する。 CNNに基づく4つの新しい高分解能降水結果と、元の高分解能シミュレーション、双線形補間および最先端CNNに基づく超分解能(SR)技術から生じる降水量を比較した。その結果,SR法は従来の高分解能シミュレーションよりもスムーズな空間分布と時間分布,データ変動と極値の両線形補間器と同様の結果が得られた。 MSEによって訓練された新しいCNNは、補間器やSR技術よりもいくつかの領域でより良い結果を生成するが、その予測は元の高分解能シミュレーションほど近いものではない。 CGANによって訓練されたCNNは、より現実的で物理的に合理的な結果を生成し、時間と空間におけるデータの変動だけでなく、激しい嵐や長期間の嵐のような極端な現象もよりよく捉えている。新たに提案されたcnnベースのダウンスケーリングアプローチは、ネットワークがトレーニングされた後(1~gpuを使用して4～時間を要する)、14～minで50～kmから12～kmまでの降雨を30～年にダウンスケールすることができる。

This study develops a neural network-based approach for emulating high-resolution modeled precipitation data with comparable statistical properties but at greatly reduced computational cost. The key idea is to use combination of low- and high- resolution simulations to train a neural network to map from the former to the latter. Specifically, we define two types of CNNs, one that stacks variables directly and one that encodes each variable before stacking, and we train each CNN type both with a conventional loss function, such as mean square error (MSE), and with a conditional generative adversarial network (CGAN), for a total of four CNN variants. We compare the four new CNN-derived high-resolution precipitation results with precipitation generated from original high resolution simulations, a bilinear interpolater and the state-of-the-art CNN-based super-resolution (SR) technique. Results show that the SR technique produces results similar to those of the bilinear interpolator with smoother spatial and temporal distributions and smaller data variabilities and extremes than the original high resolution simulations. While the new CNNs trained by MSE generate better results over some regions than the interpolator and SR technique do, their predictions are still not as close as the original high resolution simulations. The CNNs trained by CGAN generate more realistic and physically reasonable results, better capturing not only data variability in time and space but also extremes such as intense and long-lasting storms. The new proposed CNN-based downscaling approach can downscale precipitation from 50~km to 12~km in 14~min for 30~years once the network is trained (training takes 4~hours using 1~GPU), while the conventional dynamical downscaling would take 1~month using 600 CPU cores to generate simulations at the resolution of 12~km over contiguous United States.

翻訳日:2021-03-27 06:03:30 公開日:2021-01-18

# 浸透テスト中のウェブサイト構造発見を最適化するAIを活用する

Leveraging AI to optimize website structure discovery during Penetration Testing ( http://arxiv.org/abs/2101.07223v1 )

ライセンス: Link先を確認

Diego Antonelli, Roberta Cascella, Gaetano Perrone, Simon Pietro Romano, Antonio Schiano

(参考訳) Dirbustingは、サーバの内容を列挙するために、HTTPレスポンスを監視しながら、Webサーバ上のディレクトリとファイル名をブルートするテクニックである。このような手法は、共通の単語のリストを使用して、ターゲットウェブサイトの隠れた構造を発見する。 dirbustingは通常、新しいページを見つけるための発見条件としてレスポンスコードに依存している。これは企業がウェブサイトの脆弱性を検知する活動であるWebアプリケーションの浸透テストで広く利用されている。 dirbustingのテクニックは時間とリソースの両方を消費するものであり、この分野で革新的なアプローチが探求されたことはない。そこで我々は,人工知能を活用し,ディルバスティングプロセスを最適化する高度な手法を提案する。具体的には、セマンティッククラスタリング手法を用いて、意味的意味に応じて異なるグループで単語リストを整理する。生成されたクラスタは、アドホックに実装された次のワードインテリジェント戦略で使用される。本稿では,クラスタリング手法が一般的なブライト力法よりも優れていることを示す。パフォーマンスは8つの異なるWebアプリケーションをテストすることで評価される。その結果,各実験で最大50%の性能向上が確認された。

Dirbusting is a technique used to brute force directories and file names on web servers while monitoring HTTP responses, in order to enumerate server contents. Such a technique uses lists of common words to discover the hidden structure of the target website. Dirbusting typically relies on response codes as discovery conditions to find new pages. It is widely used in web application penetration testing, an activity that allows companies to detect websites vulnerabilities. Dirbusting techniques are both time and resource consuming and innovative approaches have never been explored in this field. We hence propose an advanced technique to optimize the dirbusting process by leveraging Artificial Intelligence. More specifically, we use semantic clustering techniques in order to organize wordlist items in different groups according to their semantic meaning. The created clusters are used in an ad-hoc implemented next-word intelligent strategy. This paper demonstrates that the usage of clustering techniques outperforms the commonly used brute force methods. Performance is evaluated by testing eight different web applications. Results show a performance increase that is up to 50% for each of the conducted experiments.

翻訳日:2021-03-27 06:02:40 公開日:2021-01-18

# 深層強化学習エージェントのための摂動に基づく塩分マップのベンチマーク

Benchmarking Perturbation-based Saliency Maps for Explaining Deep Reinforcement Learning Agents ( http://arxiv.org/abs/2101.07312v1 )

ライセンス: Link先を確認

Tobias Huber, Benedikt Limmer, Elisabeth Andr\'e

(参考訳) 近年、複雑な知的エージェントの説明が盛んに行われている。 1つの例は、各ピクセルがエージェントの決定にどの程度の理由があるかを示す、サリエンシマップを生成するアルゴリズムの開発である。しかし,このようなサリエンシマップのほとんどの評価は,画像分類作業に重点を置いている。私たちが知る限り、深層強化学習エージェントの異なる給与マップを徹底的に比較する作業はありません。本稿では,4つの異なるAtari 2600ゲームで訓練された深層強化学習エージェントに対して,摂動に基づく4つのサリエンシマップ作成手法を比較した。 4つのアプローチはすべて、入力の一部を摂動させ、エージェントの出力にどの程度影響するかを測定することで機能する。アプローチはエージェントの学習パラメータへの依存(正当性チェック)、エージェントの推論への忠実さ(入力劣化)、実行時間という3つの計算指標を用いて比較される。

Recent years saw a plethora of work on explaining complex intelligent agents. One example is the development of several algorithms that generate saliency maps which show how much each pixel attributed to the agents' decision. However, most evaluations of such saliency maps focus on image classification tasks. As far as we know, there is no work which thoroughly compares different saliency maps for Deep Reinforcement Learning agents. This paper compares four perturbation-based approaches to create saliency maps for Deep Reinforcement Learning agents trained on four different Atari 2600 games. All four approaches work by perturbing parts of the input and measuring how much this affects the agent's output. The approaches are compared using three computational metrics: dependence on the learned parameters of the agent (sanity checks), faithfulness to the agent's reasoning (input degradation), and run-time.

翻訳日:2021-03-27 06:02:24 公開日:2021-01-18

# 多項出力を用いたBARTの推論

Inference for BART with Multinomial Outcomes ( http://arxiv.org/abs/2101.06823v1 )

ライセンス: Link先を確認

Yizhen Xu, Joseph W. Hogan, Michael J. Daniels, Rami Kantor, Ann Mwangi

(参考訳) multinomial probit bayesian additive regression trees (mpbart) フレームワークはkindoらによって提案された。 (kd)は,マルチノミナルプロビット(mnp)モデルの潜在ユーティリティをbart(chipman et al.)で近似する。 2010). 多項ロジスティックモデルと比較して、MNPは独立した代替案を仮定せず、多変量ガウス分布潜在ユーティリティを通して代替案間の相関構造を特定できる。我々はMPBARTに適合する2つの新しいアルゴリズムを導入し、提案手法の理論的混合速度が既存のKDアルゴリズムと等しいか優れていることを示す。シミュレーションを通じて,提案手法のロバスト性,基準レベルの選択,結果周波数の不均衡,実用的誤差項に対する事前ハイパーパラメータの仕様について検討する。この研究は、ケニアのAMPATH(Academic Model Providing Access to Healthcare)からEHR(Electronic Health Record)に基づいて、HIV陽性患者の死亡率と罹患率に関する後続の予測分布を生成することによる。応用とシミュレーションの両方において,mcmc収束率と後方予測精度の観点からkdと比較して,提案手法により良好な性能が得られた。

The multinomial probit Bayesian additive regression trees (MPBART) framework was proposed by Kindo et al. (KD), approximating the latent utilities in the multinomial probit (MNP) model with BART (Chipman et al. 2010). Compared to multinomial logistic models, MNP does not assume independent alternatives and the correlation structure among alternatives can be specified through multivariate Gaussian distributed latent utilities. We introduce two new algorithms for fitting the MPBART and show that the theoretical mixing rates of our proposals are equal or superior to the existing algorithm in KD. Through simulations, we explore the robustness of the methods to the choice of reference level, imbalance in outcome frequencies, and the specifications of prior hyperparameters for the utility error term. The work is motivated by the application of generating posterior predictive distributions for mortality and engagement in care among HIV-positive patients based on electronic health records (EHRs) from the Academic Model Providing Access to Healthcare (AMPATH) in Kenya. In both the application and simulations, we observe better performance using our proposals as compared to KD in terms of MCMC convergence rate and posterior predictive accuracy.

翻訳日:2021-03-27 06:02:10 公開日:2021-01-18

# 深層ニューラルネットワークによる形状不確かさ定量化における非スムース量の推定

Deep neural network surrogates for non-smooth quantities of interest in shape uncertainty quantification ( http://arxiv.org/abs/2101.07023v1 )

ライセンス: Link先を確認

Laura Scarabosio

(参考訳) 幾何学的不確かさを持つ界面問題に対する解のポイント評価について検討し、障害物の不確かさを高次元パラメータ $\boldsymbol{y}\in[-1,1]^d$, $d\in\mathbb{n}$ で記述する。特に楕円型インタフェース問題とヘルムホルツ伝送問題に焦点を当てる。物理領域における解のポイント値は、高次元パラメータに非スムースに依存し、サーロゲートの構築に関心がある場合の課題となる。実際、高次法は収束率が低いが、不連続性を追跡する方法は通常、いわゆる次元性の呪いに悩まされる。そこで本研究では,深層ニューラルネットワークを用いた点評価のためのサロゲートの構築を提案する。ニューラルネットワークが優れたサロゲートを提供する理由を理論的に正当化する。さらに, 実運用における優れた性能を示す広範な数値実験を行った。特に,ニューラルネットワークが次元の呪いに苦しむことはないことを観察し,点評定点数(つまり,パラメータ空間における不連続点数)に対する誤差の依存性や,2つの物質間のコントラストやヘルムホルツ伝達問題に対するヘルムホルツ伝達問題などのモデリングパラメータについて検討した。

We consider the point evaluation of the solution to interface problems with geometric uncertainties, where the uncertainty in the obstacle is described by a high-dimensional parameter $\boldsymbol{y}\in[-1,1]^d$, $d\in\mathbb{N}$. We focus in particular on an elliptic interface problem and a Helmholtz transmission problem. Point values of the solution in the physical domain depend in general non-smoothly on the high-dimensional parameter, posing a challenge when one is interested in building surrogates. Indeed, high-order methods show poor convergence rates, while methods which are able to track discontinuities usually suffer from the so-called curse of dimensionality. For this reason, in this work we propose to build surrogates for point evaluation using deep neural networks. We provide a theoretical justification for why we expect neural networks to provide good surrogates. Furthermore, we present extensive numerical experiments showing their good performance in practice. We observe in particular that neural networks do not suffer from the curse of dimensionality, and we study the dependence of the error on the number of point evaluations (that is, the number of discontinuities in the parameter space), as well as on several modeling parameters, such as the contrast between the two materials and, for the Helmholtz transmission problem, the wavenumber.

翻訳日:2021-03-27 06:01:52 公開日:2021-01-18

# ランダムウォークに基づくネットワーク埋め込みアルゴリズムの一貫性

Consistency of random-walk based network embedding algorithms ( http://arxiv.org/abs/2101.07354v1 )

ライセンス: Link先を確認

Yichi Zhang, Minh Tang

(参考訳) node2vecやDeepWalkのようなランダムウォークベースのネットワーク埋め込みアルゴリズムは、ダウンストリームネットワーク推論タスクを実行する前にネットワーク内のノードのユークリッド表現を得るために広く使用されている。しかし、その印象的な経験的パフォーマンスにもかかわらず、その振る舞いを説明する理論的結果の欠如がある。本稿では行列分解の観点から, node2vec と DeepWalk のアルゴリズムについて検討した。確率的ブロックモデルグラフに対するコミュニティ検出の設定において,これらのアルゴリズムを解析し,特に,大きなサンプル誤差境界を設定し,node2vec/deepwalk埋め込みとk-meansクラスタリングの一貫したコミュニティ回復を証明した。理論的には,観測されたネットワークのスパース性,ランダムウォークのウィンドウサイズ,node2vec/deepwalk埋め込みの収束率と,真だが未知のエッジ確率行列の埋め込みとの微妙な相互作用を示す。より具体的には、ネットワークがスペーサー化するにつれて、より大きなウィンドウサイズ、または同等に長いランダムウォークを用いて、結果としての埋め込みの収束率を改善することが提案される。これらの観測を裏付ける数値実験を含む。

Random-walk based network embedding algorithms like node2vec and DeepWalk are widely used to obtain Euclidean representation of the nodes in a network prior to performing down-stream network inference tasks. Nevertheless, despite their impressive empirical performance, there is a lack of theoretical results explaining their behavior. In this paper we studied the node2vec and DeepWalk algorithms through the perspective of matrix factorization. We analyze these algorithms in the setting of community detection for stochastic blockmodel graphs; in particular we established large-sample error bounds and prove consistent community recovery of node2vec/DeepWalk embedding followed by k-means clustering. Our theoretical results indicate a subtle interplay between the sparsity of the observed networks, the window sizes of the random walks, and the convergence rates of the node2vec/DeepWalk embedding toward the embedding of the true but unknown edge probabilities matrix. More specifically, as the network becomes sparser, our results suggest using larger window sizes, or equivalently, taking longer random walks, in order to attain better convergence rate for the resulting embeddings. The paper includes numerical experiments corroborating these observations.

翻訳日:2021-03-27 06:01:27 公開日:2021-01-18

# 還元フラックスCTのための深層学習型ノイズ低減

Deep-Learning Driven Noise Reduction for Reduced Flux Computed Tomography ( http://arxiv.org/abs/2101.07376v1 )

ライセンス: Link先を確認

Khalid L. Alsamadony, Ertugrul U. Yildirim, Guenther Glatz, Umair bin Waheed, Sherif M. Hanafy

(参考訳) ディープニューラルネットワークは、特に放射線リスクの低減に関して、臨床画像にかなりの注目を集めている。光子フラックスを低減して放射線線量を減らすと、スキャンされた画像の品質が低下する。そこで研究者たちは、深層畳み込みニューラルネットワーク(dcnn)を利用して、低品質の低用量画像を高用量で高品質な画像にマッピングすることで、関連する放射線ハザードを最小限に抑えることを模索している。逆に、地球物質のCT(Computerd tomography)測定は放射線線量によって制限されない。しかしながら、人体とは対照的に、地球材料は高密度成分からなり、X線の減衰が増大する可能性がある。したがって、スキャン品質を得るためには、より高い量画像が必要である。マイクロCTベースの走査技術では, 長期取得の問題は特に深刻である。サンプルのサイズや露出時間の設定によっては、1回のスキャンで完了するには数時間を要する。これは、指数温度依存性の現象が解明される場合、特に懸念される。プロセスは、CTスキャンによって適切にキャプチャされるには早すぎるかもしれない。以上の課題に対処するため, 岩盤CT画像の品質向上と露光時間の60%以上短縮にDCNNを適用した。我々は、マイクロCTから得られたデータセットに基づいて現在の結果を強調し、DCNNの結果を改善するために転送学習を適用した。この手法は、あらゆる計算トモグラフィー技術に適用できる。さらに、平均二乗誤差や構造的類似度指数などの異なる損失関数を最小化するDCNNの性能を比較検討する。

Deep neural networks have received considerable attention in clinical imaging, particularly with respect to the reduction of radiation risk. Lowering the radiation dose by reducing the photon flux inevitably results in the degradation of the scanned image quality. Thus, researchers have sought to exploit deep convolutional neural networks (DCNNs) to map low-quality, low-dose images to higher-dose, higher-quality images thereby minimizing the associated radiation hazard. Conversely, computed tomography (CT) measurements of geomaterials are not limited by the radiation dose. In contrast to the human body, however, geomaterials may be comprised of high-density constituents causing increased attenuation of the X-Rays. Consequently, higher dosage images are required to obtain an acceptable scan quality. The problem of prolonged acquisition times is particularly severe for micro-CT based scanning technologies. Depending on the sample size and exposure time settings, a single scan may require several hours to complete. This is of particular concern if phenomena with an exponential temperature dependency are to be elucidated. A process may happen too fast to be adequately captured by CT scanning. To address the aforementioned issues, we apply DCNNs to improve the quality of rock CT images and reduce exposure times by more than 60\%, simultaneously. We highlight current results based on micro-CT derived datasets and apply transfer learning to improve DCNN results without increasing training time. The approach is applicable to any computed tomography technology. Furthermore, we contrast the performance of the DCNN trained by minimizing different loss functions such as mean squared error and structural similarity index.

翻訳日:2021-03-27 06:01:07 公開日:2021-01-18

# フェアネス達成のための最適前処理と全変動バリーセンタとの関係

Optimal Pre-Processing to Achieve Fairness and Its Relationship with Total Variation Barycenter ( http://arxiv.org/abs/2101.06811v1 )

ライセンス: Link先を確認

Farhad Farokhi

(参考訳) 我々は異なる影響、すなわち、アウトプットを観察する確率が人種や性別などの保護された属性に依存する程度を使って公正さを測定する。保護属性が与えられた入力の分布の合計変動距離によって異なる影響が上界であることが証明された。次に、公正性を強制するために前処理(データ修復とも呼ばれる)を使用します。本研究では,データの事前処理による予測モデルの成功度が,前処理前後におけるデータ分布のばらつき距離によって上限されることを示す。これにより、保護された属性の分布間の総変分距離の制約を受ける前処理前後のデータ分布間の総変分距離を最小化して、公正性を確保するための最適な前処理連隊を見つけることができる。この問題は効率的に解くことができる線形プログラムである。確率空間内の距離を全変動距離で測定した場合,この問題は2つの分布の重心(すなわち質量中心)の発見と密接に関連していることを示す。また,差分プライバシーが公平性に及ぼす影響を,全変動距離を用いて検討した。実践データセットを用いて数値実験を行い,実験結果を示す。

We use disparate impact, i.e., the extent that the probability of observing an output depends on protected attributes such as race and gender, to measure fairness. We prove that disparate impact is upper bounded by the total variation distance between the distribution of the inputs given the protected attributes. We then use pre-processing, also known as data repair, to enforce fairness. We show that utility degradation, i.e., the extent that the success of a forecasting model changes by pre-processing the data, is upper bounded by the total variation distance between the distribution of the data before and after pre-processing. Hence, the problem of finding the optimal pre-processing regiment for enforcing fairness can be cast as minimizing total variations distance between the distribution of the data before and after pre-processing subject to a constraint on the total variation distance between the distribution of the inputs given protected attributes. This problem is a linear program that can be efficiently solved. We show that this problem is intimately related to finding the barycenter (i.e., center of mass) of two distributions when distances in the probability space are measured by total variation distance. We also investigate the effect of differential privacy on fairness using the proposed the total variation distances. We demonstrate the results using numerical experimentation with a practice dataset.

翻訳日:2021-03-27 06:00:45 公開日:2021-01-18

# インクリメンタル知識に基づく質問応答

Incremental Knowledge Based Question Answering ( http://arxiv.org/abs/2101.06938v1 )

ライセンス: Link先を確認

Yongqi Li, Wenjie Li, Liqiang Nie

(参考訳) 近年,知識ベースの質問回答 (KBQA) は,知識ベースで事実を用いて自然言語の質問に答えることを目的としている。既存のアプローチは静的な知識ベースを想定することが多い。しかし、知識は現実世界で時間とともに進化している。進化する知識ベースに微調整戦略を直接適用すれば、深刻な破滅的な忘れの問題に悩まされるでしょう。本稿では,人間と同じように学習能力を段階的に拡大できる新しいインクリメンタルkbqa学習フレームワークを提案する。具体的には、知識蒸留を生かして壊滅的な忘れる問題を克服するために、マージン蒸留損失と協調抽出方法とを含む。提案するインクリメンタル学習ソリューションを評価するために,simplequestionデータセットを再編成した。包括的な実験は、進化する知識ベースに取り組む際にその効果と効率を示す。

In the past years, Knowledge-Based Question Answering (KBQA), which aims to answer natural language questions using facts in a knowledge base, has been well developed. Existing approaches often assume a static knowledge base. However, the knowledge is evolving over time in the real world. If we directly apply a fine-tuning strategy on an evolving knowledge base, it will suffer from a serious catastrophic forgetting problem. In this paper, we propose a new incremental KBQA learning framework that can progressively expand learning capacity as humans do. Specifically, it comprises a margin-distilled loss and a collaborative exemplar selection method, to overcome the catastrophic forgetting problem by taking advantage of knowledge distillation. We reorganize the SimpleQuestion dataset to evaluate the proposed incremental learning solution to KBQA. The comprehensive experiments demonstrate its effectiveness and efficiency when working with the evolving knowledge base.

翻訳日:2021-03-27 05:59:56 公開日:2021-01-18

# HinFlair: Hindi言語におけるposタグとテキスト分類のための事前訓練された文脈文字列埋め込み

HinFlair: pre-trained contextual string embeddings for pos tagging and text classification in the Hindi language ( http://arxiv.org/abs/2101.06949v1 )

ライセンス: Link先を確認

Harsh Patel

(参考訳) 繰り返しニューラルネットワークとトランスフォーマーアーキテクチャに基づく言語モデルの最近の進歩は、posタグ付け、名前付きエンティティ認識、テキスト分類など、幅広い自然言語処理タスクにおいて最先端の結果を得た。しかし、これらの言語モデルのほとんどは、英語、ドイツ語、スペイン語のような高資源言語で事前学習されている。多言語言語モデルはヒンディー語、テルグ語、ベンガル語などのインドの言語を訓練用コーパスに含んでいるが、これらの言語が研究の主要な言語ではないため、言語の特徴を表現できないことが多い。 HinFlairは、巨大な単言語Hindiコーパス上で事前訓練された言語表現モデル(コンテキスト文字列埋め込み)である。 6つのテキスト分類データセットとヒンディー語の依存木バンクを用いて、ヒンディー語のコンテキスト化文字列埋め込みの性能を分析する実験を行った。結果は、HinFlairが、テキスト分類やposタグ付けといった下流タスクのために、既存の最先端の公開トレーニング済みの埋め込みよりも優れていることを示している。また、HinFlairとFastTextの埋め込みの組み合わせは、特にヒンディー語のために訓練された多くのトランスフォーマーベースの言語モデルより優れている。

Recent advancements in language models based on recurrent neural networks and transformers architecture have achieved state-of-the-art results on a wide range of natural language processing tasks such as pos tagging, named entity recognition, and text classification. However, most of these language models are pre-trained in high resource languages like English, German, Spanish. Multi-lingual language models include Indian languages like Hindi, Telugu, Bengali in their training corpus, but they often fail to represent the linguistic features of these languages as they are not the primary language of the study. We introduce HinFlair, which is a language representation model (contextual string embeddings) pre-trained on a large monolingual Hindi corpus. Experiments were conducted on 6 text classification datasets and a Hindi dependency treebank to analyze the performance of these contextualized string embeddings for the Hindi language. Results show that HinFlair outperforms previous state-of-the-art publicly available pre-trained embeddings for downstream tasks like text classification and pos tagging. Also, HinFlair when combined with FastText embeddings outperforms many transformers-based language models trained particularly for the Hindi language.

翻訳日:2021-03-27 05:59:45 公開日:2021-01-18

# caegcn:クロス・アテンション・フュージョンベースの強化グラフ畳み込みネットワーク

CaEGCN: Cross-Attention Fusion based Enhanced Graph Convolutional Network for Clustering ( http://arxiv.org/abs/2101.06883v1 )

ライセンス: Link先を確認

Guangyu Huo, Yong Zhang, Junbin Gao, Boyue Wang, Yongli Hu, and Baocai Yin

(参考訳) 深層畳み込みネットワークの強力な学習能力により、深層クラスタリング手法は個々のデータから最も識別性の高い情報を抽出し、より良好なクラスタリング結果を生成することができる。しかしながら、既存のディープクラスタリング手法は通常、データ間の関係を無視する。幸いなことに、グラフ畳み込みネットワークはそのような関係を処理でき、ディープクラスタリングの新しい研究方向を開くことができる。 In this paper, we propose a cross-attention based deep clustering framework, named Cross-Attention Fusion based Enhanced Graph Convolutional Network (CaEGCN), which contains four main modules: the cross-attention fusion module which innovatively concatenates the Content Auto-encoder module (CAE) relating to the individual data and Graph Convolutional Auto-encoder module (GAE) relating to the relationship between the data in a layer-by-layer manner, and the self-supervised model that highlights the discriminative information for clustering tasks. クロスアテンション融合モジュールは2種類の異種表現を融合するが、CAEモジュールはGAEモジュールのコンテンツ情報を補完し、GCNの過度に平滑な問題を回避する。 GAEモジュールでは、各データの内容と関係を再構成する2つの新しい損失関数が提案されている。最後に、自己教師付きモジュールは、CAEとGAEの中間層表現の分布を一貫性に制約する。異なるタイプのデータセットに対する実験結果は、提案したCaEGCNの優位性と堅牢性を証明する。

With the powerful learning ability of deep convolutional networks, deep clustering methods can extract the most discriminative information from individual data and produce more satisfactory clustering results. However, existing deep clustering methods usually ignore the relationship between the data. Fortunately, the graph convolutional network can handle such relationship, opening up a new research direction for deep clustering. In this paper, we propose a cross-attention based deep clustering framework, named Cross-Attention Fusion based Enhanced Graph Convolutional Network (CaEGCN), which contains four main modules: the cross-attention fusion module which innovatively concatenates the Content Auto-encoder module (CAE) relating to the individual data and Graph Convolutional Auto-encoder module (GAE) relating to the relationship between the data in a layer-by-layer manner, and the self-supervised model that highlights the discriminative information for clustering tasks. While the cross-attention fusion module fuses two kinds of heterogeneous representation, the CAE module supplements the content information for the GAE module, which avoids the over-smoothing problem of GCN. In the GAE module, two novel loss functions are proposed that reconstruct the content and relationship between the data, respectively. Finally, the self-supervised module constrains the distributions of the middle layer representations of CAE and GAE to be consistent. Experimental results on different types of datasets prove the superiority and robustness of the proposed CaEGCN.

翻訳日:2021-03-27 05:59:26 公開日:2021-01-18

# 認知症者における動揺・尿路感染症のリスク分析のための注意モデル

An attention model to analyse the risk of agitation and urinary tract infections in people with dementia ( http://arxiv.org/abs/2101.07007v1 )

ライセンス: Link先を確認

Honglin Li, Roonak Rezvani, Magdalena Anita Kolanko, David J. Sharp, Maitreyee Wairagkar, Ravi Vaidyanathan, Ramin Nilforooshan, Payam Barnaghi

(参考訳) 行動症状と尿路感染症(uti)は認知症患者が直面する最も一般的な問題である。これらの状況を管理する上で重要な課題の1つは、苦痛を減らし、未計画の入院を避けるために早期発見とタイムリーな介入である。センサーデータの統合と分析に家庭内センシング技術と機械学習モデルを使用することで、臨床的に重要な出来事や健康状態の変化を検出し予測する機会が得られる。我々は,家庭内センサデータを収集する統合プラットフォームを開発し,機械学習モデルを扇動およびUTIリスク分析に適用するための観察的研究を行った。平均年齢82名,標準偏差6.5名(女性47名,男性41名)の88名から大規模なデータセットを収集し,注意と合理的なメカニズムを利用した新しい深層学習モデルの評価を行った。提案手法では,大量のデータを一定時間にわたって処理し,時系列データから重要なパターンを抽出することができる。注意) 抽出された特徴とパターンを使用してリスク分析モデル(すなわち、リスク分析モデル)をトレーニングする。合理的)。提案モデルでは,時系列データにおいてどの時間ステップと特徴が使用されるかを示すことにより,予測を説明することができる。本モデルでは, 排卵リスクとウティスの検出において, 91\%のリコールと83\%の精度を提供する。このモデルは、初期治療や早期介入アプローチと連動して、ウティスなどの病態の早期検出や興奮などの神経精神症状の管理に使用できる。本研究は,提案モデルが生成した警告を用いて早期介入を行うための臨床経路のセットを開発し,このプラットフォームを用いて,作成した介入計画に従って警告に応答する臨床モニタリングチームを設置した。

Behavioural symptoms and urinary tract infections (UTI) are among the most common problems faced by people with dementia. One of the key challenges in the management of these conditions is early detection and timely intervention in order to reduce distress and avoid unplanned hospital admissions. Using in-home sensing technologies and machine learning models for sensor data integration and analysis provides opportunities to detect and predict clinically significant events and changes in health status. We have developed an integrated platform to collect in-home sensor data and performed an observational study to apply machine learning models for agitation and UTI risk analysis. We collected a large dataset from 88 participants with a mean age of 82 and a standard deviation of 6.5 (47 females and 41 males) to evaluate a new deep learning model that utilises attention and rational mechanism. The proposed solution can process a large volume of data over a period of time and extract significant patterns in a time-series data (i.e. attention) and use the extracted features and patterns to train risk analysis models (i.e. rational). The proposed model can explain the predictions by indicating which time-steps and features are used in a long series of time-series data. The model provides a recall of 91\% and precision of 83\% in detecting the risk of agitation and UTIs. This model can be used for early detection of conditions such as UTIs and managing of neuropsychiatric symptoms such as agitation in association with initial treatment and early intervention approaches. In our study we have developed a set of clinical pathways for early interventions using the alerts generated by the proposed model and a clinical monitoring team has been set up to use the platform and respond to the alerts according to the created intervention plans.

翻訳日:2021-03-27 05:59:05 公開日:2021-01-18

# 新しく得られた有効観測値の光によるデータ欠落検出

Data Obsolescence Detection in the Light of Newly Acquired Valid Observations ( http://arxiv.org/abs/2101.07067v1 )

ライセンス: Link先を確認

Salma Chaieb and Ali Ben Mrad and Brahim Hnich and V\'eronique Delcroix

(参考訳) システムまたは人の状態を記述する情報は、常に進化し、時代遅れになり、他の情報と矛盾する可能性がある。したがってデータベースは、データベースに含まれる時代遅れのものと矛盾する、新しい有効な観測の取得によって一貫して更新されなければならない。本稿では,情報陳腐化問題に対処するための新しい手法を提案する。提案手法は,観測結果間の矛盾をリアルタイムに検出し,表現モデルから古いものを特定することを目的としている。情報不足を特徴とする不確実な環境下で作業するため、ベイズネットワークを表現モデルとして使用し、新しい近似概念である$\epsilon$-Contradictionを提案する。新しい概念は、一連の観測において矛盾を持つ自信レベルによってパラメータ化される。本稿では,古い情報を検出する多項式時間アルゴリズムを提案する。結果として得られた古くなった情報は、単純な観測結果よりもAND-OR木の方がよいことを示す。最後に,本手法の有効性を高齢者の転倒防止データベースに示すとともに,この木を用いて医師に信頼できる推薦を行う方法を示す。我々の実験は体系的にも実質的にも良い結果をもたらす。

The information describing the conditions of a system or a person is constantly evolving and may become obsolete and contradict other information. A database, therefore, must be consistently updated upon the acquisition of new valid observations that contradict obsolete ones contained in the database. In this paper, we propose a novel approach for dealing with the information obsolescence problem. Our approach aims to detect, in real-time, contradictions between observations and then identify the obsolete ones, given a representation model. Since we work within an uncertain environment characterized by the lack of information, we choose to use a Bayesian network as our representation model and propose a new approximate concept, $\epsilon$-Contradiction. The new concept is parameterised by a confidence level of having a contradiction in a set of observations. We propose a polynomial-time algorithm for detecting obsolete information. We show that the resulting obsolete information is better represented by an AND-OR tree than a simple set of observations. Finally, we demonstrate the effectiveness of our approach on a real elderly fall-prevention database and showcase how this tree can be used to give reliable recommendations to doctors. Our experiments give systematically and substantially very good results.

翻訳日:2021-03-27 05:58:39 公開日:2021-01-18

# 人間と機械理解の不協和性

Dissonance Between Human and Machine Understanding ( http://arxiv.org/abs/2101.07337v1 )

ライセンス: Link先を確認

Zijian Zhang, Jaspreet Singh, Ujwal Gadiraju, Avishek Anand

(参考訳) 複雑な機械学習モデルは、現在医療や自動運転車を含むいくつかの重要な領域にデプロイされている。その結果、このような複雑なモデルの決定を人間に説明するための解釈が近年急増している。タスクの人間の解釈に対応するモデルは、特定のコンテキストにおいてより望ましいものであり、責任の属性、信頼の構築、バイアスの顕在化、よりよいモデルの構築に役立つ。したがって、どのモデルがタスクの人間の理解にどのように準拠しているかを理解することが重要である。本稿では,画像分類タスクのレンズを通して,人間と機械の理解の不協和性を明らかにし,定量化する大規模クラウドソーシング研究を行う。特に、我々は以下の質問に答えようとしている。どの(十分にパフォーマンスのよい)複雑なMLモデルが、正確な予測を行うために機能を使用することで、人間に近づきつつあるか? タスクの難易度は、人間と比較して機械の機能選択能力にどのように影響するか? 人間は画像認識をより正確にする機能を選択するのに一貫して優れているか? 私たちの発見は、人工知能の分野における長期的な目標は、人間のように学習し推論できる機械を作ることであると考え、人間と機械のコラボレーションに重要な意味を持つ。

Complex machine learning models are deployed in several critical domains including healthcare and autonomous vehicles nowadays, albeit as functional black boxes. Consequently, there has been a recent surge in interpreting decisions of such complex models in order to explain their actions to humans. Models that correspond to human interpretation of a task are more desirable in certain contexts and can help attribute liability, build trust, expose biases and in turn build better models. It is, therefore, crucial to understand how and which models conform to human understanding of tasks. In this paper, we present a large-scale crowdsourcing study that reveals and quantifies the dissonance between human and machine understanding, through the lens of an image classification task. In particular, we seek to answer the following questions: Which (well-performing) complex ML models are closer to humans in their use of features to make accurate predictions? How does task difficulty affect the feature selection capability of machines in comparison to humans? Are humans consistently better at selecting features that make image recognition more accurate? Our findings have important implications on human-machine collaboration, considering that a long term goal in the field of artificial intelligence is to make machines capable of learning and reasoning like humans.

翻訳日:2021-03-27 05:58:23 公開日:2021-01-18

# 無標識植物病画像に対するカオス的微細クラスタリング

Chaotic-to-Fine Clustering for Unlabeled Plant Disease Images ( http://arxiv.org/abs/2101.06820v1 )

ライセンス: Link先を確認

Uno Fang, Jianxin Li, Xuequan Lu, Mumtaz Ali, Longxiang Gao and Yong Xiang

(参考訳) 現在の植物病画像の注釈は、農業の専門家による手作業による仕分けと手作りの特徴に依存する。本稿では,Kernel K-meansの脆弱性に基づいた,植物病画像のグループ化のための自己組織化クラスタリングフレームワークを提案する。主なアイデアは、カーネルk-meansに基づくクロスイテレーティブなアンダークラスタ化アルゴリズムを確立し、擬似ラベルトレーニングセットとカオスクラスタを生成し、さらにディープラーニングモジュールによって分類することである。提案手法の有効性を検証するため,植物5種と植物17種の3種類の病原体について広範な実験を行った。画像に基づく植物病の分類をバランスとバランスのとれないデータセットに対して, 既存の5つの著作物と異なる指標を用いて比較し, 高い優越性を示した。

Current annotation for plant disease images depends on manual sorting and handcrafted features by agricultural experts, which is time-consuming and labour-intensive. In this paper, we propose a self-supervised clustering framework for grouping plant disease images based on the vulnerability of Kernel K-means. The main idea is to establish a cross iterative under-clustering algorithm based on Kernel K-means to produce the pseudo-labeled training set and a chaotic cluster to be further classified by a deep learning module. In order to verify the effectiveness of our proposed framework, we conduct extensive experiments on three different plant disease datatsets with five plants and 17 plant diseases. The experimental results show the high superiority of our method to do image-based plant disease classification over balanced and unbalanced datasets by comparing with five state-of-the-art existing works in terms of different metrics.

翻訳日:2021-03-27 05:58:05 公開日:2021-01-18

# CFC-Net:リモートセンシング画像における任意指向物体検出のための重要な特徴キャプチャネットワーク

CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images ( http://arxiv.org/abs/2101.06849v1 )

ライセンス: Link先を確認

Qi Ming, Lingjuan Miao, Zhiqiang Zhou, Yunpeng Dong

(参考訳) 光リモートセンシング画像における物体検出は重要かつ困難な課題である。近年,畳み込みニューラルネットワークに基づく手法が進歩している。しかし, 物体スケール, アスペクト比, 任意の方向のばらつきが大きいため, 検出性能がさらに向上することは困難である。本稿では,物体検出における識別的特徴の役割について検討し,特徴表現の構築,事前設定アンカーの改良,ラベル割り当ての最適化という3つの側面から検出精度を向上させるために,cfc-net (critical feature capture network) を提案する。具体的には、まず分類と回帰の特徴を分離し、次に分極注意モジュール(pam)を介して各タスクに適応したロバストな重要な特徴を構築する。抽出した識別回帰特性により、R-ARM(Rotation Anchor Refinement Module)は、予め設定された水平アンカーの局所化処理を行い、より優れたローテーションアンカーを得る。次に、ダイナミックアンカー学習(DAL)戦略により、重要な特徴を捉える能力に基づいて、高品質なアンカーを適応的に選択する。提案フレームワークは、リモートセンシング画像におけるオブジェクトのより強力なセマンティック表現を生成し、高性能なリアルタイムオブジェクト検出を実現する。 HRSC2016, DOTA, UCAS-AODの3つのリモートセンシングデータセットによる実験結果から, 本手法は多くの最先端手法と比較して優れた検出性能を示すことが示された。コードとモデルはhttps://github.com/ming71/cfc-netで入手できる。

Object detection in optical remote sensing images is an important and challenging task. In recent years, the methods based on convolutional neural networks have made good progress. However, due to the large variation in object scale, aspect ratio, and arbitrary orientation, the detection performance is difficult to be further improved. In this paper, we discuss the role of discriminative features in object detection, and then propose a Critical Feature Capturing Network (CFC-Net) to improve detection accuracy from three aspects: building powerful feature representation, refining preset anchors, and optimizing label assignment. Specifically, we first decouple the classification and regression features, and then construct robust critical features adapted to the respective tasks through the Polarization Attention Module (PAM). With the extracted discriminative regression features, the Rotation Anchor Refinement Module (R-ARM) performs localization refinement on preset horizontal anchors to obtain superior rotation anchors. Next, the Dynamic Anchor Learning (DAL) strategy is given to adaptively select high-quality anchors based on their ability to capture critical features. The proposed framework creates more powerful semantic representations for objects in remote sensing images and achieves high-performance real-time object detection. Experimental results on three remote sensing datasets including HRSC2016, DOTA, and UCAS-AOD show that our method achieves superior detection performance compared with many state-of-the-art approaches. Code and models are available at https://github.com/ming71/CFC-Net.

翻訳日:2021-03-27 05:57:49 公開日:2021-01-18

# 野生における3次元物体形状復元の秘密

Secrets of 3D Implicit Object Shape Reconstruction in the Wild ( http://arxiv.org/abs/2101.06860v1 )

ライセンス: Link先を確認

Shivam Duggal, Zihao Wang, Wei-Chiu Ma, Sivabalan Manivasagam, Justin Liang, Shenlong Wang and Raquel Urtasun

(参考訳) 高忠実度3Dオブジェクトをスパースから再構成し、部分的な観察はコンピュータビジョン、ロボティクス、グラフィックスの様々な応用において重要である。最近のニューラル暗黙的モデリング手法は、合成されたデータセットや高密度なデータセットに対して有望な結果を示すが、それらはスパースでノイズの多い実世界のデータでは不十分である。本稿では,ニューラルネットワークを用いたニューラル暗黙モデルの性能低下の根本原因を解析する。この制限は、非常に複雑な目的、正規化の欠如、そして初期化の欠如によるものである。これらの問題を克服するために、 (i) 潜時コード最適化のためのより良い、より安定した初期化を提供するディープエンコーダ、 (ii) 形状の忠実度を高めるための事前モデルとして機能するディープディミネータの2つの簡単な修正を導入する。提案手法は実車2台について評価し,最先端の3dオブジェクト復元法よりも優れた性能を示す。

Reconstructing high-fidelity 3D objects from sparse, partial observation is of crucial importance for various applications in computer vision, robotics, and graphics. While recent neural implicit modeling methods show promising results on synthetic or dense datasets, they perform poorly on real-world data that is sparse and noisy. This paper analyzes the root cause of such deficient performance of a popular neural implicit model. We discover that the limitations are due to highly complicated objectives, lack of regularization, and poor initialization. To overcome these issues, we introduce two simple yet effective modifications: (i) a deep encoder that provides a better and more stable initialization for latent code optimization; and (ii) a deep discriminator that serves as a prior model to boost the fidelity of the shape. We evaluate our approach on two real-wold self-driving datasets and show superior performance over state-of-the-art 3D object reconstruction methods.

翻訳日:2021-03-27 05:57:23 公開日:2021-01-18

# Label-Efficient Point Cloud Semantic Segmentation: アクティブラーニングアプローチ

Label-Efficient Point Cloud Semantic Segmentation: An Active Learning Approach ( http://arxiv.org/abs/2101.06931v1 )

ライセンス: Link先を確認

Xian Shi, Xun Xu, Ke Chen, Lile Cai, Chuan Sheng Foo, Kui Jia

(参考訳) 3Dポイントクラウドのセマンティックセグメンテーションは、大量のラベル付きデータによる深層モデルのトレーニングに依存している。しかし、3Dポイントクラウドのラベル付けは高価であり、データアノテーションに対する賢いアプローチである。ラベル効率のよいクラウドセグメンテーションには,アクティブな学習が不可欠だ。そこで本研究では,より現実的なアノテーションカウントスキームを提案する。ラベル付け予算をよりうまく活用するために,我々は,点クラウド幾何上で定義された多様体を利用するスーパーポイントベースのアクティブラーニング戦略を採用する。さらに,形状レベルの多様性と局所的空間的一貫性制約を促進するためのアクティブラーニング戦略を提案する。 2つのベンチマークデータセットにおける実験により,提案手法がポイントクラウドのラベル効率の高い意味セグメンテーションに有効であることを示す。特に、あらゆるレベルのアノテーション予算において大幅な改善を実現し、同じレベルのアノテーションコストで最先端のメソッドよりも優れています。

Semantic segmentation of 3D point clouds relies on training deep models with a large amount of labeled data. However, labeling 3D point clouds is expensive, thus smart approach towards data annotation, a.k.a. active learning is essential to label-efficient point cloud segmentation. In this work, we first propose a more realistic annotation counting scheme so that a fair benchmark is possible. To better exploit labeling budget, we adopt a super-point based active learning strategy where we make use of manifold defined on the point cloud geometry. We further propose active learning strategy to encourage shape level diversity and local spatial consistency constraint. Experiments on two benchmark datasets demonstrate the efficacy of our proposed active learning strategy for label-efficient semantic segmentation of point clouds. Notably, we achieve significant improvement at all levels of annotation budgets and outperform the state-of-the-art methods under the same level of annotation cost.

翻訳日:2021-03-27 05:56:46 公開日:2021-01-18

# 3次元意味セグメンテーションにおける領域適応のためのクロスモーダル学習

Cross-modal Learning for Domain Adaptation in 3D Semantic Segmentation ( http://arxiv.org/abs/2101.07253v1 )

ライセンス: Link先を確認

Maximilian Jaritz, Tuan-Hung Vu, Raoul de Charette, \'Emilie Wirbel, and Patrick P\'erez

(参考訳) ドメイン適応はラベルが不足している場合の学習を可能にする重要なタスクである。ほとんどの作業はイメージモダリティのみに焦点を当てているが、多くの重要なマルチモーダルデータセットが存在する。ドメイン適応にマルチモーダルを活用するために,相互模倣による2つのモーダルの予測の整合性を実現するクロスモーダル学習を提案する。我々は、ラベル付きデータに対する正確な予測とラベルなしのターゲットドメインデータに対するモダリティ間の一貫性のある予測をネットワークに制限する。教師なしおよび半教師付きドメイン適応設定の実験は、この新しいドメイン適応戦略の有効性を証明している。具体的には,画像と点クラウドモダリティを用いて3次元意味セグメンテーションのタスクを評価する。最近の自動運転データセットを利用して、シーンレイアウトの変更、照明、センサーの設定、天気、合成から現実への設定など、さまざまなドメイン適応シナリオを作成します。本手法は,すべての適応シナリオにおいて,以前のユニモーダル適応ベースラインよりも大幅に向上する。コードは利用可能になる。

Domain adaptation is an important task to enable learning when labels are scarce. While most works focus only on the image modality, there are many important multi-modal datasets. In order to leverage multi-modality for domain adaptation, we propose cross-modal learning, where we enforce consistency between the predictions of two modalities via mutual mimicking. We constrain our network to make correct predictions on labeled data and consistent predictions across modalities on unlabeled target-domain data. Experiments in unsupervised and semi-supervised domain adaptation settings prove the effectiveness of this novel domain adaptation strategy. Specifically, we evaluate on the task of 3D semantic segmentation using the image and point cloud modality. We leverage recent autonomous driving datasets to produce a wide variety of domain adaptation scenarios including changes in scene layout, lighting, sensor setup and weather, as well as the synthetic-to-real setup. Our method significantly improves over previous uni-modal adaptation baselines on all adaption scenarios. Code will be made available.

翻訳日:2021-03-27 05:56:25 公開日:2021-01-18

# 診断用キャプション:サーベイ

Diagnostic Captioning: A Survey ( http://arxiv.org/abs/2101.07299v1 )

ライセンス: Link先を確認

John Pavlopoulos, Vasiliki Kougia, Ion Androutsopoulos, Dimitris Papamichail

(参考訳) 診断用キャプション(DC)は、検査中に収集した患者の医療画像から診断用テキストを自動的に生成するものである。 DCは経験の浅い医師を助け、臨床ミスを減らすことができる。経験豊富な医師がより早く診断レポートを作成するのに役立つ。ディープラーニングの進歩、特に一般的なイメージキャプションにおいて、DCは近年より注目を集め、いくつかのシステムやデータセットにつながった。この記事では、DCの概要を概観する。関連するデータセット、評価基準、および最新のシステムを示す。また、DCの進歩を妨げる欠点を強調し、今後の方向性を提案する。

Diagnostic Captioning (DC) concerns the automatic generation of a diagnostic text from a set of medical images of a patient collected during an examination. DC can assist inexperienced physicians, reducing clinical errors. It can also help experienced physicians produce diagnostic reports faster. Following the advances of deep learning, especially in generic image captioning, DC has recently attracted more attention, leading to several systems and datasets. This article is an extensive overview of DC. It presents relevant datasets, evaluation measures, and up to date systems. It also highlights shortcomings that hinder DC's progress and proposes future directions.

翻訳日:2021-03-27 05:56:10 公開日:2021-01-18

# パートベース表現の探索による顔認証の改善

Improving Makeup Face Verification by Exploring Part-Based Representations ( http://arxiv.org/abs/2101.07338v1 )

ライセンス: Link先を確認

Marcus de Assis Angeloni and Helio Pedrini

(参考訳) 近年,世界の顔認識市場規模が増加している。畳み込みニューラルネットワークの採用による顔認識技術の大幅な進歩にもかかわらず、顔に化粧があるようなオープンな課題がまだ残っている。この課題に対処するために,現在の全体表現と融合する顔部品の採用を提案,評価する。顔面部は4つの領域(左眼,右眼,鼻,口)と3つの顔面(上,中,下)の2つの戦略を提案する。 4つの公開メイクアップフェイスデータセットと難解なクロスデータセットプロトコルによって得られた実験結果は、顔部分から抽出された深い特徴と全体表現との融合により、顔認証システムの精度が向上し、cnnモデルのリトレーニングなしにエラーレートが低下することを示している。提案したパイプラインは,YMUデータセットの最先端性能と,他の3つのデータセット(EMFD,FAM,M501)の競合結果を得た。

Recently, we have seen an increase in the global facial recognition market size. Despite significant advances in face recognition technology with the adoption of convolutional neural networks, there are still open challenges, as when there is makeup in the face. To address this challenge, we propose and evaluate the adoption of facial parts to fuse with current holistic representations. We propose two strategies of facial parts: one with four regions (left periocular, right periocular, nose and mouth) and another with three facial thirds (upper, middle and lower). Experimental results obtained in four public makeup face datasets and in a challenging cross-dataset protocol show that the fusion of deep features extracted of facial parts with holistic representation increases the accuracy of face verification systems and decreases the error rates, even without any retraining of the CNN models. Our proposed pipeline achieved state-of-the-art performance for the YMU dataset and competitive results for other three datasets (EMFD, FAM and M501).

翻訳日:2021-03-27 05:56:03 公開日:2021-01-18

# 変圧器モデルの通路再配置における位置偏りの軽減

Mitigating the Position Bias of Transformer Models in Passage Re-Ranking ( http://arxiv.org/abs/2101.06980v1 )

ライセンス: Link先を確認

Sebastian Hofst\"atter, Aldo Lipani, Sophia Althammer, Markus Zlabinger, Allan Hanbury

(参考訳) 教師付き機械学習モデルとその評価は、基礎となるデータセットの品質に大きく依存する。関連した情報を検索すると、指定された通路のどこにでも現れる可能性がある。しかし,文中の正しい回答の位置の偏りを,文節の再ランキングに用いる2つの一般的な質問応答データセットで観察する。通路内の初期の位置を過度に好むことは望ましくない人工物である。これにより、トランスフォーマーベースの3つの一般的なリグレードモデルが、目に見えない通路で関連する部分を無視する。さらに、評価セットが同じバイアス分布から取られるので、そのバイアスに過度に適合するモデルは、真の効果を過大評価する。本研究では,データセットの位置バイアス,文脈表現,それらの検索結果への影響を分析する。本稿では,データセットのデバイアス化手法を提案する。以上の結果から,位置バイアスデータセットでトレーニングしたモデルでは,デバイアスデータセットで評価した場合,再評価の有効性が著しく低下することが示唆された。位置バイアスを緩和することにより、トランスフォーマーベースのリグレードモデルはバイアス付きおよびバイアス付きデータセットに対して等しく有効であり、2つの異なるバイアス付きデータセット間の転送学習設定においてより効果的であることを示す。

Supervised machine learning models and their evaluation strongly depends on the quality of the underlying dataset. When we search for a relevant piece of information it may appear anywhere in a given passage. However, we observe a bias in the position of the correct answer in the text in two popular Question Answering datasets used for passage re-ranking. The excessive favoring of earlier positions inside passages is an unwanted artefact. This leads to three common Transformer-based re-ranking models to ignore relevant parts in unseen passages. More concerningly, as the evaluation set is taken from the same biased distribution, the models overfitting to that bias overestimate their true effectiveness. In this work we analyze position bias on datasets, the contextualized representations, and their effect on retrieval results. We propose a debiasing method for retrieval datasets. Our results show that a model trained on a position-biased dataset exhibits a significant decrease in re-ranking effectiveness when evaluated on a debiased dataset. We demonstrate that by mitigating the position bias, Transformer-based re-ranking models are equally effective on a biased and debiased dataset, as well as more effective in a transfer-learning setting between two differently biased datasets.

翻訳日:2021-03-27 05:55:43 公開日:2021-01-18

# ガウス過程回帰のためのベイズ変換学習における転移モデル構造

Transferring model structure in Bayesian transfer learning for Gaussian process regression ( http://arxiv.org/abs/2101.06884v1 )

ライセンス: Link先を確認

Milan Pape\v{z}, Anthony Quinn

(参考訳) ベイズ変換学習(Bayesian Transfer Learning, BTL)は、転送元分布にターゲット確率分布を条件付けるタスクとして定義される。ターゲットは、ソースとターゲット間のインタラクションと、独立したローカルソースモデラーで利用可能な確率的データ予測器の条件をグローバルにモデル化する。この最適意思決定問題を解決するために、完全な確率的設計が採用されている。ソースのより高いモーメントを転送することで、ターゲットは信頼できないソース知識(つまり)を拒否することができる。堅牢な転送を実現します) このデュアルモデラーフレームワークは、ソースの生データを転送された予測分布に局所的に処理し、圧縮可能な可能性を持つ)局所的なソースモデルによって強化されることを意味する。さらに、グローバルターゲットモデラーの導入により、ソースとターゲットタスク -- ターゲットに知られていれば -- の相関を説明できる。重要な結果が生まれる。まず、新しいスキームは、完全にモデル化された(すなわち)性能を達成する。従来)ターゲットモデルの誤特定を避ける(稀な)ケースにおけるマルチタスク学習スキーム。第二に、新しいデュアルモデルフレームワークは、従来のマルチタスク学習を損なうようなモデルのミススペクテーションに対して堅牢である。我々はこれらの問題をガウス的プロセス回帰タスクの相互作用の重要な文脈で徹底的に検討する。合成データと実データの両方による実験的な証拠は、BTLフレームワークが転送時の堅牢性を保ちながら、誤特定のモデル化にも堅牢であることを示す。

Bayesian transfer learning (BTL) is defined in this paper as the task of conditioning a target probability distribution on a transferred source distribution. The target globally models the interaction between the source and target, and conditions on a probabilistic data predictor made available by an independent local source modeller. Fully probabilistic design is adopted to solve this optimal decision-making problem in the target. By successfully transferring higher moments of the source, the target can reject unreliable source knowledge (i.e. it achieves robust transfer). This dual-modeller framework means that the source's local processing of raw data into a transferred predictive distribution -- with compressive possibilities -- is enriched by (the possible expertise of) the local source model. In addition, the introduction of the global target modeller allows correlation between the source and target tasks -- if known to the target -- to be accounted for. Important consequences emerge. Firstly, the new scheme attains the performance of fully modelled (i.e. conventional) multitask learning schemes in (those rare) cases where target model misspecification is avoided. Secondly, and more importantly, the new dual-modeller framework is robust to the model misspecification that undermines conventional multitask learning. We thoroughly explore these issues in the key context of interacting Gaussian process regression tasks. Experimental evidence from both synthetic and real data settings validates our technical findings: that the proposed BTL framework enjoys robustness in transfer while also being robust to model misspecification.

翻訳日:2021-03-27 05:55:24 公開日:2021-01-18

# 群流予測のための複数モード間の不均一関係のモデル化

Modeling Heterogeneous Relations across Multiple Modes for Potential Crowd Flow Prediction ( http://arxiv.org/abs/2101.06954v1 )

ライセンス: Link先を確認

Qiang Zhou, Jingjing Gu, Xinjiang Lu, Fuzhen Zhuang, Yanchao Zhao, Qiuhong Wang, Xiao Zhang

(参考訳) 都市計画立案者や管理者にとって、新しい交通施設のクラウドフロー予測は基本的な課題である。直感的には、近くの場所を探索することで、新しい来訪地の潜在的な群集の流れを示唆することができる。しかし、近隣の交通モード(例:交通モード)は異なる。バスの駅、自転車の駅など)は対象地(例)と異なる場合がある。地下鉄の駅) 深刻なデータ不足の問題を引き起こしますそこで本研究では,新しい計画サイトにおいて,クラウドフローを予測可能なデータ駆動型手法であるmoherを提案する。具体的には,まず,地理的な近接性と都市機能の類似性を調べることにより,対象地近傍の地域を識別する。次に,これらの不均一な関係を集約するために,相関だけでなく,異なる移動モード間の差異も学習可能な,新しい関係特異的変換モデルである交叉モード関係性gcnを考案する。その後,誘導電位流表現のためのアグリゲータを設計する。最後に、LTSMモジュールがシーケンシャルフロー予測に使用される。実世界のデータセットに関する大規模な実験は、最先端のアルゴリズムと比較してMOHERフレームワークの優位性を示している。

Potential crowd flow prediction for new planned transportation sites is a fundamental task for urban planners and administrators. Intuitively, the potential crowd flow of the new coming site can be implied by exploring the nearby sites. However, the transportation modes of nearby sites (e.g. bus stations, bicycle stations) might be different from the target site (e.g. subway station), which results in severe data scarcity issues. To this end, we propose a data driven approach, named MOHER, to predict the potential crowd flow in a certain mode for a new planned site. Specifically, we first identify the neighbor regions of the target site by examining the geographical proximity as well as the urban function similarity. Then, to aggregate these heterogeneous relations, we devise a cross-mode relational GCN, a novel relation-specific transformation model, which can learn not only the correlations but also the differences between different transportation modes. Afterward, we design an aggregator for inductive potential flow representation. Finally, an LTSM module is used for sequential flow prediction. Extensive experiments on real-world data sets demonstrate the superiority of the MOHER framework compared with the state-of-the-art algorithms.

翻訳日:2021-03-27 05:55:03 公開日:2021-01-18

# ほぼ一定ピークメモリ使用量を持つディープコントラスト学習バッチサイズのスケーリング

Scaling Deep Contrastive Learning Batch Size with Almost Constant Peak Memory Usage ( http://arxiv.org/abs/2101.06983v1 )

ライセンス: Link先を確認

Luyu Gao, Yunyi Zhang

(参考訳) コントラスト学習は、テキストや画像などの様々な形式のデータの数値ベクトル表現の学習に成功している。学習エンコーダは、多くの下流タスクに汎用的な転送能力を示す。表現に基づく検索は最先端のパフォーマンスで非常に効率的である。従来の研究では、高品質な表現を学習するには、対照的な損失に多くの否定が必要であることが示されていた。実際には、バッチ内の各例について、他のバッチサンプルの正を負とみなし、余分な負のエンコーディングを避ける、バッチ内の負のテクニックが使用される。しかし、これは依然としてすべてのバッチの例で各例の損失を条件としており、大規模なバッチ全体をgpuメモリに適合させる必要がある。本稿では,コントラスト損失とエンコーダ間のバック伝搬を分離する再計算手法を提案する。その結果、グラデーションはバッチの1つのサブセットに対して一度に計算でき、異なるサイズのバッチに対するGPUメモリ使用量がほぼ一定になる。

Contrastive learning has been applied successfully to learn numerical vector representations of various forms of data, such as texts and images. Learned encoders exhibit versatile transfer capabilities to many downstream tasks. Representation based search is highly efficient with state-of-the-art performance. Previous researches demonstrated that learning high-quality representations requires a large number of negatives in contrastive loss. In practice, the technique of in-batch negative is used, where for each example in a batch, other batch examples' positives will be taken as its negatives, avoiding encoding extra negatives. This, however, still conditions each example's loss on all batch examples and requires fitting the entire large batch into GPU memory. This paper introduces a re-computation technique that decouples back propagation between contrastive loss and the encoder, removing encoder backward pass data dependency along the batch dimension. As a result, gradients can be computed for one subset of the batch at a time, leading to an almost constant peak GPU memory usage for batches of different sizes.

翻訳日:2021-03-27 05:54:49 公開日:2021-01-18

# 継承状態とゴール依存値の学習:数学的視点

Learning Successor States and Goal-Dependent Values: A Mathematical Viewpoint ( http://arxiv.org/abs/2101.07123v1 )

ライセンス: Link先を確認

L\'eonard Blier, Corentin Tallec, Yann Ollivier

(参考訳) 強化学習では、時間差に基づくアルゴリズムはサンプル非効率であり、例えば、スパース報酬の場合、報酬が観察されるまで学習は行われない。これは、環境のモデルや後継状態といったよりリッチなオブジェクトを学習することで解決できる。後継状態は、ある政策の任意の状態から期待される将来の状態占有度をモデル化し、任意の状態に到達する方法を学習するゴール依存値関数と関連付ける。我々は,後続状態と目標依存値関数学習のための時間差アルゴリズムを,離散環境,あるいは関数近似を伴う連続環境に対して形式的に導出する。特に,有限分散推定器を連続環境においても提供し,目標状態に正確に到達する報酬は無限にスパースする。後続状態はベルマン方程式以上のものを満たす: 後方のベルマン作用素とベルマン・ニュートン作用素は環境中の経路構成性を符号化する。 BN作用素は二階勾配降下法に似ており、より多くの観測値を得るときの値関数の真の更新を提供する。表の場合と無限小の学習率では、通常のベルマン作用素と後方のベルマン作用素を混合することで漸近収束の固有値が向上し、BN作用素の漸近収束はTDよりも確率的に良い。しかし、bn法はサンプリングノイズに対してより複雑でロバストではない。最後に、後続状態のフォワードバックワード(fb)有限ランクパラメータ化は、分散の低減とsamplabilityの改善を享受し、値関数の直接モデルを提供し、長距離依存性に対応する不動点を完全に理解し、bn法を近似し、副産物として状態の2つの標準表現を提供する。

In reinforcement learning, temporal difference-based algorithms can be sample-inefficient: for instance, with sparse rewards, no learning occurs until a reward is observed. This can be remedied by learning richer objects, such as a model of the environment, or successor states. Successor states model the expected future state occupancy from any given state for a given policy and are related to goal-dependent value functions, which learn how to reach arbitrary states. We formally derive the temporal difference algorithm for successor state and goal-dependent value function learning, either for discrete or for continuous environments with function approximation. Especially, we provide finite-variance estimators even in continuous environments, where the reward for exactly reaching a goal state becomes infinitely sparse. Successor states satisfy more than just the Bellman equation: a backward Bellman operator and a Bellman-Newton (BN) operator encode path compositionality in the environment. The BN operator is akin to second-order gradient descent methods and provides the true update of the value function when acquiring more observations, with explicit tabular bounds. In the tabular case and with infinitesimal learning rates, mixing the usual and backward Bellman operators provably improves eigenvalues for asymptotic convergence, and the asymptotic convergence of the BN operator is provably better than TD, with a rate independent from the environment. However, the BN method is more complex and less robust to sampling noise. Finally, a forward-backward (FB) finite-rank parameterization of successor states enjoys reduced variance and improved samplability, provides a direct model of the value function, has fully understood fixed points corresponding to long-range dependencies, approximates the BN method, and provides two canonical representations of states as a byproduct.

翻訳日:2021-03-27 05:54:16 公開日:2021-01-18

# 不確実性下におけるfNIRSデータの分類:ベイズニューラルネットワークによるアプローチ

Classification of fNIRS Data Under Uncertainty: A Bayesian Neural Network Approach ( http://arxiv.org/abs/2101.07128v1 )

ライセンス: Link先を確認

Talha Siddique and Md Shaad Mahmud

(参考訳) 機能近赤外分光法(FNIRS)は脳-コンピュータインタフェース(BCI)の非侵襲型である。脳血行動態のイメージングに用いられ、他の類似した技術よりも生じる特定のプロースによって人気を博している。全体的な機能には、脳信号の捕捉、処理、分類が含まれる。血行動態の反応は生理的ノイズによって汚染されるため、過去の文献では、焦点の反応を望ましくないものから分類するためにいくつかの方法が採用されている。しかし、これまでの方法では、データやモデルパラメータの不確実性は考慮されていない。本稿では,ベイズ型ニューラルネットワーク(bnn)を用いて,一方的指タッピング(左右指タッピング)からなるオープンアクセスデータセットのバイナリ分類を行う。 BNNはベイズ統計を用いて、点推定の代わりに確率分布をネットワーク重みに割り当てる。このように分類を行いながら、データとモデルの不確実性を考慮に入れます。モデルのトレーニングには変分推論(VI)を使用しました。我々のモデルでは30名以上のボランティアに対して86.44%の総合的な分類精度が得られた。我々は、モデルのエビデンス下限(elbo)関数がイテレーションでどのように収束するかを示した。さらに,重量の後方分布のサンプリング中に生じる不確実性について考察した。また,1人のボランティアによるテストデータを用いて,BNN分類器のROC曲線を生成し,AUCスコアが0.855である。

Functional Near-Infrared Spectroscopy (fNIRS) is a non-invasive form of Brain-Computer Interface (BCI). It is used for the imaging of brain hemodynamics and has gained popularity due to the certain pros it poses over other similar technologies. The overall functionalities encompass the capture, processing and classification of brain signals. Since hemodynamic responses are contaminated by physiological noises, several methods have been implemented in the past literature to classify the responses in focus from the unwanted ones. However, the methods, thus far does not take into consideration the uncertainty in the data or model parameters. In this paper, we use a Bayesian Neural Network (BNN) to carry out a binary classification on an open-access dataset, consisting of unilateral finger tapping (left- and right-hand finger tapping). A BNN uses Bayesian statistics to assign a probability distribution to the network weights instead of a point estimate. In this way, it takes data and model uncertainty into consideration while carrying out the classification. We used Variational Inference (VI) to train our model. Our model produced an overall classification accuracy of 86.44% over 30 volunteers. We illustrated how the evidence lower bound (ELBO) function of the model converges over iterations. We further illustrated the uncertainty that is inherent during the sampling of the posterior distribution of the weights. We also generated a ROC curve for our BNN classifier using test data from a single volunteer and our model has an AUC score of 0.855.

翻訳日:2021-03-27 05:53:42 公開日:2021-01-18

# 絡み合う重みの安定回復 : 極小サンプルからの深層ニューラルネットワークのロバスト同定に向けて

Stable Recovery of Entangled Weights: Towards Robust Identification of Deep Neural Networks from Minimal Samples ( http://arxiv.org/abs/2101.07150v1 )

ライセンス: Link先を確認

Christian Fiedler, Massimo Fornasier, Timo Klock, and Michael Rauchensteiner

(参考訳) 本稿では,有限個の入力出力サンプルから,ピラミッド形状とスムーズな活性化関数を有する汎用深層ニューラルネットワークの特異かつ安定した識別性の問題にアプローチする。より具体的には、活性化関数とそのシフトに応じて、適切な対角行列および可逆行列と交差する連続層の重みを構成するいわゆる絡み合い重みを導入する。エンタングル重みは、ネットワークの$\mathcal o(d^2 \times m)$非適応入出力サンプルが収集され、$d$が入力次元、$m$がネットワークのニューロン数であることが証明される。さらに、このアプローチは最大$\mathcal o(d \times m_l)$ニューロンのネットワークに適用され、ここで$m_l$は層$l$の出力ニューロンの数である。エンタングル重みの層割り当てと、最小二乗でさらにヒューリスティックに得られるかもしれないスケーリングとシフトパラメータの残差に関する知識により、エンタングル重みはネットワークを完全に一意的に識別する。絡み合った重みの安定回復に関する理論的結果の妥当性を明らかにするため, 一般化重み付き多層ネットワークを頑健に同定し, 提案したアルゴリズムパイプラインにより一様に近似できることを示す数値実験を行った。対照的に、バックプロパゲーションはこの設定では安定に一般化することができず、常に比較的大きな均一誤差によって制限される。本研究は,入力出力情報をネットワークパラメータに一意かつ安定的に関連付けることができ,説明可能性の一形態を提供する。さらに, 過パラメータ化ネットワークの圧縮や, 最小複雑性ネットワークのトレーニングを行う方法を提案する。

In this paper we approach the problem of unique and stable identifiability of generic deep artificial neural networks with pyramidal shape and smooth activation functions from a finite number of input-output samples. More specifically we introduce the so-called entangled weights, which compose weights of successive layers intertwined with suitable diagonal and invertible matrices depending on the activation functions and their shifts. We prove that entangled weights are completely and stably approximated by an efficient and robust algorithm as soon as $\mathcal O(D^2 \times m)$ nonadaptive input-output samples of the network are collected, where $D$ is the input dimension and $m$ is the number of neurons of the network. Moreover, we empirically observe that the approach applies to networks with up to $\mathcal O(D \times m_L)$ neurons, where $m_L$ is the number of output neurons at layer $L$. Provided knowledge of layer assignments of entangled weights and of remaining scaling and shift parameters, which may be further heuristically obtained by least squares, the entangled weights identify the network completely and uniquely. To highlight the relevance of the theoretical result of stable recovery of entangled weights, we present numerical experiments, which demonstrate that multilayered networks with generic weights can be robustly identified and therefore uniformly approximated by the presented algorithmic pipeline. In contrast backpropagation cannot generalize stably very well in this setting, being always limited by relatively large uniform error. In terms of practical impact, our study shows that we can relate input-output information uniquely and stably to network parameters, providing a form of explainability. Moreover, our method paves the way for compression of overparametrized networks and for the training of minimal complexity networks.

翻訳日:2021-03-27 05:53:24 公開日:2021-01-18

# 不均一共有交通空間における道路利用者の運動モデルの一般化可能性について

On the Generalizability of Motion Models for Road Users in Heterogeneous Shared Traffic Spaces ( http://arxiv.org/abs/2101.06974v1 )

ライセンス: Link先を確認

Fatema T. Johora, Dongfang Yang, J\"org P. M\"uller, and \"Umit \"Ozg\"uner

(参考訳) 混合交通運動と相互作用のモデル化は,将来の都市部の安全性,効率,実現可能性を評価する上で重要である。交通規制の欠如、多様な輸送モード、共有空間のような混合交通ゾーンの動的な性質は、そのような環境の現実的なモデリングを困難にしている。本稿では, 動作モデルの一般化性, すなわち, 既存の作業に欠けている, 異なる環境条件下で現実的な行動を生成する能力に焦点を当てる。具体的には, 一般的な運動モデルを定式化し, このプロセスの応用として, ゲーム理論社会力モデル(GSFM, Game-theoretic Social Force Model)を, 異なる共有空間から歩行者や車の多種多様な運動行動を生成する汎用モデルへと拡張する。第2の貢献は、個人歩行者の運動関連特徴を調整し、グループ化することで、歩行者の異なる動きパターンを検討することである。 2つのクラスタリングアプローチを分析した。このモデルのキャリブレーションと評価は、3つの異なる共有空間データセット上で行われる。その結果, 本モデルでは, 様々な動作行動やインタラクションシナリオを現実的にシミュレートでき, モデルに歩行者の異なる動きパターンを加えることで, その性能が向上することがわかった。

Modeling mixed-traffic motion and interactions is crucial to assess safety, efficiency, and feasibility of future urban areas. The lack of traffic regulations, diverse transport modes, and the dynamic nature of mixed-traffic zones like shared spaces make realistic modeling of such environments challenging. This paper focuses on the generalizability of the motion model, i.e., its ability to generate realistic behavior in different environmental settings, an aspect which is lacking in existing works. Specifically, our first contribution is a novel and systematic process of formulating general motion models and application of this process is to extend our Game-Theoretic Social Force Model (GSFM) towards a general model for generating a large variety of motion behaviors of pedestrians and cars from different shared spaces. Our second contribution is to consider different motion patterns of pedestrians by calibrating motion-related features of individual pedestrian and clustering them into groups. We analyze two clustering approaches. The calibration and evaluation of our model are performed on three different shared space data sets. The results indicate that our model can realistically simulate a wide range of motion behaviors and interaction scenarios, and that adding different motion patterns of pedestrians into our model improves its performance.

翻訳日:2021-03-27 05:52:49 公開日:2021-01-18

# クロスモダリティ医療画像セグメンテーションのためのDeep Symmetric Adaptation Network

Deep Symmetric Adaptation Network for Cross-modality Medical Image Segmentation ( http://arxiv.org/abs/2101.06853v1 )

ライセンス: Link先を確認

Xiaoting Han, Lei Qi, Qian Yu, Ziqi Zhou, Yefeng Zheng, Yinghuan Shi, Yang Gao

(参考訳) 非教師なし領域適応 (UDA) 法は, 医療画像分割作業において有望な性能を示した。これらの典型的な手法は通常、翻訳ネットワークを使用して、ソースドメインからターゲットドメインへの画像変換や、翻訳されたソースイメージと元のターゲットイメージのみを使用してピクセルレベルの分類器をトレーニングする。しかし、ソースドメインとターゲットドメインの間に大きなドメインシフトが存在する場合、この非対称構造はドメインギャップを完全に排除することができない。本稿では,セグメンテーションサブネットワークと2つの対称ソースとターゲットドメイン翻訳サブネットワークからなる医用画像セグメンテーションのための,udaの新たな深対称アーキテクチャを提案する。具体的には,2つのサブネットワークをベースとして,共有エンコーダとプライベートデコーダによる双方向アライメント方式を導入し,1)ソースからターゲットドメイン,2)ターゲットドメインからソースドメインへのアライメントを同時に行うことにより,ドメイン間の差異を効果的に緩和する。さらに,セグメンテーションサブネットワークにおいて,元のターゲット画像と翻訳されたソース画像だけでなく,元のソース画像と翻訳されたターゲット画像を用いて画素レベルの分類器を訓練し,異なるスタイルの画像からの意味情報を十分に活用する。拡張実験により,Cardiac と BraTS のセグメンテーションタスクにおける最先端手法と比較して,本手法が顕著に優れていることが示された。

Unsupervised domain adaptation (UDA) methods have shown their promising performance in the cross-modality medical image segmentation tasks. These typical methods usually utilize a translation network to transform images from the source domain to target domain or train the pixel-level classifier merely using translated source images and original target images. However, when there exists a large domain shift between source and target domains, we argue that this asymmetric structure could not fully eliminate the domain gap. In this paper, we present a novel deep symmetric architecture of UDA for medical image segmentation, which consists of a segmentation sub-network, and two symmetric source and target domain translation sub-networks. To be specific, based on two translation sub-networks, we introduce a bidirectional alignment scheme via a shared encoder and private decoders to simultaneously align features 1) from source to target domain and 2) from target to source domain, which helps effectively mitigate the discrepancy between domains. Furthermore, for the segmentation sub-network, we train a pixel-level classifier using not only original target images and translated source images, but also original source images and translated target images, which helps sufficiently leverage the semantic information from the images with different styles. Extensive experiments demonstrate that our method has remarkable advantages compared to the state-of-the-art methods in both cross-modality Cardiac and BraTS segmentation tasks.

翻訳日:2021-03-27 05:52:28 公開日:2021-01-18

# ディープニューラルネットワークと信仰機能を用いたCovid-19分類

Covid-19 classification with deep neural network and belief functions ( http://arxiv.org/abs/2101.06958v1 )

ライセンス: Link先を確認

Ling Huang, Su Ruan, Thierry Denoeux

(参考訳) CT画像は、放射線医がCovid-19を診断するのに有用な情報を提供する。しかし,CTスキャンの視覚的解析には時間を要する。したがってCT画像からCovid-19を自動的に検出するアルゴリズムを開発する必要がある。本稿では,コビッド19症例を検出するための半教師付きトレーニングを用いた信念関数に基づく畳み込みニューラルネットワークを提案する。本手法はまず,深い特徴を抽出し,信頼度マップにマップし,最終的な分類決定を行う。我々の結果は従来のディープラーニングに基づく分類モデルよりも信頼性が高く説明しやすい。実験の結果,0.81,f10.812,auc0.875の精度で良好な性能を得ることができた。

Computed tomography (CT) image provides useful information for radiologists to diagnose Covid-19. However, visual analysis of CT scans is time-consuming. Thus, it is necessary to develop algorithms for automatic Covid-19 detection from CT images. In this paper, we propose a belief function-based convolutional neural network with semi-supervised training to detect Covid-19 cases. Our method first extracts deep features, maps them into belief degree maps and makes the final classification decision. Our results are more reliable and explainable than those of traditional deep learning-based classification models. Experimental results show that our approach is able to achieve a good performance with an accuracy of 0.81, an F1 of 0.812 and an AUC of 0.875.

翻訳日:2021-03-27 05:52:03 公開日:2021-01-18

# サイクリックリバースジェネレータを用いた反復顔画像インペインティング

Iterative Facial Image Inpainting using Cyclic Reverse Generator ( http://arxiv.org/abs/2101.07036v1 )

ライセンス: Link先を確認

Yahya Dogan and Hacer Yalim Keles

(参考訳) 顔画像のインペインティングは、顔にマスクされたキーコンポーネント(例えば目と鼻)の意味情報を含む新しいピクセルを生成する必要があるため、難しい問題である。近年,この分野では注目すべき手法が提案されている。これらのアプローチのほとんどはエンコーダ-デコーダアーキテクチャを使用しており、与えられた画像と特定のマスクのユニークな結果を可能にするといった制限がある。あるいは、ジェネレータネットワークと異なるマスクを使って有望な結果を生み出すアプローチもある。しかしながら、これらのアプローチは最適化ベースであり、通常多くのイテレーションを必要とする。本稿では, 循環逆生成器(cyclic reverse generator, crg)アーキテクチャを用いた, コーダ生成器モデルによる顔画像描画問題に対する効率的な解法を提案する。このエンコーダを用いて、生成領域に所定の画像を埋め込み、可算な画像が生成されるまでマスク領域を段階的に塗りつぶし、反復中に生成された画像を評価するために判別器ネットワークを利用する。提案したモデルを用いて実写画像を生成するには,数イテレーションで十分であることがわかった。生成プロセスの後、ポスト処理では、マスク境界に近いアーティファクトを修復するために、このタスクのために特別に訓練したUnetモデルを使用します。本手法では,様々なマスクタイプを用いてスケッチベースのインペインティングを適用でき,多種多様な結果が得られる。我々は,この手法を最先端のモデルと比較し,全てのマスクタイプにおいて他のモデルと競合することを観察した。

Facial image inpainting is a challenging problem as it requires generating new pixels that include semantic information for masked key components in a face, e.g., eyes and nose. Recently, remarkable methods have been proposed in this field. Most of these approaches use encoder-decoder architectures and have different limitations such as allowing unique results for a given image and a particular mask. Alternatively, some approaches generate promising results using different masks with generator networks. However, these approaches are optimization-based and usually require quite a number of iterations. In this paper, we propose an efficient solution to the facial image painting problem using the Cyclic Reverse Generator (CRG) architecture, which provides an encoder-generator model. We use the encoder to embed a given image to the generator space and incrementally inpaint the masked regions until a plausible image is generated; a discriminator network is utilized to assess the generated images during the iterations. We empirically observed that only a few iterations are sufficient to generate realistic images with the proposed model. After the generation process, for the post processing, we utilize a Unet model that we trained specifically for this task to remedy the artifacts close to the mask boundaries. Our method allows applying sketch-based inpaintings, using variety of mask types, and producing multiple and diverse results. We qualitatively compared our method with the state-of-the-art models and observed that our method can compete with the other models in all mask types; it is particularly better in images where larger masks are utilized.

翻訳日:2021-03-27 05:51:39 公開日:2021-01-18

# 共有潜在空間表現を用いた大腸内視鏡ビデオにおける欠損面の可視化

Visualizing Missing Surfaces In Colonoscopy Videos using Shared Latent Space Representations ( http://arxiv.org/abs/2101.07280v1 )

ライセンス: Link先を確認

Shawn Mathew, Saad Nadeem and Arie Kaufman

(参考訳) 最も普及している大腸癌スクリーニングツールである光大腸内視鏡(oc)は、大腸の形状(水平折りたたみや鋭い屈曲)、内科医の経験不足や疲労、内視鏡の視野などを含む多くの要因により、ミス率が高い。大腸内視鏡検査中にフレーム当たりの欠落領域を可視化する枠組みを提示し,有効な臨床ソリューションを提供する。具体的には、3D再構成仮想大腸内視鏡(VC)データと、VCとOCが同じ形状を共有しているが、OCドメインに埋め込まれた色、テクスチャ、スペキュラリフレクションが異なるという知見を用いる。 OCとVCのための強制的共有潜在空間を伴って、損失のない画像から画像への変換モデルを導入する。この共有潜在空間は、追加のガウス雑音入力に対して色、テクスチャ、スペック情報の生成を遅らせながら幾何情報をキャプチャする。この追加ノイズ入力を使用して、VCからOC、OCからOCへの1対多マッピングを生成することができる。

Optical colonoscopy (OC), the most prevalent colon cancer screening tool, has a high miss rate due to a number of factors, including the geometry of the colon (haustral fold and sharp bends occlusions), endoscopist inexperience or fatigue, endoscope field of view, etc. We present a framework to visualize the missed regions per-frame during the colonoscopy, and provides a workable clinical solution. Specifically, we make use of 3D reconstructed virtual colonoscopy (VC) data and the insight that VC and OC share the same underlying geometry but differ in color, texture and specular reflections, embedded in the OC domain. A lossy unpaired image-to-image translation model is introduced with enforced shared latent space for OC and VC. This shared latent space captures the geometric information while deferring the color, texture, and specular information creation to additional Gaussian noise input. This additional noise input can be utilized to generate one-to-many mappings from VC to OC and OC to OC.

翻訳日:2021-03-27 05:51:17 公開日:2021-01-18

# 摂動勾配降下の微分的にプライベートな性質について

On the Differentially Private Nature of Perturbed Gradient Descent ( http://arxiv.org/abs/2101.06847v1 )

ライセンス: Link先を確認

Thulasi Tholeti, Sheetal Kalyani

(参考訳) 勾配降下アルゴリズムを用いて,データベースを与えられた経験的リスク最小化の問題を考える。最適化される関数は、アルゴリズムの収束を妨げる鞍点からなる非凸であるかもしれないことに注意する。摂動勾配降下アルゴリズムは典型的にはこれらのサドル点から逃れるために用いられる。勾配を乱すこのアルゴリズムは本質的にデータのプライバシーを保っていることを示す。次に、得られたプライバシーを定量化するために、差分プライバシーフレームワークを使用します。また,問題次元やデータベース間の距離といったパラメータによって,プライバシーの変化を分析する。

We consider the problem of empirical risk minimization given a database, using the gradient descent algorithm. We note that the function to be optimized may be non-convex, consisting of saddle points which impede the convergence of the algorithm. A perturbed gradient descent algorithm is typically employed to escape these saddle points. We show that this algorithm, that perturbs the gradient, inherently preserves the privacy of the data. We then employ the differential privacy framework to quantify the privacy hence achieved. We also analyze the change in privacy with varying parameters such as problem dimension and the distance between the databases.

翻訳日:2021-03-27 05:50:39 公開日:2021-01-18

# GraphAttacker: 一般的なマルチタスクGraphAttackフレームワーク

GraphAttacker: A General Multi-Task GraphAttack Framework ( http://arxiv.org/abs/2101.06855v1 )

ライセンス: Link先を確認

Jinyin Chen, Dunjie Zhang, Zhaoyan Ming and Kejie Huang

(参考訳) グラフニューラルネットワーク(GNN)は多くの実世界のアプリケーションでグラフ解析タスクにうまく活用されている。しかしながら、GNNは攻撃者によって生成された敵のサンプルによって課される潜在的なセキュリティ上の問題があり、ほとんど知覚不能な摂動を伴う攻撃性能を達成している。これらの攻撃者の幅広い適用を制限するのは、ノード分類やリンク予測のような特定のグラフ分析タスクに対する手法の特異性である。そこで我々は,グラフ解析タスクに従って,構造や攻撃戦略を柔軟に調整できる新しい汎用グラフ攻撃フレームワークであるGraphAttackerを提案する。 GAN(Generative Adversarial Network)に基づいて、GraphAttackerは、3つの主要なコンポーネント、MAG(Multi-strategy Attack Generator)、SD(Simisity Discriminator)、AD(Attatity Discriminator)を交互にトレーニングすることで、敵のサンプルを生成する。さらに,摂動予算内で攻撃者を実現するために,ノード間の類似性を定量化する新しい類似性修正率(smr)を提案する。本研究では,ノード分類,グラフ分類,リンク予測のグラフ解析タスクにおいて,GraphAttackerが最先端攻撃性能を達成可能であることを示す。さらに,各タスクのユニークな特性と,それらの応答を統一攻撃フレームワークで分析する。将来の攻撃研究のためのオープンソースのシミュレーションプラットフォームとして、GraphAttackerをリリースします。

Graph Neural Networks (GNNs) have been successfully exploited in graph analysis tasks in many real-world applications. However, GNNs have been shown to have potential security issues imposed by adversarial samples generated by attackers, which achieved great attack performance with almost imperceptible perturbations. What limit the wide application of these attackers are their methods' specificity on a certain graph analysis task, such as node classification or link prediction. We thus propose GraphAttacker, a novel generic graph attack framework that can flexibly adjust the structures and the attack strategies according to the graph analysis tasks. Based on the Generative Adversarial Network (GAN), GraphAttacker generates adversarial samples through alternate training on three key components, the Multi-strategy Attack Generator (MAG), the Similarity Discriminator (SD), and the Attack Discriminator(AD). Furthermore, to achieve attackers within perturbation budget, we propose a novel Similarity Modification Rate (SMR) to quantify the similarity between nodes thus constrain the attack budget. We carry out extensive experiments and the results show that GraphAttacker can achieve state-of-the-art attack performance on graph analysis tasks of node classification, graph classification, and link prediction. Besides, we also analyze the unique characteristics of each task and their specific response in the unified attack framework. We will release GraphAttacker as an open-source simulation platform for future attack researches.

翻訳日:2021-03-27 05:50:33 公開日:2021-01-18

# まばらなオンライン学習のためのスクリーニング

Screening for Sparse Online Learning ( http://arxiv.org/abs/2101.06982v1 )

ライセンス: Link先を確認

Jingwei Liang and Clarice Poon

(参考訳) 正規化を促進させるスパーシティは、低複雑さ構造(例)を課すために広く用いられている。 l1-norm for sparsity) は教師あり学習の回帰係数である。決定論的最適化の領域では、反復アルゴリズム(近位勾配降下など)によって生成されたシーケンスは「有限のアクティビティ識別」を示す。しかし、ほとんどのオンラインアルゴリズム(近確率勾配降下など)は、消滅するステップサイズと非消滅的な分散のために性質を持たない。本稿では,スクリーニングルールと組み合わせることで,オンラインアルゴリズムが生成するイテレートの不要な特徴を解消し,有限なアクティビティ識別を実現する方法を示す。その結果、任意の収束オンラインアルゴリズムと組み合わせることで、正規化器によって課される空間特性を計算利得に利用することができる。数値的には、大きな加速が得られる。

Sparsity promoting regularizers are widely used to impose low-complexity structure (e.g. l1-norm for sparsity) to the regression coefficients of supervised learning. In the realm of deterministic optimization, the sequence generated by iterative algorithms (such as proximal gradient descent) exhibit "finite activity identification", namely, they can identify the low-complexity structure in a finite number of iterations. However, most online algorithms (such as proximal stochastic gradient descent) do not have the property owing to the vanishing step-size and non-vanishing variance. In this paper, by combining with a screening rule, we show how to eliminate useless features of the iterates generated by online algorithms, and thereby enforce finite activity identification. One consequence is that when combined with any convergent online algorithm, sparsity properties imposed by the regularizer can be exploited for computational gains. Numerically, significant acceleration can be obtained.

翻訳日:2021-03-27 05:50:07 公開日:2021-01-18

# 接続特徴と畳み込みニューラルネットワークを用いた感情脳波分類

Emotional EEG Classification using Connectivity Features and Convolutional Neural Networks ( http://arxiv.org/abs/2101.07069v1 )

ライセンス: Link先を確認

Seong-Eun Moon, Chun-Jui Chen, Cho-Jui Hsieh, Jane-Ling Wang, Jong-Seok Lee

(参考訳) 畳み込みニューラルネットワーク(CNN)は脳波(EEG)信号を通じてユーザの状態を認識するために広く用いられている。前回の研究では、脳波信号は通常、高次元の生データによってcnnに供給される。しかし,本手法では,機能的脳ネットワークを記述し,ユーザの知覚状態を推定する上で有効な脳接続情報の活用が困難である。我々は,脳とCNNの接続を利用した新しい分類システムを導入し,その効果を3種類の接続手段を用いて感情映像分類によって検証する。さらに,連結行列を構成するための2つのデータ駆動手法を提案し,分類性能を最大化する。さらに分析した結果,対象映像の感情特性に関連する脳接続の濃度が分類性能と相関していることが判明した。

Convolutional neural networks (CNNs) are widely used to recognize the user's state through electroencephalography (EEG) signals. In the previous studies, the EEG signals are usually fed into the CNNs in the form of high-dimensional raw data. However, this approach makes it difficult to exploit the brain connectivity information that can be effective in describing the functional brain network and estimating the perceptual state of the user. We introduce a new classification system that utilizes brain connectivity with a CNN and validate its effectiveness via the emotional video classification by using three different types of connectivity measures. Furthermore, two data-driven methods to construct the connectivity matrix are proposed to maximize classification performance. Further analysis reveals that the level of concentration of the brain connectivity related to the emotional property of the target video is correlated with classification performance.

翻訳日:2021-03-27 05:49:53 公開日:2021-01-18

# 埋め込みのアライメントと安定性:測定と推論の改善

Alignment and stability of embeddings: measurement and inference improvement ( http://arxiv.org/abs/2101.07251v1 )

ライセンス: Link先を確認

Furkan G\"ursoy, Mounir Haddad, C\'ecile Bothorel

(参考訳) 表現学習(rl)法は、情報が距離によって保存されるオブジェクトの潜在埋め込みを学習する。距離はある種の線型変換に不変であるため、同じ情報を保持しながら異なる埋め込みが得られる。力学系では、埋め込みの時間的差はシステムの安定性や任意の変換による埋め込みの誤用によって説明できる。文献では、埋め込みアライメントは公式に定義されておらず、理論的に、あるいは経験的に分析されていない。ここでは埋め込みアライメントとその部分を調査し,最初の形式的定義を提供し,アライメントと安定性を測定するための新しい指標を提案し,合成実験を通じてそれらの適合性を示す。実世界の実験では、静的RL法と動的RL法の両方が不整合な埋め込みを生成する傾向があり、そのような不整合は動的ネットワーク推論タスクの性能を悪化させる。アライメントを確保することで、予測精度は静的で最大90%上昇し、動的RL法では40%上昇する。

Representation learning (RL) methods learn objects' latent embeddings where information is preserved by distances. Since distances are invariant to certain linear transformations, one may obtain different embeddings while preserving the same information. In dynamic systems, a temporal difference in embeddings may be explained by the stability of the system or by the misalignment of embeddings due to arbitrary transformations. In the literature, embedding alignment has not been defined formally, explored theoretically, or analyzed empirically. Here, we explore the embedding alignment and its parts, provide the first formal definitions, propose novel metrics to measure alignment and stability, and show their suitability through synthetic experiments. Real-world experiments show that both static and dynamic RL methods are prone to produce misaligned embeddings and such misalignment worsens the performance of dynamic network inference tasks. By ensuring alignment, the prediction accuracy raises by up to 90% in static and by 40% in dynamic RL methods.

翻訳日:2021-03-27 05:49:43 公開日:2021-01-18

# セキュアなマルチパーティ計算に基づく高速プライバシー保護テキスト分類

Fast Privacy-Preserving Text Classification based on Secure Multiparty Computation ( http://arxiv.org/abs/2101.07365v1 )

ライセンス: Link先を確認

Amanda Resende, Davis Railsback, Rafael Dowsley, Anderson C. A. Nascimento, Diego F. Aranha

(参考訳) 本稿では,プライバシ保存型ベイズ分類器を提案し,プライベートテキスト分類問題に適用する。この設定では、あるパーティー(アリス)がテキストメッセージを持ち、別のパーティー(ボブ)が分類器を持っている。プロトコルの最後には、Aliceはテキスト入力に適用される分類器の結果のみを学習し、Bobは何も学習しない。我々のソリューションはセキュアマルチパーティ計算(SMC)に基づいている。我々のRust実装は、構造化されていないテキストの分類のための高速でセキュアなソリューションを提供します。スパム検出の場合(ソリューションは汎用的であり、ネイブベイズ分類器が使用できる他のシナリオで使用することができる)にソリューションを適用することで、Bobのモデルの辞書サイズがすべての単語(n = 5200)を含み、AliceのSMSが少なくともm = 160ユニグラムである場合、SMSをスパムまたはハムとして340ms未満で分類することができる。 n = 369 および m = 8 の場合(データベース内のスパムSMSの平均値)、我々のソリューションは21msしか必要としない。

We propose a privacy-preserving Naive Bayes classifier and apply it to the problem of private text classification. In this setting, a party (Alice) holds a text message, while another party (Bob) holds a classifier. At the end of the protocol, Alice will only learn the result of the classifier applied to her text input and Bob learns nothing. Our solution is based on Secure Multiparty Computation (SMC). Our Rust implementation provides a fast and secure solution for the classification of unstructured text. Applying our solution to the case of spam detection (the solution is generic, and can be used in any other scenario in which the Naive Bayes classifier can be employed), we can classify an SMS as spam or ham in less than 340ms in the case where the dictionary size of Bob's model includes all words (n = 5200) and Alice's SMS has at most m = 160 unigrams. In the case with n = 369 and m = 8 (the average of a spam SMS in the database), our solution takes only 21ms.

翻訳日:2021-03-27 05:49:28 公開日:2021-01-18

# 歌声変換のための階層的不整合表現学習

Hierarchical disentangled representation learning for singing voice conversion ( http://arxiv.org/abs/2101.06842v1 )

ライセンス: Link先を確認

Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji

(参考訳) 従来の歌唱音声変換(SVC)法は、高次元データによる高解像度音声の操作に悩まされることが多い。本稿では,複数の解像度を持つ不連続表現を独立に学習できる階層表現学習を提案する。学習された不整合表現を用いて、提案手法は低解像度から高解像度のSVCを段階的に実行する。実験の結果,提案手法は平均評価スコア(MOS),類似度スコア,ピッチ精度において,単一解像度で動作するベースラインよりも優れていた。

Conventional singing voice conversion (SVC) methods often suffer from operating in high-resolution audio owing to a high dimensionality of data. In this paper, we propose a hierarchical representation learning that enables the learning of disentangled representations with multiple resolutions independently. With the learned disentangled representations, the proposed method progressively performs SVC from low to high resolutions. Experimental results show that the proposed method outperforms baselines that operate with a single resolution in terms of mean opinion score (MOS), similarity score, and pitch accuracy.

翻訳日:2021-03-27 05:49:08 公開日:2021-01-18

# 電力系統におけるサイバー攻撃検出のためのマルチソースデータフュージョン

Multi-Source Data Fusion for Cyberattack Detection in Power Systems ( http://arxiv.org/abs/2101.06897v1 )

ライセンス: Link先を確認

Abhijeet Sahu and Zeyu Mao and Patrick Wlazlo and Hao Huang and Katherine Davis and Ana Goulart and Saman Zonouz

(参考訳) サイバー攻撃は早期に検出されない限り、電力システムに深刻な影響を与える可能性がある。しかしながら、重要なインフラストラクチャシステムにおける正確かつタイムリーな検出は、ゼロデイ脆弱性の搾取や、システムのサイバー物理的性質と、高い信頼性とレジリエンスの必要性によって、課題を呈している。産業制御システム(ICS)ネットワークにおけるゼロデイサイバー侵入を検出するには,従来のルールベースおよび異常ベース侵入検知システム(IDS)ツールが不十分である。そこで本研究では,複数のデータソースからの情報を融合することで,サイバーインシデントを識別し,偽陽性を低減できることを示す。具体的には,複数のデータソースによる核融合検出の正確な使用を防止するための障壁の認識と対処について述べる。我々は,実世界のデータソースをエミュレートする複数のセンサからサイバーや物理のデータを収集し,これらを合成して侵入を検知するアルゴリズムの機能とするサイバー・フィジカル・システムテストベッドにおいて,idをトレーニングするためのマルチソースデータ融合を行う。提案したデータ融合アプリケーションを用いてFalse DataとCommand InjectionベースのMan-in-The-Middle(MiTM)攻撃を推測する。 data fusionアプリケーションは、時間同期マージを使用して、idのパフォーマンスを評価するために教師付き、半教師付き、教師なしの学習モデルの前に、インプテーションやエンコーディングなどの前処理を行う。主な発見は、サイバー、セキュリティ、物理的ドメインの特徴の融合による検出精度の向上である。また,協調学習技術は,特徴を取り入れた指導的学習手法と同等に機能することを示した。

Cyberattacks can cause a severe impact on power systems unless detected early. However, accurate and timely detection in critical infrastructure systems presents challenges, e.g., due to zero-day vulnerability exploitations and the cyber-physical nature of the system coupled with the need for high reliability and resilience of the physical system. Conventional rule-based and anomaly-based intrusion detection system (IDS) tools are insufficient for detecting zero-day cyber intrusions in the industrial control system (ICS) networks. Hence, in this work, we show that fusing information from multiple data sources can help identify cyber-induced incidents and reduce false positives. Specifically, we present how to recognize and address the barriers that can prevent the accurate use of multiple data sources for fusion-based detection. We perform multi-source data fusion for training IDS in a cyber-physical power system testbed where we collect cyber and physical side data from multiple sensors emulating real-world data sources that would be found in a utility and synthesizes these into features for algorithms to detect intrusions. Results are presented using the proposed data fusion application to infer False Data and Command injection-based Man-in- The-Middle (MiTM) attacks. Post collection, the data fusion application uses time-synchronized merge and extracts features followed by pre-processing such as imputation and encoding before training supervised, semi-supervised, and unsupervised learning models to evaluate the performance of the IDS. A major finding is the improvement of detection accuracy by fusion of features from cyber, security, and physical domains. Additionally, we observed the co-training technique performs at par with supervised learning methods when fed with our features.

翻訳日:2021-03-27 05:47:50 公開日:2021-01-18

# 圧縮センシングアプリケーションを用いた未修正ReLUを用いたDNNネットワークの学習

Learning DNN networks using un-rectifying ReLU with compressed sensing application ( http://arxiv.org/abs/2101.06940v1 )

ライセンス: Link先を確認

Wen-Liang Hwang, Shih-Shuo Tung

(参考訳) 非修正技術は、データ依存変数として非線形のポイントワイドアクティベーション関数を表現し、その入力と出力と共にアクティベーション変数を全て最適化に利用することができる。この研究におけるReLUネットワークは未修正であり、活性化関数を方程式や制約の形でデータ依存のアクティベーション変数に置き換えることができる。不整合ReLUに関連するアクティベーション変数の離散的性質は、組合せ最適化の問題としてディープラーニング問題の再構成を可能にする。しかし,活性化変数の離散領域を閉区間に緩和することにより,組合せ最適化問題の最適解が維持可能であることを示す。これにより、実領域制約最適化のために開発された手法により、ネットワークの学習が容易になる。また、データ依存スラック変数を制約として導入することにより、拡張ラグランジアンアプローチに基づいてネットワークを最適化できることを示す。これは,理論上はグローバル収束を達成でき,全ての極限点が学習問題の臨界点であることを意味する。実験では,MNISTデータベースや自然画像に適用した場合,圧縮されたセンサリカバリ問題の解法により最先端の性能が得られた。

The un-rectifying technique expresses a non-linear point-wise activation function as a data-dependent variable, which means that the activation variable along with its input and output can all be employed in optimization. The ReLU network in this study was un-rectified means that the activation functions could be replaced with data-dependent activation variables in the form of equations and constraints. The discrete nature of activation variables associated with un-rectifying ReLUs allows the reformulation of deep learning problems as problems of combinatorial optimization. However, we demonstrate that the optimal solution to a combinatorial optimization problem can be preserved by relaxing the discrete domains of activation variables to closed intervals. This makes it easier to learn a network using methods developed for real-domain constrained optimization. We also demonstrate that by introducing data-dependent slack variables as constraints, it is possible to optimize a network based on the augmented Lagrangian approach. This means that our method could theoretically achieve global convergence and all limit points are critical points of the learning problem. In experiments, our novel approach to solving the compressed sensing recovery problem achieved state-of-the-art performance when applied to the MNIST database and natural images.

翻訳日:2021-03-27 05:47:21 公開日:2021-01-18

# 最適スイッチングレグレットによるオンラインキャッシング

Online Caching with Optimal Switching Regret ( http://arxiv.org/abs/2101.07043v1 )

ライセンス: Link先を確認

Samrat Mukhopadhyay, Abhishek Sinha

(参考訳) オンライン学習の観点から、古典的なアンコードキャッシュ問題を考察する。限られたストレージ容量のキャッシュは、大きなカタログから一度に$c$ファイルを保持することができる。ユーザは、各タイムスロットでカタログから任意のファイルをリクエストする。ユーザからのファイル要求が到着する前に、キャッシュポリシーは、その選択した$c$ファイルでキャッシュをポピュレートする。キャッシュヒットの場合、ポリシーは単位報酬を受け取り、それ以外は報酬を受け取らない。それに加えて、キャッシュへのファイルのフェッチに関連するコストがあります。目的は、キャッシュヒットによる報酬とファイルフェッチによる切り替えコストの両方を考慮して、最小限の後悔を招くキャッシュポリシーを設計することである。この論文の主な貢献は、リーダーベースの永続キャッシングポリシーに従う場合の切替後悔分析であり、これは順番に最適な切替後悔を有することを示している。そこで本研究では,商用cdnサーバから入手可能なトレースを用いて,さまざまなキャッシュポリシのパフォーマンスを比較することにより,この問題に対する最もよく知られたスイッチング後悔を,$\theta(\sqrt{c})という係数で改善する。

We consider the classical uncoded caching problem from an online learning point-of-view. A cache of limited storage capacity can hold $C$ files at a time from a large catalog. A user requests an arbitrary file from the catalog at each time slot. Before the file request from the user arrives, a caching policy populates the cache with any $C$ files of its choice. In the case of a cache-hit, the policy receives a unit reward and zero rewards otherwise. In addition to that, there is a cost associated with fetching files to the cache, which we refer to as the switching cost. The objective is to design a caching policy that incurs minimal regret while considering both the rewards due to cache-hits and the switching cost due to the file fetches. The main contribution of this paper is the switching regret analysis of a Follow the Perturbed Leader-based anytime caching policy, which is shown to have an order optimal switching regret. In this pursuit, we improve the best-known switching regret bound for this problem by a factor of $\Theta(\sqrt{C}).$ We conclude the paper by comparing the performance of different popular caching policies using a publicly available trace from a commercial CDN server.

翻訳日:2021-03-27 05:47:04 公開日:2021-01-18

# 入力/出力トレースからハイブリッドオートマタを学習するためのパッシブオンライン手法

A Passive Online Technique for Learning Hybrid Automata from Input/Output Traces ( http://arxiv.org/abs/2101.07053v1 )

ライセンス: Link先を確認

Iman Saberi, Fathiyeh Faghih, Farzad Sobhi Bavil

(参考訳) 仕様合成は、システムの入力出力トレースからモデルを導出する過程である。テスト設計、リバースエンジニアリング、システム識別に広く使われている。サイバー物理システムにおけるこのプロセスの成果物の一つはハイブリッドオートマトンである。直感的で、正確で、ツールに依存し、抽象度が高く、離散変数と連続変数の両方のシステムをモデル化できる。本稿では,非線形サイバー物理システムの入力出力トレースからハイブリッドオートマトンを合成する新しい手法を提案する。非線形挙動における類似性検出は、そのようなモデルを抽出するための大きな課題である。動的時間ワープ技術を用いてこの問題に対処する。私たちのアプローチは受動的であり、ログされたトレースからのオートマトン合成の間、システムとのインタラクションは不要であり、オンラインでは、各入出力トレースが手順で1回だけ使用されることを意味する。言い換えれば、それぞれの新しいトレースは、既に合成されたオートマトンを改善するために使用できる。我々は,本アルゴリズムを2つの産業・シミュレーションケーススタディで評価した。導出オートマトンの精度は有望な結果を示す。

Specification synthesis is the process of deriving a model from the input-output traces of a system. It is used extensively in test design, reverse engineering, and system identification. One type of the resulting artifact of this process for cyber-physical systems is hybrid automata. They are intuitive, precise, tool independent, and at a high level of abstraction, and can model systems with both discrete and continuous variables. In this paper, we propose a new technique for synthesizing hybrid automaton from the input-output traces of a non-linear cyber-physical system. Similarity detection in non-linear behaviors is the main challenge for extracting such models. We address this problem by utilizing the Dynamic Time Warping technique. Our approach is passive, meaning that it does not need interaction with the system during automata synthesis from the logged traces; and online, which means that each input/output trace is used only once in the procedure. In other words, each new trace can be used to improve the already synthesized automaton. We evaluated our algorithm in two industrial and simulated case studies. The accuracy of the derived automata show promising results.

翻訳日:2021-03-27 05:46:47 公開日:2021-01-18

# PRESTO: 時間モチーフ数の厳密な近似のためのシンプルでスケーラブルなサンプリング手法

PRESTO: Simple and Scalable Sampling Techniques for the Rigorous Approximation of Temporal Motif Counts ( http://arxiv.org/abs/2101.07152v1 )

ライセンス: Link先を確認

Ilie Sarpe, Fabio Vandin

(参考訳) ネットワークモチーフと呼ばれる小さなグラフパターンの識別とカウントは、ソーシャルネットワークから神経科学まで、様々な分野のネットワーク分析における基本的な原始である。静的ネットワークにおけるモチーフの発生を数えるためにいくつかの技術が設計され、近年では大規模ネットワークによる計算課題に焦点が当てられている。現代のネットワークデータセットには、ネットワークエッジによってモデル化されたイベントが発生した時間など、豊富な情報が含まれている。時間的モチーフと呼ばれる時間的ネットワークにおけるモチーフの分析は、現代のネットワーク化されたデータセットの分析において重要な要素となっている。最近、時間的ネットワークにおける時間的モチーフのインスタンス数をカウントするために、いくつかの手法が設計されている。このような手法は厳密であり、大規模ネットワークには適用できないが、生産する推定値に対する弱い保証しか提供せず、大規模ネットワークにはスケールしない。本研究では,時間モチーフ数を厳密に近似するために,効率的でスケーラブルなアルゴリズムを提案する。アルゴリズムは単純だが効果的なサンプリング手法に基づいており、非常に大きなデータセットに対してアルゴリズムを実用的なものにしている。実験により,本アルゴリズムは,最先端サンプリングアルゴリズムよりも精度の高い時間的モチーフ数を推定し,正確なアプローチよりも実行時間が有意に低く,従来考えられていたより大きい時間的モチーフを数十億のエッジネットワーク上で研究することができることを示した。

The identification and counting of small graph patterns, called network motifs, is a fundamental primitive in the analysis of networks, with application in various domains, from social networks to neuroscience. Several techniques have been designed to count the occurrences of motifs in static networks, with recent work focusing on the computational challenges provided by large networks. Modern networked datasets contain rich information, such as the time at which the events modeled by the networks edges happened, which can provide useful insights into the process modeled by the network. The analysis of motifs in temporal networks, called temporal motifs, is becoming an important component in the analysis of modern networked datasets. Several methods have been recently designed to count the number of instances of temporal motifs in temporal networks, which is even more challenging than its counterpart for static networks. Such methods are either exact, and not applicable to large networks, or approximate, but provide only weak guarantees on the estimates they produce and do not scale to very large networks. In this work we present an efficient and scalable algorithm to obtain rigorous approximations of the count of temporal motifs. Our algorithm is based on a simple but effective sampling approach, which renders our algorithm practical for very large datasets. Our extensive experimental evaluation shows that our algorithm provides estimates of temporal motif counts which are more accurate than the state-of-the-art sampling algorithms, with significantly lower running time than exact approaches, enabling the study of temporal motifs, of size larger than the ones considered in previous works, on billion edges networks.

翻訳日:2021-03-27 05:46:32 公開日:2021-01-18

# 組込みLQRコントローラによる深層強化学習

Deep Reinforcement Learning with Embedded LQR Controllers ( http://arxiv.org/abs/2101.07175v1 )

ライセンス: Link先を確認

Wouter Caarls

(参考訳) 強化学習は、環境との直接相互作用を通じて制御ポリシーを最適化するモデルフリーの最適制御方法である。規制に終止符を打つタスクには、ゴール状態のチャタリングのため、一般的な離散アクションメソッドが適していない。強化学習と古典的lqr制御を組み合わせることで,この問題を解決する3つの方法を比較した。特に,LQR制御をアクションセットに統合し,学習力学に基づく場合のリプレイメモリにおける計算制御の一般化と修正を回避する手法を提案する。また,LQR制御を連続動作法に組み込む。いずれの場合においても、lqr制御の追加によりパフォーマンスが向上するが、個別のアクションセットの強化に使用できる場合、その効果はより深くなる。

Reinforcement learning is a model-free optimal control method that optimizes a control policy through direct interaction with the environment. For reaching tasks that end in regulation, popular discrete-action methods are not well suited due to chattering in the goal state. We compare three different ways to solve this problem through combining reinforcement learning with classical LQR control. In particular, we introduce a method that integrates LQR control into the action set, allowing generalization and avoiding fixing the computed control in the replay memory if it is based on learned dynamics. We also embed LQR control into a continuous-action method. In all cases, we show that adding LQR control can improve performance, although the effect is more profound if it can be used to augment a discrete action set.

翻訳日:2021-03-27 05:46:06 公開日:2021-01-18

# 非侵襲負荷モニタリングへのコインシデント水データの導入

Incorporating Coincidental Water Data into Non-intrusive Load Monitoring ( http://arxiv.org/abs/2101.07190v1 )

ライセンス: Link先を確認

Mohammad-Mehdi Keramati, Elnaz Azizi, Hamidreza Momeni, Sadegh Bolouki

(参考訳) 集積電力信号から家電の使用パターンを抽出するプロセスとしての非侵入負荷監視(NILM)は,住宅エネルギー管理を支援するアプローチとして成功している。近年、パワープロファイルの大量データセットが利用可能になり、nilmの目的のために使われる分類手法をより効果的かつ正確にするために役立った。しかし、近接電力値を持つ多モードアプライアンスやアプライアンスの存在は、計算の複雑さを悪化させ、これらのアルゴリズムの精度を低下させることに影響を与え続けている。これらの課題に対処するため、我々はイベントベースの分類プロセスを提案し、その第1段階では、K$-nearest neighbors法を高速な分類手法として、排他的非重複パワー値を持つ家電の電力信号を抽出する。そこで,ネットワーク上の新しいシグネチャとして,一部の家電の水消費を考慮に入れた2つのディープラーニングモデルを用いて,重なり合うパワー値のアプライアンスを識別する。電力の分散化に加えて, 提案手法は, 特定の家電の水消費プロファイルも抽出する。提案手法を概説し,その効率性を検証するため,既存の分類に基づくNILM技術に対して,数値分類結果が顕著に改善されたAMPdを7つのアプライアンスとして検討した。

Non-intrusive load monitoring (NILM) as the process of extracting the usage pattern of appliances from the aggregated power signal is among successful approaches aiding residential energy management. In recent years, high volume datasets on power profiles have become available, which has helped make classification methods employed for the NILM purpose more effective and more accurate. However, the presence of multi-mode appliances and appliances with close power values have remained influential in worsening the computational complexity and diminishing the accuracy of these algorithms. To tackle these challenges, we propose an event-based classification process, in the first phase of which the $K$-nearest neighbors method, as a fast classification technique, is employed to extract power signals of appliances with exclusive non-overlapping power values. Then, two deep learning models, which consider the water consumption of some appliances as a novel signature in the network, are utilized to distinguish between appliances with overlapping power values. In addition to power disaggregation, the proposed process as well extracts the water consumption profiles of specific appliances. To illustrate the proposed process and validate its efficiency, seven appliances of the AMPds are considered, with the numerical classification results showing marked improvement with respect to the existing classification-based NILM techniques.

翻訳日:2021-03-27 05:45:53 公開日:2021-01-18

# メータ数に着目した解離困難度の定量化

Quantification of Disaggregation Difficulty with Respect to the Number of Meters ( http://arxiv.org/abs/2101.07191v1 )

ライセンス: Link先を確認

Elnaz Azizi, Mohammad T H Beheshti, Sadegh Bolouki

(参考訳) 効率的なエネルギー管理への有望なアプローチは、集約された消費信号を分析して住宅内の家電製品の消費プロファイルを抽出する非侵入負荷監視(nilm)である。効率的なNILM法には、集約された信号のイベントを検知し、それらを引き起こすアプライアンスに従って分類するイベントベースのアルゴリズムがある。多数のアプライアンスと消費が密接なアプライアンスの存在は、イベントベースのnilmメソッドの性能を制限することが知られている。これらの課題に取り組むために、ハードウェアコストの増大、インストールの複雑さ、消費者の快適さとプライバシに関する懸念をもたらす機能空間を強化することができる。これは、アプライアンスをブロックに分割し、各ブロックの消費を別々の電力メータで監視する、セミインタラクティブ負荷監視(silm)と呼ばれる別のアプローチの出現につながった。より多くのメーターがより正確なデアグリゲーションをもたらすが、これは負荷監視の金銭的コストを増大させ、この分野の重要なギャップを示すトレードオフを示している。本稿では,このギャップを解消するための包括的アプローチとして,電力値と消費者の利用行動の両方に基づいて,任意の家電群のイベントを監視することがいかに難しいかを定量化する「分散困難度メトリクス(ddm)」という概念を確立した。したがって、DDMは、アプライアンスの任意の分割のブロックにメーターを設置することにより、一般的なイベントベースのアルゴリズムの分解精度の観点から、どれだけの量が得られるかを本質的に定量化する。 REDDデータセットに基づく実験結果は、上記のトレードオフに対応するための提案手法の実用性を示している。

A promising approach toward efficient energy management is non-intrusive load monitoring (NILM), that is to extract the consumption profiles of appliances within a residence by analyzing the aggregated consumption signal. Among efficient NILM methods are event-based algorithms in which events of the aggregated signal are detected and classified in accordance with the appliances causing them. The large number of appliances and the presence of appliances with close consumption values are known to limit the performance of event-based NILM methods. To tackle these challenges, one could enhance the feature space which in turn results in extra hardware costs, installation complexity, and concerns regarding the consumer's comfort and privacy. This has led to the emergence of an alternative approach, namely semi-intrusive load monitoring (SILM), where appliances are partitioned into blocks and the consumption of each block is monitored via separate power meters. While a greater number of meters can result in more accurate disaggregation, it increases the monetary cost of load monitoring, indicating a trade-off that represents an important gap in this field. In this paper, we take a comprehensive approach to close this gap by establishing a so-called notion of "disaggregation difficulty metric (DDM)," which quantifies how difficult it is to monitor the events of any given group of appliances based on both their power values and the consumer's usage behavior. Thus, DDM in essence quantifies how much is expected to be gained in terms of disaggregation accuracy of a generic event-based algorithm by installing meters on the blocks of any partition of the appliances. Experimental results based on the REDD dataset illustrate the practicality of the proposed approach in addressing the aforementioned trade-off.

翻訳日:2021-03-27 05:45:31 公開日:2021-01-18

# 深層材料ネットワークにおける細胞分裂のマルチスケールひずみ局在モデリングへの応用

Cell division in deep material networks applied to multiscale strain localization modeling ( http://arxiv.org/abs/2101.07226v1 )

ライセンス: Link先を確認

Zeliang Liu

(参考訳) コンピュータ支援工学におけるひずみ局所化モデリング(例えば、故障解析)の重要性は高まっているが、複数の長さスケールにわたる関連物質挙動を一貫してモデル化するための効果的なアプローチは存在しない。このギャップを、ビルディングブロックに埋め込まれた物理ベースの機械学習モデルであるディープマテリアルネットワーク(DMN)のフレームワーク内で解決することを目指している。ネットワーク上のスケール遷移を追跡するために新しいセル分割スキームが提案され、その一貫性は適合パラメータの物理によって保証される。本質的には、下層の各マイクロスケールノードは、その次元がマクロスケールの材料点からバックプロパゲーションされた楕円体細胞によって記述される。セル内の新しい亀裂面は凝集層を濃縮することによってモデル化され、暗黙のDMN分析において亀裂発生と進展のための故障アルゴリズムが開発された。粒子強化複合管の動的破砕と炭素繊維強化ポリマー複合材料の各種試験について, マルチスケールモデルを同時マルチスケールシミュレーションに適用した。後者については,オフ軸引張試験試料の実験的検証も行う。

Despite the increasing importance of strain localization modeling (e.g., failure analysis) in computer-aided engineering, there is a lack of effective approaches to consistently modeling related material behaviors across multiple length scales. We aim to address this gap within the framework of deep material networks (DMN) - a physics-based machine learning model with embedded mechanics in the building blocks. A new cell division scheme is proposed to track the scale transition through the network, and its consistency is ensured by the physics of fitting parameters. Essentially, each microscale node in the bottom layer is described by an ellipsoidal cell with its dimensions back-propagated from the macroscale material point. New crack surfaces in the cell are modeled by enriching cohesive layers, and failure algorithms are developed for crack initiation and evolution in the implicit DMN analysis. Besides single material point studies, we apply the multiscale model to concurrent multiscale simulations for the dynamic crush of a particle-reinforced composite tube and various tests on carbon fiber reinforced polymer composites. For the latter, experimental validations on an off-axis tensile test specimen are also provided.

翻訳日:2021-03-27 05:45:01 公開日:2021-01-18

# 前立腺癌検出のための機械学習を用いたラマンケミカルイメージングとデジタル組織像の融合

Feature Fusion of Raman Chemical Imaging and Digital Histopathology using Machine Learning for Prostate Cancer Detection ( http://arxiv.org/abs/2101.07342v1 )

ライセンス: Link先を確認

Trevor Doherty, Susan McKeever, Nebras Al-Attar, Tiarnan Murphy, Claudia Aura, Arman Rahman, Amanda O'Neill, Stephen P Finn, Elaine Kay, William M. Gallagher, R. William G. Watson, Aoife Gowen and Patrick Jackman

(参考訳) 前立腺癌の診断はプレゼンテーションの多様性のため困難であり,非臨床的に重要な疾患の診断と治療が過度に行われている。正確な診断は患者の生活の質や予後に直接利益をもたらす。この問題に対処するために,前立腺癌の自動診断のための学習モデルを提案する。多くの前立腺がん研究ではラマン分光法が採用されているが、ラマン化学イメージング(Raman Chemical Imaging, RCI)と他の画像モダリティの組み合わせは利用されていない。本研究は, 染色デジタル組織学(DP)と非定常RCIを併用したマルチモーダル画像を用いた。本手法は,非癌性Gleason grade 3 (G3) およびグレード4 (G4) 組織マイクロアレイ標本を含む32例の臨床試料178例を用いて開発・試験した。病理組織学的にはDP-RCI画像対とラベルが付けられている。検証された仮説は、診断精度の観点から、マルチモーダル画像モデルが単一モダリティベースラインモデルより優れているかどうかである。 2種類の非癌/がんモデルとより困難なG3/G4の分化について検討した。 g3/g4分類では,マルチモーダルアプローチは73.8%,特異度88.1%,ベースラインdpモデルは54.1%,特異度84.7%であった。マルチモーダルアプローチは、統計学的に有意な12.7%のAUCの優位性を、RCIと中央ラマンスペクトルのみに基づくモデルよりも85.8%の値で証明した。 DPとRCIの特徴融合は、腫瘍識別のより簡単な作業を改善するものではなく、G3/G4識別において観察された優位性をもたらす。これらの有望な結果に基づいて、将来の研究には、拡張モデル一般化のためのより大きなデータセットの取得が含まれる。

The diagnosis of prostate cancer is challenging due to the heterogeneity of its presentations, leading to the over diagnosis and treatment of non-clinically important disease. Accurate diagnosis can directly benefit a patient's quality of life and prognosis. Towards addressing this issue, we present a learning model for the automatic identification of prostate cancer. While many prostate cancer studies have adopted Raman spectroscopy approaches, none have utilised the combination of Raman Chemical Imaging (RCI) and other imaging modalities. This study uses multimodal images formed from stained Digital Histopathology (DP) and unstained RCI. The approach was developed and tested on a set of 178 clinical samples from 32 patients, containing a range of non-cancerous, Gleason grade 3 (G3) and grade 4 (G4) tissue microarray samples. For each histological sample, there is a pathologist labelled DP - RCI image pair. The hypothesis tested was whether multimodal image models can outperform single modality baseline models in terms of diagnostic accuracy. Binary non-cancer/cancer models and the more challenging G3/G4 differentiation were investigated. Regarding G3/G4 classification, the multimodal approach achieved a sensitivity of 73.8% and specificity of 88.1% while the baseline DP model showed a sensitivity and specificity of 54.1% and 84.7% respectively. The multimodal approach demonstrated a statistically significant 12.7% AUC advantage over the baseline with a value of 85.8% compared to 73.1%, also outperforming models based solely on RCI and median Raman spectra. Feature fusion of DP and RCI does not improve the more trivial task of tumour identification but does deliver an observed advantage in G3/G4 discrimination. Building on these promising findings, future work could include the acquisition of larger datasets for enhanced model generalization.

翻訳日:2021-03-27 05:44:45 公開日:2021-01-18

# 学習キャッシュによるディープラーニング推論の高速化

Accelerating Deep Learning Inference via Learned Caches ( http://arxiv.org/abs/2101.07344v1 )

ライセンス: Link先を確認

Arjun Balasubramanian, Adarsh Kumar, Yuhan Liu, Han Cao, Shivaram Venkataraman, Aditya Akella

(参考訳) Deep Neural Networks(DNN)は、現実世界の問題を解決する上で、高い精度で複数のドメインが採用されるのを目撃している。しかし、この高い精度は、より深いネットワークを構築することによって達成され、ユーザ向けアプリケーションによって望まれる低レイテンシの推論に対する根本的な課題となっている。現在の低レイテンシソリューションは、正確性に関するトレードオフか、ワークロード提供の予測に固有の時間的局所性を活用できないかのどちらかだ。我々は、DNNの隠れ層出力をキャッシュすることで、推論要求が必要な計算量だけを消費する遅延バインディングの形式を導入することを観察する。これにより、低レイテンシを実現するためのメカニズムと、時間的局所性を活用する能力が組み合わされる。しかし、従来のキャッシュアプローチでは、高いメモリオーバーヘッドとルックアップのレイテンシが発生し、学習したキャッシュ – 継続的に更新される単純なmlモデルで構成されるキャッシュ – を設計することになります。低レイテンシDNN推論のための学習キャッシュを組み込んだエンドツーエンド予測サービスであるGATIの設計を提案する。その結果、GATIは現実的なワークロードにおいて、推論遅延を最大7.69倍削減できることがわかった。

Deep Neural Networks (DNNs) are witnessing increased adoption in multiple domains owing to their high accuracy in solving real-world problems. However, this high accuracy has been achieved by building deeper networks, posing a fundamental challenge to the low latency inference desired by user-facing applications. Current low latency solutions trade-off on accuracy or fail to exploit the inherent temporal locality in prediction serving workloads. We observe that caching hidden layer outputs of the DNN can introduce a form of late-binding where inference requests only consume the amount of computation needed. This enables a mechanism for achieving low latencies, coupled with an ability to exploit temporal locality. However, traditional caching approaches incur high memory overheads and lookup latencies, leading us to design learned caches - caches that consist of simple ML models that are continuously updated. We present the design of GATI, an end-to-end prediction serving system that incorporates learned caches for low-latency DNN inference. Results show that GATI can reduce inference latency by up to 7.69X on realistic workloads.

翻訳日:2021-03-27 05:44:14 公開日:2021-01-18

# データ管理レンズを通して:公正な分類の実験的分析と評価

Through the Data Management Lens: Experimental Analysis and Evaluation of Fair Classification ( http://arxiv.org/abs/2101.07361v1 )

ライセンス: Link先を確認

Maliha Tashfia Islam, Anna Fariha, Alexandra Meliou

(参考訳) データ駆動機械学習タスクである分類は、ローン承認や犯罪リスク評価といった重要な人間の判断を含む予測システムの増加を推進している。しかし、分類器はしばしば識別行動を示し、特にバイアスデータで示される場合である。その結果、分類の公平性は高優先度の研究領域として浮上した。データ管理研究は、公正分類のトピックを含む、データとアルゴリズムの公平性に関連するトピックの存在と関心を示している。公平な分類における学際的な取り組みは、機械学習の研究が最大の存在感を持ち、多くの公平性概念と、体系的に評価・比較されていない幅広いアプローチを生み出した。本稿では,その正確性,公平性,効率性,スケーラビリティ,安定性について,さまざまなメトリクスと実世界のデータセットを用いて,13の公正な分類アプローチと,さらに別のバリエーションを幅広く分析する。我々の分析は、異なるメトリクスとハイレベルなアプローチ特性がパフォーマンスの異なる側面に与える影響に関する新しい洞察を強調します。また、異なる実践的設定に適したアプローチを選択するための一般的な原則を議論し、データ管理中心のソリューションが最も影響を与える可能性のある領域を特定する。

Classification, a heavily-studied data-driven machine learning task, drives an increasing number of prediction systems involving critical human decisions such as loan approval and criminal risk assessment. However, classifiers often demonstrate discriminatory behavior, especially when presented with biased data. Consequently, fairness in classification has emerged as a high-priority research area. Data management research is showing an increasing presence and interest in topics related to data and algorithmic fairness, including the topic of fair classification. The interdisciplinary efforts in fair classification, with machine learning research having the largest presence, have resulted in a large number of fairness notions and a wide range of approaches that have not been systematically evaluated and compared. In this paper, we contribute a broad analysis of 13 fair classification approaches and additional variants, over their correctness, fairness, efficiency, scalability, and stability, using a variety of metrics and real-world datasets. Our analysis highlights novel insights on the impact of different metrics and high-level approach characteristics on different aspects of performance. We also discuss general principles for choosing approaches suitable for different practical settings, and identify areas where data-management-centric solutions are likely to have the most impact.

翻訳日:2021-03-27 05:43:58 公開日:2021-01-18

PDF登録状況（公開日: 20210118）