Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20200801となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# 多次元暗空間とその基底対称性 : 散逸保護量子ビットへ向けて Multidimensional dark space and its underlying symmetries: towards dissipation-protected qubits ( http://arxiv.org/abs/2002.00237v2 ) ライセンス: Link先を確認	Raul A. Santos, Fernando Iemini, Alex Kamenev and Yuval Gefen	(参考訳) 量子系は常に環境と相互作用し、典型的にはデコヒーレンスと量子相関の歪みをもたらす。近年、環境との制御された相互作用は、デコヒーレンスに免疫的な状態である {\em ``dark''} と呼ばれる状態を作るのに役立つことが示されている。量子情報を暗黒状態にエンコードするには、空間を1より大きい次元で拡張する必要があるため、異なる直交状態が計算基底として作用する。我々は,そのような退化暗空間(dds)を,環境による非一貫性から保護する,対称性に基づく概念枠組みを考案する。我々は、dds基底が縮退したラーリン状態の集合に同型である分数量子ホール効果に触発されたモデルプロトコルを用いてこの構成を説明する。したがって、我々の駆動散逸モデルにおける長期定常状態は、ユニタリ位相系の縮退空隙の全ての特性を示す。このアプローチは、オープンシステム内の量子情報を保存、保護、操作する新たな可能性を提供します。 Quantum systems are always subject to interactions with an environment, typically resulting in decoherence and distortion of quantum correlations. It has been recently shown that a controlled interaction with the environment may actually help to create a state, dubbed as {\em ``dark''}, which is immune to decoherence. To encode quantum information in the dark states, they need to span a space with a dimensionality larger than one, so different orthogonal states act as a computational basis. We devise a symmetry-based conceptual framework to engineer such degenerate dark spaces (DDS), protected from decoherence by the environment. We illustrate this construction with a model protocol, inspired by the fractional quantum Hall effect, where the DDS basis is isomorphic to a set of degenerate Laughlin states. The long-time steady state of our driven-dissipative model exhibits thus all the characteristics of degenerate vacua of a unitary topological system. This approach offers new possibilities for storing, protecting and manipulating quantum information in open systems.	翻訳日:2023-06-05 00:39:14 公開日:2020-08-01
# 多ビット大ブロック符号に対する定深さ耐故障クリフォード回路 Constant depth fault-tolerant Clifford circuits for multi-qubit large block codes ( http://arxiv.org/abs/2003.12328v2 ) ライセンス: Link先を確認	Yi-Cong Zheng, Ching-Yi Lai, Todd A. Brun, Leong-Chuan Kwek	(参考訳) フォールトトレラント量子計算 (ftqc) スキームは、k>1$ qubits を $n$ 物理 qubits にエンコードする大きなブロックコードを使用しており、高いエンコーディングレートのためリソースオーバーヘッドを大幅に削減できる可能性がある。しかし、符号化された量子ビットに対するフォールトトレラント(FT)論理演算は、通常非常に大きなリソースオーバーヘッドだけでなく、長い$\textit{in-situ}$計算時間もかかるため、発見と実装が困難である。本稿では,Calderbank-Shor-Steane $[\! [n,k,d ]\! ]$ (CSS)コードとその論理FTクリフォード回路。任意の論理クリフォード回路の深さをknillまたはsteaneシンドローム測定回路を介してo(1)$ step \emph{in-situ} にフォールトトレラントに実装できることを示し,アシラ状態の適格化を効率的に行うことができることを示した。特に、$k/n\sim \theta(1)$を満たすコードの場合、論理レベルにおけるclifford回路の実装のリソーススケーリングは、コード距離$d$とは独立に、物理レベルから定数までと同じである。アンシラ状態を生成するのに適したパイプラインを用いて,本方式では物理量子ビット,物理ゲート,大規模FTQCの計算時間において,最小限のリソースコストしか必要としない。 Fault-tolerant quantum computation (FTQC) schemes using large block codes that encode $k>1$ qubits in $n$ physical qubits can potentially reduce the resource overhead to a great extent because of their high encoding rate. However, the fault-tolerant (FT) logical operations for the encoded qubits are difficult to find and implement, which usually takes not only a very large resource overhead but also long $\textit{in-situ}$ computation time. In this paper, we focus on Calderbank-Shor-Steane $[\![ n,k,d ]\!]$ (CSS) codes and their logical FT Clifford circuits. We show that the depth of an arbitrary logical Clifford circuit can be implemented fault-tolerantly in $O(1)$ steps \emph{in-situ} via either Knill or Steane syndrome measurement circuit, with the qualified ancilla states efficiently prepared. Particularly, for those codes satisfying $k/n\sim \Theta(1)$, the resource scaling for Clifford circuits implementation on the logical level can be the same as on the physical level up to a constant, which is independent of code distance $d$. With a suitable pipeline to produce ancilla states, our scheme requires only a modest resource cost in physical qubits, physical gates, and computation time for very large scale FTQC.	翻訳日:2023-05-27 18:33:19 公開日:2020-08-01
# 二次光学系の光応答とドリフト行列 Optical Response and Drift Matrix of Quadratic Optomechanical System ( http://arxiv.org/abs/2005.07065v3 ) ライセンス: Link先を確認	Akash Kundu	(参考訳) 光学系における非線形相互作用は、F. Monifiらによって導入された光学カオスの存在のような多くの興味深い研究や現象において重要な役割を果たす。 [Nature Photonics 10, 399405 (2016)]とZhong-Peng Liuらによる光学対称性の破れ。 [Phys. Rev. Lett.117, 110802 (2016)] 本稿では,2つの原子準位を含む二次結合光力学系を理論的に検討した。我々はまず,システムの様々なモードの解法を定常的に研究し,その後,システムの様々なパラメータを持つ透過強度(T)の変動を観測した。さらに,2次光学系のドリフト行列と安定性条件を,原子自由度を漸近的に除去することで,解析を拡張した。 Nonlinear interactions in optomechanical systems play a crucial role in many emerging number of interesting studies and phenomena such as existence of optomechanical chaos introduced by F. Monifi et al. [Nature Photonics 10, 399405 (2016)] and optomechanical symmetry breaking proposed by Zhong-Peng Liu et al. [Phys. Rev. Lett.117, 110802 (2016)]. In this article we have theoretically examined quadratically coupled optomechanical system containing two atomic levels. We have first studied the solution of various modes of the system at steady state and later we have observed the variation of Transmission Intensity (T) with several parameters of the system. Further we have extended our analyzation to find Drift matrix of the quadratic optomechanical system and stability conditions by adiabetically eliminating atomic degree of freedom.	翻訳日:2023-05-20 11:42:00 公開日:2020-08-01
# ディラック物質の粒子内絡み合いの発生とベルの不平等の経時的違反 Emergence of Intra-Particle Entanglement and Time-Varying Violation of Bell's Inequality in Dirac Matter ( http://arxiv.org/abs/2007.01584v2 ) ライセンス: Link先を確認	Bruna Gabrielly de Moraes, Aron W. Cummings, and Stephan Roche	(参考訳) 質量を持たないディラックフェルミオンにおける粒子内絡み合いの発生とダイナミクスを示す。スピン軌道カップリングによって生じるこの絡み合いは、グラフェン中の電子のスピンと準格子擬スピンの間に生じる。絡み合いは複雑な動的量であるが、一般に大きく、初期状態とは独立である。その時間依存性はベルの不等式を動的に破ることを意味し、その大きさは大きな粒子内部の絡み合いが基板上のグラフェンの一般的な特徴であることを示している。これらの特徴は粒子対の絡み合いにも影響することが期待されており、ディラック材料に基づくメソスコピックデバイスにおけるクーパー対分割とスピンスピン相関の非局所的測定を組み合わせた実験で検出することができる。 We demonstrate the emergence and dynamics of intra-particle entanglement in massless Dirac fermions. This entanglement, generated by spin-orbit coupling, arises between the spin and sublattice pseudospin of electrons in graphene. The entanglement is a complex dynamic quantity but is generally large, independent of the initial state. Its time dependence implies a dynamical violation of a Bell inequality, while its magnitude indicates that large intra-particle entanglement is a general feature of graphene on a substrate. These features are also expected to impact entanglement between pairs of particles, and may be detectable in experiments that combine Cooper pair splitting with nonlocal measurements of spin-spin correlation in mesoscopic devices based on Dirac materials.	翻訳日:2023-05-11 18:34:14 公開日:2020-08-01
# 1時間コヒーレンス時間を超える単一イオン量子ビット Single ion-qubit exceeding one hour coherence time ( http://arxiv.org/abs/2008.00251v1 ) ライセンス: Link先を確認	Pengfei Wang, Chun-Yang Luan, Mu Qiao, Mark Um, Junhua Zhang, Ye Wang, Xiao Yuan, Mile Gu, Jingning Zhang, Kihwan Kim	(参考訳) 長いコヒーレンス時間量子メモリの実現は、現在の量子技術の大きな課題である。本稿では,1時間以上のコヒーレンス時間を持つ1個のYbイオン量子ビットメモリについて報告する。周囲磁場ノイズ、位相ノイズ、マイクロ波発振器の漏れなど、様々な技術的課題に対処することにより、長いコヒーレンス時間メモリを実現する。さらに,量子プロセストモグラフィによる量子メモリのデコヒーレンス過程を体系的に研究することで,コヒーレンスの相対エントロピーと量子コヒーレンスの厳密な基準を適用できる。量子メモリはまた、量子情報の保存能力、すなわち量子メモリの堅牢性によって量子メモリをベンチマークし、量子メモリが古典的でない量子情報を保存していることを示す。本研究では、時間レベルの量子メモリの安定性を検証し、様々なシナリオにおける汎用性を示す。 Realizing a long coherence time quantum memory is a major challenge of current quantum technology. Here, we report a single \Yb ion-qubit memory with over one hour coherence time, an order of improvement compared to the state-of-the-art record. The long coherence time memory is realized by addressing various technical challenges such as ambient magnetic-field noise, phase noise and leakage of the microwave oscillator. Moreover, systematically study the decoherence process of our quantum memory by quantum process tomography, which enables to apply the strict criteria of quantum coherence, relative entropy of coherence. We also benchmark our quantum memory by its ability in preserving quantum information, i.e., the robustness of quantum memory, which clearly shows that over 6000 s, our quantum memory preserves non-classical quantum information. Our results verify the stability of the quantum memory in hours level and indicate its versatile applicability in various scenarios.	翻訳日:2023-05-07 10:38:48 公開日:2020-08-01
# 都市緑地植生の異なる指標の標準化グリーンビュー指標と定量化 Standardized Green View Index and Quantification of Different Metrics of Urban Green Vegetation ( http://arxiv.org/abs/2008.00229v1 ) ライセンス: Link先を確認	Yusuke Kumakoshi, Sau Yee Chan, Hideki Koizumi, Xiaojiang Li and Yuji Yoshimura	(参考訳) 都市緑化は、持続可能な開発と人々の生活の質との関係において重要な要素であると考えられている。都市緑化の測定方法が提案されているが、各指標の特徴は完全に確立されておらず、以前の研究は緑化指標の変化に弱い。本研究の目的は,(1)分析用緑化可視性向上指標(標準化されたGVI, sGVI)を提案し,(2)sGVIと他の緑化指標との関係を定量化することである。横浜市のデータセットを解析した結果,gviの重み付け型であるsgviが,密集した測定地点の偏りを緩和していることが示された。また,都市ブロックレベルでsGVIとNDVIを比較することで,sGVIは都市中心部の植生をよりよく捉えているのに対し,NDVIは公園や森林の植生を捉えるのに優れていることがわかった。これらのツールは、都市景観における植生の影響をより堅牢な方法でアクセスするための基盤を提供し、任意の地理的スケールの比較を可能にする。 Urban greenery is considered an important factor in relation to sustainable development and people's quality of life in the city. Although ways to measure urban greenery have been proposed, the characteristics of each metric have not been fully established, rendering previous researches vulnerable to changes in greenery metrics. To make estimation more robust, this study aims to (1) propose an improved indicator of greenery visibility for analytical use (standardized GVI; sGVI), and (2) quantify the relation between sGVI and other greenery metrics. Analyzing a data set for Yokohama city, Japan, it is shown that the sGVI, a weighted form of GVI aggregated to an area, mitigates the bias of densely located measurement sites. Also, by comparing sGVI and NDVI at city block level, we found that sGVI captures the presence of vegetation better in the city center, whereas NDVI is better in capturing vegetation in parks and forests. These tools provide a foundation for accessing the effect of vegetation in urban landscapes in a more robust matter, enabling comparison on any arbitrary geographical scale.	翻訳日:2023-05-07 10:38:30 公開日:2020-08-01
# 強結合キャビティ量子ビット系における駆動誘起共鳴狭絡 Driving-induced resonance narrowing in a strongly coupled cavity-qubit system ( http://arxiv.org/abs/2008.00224v1 ) ライセンス: Link先を確認	Eyal Buks, Paul Brookes, Eran Ginossar, Chunqing Deng, Jean-Luc F. X. Orgiazzi, Martin Otto and Adrian Lupascu	(参考訳) マイクロ波空洞に強く結合した超伝導束量子ビットからなるシステムについて検討した。着飾った状態のスペクトルを操作するために外部応用クビット駆動を用いる。 2つの基本共鳴の分割が0に調整される領域における共鳴狭化を観察する。この領域における重なり合う共鳴の狭さは、量子状態の長期保存に利用することができる。さらに, キャビティモード駆動に対する応答を計測し, 実験結果と半古典的モデルの予測との質的偏差を求める。一方、システムの力学を規定するマスター方程式を数値的に統合した理論予測を用いて、良好な一致が得られる。観察された応答は、2つの準安定な服装状態のコヒーレントなキャンセルの過程を示す。 We study a system consisting of a superconducting flux qubit strongly coupled to a microwave cavity. Externally applied qubit driving is employed in order to manipulate the spectrum of dressed states. We observe resonance narrowing in the region where the splitting between the two dressed fundamental resonances is tuned to zero. The narrowing in this region of overlapping resonances can be exploited for long-time storage of quantum states. In addition, we measure the response to strong cavity mode driving, and find a qualitative deviation between the experimental results and the predictions of a semiclassical model. On the other hand, good agreement is obtained using theoretical predictions obtained by numerically integrating the master equation governing the system's dynamics. The observed response demonstrates a process of a coherent cancellation of two meta-stable dressed states.	翻訳日:2023-05-07 10:38:10 公開日:2020-08-01
# 安全のために子供の位置を遠隔追跡する装置 Device to Remotely Track and Locate the Position of a Child for Safety ( http://arxiv.org/abs/2008.00211v1 ) ライセンス: Link先を確認	S.M.K.C.S.B. Egodawela, H.M.D.M.B. Herath, R.D. Ranaweera, J.V. Wijayakulasooriya	(参考訳) 親はいつも子供の幸福を心配している。 2017年の統計報告によると、子どもは2分ごとに行方不明になっている。差し迫った脅威のために、親は子供たちと連絡を取り合うために携帯電話を買う傾向がある。しかし、子供に携帯電話を与えると、サイバーいじめ、ソーシャルネットワークの不適切な利用、成熟した年齢へのアクセス、インターネット上の不正なコンテンツ、そしておそらく電話盗難などの問題を引き起こす可能性がある。そこで本研究では,親が子どもに親しみやすい携帯端末を使って,子どもを呼んだり,見つけたり,追跡したりできるソリューションを提案する。デバイスが遊ぶ一般的なシナリオは、典型的なルートで一人で旅行する子供の安全性を高めることだ。この装置は、典型的な旅行経路を追跡するために調整することができる。そして、デバイスが通常のルートからのずれを検知すると、親への通知がトリガーされる。確率行列に基づくnov-elアルゴリズムを導入し,経路偏差を検出する。本稿では,携帯端末のデサインの詳細と経路偏差検出アルゴリズムの詳細について述べる。 Parents are always worried about the wellbeing of their children. As per the Statistics Report 2017 by Missing Children Europe Organization, a child is reported missing every 2 minutes. Due to the imminent threat, parents are prone to buy their children mobile phones to keep in touch with them. However, giving a Mobile phone to a child can cause issues including cyber bullying, improper use of social networks, access to mature age and illicit content on the internet and possibly, phone theft. As an effort to tackle some of those issues, this paper proposes a solution which enables parents to call, locate and track their children using a child-friendly mobile device. The common scenario the device would come to play is in enhancing the safety of a child who would travel alone on a typical route; for instance a child who walks from home to school and back. The device can be calibrated to keep track of a typical route of travel. Then, if the device de-tects some deviation from the usual route, it would trigger a notification to parents. A probability matrix based nov-el algorithm is introduced to detect route deviation. De-sign details of the mobile device, along with the details of the route deviation detection algorithm are presented in this paper.	翻訳日:2023-05-07 10:37:21 公開日:2020-08-01
# BatNet: 超音波によるスマートフォン間のデータ伝送 BatNet: Data transmission between smartphones over ultrasound ( http://arxiv.org/abs/2008.00136v1 ) ライセンス: Link先を確認	Almos Zarandy, Ilia Shumailov, Ross Anderson	(参考訳) 本稿では,スマートフォンの内蔵スピーカーおよびマイク上で超音波信号を用いたデータ伝送機構であるBatNetを提案する。 8点星座と20-24kHzの周波数で位相シフトキーを使用すれば、600bit/sから6mの速度でデータを送信できる。ターゲットアプリケーションは検閲耐性メッシュネットワークである。また,コビッドの接触追跡でも評価したが,このアプリケーションでは超音波通信がBluetooth Low Energyに対して十分な優位性を与えていないことが判明した。 In this paper, we present BatNet, a data transmission mechanism using ultrasound signals over the built-in speakers and microphones of smartphones. Using phase shift keying with an 8-point constellation and frequencies between 20--24kHz, it can transmit data at over 600bit/s up to 6m. The target application is a censorship-resistant mesh network. We also evaluated it for Covid contact tracing but concluded that in this application ultrasonic communications do not appear to offer enough advantage over Bluetooth Low Energy to be worth further development.	翻訳日:2023-05-07 10:36:40 公開日:2020-08-01
# M. Hu, K. Guo, Q. Yu, Z. Zhangの論文"Third-harmonic generation investigated by a short-range bottomless exponential potential Well"へのコメント [Superlattices andstructures, 122 (2018) 538-547] Comment on the paper "Third-harmonic generation investigated by a short-range bottomless exponential potential well" by M. Hu, K. Guo, Q. Yu, Z. Zhang [Superlattices and Microstructures, 122 (2018) 538-547] ( http://arxiv.org/abs/2008.01833v1 ) ライセンス: Link先を確認	A.M. Ishkhanyan and G.G. Demirkhanyan	(参考訳) 我々は最近の論文M. Hu, K. Guo, Q. Yu, Z. Zhang[超格子と微細構造, 122 (2018) 538-547]でいくつかの重大な誤りを発見した。具体的には、schr\"odinger方程式の解と、論文で使われる境界状態波動関数の両方が誤りであることを示す。 We have discovered several severe errors in the recent paper M. Hu, K. Guo, Q. Yu, Z. Zhang [Superlattices and Microstructures, 122 (2018) 538-547]. Specifically, we demonstrate that both the solution of the Schr\"odinger equation and the bound-state wave functions used in the paper are incorrect.	翻訳日:2023-05-07 10:30:22 公開日:2020-08-01
# 量子系の安定性解析:リャプノフ基準と不変原理 Stability Analysis of Quantum Systems: a Lyapunov Criterion and an Invariance Principle ( http://arxiv.org/abs/2008.01534v1 ) ライセンス: Link先を確認	Muhammad F. Emzir, Matthew J. Woolley, Ian R. Petersen	(参考訳) 本稿では,量子系の密度演算子の収束を解析するためのリアプノフ安定性手法を提案する。マルコフ過程の古典的確率測度と類似して、不変密度作用素の集合は閉かつ凸であることを示す。次に、この集合の安定性を候補ライプノフ作用素を用いて解析する方法を示す。量子系の力学に関するBarbashin-Krasovskii-La Salle定理のアナログを導入して、不変密度作用素の集合の解析を完成させる。 In this article, we propose a Lyapunov stability approach to analyze the convergence of the density operator of a quantum system. In analog to the classical probability measure for Markovian processes, we show that the set of invariant density operators is both closed and convex. We then show how to analyze the stability of this set via a candidate Lyapunov operator. We complete our analysis of the set of invariant density operators by introducing an analog of the Barbashin-Krasovskii-La Salle theorem on the dynamics of quantum systems.	翻訳日:2023-05-07 10:30:09 公開日:2020-08-01
# ランベルト-w関数を用いた完全可解量子系 Exactly-solvable quantum systems in terms of Lambert-W functions ( http://arxiv.org/abs/2008.01072v1 ) ライセンス: Link先を確認	A. Schulze-Halberg and A.M. Ishkhanyan	(参考訳) 我々は、ランベルト-W関数の観点で与えられる様々な新しい正解量子系を構築している。特に、エネルギー依存ポテンシャルを持つschr\"odingerモデル、超対称性形式を用いた従来のschr\"odingerモデル、二次元ディラック系を生成する。さらに、ランベルト-w 函数のウロンスキー積分公式も導出する。 We construct a variety of new exactly-solvable quantum systems, the potentials of which are given in terms of Lambert-W functions. In particular, we generate Schr\"odinger models with energy-dependent potentials, conventional Schr\"odinger models using the supersymmetry formalism, and two-dimensional Dirac systems. In addition, we derive Wronskian integral formulas for Lambert-W functions.	翻訳日:2023-05-07 10:29:58 公開日:2020-08-01
# IBM量子体験における量子Zeno効果の実証 Demonstrating Quantum Zeno Effect on IBM Quantum Experience ( http://arxiv.org/abs/2008.01070v1 ) ライセンス: Link先を確認	Subhashish Barik, Dhiman Kumar Kalita, Bikash K. Behera, Prasanta K. Panigrahi	(参考訳) 量子ゼノ効果(QZE)は、1977年にMisraとSudarshanが発見して以来、量子力学において最も興味深い現象の1つである。数学 Phys 756年(1977年)。同じことを実験的に実現しようとする試みは数多くある。ここでは、IBM量子体験プラットフォーム上でQZEを初めてシミュレーションする。ラビ駆動振動の2レベルシステムをシミュレートし、量子ゲートを用いた中間繰り返し測定により時間発展を阻害し、初期状態における量子ビットの生存確率を増加させる。回路は、追加された中間測定値と共に設計され、ibm量子シミュレータで実行され、結果は予測と一致することが示されている。中間測定数による生存確率の増加はQZEを示す。さらに、得られた結果に対するいくつかの別の説明が提供され、観察結果の正確な推論が曖昧になる。 Quantum Zeno Effect (QZE) has been one of the most interesting phenomena in quantum mechanics ever since its discovery in 1977 by Misra and Sudarshan [J. Math. Phys. \textbf{18}, 756 (1977)]. There have been many attempts for experimental realization of the same. Here, we present the first ever simulation of QZE on IBM quantum experience platform. We simulate a two-level system for Rabi-driven oscillation and then disturb the time evolution by intermediate repetitive measurements using quantum gates to increase the survival probability of the qubit in the initial state. The circuits are designed along with the added intermediate measurements and executed on IBM quantum simulator, and the outcomes are shown to be consistent with the predictions. The increasing survival probability with the number of intermediate measurements demonstrates QZE. Furthermore, some alternative explanations for the obtained results are provided which leads to some ambiguity in giving the exact reasoning for the observed outcomes.	翻訳日:2023-05-07 10:29:51 公開日:2020-08-01
# 任意電磁スペクトル密度のFewモード場量子化 Few-mode Field Quantization of Arbitrary Electromagnetic Spectral Densities ( http://arxiv.org/abs/2008.00349v1 ) ライセンス: Link先を確認	Ivan Medina, Francisco J. Garc\'ia-Vidal, Antonio I. Fern\'andez-Dom\'inguez, Johannes Feist	(参考訳) 我々は、単一量子エミッタと任意の電磁環境との相互作用を数モードのマスター方程式で記述するフレームワークを開発する。場の量子化は、古典的な電磁シミュレーションによって得られたスペクトル密度を、少数の損失モードと相互作用モードを含むモデルシステムにのみ適用する必要がある。複雑なハイブリッドプラズモン-フォトニック構造に配置されたエミッタの自然崩壊における個体群と電場ダイナミクスを記述し,本手法のパワーと妥当性について述べる。 We develop a framework that provides a few-mode master equation description of the interaction between a single quantum emitter and an arbitrary electromagnetic environment. The field quantization requires only the fitting of the spectral density, obtained through classical electromagnetic simulations, to a model system involving a small number of lossy and interacting modes. We illustrate the power and validity of our approach by describing the population and electric field dynamics in the spontaneous decay of an emitter placed in a complex hybrid plasmonic-photonic structure.	翻訳日:2023-05-07 10:29:36 公開日:2020-08-01
# 量子論は「解釈」を必要としないが「理論的形式的概念的ユニティ」(または、ダヴィッド・ドイチュの解説の助けを借りてアダン・カベロの「狂気の地図」を逃れる) Quantum Theory Needs No 'Interpretation' But 'Theoretical Formal-Conceptual Unity' (Or: Escaping Adan Cabello's "Map of Madness" With the Help of David Deutsch's Explanations) ( http://arxiv.org/abs/2008.00321v1 ) ライセンス: Link先を確認	Christian de Ronde	(参考訳) 2000年、Chris FuchsとAsher PeresはQuantum Theory Needs No 'Interpretation'と題した論文で、QMにおける「解釈」によって演じられる役割に対する一連の器楽主義者の議論を発表した。それ以来、この論文の出版によらず、多くの解釈は、アダン・カベロが「狂気の地図」として特徴づけたものを構成する連続的な成長を経験した。本稿では、この危険な断片化の背景にある理由を論じ、アインシュタイン、ハイゼンベルク、パウリの著作に根ざした、理論の表現的実在論的な理解から(フクスとペレスの解釈に反する)QMの解釈の必要性に対する新たな論証を提供する。さらに、量子論における「解釈」の創出は、反現実主義者が出口のない迷路で現実主義者を投獄するためにデザインした罠として機能していると考える理由も考えられる。 david deutsch の批判的分析から反現実主義的物理学の理解の立場から、我々は「理論」と「観察」によって演じられる参照と役割に対処しようとする。この点に関して、我々は反現実主義的な解釈の罠から逃れる鍵は、アインシュタインがおよそ1世紀前にハイゼンベルクに語ったように、何が観察できるのかを教えてくれる理論にすぎないことを認識することにあると論じる。最後に、QMが必要とするものは新しい解釈ではなく、理論的(形式的-概念的)整合性、一貫性、統一的なスキームであり、理論が本当に何を言っているのかを理解することができると結論付ける。 In the year 2000, in a paper titled Quantum Theory Needs No 'Interpretation', Chris Fuchs and Asher Peres presented a series of instrumentalist arguments against the role played by 'interpretations' in QM. Since then --quite regardless of the publication of this paper-- the number of interpretations has experienced a continuous growth constituting what Adan Cabello has characterized as a "map of madness". In this work, we discuss the reasons behind this dangerous fragmentation in understanding and provide new arguments against the need of interpretations in QM which --opposite to those of Fuchs and Peres-- are derived from a representational realist understanding of theories --grounded in the writings of Einstein, Heisenberg and Pauli. Furthermore, we will argue that there are reasons to believe that the creation of 'interpretations' for the theory of quanta has functioned as a trap designed by anti-realists in order to imprison realists in a labyrinth with no exit. Taking as a standpoint the critical analysis by David Deutsch to the anti-realist understanding of physics, we attempt to address the references and roles played by 'theory' and 'observation'. In this respect, we will argue that the key to escape the anti-realist trap of interpretation is to recognize that --as Einstein told Heisenberg almost one century ago-- it is only the theory which can tell you what can be observed. Finally, we will conclude that what QM needs is not a new interpretation but instead, a theoretical (formal-conceptual) consistent, coherent and unified scheme which allows us to understand what the theory is really talking about.	翻訳日:2023-05-07 10:29:29 公開日:2020-08-01
# 入門データ科学の新展開 A fresh look at introductory data science ( http://arxiv.org/abs/2008.00315v1 ) ライセンス: Link先を確認	Mine \c{C}etinkaya-Rundel and Victoria Ellison	(参考訳) 自然界で大規模で複雑なデータセットが大量に存在することから、大学は、データの発見を効果的に計画し、取得し、管理し、分析し、伝達するのに必要な、統計学と計算学の双方で訓練された卒業生の要求に応える必要がある。この需要に対応するために、データサイエンスに早くから学生を惹きつけ、この分野にしっかりと進出させることがますます重要になっている。本稿では,これらのニーズに対応するように設計されたデータサイエンス入門科のケーススタディについて述べる。デューク大学で提供されているこのコースには前提条件がなく、人文科学、社会科学、自然科学の学生だけでなく、統計学やデータサイエンスの専攻者も幅広く利用している。このようなコースを提供することによって生じる課題のユニークなセットについて議論し、これらの課題を踏まえて、教育設計要素、コンテンツ、構造、計算インフラ、およびコースの評価方法論について詳細な議論を行う。また、オープンソースである教材を全て含むリポジトリと、論文に見られる数字を再現するための補足資料とrコードも提供しています。 The proliferation of vast quantities of available datasets that are large and complex in nature has challenged universities to keep up with the demand for graduates trained in both the statistical and the computational set of skills required to effectively plan, acquire, manage, analyze, and communicate the findings of such data. To keep up with this demand, attracting students early on to data science as well as providing them a solid foray into the field becomes increasingly important. We present a case study of an introductory undergraduate course in data science that is designed to address these needs. Offered at Duke University, this course has no pre-requisites and serves a wide audience of aspiring statistics and data science majors as well as humanities, social sciences, and natural sciences students. We discuss the unique set of challenges posed by offering such a course and in light of these challenges, we present a detailed discussion into the pedagogical design elements, content, structure, computational infrastructure, and the assessment methodology of the course. We also offer a repository containing all teaching materials that are open-source, along with supplemental materials and the R code for reproducing the figures found in the paper.	翻訳日:2023-05-07 10:28:53 公開日:2020-08-01
# 二成分量子系のユニタリダイナミクスによる誤差について On errors generated by unitary dynamics of bipartite quantum systems ( http://arxiv.org/abs/2008.00290v1 ) ライセンス: Link先を確認	G.G. Amosov, A.S. Mokeev	(参考訳) 量子チャネルが与えられると、このチャネルを介して情報の誤りのない送信の可能性を決定する性質を持つ非可換作用素グラフを定義することができる。対応するグラフは、クラウス作用素を通して量子誤差を決定するストレートな定義を持つ。我々は、あるグラフが対応するエラーの適切な定義の反対の問題について議論している。任意のグラフがある種のPOVMによって生成されることを考慮し、ナイマーク拡張定理を用いてそのような問題の解を与える。この手法を用いて、二部量子系のユニタリダイナミクスによって生成されるグラフに対応する誤差を構築する。円群 ${\mathbb Z}_n$ 上の POVM のケースと加法群 $\mathbb R$ について議論する。例えば、2モード量子発振器のダイナミクスによって生成される誤差に対応するグラフを構築する。 Given a quantum channel it is possible to define the non-commutative operator graph whose properties determine a possibility of error-free transmission of information via this channel. The corresponding graph has a straight definition through Kraus operators determining quantum errors. We are discussing the opposite problem of a proper definition of errors that some graph corresponds to. Taking into account that any graph is generated by some POVM we give a solution to such a problem by means of the Naimark dilatation theorem. Using our approach we construct errors corresponding to the graphs generated by unitary dynamics of bipartite quantum systems. The cases of POVMs on the circle group ${\mathbb Z}_n$ and the additive group $\mathbb R$ are discussed. As an example we construct the graph corresponding to the errors generated by dynamics of two mode quantum oscillator.	翻訳日:2023-05-07 10:28:30 公開日:2020-08-01
# マイクロキャビティを経由するロバスト忠実性ハイパーパラレル制御相フリップゲート Robust-fidelity hyperparallel controlled-phase-flip gate through microcavities ( http://arxiv.org/abs/2008.00258v1 ) ライセンス: Link先を確認	Hai-Rui Wei, Yan-Bei Zheng, Ming Hua, and Guo-Fu Xu	(参考訳) ハイパーパラレル量子情報処理は、チャネル容量、低損失率、処理速度の点で従来の並列処理よりも優れている。マイクロキャビティを用いた高並列光制御位相フリップゲートの実現手法を提案する。ゲートは同時に偏光と空間自由度(DOF)に作用し、光子と量子ドットの間の不完全で望ましくない相互作用が防止される。興味深いことに、ゲートの統一性は原則として達成でき、ゲートの成功は単光子検出器によって予測される。 Hyperparallel quantum information processing outperforms its traditional parallel one in terms of channel capacity, low loss rate, and processing speed. We present a way for implementing a robust hyper-parallel optical controlled-phase-flip gate through microcavities. The gate acts on polarization and spatial degrees of freedom (DOFs) simultaneously, and the incomplete and undesired interactions between photons and quantum dots are prevented. Interestingly, the unity fidelity of the gate can be achieved in principle, and the success of the gate is heralded by the single-photon detectors.	翻訳日:2023-05-07 10:27:51 公開日:2020-08-01
# 単語融合ネットワークを用いた対話状態追跡のためのASR曖昧性のモデル化 Modeling ASR Ambiguity for Dialogue State Tracking Using Word Confusion Networks ( http://arxiv.org/abs/2002.00768v2 ) ライセンス: Link先を確認	Vaishali Pal, Fabien Guillot, Manish Shrivastava, Jean-Michel Renders, Laurent Besacier	(参考訳) 音声対話システムは通常、意味的意味を推測し、対話の状態を追跡するためにトップNのASR仮説のリストを使用する。しかし、混乱ネットワーク (confnets) のような ASR グラフは、トップNの ASR リストよりもリッチな仮説空間のコンパクトな表現を提供する。本稿では,最先端のニューラルダイアログ状態トラッカー(DST)を用いた混乱ネットワークの利点について検討する。我々は,DSTシステムで使用可能な注目混乱ネットワークエンコーダを用いて,2次元の畳み込みを1次元の埋め込み列に符号化する。 DSTの「グローバルローカル自己認識状態タッカー」(GLAD)モデルに実装し、トップNのASR仮説と比較して精度と推論時間に大きな改善を加えた。 Spoken dialogue systems typically use a list of top-N ASR hypotheses for inferring the semantic meaning and tracking the state of the dialogue. However ASR graphs, such as confusion networks (confnets), provide a compact representation of a richer hypothesis space than a top-N ASR list. In this paper, we study the benefits of using confusion networks with a state-of-the-art neural dialogue state tracker (DST). We encode the 2-dimensional confnet into a 1-dimensional sequence of embeddings using an attentional confusion network encoder which can be used with any DST system. Our confnet encoder is plugged into the state-of-the-art 'Global-locally Self-Attentive Dialogue State Tacker' (GLAD) model for DST and obtains significant improvements in both accuracy and inference time compared to using top-N ASR hypotheses.	翻訳日:2023-01-04 08:30:06 公開日:2020-08-01
# 深層学習におけるトレーニング戦略の再検討と一般化性能 Revisiting Training Strategies and Generalization Performance in Deep Metric Learning ( http://arxiv.org/abs/2002.08473v9 ) ライセンス: Link先を確認	Karsten Roth, Timo Milbich, Samarth Sinha, Prateek Gupta, Bj\"orn Ommer, Joseph Paul Cohen	(参考訳) ディープメトリック学習(dml)は、毎年提案されている多くのアプローチと視覚的な類似性を学ぶための最も影響力のある研究の1つである。フィールドは急速な進歩から恩恵を受けるが、トレーニングプロトコル、アーキテクチャ、パラメータの選択の相違はバイアスのない比較を難しくする。そこで我々は,最も広く使用されているDML対象関数を再検討し,重要なパラメータ選択と,一般的に無視されるミニバッチサンプリングプロセスについて検討する。一貫した比較では、DMLの目的は文学で示されるよりもはるかに高い飽和を示す。さらに解析により,DMLモデルの一般化性能に対する埋め込み空間密度と圧縮の相関関係を明らかにする。これらの知見をエクスプロイトし、様々な標準ベンチマークデータセット上でランキングベースのDMLモデルの性能を確実に向上させるための、シンプルで効果的なトレーニング正則化を提案する。コードとWandB-repoはhttps://github.com/Confusezius/Revisiting_Deep_Metric_Learning_PyTorchで公開されている。 Deep Metric Learning (DML) is arguably one of the most influential lines of research for learning visual similarities with many proposed approaches every year. Although the field benefits from the rapid progress, the divergence in training protocols, architectures, and parameter choices make an unbiased comparison difficult. To provide a consistent reference point, we revisit the most widely used DML objective functions and conduct a study of the crucial parameter choices as well as the commonly neglected mini-batch sampling process. Under consistent comparison, DML objectives show much higher saturation than indicated by literature. Further based on our analysis, we uncover a correlation between the embedding space density and compression to the generalization performance of DML models. Exploiting these insights, we propose a simple, yet effective, training regularization to reliably boost the performance of ranking-based DML models on various standard benchmark datasets. Code and a publicly accessible WandB-repo are available at https://github.com/Confusezius/Revisiting_Deep_Metric_Learning_PyTorch.	翻訳日:2022-12-30 14:10:47 公開日:2020-08-01
# 物体検出における空間的不確かさの推測 Inferring Spatial Uncertainty in Object Detection ( http://arxiv.org/abs/2003.03644v2 ) ライセンス: Link先を確認	Zining Wang, Di Feng, Yiyang Zhou, Lars Rosenbaum, Fabian Timm, Klaus Dietmayer, Masayoshi Tomizuka and Wei Zhan	(参考訳) 実世界のデータセットが利用可能であることは、自動運転のためのオブジェクト検出方法を開発するための前提条件である。オブジェクトラベルには、エラーが発生しやすいアノテーション処理やセンサーによる観測ノイズによる曖昧性が存在するが、現在のオブジェクト検出データセットは、その不確かさを考慮せずに決定論的アノテーションのみを提供する。これにより、特に予測確率を明示的にモデル化するオブジェクト検出手法の詳細な評価が妨げられる。本研究では,lidar点雲から境界ボックスラベルの不確かさを推定する生成モデルを提案し,空間分布を通じて確率的境界ボックスの新しい表現を定義する。総合実験により,提案モデルが運転シナリオでよく見られる不確実性を表すことを示す。空間分布に基づいて,ラベルの不確実性を考慮した新しい評価指標として,Jaccard IoU(JIoU)と呼ばれるIoUの拡張を提案する。 KITTIとWaymo Open Datasetsの実験により、JIoUは確率的物体検出器の評価においてIoUよりも優れていることが示された。 The availability of real-world datasets is the prerequisite for developing object detection methods for autonomous driving. While ambiguity exists in object labels due to error-prone annotation process or sensor observation noises, current object detection datasets only provide deterministic annotations without considering their uncertainty. This precludes an in-depth evaluation among different object detection methods, especially for those that explicitly model predictive probability. In this work, we propose a generative model to estimate bounding box label uncertainties from LiDAR point clouds, and define a new representation of the probabilistic bounding box through spatial distribution. Comprehensive experiments show that the proposed model represents uncertainties commonly seen in driving scenarios. Based on the spatial distribution, we further propose an extension of IoU, called the Jaccard IoU (JIoU), as a new evaluation metric that incorporates label uncertainty. Experiments on the KITTI and the Waymo Open Datasets show that JIoU is superior to IoU when evaluating probabilistic object detectors.	翻訳日:2022-12-25 19:39:21 公開日:2020-08-01
# 畳み込み型占有ネットワーク Convolutional Occupancy Networks ( http://arxiv.org/abs/2003.04618v2 ) ライセンス: Link先を確認	Songyou Peng, Michael Niemeyer, Lars Mescheder, Marc Pollefeys, Andreas Geiger	(参考訳) 近年、暗黙の神経表現が学習に基づく3D再構成で人気を集めている。有望な結果を示す一方で、ほとんどの暗黙的なアプローチは単一のオブジェクトの単純な幾何学に限られており、より複雑で大規模なシーンにスケールしない。暗黙的手法の鍵となる制限要因は、観察中に局所的な情報を統合したり、翻訳等価性のような帰納的バイアスを組み込むことができない、単純な完全連結ネットワークアーキテクチャである。本稿では,オブジェクトと3Dシーンの詳細な再構築のための,より柔軟な暗黙的表現である畳み込みネットワークを提案する。畳み込みエンコーダと暗黙の占有デコーダを組み合わせたモデルでは,帰納的バイアスが組み込まれ,3次元空間における構造化推論が可能となる。ノイズ点雲と低分解能ボクセル表現から複素幾何を再構成することにより,提案表現の有効性を検討する。実験により,本手法は単一物体の微細な3次元再構成,大規模屋内シーンへのスケール,合成データから実データへの一般化を可能にした。 Recently, implicit neural representations have gained popularity for learning-based 3D reconstruction. While demonstrating promising results, most implicit approaches are limited to comparably simple geometry of single objects and do not scale to more complicated or large-scale scenes. The key limiting factor of implicit methods is their simple fully-connected network architecture which does not allow for integrating local information in the observations or incorporating inductive biases such as translational equivariance. In this paper, we propose Convolutional Occupancy Networks, a more flexible implicit representation for detailed reconstruction of objects and 3D scenes. By combining convolutional encoders with implicit occupancy decoders, our model incorporates inductive biases, enabling structured reasoning in 3D space. We investigate the effectiveness of the proposed representation by reconstructing complex geometry from noisy point clouds and low-resolution voxel representations. We empirically find that our method enables the fine-grained implicit 3D reconstruction of single objects, scales to large indoor scenes, and generalizes well from synthetic to real data.	翻訳日:2022-12-24 21:11:58 公開日:2020-08-01
# Emotions Don't Lie:Affective Cuesを用いたオーディオ・ビジュアルディープフェイク検出法 Emotions Don't Lie: An Audio-Visual Deepfake Detection Method Using Affective Cues ( http://arxiv.org/abs/2003.06711v3 ) ライセンス: Link先を確認	Trisha Mittal, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha	(参考訳) 本稿では,実および偽のディープフェイクマルチメディアコンテンツを検出するための学習ベース手法を提案する。学習のための情報を最大化するために,同じビデオから2つのオーディオと視覚の類似性を抽出し,分析する。さらに,映像中の2つのモダリティから感情知覚に対応する感情的手がかりを抽出・比較し,入力映像が「リアル」か「フェイク」かを推定する。本稿では,シームズネットワークアーキテクチャと三重項損失にインスパイアされたディープラーニングネットワークを提案する。本モデルの有効性を検証するため,大規模深度検出データセットであるDeepFake-TIMIT DatasetとDFDCのAUC測定値について報告する。我々は,複数のSOTAディープフェイク検出手法とDFDCで84.4%,DF-TIMITデータセットで96.6%の動画AUCとを比較した。我々の知る限りでは、オーディオとビデオのモダリティを同時に活用する最初のアプローチであり、ディープフェイク検出のための2つのモダリティからの感情も認識する。 We present a learning-based method for detecting real and fake deepfake multimedia content. To maximize information for learning, we extract and analyze the similarity between the two audio and visual modalities from within the same video. Additionally, we extract and compare affective cues corresponding to perceived emotion from the two modalities within a video to infer whether the input video is "real" or "fake". We propose a deep learning network, inspired by the Siamese network architecture and the triplet loss. To validate our model, we report the AUC metric on two large-scale deepfake detection datasets, DeepFake-TIMIT Dataset and DFDC. We compare our approach with several SOTA deepfake detection methods and report per-video AUC of 84.4% on the DFDC and 96.6% on the DF-TIMIT datasets, respectively. To the best of our knowledge, ours is the first approach that simultaneously exploits audio and video modalities and also perceived emotions from the two modalities for deepfake detection.	翻訳日:2022-12-23 20:12:12 公開日:2020-08-01
# DHP: HyperNetworksによる差別化可能なメタプルーニング DHP: Differentiable Meta Pruning via HyperNetworks ( http://arxiv.org/abs/2003.13683v3 ) ライセンス: Link先を確認	Yawei Li, Shuhang Gu, Kai Zhang, Luc Van Gool, Radu Timofte	(参考訳) ネットワークプルーニングは、ニューラルネットワークの加速とモデルストレージ/送信負荷の軽減の原動力となっている。 AutoMLとニューラルアーキテクチャサーチ(NAS)の出現により、プルーニングは自動メカニズムと検索に基づくアーキテクチャ最適化で話題になっている。しかし、現在の自動設計は強化学習か進化的アルゴリズムに依存している。これらのアルゴリズムの非微分性のため、プルーニングアルゴリズムは収束に到達する前に長い探索段階を必要とする。この問題を回避するために,ネットワークの自動刈り出しのためのハイパーネットによる識別可能な刈り出し方式を提案する。特別に設計されたハイパーネットは遅延ベクトルを入力として、バックボーンネットワークの重みパラメータを生成する。潜在ベクトルは、バックボーンネットワーク内の畳み込み層の出力チャネルを制御し、レイヤのプルーニングのハンドルとして機能する。潜ベクトルに$\ell_1$スパーシティ正規化を強制し、近位勾配ソルバを利用することにより、疎潜ベクトルを得ることができる。スパシファイド潜在ベクトルをハイパーネットワークスに通すと、生成された重みパラメータの対応するスライスを除去し、ネットワーク切断の効果を達成できる。すべてのレイヤの潜在ベクターがプルーピングされ、自動的にレイヤ構成が生成される。画像分類、単一画像の超解像、雑音除去など、様々なネットワーク上で広範な実験が行われている。実験の結果,提案手法が検証された。 Network pruning has been the driving force for the acceleration of neural networks and the alleviation of model storage/transmission burden. With the advent of AutoML and neural architecture search (NAS), pruning has become topical with automatic mechanism and searching based architecture optimization. Yet, current automatic designs rely on either reinforcement learning or evolutionary algorithm. Due to the non-differentiability of those algorithms, the pruning algorithm needs a long searching stage before reaching the convergence. To circumvent this problem, this paper introduces a differentiable pruning method via hypernetworks for automatic network pruning. The specifically designed hypernetworks take latent vectors as input and generate the weight parameters of the backbone network. The latent vectors control the output channels of the convolutional layers in the backbone network and act as a handle for the pruning of the layers. By enforcing $\ell_1$ sparsity regularization to the latent vectors and utilizing proximal gradient solver, sparse latent vectors can be obtained. Passing the sparsified latent vectors through the hypernetworks, the corresponding slices of the generated weight parameters can be removed, achieving the effect of network pruning. The latent vectors of all the layers are pruned together, resulting in an automatic layer configuration. Extensive experiments are conducted on various networks for image classification, single image super-resolution, and denoising. And the experimental results validate the proposed method.	翻訳日:2022-12-18 07:27:36 公開日:2020-08-01
# VisualCOMET:静止画像の動的コンテキストに関する推論 VisualCOMET: Reasoning about the Dynamic Context of a Still Image ( http://arxiv.org/abs/2004.10796v3 ) ライセンス: Link先を確認	Jae Sung Park, Chandra Bhagavatula, Roozbeh Mottaghi, Ali Farhadi, Yejin Choi	(参考訳) 静止画の1つのフレームからでも、人々はその画像のダイナミックなストーリーをフレームの前、後、そしてその向こうで考えることができる。例えば、水に浮くのに苦労している男のイメージを考えると、その男が過去に水に落ちたのは、その時の男の意図が生き残ることであり、近い将来に助けが必要であり、そうでなければ洗い流されることになる。我々はvisualcometを提案する。visual commonsense推論タスクの新しいフレームワークで、以前発生した可能性のあるイベント、次に発生した可能性のあるイベント、現在の人々の意図を予測する。視覚コモンセンス推論に向けた研究を支援するために,視覚コモンセンス推論の140万以上のテキスト記述からなり,それぞれが前後の短いビデオ要約と組み合わせて,様々な6万枚の画像セットに注意深く注釈付けされた視覚コモンセンス推論の大規模リポジトリを紹介する。さらに,画像に現れる人とテキストのコモンセンス記述で言及される人との人格的接点(つまりコリファレンスリンク)を提供し,画像とテキストのより緊密な統合を可能にした。我々は,この課題に対して強力なベースライン性能を確立し,視覚的およびテキスト的コモンセンス推論の統合が鍵であり,非統合的な代替手段に勝っていることを示す。 Even from a single frame of a still image, people can reason about the dynamic story of the image before, after, and beyond the frame. For example, given an image of a man struggling to stay afloat in water, we can reason that the man fell into the water sometime in the past, the intent of that man at the moment is to stay alive, and he will need help in the near future or else he will get washed away. We propose VisualComet, the novel framework of visual commonsense reasoning tasks to predict events that might have happened before, events that might happen next, and the intents of the people at present. To support research toward visual commonsense reasoning, we introduce the first large-scale repository of Visual Commonsense Graphs that consists of over 1.4 million textual descriptions of visual commonsense inferences carefully annotated over a diverse set of 60,000 images, each paired with short video summaries of before and after. In addition, we provide person-grounding (i.e., co-reference links) between people appearing in the image and people mentioned in the textual commonsense descriptions, allowing for tighter integration between images and text. We establish strong baseline performances on this task and demonstrate that integration between visual and textual commonsense reasoning is the key and wins over non-integrative alternatives.	翻訳日:2022-12-10 17:11:57 公開日:2020-08-01
# 一級アプローチによる新型コロナウイルス研究論文のターゲット特定マイニング Target specific mining of COVID-19 scholarly articles using one-class approach ( http://arxiv.org/abs/2004.11706v2 ) ライセンス: Link先を確認	Sanjay Kumar Sonbhadra, Sonali Agarwal and P. Nagabhushan	(参考訳) 近年では、重症急性呼吸症候群(SARS)、中東部呼吸症候群(MERS)、COVID-19など、コロナウイルスの分野でのいくつかの研究論文が公表されている。多くの研究論文が存在する中で、最も適した記事の抽出には時間がかかる。本研究の目的は,コロナウイルス関連研究論文の活動と動向を機械学習を用いて抽出することである。実験にはcovid-19 open research dataset(cord-19)が使用される一方で、いくつかのターゲットタスクと説明がドメイン知識に基づいて分類のために定義されている。クラスタリング技術は、利用可能な記事の異なるクラスタを作成するために使用され、その後、並列一クラスサポートベクターマシン(OCSVM)を使用してタスク割り当てが行われる。オリジナルと縮小された機能による実験は、アプローチのパフォーマンスを検証する。 k-meansクラスタリングアルゴリズムが並列なOCSVMに続き、オリジナルと縮小された特徴空間において他の手法よりも優れていることは明らかである。 In recent years, several research articles have been published in the field of corona-virus caused diseases like severe acute respiratory syndrome (SARS), middle east respiratory syndrome (MERS) and COVID-19. In the presence of numerous research articles, extracting best-suited articles is time-consuming and manually impractical. The objective of this paper is to extract the activity and trends of corona-virus related research articles using machine learning approaches. The COVID-19 open research dataset (CORD-19) is used for experiments, whereas several target-tasks along with explanations are defined for classification, based on domain knowledge. Clustering techniques are used to create the different clusters of available articles, and later the task assignment is performed using parallel one-class support vector machines (OCSVMs). Experiments with original and reduced features validate the performance of the approach. It is evident that the k-means clustering algorithm, followed by parallel OCSVMs, outperforms other methods for both original and reduced feature space.	翻訳日:2022-12-10 03:07:08 公開日:2020-08-01
# マルチビュースペクトルクラスタリングによるテンソル低ランク表現 Multi-View Spectral Clustering Tailored Tensor Low-Rank Representation ( http://arxiv.org/abs/2004.14705v2 ) ライセンス: Link先を確認	Yuheng Jia, Hui Liu, Junhui Hou, Sam Kwong, Qingfu Zhang	(参考訳) 本稿では,テンソル低ランクモデルに基づくマルチビュースペクトルクラスタリング(MVSC)の問題について検討する。 MVSCのテンソルの特殊特性を考慮せずに、既成のテンソル低ランクノルムを採用する既存の方法とは異なり、MVSCに合わせた構造付きテンソル低ランクノルムを設計する。具体的には、テンソルの前面スライスと水平スライスに対称な低ランク制約と構造的な低ランク制約を明示的に課し、ビュー内関係とビュー間関係を特徴付ける。さらに、この2つの制約は相互改善を達成するために共同で最適化できる。新たなテンソル低ランクノルムに基づいて, MVSCを凸低ランクテンソル回復問題として定式化し, 拡張ラグランジュ乗算法を反復的に解いた。 5つのベンチマークデータセットの広範な実験結果から,提案手法が最先端手法をかなり上回っていることがわかった。驚くべきことに、この手法は完璧なクラスタリングを実現できる。さらに,提案手法のパラメータの調整も容易であり,提案手法は異なるデータセットに対して頑健であり,実際にその可能性を示す。 This paper explores the problem of multi-view spectral clustering (MVSC) based on tensor low-rank modeling. Unlike the existing methods that all adopt an off-the-shelf tensor low-rank norm without considering the special characteristics of the tensor in MVSC, we design a novel structured tensor low-rank norm tailored to MVSC. Specifically, we explicitly impose a symmetric low-rank constraint and a structured sparse low-rank constraint on the frontal and horizontal slices of the tensor to characterize the intra-view and inter-view relationships, respectively. Moreover, the two constraints could be jointly optimized to achieve mutual refinement. On the basis of the novel tensor low-rank norm, we formulate MVSC as a convex low-rank tensor recovery problem, which is then efficiently solved with an augmented Lagrange multiplier based method iteratively. Extensive experimental results on five benchmark datasets show that the proposed method outperforms state-of-the-art methods to a significant extent. Impressively, our method is able to produce perfect clustering. In addition, the parameters of our method can be easily tuned, and the proposed model is robust to different datasets, demonstrating its potential in practice.	翻訳日:2022-12-08 02:54:16 公開日:2020-08-01
# ニューラルネットワーク翻訳におけるサブワードセグメンテーションのための動的プログラミング符号化 Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation ( http://arxiv.org/abs/2005.06606v2 ) ライセンス: Link先を確認	Xuanli He, Gholamreza Haffari, Mohammad Norouzi	(参考訳) 本稿では,文をサブワード単位にトークン化する新しいセグメンテーションアルゴリズムである動的プログラミング符号化(DPE)を紹介する。学習や推論のために限界化されるべき潜在変数として,出力文のサブワードセグメンテーションを考察する。高精度なログ辺縁確率推定と正確な地図推定を可能にし,最大後方確率のターゲットセグメンテーションを探索する混合文字・サブワードトランスを提案する。 DPEは、動的プログラミングを用いて出力文を分割する並列データを前処理する手段として、軽量な混合文字サブワード変換器を使用している。機械翻訳における実験結果から、DPEは出力文のセグメンテーションに有効であり、ソース文の確率的セグメンテーションにBPEドロップアウトと組み合わせることができることが示唆された。 DPEは、BPEよりも0.9BLEUの平均的な改善(Sennrich et al., 2016)とBPEよりも0.55BLEUの平均的な改善(Provilkov et al., 2019)を、英語<=>(ドイツ語、ルーマニア語、エストニア語、フィンランド語、ハンガリー語)を含むいくつかのWMTデータセットで達成している。 This paper introduces Dynamic Programming Encoding (DPE), a new segmentation algorithm for tokenizing sentences into subword units. We view the subword segmentation of output sentences as a latent variable that should be marginalized out for learning and inference. A mixed character-subword transformer is proposed, which enables exact log marginal likelihood estimation and exact MAP inference to find target segmentations with maximum posterior probability. DPE uses a lightweight mixed character-subword transformer as a means of pre-processing parallel data to segment output sentences using dynamic programming. Empirical results on machine translation suggest that DPE is effective for segmenting output sentences and can be combined with BPE dropout for stochastic segmentation of source sentences. DPE achieves an average improvement of 0.9 BLEU over BPE (Sennrich et al., 2016) and an average improvement of 0.55 BLEU over BPE dropout (Provilkov et al., 2019) on several WMT datasets including English <=> (German, Romanian, Estonian, Finnish, Hungarian).	翻訳日:2022-12-07 06:15:09 公開日:2020-08-01
# 脳波時系列予測のための感情誘発深部構造(EiDS) Emotion-Inspired Deep Structure (EiDS) for EEG Time Series Forecasting ( http://arxiv.org/abs/2005.13520v2 ) ライセンス: Link先を確認	Mahboobeh Parsapoor	(参考訳) 脳波(EEG)時系列の正確な予測は、発作やてんかんなどの神経疾患の正確な診断に不可欠である。脳波時系列はカオスであるため、従来の機械学習アルゴリズムは次のステップを正確に予測できなかった。そこで本研究では,脳波の時系列を予測するために,感情(感情状態)を阻害する神経構造から着想を得たモデルを提案する。このモデルは感情にインスパイアされた深層構造(EiDS)と呼ばれ、脳波時系列の短期と長期の両方を予測するのに使うことができる。本稿では,EyDSの性能を,長寿命メモリ(LSTM)ネットワークの他のバリエーションと比較する。 Accurate forecasting of an electroencephalogram (EEG) time series is crucial for the correct diagnosis of neurological disorders such as seizures and epilepsy. Since the EEG time series is chaotic, most traditional machine learning algorithms have failed to forecast its next steps accurately. Thus, we suggest a model, which has formed by taking inspiration from the neural structures that underlie feelings (emotional states), to forecast EEG time series. The model, which is referred to as emotion-inspired deep structure (EiDS), can be used to predict both short- and long-term of EEG time series. This paper also compares the performance of EiDS with other variations of long short-term memory (LSTM) networks.	翻訳日:2022-11-30 03:44:54 公開日:2020-08-01
# 予測から処方へ:COVID-19パンデミックにおける非薬剤的介入の進化的最適化 From Prediction to Prescription: Evolutionary Optimization of Non-Pharmaceutical Interventions in the COVID-19 Pandemic ( http://arxiv.org/abs/2005.13766v3 ) ライセンス: Link先を確認	Risto Miikkulainen, Olivier Francon, Elliot Meyerson, Xin Qiu, Elisa Canzani, and Babak Hodjat	(参考訳) 新型コロナウイルス(COVID-19)のパンデミックの広がりや、ソーシャルディスタンシングの規制や学校やビジネスの閉鎖など、非薬剤的介入(NPI)をどう含めるかを予測するために、いくつかのモデルが開発されている。本稿では,進化的AIが次のステップ,すなわち最も効果的な介入戦略を自動決定するためにどのように使用できるかを示す。進化的代理補助処方(ESP)により、多数の候補戦略を生成し、予測モデルで評価することができる。原則として、戦略は異なる国や地域向けにカスタマイズでき、パンデミックを包含する必要性と経済への影響を最小限に抑える必要性のバランスを取ることができる。まだ利用可能なデータには制限があるが、初期の実験では職場や学校の制限が最も重要であり、慎重に設計する必要があることを示唆している。また、制限の解除結果が信頼できないことも示しており、例えば時間とともに変更することで、制約をソフトに実装できる創造的な方法を提案している。より多くのデータが利用可能になるにつれて、このアプローチは新型コロナウイルス(COVID-19)と将来のパンデミックに対処するのにますます有用になる。 Several models have been developed to predict how the COVID-19 pandemic spreads, and how it could be contained with non-pharmaceutical interventions (NPIs) such as social distancing restrictions and school and business closures. This paper demonstrates how evolutionary AI could be used to facilitate the next step, i.e. determining most effective intervention strategies automatically. Through evolutionary surrogate-assisted prescription (ESP), it is possible to generate a large number of candidate strategies and evaluate them with predictive models. In principle, strategies can be customized for different countries and locales, and balance the need to contain the pandemic and the need to minimize their economic impact. While still limited by available data, early experiments suggest that workplace and school restrictions are the most important and need to be designed carefully. It also demonstrates that results of lifting restrictions can be unreliable, and suggests creative ways in which restrictions can be implemented softly, e.g. by alternating them over time. As more data becomes available, the approach can be increasingly useful in dealing with COVID-19 as well as possible future pandemics.	翻訳日:2022-11-27 04:27:12 公開日:2020-08-01
# ニューラルネットワーク圧縮の概観 An Overview of Neural Network Compression ( http://arxiv.org/abs/2006.03669v2 ) ライセンス: Link先を確認	James O' Neill	(参考訳) コンバージェンスに訓練された過パラメータネットワークは、コンピュータビジョンや自然言語処理といった領域で印象的なパフォーマンスを示している。これらの領域におけるサルエントタスクの最先端の推進は、メモリとストレージの要求の増加を考えると、カーボンフットプリントを大きくするだけでなく、機械学習実践者が使用するモデルが大きくなり、ますます難しくなっていることに対応します。このように、近年ではモデル圧縮技術が復活し、特に深い畳み込みニューラルネットワークや、トランスフォーマーのような自己接続ベースのネットワークが注目されている。そこで本論文では, プルーニング, 量子化, テンソル分解, 知識蒸留, 組み合わせを含む, ディープニューラルネットワークの古きと現在の圧縮技術について, タイムリーに概説する。 We assume a basic familiarity with deep learning architectures\footnote{For an introduction to deep learning, see ~\citet{goodfellow2016deep}}, namely, Recurrent Neural Networks~\citep[(RNNs)][]{rumelhart1985learning,hochreiter1997long}, Convolutional Neural Networks~\citep{fukushima1980neocognitron}~\footnote{For an up to date overview see~\citet{khan2019survey}} and Self-Attention based networks~\citep{vaswani2017attention}\footnote{For a general overview of self-attention networks, see ~\citet{chaudhari2019attentive}. 詳細と自然言語処理での使用については、~\citet{hu2019introductory}}を参照してください。議論された論文のほとんどは、これらのdnnアーキテクチャの少なくとも1つの文脈で提案されている。 Overparameterized networks trained to convergence have shown impressive performance in domains such as computer vision and natural language processing. Pushing state of the art on salient tasks within these domains corresponds to these models becoming larger and more difficult for machine learning practitioners to use given the increasing memory and storage requirements, not to mention the larger carbon footprint. Thus, in recent years there has been a resurgence in model compression techniques, particularly for deep convolutional neural networks and self-attention based networks such as the Transformer. Hence, this paper provides a timely overview of both old and current compression techniques for deep neural networks, including pruning, quantization, tensor decomposition, knowledge distillation and combinations thereof. We assume a basic familiarity with deep learning architectures\footnote{For an introduction to deep learning, see ~\citet{goodfellow2016deep}}, namely, Recurrent Neural Networks~\citep[(RNNs)][]{rumelhart1985learning,hochreiter1997long}, Convolutional Neural Networks~\citep{fukushima1980neocognitron}~\footnote{For an up to date overview see~\citet{khan2019survey}} and Self-Attention based networks~\citep{vaswani2017attention}\footnote{For a general overview of self-attention networks, see ~\citet{chaudhari2019attentive}.},\footnote{For more detail and their use in natural language processing, see~\citet{hu2019introductory}}. Most of the papers discussed are proposed in the context of at least one of these DNN architectures.	翻訳日:2022-11-25 02:58:04 公開日:2020-08-01
# MultiSpeech: トランスフォーマーを用いた多話者音声テキスト MultiSpeech: Multi-Speaker Text to Speech with Transformer ( http://arxiv.org/abs/2006.04664v2 ) ライセンス: Link先を確認	Mingjian Chen, Xu Tan, Yi Ren, Jin Xu, Hao Sun, Sheng Zhao, Tao Qin, Tie-Yan Liu	(参考訳) Transformer-based text to speech (TTS) model (例: Transformer TTS~\cite{li2019neural}, FastSpeech~\cite{ren2019fastspeech}) は、トレーニングと推論における並列計算により、RNNベースのモデル(例: Tacotron~\cite{shen2018natural})よりもトレーニングと推論効率の利点を示した。しかし、並列計算はトランスフォーマのテキストと音声のアライメントを学習しながら難易度を増大させ、ノイズデータと多彩な話者によるマルチスピーカーシナリオではさらに拡大され、マルチスピーカーttsにおけるトランスフォーマの適用性が阻害される。本稿では,テキストから音声へのアライメントを改善するためのコンポーネント/技術をいくつか備えた,ロバストで高品質なマルチスピーカートランスフォーマーttsシステムであるmultispeechを開発した。 1) 訓練及び推論において,エンコーダ・デコーダ注意の重量行列上の対角的制約 2) 位置情報をよりよく保存するためにエンコーダに埋め込まれた音素の正規化 3) 連続音声フレーム間のコピーを防止するデコーダプリネットのボトルネック。 VCTKおよびLibriTTSマルチ話者データセットの実験は、MultiSpeechの有効性を実証している。 1) ナイーブトランスフォーマーベースのTSよりも頑健で高品質なマルチスピーカ音声を合成する。 2) 教師としてのMutiSpeechモデルを用いて, 非常に高速な推論速度を保ちながら, ほぼ品質劣化の強いマルチスピーカFastSpeechモデルを得る。 Transformer-based text to speech (TTS) model (e.g., Transformer TTS~\cite{li2019neural}, FastSpeech~\cite{ren2019fastspeech}) has shown the advantages of training and inference efficiency over RNN-based model (e.g., Tacotron~\cite{shen2018natural}) due to its parallel computation in training and/or inference. However, the parallel computation increases the difficulty while learning the alignment between text and speech in Transformer, which is further magnified in the multi-speaker scenario with noisy data and diverse speakers, and hinders the applicability of Transformer for multi-speaker TTS. In this paper, we develop a robust and high-quality multi-speaker Transformer TTS system called MultiSpeech, with several specially designed components/techniques to improve text-to-speech alignment: 1) a diagonal constraint on the weight matrix of encoder-decoder attention in both training and inference; 2) layer normalization on phoneme embedding in encoder to better preserve position information; 3) a bottleneck in decoder pre-net to prevent copy between consecutive speech frames. Experiments on VCTK and LibriTTS multi-speaker datasets demonstrate the effectiveness of MultiSpeech: 1) it synthesizes more robust and better quality multi-speaker voice than naive Transformer based TTS; 2) with a MutiSpeech model as the teacher, we obtain a strong multi-speaker FastSpeech model with almost zero quality degradation while enjoying extremely fast inference speed.	翻訳日:2022-11-24 01:17:12 公開日:2020-08-01
# 教師なしハイパースペクトル超解法におけるカップリングアンミックスネットのクロスアテンション Cross-Attention in Coupled Unmixing Nets for Unsupervised Hyperspectral Super-Resolution ( http://arxiv.org/abs/2007.05230v3 ) ライセンス: Link先を確認	Jing Yao, Danfeng Hong, Jocelyn Chanussot, Deyu Meng, Xiaoxiang Zhu, Zongben Xu	(参考訳) 近年のディープラーニング技術の進歩は、ハイパースペクトル画像超解像(HSI-SR)に大きな進歩をもたらした。しかし、この課題に対して教師なしのディープネットワークの開発は依然として困難である。そこで本研究では,高空間分解能マルチスペクトル画像(MSI)を用いて,HSIの空間分解能を高めるために,クロスアテンション機構CUCaNetを組み込んだ新しい非混合ネットワークを提案する。スペクトルアンミックスにインスパイアされた2ストリーム畳み込みオートエンコーダフレームワークをバックボーンとしてMSとHSデータをスペクトル的に有意な基底と対応する係数に分解する。 CUCaNetは、ネットワーク上の合理的な一貫性仮定を強制することにより、HS-MS対応からスペクトルおよび空間応答関数を適応的に学習することができる。さらに、ネットワークにおけるより効果的な空間スペクトル情報転送を実現するために、クロスアテンションモジュールが考案された。 HSI-SRモデルと比較して広く使われている3つのHS-MSデータセットに対して大規模な実験を行い、HSI-SRアプリケーションにおけるCUCaNetの優位性を実証した。さらに、コードとデータセットはhttps://github.com/danfenghong/eccv2020_cucanetで利用可能になる。 The recent advancement of deep learning techniques has made great progress on hyperspectral image super-resolution (HSI-SR). Yet the development of unsupervised deep networks remains challenging for this task. To this end, we propose a novel coupled unmixing network with a cross-attention mechanism, CUCaNet for short, to enhance the spatial resolution of HSI by means of higher-spatial-resolution multispectral image (MSI). Inspired by coupled spectral unmixing, a two-stream convolutional autoencoder framework is taken as backbone to jointly decompose MS and HS data into a spectrally meaningful basis and corresponding coefficients. CUCaNet is capable of adaptively learning spectral and spatial response functions from HS-MS correspondences by enforcing reasonable consistency assumptions on the networks. Moreover, a cross-attention module is devised to yield more effective spatial-spectral information transfer in networks. Extensive experiments are conducted on three widely-used HS-MS datasets in comparison with state-of-the-art HSI-SR models, demonstrating the superiority of the CUCaNet in the HSI-SR application. Furthermore, the codes and datasets will be available at: https://github.com/danfenghong/ECCV2020_CUCaNet.	翻訳日:2022-11-11 22:35:45 公開日:2020-08-01
# AirCapRL:Deep Reinforcement Learningを用いた自律飛行型人体モーションキャプチャ AirCapRL: Autonomous Aerial Human Motion Capture using Deep Reinforcement Learning ( http://arxiv.org/abs/2007.06343v2 ) ライセンス: Link先を確認	Rahul Tallamraju, Nitin Saini, Elia Bonetto, Michael Pabst, Yu Tang Liu, Michael J. Black and Aamir Ahmad	(参考訳) 本稿では,自律型空中人体モーションキャプチャ(MoCap)のための深部強化学習(RL)に基づくマルチロボット生成制御について紹介する。視覚をベースとしたMoCapに焦点をあて,複数のマイクロエアロ車両を用いた1人の移動体の姿勢と形状の軌跡を推定することを目的とする。この問題に対する最先端の解決策は、手作りのシステムと観測モデルに依存する古典的な制御法に基づいている。このようなモデルは、異なるシステム間で導出および一般化することが困難である。さらに、これらのモデルの非線形性や非凸性は、準最適制御につながる。本研究では,視覚に基づくモーションキャプチャ目的を達成するための逐次意思決定タスクとしてこの問題を定式化し,深層ニューラルネットワークを用いたRL法を用いて解決する。我々はPPOを利用して、構成制御のための確率的分散制御ポリシーを訓練する。ニューラルネットワークは、合成環境で並列化されたセットアップでトレーニングされる。我々はアプローチを検証するために広範囲なシミュレーション実験を行った。最後に、実ロボット実験により、我々のポリシーが現実の条件に一般化されることを示した。ビデオリンク: https://bit.ly/38SJfjo 補足: https://bit.ly/3evfo1O In this letter, we introduce a deep reinforcement learning (RL) based multi-robot formation controller for the task of autonomous aerial human motion capture (MoCap). We focus on vision-based MoCap, where the objective is to estimate the trajectory of body pose and shape of a single moving person using multiple micro aerial vehicles. State-of-the-art solutions to this problem are based on classical control methods, which depend on hand-crafted system and observation models. Such models are difficult to derive and generalize across different systems. Moreover, the non-linearity and non-convexities of these models lead to sub-optimal controls. In our work, we formulate this problem as a sequential decision making task to achieve the vision-based motion capture objectives, and solve it using a deep neural network-based RL method. We leverage proximal policy optimization (PPO) to train a stochastic decentralized control policy for formation control. The neural network is trained in a parallelized setup in synthetic environments. We performed extensive simulation experiments to validate our approach. Finally, real-robot experiments demonstrate that our policies generalize to real world conditions. Video Link: https://bit.ly/38SJfjo Supplementary: https://bit.ly/3evfo1O	翻訳日:2022-11-11 00:51:54 公開日:2020-08-01
# 頑健な低階表現による対向ロバスト性 Adversarial robustness via robust low rank representations ( http://arxiv.org/abs/2007.06555v2 ) ライセンス: Link先を確認	Pranjal Awasthi, Himanshu Jain, Ankit Singh Rawat, Aravindan Vijayaraghavan	(参考訳) 逆のロバスト性は、テスト時の入力に対する知覚できない摂動に対する分類器の感受性を測定する。本研究では、画像などの実データに対してしばしば存在する自然な低ランク表現の利点を強調し、確証された堅牢性を保証するニューラルネットワークのトレーニングを行う。最初の貢献は、$\ell_2$ normで測定された摂動に対する認証された堅牢性です。我々は、CIFAR-10やCIFAR-100のような標準ベンチマークデータセットに対して、最先端のランダム化スムーシングに基づくアプローチの改善を保証するために、低ランクデータ表現を利用する。第二の貢献は、$\ell_\infty$ normで測定された摂動に対する証明された堅牢性のより困難な設定である。我々は、自然な低階表現が本質的に堅牢性を持つことを実証的に証明し、それらの表現における$\ell_\infty$摂動に対する証明されたロバスト性を保証するために利用することができる。我々の $\ell_\infty$ robustness の証明は、表現に付随する $\infty \to 2$ matrix operator norm を含む自然量に依存し、ロバストネス保証を $\ell_2$ から $\ell_\infty$ 摂動に変換する。証明保証のための重要な技術的要素は、上記の行列ノルムに上限を与える乗法重み更新法に基づく証明可能な保証付き高速アルゴリズムである。我々のアルゴリズムによる保証は、この問題に対する技術の現状を改善し、独立した関心を持つかもしれない。 Adversarial robustness measures the susceptibility of a classifier to imperceptible perturbations made to the inputs at test time. In this work we highlight the benefits of natural low rank representations that often exist for real data such as images, for training neural networks with certified robustness guarantees. Our first contribution is for certified robustness to perturbations measured in $\ell_2$ norm. We exploit low rank data representations to provide improved guarantees over state-of-the-art randomized smoothing-based approaches on standard benchmark datasets such as CIFAR-10 and CIFAR-100. Our second contribution is for the more challenging setting of certified robustness to perturbations measured in $\ell_\infty$ norm. We demonstrate empirically that natural low rank representations have inherent robustness properties, that can be leveraged to provide significantly better guarantees for certified robustness to $\ell_\infty$ perturbations in those representations. Our certificate of $\ell_\infty$ robustness relies on a natural quantity involving the $\infty \to 2$ matrix operator norm associated with the representation, to translate robustness guarantees from $\ell_2$ to $\ell_\infty$ perturbations. A key technical ingredient for our certification guarantees is a fast algorithm with provable guarantees based on the multiplicative weights update method to provide upper bounds on the above matrix norm. Our algorithmic guarantees improve upon the state of the art for this problem, and may be of independent interest.	翻訳日:2022-11-10 23:42:28 公開日:2020-08-01
# 形状CD:形状とニューロンを持つ時系列データにおける変化点検出 Shape-CD: Change-Point Detection in Time-Series Data with Shapes and Neurons ( http://arxiv.org/abs/2007.11985v3 ) ライセンス: Link先を確認	Varsha Suresh, Wei Tsang Ooi	(参考訳) 時系列における変更点検出は、時系列データを生成する未知の物理プロセスが変化した時点を検出することを目的としている。既存の手法は、基礎となるプロセスが複雑になれば精度が低下し、時系列で大量のパターンが生成される。この欠点に対処するため,簡単な高速かつ高精度な変化点検出法であるShape-CDを提案する。 shape-cdは形状に基づく特徴を用いてパターンをモデル化し、条件付きニューラルネットワークを用いて時間領域間の時間相関をモデル化する。最大2000クラスを含む4つの高ダイナミック時系列データセットを用いて,shape-cdの性能評価を行った。形状-CDは従来の手法に比べて精度(AUCでは7-60%高い)と計算速度が向上した。さらに、Shape-CDモデルは数百のパラメータで構成されており、他の深層学習モデルよりも訓練に必要なデータが少ない。 Change-point detection in a time series aims to discover the time points at which some unknown underlying physical process that generates the time-series data has changed. We found that existing approaches become less accurate when the underlying process is complex and generates large varieties of patterns in the time series. To address this shortcoming, we propose Shape-CD, a simple, fast, and accurate change point detection method. Shape-CD uses shape-based features to model the patterns and a conditional neural field to model the temporal correlations among the time regions. We evaluated the performance of Shape-CD using four highly dynamic time-series datasets, including the ExtraSensory dataset with up to 2000 classes. Shape-CD demonstrated improved accuracy (7-60% higher in AUC) and faster computational speed compared to existing approaches. Furthermore, the Shape-CD model consists of only hundreds of parameters and require less data to train than other deep supervised learning models.	翻訳日:2022-11-07 23:16:10 公開日:2020-08-01
# CelebA-Spoof: リッチアノテーション付き大規模顔アンチスプーフデータセット CelebA-Spoof: Large-Scale Face Anti-Spoofing Dataset with Rich Annotations ( http://arxiv.org/abs/2007.12342v3 ) ライセンス: Link先を確認	Yuanhan Zhang, Zhenfei Yin, Yidong Li, Guojun Yin, Junjie Yan, Jing Shao, and Ziwei Liu	(参考訳) 顔とのインタラクションシステムが普及するにつれ、これらのシステムのセキュリティと信頼性は重要な問題となり、かなりの研究が費やされる。そのうち、顔の偽造は重要な領域として現れ、その目的は提示された顔が生きているか偽造なのかを特定することである。有望な進歩は達成されたが、既存の作品では複雑なspoof攻撃の処理や現実のシナリオへの一般化が困難である。主な理由は、現在の対spoofingデータセットは量と多様性の両方に制限があるためである。これらの障害を克服するために,大規模な対スプーフ対策データセットceleba-spoofに,次のような魅力を付与する。 1)量:celeba-spoofは10,177人の625,537枚の画像からなる。 2)多様性:スプーフ画像は10以上のセンサーで8つのシーン(2つの環境4つの照明条件)から撮影される。 3) アノテーションのリッチ性: CelebA-Spoofには10のspoof型アノテーションと、オリジナルのCelebAデータセットから継承された40の属性アノテーションが含まれている。 CelebA-Spoof と組み合わせた統合マルチタスクフレームワークである Auxiliary Information Embedding Network (AENet) の既存手法を慎重にベンチマークし、いくつかの貴重な観測結果を明らかにする。 As facial interaction systems are prevalently deployed, security and reliability of these systems become a critical issue, with substantial research efforts devoted. Among them, face anti-spoofing emerges as an important area, whose objective is to identify whether a presented face is live or spoof. Though promising progress has been achieved, existing works still have difficulty in handling complex spoof attacks and generalizing to real-world scenarios. The main reason is that current face anti-spoofing datasets are limited in both quantity and diversity. To overcome these obstacles, we contribute a large-scale face anti-spoofing dataset, CelebA-Spoof, with the following appealing properties: 1) Quantity: CelebA-Spoof comprises of 625,537 pictures of 10,177 subjects, significantly larger than the existing datasets. 2) Diversity: The spoof images are captured from 8 scenes (2 environments 4 illumination conditions) with more than 10 sensors. 3) Annotation Richness: CelebA-Spoof contains 10 spoof type annotations, as well as the 40 attribute annotations inherited from the original CelebA dataset. Equipped with CelebA-Spoof, we carefully benchmark existing methods in a unified multi-task framework, Auxiliary Information Embedding Network (AENet), and reveal several valuable observations.	翻訳日:2022-11-07 06:50:26 公開日:2020-08-01
# 非線形最小二乗問題に対する拡張微分自由最適化 Scalable Derivative-Free Optimization for Nonlinear Least-Squares Problems ( http://arxiv.org/abs/2007.13243v2 ) ライセンス: Link先を確認	Coralia Cartis and Tyler Ferguson and Lindon Roberts	(参考訳) 微分自由(あるいはゼロオーダー)最適化(DFO)は、機械学習を含むさまざまなアプリケーション領域で、特に確率的で計算に高価な目的を含む問題を解く能力において、近年注目を集めている。本研究では,非線形最小二乗問題を解くためのモデルに基づく新しいDFO法を提案する。スケッチ手法を用いて観測空間の次元的低減を行い,局所モデル全体の構築を回避し,最先端のDFOを改善する。提案手法は,ビッグデータシステムにおける問題次元の線形化を図り,既存のソフトウェアと比較して,過度に決定された最小二乗問題に対する実行性能が劇的に向上したことを示す数値的証拠である。 Derivative-free - or zeroth-order - optimization (DFO) has gained recent attention for its ability to solve problems in a variety of application areas, including machine learning, particularly involving objectives which are stochastic and/or expensive to compute. In this work, we develop a novel model-based DFO method for solving nonlinear least-squares problems. We improve on state-of-the-art DFO by performing dimensionality reduction in the observational space using sketching methods, avoiding the construction of a full local model. Our approach has a per-iteration computational cost which is linear in problem dimension in a big data regime, and numerical evidence demonstrates that, compared to existing software, it has dramatically improved runtime performance on overdetermined least-squares problems.	翻訳日:2022-11-06 20:17:33 公開日:2020-08-01
# 地磁気嵐予測のための脳感情学習に基づく予測モデル Brain Emotional Learning-based Prediction Model For the Prediction of Geomagnetic Storms ( http://arxiv.org/abs/2007.15579v2 ) ライセンス: Link先を確認	Mahboobeh Parsapoor	(参考訳) 本研究では,地磁気嵐予測のための新しいデータ駆動モデルを提案する。脳情緒学習インスパイアされたモデル(BELIM)の例であるモデルは、脳情緒学習ベース予測モデル(BELPM)として知られている。 BELPMは4つの主要なサブシステムから構成されており、これらのサブシステム間の接続は感情システムの対応する領域によって模倣されている。これらのサブシステムの機能は適応ネットワークを用いて説明される。 BELPMの学習アルゴリズムは、最も急降下(SD)と最小二乗推定器(LSE)を用いて定義される。 BELPMは、Auroral Electrojet (AE) IndexとDisrupt Time (Dst) Indexという2つの磁気指標を用いて、地磁気嵐を予測するために使用される。 BELPMの性能を評価するため,ANFIS,WKNN,その他のBELIMと比較した。その結果,BELPMは短期および長期の地磁気嵐予測において妥当な精度を達成できることを確認した。 This study suggests a new data-driven model for the prediction of geomagnetic storm. The model which is an instance of Brain Emotional Learning Inspired Models (BELIMs), is known as the Brain Emotional Learning-based Prediction Model (BELPM). BELPM consists of four main subsystems; the connection between these subsystems has been mimicked by the corresponding regions of the emotional system. The functions of these subsystems are explained using adaptive networks. The learning algorithm of BELPM is defined using the steepest descent (SD) and the least square estimator (LSE). BELPM is employed to predict geomagnetic storms using two geomagnetic indices, Auroral Electrojet (AE) Index and Disturbance Time (Dst) Index. To evaluate the performance of BELPM, the obtained results have been compared with ANFIS, WKNN and other instances of BELIMs. The results verify that BELPM has the capability to achieve a reasonable accuracy for both the short-term and the long-term geomagnetic storms prediction.	翻訳日:2022-11-06 02:19:45 公開日:2020-08-01
# 楽譜評価のためのスコアインフォームドネットワーク Score-informed Networks for Music Performance Assessment ( http://arxiv.org/abs/2008.00203v1 ) ライセンス: Link先を確認	Jiawen Huang, Yun-Ning Hung, Ashis Pati, Siddharth Kumar Gururani, Alexander Lerch	(参考訳) ほとんどの場合の音楽演奏の評価は、演奏中の楽譜の基盤を考慮に入れている。演奏音声と楽譜の両方から抽出した特徴に基づく客観的音楽演奏評価(MPA)の自動手法はいくつかあるが,MPAモデルにスコア情報を組み込んだディープニューラルネットワークによる手法はまだ検討されていない。本稿では,スコアインフォームド性能評価が可能な3つのモデルを提案する。これらは (i)調整されたピッチ輪郭とスコアからなる単純な時系列入力を利用する畳み込みニューラルネットワーク (二)ピッチ輪郭と楽譜のジョイント潜在空間を学習するジョイント埋め込みモデル (iii)ピッチ輪郭と楽譜間の距離行列のパターンを利用して評価評価を行う距離行列に基づく畳み込みニューラルネットワーク。本結果は,異なるアーキテクチャと入力表現の適合性に関する知見を提供し,スコア非依存モデルと比較して,スコアインフォームドモデルの利点を示す。 The assessment of music performances in most cases takes into account the underlying musical score being performed. While there have been several automatic approaches for objective music performance assessment (MPA) based on extracted features from both the performance audio and the score, deep neural network-based methods incorporating score information into MPA models have not yet been investigated. In this paper, we introduce three different models capable of score-informed performance assessment. These are (i) a convolutional neural network that utilizes a simple time-series input comprising of aligned pitch contours and score, (ii) a joint embedding model which learns a joint latent space for pitch contours and scores, and (iii) a distance matrix-based convolutional neural network which utilizes patterns in the distance matrix between pitch contours and musical score to predict assessment ratings. Our results provide insights into the suitability of different architectures and input representations and demonstrate the benefits of score-informed models as compared to score-independent models.	翻訳日:2022-11-04 01:19:01 公開日:2020-08-01
# インテリジェントモノのインターネットのための深層強化学習に基づくモバイルエッジコンピューティング Deep Reinforcement Learning Based Mobile Edge Computing for Intelligent Internet of Things ( http://arxiv.org/abs/2008.00250v1 ) ライセンス: Link先を確認	Rui Zhao, Xinjie Wang, Junjuan Xia, and Liseng Fan	(参考訳) 本稿では,複数のユーザが複数の計算アクセスポイント(CAP)によって支援される計算処理を行う,インテリジェントなモノのインターネット(IoT)のための移動エッジコンピューティング(MEC)ネットワークについて検討する。いくつかのタスクをCAPにオフロードすることで、MECネットワークにおける2つの重要な指標であるレイテンシとエネルギー消費を削減することで、システムパフォーマンスを改善することができる。深層強化学習アルゴリズムを用いて,オフロード戦略をインテリジェントに提案することで,システムを考案する。このアルゴリズムでは、システム性能を最適化するために、ディープqネットワークを使用してオフロード決定を自動的に学習し、ニューラルネットワーク(nn)を訓練して、環境システムからトレーニングデータを生成するオフロード動作を予測する。また,複数の帯域割り当て方式が提案されているユーザとキャップ間のリンクに対して,無線帯域幅を最適化するために帯域割り当てを用いる。さらに、ユーザからの計算タスクを支援するために、ベストキャップを1つ選択するために、キャップ選択を用いる。シミュレーションの結果から,提案した強化学習オフロード戦略の有効性が示された。特に、深層強化学習に基づくアルゴリズムにより、レイテンシとエネルギー消費のシステムコストを大幅に削減することができる。 In this paper, we investigate mobile edge computing (MEC) networks for intelligent internet of things (IoT), where multiple users have some computational tasks assisted by multiple computational access points (CAPs). By offloading some tasks to the CAPs, the system performance can be improved through reducing the latency and energy consumption, which are the two important metrics of interest in the MEC networks. We devise the system by proposing the offloading strategy intelligently through the deep reinforcement learning algorithm. In this algorithm, Deep Q-Network is used to automatically learn the offloading decision in order to optimize the system performance, and a neural network (NN) is trained to predict the offloading action, where the training data is generated from the environmental system. Moreover, we employ the bandwidth allocation in order to optimize the wireless spectrum for the links between the users and CAPs, where several bandwidth allocation schemes are proposed. In further, we use the CAP selection in order to choose one best CAP to assist the computational tasks from the users. Simulation results are finally presented to show the effectiveness of the proposed reinforcement learning offloading strategy. In particular, the system cost of latency and energy consumption can be reduced significantly by the proposed deep reinforcement learning based algorithm.	翻訳日:2022-11-04 01:18:38 公開日:2020-08-01
# 心メタボリックシンドロームを応用したマルチオムリックデータに対する2段階のペナルドロジスティック回帰法 Two-step penalised logistic regression for multi-omic data with an application to cardiometabolic syndrome ( http://arxiv.org/abs/2008.00235v1 ) ライセンス: Link先を確認	Alessandra Cabassi, Denis Seyres, Mattia Frontini, Paul D. W. Kirk	(参考訳) 高次元マルチオミクスデータセットに基づいてバイナリクラスラベルを予測する分類モデルを構築することは、予測器の数、データの種類、ノイズレベルといった点において、データ層の特徴が広く異なるため、いくつかの課題を提起する。これまでの研究では、これらのデータセットに弾性ネットペナルティを用いた古典的ロジスティック回帰を適用すると、結果が低くなることが示されている(Liu et al., 2018)。本稿では,各層で変数選択を行い,第1ステップで選択した変数を用いて予測モデルを構築する,多段階ロジスティック回帰に対する2段階のアプローチを提案する。そこで本手法は, 同一目的に開発された他の手法と比較し, 既存のソフトウェアを多相線形回帰(Zhao and Zucknick, 2020)にロジスティック回帰設定に適用する。広範なシミュレーション研究により,提案手法は,最善の競争相手と同等の予測性能を実現するだけでなく,関連する予測要因を可能な限り多数選択することを望む場合に望ましいことが示された。モチベーション例は,2つの極端な表現型グループ(肥満10名,リポジストトロフィー10名)と185名の献血者を対象に,8種類の「異常データ型」からなる心メタボリックシンドロームデータセットである。提案手法により,分子レベルでの心筋メタボリックシンドロームの特徴を同定することができる。 Rコードはhttps://github.com/acabassi/logistic-regression-for-multi-omic-dataで入手できる。 Building classification models that predict a binary class label on the basis of high dimensional multi-omics datasets poses several challenges, due to the typically widely differing characteristics of the data layers in terms of number of predictors, type of data, and levels of noise. Previous research has shown that applying classical logistic regression with elastic-net penalty to these datasets can lead to poor results (Liu et al., 2018). We implement a two-step approach to multi-omic logistic regression in which variable selection is performed on each layer separately and a predictive model is then built using the variables selected in the first step. Here, our approach is compared to other methods that have been developed for the same purpose, and we adapt existing software for multi-omic linear regression (Zhao and Zucknick, 2020) to the logistic regression setting. Extensive simulation studies show that our approach should be preferred if the goal is to select as many relevant predictors as possible, as well as achieving prediction performances comparable to those of the best competitors. Our motivating example is a cardiometabolic syndrome dataset comprising eight 'omic data types for 2 extreme phenotype groups (10 obese and 10 lipodystrophy individuals) and 185 blood donors. Our proposed approach allows us to identify features that characterise cardiometabolic syndrome at the molecular level. R code is available at https://github.com/acabassi/logistic-regression-for-multi-omic-data.	翻訳日:2022-11-04 01:15:06 公開日:2020-08-01
# 深層学習モデルに基づくデータ合成管理 Data Synopses Management based on a Deep Learning Model ( http://arxiv.org/abs/2008.01560v1 ) ライセンス: Link先を確認	Panagiotis Fountas, Kostas Kolomvatsos, Christos Anagnostopoulos	(参考訳) 分散コンピューティングは、インテリジェントアプリケーションをサポートするためにエンドユーザに近い処理サービスを配置する。 IoT(Internet of Things)とエッジコンピューティング(Edge Computing)の出現により、前述のインフラストラクチャの相互接続において、さまざまなポイントにサービスを配置する余地を見つけることができる。重要な点は収集したデータの処理である。このような処理は、IoTデバイスと比較して計算能力の増大を示すECノード上で実現可能である。インテリジェントノードのエコシステムがECで作成され、協調モデルをサポートする機会を提供する。ノードは、iotデバイスレポートで定式化された地理的分散データセットのホストになる。データセットでは、いくつかのクエリ/タスクを実行できます。クエリ/タスクはパフォーマンス上の理由からオフロードできる。しかしながら、オフローディングアクションは、ホストノードに存在するデータと常に一致して、慎重に設計されるべきである。本稿では,ECインフラにおける協調的側面を支援するためのモデルを提案する。データシナプスをECノードに配信することで、ピアに存在するデータと完全に整合したオフロード決定を行えるようにしています。ノードはデータを交換して仲間に知らせる。本稿では,IoTデバイスがデータをECノードに報告する頻度が高いため,特にシナプスを頻繁に抽出する場合に,ネットワーク過負荷を回避するためのシナプス配布の適切な時間を検出する手法を提案する。我々のアプローチは、計算されたシナプスの分布を学習し、将来の傾向を推定するディープラーニングモデルである。これらのトレンドに基づいて、ピアノードにシナプスを提供する適切な時間を見つけることができます。提案するメカニズムを記述し,実際のデータセットに基づいて評価する。様々なシナリオに対する広範な実験は、数値的な結果を与えることでアプローチの長所と短所を明らかにする。 Pervasive computing involves the placement of processing services close to end users to support intelligent applications. With the advent of the Internet of Things (IoT) and the Edge Computing (EC), one can find room for placing services at various points in the interconnection of the aforementioned infrastructures. Of significant importance is the processing of the collected data. Such a processing can be realized upon the EC nodes that exhibit increased computational capabilities compared to IoT devices. An ecosystem of intelligent nodes is created at the EC giving the opportunity to support cooperative models. Nodes become the hosts of geo-distributed datasets formulated by the IoT devices reports. Upon the datasets, a number of queries/tasks can be executed. Queries/tasks can be offloaded for performance reasons. However, an offloading action should be carefully designed being always aligned with the data present to the hosting node. In this paper, we present a model to support the cooperative aspect in the EC infrastructure. We argue on the delivery of data synopses to EC nodes making them capable to take offloading decisions fully aligned with data present at peers. Nodes exchange data synopses to inform their peers. We propose a scheme that detects the appropriate time to distribute synopses trying to avoid the network overloading especially when synopses are frequently extracted due to the high rates at which IoT devices report data to EC nodes. Our approach involves a Deep Learning model for learning the distribution of calculated synopses and estimate future trends. Upon these trends, we are able to find the appropriate time to deliver synopses to peer nodes. We provide the description of the proposed mechanism and evaluate it based on real datasets. An extensive experimentation upon various scenarios reveals the pros and cons of the approach by giving numerical results.	翻訳日:2022-11-04 01:13:42 公開日:2020-08-01
# LDAとLSTMモデルを用いた2007年から2019年までのニューヨーク市の混雑価格に向けた公共の意見と批判グループの研究 Using LDA and LSTM Models to Study Public Opinions and Critical Groups Towards Congestion Pricing in New York City through 2007 to 2019 ( http://arxiv.org/abs/2008.07366v1 ) ライセンス: Link先を確認	Qian Ye, Xiaohong Chen, Onur Kalan, and Kaan Ozbay	(参考訳) 本研究は,ニューヨーク市の混雑価格設定の提案が時間とともにどのように発展していくのかを考察する。これらの反応を理解するために、Twitterデータは収集され分析される。活発なユーザと最も言及されたアカウントを統計的に分析することにより、反復プロセスにおける臨界グループを検出し、テキストマイニングとLDAトピックモデリングやLSTM感情分類を含むハイブリッドな自然言語処理技術により、人々の長年の態度や関心の傾向を識別する。その結果、複数の利害団体が関与し、特に市長と知事、mtaおよび外選挙区の代表者など、重要な役割を演じた。大衆は、計画の詳細からより広い都市の持続可能性と公平性への焦点の関心を移した。さらに、計画の承認はいくつかの要素、政治的プロセスで合意された共同合意、現実世界での強い動機付け、複数の利益のバランスに基づくスキーム、 tollingの利益と必要性に対するグループの認識に依存する。 This study explores how people view and respond to the proposals of NYC congestion pricing evolve in time. To understand these responses, Twitter data is collected and analyzed. Critical groups in the recurrent process are detected by statistically analyzing the active users and the most mentioned accounts, and the trends of people's attitudes and concerns over the years are identified with text mining and hybrid Nature Language Processing techniques, including LDA topic modeling and LSTM sentiment classification. The result shows that multiple interest groups were involved and played crucial roles during the proposal, especially Mayor and Governor, MTA, and outer-borough representatives. The public shifted the concern of focus from the plan details to a wider city's sustainability and fairness. Furthermore, the plan's approval relies on several elements, the joint agreement reached in the political process, strong motivation in the real-world, the scheme based on balancing multiple interests, and groups' awareness of tolling's benefits and necessity.	翻訳日:2022-11-04 01:13:17 公開日:2020-08-01
# 捕食者適応のためのランダム林の選別スイート Custom Tailored Suite of Random Forests for Prefetcher Adaptation ( http://arxiv.org/abs/2008.00176v1 ) ライセンス: Link先を確認	Furkan Eris, Sadullah Canakci, Cansu Demirkiran, Ajay Joshi	(参考訳) メモリとプロセッサ間のギャップを埋め、パフォーマンスを向上させるために、データ/インストラクションプリフィッシャー設計の分野では、多くの作業が続けられている。プリフェッチはメモリ階層の各レベルにデプロイされるが、通常、各プリフェッチはシステム内の他のプリフェッチを包括的に考慮せずに設計される。結果として、これらの個別のプレフィッシャー設計は必ずしも相補的ではなく、平均的な性能向上や多くの負の外れ値をもたらす。本研究では,ランダムフォレストを用いて,各メモリレベルでどのプリフェッチャーをオンにすべきかを実行時に決定し,それらを補完するハードウェアプリフェッチャーアダプタであるS SuitAP(Suite of random Forests for Adaptation of Prefetcher system configuration)を提案する。プリフェッチのない設計と比較して、S SuitAPを使うことで、12KBのオーバーヘッドを持つSPEC2017スイートから生成されるトレースの平均で、IPCを46%改善する。また,S SuitAP を用いた負の外れ値も低減する。 To close the gap between memory and processors, and in turn improve performance, there has been an abundance of work in the area of data/instruction prefetcher designs. Prefetchers are deployed in each level of the memory hierarchy, but typically, each prefetcher gets designed without comprehensively accounting for other prefetchers in the system. As a result, these individual prefetcher designs do not always complement each other, and that leads to low average performance gains and/or many negative outliers. In this work, we propose SuitAP (Suite of random forests for Adaptation of Prefetcher system configuration), which is a hardware prefetcher adapter that uses a suite of random forests to determine at runtime which prefetcher should be ON at each memory level, such that they complement each other. Compared to a design with no prefetchers, using SuitAP we improve IPC by 46% on average across traces generated from SPEC2017 suite with 12KB overhead. Moreover, we also reduce negative outliers using SuitAP.	翻訳日:2022-11-04 01:12:11 公開日:2020-08-01
# NFVシステムの障害を考慮したサービスチェーン構成:ゲーム理論の視点から Service Chain Composition with Failures in NFV Systems: A Game-Theoretic Perspective ( http://arxiv.org/abs/2008.00208v1 ) ライセンス: Link先を確認	Simeng Bian, Xi Huang, Ziyu Shao, Xin Gao, Yang Yang	(参考訳) 最先端のネットワーク機能仮想化(nfv)システムでは、超低リクエストレイテンシと最小ネットワーク混雑を持つ異なるネットワークサービス(nss)に対して、効果的なサービスチェーン構成を行うことが依然として重要な課題である。この目的のために、既存のソリューションは、プライバシーの問題を無視し、ユーザの非協力的な振る舞いを無視しながら、ネットワーク状態の完全な知識を必要とします。さらに、ユーザ不使用や仮想マシンのダウンといった予期せぬ失敗に直面している場合もあります。本稿では,非協調ゲームとして失敗するNFVシステムにおけるサービスチェーン構成の問題点を定式化する。このようなゲームが重み付きポテンシャルゲームであることを示すことによって、異なるNSのサービスチェーン組成をNash平衡状態(NE)へ誘導する2つの効果的な分散スキームを提案する。さらに, 深部強化学習 (DRL) とモンテカルロ木探索 (MCTS) に基づく2つの新しい学習支援スキームを比較として開発した。提案手法の有効性と, 故障時の適応性について理論的解析およびシミュレーションにより検証した。 For state-of-the-art network function virtualization (NFV) systems, it remains a key challenge to conduct effective service chain composition for different network services (NSs) with ultra-low request latencies and minimum network congestion. To this end, existing solutions often require full knowledge of the network state, while ignoring the privacy issues and overlooking the non-cooperative behaviors of users. What is more, they may fall short in the face of unexpected failures such as user unavailability and virtual machine breakdown. In this paper, we formulate the problem of service chain composition in NFV systems with failures as a non-cooperative game. By showing that such a game is a weighted potential game and exploiting the unique problem structure, we propose two effective distributed schemes that guide the service chain compositions of different NSs towards the Nash equilibrium (NE) state with both near-optimal latencies and minimum congestion. Besides, we develop two novel learning-aided schemes as comparisons, which are based on deep reinforcement learning (DRL) and Monte Carlo tree search (MCTS) techniques, respectively. Our theoretical analysis and simulation results demonstrate the effectiveness of our proposed schemes, as well as the adaptivity when faced with failures.	翻訳日:2022-11-04 01:06:14 公開日:2020-08-01
# 畳み込みニューラルネットワークを用いた糖尿病網膜症の診断 Diabetic Retinopathy Diagnosis based on Convolutional Neural Network ( http://arxiv.org/abs/2008.00148v1 ) ライセンス: Link先を確認	Mohammed hamzah abed, Lamia Abed Noor Muhammed, Sarah Hussein Toman	(参考訳) 糖尿病網膜症drは、年齢や糖尿病の結果、多くの人にとって人気のある疾患であり、結果として盲目を引き起こす可能性がある。そのため、特に早期にこの疾患の診断は、多くの患者に対する効果を予防することができる。この診断には、網膜を連続的に検査する必要がある。したがって、コンピュータビジョン技術に基づく分野において、コンピュータ支援ツールが使用できる。様々な機械学習技術を用いて様々な研究が行われている。畳み込みニューラルネットワーク(convolutional neural network, convolutional neural network, convolutional neural network)は,糖尿病網膜症検出のための手法である。また、本研究は、前処理フェーズにおける視覚的増強を含み、CNNモデルは、正常で不健康な網膜像を診断するために、認識および分類フェーズを訓練する。 3つの公開データセット DiaretDB0, DiaretDB1, DrimDB が実用的なテストに使用された。この作業の実装は、ディープラーニングツールボックスでディープネットワークデザイナであるMatlab-R2019aに基づいて、畳み込みニューラルネットワークのアーキテクチャを設計し、それをトレーニングする。結果は異なる指標で評価され、その1つは正確さである。 DiaretDB0は100%、DiaretDB1は99.495%、DrimDBは97.55%である。 Diabetic Retinopathy DR is a popular disease for many people as a result of age or the diabetic, as a result, it can cause blindness. therefore, diagnosis of this disease especially in the early time can prevent its effect for a lot of patients. To achieve this diagnosis, eye retina must be examined continuously. Therefore, computer-aided tools can be used in the field based on computer vision techniques. Different works have been performed using various machine learning techniques. Convolutional Neural Network is one of the promise methods, so it was for Diabetic Retinopathy detection in this paper. Also, the proposed work contains visual enhancement in the pre-processing phase, then the CNN model is trained to be able for recognition and classification phase, to diagnosis the healthy and unhealthy retina image. Three public dataset DiaretDB0, DiaretDB1 and DrimDB were used in practical testing. The implementation of this work based on Matlab- R2019a, deep learning toolbox and deep network designer to design the architecture of the convolutional neural network and train it. The results were evaluated to different metrics; accuracy is one of them. The best accuracy that was achieved: for DiaretDB0 is 100%, DiaretDB1 is 99.495% and DrimDB is 97.55%.	翻訳日:2022-11-04 01:05:25 公開日:2020-08-01
# 画像分割用ファジィアクティブ輪郭モデルの現状 State-of-The-Art Fuzzy Active Contour Models for Image Segmentation ( http://arxiv.org/abs/2008.00175v1 ) ライセンス: Link先を確認	Ajoy Mondal and Kuntal Ghosh	(参考訳) イメージセグメンテーションは、すべての画像解析タスクの最初のステップである。数十年の間に、様々なセグメンテーションアルゴリズムが文献で提案され、ある程度の成功を収めた。その中で、ファジィエネルギーに基づくアクティブな輪郭モデルが過去10年間に研究者に注目され、様々な方法が開発されている。良いセグメンテーションアルゴリズムは、ノイズ、ぼかし、低コントラスト、領域内均一性などを含む多数の画像でうまく機能するべきである。しかし、既存のファジィエネルギーに基づく活動輪郭モデルの性能は、通常、限られた数の画像に基づいて評価されている。本稿では,既存のファジィアクティブな輪郭モデルについて理論的観点から検討し,様々な条件下での大規模な画像に対して実験的に評価することを目的とする。様々な画像に基づく解析は、様々なファジィアクティブ輪郭モデルの強みと弱みについて客観的な洞察を与える。最後に,本トピックに関する課題と今後の研究方向性について考察する。 Image segmentation is the initial step for every image analysis task. A large variety of segmentation algorithm has been proposed in the literature during several decades with some mixed success. Among them, the fuzzy energy based active contour models get attention to the researchers during last decade which results in development of various methods. A good segmentation algorithm should perform well in a large number of images containing noise, blur, low contrast, region in-homogeneity, etc. However, the performances of the most of the existing fuzzy energy based active contour models have been evaluated typically on the limited number of images. In this article, our aim is to review the existing fuzzy active contour models from the theoretical point of view and also evaluate them experimentally on a large set of images under the various conditions. The analysis under a large variety of images provides objective insight into the strengths and weaknesses of various fuzzy active contour models. Finally, we discuss several issues and future research direction on this particular topic.	翻訳日:2022-11-04 01:05:04 公開日:2020-08-01
# PERCH 2.0 : オブジェクトポス推定による高速かつ高精度なGPU認識 PERCH 2.0 : Fast and Accurate GPU-based Perception via Search for Object Pose Estimation ( http://arxiv.org/abs/2008.00326v1 ) ライセンス: Link先を確認	Aditya Agarwal, Yupeng Han, Maxim Likhachev	(参考訳) 既知のオブジェクトのポース推定は、ロボットの把握や操作といったタスクに不可欠である。確実な把握の必要性は、動的環境における乱雑で隠蔽されたシーンのポーズ推定に厳密な精度要件を課す。現代の手法では,3次元モデルと観測データとの対応を見つけるために,大量のトレーニングデータを用いて特徴を学習する。しかし、これらの方法は根拠真理の広範な注釈を必要とする。別の方法として、レンダリング可能なシーンの空間で観察されたシーンの最良の説明を求めるアルゴリズムを使う方法がある。最近開発された PERCH (PErception Via SeaRCH) アルゴリズムは、深度データを用いて、特別に構築された木を探索して、グローバルに最適な解に収束する。 PERCHは精度に強い保証を提供するが、現在の定式化は高いランタイムのためスケーラビリティの低下に悩まされている。さらに、ポーズ推定のための深さデータのみに依存するため、アルゴリズムは2つのオブジェクトが同じ形状のシーンに制限される。本稿では,GPUアクセラレーションとRGBデータを活用する検索戦略による新しい認識手法であるPERCH 2.0を提案する。その結果,本手法は6自由度姿勢推定における最先端データ駆動アプローチよりも100倍のスピードアップを達成でき,トレーニングデータに基礎的真理をアノテートする必要がなく,精度が向上することが示された。私たちのコードとビデオはhttps://sbpl-cruz.github.io/perception/で閲覧できます。 Pose estimation of known objects is fundamental to tasks such as robotic grasping and manipulation. The need for reliable grasping imposes stringent accuracy requirements on pose estimation in cluttered, occluded scenes in dynamic environments. Modern methods employ large sets of training data to learn features in order to find correspondence between 3D models and observed data. However these methods require extensive annotation of ground truth poses. An alternative is to use algorithms that search for the best explanation of the observed scene in a space of possible rendered scenes. A recently developed algorithm, PERCH (PErception Via SeaRCH) does so by using depth data to converge to a globally optimum solution using a search over a specially constructed tree. While PERCH offers strong guarantees on accuracy, the current formulation suffers from low scalability owing to its high runtime. In addition, the sole reliance on depth data for pose estimation restricts the algorithm to scenes where no two objects have the same shape. In this work, we propose PERCH 2.0, a novel perception via search strategy that takes advantage of GPU acceleration and RGB data. We show that our approach can achieve a speedup of 100x over PERCH, as well as better accuracy than the state-of-the-art data-driven approaches on 6-DoF pose estimation without the need for annotating ground truth poses in the training data. Our code and video are available at https://sbpl-cruz.github.io/perception/.	翻訳日:2022-11-04 01:03:49 公開日:2020-08-01
# Fog-Assisted IoTシステムにおけるグリーンオフロード - 学習と制御を統合するオンラインパースペクティブ Green Offloading in Fog-Assisted IoT Systems: An Online Perspective Integrating Learning and Control ( http://arxiv.org/abs/2008.00199v1 ) ライセンス: Link先を確認	Xin Gao, Xi Huang, Ziyu Shao, Yang Yang	(参考訳) フォグアシスト型IoTシステムでは、タスク処理のレイテンシとエネルギー消費を減らすために、IoTデバイスから近隣のフォグノードにタスクをオフロードすることが一般的である。しかし, 処理能力や伝送速度などのシステム力学に不確実性があるため, オンラインエネルギー効率スキームの設計は依然として未解決の課題である。さらに、決定プロセスはフォグノードやIoTデバイスのリソース制限によって制約されるため、設計はさらに複雑になる。本稿では,時間平均エネルギー消費の長期的制約を伴う組合せ型マルチアームバンドイット(CMAB)問題として,未知のシステムダイナミクスによるタスクオフロード問題を定式化する。オンライン学習とオンライン制御の効果的な統合を通じて,lago(entextit{learning-aided green offloading})方式を提案する。 LAGOでは,悪用と探索のトレードオフを扱うために帯域学習法を採用し,長期的制約に対処するために仮想キュー技術を利用する。理論的解析により,lagoは時間軸を有限に制限し,長期的平均エネルギー制約を満たすことで,平均的なタスク遅延を低減できることが示された。このような理論結果を検証するために,広範なシミュレーションを行う。 In fog-assisted IoT systems, it is a common practice to offload tasks from IoT devices to their nearby fog nodes to reduce task processing latencies and energy consumptions. However, the design of online energy-efficient scheme is still an open problem because of various uncertainties in system dynamics such as processing capacities and transmission rates. Moreover, the decision-making process is constrained by resource limits on fog nodes and IoT devices, making the design even more complicated. In this paper, we formulate such a task offloading problem with unknown system dynamics as a combinatorial multi-armed bandit (CMAB) problem with long-term constraints on time-averaged energy consumptions. Through an effective integration of online learning and online control, we propose a \textit{Learning-Aided Green Offloading} (LAGO) scheme. In LAGO, we employ bandit learning methods to handle the exploitation-exploration tradeoff and utilize virtual queue techniques to deal with the long-term constraints. Our theoretical analysis shows that LAGO can reduce the average task latency with a tunable sublinear regret bound over a finite time horizon and satisfy the long-term time-averaged energy constraints. We conduct extensive simulations to verify such theoretical results.	翻訳日:2022-11-04 01:02:54 公開日:2020-08-01
# シャドウセグメンテーションからシャドウの除去まで From Shadow Segmentation to Shadow Removal ( http://arxiv.org/abs/2008.00267v1 ) ライセンス: Link先を確認	Hieu Le and Dimitris Samaras	(参考訳) シャドウとシャドウのない画像のペアの必要性はシャドウ除去データセットのサイズと多様性を制限し、大規模なロバストなシャドウ除去アルゴリズムのトレーニングを妨げている。本研究では,影画像から抽出した陰影と非陰影パッチのみを用いて,陰影除去法を提案する。本手法は,影形成の物理モデルに従って,敵対的枠組みを用いて学習する。我々の中心的な貢献は、この逆行訓練を可能にする物理に基づく一連の制約である。提案手法は,完全対影画像と無影画像で訓練した最先端手法と比較して,競争力のあるシャドウ除去を実現する。私たちのトレーニング体制の利点は、ビデオのシャドウ除去においてさらに顕著です。本手法は,事前学習したシャドウ検出器で生成したシャドウマスクのみを用いて,テストビデオ上で微調整を行うことができる。本手法の利点を,提案するビデオシャドウ除去データセットに示す。 The requirement for paired shadow and shadow-free images limits the size and diversity of shadow removal datasets and hinders the possibility of training large-scale, robust shadow removal algorithms. We propose a shadow removal method that can be trained using only shadow and non-shadow patches cropped from the shadow images themselves. Our method is trained via an adversarial framework, following a physical model of shadow formation. Our central contribution is a set of physics-based constraints that enables this adversarial training. Our method achieves competitive shadow removal results compared to state-of-the-art methods that are trained with fully paired shadow and shadow-free images. The advantages of our training regime are even more pronounced in shadow removal for videos. Our method can be fine-tuned on a testing video with only the shadow masks generated by a pre-trained shadow detector and outperforms state-of-the-art methods on this challenging test. We illustrate the advantages of our method on our proposed video shadow removal dataset.	翻訳日:2022-11-04 00:57:11 公開日:2020-08-01
# ロバストな空間的・時間的特徴を用いたスケルトンに基づく行動認識の改善 Improving Skeleton-based Action Recognitionwith Robust Spatial and Temporal Features ( http://arxiv.org/abs/2008.00324v1 ) ライセンス: Link先を確認	Zeshi Yang and Kangkang Yin	(参考訳) 近年,骨格に基づく行動認識がコンピュータビジョンコミュニティにおいて顕著な進歩を遂げている。ほとんどの最先端アルゴリズムは、Graph Convolutional Networks (GCN)に基づいており、バックボーンGCNレイアのネットワーク構造を改善する。本稿では,空間と時間におけるよりロバストな識別的特徴を学ぶための新しいメカニズムを提案する。より具体的には、ネットワークの最後の層にaDiscriminative Feature Learning (DFL)ブランチを追加し、識別的空間的特徴と時間的特徴を抽出して学習を補助する。また、ニューラルネットワークへの入力としてDirection-Invariant Features (DIF)の使用を正式に提唱する。これらのロバストな特徴を学習し使用すると、動作認識精度が向上することを示す。 NTU-RGBD60, NTU-RGBD120, SYSU 3DHOI, Skeleton-Kineticsの4つのデータセットにおけるST-GCNand関連手法との比較を行った。 Recently skeleton-based action recognition has made signif-icant progresses in the computer vision community. Most state-of-the-art algorithms are based on Graph Convolutional Networks (GCN), andtarget at improving the network structure of the backbone GCN lay-ers. In this paper, we propose a novel mechanism to learn more robustdiscriminative features in space and time. More specifically, we add aDiscriminative Feature Learning (DFL) branch to the last layers of thenetwork to extract discriminative spatial and temporal features to helpregularize the learning. We also formally advocate the use of Direction-Invariant Features (DIF) as input to the neural networks. We show thataction recognition accuracy can be improved when these robust featuresare learned and used. We compare our results with those of ST-GCNand related methods on four datasets: NTU-RGBD60, NTU-RGBD120,SYSU 3DHOI and Skeleton-Kinetics.	翻訳日:2022-11-04 00:56:25 公開日:2020-08-01
# 時空間関係学習による不確実性に基づく交通事故予測 Uncertainty-based Traffic Accident Anticipation with Spatio-Temporal Relational Learning ( http://arxiv.org/abs/2008.00334v1 ) ライセンス: Link先を確認	Wentao Bao and Qi Yu and Yu Kong	(参考訳) 事故予測は、ダッシュカムのビデオから事故をできるだけ早く予測することを目的としている。交通シーンが散らばり、視覚的手がかりが限られているため、早期に観測されたフレームからの事故がいつまで続くかを予測することは、非常に難しい。多くの既存手法は,事故予知のための事故関連エージェントの特徴を学習するために開発され,空間的・時間的関係の特徴を無視している。さらに、現在の決定論的ディープニューラルネットワークは誤った予測を過度に信ずる可能性があり、自動運転システムによる交通事故のリスクが高い。本稿では,時空間関係学習を用いた不確実性に基づく事故予測モデルを提案する。ダッシュカムビデオによる交通事故発生確率を逐次予測する。具体的には,リレーショナル特徴学習におけるグラフ畳み込みとリカレントネットワークの活用を提案し,ベイズニューラルネットワークを用いて潜在関係表現の内在的変動に対処する。導出の不確実性に基づくランキング損失は,リレーショナル機能の品質向上により,モデル性能を著しく向上させることがわかった。さらに,環境属性や事故理由アノテーションを含む,交通事故予測のための新しい自動車事故データセット(ccd)を収集した。新たにコンパイルされたデータセットとパブリックデータセットの両方の実験結果から,我々のモデルの最先端性能が示された。私たちのコードとCCDデータセットはhttps://github.com/Cogito2012/UString.orgから入手可能です。 Traffic accident anticipation aims to predict accidents from dashcam videos as early as possible, which is critical to safety-guaranteed self-driving systems. With cluttered traffic scenes and limited visual cues, it is of great challenge to predict how long there will be an accident from early observed frames. Most existing approaches are developed to learn features of accident-relevant agents for accident anticipation, while ignoring the features of their spatial and temporal relations. Besides, current deterministic deep neural networks could be overconfident in false predictions, leading to high risk of traffic accidents caused by self-driving systems. In this paper, we propose an uncertainty-based accident anticipation model with spatio-temporal relational learning. It sequentially predicts the probability of traffic accident occurrence with dashcam videos. Specifically, we propose to take advantage of graph convolution and recurrent networks for relational feature learning, and leverage Bayesian neural networks to address the intrinsic variability of latent relational representations. The derived uncertainty-based ranking loss is found to significantly boost model performance by improving the quality of relational features. In addition, we collect a new Car Crash Dataset (CCD) for traffic accident anticipation which contains environmental attributes and accident reasons annotations. Experimental results on both public and the newly-compiled datasets show state-of-the-art performance of our model. Our code and CCD dataset are available at https://github.com/Cogito2012/UString.	翻訳日:2022-11-04 00:56:07 公開日:2020-08-01
# 抽出要約実験:整数線形プログラミング、項/文のスコーリング、およびタイトル駆動モデル Experiments in Extractive Summarization: Integer Linear Programming, Term/Sentence Scoring, and Title-driven Models ( http://arxiv.org/abs/2008.00140v1 ) ライセンス: Link先を確認	Daniel Lee and Rakesh Verma and Avisha Das and Arjun Mukherjee	(参考訳) 本稿では,教師なし単一文書要約の課題を再検討し,整数線形計画法(ilp)に基づくアルゴリズム,項・文スコアのパラメータ化正規化,要約のためのタイトル駆動アプローチについて検討する。我々は,新たなフレームワークであるNewsSummについて述べる。このフレームワークには,ILPやタイトル駆動アプローチを含む,要約のための既存および新しいアプローチが多数含まれている。 NewsSummの柔軟性は、異なるアルゴリズムと文のスコアリングスキームをシームレスに組み合わせることができる。文得点とilpと正規化を組み合わせることは,この話題に対するこれまでの研究とは対照的な結果であり,最適なパラメータの探索がより重要となることを示す。また,新たなタイトル駆動型削減アイデアは,検討中の非監督型と監督型の両方のアプローチのパフォーマンス向上につながることを示す。 In this paper, we revisit the challenging problem of unsupervised single-document summarization and study the following aspects: Integer linear programming (ILP) based algorithms, Parameterized normalization of term and sentence scores, and Title-driven approaches for summarization. We describe a new framework, NewsSumm, that includes many existing and new approaches for summarization including ILP and title-driven approaches. NewsSumm's flexibility allows to combine different algorithms and sentence scoring schemes seamlessly. Our results combining sentence scoring with ILP and normalization are in contrast to previous work on this topic, showing the importance of a broader search for optimal parameters. We also show that the new title-driven reduction idea leads to improvement in performance for both unsupervised and supervised approaches considered.	翻訳日:2022-11-04 00:55:12 公開日:2020-08-01
# 質問ベースシステムの明確化に関する実証的研究 An Empirical Study of Clarifying Question-Based Systems ( http://arxiv.org/abs/2008.00279v1 ) ライセンス: Link先を確認	Jie Zou, Evangelos Kanoulas, and Yiqun Liu	(参考訳) ユーザの情報ニーズをよりよく理解するために,質問を明確にするためのイニシアチブを取り入れた検索・レコメンデーションシステムは,研究コミュニティから注目を集めている。しかし、私たちの知る限りでは、ユーザがこれらの質問に答える意思があるかどうかを定量化するための実証的研究はない。本研究では,製品リポジトリに対する質問を明確にすることでユーザと対話する実験システムを展開することで,オンライン実験を行う。暗黙のインタラクション行動データとそれを示すユーザからの明示的なフィードバックの両方を収集します。 (a)ユーザーは、多くの明確な質問(平均11～21件)に答える意思があるが、それ以上は多くない。 b) ほとんどのユーザは,対象製品に到達するまで質問に回答するが,そのごく一部は,疲労や無関係な質問の受け取りによって停止する。 c) ユーザの回答の一部(12-17%)は,実際には対象製品の説明とは反対である。 (d) ユーザ(66～84%)の多くは,タスク完了に有用な質問ベースシステムを見出している。本研究の結果は,現在のシミュレーション評価の前提と矛盾するものが多いが,評価フレームワークの改善を示唆し,今後の対話型検索/リコンペンダーシステム設計に刺激を与える可能性がある。 Search and recommender systems that take the initiative to ask clarifying questions to better understand users' information needs are receiving increasing attention from the research community. However, to the best of our knowledge, there is no empirical study to quantify whether and to what extent users are willing or able to answer these questions. In this work, we conduct an online experiment by deploying an experimental system, which interacts with users by asking clarifying questions against a product repository. We collect both implicit interaction behavior data and explicit feedback from users showing that: (a) users are willing to answer a good number of clarifying questions (11-21 on average), but not many more than that; (b) most users answer questions until they reach the target product, but also a fraction of them stops due to fatigue or due to receiving irrelevant questions; (c) part of the users' answers (12-17%) are actually opposite to the description of the target product; while (d) most of the users (66-84%) find the question-based system helpful towards completing their tasks. Some of the findings of the study contradict current assumptions on simulated evaluations in the field, while they point towards improvements in the evaluation framework and can inspire future interactive search/recommender system designs.	翻訳日:2022-11-04 00:54:55 公開日:2020-08-01
# K-means)-階層型並列遺伝的アルゴリズムによるクラスタベース情報検索 Cluster-Based Information Retrieval by using (K-means)- Hierarchical Parallel Genetic Algorithms Approach ( http://arxiv.org/abs/2008.00150v1 ) ライセンス: Link先を確認	Sarah Hussein Toman, Mohammed Hamzah Abed, Zinah Hussein Toman	(参考訳) クラスタベースの情報検索は、特徴を整理し、抽出し、類似性に応じてWebドキュメントを分類する情報検索(IR)ツールの1つである。従来のアプローチとは異なり、クラスタベースのIRはドキュメントの大きなデータセットを処理するのが速い。検索された文書の品質を高め、IRの効率を高め、ユーザ検索から無関係な文書を減らす。本稿では,K-meansクラスタリングアルゴリズムとマルチデメとマスタ/スレーブPGのハイブリッドPGを組み合わせた(K-means)階層並列遺伝的アルゴリズムアプローチ(HPGA)を提案する。 K-平均は、集団を k 個のサブポピュレーションにクラスタリングし、クエリに関連するほとんどのクラスタを2つのレベルの遺伝的並列性によって並列に操作することで、結果の質を改善する方法として、非関連文書はサブポピュレーションに含まれない。 3つの共通データセット(NLP、CISI、CACM)は、リコール、精度、F測定平均を計算するために使用される。最後に、3つのデータセットの精度を遺伝的IRと古典IRと比較した。 IR-GAによるアプローチ精度の改善はCACMで45%,CISIで27%,NLPで25%であった。一方、Classic-IRと比較すると、(k-means)-HPGAはCACMが47%、CISIが28%、NLPが34%であった。 Cluster-based information retrieval is one of the Information retrieval(IR) tools that organize, extract features and categorize the web documents according to their similarity. Unlike traditional approaches, cluster-based IR is fast in processing large datasets of document. To improve the quality of retrieved documents, increase the efficiency of IR and reduce irrelevant documents from user search. in this paper, we proposed a (K-means) - Hierarchical Parallel Genetic Algorithms Approach (HPGA) that combines the K-means clustering algorithm with hybrid PG of multi-deme and master/slave PG algorithms. K-means uses to cluster the population to k subpopulations then take most clusters relevant to the query to manipulate in a parallel way by the two levels of genetic parallelism, thus, irrelevant documents will not be included in subpopulations, as a way to improve the quality of results. Three common datasets (NLP, CISI, and CACM) are used to compute the recall, precision, and F-measure averages. Finally, we compared the precision values of three datasets with Genetic-IR and classic-IR. The proposed approach precision improvements with IR-GA were 45% in the CACM, 27% in the CISI, and 25% in the NLP. While, by comparing with Classic-IR, (k-means)-HPGA got 47% in CACM, 28% in CISI, and 34% in NLP.	翻訳日:2022-11-04 00:54:34 公開日:2020-08-01
# マルチリソースフェアネスを用いたフォグコンピューティングのためのオンラインタスクスケジューリング Online Task Scheduling for Fog Computing with Multi-Resource Fairness ( http://arxiv.org/abs/2008.00207v1 ) ライセンス: Link先を確認	Simeng Bian, Xi Huang, Ziyu Shao	(参考訳) フォグコンピューティングシステムでは、オンラインタスクスケジューリング、すなわち、エンドデバイスから連続的に生成されるタスクのリソース割り当てを決定することが重要な課題である。この設計は、フォグコンピューティングシステムに現れる様々な不確実性のために困難であり、例えば、実際の到着前にタスクのリソース要求が不明である。最近の研究は、オンラインタスクスケジューリングと様々な目的の改善のために、深層強化学習(DRL)技術を適用している。しかし、異なるタスクに対するマルチリソースの公平性を見落としており、これはタスク間で公平なリソース共有を実現するための鍵となるが、一般的には自明ではない。このように、マルチリソースフェアネスを備えたオンラインタスクスケジューリングスキームを設計することは、依然としてオープンな問題である。本稿では,上記の課題に対処する。特に,drl技術を活用して支配的資源公平性(drf)という考え方を採用することで,経験から直接学習し,タスク間の公平性を確保しつつ平均的なタスクスローダウンを効果的に短縮するオンラインタスクスケジューリングスキームであるfairtsを提案する。シミュレーションの結果、FairTSはタスクの遅くなり、リソースの公平性が向上し、最先端のスキームよりも優れていた。 In fog computing systems, one key challenge is online task scheduling, i.e., to decide the resource allocation for tasks that are continuously generated from end devices. The design is challenging because of various uncertainties manifested in fog computing systems; e.g., tasks' resource demands remain unknown before their actual arrivals. Recent works have applied deep reinforcement learning (DRL) techniques to conduct online task scheduling and improve various objectives. However, they overlook the multi-resource fairness for different tasks, which is key to achieving fair resource sharing among tasks but in general non-trivial to achieve. Thusly, it is still an open problem to design an online task scheduling scheme with multi-resource fairness. In this paper, we address the above challenges. Particularly, by leveraging DRL techniques and adopting the idea of dominant resource fairness (DRF), we propose FairTS, an online task scheduling scheme that learns directly from experience to effectively shorten average task slowdown while ensuring multi-resource fairness among tasks. Simulation results show that FairTS outperforms state-of-the-art schemes with an ultra-low task slowdown and better resource fairness.	翻訳日:2022-11-04 00:54:11 公開日:2020-08-01
# スパイキングニューラルネットワークを用いた輪郭追跡改善のための適応ケモタキシー Adaptive Chemotaxis for improved Contour Tracking using Spiking Neural Networks ( http://arxiv.org/abs/2008.00317v1 ) ライセンス: Link先を確認	Shashwat Shukla, Rohan Pathak, Vivek Saraswat and Udayan Ganguly	(参考訳) 本稿では,線虫Caenorhabditis elegansの走化ネットワークに触発された自律ナビゲーションのためのスパイキングニューラルネットワーク(SNN)を提案する。特に、輪郭追跡の問題に焦点を当て、ロボットが到達し、次に所望の濃度設定点に従う必要がある。 klinokinesisのみを使用した過去のスキームは、効率的に輪郭に従うことができるが、セットポイントに到達するのに過度な時間がかかる。我々は,従来提案していた勾配クライミング回路を基盤とした適応型クリノ軸機構を提案することで,この欠点に対処する。我々は,我々のklinotaxis回路が,勾配上昇や勾配降下を行うように自律的に構成され,その後,前述のklinokinesis回路とシームレスに統合できないことを実証する。また,速度制御(orthokinesis)を取り入れ,輪郭追跡性能をさらに向上させた。そこで本研究では,klinokinesis,klinotaxis,ortokinesisを統合したモデルを提案する。輪郭追跡シミュレーションにより,提案手法がセットポイントに到達するまでの時間の2.4倍削減と,セットポイントからの平均偏差の8.7倍削減を実現することを実証した。 In this paper we present a Spiking Neural Network (SNN) for autonomous navigation, inspired by the chemotaxis network of the worm Caenorhabditis elegans. In particular, we focus on the problem of contour tracking, wherein the bot must reach and subsequently follow a desired concentration setpoint. Past schemes that used only klinokinesis can follow the contour efficiently but take excessive time to reach the setpoint. We address this shortcoming by proposing a novel adaptive klinotaxis mechanism that builds upon a previously proposed gradient climbing circuit. We demonstrate how our klinotaxis circuit can autonomously be configured to perform gradient ascent, gradient descent and subsequently be disabled to seamlessly integrate with the aforementioned klinokinesis circuit. We also incorporate speed regulation (orthokinesis) to further improve contour tracking performance. Thus for the first time, we present a model that successfully integrates klinokinesis, klinotaxis and orthokinesis. We demonstrate via contour tracking simulations that our proposed scheme achieves an 2.4x reduction in the time to reach the setpoint, along with a simultaneous 8.7x reduction in average deviation from the setpoint.	翻訳日:2022-11-04 00:47:50 公開日:2020-08-01
# 指紋認識システムのホワイトボックス評価 White-Box Evaluation of Fingerprint Recognition Systems ( http://arxiv.org/abs/2008.00128v1 ) ライセンス: Link先を確認	Steven A. Grosz, Joshua J. Engelsma, Anil K. Jain	(参考訳) 指紋認証システムの典型的な評価は、全体的な識別や認証精度の観点から性能を評価するエンドツーエンドのブラックボックス評価である。しかしながら、これらのブラックボックステストは、画像取得、特徴抽出、マッチングを含む個々のモジュールのパフォーマンスに関する洞察を明らかにしていない。一方,本論文のトピックであるホワイトボックス評価では,各構成モジュールの性能を個別に測定する。指紋読取装置,特徴抽出装置,マッチング部品のホワイトボックス評価をいくつかの研究で行ったが,指紋認識システムの各段階で導入された不確実性に関するホワイトボックス分析の完全なシステムを提供していない。本研究では,指紋認識システムコンポーネントの過去のホワイトボックス評価を拡張し,集計されたホワイトボックス評価結果に基づいて指紋認識システム性能の詳細な分析を行う。特に, 指紋認証システムの各段階において, 不正な捕獲条件(照明, 水分, 圧力など)による不確実性について, 取得時の解析を行った。本実験では,ブラックボックス認識性能の面では,各サブモジュールのホワイトボックス解析でのみ確認可能な,指紋認識システムパイプラインの各モジュールにおいて,総合的に優れた性能を発揮できないことを示す。このような発見により、研究者たちは指紋認識システムの改善にもっと注力できる。 Typical evaluations of fingerprint recognition systems consist of end-to-end black-box evaluations, which assess performance in terms of overall identification or authentication accuracy. However, these black-box tests of system performance do not reveal insights into the performance of the individual modules, including image acquisition, feature extraction, and matching. On the other hand, white-box evaluations, the topic of this paper, measure the individual performance of each constituent module in isolation. While a few studies have conducted white-box evaluations of the fingerprint reader, feature extractor, and matching components, no existing study has provided a full system, white-box analysis of the uncertainty introduced at each stage of a fingerprint recognition system. In this work, we extend previous white-box evaluations of fingerprint recognition system components and provide a unified, in-depth analysis of fingerprint recognition system performance based on the aggregated white-box evaluation results. In particular, we analyze the uncertainty introduced at each stage of the fingerprint recognition system due to adverse capture conditions (i.e., varying illumination, moisture, and pressure) at the time of acquisition. Our experiments show that a system that performs better overall, in terms of black-box recognition performance, does not necessarily perform best at each module in the fingerprint recognition system pipeline, which can only be seen with white-box analysis of each sub-module. Findings such as these enable researchers to better focus their efforts in improving fingerprint recognition systems.	翻訳日:2022-11-04 00:47:29 公開日:2020-08-01
# PanoNet: 位置感性機能埋め込みによるリアルタイムパノプティクスセグメンテーション PanoNet: Real-time Panoptic Segmentation through Position-Sensitive Feature Embedding ( http://arxiv.org/abs/2008.00192v1 ) ライセンス: Link先を確認	Xia Chen, Jianren Wang, Martial Hebert	(参考訳) 我々は,パンオプティカルセグメンテーションのためのセマンティクスとインスタンスマスクを同時に生成する,シンプルで高速で柔軟なフレームワークを提案する。パノネットと呼ばれる我々の手法はクリーンで自然な構造設計を取り入れており、時間を要する検出処理が不要なセグメンテーションタスクとして問題に取り組む。また,物体の外観と空間的位置の両方を考慮し,位置感性埋め込みを例示する。全体的に、パノネットは高精細な都市景観画像のパンオプティカルな品質の結果をリアルタイムで得ることができ、同等の性能を持つ他の手法よりもかなり高速である。私たちのアプローチは、自律運転や拡張現実といった多くのアプリケーションにおいて、現実的なスピードとメモリ要件を十分に満たしています。 We propose a simple, fast, and flexible framework to generate simultaneously semantic and instance masks for panoptic segmentation. Our method, called PanoNet, incorporates a clean and natural structure design that tackles the problem purely as a segmentation task without the time-consuming detection process. We also introduce position-sensitive embedding for instance grouping by accounting for both object's appearance and its spatial location. Overall, PanoNet yields high panoptic quality results of high-resolution Cityscapes images in real-time, significantly faster than all other methods with comparable performance. Our approach well satisfies the practical speed and memory requirement for many applications like autonomous driving and augmented reality.	翻訳日:2022-11-04 00:46:12 公開日:2020-08-01
# 自己教師付き学習から視覚プライオリティーを蒸留する Distilling Visual Priors from Self-Supervised Learning ( http://arxiv.org/abs/2008.00261v1 ) ライセンス: Link先を確認	Bingchen Zhao, Xin Wen	(参考訳) 畳み込みニューラルネットワーク(CNN)は、小さなトレーニングデータセットに適合する傾向にある。本稿では,画像分類のためのcnnモデルの一般化能力を向上させるために,自己教師付き学習と知識蒸留を利用した2相パイプラインを提案する。第1段階は、自己教師型学習を通してリッチで一般化可能な視覚表現を持つ教師モデルを学習し、第2段階は、学生モデルを自己蒸留方式で蒸留し、一方、イメージ分類タスクの生徒モデルを微調整する。また,データ不足シナリオ下での表現をよりよく学習するために,自己指導型コントラスト学習プロキシタスクの新たなマージン損失を提案する。他のトリックとともに、VIPriors画像分類チャレンジにおいて競合性能を達成する。 Convolutional Neural Networks (CNNs) are prone to overfit small training datasets. We present a novel two-phase pipeline that leverages self-supervised learning and knowledge distillation to improve the generalization ability of CNN models for image classification under the data-deficient setting. The first phase is to learn a teacher model which possesses rich and generalizable visual representations via self-supervised learning, and the second phase is to distill the representations into a student model in a self-distillation manner, and meanwhile fine-tune the student model for the image classification task. We also propose a novel margin loss for the self-supervised contrastive learning proxy task to better learn the representation under the data-deficient scenario. Together with other tricks, we achieve competitive performance in the VIPriors image classification challenge.	翻訳日:2022-11-04 00:44:55 公開日:2020-08-01
# メタDRN:1ショット画像セグメンテーションのためのメタラーニング Meta-DRN: Meta-Learning for 1-Shot Image Segmentation ( http://arxiv.org/abs/2008.00247v1 ) ライセンス: Link先を確認	Atmadeep Banerjee	(参考訳) 現代のディープラーニングモデルはコンピュータビジョンの分野に革命をもたらした。しかし、これらのモデルの大きな欠点は、適切に一般化するために多数のラベル付き例を必要とすることである。数ショット学習の最近の発展は、この要件を緩和することを目的としている。本稿では,1ショット画像セグメンテーションのための新しい軽量cnnアーキテクチャを提案する。提案モデルは,セマンティックセグメンテーションのための優れたアーキテクチャからインスピレーションを得て,それを1ショットドメインに適応させることによって作成される。画像分類に有効である4つのメタ学習アルゴリズムを用いてモデルをトレーニングし、その結果を比較した。選択したデータセットに対して、提案したモデルは、ベンチマークよりも70%低いパラメータカウントを持ち、メタ学習アルゴリズムの4つすべてを用いて、より良いか同等の平均IoUスコアを持つ。 Modern deep learning models have revolutionized the field of computer vision. But, a significant drawback of most of these models is that they require a large number of labelled examples to generalize properly. Recent developments in few-shot learning aim to alleviate this requirement. In this paper, we propose a novel lightweight CNN architecture for 1-shot image segmentation. The proposed model is created by taking inspiration from well-performing architectures for semantic segmentation and adapting it to the 1-shot domain. We train our model using 4 meta-learning algorithms that have worked well for image classification and compare the results. For the chosen dataset, our proposed model has a 70% lower parameter count than the benchmark, while having better or comparable mean IoU scores using all 4 of the meta-learning algorithms.	翻訳日:2022-11-04 00:39:09 公開日:2020-08-01
# ワープによるアニメーション:高品質表情アニメーションの効率的な方法 Animating Through Warping: an Efficient Method for High-Quality Facial Expression Animation ( http://arxiv.org/abs/2008.00362v1 ) ライセンス: Link先を確認	Zili Yi, Qiang Tang, Vishnu Sanjay Ramiya Srinivasan, Zhan Xu	(参考訳) ディープニューラルネットワークの進歩は、3Dドメインを操作せずに静止画像をアニメーションする技術を大幅に改善した。一方、先行技術では、メモリの制限、トレーニングの難しさ、高解像度(hd)トレーニングデータセットの欠如により、小さな画像(典型的には512x512)しかアニメーションできない。ニューラルネットワークが生成する低分解能結果に高周波数残差を加えることでHD画像を生成することができるという考えから,我々は,HD画像の効率的なアニメーションを実現するための新しいフレームワークであるAnimating Through Warping(ATW)を提案する。具体的には、新しい2段階のニューラルネットワークジェネレータと、Animating Through Warping (ATW)として知られる新しい後処理モジュールの2つのモジュールで構成されている。ジェネレータを小さなイメージでトレーニングし、任意のサイズのイメージで推論することしか必要ありません。推論中、hd入力画像は低解像度成分(128x128)と対応する高周波残差に分解される。ジェネレータは、入力面を所望の状態(例えば、表現カテゴリまたはアクションユニット)に歪ませる動き場と同様に、低解像度の結果を予測する。最後に、reswarpモジュールは、動き場に基づいて残差をゆがめ、ゆがんだ残差を追加して、極端にアップサンプリングされた低解像度結果から最終的なhd結果を生成する。実験では,高分解能アニメーション生成における手法の有効性と効率を示す。提案するフレームワークは,従来のニューラルモデルでは達成されていない4K顔画像の認識に成功している。また,本手法は,生成したアニメーションの時間的一貫性を一般的に保証する。ソースコードは公開される予定だ。 Advances in deep neural networks have considerably improved the art of animating a still image without operating in 3D domain. Whereas, prior arts can only animate small images (typically no larger than 512x512) due to memory limitations, difficulty of training and lack of high-resolution (HD) training datasets, which significantly reduce their potential for applications in movie production and interactive systems. Motivated by the idea that HD images can be generated by adding high-frequency residuals to low-resolution results produced by a neural network, we propose a novel framework known as Animating Through Warping (ATW) to enable efficient animation of HD images. Specifically, the proposed framework consists of two modules, a novel two-stage neural-network generator and a novel post-processing module known as Animating Through Warping (ATW). It only requires the generator to be trained on small images and can do inference on an image of any size. During inference, an HD input image is decomposed into a low-resolution component(128x128) and its corresponding high-frequency residuals. The generator predicts the low-resolution result as well as the motion field that warps the input face to the desired status (e.g., expressions categories or action units). Finally, the ResWarp module warps the residuals based on the motion field and adding the warped residuals to generates the final HD results from the naively up-sampled low-resolution results. Experiments show the effectiveness and efficiency of our method in generating high-resolution animations. Our proposed framework successfully animates a 4K facial image, which has never been achieved by prior neural models. In addition, our method generally guarantee the temporal coherency of the generated animations. Source codes will be made publicly available.	翻訳日:2022-11-04 00:38:42 公開日:2020-08-01
# 小さな運動と大きな結果:運動拡大法を用いて乳児の震動を再現する Little Motion, Big Results: Using Motion Magnification to Reveal Subtle Tremors in Infants ( http://arxiv.org/abs/2008.04946v1 ) ライセンス: Link先を確認	Girik Malik and Ish K. Gulati	(参考訳) 震えの検出は人間と機械の両方にとって困難である。妊娠中にオピオイドに曝露した幼児は、しばしばヒトの眼では見逃し易い出生後の離脱の徴候や症状を示す。新生児アブスティネンス症候群(nas)と呼ばれる一連の臨床特徴は、震え、発作、刺激性などである。現在のケア基準は主観的評価に基づいてFinnegan Neonatal Abstinence Syndrome Scoring System (FNASS) を用いている。 FNASSによるモニタリングには高度に熟練した看護スタッフが必要である。本稿では,増幅動作信号を用いた自動震動検出システムを提案する。 NASの徴候を示す乳児のベッドサイドビデオで適用可能性を示す。さらに, 深層畳み込みネットワークに基づく運動拡大の異なるモードをテストし, ダイナミックモードが臨床設定において最も有効であり, 共通の方向変化に不変であることを示す。本研究では,NAS患者に対して,既存のプロトコルを補うために運動倍率を用いた退院・フォローアップ戦略を提案する。本研究は,現在の実践,訓練,資源利用のギャップを埋める手法を提案する。 Detecting tremors is challenging for both humans and machines. Infants exposed to opioids during pregnancy often show signs and symptoms of withdrawal after birth, which are easy to miss with the human eye. The constellation of clinical features, termed as Neonatal Abstinence Syndrome (NAS), include tremors, seizures, irritability, etc. The current standard of care uses Finnegan Neonatal Abstinence Syndrome Scoring System (FNASS), based on subjective evaluations. Monitoring with FNASS requires highly skilled nursing staff, making continuous monitoring difficult. In this paper we propose an automated tremor detection system using amplified motion signals. We demonstrate its applicability on bedside video of infant exhibiting signs of NAS. Further, we test different modes of deep convolutional network based motion magnification, and identify that dynamic mode works best in the clinical setting, being invariant to common orientational changes. We propose a strategy for discharge and follow up for NAS patients, using motion magnification to supplement the existing protocols. Overall our study suggests methods for bridging the gap in current practices, training and resource utilization.	翻訳日:2022-11-04 00:38:14 公開日:2020-08-01
# tactilesgnet: イベントベースの触覚物体認識のためのスパイクグラフニューラルネットワーク TactileSGNet: A Spiking Graph Neural Network for Event-based Tactile Object Recognition ( http://arxiv.org/abs/2008.08046v1 ) ライセンス: Link先を確認	Fuqiang Gu, Weicong Sng, Tasbolat Taunyazov and Harold Soh	(参考訳) 触覚知覚は、把持や手操作を含む様々なロボットタスクに不可欠である。フレキシブルでイベント駆動の電子皮膚の新しい進歩は、すぐに人間に似たタッチ認識能力を持つロボットに与えられるかもしれない。これらの電子皮膚は変化(例えば圧力、温度)に非同期に反応し、ロボットの体やエンドエフェクターに不規則にレイアウトすることができる。しかし、これらのユニークな特徴は、触覚学習には適さない畳み込み特徴抽出器のような現在のディープラーニングアプローチをもたらす可能性がある。本稿では,イベントに基づく触覚物体認識のための新しいスパイキンググラフニューラルネットワークを提案する。そこで本研究では,タクセルの局所接続性を活用するために,触覚データをグラフ構造に整理する手法を提案する。構築したグラフに基づいて,スパイキンググラフ畳み込みネットワークを開発した。スパイクニューラルネットワークのイベント駆動性は、イベントベースのデータを処理するのに間違いなく適している。 2つの触覚データセットによる実験結果から,提案手法は様々な家庭オブジェクトの分類において,約90%の精度で高い精度を達成できることがわかった。 Tactile perception is crucial for a variety of robot tasks including grasping and in-hand manipulation. New advances in flexible, event-driven, electronic skins may soon endow robots with touch perception capabilities similar to humans. These electronic skins respond asynchronously to changes (e.g., in pressure, temperature), and can be laid out irregularly on the robot's body or end-effector. However, these unique features may render current deep learning approaches such as convolutional feature extractors unsuitable for tactile learning. In this paper, we propose a novel spiking graph neural network for event-based tactile object recognition. To make use of local connectivity of taxels, we present several methods for organizing the tactile data in a graph structure. Based on the constructed graphs, we develop a spiking graph convolutional network. The event-driven nature of spiking neural network makes it arguably more suitable for processing the event-based data. Experimental results on two tactile datasets show that the proposed method outperforms other state-of-the-art spiking methods, achieving high accuracies of approximately 90\% when classifying a variety of different household objects.	翻訳日:2022-11-04 00:37:54 公開日:2020-08-01
# トランスコーダシステムのテストセット The test set for the TransCoder system ( http://arxiv.org/abs/2008.00293v1 ) ライセンス: Link先を確認	Ernest Davis	(参考訳) TransCoderシステムは、Java、C++、Python 3間でソースコードを変換する。品質評価に使用されたテストセットには,クラスを定義して使用する機能や,再帰的以外のユーザ定義関数を呼び出す機能など,javaの重要な機能が欠落している。そのため、これらの特徴を持つプログラムに対するTransCoderの精度は未だ不明である。 The TransCoder system translates source code between Java, C++, and Python 3. The test set that was used to evaluate its quality is missing important features of Java, including the ability to define and use classes and the ability to call user-defined functions other than recursively. Therefore, the accuracy of TransCoder over programs with those features remains unknown.	翻訳日:2022-11-04 00:37:35 公開日:2020-08-01
# マイクロテキストから実行可能な情報を抽出する Extracting actionable information from microtexts ( http://arxiv.org/abs/2008.00343v1 ) ライセンス: Link先を確認	Ali H\"urriyeto\u{g}lu	(参考訳) Twitterのようなマイクロブログは強力な情報ソースである。この情報の一部は、個々の投稿のレベルを超えて集約することができる。集約された情報のいくつかは、e-governanceや公共の安全、その他の公共の関心のレベルに関心を持って行動すべきイベントを指している。さらに、もし集約すれば、かなりの量の情報が既存の情報ネットワークを非自明な方法で補完することができる。この論文は、この目的を果たす実行可能な情報を抽出する半自動手法を提案する。まず,ドメイン内シナリオとクロスドメインシナリオの両方において,イベントまでの時間予測が可能であることを示す。第2に、アナリストのコンテキストに対する関連性の定義を容易にし、この定義を用いて新しいデータを分析する方法を提案する。最後に,機械学習に基づく関連情報分類手法とルールベースの情報分類手法を統合し,マイクロテキストを分類する手法を提案する。マイクロテキスト解析の完全自動化は、この研究プロジェクトの初日から私たちの目標です。この方向への取り組みは、この自動化がどの程度実現できるかを教えてくれました。主に自動化アプローチを開発し、その後、自動化アプローチのさまざまなステップで人間の介入を統合することで、それを拡張し、改善しました。我々の経験は、情報システムの設計、実現、評価によく設計された人間の介入や貢献が、その性能を改善するか、実現を可能にするかのどちらかであることを示す以前の研究を確認する。我々の研究と成果がその必要性と価値に向けられたので、私たちは人間の関与をデザインする以前の研究からインスピレーションを受け、人間の入力から利益を得るためのアプローチをカスタマイズしました。 Microblogs such as Twitter represent a powerful source of information. Part of this information can be aggregated beyond the level of individual posts. Some of this aggregated information is referring to events that could or should be acted upon in the interest of e-governance, public safety, or other levels of public interest. Moreover, a significant amount of this information, if aggregated, could complement existing information networks in a non-trivial way. This dissertation proposes a semi-automatic method for extracting actionable information that serves this purpose. First, we show that predicting time to event is possible for both in-domain and cross-domain scenarios. Second, we suggest a method which facilitates the definition of relevance for an analyst's context and the use of this definition to analyze new data. Finally, we propose a method to integrate the machine learning based relevant information classification method with a rule-based information classification technique to classify microtexts. Fully automatizing microtext analysis has been our goal since the first day of this research project. Our efforts in this direction informed us about the extent this automation can be realized. We mostly first developed an automated approach, then we extended and improved it by integrating human intervention at various steps of the automated approach. Our experience confirms previous work that states that a well-designed human intervention or contribution in design, realization, or evaluation of an information system either improves its performance or enables its realization. As our studies and results directed us toward its necessity and value, we were inspired from previous studies in designing human involvement and customized our approaches to benefit from human input.	翻訳日:2022-11-04 00:37:29 公開日:2020-08-01
# CLEF 2019 Lab ProtestNewsの概要: クロスコンテキスト設定でニュースから抗議を抽出する Overview of CLEF 2019 Lab ProtestNews: Extracting Protests from News in a Cross-context Setting ( http://arxiv.org/abs/2008.00345v1 ) ライセンス: Link先を確認	Ali H\"urriyeto\u{g}lu, Erdem Y\"or\"uk, Deniz Y\"uret, \c{C}a\u{g}r{\i} Yoltar, Burak G\"urel, F{\i}rat Duru\c{s}an, Osman Mutlu, and Arda Akdemir	(参考訳) 我々は,一般化可能な自然言語処理の文脈において,ニュースからの抗議を取り出すためのclef-2019 lab protestnewsの概要を紹介する。この研究室は、文書、文、トークンレベルの情報分類および抽出タスクから構成されており、それぞれこの研究室の範囲内においてタスク1、タスク2、タスク3と呼ばれる。この作業では、参加者は、上記の1つ以上のレベルにおいて、英語のローカルニュースから抗議活動に関連する情報を識別する必要があった。トレーニングと開発データはインドから収集され、テストデータはインドと中国から収集された。 58チームが参加している。これらのチームの12チームと9チームがそれぞれ結果と作業ノートを提出した。我々は、ニューラルネットワークが最良の結果をもたらすのを観察し、その性能低下は、中国であるクロスカントリーセッティングにおけるほとんどの投稿に対して顕著である。 We present an overview of the CLEF-2019 Lab ProtestNews on Extracting Protests from News in the context of generalizable natural language processing. The lab consists of document, sentence, and token level information classification and extraction tasks that were referred as task 1, task 2, and task 3 respectively in the scope of this lab. The tasks required the participants to identify protest relevant information from English local news at one or more aforementioned levels in a cross-context setting, which is cross-country in the scope of this lab. The training and development data were collected from India and test data was collected from India and China. The lab attracted 58 teams to participate in the lab. 12 and 9 of these teams submitted results and working notes respectively. We have observed neural networks yield the best results and the performance drops significantly for majority of the submissions in the cross-country setting, which is China.	翻訳日:2022-11-04 00:37:03 公開日:2020-08-01
# 抗議イベント関連知識ベース構築のためのクロスコンテキストニュースコーパス Cross-context News Corpus for Protest Events related Knowledge Base Construction ( http://arxiv.org/abs/2008.00351v1 ) ライセンス: Link先を確認	Ali H\"urriyeto\u{g}lu, Erdem Y\"or\"uk, Deniz Y\"uret, Osman Mutlu, \c{C}a\u{g}r{\i} Yoltar, F{\i}rat Duru\c{s}an, Burak G\"urel	(参考訳) 英語の様々な国からの様々な地域的・国際的ソースからなる抗議イベントの金本位制コーパスについて述べる。コーパスには文書、文、トークンレベルのアノテーションが含まれている。このコーパスは、ニュース記事を自動的に分類し、抗議イベント関連情報を抽出する機械学習モデルの作成を容易にし、社会科学と政治科学の比較研究を可能にする知識ベースを構築する。各ニュースソースについて、アノテーションはニュース記事のランダムなサンプルから始まり、アクティブな学習を用いて描画されたサンプルで続く。各サンプルのバッチは2人の社会・政治科学者によって注釈監督官によってアノテートされ、アノテーションエラーを半自動的に識別することで改善された。テキスト分類とイベント抽出システムの開発とベンチマークを行う上で,コーパスには多様性と品質があり,テキスト自動処理システムの汎用性と堅牢性に寄与することがわかった。このコーパスと報告された結果は、現在、自動抗議イベント収集研究の共通基盤を欠いている。 We describe a gold standard corpus of protest events that comprise of various local and international sources from various countries in English. The corpus contains document, sentence, and token level annotations. This corpus facilitates creating machine learning models that automatically classify news articles and extract protest event-related information, constructing knowledge bases which enable comparative social and political science studies. For each news source, the annotation starts on random samples of news articles and continues with samples that are drawn using active learning. Each batch of samples was annotated by two social and political scientists, adjudicated by an annotation supervisor, and was improved by identifying annotation errors semi-automatically. We found that the corpus has the variety and quality to develop and benchmark text classification and event extraction systems in a cross-context setting, which contributes to the generalizability and robustness of automated text processing systems. This corpus and the reported results will set the currently lacking common ground in automated protest event collection studies.	翻訳日:2022-11-04 00:36:46 公開日:2020-08-01
# 状況分析と危機管理のための時制・局面・ムードに基づくイベント抽出 Tense, aspect and mood based event extraction for situation analysis and crisis management ( http://arxiv.org/abs/2008.01555v1 ) ライセンス: Link先を確認	Ali H\"urriyeto\u{g}lu	(参考訳) 今日では、イベント抽出システムは主に、状況の時間的およびモード的資格に関する情報を比較的少ない量で処理し、主に過去時制でアサーション文を処理する。しかし、時制、アスペクト、ムードの幅が広いシステムは、より良い分析を提供し、より広い範囲のテキスト分析アプリケーションで使用することができる。この論文はトルコ語のこのような体系を発展させている。これは、オープンソース情報マイニング・分析(OPTIMA)研究グループのイベント抽出ソフトウェアを拡張し、セマンティック表現形式における適切な拡張を実装し、TAM(Tense, Aspect and Mood)マーカーを改善した部分文法、Expressionの副詞解析とマッチング機能を追加し、CORLEONEの標準で適切な辞書を構築することで達成される。これらの拡張はiv のアンカー関係の理論(tem\"urc\"u, 2007)に基づいている。その結果、基本的なイベント構造を抽出するだけでなく、その時間的、モーダル的、そして意志的/反復的な値に応じて、ニュースレポートに与えられる文章を分類できるシステムとなった。トルコ語の自然災害、病気の発生、人為的災害のニュースに焦点が当てられているが、他の言語、ドメイン、ジャンルに適応することができる。このイベント抽出・分類システムは、さらなる発展とともに、環境および人道上のリスクを防止するための自動ブラウジングシステムの基礎を提供することができる。 Nowadays event extraction systems mainly deal with a relatively small amount of information about temporal and modal qualifications of situations, primarily processing assertive sentences in the past tense. However, systems with a wider coverage of tense, aspect and mood can provide better analyses and can be used in a wider range of text analysis applications. This thesis develops such a system for Turkish language. This is accomplished by extending Open Source Information Mining and Analysis (OPTIMA) research group's event extraction software, by implementing appropriate extensions in the semantic representation format, by adding a partial grammar which improves the TAM (Tense, Aspect and Mood) marker, adverb analysis and matching functions of ExPRESS, and by constructing an appropriate lexicon in the standard of CORLEONE. These extensions are based on iv the theory of anchoring relations (Tem\"urc\"u, 2007, 2011) which is a crosslinguistically applicable semantic framework for analyzing tense, aspect and mood related categories. The result is a system which can, in addition to extracting basic event structures, classify sentences given in news reports according to their temporal, modal and volitional/illocutionary values. Although the focus is on news reports of natural disasters, disease outbreaks and man-made disasters in Turkish language, the approach can be adapted to other languages, domains and genres. This event extraction and classification system, with further developments, can provide a basis for automated browsing systems for preventing environmental and humanitarian risk.	翻訳日:2022-11-04 00:36:30 公開日:2020-08-01
# L-CNN:マルチストリーム畳み込みニューラルネットワークのための格子相互融合戦略 L-CNN: A Lattice cross-fusion strategy for multistream convolutional neural networks ( http://arxiv.org/abs/2008.00157v1 ) ライセンス: Link先を確認	Ana Paula G. S. de Almeida and Flavio de Barros Vidal	(参考訳) 本稿では,マルチストリーム畳み込みネットワークの融合戦略であるLattice Cross Fusionを提案する。このアプローチは、プール層直前に数学的操作に基づく融合を行う畳み込み層からの信号と交差する。画像分類データセットであるCIFAR-10を改良したAlexNet-LCNNバージョンを用いて目的的に悪化させた結果,この手法は,より高速な収束,安定性,ロバスト性を備えたベースラインシングルストリームネットワークにおいて,46%向上した。 This paper proposes a fusion strategy for multistream convolutional networks, the Lattice Cross Fusion. This approach crosses signals from convolution layers performing mathematical operation-based fusions right before pooling layers. Results on a purposely worsened CIFAR-10, a popular image classification data set, with a modified AlexNet-LCNN version show that this novel method outperforms by 46% the baseline single stream network, with faster convergence, stability, and robustness.	翻訳日:2022-11-04 00:30:34 公開日:2020-08-01
# eigen-cam:主成分を用いたクラスアクティベーションマップ Eigen-CAM: Class Activation Map using Principal Components ( http://arxiv.org/abs/2008.00299v1 ) ライセンス: Link先を確認	Mohammed Bany Muhammad, Mohammed Yeasin	(参考訳) ディープニューラルネットワークは、モデルの開発と他のドメインへの影響により、ユビキタスである。この進歩の中心は畳み込みニューラルネットワーク(cnns)であり、一連のデータから表現や特徴を学習することができる。このような複雑なモデル(数百万のパラメータと数百のレイヤ)を理解することは、開発者だけでなくエンドユーザにとっても難しい。これは部分的には、解釈性と透明性を提供するツールやインターフェースの欠如によるものだ。クラスアクティベーションマップ(クラスアクティベーションマップ,class activation map, CAM)は、モデルがデータから何を学ぶか、あるいはそれが与えられたタスクでどのように振る舞うかを理解することに焦点を当てている。本稿では,解釈可能でロバストで透明なモデルに対する需要の増加に対応するために,従来の考え方を基礎としている。私たちのアプローチは、CAMを生成するためのシンプルで直感的な(あるいは慣れ親しんだ)方法を提供します。提案する固有camは畳み込み層から学習した特徴/表現の原理成分を計算・可視化する。逆雑音の存在下での弱教師付き局所化や局所化などのベンチマークデータセットを評価することにより,固有camと最先端手法(grad-cam,grad-cam++,cnn-fixationsなど)を比較する実験を行った。固有camはcnnの完全連結層による分類エラーに対して頑健であり、勾配のバックプロパゲーションやクラス妥当性スコア、最大アクティベーション位置、その他の重み付け機能には依存していない。さらに、レイヤの変更やモデルの再トレーニングを必要とせずに、すべてのCNNモデルで動作する。その結果, 弱教師付き物体定位法と比較して, 最良手法に比べて最大12%改善した。 Deep neural networks are ubiquitous due to the ease of developing models and their influence on other domains. At the heart of this progress is convolutional neural networks (CNNs) that are capable of learning representations or features given a set of data. Making sense of such complex models (i.e., millions of parameters and hundreds of layers) remains challenging for developers as well as the end-users. This is partially due to the lack of tools or interfaces capable of providing interpretability and transparency. A growing body of literature, for example, class activation map (CAM), focuses on making sense of what a model learns from the data or why it behaves poorly in a given task. This paper builds on previous ideas to cope with the increasing demand for interpretable, robust, and transparent models. Our approach provides a simpler and intuitive (or familiar) way of generating CAM. The proposed Eigen-CAM computes and visualizes the principle components of the learned features/representations from the convolutional layers. Empirical studies were performed to compare the Eigen-CAM with the state-of-the-art methods (such as Grad-CAM, Grad-CAM++, CNN-fixations) by evaluating on benchmark datasets such as weakly-supervised localization and localizing objects in the presence of adversarial noise. Eigen-CAM was found to be robust against classification errors made by fully connected layers in CNNs, does not rely on the backpropagation of gradients, class relevance score, maximum activation locations, or any other form of weighting features. In addition, it works with all CNN models without the need to modify layers or retrain models. Empirical results show up to 12% improvement over the best method among the methods compared on weakly supervised object localization.	翻訳日:2022-11-04 00:30:25 公開日:2020-08-01
# 視覚物体追跡のための効率的な逆襲 Efficient Adversarial Attacks for Visual Object Tracking ( http://arxiv.org/abs/2008.00217v1 ) ライセンス: Link先を確認	Siyuan Liang, Xingxing Wei, Siyuan Yao and Xiaochun Cao	(参考訳) ビジュアルオブジェクトのトラッキングは、トラッカーがオブジェクトを迅速かつ正確に見つける必要がある重要なタスクである。既存の最先端のオブジェクトトラッカー、すなわちシームズをベースとしたトラッカーは、DNNを使用して高精度を実現する。しかし、視覚追跡モデルの堅牢性はほとんど調査されていない。本稿では,Siameseネットワークに基づくオブジェクトトラッカーの弱点を分析し,その逆の例を視覚的オブジェクトトラッカーに拡張する。本稿では,新たなドリフト損失と組込み機能損失を併用して,シームズネットワークベースのトラッカーを攻撃するエンド・ツー・エンドネットワークFANを提案する。単一のGPUの下では、FANはトレーニング速度が効率的で、強力な攻撃性能を持つ。 FANは10msで敵の例を生成し、効果的な標的攻撃(OTBでは少なくとも40%の落差率)と未目標攻撃(OTBでは少なくとも70%の落差率)を達成する。 Visual object tracking is an important task that requires the tracker to find the objects quickly and accurately. The existing state-ofthe-art object trackers, i.e., Siamese based trackers, use DNNs to attain high accuracy. However, the robustness of visual tracking models is seldom explored. In this paper, we analyze the weakness of object trackers based on the Siamese network and then extend adversarial examples to visual object tracking. We present an end-to-end network FAN (Fast Attack Network) that uses a novel drift loss combined with the embedded feature loss to attack the Siamese network based trackers. Under a single GPU, FAN is efficient in the training speed and has a strong attack performance. The FAN can generate an adversarial example at 10ms, achieve effective targeted attack (at least 40% drop rate on OTB) and untargeted attack (at least 70% drop rate on OTB).	翻訳日:2022-11-04 00:27:51 公開日:2020-08-01
# DaTscan画像上のLIMEを用いたパーキンソン病早期検出のための説明可能な機械学習モデル An Explainable Machine Learning Model for Early Detection of Parkinson's Disease using LIME on DaTscan Imagery ( http://arxiv.org/abs/2008.00238v1 ) ライセンス: Link先を確認	Pavan Rajkumar Magesh, Richard Delwin Myloth, Rijo Jackson Tom	(参考訳) パーキンソン病(英: Parkinson's disease、PD)は、神経疾患である。早期診断は患者の治療を改善し、spect datscanのようなドーパミン作動性イメージング技術によって行われる。本研究では,任意のdatscanがパーキンソン病であるか否かを正確に分類する機械学習モデルを提案する。この種の推論は、Local Interpretable Model-Agnostic Explainer (LIME) メソッドを用いて生成された視覚的指標を用いて行われる。 DaTscansはParkinson's Progression Markers Initiativeのデータベースから抽出され、CNN(VGG16)でトランスファーラーニングを使用して訓練され、95.2%の精度、97.5%の感度、90.9%の特異性を得た。本研究は、特に医療分野における最重要度のモデル解釈可能性を維持するために、DATscansの視覚的スーパーピクセルを用いて、PDとPDを区別するためにLIMEの説明を利用する。提案システムは, パーキンソン病の早期診断において, 医療従事者に対して有効に有効である可能性が示唆された。 Parkinson's disease (PD) is a degenerative and progressive neurological condition. Early diagnosis can improve treatment for patients and is performed through dopaminergic imaging techniques like the SPECT DaTscan. In this study, we propose a machine learning model that accurately classifies any given DaTscan as having Parkinson's disease or not, in addition to providing a plausible reason for the prediction. This is kind of reasoning is done through the use of visual indicators generated using Local Interpretable Model-Agnostic Explainer (LIME) methods. DaTscans were drawn from the Parkinson's Progression Markers Initiative database and trained on a CNN (VGG16) using transfer learning, yielding an accuracy of 95.2%, a sensitivity of 97.5%, and a specificity of 90.9%. Keeping model interpretability of paramount importance, especially in the healthcare field, this study utilises LIME explanations to distinguish PD from non-PD, using visual superpixels on the DaTscans. It could be concluded that the proposed system, in union with its measured interpretability and accuracy may effectively aid medical workers in the early diagnosis of Parkinson's Disease.	翻訳日:2022-11-04 00:27:32 公開日:2020-08-01
# 敵対的機械学習の脆弱性: バイアスか分散か? Vulnerability Under Adversarial Machine Learning: Bias or Variance? ( http://arxiv.org/abs/2008.00138v1 ) ライセンス: Link先を確認	Hossein Aboutalebi, Mohammad Javad Shafiee, Michelle Karg, Christian Scharfenberger, and Alexander Wong	(参考訳) 先行研究により、敵の機械学習の文脈におけるディープニューラルネットワークの脆弱性が明らかにされ、この領域に最近注目が集まっている。まだ十分に検討されていない興味深い質問は、敵対的機械学習のバイアス-分散関係であり、この振る舞いに関する深い洞察を提供する可能性がある。バイアスと分散の概念は、機械学習モデルの一般化と信頼性を分析し評価するための主要なアプローチの1つである。他の機械学習モデルで広く使われているが、ディープラーニングの分野ではよく研究されておらず、敵対的な機械学習の分野でも研究されていない。本研究では,訓練された深層ニューラルネットワークのバイアスと分散に対する敵意機械学習の効果を調査し,敵意摂動がネットワークの一般化にどのように影響するかを分析する。 2つの主な損失関数に基づく分類および回帰適用のバイアス分散トレードオフを導出する。 (i)平均二乗誤差(MSE)、及び (ii)クロスエントロピー。さらに、シミュレーションデータと実データの両方を用いて定量的解析を行い、導出バイアス分散トレードオフとの整合性を実証的に評価する。我々の分析は、バイアス分散の観点から、ディープニューラルネットワークが逆方向の摂動の下で性能が劣る理由と、この種の摂動がネットワークの性能をどう変えるかに光を当てている。さらに,これらの新たな理論的知見を踏まえて,よく知られた機械学習戦略(例:pgd)よりも計算複雑性が低く,低摂動大小の深層ニューラルネットワークを騙すのに高い成功率を提供する新しい逆機械学習アルゴリズムを提案する。 Prior studies have unveiled the vulnerability of the deep neural networks in the context of adversarial machine learning, leading to great recent attention into this area. One interesting question that has yet to be fully explored is the bias-variance relationship of adversarial machine learning, which can potentially provide deeper insights into this behaviour. The notion of bias and variance is one of the main approaches to analyze and evaluate the generalization and reliability of a machine learning model. Although it has been extensively used in other machine learning models, it is not well explored in the field of deep learning and it is even less explored in the area of adversarial machine learning. In this study, we investigate the effect of adversarial machine learning on the bias and variance of a trained deep neural network and analyze how adversarial perturbations can affect the generalization of a network. We derive the bias-variance trade-off for both classification and regression applications based on two main loss functions: (i) mean squared error (MSE), and (ii) cross-entropy. Furthermore, we perform quantitative analysis with both simulated and real data to empirically evaluate consistency with the derived bias-variance tradeoffs. Our analysis sheds light on why the deep neural networks have poor performance under adversarial perturbation from a bias-variance point of view and how this type of perturbation would change the performance of a network. Moreover, given these new theoretical findings, we introduce a new adversarial machine learning algorithm with lower computational complexity than well-known adversarial machine learning strategies (e.g., PGD) while providing a high success rate in fooling deep neural networks in lower perturbation magnitudes.	翻訳日:2022-11-04 00:22:09 公開日:2020-08-01
# マルチノードベルトプレトレーニング:コスト効率のアプローチ Multi-node Bert-pretraining: Cost-efficient Approach ( http://arxiv.org/abs/2008.00177v1 ) ライセンス: Link先を確認	Jiahuang Lin, Xin Li, Gennady Pekhimenko	(参考訳) 近年,BERT,GPT-2,XLNetなどの大規模トランスフォーマーベースの言語モデルが,多くの自然言語処理(NLP)タスクの最先端結果にエキサイティングな飛躍をもたらした。これらのモデルにおける一般的な傾向の1つは、重み付けと計算の両方を導入するモデル複雑性の著しい増加である。さらに、大規模な教師なしデータセットの出現に伴い、単一のトレーニングエポック内のデータサンプルの増加により、トレーニング時間がさらに延長される。結果として、これらのモデルを適切な時間内にトレーニングするために、機械学習(ML)プログラマは、GPU対応のNVIDIA DGXワークステーションやGoogleのTPU Podsのような特別なアクセラレータのような高度なハードウェアセットアップを必要とすることが多い。我々の研究は、この制限に対処し、BERT事前訓練モデルが2週間以内に、慎重にアルゴリズムとソフトウェア最適化を行うことで、広く利用可能なGPUの学術規模のクラスタでトレーニングできることを実証している。本稿では,単一デバイスでのトレーニングスループットの向上,複数のノードとgpu上でのトレーニングワークロードの分散,ネットワーク上での大規模データ交換によって引き起こされる通信ボトルネックを克服するための最適化について述べる。学術的な環境では,BERTの事前トレーニングを合理的な時間予算(12日)で行うことができるが,NVIDIA DGXマシンやGoogleのTPU Podをベースとした産業環境よりもはるかに安価で,攻撃的なハードウェアリソース要件で行うことができる。 Recently, large scale Transformer-based language models such as BERT, GPT-2, and XLNet have brought about exciting leaps in state-of-the-art results for many Natural Language Processing (NLP) tasks. One of the common trends in these recent models is a significant increase in model complexity, which introduces both more weights and computation. Moreover, with the advent of large-scale unsupervised datasets, training time is further extended due to the increased amount of data samples within a single training epoch. As a result, to train these models within a reasonable time, machine learning (ML) programmers often require advanced hardware setups such as the premium GPU-enabled NVIDIA DGX workstations or specialized accelerators such as Google's TPU Pods. Our work addresses this limitation and demonstrates that the BERT pre-trained model can be trained within 2 weeks on an academic-size cluster of widely available GPUs through careful algorithmic and software optimizations. In this paper, we present these optimizations on how to improve single device training throughput, distribute the training workload over multiple nodes and GPUs, and overcome the communication bottleneck introduced by the large data exchanges over the network. We show that we are able to perform pre-training on BERT within a reasonable time budget (12 days) in an academic setting, but with a much less expensive and less aggressive hardware resource requirement than in previously demonstrated industrial settings based on NVIDIA DGX machines or Google's TPU Pods.	翻訳日:2022-11-04 00:21:43 公開日:2020-08-01
# ブラックボックス予測モデルへの覗き込みのための因果レンズ--因果帰属による予測モデル解釈 A Causal Lens for Peeking into Black Box Predictive Models: Predictive Model Interpretation via Causal Attribution ( http://arxiv.org/abs/2008.00357v1 ) ライセンス: Link先を確認	Aria Khademi, Vasant Honavar	(参考訳) 機械学習を用いて訓練された予測モデルの採用が、医療、治安、刑事司法、財務、教育など、幅広い高度な応用にまたがって増加し、そのようなモデルとその予測を説明する効果的な技術の必要性が高まっている。我々は、予測モデルがブラックボックスである設定、すなわち、様々な入力に対するモデルの応答を観察できるだけでなく、予測モデルの内部構造、パラメータ、目的関数、およびモデルを最適化するのに使用されるアルゴリズムに関する知識を持たない設定でこの問題に対処することを目指している。モデル入力と対応する出力の観測から、ブラックボックス予測モデルをモデル出力に対する各モデル入力の因果効果を推定する方法に解釈する問題を低減させる。観測データから因果効果を推定するためのrubin neyman potential outcomes frameworkの変種を用いて,モデル入力のモデル出力に対する因果効果を推定する。モデル入力に対するモデル出力に対する責任の因果関係が、予測モデルを解釈し、その予測を説明するためにどのように使用できるかを示す。本研究では,1つの合成データセット(出力変数に影響を与える入力変数が設計上知られている)と2つの実世界のデータセット(手書き桁分類,パーキンソン病重症度予測)で訓練されたディープニューラルネットワークモデルにおいて,因果属性によるブラックボックス予測モデルの解釈の有効性を示す実験結果を示す。我々の手法は予測モデルアルゴリズムに関する知識を必要とせず、入力出力応答が観測可能であること以外はブラックボックス予測モデルに関する仮定を含まないため、原則としてブラックボックス予測モデルに適用することができる。 With the increasing adoption of predictive models trained using machine learning across a wide range of high-stakes applications, e.g., health care, security, criminal justice, finance, and education, there is a growing need for effective techniques for explaining such models and their predictions. We aim to address this problem in settings where the predictive model is a black box; That is, we can only observe the response of the model to various inputs, but have no knowledge about the internal structure of the predictive model, its parameters, the objective function, and the algorithm used to optimize the model. We reduce the problem of interpreting a black box predictive model to that of estimating the causal effects of each of the model inputs on the model output, from observations of the model inputs and the corresponding outputs. We estimate the causal effects of model inputs on model output using variants of the Rubin Neyman potential outcomes framework for estimating causal effects from observational data. We show how the resulting causal attribution of responsibility for model output to the different model inputs can be used to interpret the predictive model and to explain its predictions. We present results of experiments that demonstrate the effectiveness of our approach to the interpretation of black box predictive models via causal attribution in the case of deep neural network models trained on one synthetic data set (where the input variables that impact the output variable are known by design) and two real-world data sets: Handwritten digit classification, and Parkinson's disease severity prediction. Because our approach does not require knowledge about the predictive model algorithm and is free of assumptions regarding the black box predictive model except that its input-output responses be observable, it can be applied, in principle, to any black box predictive model.	翻訳日:2022-11-04 00:21:17 公開日:2020-08-01
# ニューラルネットワークにおける対比的説明 Contrastive Explanations in Neural Networks ( http://arxiv.org/abs/2008.00178v1 ) ライセンス: Link先を確認	Mohit Prabhushankar, Gukyeong Kwon, Dogancan Temel, and Ghassan AlRegib	(参考訳) 視覚的説明は、ニューラルネットワークによる予測を正当化する視覚的特徴に基づく論理的議論である。現在の視覚的説明のモードは、$`Why \text{ } P?'$という形式の質問に答える。これらの$Why$の質問は広義の文脈で動作し、場合によっては無関係な回答を提供する。我々は、これらの$Why$の質問を、あるコンテキストの$Q$に基づいて制限することを提案し、この説明は、$`Why \text{ } P, \text{} rather \text{ } than \text{ } Q?'$という形式の対照的な質問に答えるようにします。本稿では,ニューラルネットワークのためのコントラスト視覚説明の構造を定式化する。ニューラルネットワークに基づくコントラストを定義し,定義コントラストを抽出する手法を提案する。次に、抽出したコントラストを既存の $`Why \text{ } P?'$ テクニック、特に Grad-CAM 上のプラグインとして使用します。大規模認識,微粒化認識,地下地震解析,画像品質評価などの応用において,ネットワークとデータの両方を解析することの価値を実証する。 Visual explanations are logical arguments based on visual features that justify the predictions made by neural networks. Current modes of visual explanations answer questions of the form $`Why \text{ } P?'$. These $Why$ questions operate under broad contexts thereby providing answers that are irrelevant in some cases. We propose to constrain these $Why$ questions based on some context $Q$ so that our explanations answer contrastive questions of the form $`Why \text{ } P, \text{} rather \text{ } than \text{ } Q?'$. In this paper, we formalize the structure of contrastive visual explanations for neural networks. We define contrast based on neural networks and propose a methodology to extract defined contrasts. We then use the extracted contrasts as a plug-in on top of existing $`Why \text{ } P?'$ techniques, specifically Grad-CAM. We demonstrate their value in analyzing both networks and data in applications of large-scale recognition, fine-grained recognition, subsurface seismic analysis, and image quality assessment.	翻訳日:2022-11-04 00:20:43 公開日:2020-08-01
# semeval-2020タスク7: 編集ニュース見出しにおけるユーモアの評価 SemEval-2020 Task 7: Assessing Humor in Edited News Headlines ( http://arxiv.org/abs/2008.00304v1 ) ライセンス: Link先を確認	Nabil Hossain, John Krumm, Michael Gamon and Henry Kautz	(参考訳) 本稿では,SemEval-2020共有タスク"Assessing Humor in Edited News Headlines"について述べる。タスクのデータセットにはニュースの見出しが含まれており、短い編集を施して面白くし、これらの編集された見出しの面白さはクラウドソーシングを使って評価された。このタスクには2つのサブタスクが含まれており、その1つは、0-3間隔のユーモア尺度における見出しの面白さを推定することである。第二のサブタスクは、同じオリジナルの見出しの2つの編集されたバージョンについて予測することである。これまでのところ、このタスクは最も一般的な共有計算ユーモアタスクであり、最初のサブタスクで48チーム、1番目のタスクで31チームを惹きつける。 This paper describes the SemEval-2020 shared task "Assessing Humor in Edited News Headlines." The task's dataset contains news headlines in which short edits were applied to make them funny, and the funniness of these edited headlines was rated using crowdsourcing. This task includes two subtasks, the first of which is to estimate the funniness of headlines on a humor scale in the interval 0-3. The second subtask is to predict, for a pair of edited versions of the same original headline, which is the funnier version. To date, this task is the most popular shared computational humor task, attracting 48 teams for the first subtask and 31 teams for the second.	翻訳日:2022-11-04 00:20:23 公開日:2020-08-01
# LXPER Index:韓国のEFL学生を対象としたカリキュラム別テキスト可読性評価モデル LXPER Index: a curriculum-specific text readability assessment model for EFL students in Korea ( http://arxiv.org/abs/2008.01564v1 ) ライセンス: Link先を確認	Bruce W. Lee, Jason Hyung-Jong Lee	(参考訳) 自動可読性評価は、教育における自然言語処理(NLP)の最も重要な応用の1つである。自動可読性評価は、あらゆるレベルの習熟度において読み手の適切な読解教材を迅速に選択することを可能にするため、世界中の英語の外国語(efl)学生にとって特に有用である。ほとんどの可読性評価モデルは英語のネイティブ読者向けに開発されており、非ネイティブ英語教育(ELT)カリキュラムにおけるテキストの精度は低い。韓国のELTカリキュラムにおいて,非ネイティブなEFL読者を対象とした可読性評価モデルであるLXPER Indexを紹介した。実験の結果,韓国のELTカリキュラムのテキストコーパス(テキストコーパス)を用いて学習した新モデルは,韓国のELTカリキュラムにおけるテキストの自動可読性評価の精度を大幅に向上させることがわかった。 Automatic readability assessment is one of the most important applications of Natural Language Processing (NLP) in education. Since automatic readability assessment allows the fast selection of appropriate reading material for readers at all levels of proficiency, it can be particularly useful for the English education of English as Foreign Language (EFL) students around the world. Most readability assessment models are developed for the native readers of English and have low accuracy for texts in the non-native English Language Training (ELT) curriculum. We introduce LXPER Index, which is a readability assessment model for non-native EFL readers in the ELT curriculum of Korea. Our experiments show that our new model, trained with CoKEC-text (Text Corpus of the Korean ELT Curriculum), significantly improves the accuracy of automatic readability assessment for texts in the Korean ELT curriculum.	翻訳日:2022-11-04 00:20:10 公開日:2020-08-01
# ガウス過程回帰におけるスパース変分推論の収束 Convergence of Sparse Variational Inference in Gaussian Processes Regression ( http://arxiv.org/abs/2008.00323v1 ) ライセンス: Link先を確認	David R. Burt and Carl Edward Rasmussen and Mark van der Wilk	(参考訳) ガウス過程(英: gaussian process)は、ベイズモデリングにおいて万能かつ数学的に便利である函数上の分布である。しかし、正確な推論に使われる行列演算の立方体コスト($n$)のため、多くの観測値を持つデータに対して、それらの使用が妨げられることが多い。変数を誘導する$M \ll N$に依存して$\mathcal{O}(NM^2)$のコストで近似を形成する多くの解が提案されている。計算コストは$N$で線形に見えるが、真の複雑さは近似の特定の品質を保証するために$M$を$N$でスケールする方法に依存する。本研究では,高品質な近似値を確保するために,$m$が$n$でどのように成長する必要があるかという上限について検討する。 M\ll N$ のガウス雑音回帰モデルに対して、近似モデルと正確な後続モデルの間の KL 分割を任意に小さくすることができることを示す。具体的には、一般的な二乗指数核と、d$-次元のガウス分布共変量に対して、$m=\mathcal{o}((\log n)^d)$ suffice と、全体的な計算コスト$\mathcal{o}(n(\log n)^{2d}(\log\log n)^2)$ を持つ方法が推論に利用できる。 Gaussian processes are distributions over functions that are versatile and mathematically convenient priors in Bayesian modelling. However, their use is often impeded for data with large numbers of observations, $N$, due to the cubic (in $N$) cost of matrix operations used in exact inference. Many solutions have been proposed that rely on $M \ll N$ inducing variables to form an approximation at a cost of $\mathcal{O}(NM^2)$. While the computational cost appears linear in $N$, the true complexity depends on how $M$ must scale with $N$ to ensure a certain quality of the approximation. In this work, we investigate upper and lower bounds on how $M$ needs to grow with $N$ to ensure high quality approximations. We show that we can make the KL-divergence between the approximate model and the exact posterior arbitrarily small for a Gaussian-noise regression model with $M\ll N$. Specifically, for the popular squared exponential kernel and $D$-dimensional Gaussian distributed covariates, $M=\mathcal{O}((\log N)^D)$ suffice and a method with an overall computational cost of $\mathcal{O}(N(\log N)^{2D}(\log\log N)^2)$ can be used to perform inference.	翻訳日:2022-11-04 00:19:35 公開日:2020-08-01
# 個人人口と公共人口の混合から学ぶ Learning from Mixtures of Private and Public Populations ( http://arxiv.org/abs/2008.00331v1 ) ライセンス: Link先を確認	Raef Bassily, Shay Moran and Anupama Nandi	(参考訳) 我々は,プライバシー制約下での教師あり学習の新しいモデルの研究を開始する。健康な人や不健康な人の集団からデータセットを採取する医療研究を想像してください。健康な個人がプライバシーに懸念を抱いていない場合(そのような場合、データを「公開」と呼ぶ)、不健康な個人がデータに対する厳格なプライバシー保護を望んでいると仮定する。この例では、人口(データ分布)は個人(不健康)と公共(健康)のサブ人口の混合であり、非常に異なる可能性がある。上記の例に触発されて、人口の$\mathcal{d}$が2つのサブ人口の混合であるモデルを考える: プライベートなサブ人口の$\mathcal{d}_{\sf priv}$ プライベートでセンシティブなデータと、プライバシーの懸念のないパブリックなサブ人口の$\mathcal{d}_{\sf pub}$である。 $\mathcal{D}$から引き出された各例は、その例がプライベートかパブリックかを示すプライバシー統計ビットを含むと仮定される。目標は、プライベートな例に対してのみ差分プライバシーを満たす学習アルゴリズムを設計することだ。この文脈における先行研究は、プライベートおよびパブリックデータが同じ分布から生じる均質な集団を仮定し、特にこの仮定を利用する設計されたソリューションを仮定した。本研究では, 線形分類器の学習問題である$\mathbb{r}^d$ を考えることにより, この仮定を回避できることを示す。プライバシステータスがターゲットラベルと相関している場合(上述の例のように)、古典的(非プライベートな)PAC学習に匹敵する複雑さを持つ、非依存的かつ実現可能な設定において、$\mathbb{R}^d$の線形分類器が学習可能であることを示す。すべてのデータをプライベートとみなすと、このタスクは不可能であることが知られている。 We initiate the study of a new model of supervised learning under privacy constraints. Imagine a medical study where a dataset is sampled from a population of both healthy and unhealthy individuals. Suppose healthy individuals have no privacy concerns (in such case, we call their data "public") while the unhealthy individuals desire stringent privacy protection for their data. In this example, the population (data distribution) is a mixture of private (unhealthy) and public (healthy) sub-populations that could be very different. Inspired by the above example, we consider a model in which the population $\mathcal{D}$ is a mixture of two sub-populations: a private sub-population $\mathcal{D}_{\sf priv}$ of private and sensitive data, and a public sub-population $\mathcal{D}_{\sf pub}$ of data with no privacy concerns. Each example drawn from $\mathcal{D}$ is assumed to contain a privacy-status bit that indicates whether the example is private or public. The goal is to design a learning algorithm that satisfies differential privacy only with respect to the private examples. Prior works in this context assumed a homogeneous population where private and public data arise from the same distribution, and in particular designed solutions which exploit this assumption. We demonstrate how to circumvent this assumption by considering, as a case study, the problem of learning linear classifiers in $\mathbb{R}^d$. We show that in the case where the privacy status is correlated with the target label (as in the above example), linear classifiers in $\mathbb{R}^d$ can be learned, in the agnostic as well as the realizable setting, with sample complexity which is comparable to that of the classical (non-private) PAC-learning. It is known that this task is impossible if all the data is considered private.	翻訳日:2022-11-04 00:19:04 公開日:2020-08-01
# エルゴード・アニーリング Ergodic Annealing ( http://arxiv.org/abs/2008.00234v1 ) ライセンス: Link先を確認	Carlo Baldassi, Fabio Maccheroni, Massimo Marinacci, Marco Pirazzini	(参考訳) シミュレート・アニーリング(シミュレート・アニーリング)は、コスト関数が知られているnpハード最適化問題の解に対するマルコフ連鎖モンテカルロ法の栄光である。ここでは,Simulated AnnealingのMetropolisエンジンを,Macau Algorithmと呼ばれる強化学習変種に置き換えることで,コスト関数が不明で,人工エージェントが学習しなければならない場合にも,Simulated Annealingヒューリスティックが非常に有効であることを示す。 Simulated Annealing is the crowning glory of Markov Chain Monte Carlo Methods for the solution of NP-hard optimization problems in which the cost function is known. Here, by replacing the Metropolis engine of Simulated Annealing with a reinforcement learning variation -- that we call Macau Algorithm -- we show that the Simulated Annealing heuristic can be very effective also when the cost function is unknown and has to be learned by an artificial agent.	翻訳日:2022-11-04 00:18:23 公開日:2020-08-01

Title

Authors

Abstract

論文公表日・翻訳日

# 多次元暗空間とその基底対称性 : 散逸保護量子ビットへ向けて

Multidimensional dark space and its underlying symmetries: towards dissipation-protected qubits ( http://arxiv.org/abs/2002.00237v2 )

ライセンス: Link先を確認

Raul A. Santos, Fernando Iemini, Alex Kamenev and Yuval Gefen

(参考訳) 量子系は常に環境と相互作用し、典型的にはデコヒーレンスと量子相関の歪みをもたらす。近年、環境との制御された相互作用は、デコヒーレンスに免疫的な状態である {\em ``dark''} と呼ばれる状態を作るのに役立つことが示されている。量子情報を暗黒状態にエンコードするには、空間を1より大きい次元で拡張する必要があるため、異なる直交状態が計算基底として作用する。我々は,そのような退化暗空間(dds)を,環境による非一貫性から保護する,対称性に基づく概念枠組みを考案する。我々は、dds基底が縮退したラーリン状態の集合に同型である分数量子ホール効果に触発されたモデルプロトコルを用いてこの構成を説明する。したがって、我々の駆動散逸モデルにおける長期定常状態は、ユニタリ位相系の縮退空隙の全ての特性を示す。このアプローチは、オープンシステム内の量子情報を保存、保護、操作する新たな可能性を提供します。

Quantum systems are always subject to interactions with an environment, typically resulting in decoherence and distortion of quantum correlations. It has been recently shown that a controlled interaction with the environment may actually help to create a state, dubbed as {\em ``dark''}, which is immune to decoherence. To encode quantum information in the dark states, they need to span a space with a dimensionality larger than one, so different orthogonal states act as a computational basis. We devise a symmetry-based conceptual framework to engineer such degenerate dark spaces (DDS), protected from decoherence by the environment. We illustrate this construction with a model protocol, inspired by the fractional quantum Hall effect, where the DDS basis is isomorphic to a set of degenerate Laughlin states. The long-time steady state of our driven-dissipative model exhibits thus all the characteristics of degenerate vacua of a unitary topological system. This approach offers new possibilities for storing, protecting and manipulating quantum information in open systems.

翻訳日:2023-06-05 00:39:14 公開日:2020-08-01

# 多ビット大ブロック符号に対する定深さ耐故障クリフォード回路

Constant depth fault-tolerant Clifford circuits for multi-qubit large block codes ( http://arxiv.org/abs/2003.12328v2 )

ライセンス: Link先を確認

Yi-Cong Zheng, Ching-Yi Lai, Todd A. Brun, Leong-Chuan Kwek

(参考訳) フォールトトレラント量子計算 (ftqc) スキームは、k>1$ qubits を $n$ 物理 qubits にエンコードする大きなブロックコードを使用しており、高いエンコーディングレートのためリソースオーバーヘッドを大幅に削減できる可能性がある。しかし、符号化された量子ビットに対するフォールトトレラント(FT)論理演算は、通常非常に大きなリソースオーバーヘッドだけでなく、長い$\textit{in-situ}$計算時間もかかるため、発見と実装が困難である。本稿では,Calderbank-Shor-Steane $[\! [n,k,d ]\! ]$ (CSS)コードとその論理FTクリフォード回路。任意の論理クリフォード回路の深さをknillまたはsteaneシンドローム測定回路を介してo(1)$ step \emph{in-situ} にフォールトトレラントに実装できることを示し,アシラ状態の適格化を効率的に行うことができることを示した。特に、$k/n\sim \theta(1)$を満たすコードの場合、論理レベルにおけるclifford回路の実装のリソーススケーリングは、コード距離$d$とは独立に、物理レベルから定数までと同じである。アンシラ状態を生成するのに適したパイプラインを用いて,本方式では物理量子ビット,物理ゲート,大規模FTQCの計算時間において,最小限のリソースコストしか必要としない。

Fault-tolerant quantum computation (FTQC) schemes using large block codes that encode $k>1$ qubits in $n$ physical qubits can potentially reduce the resource overhead to a great extent because of their high encoding rate. However, the fault-tolerant (FT) logical operations for the encoded qubits are difficult to find and implement, which usually takes not only a very large resource overhead but also long $\textit{in-situ}$ computation time. In this paper, we focus on Calderbank-Shor-Steane $[\![ n,k,d ]\!]$ (CSS) codes and their logical FT Clifford circuits. We show that the depth of an arbitrary logical Clifford circuit can be implemented fault-tolerantly in $O(1)$ steps \emph{in-situ} via either Knill or Steane syndrome measurement circuit, with the qualified ancilla states efficiently prepared. Particularly, for those codes satisfying $k/n\sim \Theta(1)$, the resource scaling for Clifford circuits implementation on the logical level can be the same as on the physical level up to a constant, which is independent of code distance $d$. With a suitable pipeline to produce ancilla states, our scheme requires only a modest resource cost in physical qubits, physical gates, and computation time for very large scale FTQC.

翻訳日:2023-05-27 18:33:19 公開日:2020-08-01

# 二次光学系の光応答とドリフト行列

Optical Response and Drift Matrix of Quadratic Optomechanical System ( http://arxiv.org/abs/2005.07065v3 )

ライセンス: Link先を確認

Akash Kundu

(参考訳) 光学系における非線形相互作用は、F. Monifiらによって導入された光学カオスの存在のような多くの興味深い研究や現象において重要な役割を果たす。 [Nature Photonics 10, 399405 (2016)]とZhong-Peng Liuらによる光学対称性の破れ。 [Phys. Rev. Lett.117, 110802 (2016)] 本稿では,2つの原子準位を含む二次結合光力学系を理論的に検討した。我々はまず,システムの様々なモードの解法を定常的に研究し,その後,システムの様々なパラメータを持つ透過強度(T)の変動を観測した。さらに,2次光学系のドリフト行列と安定性条件を,原子自由度を漸近的に除去することで,解析を拡張した。

Nonlinear interactions in optomechanical systems play a crucial role in many emerging number of interesting studies and phenomena such as existence of optomechanical chaos introduced by F. Monifi et al. [Nature Photonics 10, 399405 (2016)] and optomechanical symmetry breaking proposed by Zhong-Peng Liu et al. [Phys. Rev. Lett.117, 110802 (2016)]. In this article we have theoretically examined quadratically coupled optomechanical system containing two atomic levels. We have first studied the solution of various modes of the system at steady state and later we have observed the variation of Transmission Intensity (T) with several parameters of the system. Further we have extended our analyzation to find Drift matrix of the quadratic optomechanical system and stability conditions by adiabetically eliminating atomic degree of freedom.

翻訳日:2023-05-20 11:42:00 公開日:2020-08-01

# ディラック物質の粒子内絡み合いの発生とベルの不平等の経時的違反

Emergence of Intra-Particle Entanglement and Time-Varying Violation of Bell's Inequality in Dirac Matter ( http://arxiv.org/abs/2007.01584v2 )

ライセンス: Link先を確認

Bruna Gabrielly de Moraes, Aron W. Cummings, and Stephan Roche

(参考訳) 質量を持たないディラックフェルミオンにおける粒子内絡み合いの発生とダイナミクスを示す。スピン軌道カップリングによって生じるこの絡み合いは、グラフェン中の電子のスピンと準格子擬スピンの間に生じる。絡み合いは複雑な動的量であるが、一般に大きく、初期状態とは独立である。その時間依存性はベルの不等式を動的に破ることを意味し、その大きさは大きな粒子内部の絡み合いが基板上のグラフェンの一般的な特徴であることを示している。これらの特徴は粒子対の絡み合いにも影響することが期待されており、ディラック材料に基づくメソスコピックデバイスにおけるクーパー対分割とスピンスピン相関の非局所的測定を組み合わせた実験で検出することができる。

We demonstrate the emergence and dynamics of intra-particle entanglement in massless Dirac fermions. This entanglement, generated by spin-orbit coupling, arises between the spin and sublattice pseudospin of electrons in graphene. The entanglement is a complex dynamic quantity but is generally large, independent of the initial state. Its time dependence implies a dynamical violation of a Bell inequality, while its magnitude indicates that large intra-particle entanglement is a general feature of graphene on a substrate. These features are also expected to impact entanglement between pairs of particles, and may be detectable in experiments that combine Cooper pair splitting with nonlocal measurements of spin-spin correlation in mesoscopic devices based on Dirac materials.

翻訳日:2023-05-11 18:34:14 公開日:2020-08-01

# 1時間コヒーレンス時間を超える単一イオン量子ビット

Single ion-qubit exceeding one hour coherence time ( http://arxiv.org/abs/2008.00251v1 )

ライセンス: Link先を確認

Pengfei Wang, Chun-Yang Luan, Mu Qiao, Mark Um, Junhua Zhang, Ye Wang, Xiao Yuan, Mile Gu, Jingning Zhang, Kihwan Kim

(参考訳) 長いコヒーレンス時間量子メモリの実現は、現在の量子技術の大きな課題である。本稿では,1時間以上のコヒーレンス時間を持つ1個のYbイオン量子ビットメモリについて報告する。周囲磁場ノイズ、位相ノイズ、マイクロ波発振器の漏れなど、様々な技術的課題に対処することにより、長いコヒーレンス時間メモリを実現する。さらに,量子プロセストモグラフィによる量子メモリのデコヒーレンス過程を体系的に研究することで,コヒーレンスの相対エントロピーと量子コヒーレンスの厳密な基準を適用できる。量子メモリはまた、量子情報の保存能力、すなわち量子メモリの堅牢性によって量子メモリをベンチマークし、量子メモリが古典的でない量子情報を保存していることを示す。本研究では、時間レベルの量子メモリの安定性を検証し、様々なシナリオにおける汎用性を示す。

Realizing a long coherence time quantum memory is a major challenge of current quantum technology. Here, we report a single \Yb ion-qubit memory with over one hour coherence time, an order of improvement compared to the state-of-the-art record. The long coherence time memory is realized by addressing various technical challenges such as ambient magnetic-field noise, phase noise and leakage of the microwave oscillator. Moreover, systematically study the decoherence process of our quantum memory by quantum process tomography, which enables to apply the strict criteria of quantum coherence, relative entropy of coherence. We also benchmark our quantum memory by its ability in preserving quantum information, i.e., the robustness of quantum memory, which clearly shows that over 6000 s, our quantum memory preserves non-classical quantum information. Our results verify the stability of the quantum memory in hours level and indicate its versatile applicability in various scenarios.

翻訳日:2023-05-07 10:38:48 公開日:2020-08-01

# 都市緑地植生の異なる指標の標準化グリーンビュー指標と定量化

Standardized Green View Index and Quantification of Different Metrics of Urban Green Vegetation ( http://arxiv.org/abs/2008.00229v1 )

ライセンス: Link先を確認

Yusuke Kumakoshi, Sau Yee Chan, Hideki Koizumi, Xiaojiang Li and Yuji Yoshimura

(参考訳) 都市緑化は、持続可能な開発と人々の生活の質との関係において重要な要素であると考えられている。都市緑化の測定方法が提案されているが、各指標の特徴は完全に確立されておらず、以前の研究は緑化指標の変化に弱い。本研究の目的は,(1)分析用緑化可視性向上指標(標準化されたGVI, sGVI)を提案し,(2)sGVIと他の緑化指標との関係を定量化することである。横浜市のデータセットを解析した結果,gviの重み付け型であるsgviが,密集した測定地点の偏りを緩和していることが示された。また,都市ブロックレベルでsGVIとNDVIを比較することで,sGVIは都市中心部の植生をよりよく捉えているのに対し,NDVIは公園や森林の植生を捉えるのに優れていることがわかった。これらのツールは、都市景観における植生の影響をより堅牢な方法でアクセスするための基盤を提供し、任意の地理的スケールの比較を可能にする。

Urban greenery is considered an important factor in relation to sustainable development and people's quality of life in the city. Although ways to measure urban greenery have been proposed, the characteristics of each metric have not been fully established, rendering previous researches vulnerable to changes in greenery metrics. To make estimation more robust, this study aims to (1) propose an improved indicator of greenery visibility for analytical use (standardized GVI; sGVI), and (2) quantify the relation between sGVI and other greenery metrics. Analyzing a data set for Yokohama city, Japan, it is shown that the sGVI, a weighted form of GVI aggregated to an area, mitigates the bias of densely located measurement sites. Also, by comparing sGVI and NDVI at city block level, we found that sGVI captures the presence of vegetation better in the city center, whereas NDVI is better in capturing vegetation in parks and forests. These tools provide a foundation for accessing the effect of vegetation in urban landscapes in a more robust matter, enabling comparison on any arbitrary geographical scale.

翻訳日:2023-05-07 10:38:30 公開日:2020-08-01

# 強結合キャビティ量子ビット系における駆動誘起共鳴狭絡

Driving-induced resonance narrowing in a strongly coupled cavity-qubit system ( http://arxiv.org/abs/2008.00224v1 )

ライセンス: Link先を確認

Eyal Buks, Paul Brookes, Eran Ginossar, Chunqing Deng, Jean-Luc F. X. Orgiazzi, Martin Otto and Adrian Lupascu

(参考訳) マイクロ波空洞に強く結合した超伝導束量子ビットからなるシステムについて検討した。着飾った状態のスペクトルを操作するために外部応用クビット駆動を用いる。 2つの基本共鳴の分割が0に調整される領域における共鳴狭化を観察する。この領域における重なり合う共鳴の狭さは、量子状態の長期保存に利用することができる。さらに, キャビティモード駆動に対する応答を計測し, 実験結果と半古典的モデルの予測との質的偏差を求める。一方、システムの力学を規定するマスター方程式を数値的に統合した理論予測を用いて、良好な一致が得られる。観察された応答は、2つの準安定な服装状態のコヒーレントなキャンセルの過程を示す。

We study a system consisting of a superconducting flux qubit strongly coupled to a microwave cavity. Externally applied qubit driving is employed in order to manipulate the spectrum of dressed states. We observe resonance narrowing in the region where the splitting between the two dressed fundamental resonances is tuned to zero. The narrowing in this region of overlapping resonances can be exploited for long-time storage of quantum states. In addition, we measure the response to strong cavity mode driving, and find a qualitative deviation between the experimental results and the predictions of a semiclassical model. On the other hand, good agreement is obtained using theoretical predictions obtained by numerically integrating the master equation governing the system's dynamics. The observed response demonstrates a process of a coherent cancellation of two meta-stable dressed states.

翻訳日:2023-05-07 10:38:10 公開日:2020-08-01

# 安全のために子供の位置を遠隔追跡する装置

Device to Remotely Track and Locate the Position of a Child for Safety ( http://arxiv.org/abs/2008.00211v1 )

ライセンス: Link先を確認

S.M.K.C.S.B. Egodawela, H.M.D.M.B. Herath, R.D. Ranaweera, J.V. Wijayakulasooriya

(参考訳) 親はいつも子供の幸福を心配している。 2017年の統計報告によると、子どもは2分ごとに行方不明になっている。差し迫った脅威のために、親は子供たちと連絡を取り合うために携帯電話を買う傾向がある。しかし、子供に携帯電話を与えると、サイバーいじめ、ソーシャルネットワークの不適切な利用、成熟した年齢へのアクセス、インターネット上の不正なコンテンツ、そしておそらく電話盗難などの問題を引き起こす可能性がある。そこで本研究では,親が子どもに親しみやすい携帯端末を使って,子どもを呼んだり,見つけたり,追跡したりできるソリューションを提案する。デバイスが遊ぶ一般的なシナリオは、典型的なルートで一人で旅行する子供の安全性を高めることだ。この装置は、典型的な旅行経路を追跡するために調整することができる。そして、デバイスが通常のルートからのずれを検知すると、親への通知がトリガーされる。確率行列に基づくnov-elアルゴリズムを導入し,経路偏差を検出する。本稿では,携帯端末のデサインの詳細と経路偏差検出アルゴリズムの詳細について述べる。

Parents are always worried about the wellbeing of their children. As per the Statistics Report 2017 by Missing Children Europe Organization, a child is reported missing every 2 minutes. Due to the imminent threat, parents are prone to buy their children mobile phones to keep in touch with them. However, giving a Mobile phone to a child can cause issues including cyber bullying, improper use of social networks, access to mature age and illicit content on the internet and possibly, phone theft. As an effort to tackle some of those issues, this paper proposes a solution which enables parents to call, locate and track their children using a child-friendly mobile device. The common scenario the device would come to play is in enhancing the safety of a child who would travel alone on a typical route; for instance a child who walks from home to school and back. The device can be calibrated to keep track of a typical route of travel. Then, if the device de-tects some deviation from the usual route, it would trigger a notification to parents. A probability matrix based nov-el algorithm is introduced to detect route deviation. De-sign details of the mobile device, along with the details of the route deviation detection algorithm are presented in this paper.

翻訳日:2023-05-07 10:37:21 公開日:2020-08-01

# BatNet: 超音波によるスマートフォン間のデータ伝送

BatNet: Data transmission between smartphones over ultrasound ( http://arxiv.org/abs/2008.00136v1 )

ライセンス: Link先を確認

Almos Zarandy, Ilia Shumailov, Ross Anderson

(参考訳) 本稿では,スマートフォンの内蔵スピーカーおよびマイク上で超音波信号を用いたデータ伝送機構であるBatNetを提案する。 8点星座と20-24kHzの周波数で位相シフトキーを使用すれば、600bit/sから6mの速度でデータを送信できる。ターゲットアプリケーションは検閲耐性メッシュネットワークである。また,コビッドの接触追跡でも評価したが,このアプリケーションでは超音波通信がBluetooth Low Energyに対して十分な優位性を与えていないことが判明した。

In this paper, we present BatNet, a data transmission mechanism using ultrasound signals over the built-in speakers and microphones of smartphones. Using phase shift keying with an 8-point constellation and frequencies between 20--24kHz, it can transmit data at over 600bit/s up to 6m. The target application is a censorship-resistant mesh network. We also evaluated it for Covid contact tracing but concluded that in this application ultrasonic communications do not appear to offer enough advantage over Bluetooth Low Energy to be worth further development.

翻訳日:2023-05-07 10:36:40 公開日:2020-08-01

# M. Hu, K. Guo, Q. Yu, Z. Zhangの論文"Third-harmonic generation investigated by a short-range bottomless exponential potential Well"へのコメント [Superlattices andstructures, 122 (2018) 538-547]

Comment on the paper "Third-harmonic generation investigated by a short-range bottomless exponential potential well" by M. Hu, K. Guo, Q. Yu, Z. Zhang [Superlattices and Microstructures, 122 (2018) 538-547] ( http://arxiv.org/abs/2008.01833v1 )

ライセンス: Link先を確認

A.M. Ishkhanyan and G.G. Demirkhanyan

(参考訳) 我々は最近の論文M. Hu, K. Guo, Q. Yu, Z. Zhang[超格子と微細構造, 122 (2018) 538-547]でいくつかの重大な誤りを発見した。具体的には、schr\"odinger方程式の解と、論文で使われる境界状態波動関数の両方が誤りであることを示す。

We have discovered several severe errors in the recent paper M. Hu, K. Guo, Q. Yu, Z. Zhang [Superlattices and Microstructures, 122 (2018) 538-547]. Specifically, we demonstrate that both the solution of the Schr\"odinger equation and the bound-state wave functions used in the paper are incorrect.

翻訳日:2023-05-07 10:30:22 公開日:2020-08-01

# 量子系の安定性解析:リャプノフ基準と不変原理

Stability Analysis of Quantum Systems: a Lyapunov Criterion and an Invariance Principle ( http://arxiv.org/abs/2008.01534v1 )

ライセンス: Link先を確認

Muhammad F. Emzir, Matthew J. Woolley, Ian R. Petersen

(参考訳) 本稿では,量子系の密度演算子の収束を解析するためのリアプノフ安定性手法を提案する。マルコフ過程の古典的確率測度と類似して、不変密度作用素の集合は閉かつ凸であることを示す。次に、この集合の安定性を候補ライプノフ作用素を用いて解析する方法を示す。量子系の力学に関するBarbashin-Krasovskii-La Salle定理のアナログを導入して、不変密度作用素の集合の解析を完成させる。

In this article, we propose a Lyapunov stability approach to analyze the convergence of the density operator of a quantum system. In analog to the classical probability measure for Markovian processes, we show that the set of invariant density operators is both closed and convex. We then show how to analyze the stability of this set via a candidate Lyapunov operator. We complete our analysis of the set of invariant density operators by introducing an analog of the Barbashin-Krasovskii-La Salle theorem on the dynamics of quantum systems.

翻訳日:2023-05-07 10:30:09 公開日:2020-08-01

# ランベルト-w関数を用いた完全可解量子系

Exactly-solvable quantum systems in terms of Lambert-W functions ( http://arxiv.org/abs/2008.01072v1 )

ライセンス: Link先を確認

A. Schulze-Halberg and A.M. Ishkhanyan

(参考訳) 我々は、ランベルト-W関数の観点で与えられる様々な新しい正解量子系を構築している。特に、エネルギー依存ポテンシャルを持つschr\"odingerモデル、超対称性形式を用いた従来のschr\"odingerモデル、二次元ディラック系を生成する。さらに、ランベルト-w 函数のウロンスキー積分公式も導出する。

We construct a variety of new exactly-solvable quantum systems, the potentials of which are given in terms of Lambert-W functions. In particular, we generate Schr\"odinger models with energy-dependent potentials, conventional Schr\"odinger models using the supersymmetry formalism, and two-dimensional Dirac systems. In addition, we derive Wronskian integral formulas for Lambert-W functions.

翻訳日:2023-05-07 10:29:58 公開日:2020-08-01

# IBM量子体験における量子Zeno効果の実証

Demonstrating Quantum Zeno Effect on IBM Quantum Experience ( http://arxiv.org/abs/2008.01070v1 )

ライセンス: Link先を確認

Subhashish Barik, Dhiman Kumar Kalita, Bikash K. Behera, Prasanta K. Panigrahi

(参考訳) 量子ゼノ効果(QZE)は、1977年にMisraとSudarshanが発見して以来、量子力学において最も興味深い現象の1つである。数学 Phys 756年(1977年)。同じことを実験的に実現しようとする試みは数多くある。ここでは、IBM量子体験プラットフォーム上でQZEを初めてシミュレーションする。ラビ駆動振動の2レベルシステムをシミュレートし、量子ゲートを用いた中間繰り返し測定により時間発展を阻害し、初期状態における量子ビットの生存確率を増加させる。回路は、追加された中間測定値と共に設計され、ibm量子シミュレータで実行され、結果は予測と一致することが示されている。中間測定数による生存確率の増加はQZEを示す。さらに、得られた結果に対するいくつかの別の説明が提供され、観察結果の正確な推論が曖昧になる。

Quantum Zeno Effect (QZE) has been one of the most interesting phenomena in quantum mechanics ever since its discovery in 1977 by Misra and Sudarshan [J. Math. Phys. \textbf{18}, 756 (1977)]. There have been many attempts for experimental realization of the same. Here, we present the first ever simulation of QZE on IBM quantum experience platform. We simulate a two-level system for Rabi-driven oscillation and then disturb the time evolution by intermediate repetitive measurements using quantum gates to increase the survival probability of the qubit in the initial state. The circuits are designed along with the added intermediate measurements and executed on IBM quantum simulator, and the outcomes are shown to be consistent with the predictions. The increasing survival probability with the number of intermediate measurements demonstrates QZE. Furthermore, some alternative explanations for the obtained results are provided which leads to some ambiguity in giving the exact reasoning for the observed outcomes.

翻訳日:2023-05-07 10:29:51 公開日:2020-08-01

# 任意電磁スペクトル密度のFewモード場量子化

Few-mode Field Quantization of Arbitrary Electromagnetic Spectral Densities ( http://arxiv.org/abs/2008.00349v1 )

ライセンス: Link先を確認

Ivan Medina, Francisco J. Garc\'ia-Vidal, Antonio I. Fern\'andez-Dom\'inguez, Johannes Feist

(参考訳) 我々は、単一量子エミッタと任意の電磁環境との相互作用を数モードのマスター方程式で記述するフレームワークを開発する。場の量子化は、古典的な電磁シミュレーションによって得られたスペクトル密度を、少数の損失モードと相互作用モードを含むモデルシステムにのみ適用する必要がある。複雑なハイブリッドプラズモン-フォトニック構造に配置されたエミッタの自然崩壊における個体群と電場ダイナミクスを記述し,本手法のパワーと妥当性について述べる。

We develop a framework that provides a few-mode master equation description of the interaction between a single quantum emitter and an arbitrary electromagnetic environment. The field quantization requires only the fitting of the spectral density, obtained through classical electromagnetic simulations, to a model system involving a small number of lossy and interacting modes. We illustrate the power and validity of our approach by describing the population and electric field dynamics in the spontaneous decay of an emitter placed in a complex hybrid plasmonic-photonic structure.

翻訳日:2023-05-07 10:29:36 公開日:2020-08-01

# 量子論は「解釈」を必要としないが「理論的形式的概念的ユニティ」(または、ダヴィッド・ドイチュの解説の助けを借りてアダン・カベロの「狂気の地図」を逃れる)

Quantum Theory Needs No 'Interpretation' But 'Theoretical Formal-Conceptual Unity' (Or: Escaping Adan Cabello's "Map of Madness" With the Help of David Deutsch's Explanations) ( http://arxiv.org/abs/2008.00321v1 )

ライセンス: Link先を確認

Christian de Ronde

(参考訳) 2000年、Chris FuchsとAsher PeresはQuantum Theory Needs No 'Interpretation'と題した論文で、QMにおける「解釈」によって演じられる役割に対する一連の器楽主義者の議論を発表した。それ以来、この論文の出版によらず、多くの解釈は、アダン・カベロが「狂気の地図」として特徴づけたものを構成する連続的な成長を経験した。本稿では、この危険な断片化の背景にある理由を論じ、アインシュタイン、ハイゼンベルク、パウリの著作に根ざした、理論の表現的実在論的な理解から(フクスとペレスの解釈に反する)QMの解釈の必要性に対する新たな論証を提供する。さらに、量子論における「解釈」の創出は、反現実主義者が出口のない迷路で現実主義者を投獄するためにデザインした罠として機能していると考える理由も考えられる。 david deutsch の批判的分析から反現実主義的物理学の理解の立場から、我々は「理論」と「観察」によって演じられる参照と役割に対処しようとする。この点に関して、我々は反現実主義的な解釈の罠から逃れる鍵は、アインシュタインがおよそ1世紀前にハイゼンベルクに語ったように、何が観察できるのかを教えてくれる理論にすぎないことを認識することにあると論じる。最後に、QMが必要とするものは新しい解釈ではなく、理論的(形式的-概念的)整合性、一貫性、統一的なスキームであり、理論が本当に何を言っているのかを理解することができると結論付ける。

In the year 2000, in a paper titled Quantum Theory Needs No 'Interpretation', Chris Fuchs and Asher Peres presented a series of instrumentalist arguments against the role played by 'interpretations' in QM. Since then --quite regardless of the publication of this paper-- the number of interpretations has experienced a continuous growth constituting what Adan Cabello has characterized as a "map of madness". In this work, we discuss the reasons behind this dangerous fragmentation in understanding and provide new arguments against the need of interpretations in QM which --opposite to those of Fuchs and Peres-- are derived from a representational realist understanding of theories --grounded in the writings of Einstein, Heisenberg and Pauli. Furthermore, we will argue that there are reasons to believe that the creation of 'interpretations' for the theory of quanta has functioned as a trap designed by anti-realists in order to imprison realists in a labyrinth with no exit. Taking as a standpoint the critical analysis by David Deutsch to the anti-realist understanding of physics, we attempt to address the references and roles played by 'theory' and 'observation'. In this respect, we will argue that the key to escape the anti-realist trap of interpretation is to recognize that --as Einstein told Heisenberg almost one century ago-- it is only the theory which can tell you what can be observed. Finally, we will conclude that what QM needs is not a new interpretation but instead, a theoretical (formal-conceptual) consistent, coherent and unified scheme which allows us to understand what the theory is really talking about.

翻訳日:2023-05-07 10:29:29 公開日:2020-08-01

# 入門データ科学の新展開

A fresh look at introductory data science ( http://arxiv.org/abs/2008.00315v1 )

ライセンス: Link先を確認

Mine \c{C}etinkaya-Rundel and Victoria Ellison

(参考訳) 自然界で大規模で複雑なデータセットが大量に存在することから、大学は、データの発見を効果的に計画し、取得し、管理し、分析し、伝達するのに必要な、統計学と計算学の双方で訓練された卒業生の要求に応える必要がある。この需要に対応するために、データサイエンスに早くから学生を惹きつけ、この分野にしっかりと進出させることがますます重要になっている。本稿では,これらのニーズに対応するように設計されたデータサイエンス入門科のケーススタディについて述べる。デューク大学で提供されているこのコースには前提条件がなく、人文科学、社会科学、自然科学の学生だけでなく、統計学やデータサイエンスの専攻者も幅広く利用している。このようなコースを提供することによって生じる課題のユニークなセットについて議論し、これらの課題を踏まえて、教育設計要素、コンテンツ、構造、計算インフラ、およびコースの評価方法論について詳細な議論を行う。また、オープンソースである教材を全て含むリポジトリと、論文に見られる数字を再現するための補足資料とrコードも提供しています。

The proliferation of vast quantities of available datasets that are large and complex in nature has challenged universities to keep up with the demand for graduates trained in both the statistical and the computational set of skills required to effectively plan, acquire, manage, analyze, and communicate the findings of such data. To keep up with this demand, attracting students early on to data science as well as providing them a solid foray into the field becomes increasingly important. We present a case study of an introductory undergraduate course in data science that is designed to address these needs. Offered at Duke University, this course has no pre-requisites and serves a wide audience of aspiring statistics and data science majors as well as humanities, social sciences, and natural sciences students. We discuss the unique set of challenges posed by offering such a course and in light of these challenges, we present a detailed discussion into the pedagogical design elements, content, structure, computational infrastructure, and the assessment methodology of the course. We also offer a repository containing all teaching materials that are open-source, along with supplemental materials and the R code for reproducing the figures found in the paper.

翻訳日:2023-05-07 10:28:53 公開日:2020-08-01

# 二成分量子系のユニタリダイナミクスによる誤差について

On errors generated by unitary dynamics of bipartite quantum systems ( http://arxiv.org/abs/2008.00290v1 )

ライセンス: Link先を確認

G.G. Amosov, A.S. Mokeev

(参考訳) 量子チャネルが与えられると、このチャネルを介して情報の誤りのない送信の可能性を決定する性質を持つ非可換作用素グラフを定義することができる。対応するグラフは、クラウス作用素を通して量子誤差を決定するストレートな定義を持つ。我々は、あるグラフが対応するエラーの適切な定義の反対の問題について議論している。任意のグラフがある種のPOVMによって生成されることを考慮し、ナイマーク拡張定理を用いてそのような問題の解を与える。この手法を用いて、二部量子系のユニタリダイナミクスによって生成されるグラフに対応する誤差を構築する。円群 ${\mathbb Z}_n$ 上の POVM のケースと加法群 $\mathbb R$ について議論する。例えば、2モード量子発振器のダイナミクスによって生成される誤差に対応するグラフを構築する。

Given a quantum channel it is possible to define the non-commutative operator graph whose properties determine a possibility of error-free transmission of information via this channel. The corresponding graph has a straight definition through Kraus operators determining quantum errors. We are discussing the opposite problem of a proper definition of errors that some graph corresponds to. Taking into account that any graph is generated by some POVM we give a solution to such a problem by means of the Naimark dilatation theorem. Using our approach we construct errors corresponding to the graphs generated by unitary dynamics of bipartite quantum systems. The cases of POVMs on the circle group ${\mathbb Z}_n$ and the additive group $\mathbb R$ are discussed. As an example we construct the graph corresponding to the errors generated by dynamics of two mode quantum oscillator.

翻訳日:2023-05-07 10:28:30 公開日:2020-08-01

# マイクロキャビティを経由するロバスト忠実性ハイパーパラレル制御相フリップゲート

Robust-fidelity hyperparallel controlled-phase-flip gate through microcavities ( http://arxiv.org/abs/2008.00258v1 )

ライセンス: Link先を確認

Hai-Rui Wei, Yan-Bei Zheng, Ming Hua, and Guo-Fu Xu

(参考訳) ハイパーパラレル量子情報処理は、チャネル容量、低損失率、処理速度の点で従来の並列処理よりも優れている。マイクロキャビティを用いた高並列光制御位相フリップゲートの実現手法を提案する。ゲートは同時に偏光と空間自由度(DOF)に作用し、光子と量子ドットの間の不完全で望ましくない相互作用が防止される。興味深いことに、ゲートの統一性は原則として達成でき、ゲートの成功は単光子検出器によって予測される。

Hyperparallel quantum information processing outperforms its traditional parallel one in terms of channel capacity, low loss rate, and processing speed. We present a way for implementing a robust hyper-parallel optical controlled-phase-flip gate through microcavities. The gate acts on polarization and spatial degrees of freedom (DOFs) simultaneously, and the incomplete and undesired interactions between photons and quantum dots are prevented. Interestingly, the unity fidelity of the gate can be achieved in principle, and the success of the gate is heralded by the single-photon detectors.

翻訳日:2023-05-07 10:27:51 公開日:2020-08-01

# 単語融合ネットワークを用いた対話状態追跡のためのASR曖昧性のモデル化

Modeling ASR Ambiguity for Dialogue State Tracking Using Word Confusion Networks ( http://arxiv.org/abs/2002.00768v2 )

ライセンス: Link先を確認

Vaishali Pal, Fabien Guillot, Manish Shrivastava, Jean-Michel Renders, Laurent Besacier

(参考訳) 音声対話システムは通常、意味的意味を推測し、対話の状態を追跡するためにトップNのASR仮説のリストを使用する。しかし、混乱ネットワーク (confnets) のような ASR グラフは、トップNの ASR リストよりもリッチな仮説空間のコンパクトな表現を提供する。本稿では,最先端のニューラルダイアログ状態トラッカー(DST)を用いた混乱ネットワークの利点について検討する。我々は,DSTシステムで使用可能な注目混乱ネットワークエンコーダを用いて,2次元の畳み込みを1次元の埋め込み列に符号化する。 DSTの「グローバルローカル自己認識状態タッカー」(GLAD)モデルに実装し、トップNのASR仮説と比較して精度と推論時間に大きな改善を加えた。

Spoken dialogue systems typically use a list of top-N ASR hypotheses for inferring the semantic meaning and tracking the state of the dialogue. However ASR graphs, such as confusion networks (confnets), provide a compact representation of a richer hypothesis space than a top-N ASR list. In this paper, we study the benefits of using confusion networks with a state-of-the-art neural dialogue state tracker (DST). We encode the 2-dimensional confnet into a 1-dimensional sequence of embeddings using an attentional confusion network encoder which can be used with any DST system. Our confnet encoder is plugged into the state-of-the-art 'Global-locally Self-Attentive Dialogue State Tacker' (GLAD) model for DST and obtains significant improvements in both accuracy and inference time compared to using top-N ASR hypotheses.

翻訳日:2023-01-04 08:30:06 公開日:2020-08-01

# 深層学習におけるトレーニング戦略の再検討と一般化性能

Revisiting Training Strategies and Generalization Performance in Deep Metric Learning ( http://arxiv.org/abs/2002.08473v9 )

ライセンス: Link先を確認

Karsten Roth, Timo Milbich, Samarth Sinha, Prateek Gupta, Bj\"orn Ommer, Joseph Paul Cohen

(参考訳) ディープメトリック学習(dml)は、毎年提案されている多くのアプローチと視覚的な類似性を学ぶための最も影響力のある研究の1つである。フィールドは急速な進歩から恩恵を受けるが、トレーニングプロトコル、アーキテクチャ、パラメータの選択の相違はバイアスのない比較を難しくする。そこで我々は,最も広く使用されているDML対象関数を再検討し,重要なパラメータ選択と,一般的に無視されるミニバッチサンプリングプロセスについて検討する。一貫した比較では、DMLの目的は文学で示されるよりもはるかに高い飽和を示す。さらに解析により,DMLモデルの一般化性能に対する埋め込み空間密度と圧縮の相関関係を明らかにする。これらの知見をエクスプロイトし、様々な標準ベンチマークデータセット上でランキングベースのDMLモデルの性能を確実に向上させるための、シンプルで効果的なトレーニング正則化を提案する。コードとWandB-repoはhttps://github.com/Confusezius/Revisiting_Deep_Metric_Learning_PyTorchで公開されている。

Deep Metric Learning (DML) is arguably one of the most influential lines of research for learning visual similarities with many proposed approaches every year. Although the field benefits from the rapid progress, the divergence in training protocols, architectures, and parameter choices make an unbiased comparison difficult. To provide a consistent reference point, we revisit the most widely used DML objective functions and conduct a study of the crucial parameter choices as well as the commonly neglected mini-batch sampling process. Under consistent comparison, DML objectives show much higher saturation than indicated by literature. Further based on our analysis, we uncover a correlation between the embedding space density and compression to the generalization performance of DML models. Exploiting these insights, we propose a simple, yet effective, training regularization to reliably boost the performance of ranking-based DML models on various standard benchmark datasets. Code and a publicly accessible WandB-repo are available at https://github.com/Confusezius/Revisiting_Deep_Metric_Learning_PyTorch.

翻訳日:2022-12-30 14:10:47 公開日:2020-08-01

# 物体検出における空間的不確かさの推測

Inferring Spatial Uncertainty in Object Detection ( http://arxiv.org/abs/2003.03644v2 )

ライセンス: Link先を確認

Zining Wang, Di Feng, Yiyang Zhou, Lars Rosenbaum, Fabian Timm, Klaus Dietmayer, Masayoshi Tomizuka and Wei Zhan

(参考訳) 実世界のデータセットが利用可能であることは、自動運転のためのオブジェクト検出方法を開発するための前提条件である。オブジェクトラベルには、エラーが発生しやすいアノテーション処理やセンサーによる観測ノイズによる曖昧性が存在するが、現在のオブジェクト検出データセットは、その不確かさを考慮せずに決定論的アノテーションのみを提供する。これにより、特に予測確率を明示的にモデル化するオブジェクト検出手法の詳細な評価が妨げられる。本研究では,lidar点雲から境界ボックスラベルの不確かさを推定する生成モデルを提案し,空間分布を通じて確率的境界ボックスの新しい表現を定義する。総合実験により,提案モデルが運転シナリオでよく見られる不確実性を表すことを示す。空間分布に基づいて,ラベルの不確実性を考慮した新しい評価指標として,Jaccard IoU(JIoU)と呼ばれるIoUの拡張を提案する。 KITTIとWaymo Open Datasetsの実験により、JIoUは確率的物体検出器の評価においてIoUよりも優れていることが示された。

The availability of real-world datasets is the prerequisite for developing object detection methods for autonomous driving. While ambiguity exists in object labels due to error-prone annotation process or sensor observation noises, current object detection datasets only provide deterministic annotations without considering their uncertainty. This precludes an in-depth evaluation among different object detection methods, especially for those that explicitly model predictive probability. In this work, we propose a generative model to estimate bounding box label uncertainties from LiDAR point clouds, and define a new representation of the probabilistic bounding box through spatial distribution. Comprehensive experiments show that the proposed model represents uncertainties commonly seen in driving scenarios. Based on the spatial distribution, we further propose an extension of IoU, called the Jaccard IoU (JIoU), as a new evaluation metric that incorporates label uncertainty. Experiments on the KITTI and the Waymo Open Datasets show that JIoU is superior to IoU when evaluating probabilistic object detectors.

翻訳日:2022-12-25 19:39:21 公開日:2020-08-01

# 畳み込み型占有ネットワーク

Convolutional Occupancy Networks ( http://arxiv.org/abs/2003.04618v2 )

ライセンス: Link先を確認

Songyou Peng, Michael Niemeyer, Lars Mescheder, Marc Pollefeys, Andreas Geiger

(参考訳) 近年、暗黙の神経表現が学習に基づく3D再構成で人気を集めている。有望な結果を示す一方で、ほとんどの暗黙的なアプローチは単一のオブジェクトの単純な幾何学に限られており、より複雑で大規模なシーンにスケールしない。暗黙的手法の鍵となる制限要因は、観察中に局所的な情報を統合したり、翻訳等価性のような帰納的バイアスを組み込むことができない、単純な完全連結ネットワークアーキテクチャである。本稿では,オブジェクトと3Dシーンの詳細な再構築のための,より柔軟な暗黙的表現である畳み込みネットワークを提案する。畳み込みエンコーダと暗黙の占有デコーダを組み合わせたモデルでは,帰納的バイアスが組み込まれ,3次元空間における構造化推論が可能となる。ノイズ点雲と低分解能ボクセル表現から複素幾何を再構成することにより,提案表現の有効性を検討する。実験により,本手法は単一物体の微細な3次元再構成,大規模屋内シーンへのスケール,合成データから実データへの一般化を可能にした。

Recently, implicit neural representations have gained popularity for learning-based 3D reconstruction. While demonstrating promising results, most implicit approaches are limited to comparably simple geometry of single objects and do not scale to more complicated or large-scale scenes. The key limiting factor of implicit methods is their simple fully-connected network architecture which does not allow for integrating local information in the observations or incorporating inductive biases such as translational equivariance. In this paper, we propose Convolutional Occupancy Networks, a more flexible implicit representation for detailed reconstruction of objects and 3D scenes. By combining convolutional encoders with implicit occupancy decoders, our model incorporates inductive biases, enabling structured reasoning in 3D space. We investigate the effectiveness of the proposed representation by reconstructing complex geometry from noisy point clouds and low-resolution voxel representations. We empirically find that our method enables the fine-grained implicit 3D reconstruction of single objects, scales to large indoor scenes, and generalizes well from synthetic to real data.

翻訳日:2022-12-24 21:11:58 公開日:2020-08-01

# Emotions Don't Lie:Affective Cuesを用いたオーディオ・ビジュアルディープフェイク検出法

Emotions Don't Lie: An Audio-Visual Deepfake Detection Method Using Affective Cues ( http://arxiv.org/abs/2003.06711v3 )

ライセンス: Link先を確認

Trisha Mittal, Uttaran Bhattacharya, Rohan Chandra, Aniket Bera, Dinesh Manocha

(参考訳) 本稿では,実および偽のディープフェイクマルチメディアコンテンツを検出するための学習ベース手法を提案する。学習のための情報を最大化するために,同じビデオから2つのオーディオと視覚の類似性を抽出し,分析する。さらに,映像中の2つのモダリティから感情知覚に対応する感情的手がかりを抽出・比較し,入力映像が「リアル」か「フェイク」かを推定する。本稿では,シームズネットワークアーキテクチャと三重項損失にインスパイアされたディープラーニングネットワークを提案する。本モデルの有効性を検証するため,大規模深度検出データセットであるDeepFake-TIMIT DatasetとDFDCのAUC測定値について報告する。我々は,複数のSOTAディープフェイク検出手法とDFDCで84.4%,DF-TIMITデータセットで96.6%の動画AUCとを比較した。我々の知る限りでは、オーディオとビデオのモダリティを同時に活用する最初のアプローチであり、ディープフェイク検出のための2つのモダリティからの感情も認識する。

We present a learning-based method for detecting real and fake deepfake multimedia content. To maximize information for learning, we extract and analyze the similarity between the two audio and visual modalities from within the same video. Additionally, we extract and compare affective cues corresponding to perceived emotion from the two modalities within a video to infer whether the input video is "real" or "fake". We propose a deep learning network, inspired by the Siamese network architecture and the triplet loss. To validate our model, we report the AUC metric on two large-scale deepfake detection datasets, DeepFake-TIMIT Dataset and DFDC. We compare our approach with several SOTA deepfake detection methods and report per-video AUC of 84.4% on the DFDC and 96.6% on the DF-TIMIT datasets, respectively. To the best of our knowledge, ours is the first approach that simultaneously exploits audio and video modalities and also perceived emotions from the two modalities for deepfake detection.

翻訳日:2022-12-23 20:12:12 公開日:2020-08-01

# DHP: HyperNetworksによる差別化可能なメタプルーニング

DHP: Differentiable Meta Pruning via HyperNetworks ( http://arxiv.org/abs/2003.13683v3 )

ライセンス: Link先を確認

Yawei Li, Shuhang Gu, Kai Zhang, Luc Van Gool, Radu Timofte

(参考訳) ネットワークプルーニングは、ニューラルネットワークの加速とモデルストレージ/送信負荷の軽減の原動力となっている。 AutoMLとニューラルアーキテクチャサーチ(NAS)の出現により、プルーニングは自動メカニズムと検索に基づくアーキテクチャ最適化で話題になっている。しかし、現在の自動設計は強化学習か進化的アルゴリズムに依存している。これらのアルゴリズムの非微分性のため、プルーニングアルゴリズムは収束に到達する前に長い探索段階を必要とする。この問題を回避するために,ネットワークの自動刈り出しのためのハイパーネットによる識別可能な刈り出し方式を提案する。特別に設計されたハイパーネットは遅延ベクトルを入力として、バックボーンネットワークの重みパラメータを生成する。潜在ベクトルは、バックボーンネットワーク内の畳み込み層の出力チャネルを制御し、レイヤのプルーニングのハンドルとして機能する。潜ベクトルに$\ell_1$スパーシティ正規化を強制し、近位勾配ソルバを利用することにより、疎潜ベクトルを得ることができる。スパシファイド潜在ベクトルをハイパーネットワークスに通すと、生成された重みパラメータの対応するスライスを除去し、ネットワーク切断の効果を達成できる。すべてのレイヤの潜在ベクターがプルーピングされ、自動的にレイヤ構成が生成される。画像分類、単一画像の超解像、雑音除去など、様々なネットワーク上で広範な実験が行われている。実験の結果,提案手法が検証された。

Network pruning has been the driving force for the acceleration of neural networks and the alleviation of model storage/transmission burden. With the advent of AutoML and neural architecture search (NAS), pruning has become topical with automatic mechanism and searching based architecture optimization. Yet, current automatic designs rely on either reinforcement learning or evolutionary algorithm. Due to the non-differentiability of those algorithms, the pruning algorithm needs a long searching stage before reaching the convergence. To circumvent this problem, this paper introduces a differentiable pruning method via hypernetworks for automatic network pruning. The specifically designed hypernetworks take latent vectors as input and generate the weight parameters of the backbone network. The latent vectors control the output channels of the convolutional layers in the backbone network and act as a handle for the pruning of the layers. By enforcing $\ell_1$ sparsity regularization to the latent vectors and utilizing proximal gradient solver, sparse latent vectors can be obtained. Passing the sparsified latent vectors through the hypernetworks, the corresponding slices of the generated weight parameters can be removed, achieving the effect of network pruning. The latent vectors of all the layers are pruned together, resulting in an automatic layer configuration. Extensive experiments are conducted on various networks for image classification, single image super-resolution, and denoising. And the experimental results validate the proposed method.

翻訳日:2022-12-18 07:27:36 公開日:2020-08-01

# VisualCOMET:静止画像の動的コンテキストに関する推論

VisualCOMET: Reasoning about the Dynamic Context of a Still Image ( http://arxiv.org/abs/2004.10796v3 )

ライセンス: Link先を確認

Jae Sung Park, Chandra Bhagavatula, Roozbeh Mottaghi, Ali Farhadi, Yejin Choi

(参考訳) 静止画の1つのフレームからでも、人々はその画像のダイナミックなストーリーをフレームの前、後、そしてその向こうで考えることができる。例えば、水に浮くのに苦労している男のイメージを考えると、その男が過去に水に落ちたのは、その時の男の意図が生き残ることであり、近い将来に助けが必要であり、そうでなければ洗い流されることになる。我々はvisualcometを提案する。visual commonsense推論タスクの新しいフレームワークで、以前発生した可能性のあるイベント、次に発生した可能性のあるイベント、現在の人々の意図を予測する。視覚コモンセンス推論に向けた研究を支援するために,視覚コモンセンス推論の140万以上のテキスト記述からなり,それぞれが前後の短いビデオ要約と組み合わせて,様々な6万枚の画像セットに注意深く注釈付けされた視覚コモンセンス推論の大規模リポジトリを紹介する。さらに,画像に現れる人とテキストのコモンセンス記述で言及される人との人格的接点(つまりコリファレンスリンク)を提供し,画像とテキストのより緊密な統合を可能にした。我々は,この課題に対して強力なベースライン性能を確立し,視覚的およびテキスト的コモンセンス推論の統合が鍵であり,非統合的な代替手段に勝っていることを示す。

Even from a single frame of a still image, people can reason about the dynamic story of the image before, after, and beyond the frame. For example, given an image of a man struggling to stay afloat in water, we can reason that the man fell into the water sometime in the past, the intent of that man at the moment is to stay alive, and he will need help in the near future or else he will get washed away. We propose VisualComet, the novel framework of visual commonsense reasoning tasks to predict events that might have happened before, events that might happen next, and the intents of the people at present. To support research toward visual commonsense reasoning, we introduce the first large-scale repository of Visual Commonsense Graphs that consists of over 1.4 million textual descriptions of visual commonsense inferences carefully annotated over a diverse set of 60,000 images, each paired with short video summaries of before and after. In addition, we provide person-grounding (i.e., co-reference links) between people appearing in the image and people mentioned in the textual commonsense descriptions, allowing for tighter integration between images and text. We establish strong baseline performances on this task and demonstrate that integration between visual and textual commonsense reasoning is the key and wins over non-integrative alternatives.

翻訳日:2022-12-10 17:11:57 公開日:2020-08-01

# 一級アプローチによる新型コロナウイルス研究論文のターゲット特定マイニング

Target specific mining of COVID-19 scholarly articles using one-class approach ( http://arxiv.org/abs/2004.11706v2 )

ライセンス: Link先を確認

Sanjay Kumar Sonbhadra, Sonali Agarwal and P. Nagabhushan

(参考訳) 近年では、重症急性呼吸症候群(SARS)、中東部呼吸症候群(MERS)、COVID-19など、コロナウイルスの分野でのいくつかの研究論文が公表されている。多くの研究論文が存在する中で、最も適した記事の抽出には時間がかかる。本研究の目的は,コロナウイルス関連研究論文の活動と動向を機械学習を用いて抽出することである。実験にはcovid-19 open research dataset(cord-19)が使用される一方で、いくつかのターゲットタスクと説明がドメイン知識に基づいて分類のために定義されている。クラスタリング技術は、利用可能な記事の異なるクラスタを作成するために使用され、その後、並列一クラスサポートベクターマシン(OCSVM)を使用してタスク割り当てが行われる。オリジナルと縮小された機能による実験は、アプローチのパフォーマンスを検証する。 k-meansクラスタリングアルゴリズムが並列なOCSVMに続き、オリジナルと縮小された特徴空間において他の手法よりも優れていることは明らかである。

In recent years, several research articles have been published in the field of corona-virus caused diseases like severe acute respiratory syndrome (SARS), middle east respiratory syndrome (MERS) and COVID-19. In the presence of numerous research articles, extracting best-suited articles is time-consuming and manually impractical. The objective of this paper is to extract the activity and trends of corona-virus related research articles using machine learning approaches. The COVID-19 open research dataset (CORD-19) is used for experiments, whereas several target-tasks along with explanations are defined for classification, based on domain knowledge. Clustering techniques are used to create the different clusters of available articles, and later the task assignment is performed using parallel one-class support vector machines (OCSVMs). Experiments with original and reduced features validate the performance of the approach. It is evident that the k-means clustering algorithm, followed by parallel OCSVMs, outperforms other methods for both original and reduced feature space.

翻訳日:2022-12-10 03:07:08 公開日:2020-08-01

# マルチビュースペクトルクラスタリングによるテンソル低ランク表現

Multi-View Spectral Clustering Tailored Tensor Low-Rank Representation ( http://arxiv.org/abs/2004.14705v2 )

ライセンス: Link先を確認

Yuheng Jia, Hui Liu, Junhui Hou, Sam Kwong, Qingfu Zhang

(参考訳) 本稿では,テンソル低ランクモデルに基づくマルチビュースペクトルクラスタリング(MVSC)の問題について検討する。 MVSCのテンソルの特殊特性を考慮せずに、既成のテンソル低ランクノルムを採用する既存の方法とは異なり、MVSCに合わせた構造付きテンソル低ランクノルムを設計する。具体的には、テンソルの前面スライスと水平スライスに対称な低ランク制約と構造的な低ランク制約を明示的に課し、ビュー内関係とビュー間関係を特徴付ける。さらに、この2つの制約は相互改善を達成するために共同で最適化できる。新たなテンソル低ランクノルムに基づいて, MVSCを凸低ランクテンソル回復問題として定式化し, 拡張ラグランジュ乗算法を反復的に解いた。 5つのベンチマークデータセットの広範な実験結果から,提案手法が最先端手法をかなり上回っていることがわかった。驚くべきことに、この手法は完璧なクラスタリングを実現できる。さらに,提案手法のパラメータの調整も容易であり,提案手法は異なるデータセットに対して頑健であり,実際にその可能性を示す。

This paper explores the problem of multi-view spectral clustering (MVSC) based on tensor low-rank modeling. Unlike the existing methods that all adopt an off-the-shelf tensor low-rank norm without considering the special characteristics of the tensor in MVSC, we design a novel structured tensor low-rank norm tailored to MVSC. Specifically, we explicitly impose a symmetric low-rank constraint and a structured sparse low-rank constraint on the frontal and horizontal slices of the tensor to characterize the intra-view and inter-view relationships, respectively. Moreover, the two constraints could be jointly optimized to achieve mutual refinement. On the basis of the novel tensor low-rank norm, we formulate MVSC as a convex low-rank tensor recovery problem, which is then efficiently solved with an augmented Lagrange multiplier based method iteratively. Extensive experimental results on five benchmark datasets show that the proposed method outperforms state-of-the-art methods to a significant extent. Impressively, our method is able to produce perfect clustering. In addition, the parameters of our method can be easily tuned, and the proposed model is robust to different datasets, demonstrating its potential in practice.

翻訳日:2022-12-08 02:54:16 公開日:2020-08-01

# ニューラルネットワーク翻訳におけるサブワードセグメンテーションのための動的プログラミング符号化

Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation ( http://arxiv.org/abs/2005.06606v2 )

ライセンス: Link先を確認

Xuanli He, Gholamreza Haffari, Mohammad Norouzi

(参考訳) 本稿では,文をサブワード単位にトークン化する新しいセグメンテーションアルゴリズムである動的プログラミング符号化(DPE)を紹介する。学習や推論のために限界化されるべき潜在変数として,出力文のサブワードセグメンテーションを考察する。高精度なログ辺縁確率推定と正確な地図推定を可能にし,最大後方確率のターゲットセグメンテーションを探索する混合文字・サブワードトランスを提案する。 DPEは、動的プログラミングを用いて出力文を分割する並列データを前処理する手段として、軽量な混合文字サブワード変換器を使用している。機械翻訳における実験結果から、DPEは出力文のセグメンテーションに有効であり、ソース文の確率的セグメンテーションにBPEドロップアウトと組み合わせることができることが示唆された。 DPEは、BPEよりも0.9BLEUの平均的な改善(Sennrich et al., 2016)とBPEよりも0.55BLEUの平均的な改善(Provilkov et al., 2019)を、英語<=>(ドイツ語、ルーマニア語、エストニア語、フィンランド語、ハンガリー語)を含むいくつかのWMTデータセットで達成している。

This paper introduces Dynamic Programming Encoding (DPE), a new segmentation algorithm for tokenizing sentences into subword units. We view the subword segmentation of output sentences as a latent variable that should be marginalized out for learning and inference. A mixed character-subword transformer is proposed, which enables exact log marginal likelihood estimation and exact MAP inference to find target segmentations with maximum posterior probability. DPE uses a lightweight mixed character-subword transformer as a means of pre-processing parallel data to segment output sentences using dynamic programming. Empirical results on machine translation suggest that DPE is effective for segmenting output sentences and can be combined with BPE dropout for stochastic segmentation of source sentences. DPE achieves an average improvement of 0.9 BLEU over BPE (Sennrich et al., 2016) and an average improvement of 0.55 BLEU over BPE dropout (Provilkov et al., 2019) on several WMT datasets including English <=> (German, Romanian, Estonian, Finnish, Hungarian).

翻訳日:2022-12-07 06:15:09 公開日:2020-08-01

# 脳波時系列予測のための感情誘発深部構造(EiDS)

Emotion-Inspired Deep Structure (EiDS) for EEG Time Series Forecasting ( http://arxiv.org/abs/2005.13520v2 )

ライセンス: Link先を確認

Mahboobeh Parsapoor

(参考訳) 脳波(EEG)時系列の正確な予測は、発作やてんかんなどの神経疾患の正確な診断に不可欠である。脳波時系列はカオスであるため、従来の機械学習アルゴリズムは次のステップを正確に予測できなかった。そこで本研究では,脳波の時系列を予測するために,感情(感情状態)を阻害する神経構造から着想を得たモデルを提案する。このモデルは感情にインスパイアされた深層構造(EiDS)と呼ばれ、脳波時系列の短期と長期の両方を予測するのに使うことができる。本稿では,EyDSの性能を,長寿命メモリ(LSTM)ネットワークの他のバリエーションと比較する。

Accurate forecasting of an electroencephalogram (EEG) time series is crucial for the correct diagnosis of neurological disorders such as seizures and epilepsy. Since the EEG time series is chaotic, most traditional machine learning algorithms have failed to forecast its next steps accurately. Thus, we suggest a model, which has formed by taking inspiration from the neural structures that underlie feelings (emotional states), to forecast EEG time series. The model, which is referred to as emotion-inspired deep structure (EiDS), can be used to predict both short- and long-term of EEG time series. This paper also compares the performance of EiDS with other variations of long short-term memory (LSTM) networks.

翻訳日:2022-11-30 03:44:54 公開日:2020-08-01

# 予測から処方へ:COVID-19パンデミックにおける非薬剤的介入の進化的最適化

From Prediction to Prescription: Evolutionary Optimization of Non-Pharmaceutical Interventions in the COVID-19 Pandemic ( http://arxiv.org/abs/2005.13766v3 )

ライセンス: Link先を確認

Risto Miikkulainen, Olivier Francon, Elliot Meyerson, Xin Qiu, Elisa Canzani, and Babak Hodjat

(参考訳) 新型コロナウイルス(COVID-19)のパンデミックの広がりや、ソーシャルディスタンシングの規制や学校やビジネスの閉鎖など、非薬剤的介入(NPI)をどう含めるかを予測するために、いくつかのモデルが開発されている。本稿では,進化的AIが次のステップ,すなわち最も効果的な介入戦略を自動決定するためにどのように使用できるかを示す。進化的代理補助処方(ESP)により、多数の候補戦略を生成し、予測モデルで評価することができる。原則として、戦略は異なる国や地域向けにカスタマイズでき、パンデミックを包含する必要性と経済への影響を最小限に抑える必要性のバランスを取ることができる。まだ利用可能なデータには制限があるが、初期の実験では職場や学校の制限が最も重要であり、慎重に設計する必要があることを示唆している。また、制限の解除結果が信頼できないことも示しており、例えば時間とともに変更することで、制約をソフトに実装できる創造的な方法を提案している。より多くのデータが利用可能になるにつれて、このアプローチは新型コロナウイルス(COVID-19)と将来のパンデミックに対処するのにますます有用になる。

Several models have been developed to predict how the COVID-19 pandemic spreads, and how it could be contained with non-pharmaceutical interventions (NPIs) such as social distancing restrictions and school and business closures. This paper demonstrates how evolutionary AI could be used to facilitate the next step, i.e. determining most effective intervention strategies automatically. Through evolutionary surrogate-assisted prescription (ESP), it is possible to generate a large number of candidate strategies and evaluate them with predictive models. In principle, strategies can be customized for different countries and locales, and balance the need to contain the pandemic and the need to minimize their economic impact. While still limited by available data, early experiments suggest that workplace and school restrictions are the most important and need to be designed carefully. It also demonstrates that results of lifting restrictions can be unreliable, and suggests creative ways in which restrictions can be implemented softly, e.g. by alternating them over time. As more data becomes available, the approach can be increasingly useful in dealing with COVID-19 as well as possible future pandemics.

翻訳日:2022-11-27 04:27:12 公開日:2020-08-01

# ニューラルネットワーク圧縮の概観

An Overview of Neural Network Compression ( http://arxiv.org/abs/2006.03669v2 )

ライセンス: Link先を確認

James O' Neill

(参考訳) コンバージェンスに訓練された過パラメータネットワークは、コンピュータビジョンや自然言語処理といった領域で印象的なパフォーマンスを示している。これらの領域におけるサルエントタスクの最先端の推進は、メモリとストレージの要求の増加を考えると、カーボンフットプリントを大きくするだけでなく、機械学習実践者が使用するモデルが大きくなり、ますます難しくなっていることに対応します。このように、近年ではモデル圧縮技術が復活し、特に深い畳み込みニューラルネットワークや、トランスフォーマーのような自己接続ベースのネットワークが注目されている。そこで本論文では, プルーニング, 量子化, テンソル分解, 知識蒸留, 組み合わせを含む, ディープニューラルネットワークの古きと現在の圧縮技術について, タイムリーに概説する。 We assume a basic familiarity with deep learning architectures\footnote{For an introduction to deep learning, see ~\citet{goodfellow2016deep}}, namely, Recurrent Neural Networks~\citep[(RNNs)][]{rumelhart1985learning,hochreiter1997long}, Convolutional Neural Networks~\citep{fukushima1980neocognitron}~\footnote{For an up to date overview see~\citet{khan2019survey}} and Self-Attention based networks~\citep{vaswani2017attention}\footnote{For a general overview of self-attention networks, see ~\citet{chaudhari2019attentive}. 詳細と自然言語処理での使用については、~\citet{hu2019introductory}}を参照してください。議論された論文のほとんどは、これらのdnnアーキテクチャの少なくとも1つの文脈で提案されている。

Overparameterized networks trained to convergence have shown impressive performance in domains such as computer vision and natural language processing. Pushing state of the art on salient tasks within these domains corresponds to these models becoming larger and more difficult for machine learning practitioners to use given the increasing memory and storage requirements, not to mention the larger carbon footprint. Thus, in recent years there has been a resurgence in model compression techniques, particularly for deep convolutional neural networks and self-attention based networks such as the Transformer. Hence, this paper provides a timely overview of both old and current compression techniques for deep neural networks, including pruning, quantization, tensor decomposition, knowledge distillation and combinations thereof. We assume a basic familiarity with deep learning architectures\footnote{For an introduction to deep learning, see ~\citet{goodfellow2016deep}}, namely, Recurrent Neural Networks~\citep[(RNNs)][]{rumelhart1985learning,hochreiter1997long}, Convolutional Neural Networks~\citep{fukushima1980neocognitron}~\footnote{For an up to date overview see~\citet{khan2019survey}} and Self-Attention based networks~\citep{vaswani2017attention}\footnote{For a general overview of self-attention networks, see ~\citet{chaudhari2019attentive}.},\footnote{For more detail and their use in natural language processing, see~\citet{hu2019introductory}}. Most of the papers discussed are proposed in the context of at least one of these DNN architectures.

翻訳日:2022-11-25 02:58:04 公開日:2020-08-01

# MultiSpeech: トランスフォーマーを用いた多話者音声テキスト

MultiSpeech: Multi-Speaker Text to Speech with Transformer ( http://arxiv.org/abs/2006.04664v2 )

ライセンス: Link先を確認

Mingjian Chen, Xu Tan, Yi Ren, Jin Xu, Hao Sun, Sheng Zhao, Tao Qin, Tie-Yan Liu

(参考訳) Transformer-based text to speech (TTS) model (例: Transformer TTS~\cite{li2019neural}, FastSpeech~\cite{ren2019fastspeech}) は、トレーニングと推論における並列計算により、RNNベースのモデル(例: Tacotron~\cite{shen2018natural})よりもトレーニングと推論効率の利点を示した。しかし、並列計算はトランスフォーマのテキストと音声のアライメントを学習しながら難易度を増大させ、ノイズデータと多彩な話者によるマルチスピーカーシナリオではさらに拡大され、マルチスピーカーttsにおけるトランスフォーマの適用性が阻害される。本稿では,テキストから音声へのアライメントを改善するためのコンポーネント/技術をいくつか備えた,ロバストで高品質なマルチスピーカートランスフォーマーttsシステムであるmultispeechを開発した。 1) 訓練及び推論において,エンコーダ・デコーダ注意の重量行列上の対角的制約 2) 位置情報をよりよく保存するためにエンコーダに埋め込まれた音素の正規化 3) 連続音声フレーム間のコピーを防止するデコーダプリネットのボトルネック。 VCTKおよびLibriTTSマルチ話者データセットの実験は、MultiSpeechの有効性を実証している。 1) ナイーブトランスフォーマーベースのTSよりも頑健で高品質なマルチスピーカ音声を合成する。 2) 教師としてのMutiSpeechモデルを用いて, 非常に高速な推論速度を保ちながら, ほぼ品質劣化の強いマルチスピーカFastSpeechモデルを得る。

Transformer-based text to speech (TTS) model (e.g., Transformer TTS~\cite{li2019neural}, FastSpeech~\cite{ren2019fastspeech}) has shown the advantages of training and inference efficiency over RNN-based model (e.g., Tacotron~\cite{shen2018natural}) due to its parallel computation in training and/or inference. However, the parallel computation increases the difficulty while learning the alignment between text and speech in Transformer, which is further magnified in the multi-speaker scenario with noisy data and diverse speakers, and hinders the applicability of Transformer for multi-speaker TTS. In this paper, we develop a robust and high-quality multi-speaker Transformer TTS system called MultiSpeech, with several specially designed components/techniques to improve text-to-speech alignment: 1) a diagonal constraint on the weight matrix of encoder-decoder attention in both training and inference; 2) layer normalization on phoneme embedding in encoder to better preserve position information; 3) a bottleneck in decoder pre-net to prevent copy between consecutive speech frames. Experiments on VCTK and LibriTTS multi-speaker datasets demonstrate the effectiveness of MultiSpeech: 1) it synthesizes more robust and better quality multi-speaker voice than naive Transformer based TTS; 2) with a MutiSpeech model as the teacher, we obtain a strong multi-speaker FastSpeech model with almost zero quality degradation while enjoying extremely fast inference speed.

翻訳日:2022-11-24 01:17:12 公開日:2020-08-01

# 教師なしハイパースペクトル超解法におけるカップリングアンミックスネットのクロスアテンション

Cross-Attention in Coupled Unmixing Nets for Unsupervised Hyperspectral Super-Resolution ( http://arxiv.org/abs/2007.05230v3 )

ライセンス: Link先を確認

Jing Yao, Danfeng Hong, Jocelyn Chanussot, Deyu Meng, Xiaoxiang Zhu, Zongben Xu

(参考訳) 近年のディープラーニング技術の進歩は、ハイパースペクトル画像超解像(HSI-SR)に大きな進歩をもたらした。しかし、この課題に対して教師なしのディープネットワークの開発は依然として困難である。そこで本研究では,高空間分解能マルチスペクトル画像(MSI)を用いて,HSIの空間分解能を高めるために,クロスアテンション機構CUCaNetを組み込んだ新しい非混合ネットワークを提案する。スペクトルアンミックスにインスパイアされた2ストリーム畳み込みオートエンコーダフレームワークをバックボーンとしてMSとHSデータをスペクトル的に有意な基底と対応する係数に分解する。 CUCaNetは、ネットワーク上の合理的な一貫性仮定を強制することにより、HS-MS対応からスペクトルおよび空間応答関数を適応的に学習することができる。さらに、ネットワークにおけるより効果的な空間スペクトル情報転送を実現するために、クロスアテンションモジュールが考案された。 HSI-SRモデルと比較して広く使われている3つのHS-MSデータセットに対して大規模な実験を行い、HSI-SRアプリケーションにおけるCUCaNetの優位性を実証した。さらに、コードとデータセットはhttps://github.com/danfenghong/eccv2020_cucanetで利用可能になる。

The recent advancement of deep learning techniques has made great progress on hyperspectral image super-resolution (HSI-SR). Yet the development of unsupervised deep networks remains challenging for this task. To this end, we propose a novel coupled unmixing network with a cross-attention mechanism, CUCaNet for short, to enhance the spatial resolution of HSI by means of higher-spatial-resolution multispectral image (MSI). Inspired by coupled spectral unmixing, a two-stream convolutional autoencoder framework is taken as backbone to jointly decompose MS and HS data into a spectrally meaningful basis and corresponding coefficients. CUCaNet is capable of adaptively learning spectral and spatial response functions from HS-MS correspondences by enforcing reasonable consistency assumptions on the networks. Moreover, a cross-attention module is devised to yield more effective spatial-spectral information transfer in networks. Extensive experiments are conducted on three widely-used HS-MS datasets in comparison with state-of-the-art HSI-SR models, demonstrating the superiority of the CUCaNet in the HSI-SR application. Furthermore, the codes and datasets will be available at: https://github.com/danfenghong/ECCV2020_CUCaNet.

翻訳日:2022-11-11 22:35:45 公開日:2020-08-01

# AirCapRL:Deep Reinforcement Learningを用いた自律飛行型人体モーションキャプチャ

AirCapRL: Autonomous Aerial Human Motion Capture using Deep Reinforcement Learning ( http://arxiv.org/abs/2007.06343v2 )

ライセンス: Link先を確認

Rahul Tallamraju, Nitin Saini, Elia Bonetto, Michael Pabst, Yu Tang Liu, Michael J. Black and Aamir Ahmad

(参考訳) 本稿では,自律型空中人体モーションキャプチャ(MoCap)のための深部強化学習(RL)に基づくマルチロボット生成制御について紹介する。視覚をベースとしたMoCapに焦点をあて,複数のマイクロエアロ車両を用いた1人の移動体の姿勢と形状の軌跡を推定することを目的とする。この問題に対する最先端の解決策は、手作りのシステムと観測モデルに依存する古典的な制御法に基づいている。このようなモデルは、異なるシステム間で導出および一般化することが困難である。さらに、これらのモデルの非線形性や非凸性は、準最適制御につながる。本研究では,視覚に基づくモーションキャプチャ目的を達成するための逐次意思決定タスクとしてこの問題を定式化し,深層ニューラルネットワークを用いたRL法を用いて解決する。我々はPPOを利用して、構成制御のための確率的分散制御ポリシーを訓練する。ニューラルネットワークは、合成環境で並列化されたセットアップでトレーニングされる。我々はアプローチを検証するために広範囲なシミュレーション実験を行った。最後に、実ロボット実験により、我々のポリシーが現実の条件に一般化されることを示した。ビデオリンク: https://bit.ly/38SJfjo 補足: https://bit.ly/3evfo1O

In this letter, we introduce a deep reinforcement learning (RL) based multi-robot formation controller for the task of autonomous aerial human motion capture (MoCap). We focus on vision-based MoCap, where the objective is to estimate the trajectory of body pose and shape of a single moving person using multiple micro aerial vehicles. State-of-the-art solutions to this problem are based on classical control methods, which depend on hand-crafted system and observation models. Such models are difficult to derive and generalize across different systems. Moreover, the non-linearity and non-convexities of these models lead to sub-optimal controls. In our work, we formulate this problem as a sequential decision making task to achieve the vision-based motion capture objectives, and solve it using a deep neural network-based RL method. We leverage proximal policy optimization (PPO) to train a stochastic decentralized control policy for formation control. The neural network is trained in a parallelized setup in synthetic environments. We performed extensive simulation experiments to validate our approach. Finally, real-robot experiments demonstrate that our policies generalize to real world conditions. Video Link: https://bit.ly/38SJfjo Supplementary: https://bit.ly/3evfo1O

翻訳日:2022-11-11 00:51:54 公開日:2020-08-01

# 頑健な低階表現による対向ロバスト性

Adversarial robustness via robust low rank representations ( http://arxiv.org/abs/2007.06555v2 )

ライセンス: Link先を確認

Pranjal Awasthi, Himanshu Jain, Ankit Singh Rawat, Aravindan Vijayaraghavan

(参考訳) 逆のロバスト性は、テスト時の入力に対する知覚できない摂動に対する分類器の感受性を測定する。本研究では、画像などの実データに対してしばしば存在する自然な低ランク表現の利点を強調し、確証された堅牢性を保証するニューラルネットワークのトレーニングを行う。最初の貢献は、$\ell_2$ normで測定された摂動に対する認証された堅牢性です。我々は、CIFAR-10やCIFAR-100のような標準ベンチマークデータセットに対して、最先端のランダム化スムーシングに基づくアプローチの改善を保証するために、低ランクデータ表現を利用する。第二の貢献は、$\ell_\infty$ normで測定された摂動に対する証明された堅牢性のより困難な設定である。我々は、自然な低階表現が本質的に堅牢性を持つことを実証的に証明し、それらの表現における$\ell_\infty$摂動に対する証明されたロバスト性を保証するために利用することができる。我々の $\ell_\infty$ robustness の証明は、表現に付随する $\infty \to 2$ matrix operator norm を含む自然量に依存し、ロバストネス保証を $\ell_2$ から $\ell_\infty$ 摂動に変換する。証明保証のための重要な技術的要素は、上記の行列ノルムに上限を与える乗法重み更新法に基づく証明可能な保証付き高速アルゴリズムである。我々のアルゴリズムによる保証は、この問題に対する技術の現状を改善し、独立した関心を持つかもしれない。

Adversarial robustness measures the susceptibility of a classifier to imperceptible perturbations made to the inputs at test time. In this work we highlight the benefits of natural low rank representations that often exist for real data such as images, for training neural networks with certified robustness guarantees. Our first contribution is for certified robustness to perturbations measured in $\ell_2$ norm. We exploit low rank data representations to provide improved guarantees over state-of-the-art randomized smoothing-based approaches on standard benchmark datasets such as CIFAR-10 and CIFAR-100. Our second contribution is for the more challenging setting of certified robustness to perturbations measured in $\ell_\infty$ norm. We demonstrate empirically that natural low rank representations have inherent robustness properties, that can be leveraged to provide significantly better guarantees for certified robustness to $\ell_\infty$ perturbations in those representations. Our certificate of $\ell_\infty$ robustness relies on a natural quantity involving the $\infty \to 2$ matrix operator norm associated with the representation, to translate robustness guarantees from $\ell_2$ to $\ell_\infty$ perturbations. A key technical ingredient for our certification guarantees is a fast algorithm with provable guarantees based on the multiplicative weights update method to provide upper bounds on the above matrix norm. Our algorithmic guarantees improve upon the state of the art for this problem, and may be of independent interest.

翻訳日:2022-11-10 23:42:28 公開日:2020-08-01

# 形状CD:形状とニューロンを持つ時系列データにおける変化点検出

Shape-CD: Change-Point Detection in Time-Series Data with Shapes and Neurons ( http://arxiv.org/abs/2007.11985v3 )

ライセンス: Link先を確認

Varsha Suresh, Wei Tsang Ooi

(参考訳) 時系列における変更点検出は、時系列データを生成する未知の物理プロセスが変化した時点を検出することを目的としている。既存の手法は、基礎となるプロセスが複雑になれば精度が低下し、時系列で大量のパターンが生成される。この欠点に対処するため,簡単な高速かつ高精度な変化点検出法であるShape-CDを提案する。 shape-cdは形状に基づく特徴を用いてパターンをモデル化し、条件付きニューラルネットワークを用いて時間領域間の時間相関をモデル化する。最大2000クラスを含む4つの高ダイナミック時系列データセットを用いて,shape-cdの性能評価を行った。形状-CDは従来の手法に比べて精度(AUCでは7-60%高い)と計算速度が向上した。さらに、Shape-CDモデルは数百のパラメータで構成されており、他の深層学習モデルよりも訓練に必要なデータが少ない。

Change-point detection in a time series aims to discover the time points at which some unknown underlying physical process that generates the time-series data has changed. We found that existing approaches become less accurate when the underlying process is complex and generates large varieties of patterns in the time series. To address this shortcoming, we propose Shape-CD, a simple, fast, and accurate change point detection method. Shape-CD uses shape-based features to model the patterns and a conditional neural field to model the temporal correlations among the time regions. We evaluated the performance of Shape-CD using four highly dynamic time-series datasets, including the ExtraSensory dataset with up to 2000 classes. Shape-CD demonstrated improved accuracy (7-60% higher in AUC) and faster computational speed compared to existing approaches. Furthermore, the Shape-CD model consists of only hundreds of parameters and require less data to train than other deep supervised learning models.

翻訳日:2022-11-07 23:16:10 公開日:2020-08-01

# CelebA-Spoof: リッチアノテーション付き大規模顔アンチスプーフデータセット

CelebA-Spoof: Large-Scale Face Anti-Spoofing Dataset with Rich Annotations ( http://arxiv.org/abs/2007.12342v3 )

ライセンス: Link先を確認

Yuanhan Zhang, Zhenfei Yin, Yidong Li, Guojun Yin, Junjie Yan, Jing Shao, and Ziwei Liu

(参考訳) 顔とのインタラクションシステムが普及するにつれ、これらのシステムのセキュリティと信頼性は重要な問題となり、かなりの研究が費やされる。そのうち、顔の偽造は重要な領域として現れ、その目的は提示された顔が生きているか偽造なのかを特定することである。有望な進歩は達成されたが、既存の作品では複雑なspoof攻撃の処理や現実のシナリオへの一般化が困難である。主な理由は、現在の対spoofingデータセットは量と多様性の両方に制限があるためである。これらの障害を克服するために,大規模な対スプーフ対策データセットceleba-spoofに,次のような魅力を付与する。 1)量:celeba-spoofは10,177人の625,537枚の画像からなる。 2)多様性:スプーフ画像は10以上のセンサーで8つのシーン(2つの環境*4つの照明条件)から撮影される。 3) アノテーションのリッチ性: CelebA-Spoofには10のspoof型アノテーションと、オリジナルのCelebAデータセットから継承された40の属性アノテーションが含まれている。 CelebA-Spoof と組み合わせた統合マルチタスクフレームワークである Auxiliary Information Embedding Network (AENet) の既存手法を慎重にベンチマークし、いくつかの貴重な観測結果を明らかにする。

As facial interaction systems are prevalently deployed, security and reliability of these systems become a critical issue, with substantial research efforts devoted. Among them, face anti-spoofing emerges as an important area, whose objective is to identify whether a presented face is live or spoof. Though promising progress has been achieved, existing works still have difficulty in handling complex spoof attacks and generalizing to real-world scenarios. The main reason is that current face anti-spoofing datasets are limited in both quantity and diversity. To overcome these obstacles, we contribute a large-scale face anti-spoofing dataset, CelebA-Spoof, with the following appealing properties: 1) Quantity: CelebA-Spoof comprises of 625,537 pictures of 10,177 subjects, significantly larger than the existing datasets. 2) Diversity: The spoof images are captured from 8 scenes (2 environments * 4 illumination conditions) with more than 10 sensors. 3) Annotation Richness: CelebA-Spoof contains 10 spoof type annotations, as well as the 40 attribute annotations inherited from the original CelebA dataset. Equipped with CelebA-Spoof, we carefully benchmark existing methods in a unified multi-task framework, Auxiliary Information Embedding Network (AENet), and reveal several valuable observations.

翻訳日:2022-11-07 06:50:26 公開日:2020-08-01

# 非線形最小二乗問題に対する拡張微分自由最適化

Scalable Derivative-Free Optimization for Nonlinear Least-Squares Problems ( http://arxiv.org/abs/2007.13243v2 )

ライセンス: Link先を確認

Coralia Cartis and Tyler Ferguson and Lindon Roberts

(参考訳) 微分自由(あるいはゼロオーダー)最適化(DFO)は、機械学習を含むさまざまなアプリケーション領域で、特に確率的で計算に高価な目的を含む問題を解く能力において、近年注目を集めている。本研究では,非線形最小二乗問題を解くためのモデルに基づく新しいDFO法を提案する。スケッチ手法を用いて観測空間の次元的低減を行い,局所モデル全体の構築を回避し,最先端のDFOを改善する。提案手法は,ビッグデータシステムにおける問題次元の線形化を図り,既存のソフトウェアと比較して,過度に決定された最小二乗問題に対する実行性能が劇的に向上したことを示す数値的証拠である。

Derivative-free - or zeroth-order - optimization (DFO) has gained recent attention for its ability to solve problems in a variety of application areas, including machine learning, particularly involving objectives which are stochastic and/or expensive to compute. In this work, we develop a novel model-based DFO method for solving nonlinear least-squares problems. We improve on state-of-the-art DFO by performing dimensionality reduction in the observational space using sketching methods, avoiding the construction of a full local model. Our approach has a per-iteration computational cost which is linear in problem dimension in a big data regime, and numerical evidence demonstrates that, compared to existing software, it has dramatically improved runtime performance on overdetermined least-squares problems.

翻訳日:2022-11-06 20:17:33 公開日:2020-08-01

# 地磁気嵐予測のための脳感情学習に基づく予測モデル

Brain Emotional Learning-based Prediction Model For the Prediction of Geomagnetic Storms ( http://arxiv.org/abs/2007.15579v2 )

ライセンス: Link先を確認

Mahboobeh Parsapoor

(参考訳) 本研究では,地磁気嵐予測のための新しいデータ駆動モデルを提案する。脳情緒学習インスパイアされたモデル(BELIM)の例であるモデルは、脳情緒学習ベース予測モデル(BELPM)として知られている。 BELPMは4つの主要なサブシステムから構成されており、これらのサブシステム間の接続は感情システムの対応する領域によって模倣されている。これらのサブシステムの機能は適応ネットワークを用いて説明される。 BELPMの学習アルゴリズムは、最も急降下(SD)と最小二乗推定器(LSE)を用いて定義される。 BELPMは、Auroral Electrojet (AE) IndexとDisrupt Time (Dst) Indexという2つの磁気指標を用いて、地磁気嵐を予測するために使用される。 BELPMの性能を評価するため,ANFIS,WKNN,その他のBELIMと比較した。その結果,BELPMは短期および長期の地磁気嵐予測において妥当な精度を達成できることを確認した。

This study suggests a new data-driven model for the prediction of geomagnetic storm. The model which is an instance of Brain Emotional Learning Inspired Models (BELIMs), is known as the Brain Emotional Learning-based Prediction Model (BELPM). BELPM consists of four main subsystems; the connection between these subsystems has been mimicked by the corresponding regions of the emotional system. The functions of these subsystems are explained using adaptive networks. The learning algorithm of BELPM is defined using the steepest descent (SD) and the least square estimator (LSE). BELPM is employed to predict geomagnetic storms using two geomagnetic indices, Auroral Electrojet (AE) Index and Disturbance Time (Dst) Index. To evaluate the performance of BELPM, the obtained results have been compared with ANFIS, WKNN and other instances of BELIMs. The results verify that BELPM has the capability to achieve a reasonable accuracy for both the short-term and the long-term geomagnetic storms prediction.

翻訳日:2022-11-06 02:19:45 公開日:2020-08-01

# 楽譜評価のためのスコアインフォームドネットワーク

Score-informed Networks for Music Performance Assessment ( http://arxiv.org/abs/2008.00203v1 )

ライセンス: Link先を確認

Jiawen Huang, Yun-Ning Hung, Ashis Pati, Siddharth Kumar Gururani, Alexander Lerch

(参考訳) ほとんどの場合の音楽演奏の評価は、演奏中の楽譜の基盤を考慮に入れている。演奏音声と楽譜の両方から抽出した特徴に基づく客観的音楽演奏評価(MPA)の自動手法はいくつかあるが,MPAモデルにスコア情報を組み込んだディープニューラルネットワークによる手法はまだ検討されていない。本稿では,スコアインフォームド性能評価が可能な3つのモデルを提案する。これらは (i)調整されたピッチ輪郭とスコアからなる単純な時系列入力を利用する畳み込みニューラルネットワーク (二)ピッチ輪郭と楽譜のジョイント潜在空間を学習するジョイント埋め込みモデル (iii)ピッチ輪郭と楽譜間の距離行列のパターンを利用して評価評価を行う距離行列に基づく畳み込みニューラルネットワーク。本結果は,異なるアーキテクチャと入力表現の適合性に関する知見を提供し,スコア非依存モデルと比較して,スコアインフォームドモデルの利点を示す。

The assessment of music performances in most cases takes into account the underlying musical score being performed. While there have been several automatic approaches for objective music performance assessment (MPA) based on extracted features from both the performance audio and the score, deep neural network-based methods incorporating score information into MPA models have not yet been investigated. In this paper, we introduce three different models capable of score-informed performance assessment. These are (i) a convolutional neural network that utilizes a simple time-series input comprising of aligned pitch contours and score, (ii) a joint embedding model which learns a joint latent space for pitch contours and scores, and (iii) a distance matrix-based convolutional neural network which utilizes patterns in the distance matrix between pitch contours and musical score to predict assessment ratings. Our results provide insights into the suitability of different architectures and input representations and demonstrate the benefits of score-informed models as compared to score-independent models.

翻訳日:2022-11-04 01:19:01 公開日:2020-08-01

# インテリジェントモノのインターネットのための深層強化学習に基づくモバイルエッジコンピューティング

Deep Reinforcement Learning Based Mobile Edge Computing for Intelligent Internet of Things ( http://arxiv.org/abs/2008.00250v1 )

ライセンス: Link先を確認

Rui Zhao, Xinjie Wang, Junjuan Xia, and Liseng Fan

(参考訳) 本稿では,複数のユーザが複数の計算アクセスポイント(CAP)によって支援される計算処理を行う,インテリジェントなモノのインターネット(IoT)のための移動エッジコンピューティング(MEC)ネットワークについて検討する。いくつかのタスクをCAPにオフロードすることで、MECネットワークにおける2つの重要な指標であるレイテンシとエネルギー消費を削減することで、システムパフォーマンスを改善することができる。深層強化学習アルゴリズムを用いて,オフロード戦略をインテリジェントに提案することで,システムを考案する。このアルゴリズムでは、システム性能を最適化するために、ディープqネットワークを使用してオフロード決定を自動的に学習し、ニューラルネットワーク(nn)を訓練して、環境システムからトレーニングデータを生成するオフロード動作を予測する。また,複数の帯域割り当て方式が提案されているユーザとキャップ間のリンクに対して,無線帯域幅を最適化するために帯域割り当てを用いる。さらに、ユーザからの計算タスクを支援するために、ベストキャップを1つ選択するために、キャップ選択を用いる。シミュレーションの結果から,提案した強化学習オフロード戦略の有効性が示された。特に、深層強化学習に基づくアルゴリズムにより、レイテンシとエネルギー消費のシステムコストを大幅に削減することができる。

In this paper, we investigate mobile edge computing (MEC) networks for intelligent internet of things (IoT), where multiple users have some computational tasks assisted by multiple computational access points (CAPs). By offloading some tasks to the CAPs, the system performance can be improved through reducing the latency and energy consumption, which are the two important metrics of interest in the MEC networks. We devise the system by proposing the offloading strategy intelligently through the deep reinforcement learning algorithm. In this algorithm, Deep Q-Network is used to automatically learn the offloading decision in order to optimize the system performance, and a neural network (NN) is trained to predict the offloading action, where the training data is generated from the environmental system. Moreover, we employ the bandwidth allocation in order to optimize the wireless spectrum for the links between the users and CAPs, where several bandwidth allocation schemes are proposed. In further, we use the CAP selection in order to choose one best CAP to assist the computational tasks from the users. Simulation results are finally presented to show the effectiveness of the proposed reinforcement learning offloading strategy. In particular, the system cost of latency and energy consumption can be reduced significantly by the proposed deep reinforcement learning based algorithm.

翻訳日:2022-11-04 01:18:38 公開日:2020-08-01

# 心メタボリックシンドロームを応用したマルチオムリックデータに対する2段階のペナルドロジスティック回帰法

Two-step penalised logistic regression for multi-omic data with an application to cardiometabolic syndrome ( http://arxiv.org/abs/2008.00235v1 )

ライセンス: Link先を確認

Alessandra Cabassi, Denis Seyres, Mattia Frontini, Paul D. W. Kirk

(参考訳) 高次元マルチオミクスデータセットに基づいてバイナリクラスラベルを予測する分類モデルを構築することは、予測器の数、データの種類、ノイズレベルといった点において、データ層の特徴が広く異なるため、いくつかの課題を提起する。これまでの研究では、これらのデータセットに弾性ネットペナルティを用いた古典的ロジスティック回帰を適用すると、結果が低くなることが示されている(Liu et al., 2018)。本稿では,各層で変数選択を行い,第1ステップで選択した変数を用いて予測モデルを構築する,多段階ロジスティック回帰に対する2段階のアプローチを提案する。そこで本手法は, 同一目的に開発された他の手法と比較し, 既存のソフトウェアを多相線形回帰(Zhao and Zucknick, 2020)にロジスティック回帰設定に適用する。広範なシミュレーション研究により,提案手法は,最善の競争相手と同等の予測性能を実現するだけでなく,関連する予測要因を可能な限り多数選択することを望む場合に望ましいことが示された。モチベーション例は,2つの極端な表現型グループ(肥満10名,リポジストトロフィー10名)と185名の献血者を対象に,8種類の「異常データ型」からなる心メタボリックシンドロームデータセットである。提案手法により,分子レベルでの心筋メタボリックシンドロームの特徴を同定することができる。 Rコードはhttps://github.com/acabassi/logistic-regression-for-multi-omic-dataで入手できる。

Building classification models that predict a binary class label on the basis of high dimensional multi-omics datasets poses several challenges, due to the typically widely differing characteristics of the data layers in terms of number of predictors, type of data, and levels of noise. Previous research has shown that applying classical logistic regression with elastic-net penalty to these datasets can lead to poor results (Liu et al., 2018). We implement a two-step approach to multi-omic logistic regression in which variable selection is performed on each layer separately and a predictive model is then built using the variables selected in the first step. Here, our approach is compared to other methods that have been developed for the same purpose, and we adapt existing software for multi-omic linear regression (Zhao and Zucknick, 2020) to the logistic regression setting. Extensive simulation studies show that our approach should be preferred if the goal is to select as many relevant predictors as possible, as well as achieving prediction performances comparable to those of the best competitors. Our motivating example is a cardiometabolic syndrome dataset comprising eight 'omic data types for 2 extreme phenotype groups (10 obese and 10 lipodystrophy individuals) and 185 blood donors. Our proposed approach allows us to identify features that characterise cardiometabolic syndrome at the molecular level. R code is available at https://github.com/acabassi/logistic-regression-for-multi-omic-data.

翻訳日:2022-11-04 01:15:06 公開日:2020-08-01

# 深層学習モデルに基づくデータ合成管理

Data Synopses Management based on a Deep Learning Model ( http://arxiv.org/abs/2008.01560v1 )

ライセンス: Link先を確認

Panagiotis Fountas, Kostas Kolomvatsos, Christos Anagnostopoulos

(参考訳) 分散コンピューティングは、インテリジェントアプリケーションをサポートするためにエンドユーザに近い処理サービスを配置する。 IoT(Internet of Things)とエッジコンピューティング(Edge Computing)の出現により、前述のインフラストラクチャの相互接続において、さまざまなポイントにサービスを配置する余地を見つけることができる。重要な点は収集したデータの処理である。このような処理は、IoTデバイスと比較して計算能力の増大を示すECノード上で実現可能である。インテリジェントノードのエコシステムがECで作成され、協調モデルをサポートする機会を提供する。ノードは、iotデバイスレポートで定式化された地理的分散データセットのホストになる。データセットでは、いくつかのクエリ/タスクを実行できます。クエリ/タスクはパフォーマンス上の理由からオフロードできる。しかしながら、オフローディングアクションは、ホストノードに存在するデータと常に一致して、慎重に設計されるべきである。本稿では,ECインフラにおける協調的側面を支援するためのモデルを提案する。データシナプスをECノードに配信することで、ピアに存在するデータと完全に整合したオフロード決定を行えるようにしています。ノードはデータを交換して仲間に知らせる。本稿では,IoTデバイスがデータをECノードに報告する頻度が高いため,特にシナプスを頻繁に抽出する場合に,ネットワーク過負荷を回避するためのシナプス配布の適切な時間を検出する手法を提案する。我々のアプローチは、計算されたシナプスの分布を学習し、将来の傾向を推定するディープラーニングモデルである。これらのトレンドに基づいて、ピアノードにシナプスを提供する適切な時間を見つけることができます。提案するメカニズムを記述し,実際のデータセットに基づいて評価する。様々なシナリオに対する広範な実験は、数値的な結果を与えることでアプローチの長所と短所を明らかにする。

Pervasive computing involves the placement of processing services close to end users to support intelligent applications. With the advent of the Internet of Things (IoT) and the Edge Computing (EC), one can find room for placing services at various points in the interconnection of the aforementioned infrastructures. Of significant importance is the processing of the collected data. Such a processing can be realized upon the EC nodes that exhibit increased computational capabilities compared to IoT devices. An ecosystem of intelligent nodes is created at the EC giving the opportunity to support cooperative models. Nodes become the hosts of geo-distributed datasets formulated by the IoT devices reports. Upon the datasets, a number of queries/tasks can be executed. Queries/tasks can be offloaded for performance reasons. However, an offloading action should be carefully designed being always aligned with the data present to the hosting node. In this paper, we present a model to support the cooperative aspect in the EC infrastructure. We argue on the delivery of data synopses to EC nodes making them capable to take offloading decisions fully aligned with data present at peers. Nodes exchange data synopses to inform their peers. We propose a scheme that detects the appropriate time to distribute synopses trying to avoid the network overloading especially when synopses are frequently extracted due to the high rates at which IoT devices report data to EC nodes. Our approach involves a Deep Learning model for learning the distribution of calculated synopses and estimate future trends. Upon these trends, we are able to find the appropriate time to deliver synopses to peer nodes. We provide the description of the proposed mechanism and evaluate it based on real datasets. An extensive experimentation upon various scenarios reveals the pros and cons of the approach by giving numerical results.

翻訳日:2022-11-04 01:13:42 公開日:2020-08-01

# LDAとLSTMモデルを用いた2007年から2019年までのニューヨーク市の混雑価格に向けた公共の意見と批判グループの研究

Using LDA and LSTM Models to Study Public Opinions and Critical Groups Towards Congestion Pricing in New York City through 2007 to 2019 ( http://arxiv.org/abs/2008.07366v1 )

ライセンス: Link先を確認

Qian Ye, Xiaohong Chen, Onur Kalan, and Kaan Ozbay

(参考訳) 本研究は,ニューヨーク市の混雑価格設定の提案が時間とともにどのように発展していくのかを考察する。これらの反応を理解するために、Twitterデータは収集され分析される。活発なユーザと最も言及されたアカウントを統計的に分析することにより、反復プロセスにおける臨界グループを検出し、テキストマイニングとLDAトピックモデリングやLSTM感情分類を含むハイブリッドな自然言語処理技術により、人々の長年の態度や関心の傾向を識別する。その結果、複数の利害団体が関与し、特に市長と知事、mtaおよび外選挙区の代表者など、重要な役割を演じた。大衆は、計画の詳細からより広い都市の持続可能性と公平性への焦点の関心を移した。さらに、計画の承認はいくつかの要素、政治的プロセスで合意された共同合意、現実世界での強い動機付け、複数の利益のバランスに基づくスキーム、 tollingの利益と必要性に対するグループの認識に依存する。

This study explores how people view and respond to the proposals of NYC congestion pricing evolve in time. To understand these responses, Twitter data is collected and analyzed. Critical groups in the recurrent process are detected by statistically analyzing the active users and the most mentioned accounts, and the trends of people's attitudes and concerns over the years are identified with text mining and hybrid Nature Language Processing techniques, including LDA topic modeling and LSTM sentiment classification. The result shows that multiple interest groups were involved and played crucial roles during the proposal, especially Mayor and Governor, MTA, and outer-borough representatives. The public shifted the concern of focus from the plan details to a wider city's sustainability and fairness. Furthermore, the plan's approval relies on several elements, the joint agreement reached in the political process, strong motivation in the real-world, the scheme based on balancing multiple interests, and groups' awareness of tolling's benefits and necessity.

翻訳日:2022-11-04 01:13:17 公開日:2020-08-01

# 捕食者適応のためのランダム林の選別スイート

Custom Tailored Suite of Random Forests for Prefetcher Adaptation ( http://arxiv.org/abs/2008.00176v1 )

ライセンス: Link先を確認

Furkan Eris, Sadullah Canakci, Cansu Demirkiran, Ajay Joshi

(参考訳) メモリとプロセッサ間のギャップを埋め、パフォーマンスを向上させるために、データ/インストラクションプリフィッシャー設計の分野では、多くの作業が続けられている。プリフェッチはメモリ階層の各レベルにデプロイされるが、通常、各プリフェッチはシステム内の他のプリフェッチを包括的に考慮せずに設計される。結果として、これらの個別のプレフィッシャー設計は必ずしも相補的ではなく、平均的な性能向上や多くの負の外れ値をもたらす。本研究では,ランダムフォレストを用いて,各メモリレベルでどのプリフェッチャーをオンにすべきかを実行時に決定し,それらを補完するハードウェアプリフェッチャーアダプタであるS SuitAP(Suite of random Forests for Adaptation of Prefetcher system configuration)を提案する。プリフェッチのない設計と比較して、S SuitAPを使うことで、12KBのオーバーヘッドを持つSPEC2017スイートから生成されるトレースの平均で、IPCを46%改善する。また,S SuitAP を用いた負の外れ値も低減する。

To close the gap between memory and processors, and in turn improve performance, there has been an abundance of work in the area of data/instruction prefetcher designs. Prefetchers are deployed in each level of the memory hierarchy, but typically, each prefetcher gets designed without comprehensively accounting for other prefetchers in the system. As a result, these individual prefetcher designs do not always complement each other, and that leads to low average performance gains and/or many negative outliers. In this work, we propose SuitAP (Suite of random forests for Adaptation of Prefetcher system configuration), which is a hardware prefetcher adapter that uses a suite of random forests to determine at runtime which prefetcher should be ON at each memory level, such that they complement each other. Compared to a design with no prefetchers, using SuitAP we improve IPC by 46% on average across traces generated from SPEC2017 suite with 12KB overhead. Moreover, we also reduce negative outliers using SuitAP.

翻訳日:2022-11-04 01:12:11 公開日:2020-08-01

# NFVシステムの障害を考慮したサービスチェーン構成:ゲーム理論の視点から

Service Chain Composition with Failures in NFV Systems: A Game-Theoretic Perspective ( http://arxiv.org/abs/2008.00208v1 )

ライセンス: Link先を確認

Simeng Bian, Xi Huang, Ziyu Shao, Xin Gao, Yang Yang

(参考訳) 最先端のネットワーク機能仮想化(nfv)システムでは、超低リクエストレイテンシと最小ネットワーク混雑を持つ異なるネットワークサービス(nss)に対して、効果的なサービスチェーン構成を行うことが依然として重要な課題である。この目的のために、既存のソリューションは、プライバシーの問題を無視し、ユーザの非協力的な振る舞いを無視しながら、ネットワーク状態の完全な知識を必要とします。さらに、ユーザ不使用や仮想マシンのダウンといった予期せぬ失敗に直面している場合もあります。本稿では,非協調ゲームとして失敗するNFVシステムにおけるサービスチェーン構成の問題点を定式化する。このようなゲームが重み付きポテンシャルゲームであることを示すことによって、異なるNSのサービスチェーン組成をNash平衡状態(NE)へ誘導する2つの効果的な分散スキームを提案する。さらに, 深部強化学習 (DRL) とモンテカルロ木探索 (MCTS) に基づく2つの新しい学習支援スキームを比較として開発した。提案手法の有効性と, 故障時の適応性について理論的解析およびシミュレーションにより検証した。

For state-of-the-art network function virtualization (NFV) systems, it remains a key challenge to conduct effective service chain composition for different network services (NSs) with ultra-low request latencies and minimum network congestion. To this end, existing solutions often require full knowledge of the network state, while ignoring the privacy issues and overlooking the non-cooperative behaviors of users. What is more, they may fall short in the face of unexpected failures such as user unavailability and virtual machine breakdown. In this paper, we formulate the problem of service chain composition in NFV systems with failures as a non-cooperative game. By showing that such a game is a weighted potential game and exploiting the unique problem structure, we propose two effective distributed schemes that guide the service chain compositions of different NSs towards the Nash equilibrium (NE) state with both near-optimal latencies and minimum congestion. Besides, we develop two novel learning-aided schemes as comparisons, which are based on deep reinforcement learning (DRL) and Monte Carlo tree search (MCTS) techniques, respectively. Our theoretical analysis and simulation results demonstrate the effectiveness of our proposed schemes, as well as the adaptivity when faced with failures.

翻訳日:2022-11-04 01:06:14 公開日:2020-08-01

# 畳み込みニューラルネットワークを用いた糖尿病網膜症の診断

Diabetic Retinopathy Diagnosis based on Convolutional Neural Network ( http://arxiv.org/abs/2008.00148v1 )

ライセンス: Link先を確認

Mohammed hamzah abed, Lamia Abed Noor Muhammed, Sarah Hussein Toman

(参考訳) 糖尿病網膜症drは、年齢や糖尿病の結果、多くの人にとって人気のある疾患であり、結果として盲目を引き起こす可能性がある。そのため、特に早期にこの疾患の診断は、多くの患者に対する効果を予防することができる。この診断には、網膜を連続的に検査する必要がある。したがって、コンピュータビジョン技術に基づく分野において、コンピュータ支援ツールが使用できる。様々な機械学習技術を用いて様々な研究が行われている。畳み込みニューラルネットワーク(convolutional neural network, convolutional neural network, convolutional neural network)は,糖尿病網膜症検出のための手法である。また、本研究は、前処理フェーズにおける視覚的増強を含み、CNNモデルは、正常で不健康な網膜像を診断するために、認識および分類フェーズを訓練する。 3つの公開データセット DiaretDB0, DiaretDB1, DrimDB が実用的なテストに使用された。この作業の実装は、ディープラーニングツールボックスでディープネットワークデザイナであるMatlab-R2019aに基づいて、畳み込みニューラルネットワークのアーキテクチャを設計し、それをトレーニングする。結果は異なる指標で評価され、その1つは正確さである。 DiaretDB0は100%、DiaretDB1は99.495%、DrimDBは97.55%である。

Diabetic Retinopathy DR is a popular disease for many people as a result of age or the diabetic, as a result, it can cause blindness. therefore, diagnosis of this disease especially in the early time can prevent its effect for a lot of patients. To achieve this diagnosis, eye retina must be examined continuously. Therefore, computer-aided tools can be used in the field based on computer vision techniques. Different works have been performed using various machine learning techniques. Convolutional Neural Network is one of the promise methods, so it was for Diabetic Retinopathy detection in this paper. Also, the proposed work contains visual enhancement in the pre-processing phase, then the CNN model is trained to be able for recognition and classification phase, to diagnosis the healthy and unhealthy retina image. Three public dataset DiaretDB0, DiaretDB1 and DrimDB were used in practical testing. The implementation of this work based on Matlab- R2019a, deep learning toolbox and deep network designer to design the architecture of the convolutional neural network and train it. The results were evaluated to different metrics; accuracy is one of them. The best accuracy that was achieved: for DiaretDB0 is 100%, DiaretDB1 is 99.495% and DrimDB is 97.55%.

翻訳日:2022-11-04 01:05:25 公開日:2020-08-01

# 画像分割用ファジィアクティブ輪郭モデルの現状

State-of-The-Art Fuzzy Active Contour Models for Image Segmentation ( http://arxiv.org/abs/2008.00175v1 )

ライセンス: Link先を確認

Ajoy Mondal and Kuntal Ghosh

(参考訳) イメージセグメンテーションは、すべての画像解析タスクの最初のステップである。数十年の間に、様々なセグメンテーションアルゴリズムが文献で提案され、ある程度の成功を収めた。その中で、ファジィエネルギーに基づくアクティブな輪郭モデルが過去10年間に研究者に注目され、様々な方法が開発されている。良いセグメンテーションアルゴリズムは、ノイズ、ぼかし、低コントラスト、領域内均一性などを含む多数の画像でうまく機能するべきである。しかし、既存のファジィエネルギーに基づく活動輪郭モデルの性能は、通常、限られた数の画像に基づいて評価されている。本稿では,既存のファジィアクティブな輪郭モデルについて理論的観点から検討し,様々な条件下での大規模な画像に対して実験的に評価することを目的とする。様々な画像に基づく解析は、様々なファジィアクティブ輪郭モデルの強みと弱みについて客観的な洞察を与える。最後に,本トピックに関する課題と今後の研究方向性について考察する。

Image segmentation is the initial step for every image analysis task. A large variety of segmentation algorithm has been proposed in the literature during several decades with some mixed success. Among them, the fuzzy energy based active contour models get attention to the researchers during last decade which results in development of various methods. A good segmentation algorithm should perform well in a large number of images containing noise, blur, low contrast, region in-homogeneity, etc. However, the performances of the most of the existing fuzzy energy based active contour models have been evaluated typically on the limited number of images. In this article, our aim is to review the existing fuzzy active contour models from the theoretical point of view and also evaluate them experimentally on a large set of images under the various conditions. The analysis under a large variety of images provides objective insight into the strengths and weaknesses of various fuzzy active contour models. Finally, we discuss several issues and future research direction on this particular topic.

翻訳日:2022-11-04 01:05:04 公開日:2020-08-01

# PERCH 2.0 : オブジェクトポス推定による高速かつ高精度なGPU認識

PERCH 2.0 : Fast and Accurate GPU-based Perception via Search for Object Pose Estimation ( http://arxiv.org/abs/2008.00326v1 )

ライセンス: Link先を確認

Aditya Agarwal, Yupeng Han, Maxim Likhachev

(参考訳) 既知のオブジェクトのポース推定は、ロボットの把握や操作といったタスクに不可欠である。確実な把握の必要性は、動的環境における乱雑で隠蔽されたシーンのポーズ推定に厳密な精度要件を課す。現代の手法では,3次元モデルと観測データとの対応を見つけるために,大量のトレーニングデータを用いて特徴を学習する。しかし、これらの方法は根拠真理の広範な注釈を必要とする。別の方法として、レンダリング可能なシーンの空間で観察されたシーンの最良の説明を求めるアルゴリズムを使う方法がある。最近開発された PERCH (PErception Via SeaRCH) アルゴリズムは、深度データを用いて、特別に構築された木を探索して、グローバルに最適な解に収束する。 PERCHは精度に強い保証を提供するが、現在の定式化は高いランタイムのためスケーラビリティの低下に悩まされている。さらに、ポーズ推定のための深さデータのみに依存するため、アルゴリズムは2つのオブジェクトが同じ形状のシーンに制限される。本稿では,GPUアクセラレーションとRGBデータを活用する検索戦略による新しい認識手法であるPERCH 2.0を提案する。その結果,本手法は6自由度姿勢推定における最先端データ駆動アプローチよりも100倍のスピードアップを達成でき,トレーニングデータに基礎的真理をアノテートする必要がなく,精度が向上することが示された。私たちのコードとビデオはhttps://sbpl-cruz.github.io/perception/で閲覧できます。

Pose estimation of known objects is fundamental to tasks such as robotic grasping and manipulation. The need for reliable grasping imposes stringent accuracy requirements on pose estimation in cluttered, occluded scenes in dynamic environments. Modern methods employ large sets of training data to learn features in order to find correspondence between 3D models and observed data. However these methods require extensive annotation of ground truth poses. An alternative is to use algorithms that search for the best explanation of the observed scene in a space of possible rendered scenes. A recently developed algorithm, PERCH (PErception Via SeaRCH) does so by using depth data to converge to a globally optimum solution using a search over a specially constructed tree. While PERCH offers strong guarantees on accuracy, the current formulation suffers from low scalability owing to its high runtime. In addition, the sole reliance on depth data for pose estimation restricts the algorithm to scenes where no two objects have the same shape. In this work, we propose PERCH 2.0, a novel perception via search strategy that takes advantage of GPU acceleration and RGB data. We show that our approach can achieve a speedup of 100x over PERCH, as well as better accuracy than the state-of-the-art data-driven approaches on 6-DoF pose estimation without the need for annotating ground truth poses in the training data. Our code and video are available at https://sbpl-cruz.github.io/perception/.

翻訳日:2022-11-04 01:03:49 公開日:2020-08-01

# Fog-Assisted IoTシステムにおけるグリーンオフロード - 学習と制御を統合するオンラインパースペクティブ

Green Offloading in Fog-Assisted IoT Systems: An Online Perspective Integrating Learning and Control ( http://arxiv.org/abs/2008.00199v1 )

ライセンス: Link先を確認

Xin Gao, Xi Huang, Ziyu Shao, Yang Yang

(参考訳) フォグアシスト型IoTシステムでは、タスク処理のレイテンシとエネルギー消費を減らすために、IoTデバイスから近隣のフォグノードにタスクをオフロードすることが一般的である。しかし, 処理能力や伝送速度などのシステム力学に不確実性があるため, オンラインエネルギー効率スキームの設計は依然として未解決の課題である。さらに、決定プロセスはフォグノードやIoTデバイスのリソース制限によって制約されるため、設計はさらに複雑になる。本稿では,時間平均エネルギー消費の長期的制約を伴う組合せ型マルチアームバンドイット(CMAB)問題として,未知のシステムダイナミクスによるタスクオフロード問題を定式化する。オンライン学習とオンライン制御の効果的な統合を通じて,lago(entextit{learning-aided green offloading})方式を提案する。 LAGOでは,悪用と探索のトレードオフを扱うために帯域学習法を採用し,長期的制約に対処するために仮想キュー技術を利用する。理論的解析により,lagoは時間軸を有限に制限し,長期的平均エネルギー制約を満たすことで,平均的なタスク遅延を低減できることが示された。このような理論結果を検証するために,広範なシミュレーションを行う。

In fog-assisted IoT systems, it is a common practice to offload tasks from IoT devices to their nearby fog nodes to reduce task processing latencies and energy consumptions. However, the design of online energy-efficient scheme is still an open problem because of various uncertainties in system dynamics such as processing capacities and transmission rates. Moreover, the decision-making process is constrained by resource limits on fog nodes and IoT devices, making the design even more complicated. In this paper, we formulate such a task offloading problem with unknown system dynamics as a combinatorial multi-armed bandit (CMAB) problem with long-term constraints on time-averaged energy consumptions. Through an effective integration of online learning and online control, we propose a \textit{Learning-Aided Green Offloading} (LAGO) scheme. In LAGO, we employ bandit learning methods to handle the exploitation-exploration tradeoff and utilize virtual queue techniques to deal with the long-term constraints. Our theoretical analysis shows that LAGO can reduce the average task latency with a tunable sublinear regret bound over a finite time horizon and satisfy the long-term time-averaged energy constraints. We conduct extensive simulations to verify such theoretical results.

翻訳日:2022-11-04 01:02:54 公開日:2020-08-01

# シャドウセグメンテーションからシャドウの除去まで

From Shadow Segmentation to Shadow Removal ( http://arxiv.org/abs/2008.00267v1 )

ライセンス: Link先を確認

Hieu Le and Dimitris Samaras

(参考訳) シャドウとシャドウのない画像のペアの必要性はシャドウ除去データセットのサイズと多様性を制限し、大規模なロバストなシャドウ除去アルゴリズムのトレーニングを妨げている。本研究では,影画像から抽出した陰影と非陰影パッチのみを用いて,陰影除去法を提案する。本手法は,影形成の物理モデルに従って,敵対的枠組みを用いて学習する。我々の中心的な貢献は、この逆行訓練を可能にする物理に基づく一連の制約である。提案手法は,完全対影画像と無影画像で訓練した最先端手法と比較して,競争力のあるシャドウ除去を実現する。私たちのトレーニング体制の利点は、ビデオのシャドウ除去においてさらに顕著です。本手法は,事前学習したシャドウ検出器で生成したシャドウマスクのみを用いて,テストビデオ上で微調整を行うことができる。本手法の利点を,提案するビデオシャドウ除去データセットに示す。

The requirement for paired shadow and shadow-free images limits the size and diversity of shadow removal datasets and hinders the possibility of training large-scale, robust shadow removal algorithms. We propose a shadow removal method that can be trained using only shadow and non-shadow patches cropped from the shadow images themselves. Our method is trained via an adversarial framework, following a physical model of shadow formation. Our central contribution is a set of physics-based constraints that enables this adversarial training. Our method achieves competitive shadow removal results compared to state-of-the-art methods that are trained with fully paired shadow and shadow-free images. The advantages of our training regime are even more pronounced in shadow removal for videos. Our method can be fine-tuned on a testing video with only the shadow masks generated by a pre-trained shadow detector and outperforms state-of-the-art methods on this challenging test. We illustrate the advantages of our method on our proposed video shadow removal dataset.

翻訳日:2022-11-04 00:57:11 公開日:2020-08-01

# ロバストな空間的・時間的特徴を用いたスケルトンに基づく行動認識の改善

Improving Skeleton-based Action Recognitionwith Robust Spatial and Temporal Features ( http://arxiv.org/abs/2008.00324v1 )

ライセンス: Link先を確認

Zeshi Yang and Kangkang Yin

(参考訳) 近年,骨格に基づく行動認識がコンピュータビジョンコミュニティにおいて顕著な進歩を遂げている。ほとんどの最先端アルゴリズムは、Graph Convolutional Networks (GCN)に基づいており、バックボーンGCNレイアのネットワーク構造を改善する。本稿では,空間と時間におけるよりロバストな識別的特徴を学ぶための新しいメカニズムを提案する。より具体的には、ネットワークの最後の層にaDiscriminative Feature Learning (DFL)ブランチを追加し、識別的空間的特徴と時間的特徴を抽出して学習を補助する。また、ニューラルネットワークへの入力としてDirection-Invariant Features (DIF)の使用を正式に提唱する。これらのロバストな特徴を学習し使用すると、動作認識精度が向上することを示す。 NTU-RGBD60, NTU-RGBD120, SYSU 3DHOI, Skeleton-Kineticsの4つのデータセットにおけるST-GCNand関連手法との比較を行った。

Recently skeleton-based action recognition has made signif-icant progresses in the computer vision community. Most state-of-the-art algorithms are based on Graph Convolutional Networks (GCN), andtarget at improving the network structure of the backbone GCN lay-ers. In this paper, we propose a novel mechanism to learn more robustdiscriminative features in space and time. More specifically, we add aDiscriminative Feature Learning (DFL) branch to the last layers of thenetwork to extract discriminative spatial and temporal features to helpregularize the learning. We also formally advocate the use of Direction-Invariant Features (DIF) as input to the neural networks. We show thataction recognition accuracy can be improved when these robust featuresare learned and used. We compare our results with those of ST-GCNand related methods on four datasets: NTU-RGBD60, NTU-RGBD120,SYSU 3DHOI and Skeleton-Kinetics.

翻訳日:2022-11-04 00:56:25 公開日:2020-08-01

# 時空間関係学習による不確実性に基づく交通事故予測

Uncertainty-based Traffic Accident Anticipation with Spatio-Temporal Relational Learning ( http://arxiv.org/abs/2008.00334v1 )

ライセンス: Link先を確認

Wentao Bao and Qi Yu and Yu Kong

(参考訳) 事故予測は、ダッシュカムのビデオから事故をできるだけ早く予測することを目的としている。交通シーンが散らばり、視覚的手がかりが限られているため、早期に観測されたフレームからの事故がいつまで続くかを予測することは、非常に難しい。多くの既存手法は,事故予知のための事故関連エージェントの特徴を学習するために開発され,空間的・時間的関係の特徴を無視している。さらに、現在の決定論的ディープニューラルネットワークは誤った予測を過度に信ずる可能性があり、自動運転システムによる交通事故のリスクが高い。本稿では,時空間関係学習を用いた不確実性に基づく事故予測モデルを提案する。ダッシュカムビデオによる交通事故発生確率を逐次予測する。具体的には,リレーショナル特徴学習におけるグラフ畳み込みとリカレントネットワークの活用を提案し,ベイズニューラルネットワークを用いて潜在関係表現の内在的変動に対処する。導出の不確実性に基づくランキング損失は,リレーショナル機能の品質向上により,モデル性能を著しく向上させることがわかった。さらに,環境属性や事故理由アノテーションを含む,交通事故予測のための新しい自動車事故データセット(ccd)を収集した。新たにコンパイルされたデータセットとパブリックデータセットの両方の実験結果から,我々のモデルの最先端性能が示された。私たちのコードとCCDデータセットはhttps://github.com/Cogito2012/UString.orgから入手可能です。

Traffic accident anticipation aims to predict accidents from dashcam videos as early as possible, which is critical to safety-guaranteed self-driving systems. With cluttered traffic scenes and limited visual cues, it is of great challenge to predict how long there will be an accident from early observed frames. Most existing approaches are developed to learn features of accident-relevant agents for accident anticipation, while ignoring the features of their spatial and temporal relations. Besides, current deterministic deep neural networks could be overconfident in false predictions, leading to high risk of traffic accidents caused by self-driving systems. In this paper, we propose an uncertainty-based accident anticipation model with spatio-temporal relational learning. It sequentially predicts the probability of traffic accident occurrence with dashcam videos. Specifically, we propose to take advantage of graph convolution and recurrent networks for relational feature learning, and leverage Bayesian neural networks to address the intrinsic variability of latent relational representations. The derived uncertainty-based ranking loss is found to significantly boost model performance by improving the quality of relational features. In addition, we collect a new Car Crash Dataset (CCD) for traffic accident anticipation which contains environmental attributes and accident reasons annotations. Experimental results on both public and the newly-compiled datasets show state-of-the-art performance of our model. Our code and CCD dataset are available at https://github.com/Cogito2012/UString.

翻訳日:2022-11-04 00:56:07 公開日:2020-08-01

# 抽出要約実験:整数線形プログラミング、項/文のスコーリング、およびタイトル駆動モデル

Experiments in Extractive Summarization: Integer Linear Programming, Term/Sentence Scoring, and Title-driven Models ( http://arxiv.org/abs/2008.00140v1 )

ライセンス: Link先を確認

Daniel Lee and Rakesh Verma and Avisha Das and Arjun Mukherjee

(参考訳) 本稿では,教師なし単一文書要約の課題を再検討し,整数線形計画法(ilp)に基づくアルゴリズム,項・文スコアのパラメータ化正規化,要約のためのタイトル駆動アプローチについて検討する。我々は,新たなフレームワークであるNewsSummについて述べる。このフレームワークには,ILPやタイトル駆動アプローチを含む,要約のための既存および新しいアプローチが多数含まれている。 NewsSummの柔軟性は、異なるアルゴリズムと文のスコアリングスキームをシームレスに組み合わせることができる。文得点とilpと正規化を組み合わせることは,この話題に対するこれまでの研究とは対照的な結果であり,最適なパラメータの探索がより重要となることを示す。また,新たなタイトル駆動型削減アイデアは,検討中の非監督型と監督型の両方のアプローチのパフォーマンス向上につながることを示す。

In this paper, we revisit the challenging problem of unsupervised single-document summarization and study the following aspects: Integer linear programming (ILP) based algorithms, Parameterized normalization of term and sentence scores, and Title-driven approaches for summarization. We describe a new framework, NewsSumm, that includes many existing and new approaches for summarization including ILP and title-driven approaches. NewsSumm's flexibility allows to combine different algorithms and sentence scoring schemes seamlessly. Our results combining sentence scoring with ILP and normalization are in contrast to previous work on this topic, showing the importance of a broader search for optimal parameters. We also show that the new title-driven reduction idea leads to improvement in performance for both unsupervised and supervised approaches considered.

翻訳日:2022-11-04 00:55:12 公開日:2020-08-01

# 質問ベースシステムの明確化に関する実証的研究

An Empirical Study of Clarifying Question-Based Systems ( http://arxiv.org/abs/2008.00279v1 )

ライセンス: Link先を確認

Jie Zou, Evangelos Kanoulas, and Yiqun Liu

(参考訳) ユーザの情報ニーズをよりよく理解するために,質問を明確にするためのイニシアチブを取り入れた検索・レコメンデーションシステムは,研究コミュニティから注目を集めている。しかし、私たちの知る限りでは、ユーザがこれらの質問に答える意思があるかどうかを定量化するための実証的研究はない。本研究では,製品リポジトリに対する質問を明確にすることでユーザと対話する実験システムを展開することで,オンライン実験を行う。暗黙のインタラクション行動データとそれを示すユーザからの明示的なフィードバックの両方を収集します。 (a)ユーザーは、多くの明確な質問(平均11～21件)に答える意思があるが、それ以上は多くない。 b) ほとんどのユーザは,対象製品に到達するまで質問に回答するが,そのごく一部は,疲労や無関係な質問の受け取りによって停止する。 c) ユーザの回答の一部(12-17%)は,実際には対象製品の説明とは反対である。 (d) ユーザ(66～84%)の多くは,タスク完了に有用な質問ベースシステムを見出している。本研究の結果は,現在のシミュレーション評価の前提と矛盾するものが多いが,評価フレームワークの改善を示唆し,今後の対話型検索/リコンペンダーシステム設計に刺激を与える可能性がある。

Search and recommender systems that take the initiative to ask clarifying questions to better understand users' information needs are receiving increasing attention from the research community. However, to the best of our knowledge, there is no empirical study to quantify whether and to what extent users are willing or able to answer these questions. In this work, we conduct an online experiment by deploying an experimental system, which interacts with users by asking clarifying questions against a product repository. We collect both implicit interaction behavior data and explicit feedback from users showing that: (a) users are willing to answer a good number of clarifying questions (11-21 on average), but not many more than that; (b) most users answer questions until they reach the target product, but also a fraction of them stops due to fatigue or due to receiving irrelevant questions; (c) part of the users' answers (12-17%) are actually opposite to the description of the target product; while (d) most of the users (66-84%) find the question-based system helpful towards completing their tasks. Some of the findings of the study contradict current assumptions on simulated evaluations in the field, while they point towards improvements in the evaluation framework and can inspire future interactive search/recommender system designs.

翻訳日:2022-11-04 00:54:55 公開日:2020-08-01

# K-means)-階層型並列遺伝的アルゴリズムによるクラスタベース情報検索

Cluster-Based Information Retrieval by using (K-means)- Hierarchical Parallel Genetic Algorithms Approach ( http://arxiv.org/abs/2008.00150v1 )

ライセンス: Link先を確認

Sarah Hussein Toman, Mohammed Hamzah Abed, Zinah Hussein Toman

(参考訳) クラスタベースの情報検索は、特徴を整理し、抽出し、類似性に応じてWebドキュメントを分類する情報検索(IR)ツールの1つである。従来のアプローチとは異なり、クラスタベースのIRはドキュメントの大きなデータセットを処理するのが速い。検索された文書の品質を高め、IRの効率を高め、ユーザ検索から無関係な文書を減らす。本稿では,K-meansクラスタリングアルゴリズムとマルチデメとマスタ/スレーブPGのハイブリッドPGを組み合わせた(K-means)階層並列遺伝的アルゴリズムアプローチ(HPGA)を提案する。 K-平均は、集団を k 個のサブポピュレーションにクラスタリングし、クエリに関連するほとんどのクラスタを2つのレベルの遺伝的並列性によって並列に操作することで、結果の質を改善する方法として、非関連文書はサブポピュレーションに含まれない。 3つの共通データセット(NLP、CISI、CACM)は、リコール、精度、F測定平均を計算するために使用される。最後に、3つのデータセットの精度を遺伝的IRと古典IRと比較した。 IR-GAによるアプローチ精度の改善はCACMで45%,CISIで27%,NLPで25%であった。一方、Classic-IRと比較すると、(k-means)-HPGAはCACMが47%、CISIが28%、NLPが34%であった。

Cluster-based information retrieval is one of the Information retrieval(IR) tools that organize, extract features and categorize the web documents according to their similarity. Unlike traditional approaches, cluster-based IR is fast in processing large datasets of document. To improve the quality of retrieved documents, increase the efficiency of IR and reduce irrelevant documents from user search. in this paper, we proposed a (K-means) - Hierarchical Parallel Genetic Algorithms Approach (HPGA) that combines the K-means clustering algorithm with hybrid PG of multi-deme and master/slave PG algorithms. K-means uses to cluster the population to k subpopulations then take most clusters relevant to the query to manipulate in a parallel way by the two levels of genetic parallelism, thus, irrelevant documents will not be included in subpopulations, as a way to improve the quality of results. Three common datasets (NLP, CISI, and CACM) are used to compute the recall, precision, and F-measure averages. Finally, we compared the precision values of three datasets with Genetic-IR and classic-IR. The proposed approach precision improvements with IR-GA were 45% in the CACM, 27% in the CISI, and 25% in the NLP. While, by comparing with Classic-IR, (k-means)-HPGA got 47% in CACM, 28% in CISI, and 34% in NLP.

翻訳日:2022-11-04 00:54:34 公開日:2020-08-01

# マルチリソースフェアネスを用いたフォグコンピューティングのためのオンラインタスクスケジューリング

Online Task Scheduling for Fog Computing with Multi-Resource Fairness ( http://arxiv.org/abs/2008.00207v1 )

ライセンス: Link先を確認

Simeng Bian, Xi Huang, Ziyu Shao

(参考訳) フォグコンピューティングシステムでは、オンラインタスクスケジューリング、すなわち、エンドデバイスから連続的に生成されるタスクのリソース割り当てを決定することが重要な課題である。この設計は、フォグコンピューティングシステムに現れる様々な不確実性のために困難であり、例えば、実際の到着前にタスクのリソース要求が不明である。最近の研究は、オンラインタスクスケジューリングと様々な目的の改善のために、深層強化学習(DRL)技術を適用している。しかし、異なるタスクに対するマルチリソースの公平性を見落としており、これはタスク間で公平なリソース共有を実現するための鍵となるが、一般的には自明ではない。このように、マルチリソースフェアネスを備えたオンラインタスクスケジューリングスキームを設計することは、依然としてオープンな問題である。本稿では,上記の課題に対処する。特に,drl技術を活用して支配的資源公平性(drf)という考え方を採用することで,経験から直接学習し,タスク間の公平性を確保しつつ平均的なタスクスローダウンを効果的に短縮するオンラインタスクスケジューリングスキームであるfairtsを提案する。シミュレーションの結果、FairTSはタスクの遅くなり、リソースの公平性が向上し、最先端のスキームよりも優れていた。

In fog computing systems, one key challenge is online task scheduling, i.e., to decide the resource allocation for tasks that are continuously generated from end devices. The design is challenging because of various uncertainties manifested in fog computing systems; e.g., tasks' resource demands remain unknown before their actual arrivals. Recent works have applied deep reinforcement learning (DRL) techniques to conduct online task scheduling and improve various objectives. However, they overlook the multi-resource fairness for different tasks, which is key to achieving fair resource sharing among tasks but in general non-trivial to achieve. Thusly, it is still an open problem to design an online task scheduling scheme with multi-resource fairness. In this paper, we address the above challenges. Particularly, by leveraging DRL techniques and adopting the idea of dominant resource fairness (DRF), we propose FairTS, an online task scheduling scheme that learns directly from experience to effectively shorten average task slowdown while ensuring multi-resource fairness among tasks. Simulation results show that FairTS outperforms state-of-the-art schemes with an ultra-low task slowdown and better resource fairness.

翻訳日:2022-11-04 00:54:11 公開日:2020-08-01

# スパイキングニューラルネットワークを用いた輪郭追跡改善のための適応ケモタキシー

Adaptive Chemotaxis for improved Contour Tracking using Spiking Neural Networks ( http://arxiv.org/abs/2008.00317v1 )

ライセンス: Link先を確認

Shashwat Shukla, Rohan Pathak, Vivek Saraswat and Udayan Ganguly

(参考訳) 本稿では,線虫Caenorhabditis elegansの走化ネットワークに触発された自律ナビゲーションのためのスパイキングニューラルネットワーク(SNN)を提案する。特に、輪郭追跡の問題に焦点を当て、ロボットが到達し、次に所望の濃度設定点に従う必要がある。 klinokinesisのみを使用した過去のスキームは、効率的に輪郭に従うことができるが、セットポイントに到達するのに過度な時間がかかる。我々は,従来提案していた勾配クライミング回路を基盤とした適応型クリノ軸機構を提案することで,この欠点に対処する。我々は,我々のklinotaxis回路が,勾配上昇や勾配降下を行うように自律的に構成され,その後,前述のklinokinesis回路とシームレスに統合できないことを実証する。また,速度制御(orthokinesis)を取り入れ,輪郭追跡性能をさらに向上させた。そこで本研究では,klinokinesis,klinotaxis,ortokinesisを統合したモデルを提案する。輪郭追跡シミュレーションにより,提案手法がセットポイントに到達するまでの時間の2.4倍削減と,セットポイントからの平均偏差の8.7倍削減を実現することを実証した。

In this paper we present a Spiking Neural Network (SNN) for autonomous navigation, inspired by the chemotaxis network of the worm Caenorhabditis elegans. In particular, we focus on the problem of contour tracking, wherein the bot must reach and subsequently follow a desired concentration setpoint. Past schemes that used only klinokinesis can follow the contour efficiently but take excessive time to reach the setpoint. We address this shortcoming by proposing a novel adaptive klinotaxis mechanism that builds upon a previously proposed gradient climbing circuit. We demonstrate how our klinotaxis circuit can autonomously be configured to perform gradient ascent, gradient descent and subsequently be disabled to seamlessly integrate with the aforementioned klinokinesis circuit. We also incorporate speed regulation (orthokinesis) to further improve contour tracking performance. Thus for the first time, we present a model that successfully integrates klinokinesis, klinotaxis and orthokinesis. We demonstrate via contour tracking simulations that our proposed scheme achieves an 2.4x reduction in the time to reach the setpoint, along with a simultaneous 8.7x reduction in average deviation from the setpoint.

翻訳日:2022-11-04 00:47:50 公開日:2020-08-01

# 指紋認識システムのホワイトボックス評価

White-Box Evaluation of Fingerprint Recognition Systems ( http://arxiv.org/abs/2008.00128v1 )

ライセンス: Link先を確認

Steven A. Grosz, Joshua J. Engelsma, Anil K. Jain

(参考訳) 指紋認証システムの典型的な評価は、全体的な識別や認証精度の観点から性能を評価するエンドツーエンドのブラックボックス評価である。しかしながら、これらのブラックボックステストは、画像取得、特徴抽出、マッチングを含む個々のモジュールのパフォーマンスに関する洞察を明らかにしていない。一方,本論文のトピックであるホワイトボックス評価では,各構成モジュールの性能を個別に測定する。指紋読取装置,特徴抽出装置,マッチング部品のホワイトボックス評価をいくつかの研究で行ったが,指紋認識システムの各段階で導入された不確実性に関するホワイトボックス分析の完全なシステムを提供していない。本研究では,指紋認識システムコンポーネントの過去のホワイトボックス評価を拡張し,集計されたホワイトボックス評価結果に基づいて指紋認識システム性能の詳細な分析を行う。特に, 指紋認証システムの各段階において, 不正な捕獲条件(照明, 水分, 圧力など)による不確実性について, 取得時の解析を行った。本実験では,ブラックボックス認識性能の面では,各サブモジュールのホワイトボックス解析でのみ確認可能な,指紋認識システムパイプラインの各モジュールにおいて,総合的に優れた性能を発揮できないことを示す。このような発見により、研究者たちは指紋認識システムの改善にもっと注力できる。

Typical evaluations of fingerprint recognition systems consist of end-to-end black-box evaluations, which assess performance in terms of overall identification or authentication accuracy. However, these black-box tests of system performance do not reveal insights into the performance of the individual modules, including image acquisition, feature extraction, and matching. On the other hand, white-box evaluations, the topic of this paper, measure the individual performance of each constituent module in isolation. While a few studies have conducted white-box evaluations of the fingerprint reader, feature extractor, and matching components, no existing study has provided a full system, white-box analysis of the uncertainty introduced at each stage of a fingerprint recognition system. In this work, we extend previous white-box evaluations of fingerprint recognition system components and provide a unified, in-depth analysis of fingerprint recognition system performance based on the aggregated white-box evaluation results. In particular, we analyze the uncertainty introduced at each stage of the fingerprint recognition system due to adverse capture conditions (i.e., varying illumination, moisture, and pressure) at the time of acquisition. Our experiments show that a system that performs better overall, in terms of black-box recognition performance, does not necessarily perform best at each module in the fingerprint recognition system pipeline, which can only be seen with white-box analysis of each sub-module. Findings such as these enable researchers to better focus their efforts in improving fingerprint recognition systems.

翻訳日:2022-11-04 00:47:29 公開日:2020-08-01

# PanoNet: 位置感性機能埋め込みによるリアルタイムパノプティクスセグメンテーション

PanoNet: Real-time Panoptic Segmentation through Position-Sensitive Feature Embedding ( http://arxiv.org/abs/2008.00192v1 )

ライセンス: Link先を確認

Xia Chen, Jianren Wang, Martial Hebert

(参考訳) 我々は,パンオプティカルセグメンテーションのためのセマンティクスとインスタンスマスクを同時に生成する,シンプルで高速で柔軟なフレームワークを提案する。パノネットと呼ばれる我々の手法はクリーンで自然な構造設計を取り入れており、時間を要する検出処理が不要なセグメンテーションタスクとして問題に取り組む。また,物体の外観と空間的位置の両方を考慮し,位置感性埋め込みを例示する。全体的に、パノネットは高精細な都市景観画像のパンオプティカルな品質の結果をリアルタイムで得ることができ、同等の性能を持つ他の手法よりもかなり高速である。私たちのアプローチは、自律運転や拡張現実といった多くのアプリケーションにおいて、現実的なスピードとメモリ要件を十分に満たしています。

We propose a simple, fast, and flexible framework to generate simultaneously semantic and instance masks for panoptic segmentation. Our method, called PanoNet, incorporates a clean and natural structure design that tackles the problem purely as a segmentation task without the time-consuming detection process. We also introduce position-sensitive embedding for instance grouping by accounting for both object's appearance and its spatial location. Overall, PanoNet yields high panoptic quality results of high-resolution Cityscapes images in real-time, significantly faster than all other methods with comparable performance. Our approach well satisfies the practical speed and memory requirement for many applications like autonomous driving and augmented reality.

翻訳日:2022-11-04 00:46:12 公開日:2020-08-01

# 自己教師付き学習から視覚プライオリティーを蒸留する

Distilling Visual Priors from Self-Supervised Learning ( http://arxiv.org/abs/2008.00261v1 )

ライセンス: Link先を確認

Bingchen Zhao, Xin Wen

(参考訳) 畳み込みニューラルネットワーク(CNN)は、小さなトレーニングデータセットに適合する傾向にある。本稿では,画像分類のためのcnnモデルの一般化能力を向上させるために,自己教師付き学習と知識蒸留を利用した2相パイプラインを提案する。第1段階は、自己教師型学習を通してリッチで一般化可能な視覚表現を持つ教師モデルを学習し、第2段階は、学生モデルを自己蒸留方式で蒸留し、一方、イメージ分類タスクの生徒モデルを微調整する。また,データ不足シナリオ下での表現をよりよく学習するために,自己指導型コントラスト学習プロキシタスクの新たなマージン損失を提案する。他のトリックとともに、VIPriors画像分類チャレンジにおいて競合性能を達成する。

Convolutional Neural Networks (CNNs) are prone to overfit small training datasets. We present a novel two-phase pipeline that leverages self-supervised learning and knowledge distillation to improve the generalization ability of CNN models for image classification under the data-deficient setting. The first phase is to learn a teacher model which possesses rich and generalizable visual representations via self-supervised learning, and the second phase is to distill the representations into a student model in a self-distillation manner, and meanwhile fine-tune the student model for the image classification task. We also propose a novel margin loss for the self-supervised contrastive learning proxy task to better learn the representation under the data-deficient scenario. Together with other tricks, we achieve competitive performance in the VIPriors image classification challenge.

翻訳日:2022-11-04 00:44:55 公開日:2020-08-01

# メタDRN:1ショット画像セグメンテーションのためのメタラーニング

Meta-DRN: Meta-Learning for 1-Shot Image Segmentation ( http://arxiv.org/abs/2008.00247v1 )

ライセンス: Link先を確認

Atmadeep Banerjee

(参考訳) 現代のディープラーニングモデルはコンピュータビジョンの分野に革命をもたらした。しかし、これらのモデルの大きな欠点は、適切に一般化するために多数のラベル付き例を必要とすることである。数ショット学習の最近の発展は、この要件を緩和することを目的としている。本稿では,1ショット画像セグメンテーションのための新しい軽量cnnアーキテクチャを提案する。提案モデルは,セマンティックセグメンテーションのための優れたアーキテクチャからインスピレーションを得て,それを1ショットドメインに適応させることによって作成される。画像分類に有効である4つのメタ学習アルゴリズムを用いてモデルをトレーニングし、その結果を比較した。選択したデータセットに対して、提案したモデルは、ベンチマークよりも70%低いパラメータカウントを持ち、メタ学習アルゴリズムの4つすべてを用いて、より良いか同等の平均IoUスコアを持つ。

Modern deep learning models have revolutionized the field of computer vision. But, a significant drawback of most of these models is that they require a large number of labelled examples to generalize properly. Recent developments in few-shot learning aim to alleviate this requirement. In this paper, we propose a novel lightweight CNN architecture for 1-shot image segmentation. The proposed model is created by taking inspiration from well-performing architectures for semantic segmentation and adapting it to the 1-shot domain. We train our model using 4 meta-learning algorithms that have worked well for image classification and compare the results. For the chosen dataset, our proposed model has a 70% lower parameter count than the benchmark, while having better or comparable mean IoU scores using all 4 of the meta-learning algorithms.

翻訳日:2022-11-04 00:39:09 公開日:2020-08-01

# ワープによるアニメーション:高品質表情アニメーションの効率的な方法

Animating Through Warping: an Efficient Method for High-Quality Facial Expression Animation ( http://arxiv.org/abs/2008.00362v1 )

ライセンス: Link先を確認

Zili Yi, Qiang Tang, Vishnu Sanjay Ramiya Srinivasan, Zhan Xu

(参考訳) ディープニューラルネットワークの進歩は、3Dドメインを操作せずに静止画像をアニメーションする技術を大幅に改善した。一方、先行技術では、メモリの制限、トレーニングの難しさ、高解像度(hd)トレーニングデータセットの欠如により、小さな画像(典型的には512x512)しかアニメーションできない。ニューラルネットワークが生成する低分解能結果に高周波数残差を加えることでHD画像を生成することができるという考えから,我々は,HD画像の効率的なアニメーションを実現するための新しいフレームワークであるAnimating Through Warping(ATW)を提案する。具体的には、新しい2段階のニューラルネットワークジェネレータと、Animating Through Warping (ATW)として知られる新しい後処理モジュールの2つのモジュールで構成されている。ジェネレータを小さなイメージでトレーニングし、任意のサイズのイメージで推論することしか必要ありません。推論中、hd入力画像は低解像度成分(128x128)と対応する高周波残差に分解される。ジェネレータは、入力面を所望の状態(例えば、表現カテゴリまたはアクションユニット)に歪ませる動き場と同様に、低解像度の結果を予測する。最後に、reswarpモジュールは、動き場に基づいて残差をゆがめ、ゆがんだ残差を追加して、極端にアップサンプリングされた低解像度結果から最終的なhd結果を生成する。実験では,高分解能アニメーション生成における手法の有効性と効率を示す。提案するフレームワークは,従来のニューラルモデルでは達成されていない4K顔画像の認識に成功している。また,本手法は,生成したアニメーションの時間的一貫性を一般的に保証する。ソースコードは公開される予定だ。

Advances in deep neural networks have considerably improved the art of animating a still image without operating in 3D domain. Whereas, prior arts can only animate small images (typically no larger than 512x512) due to memory limitations, difficulty of training and lack of high-resolution (HD) training datasets, which significantly reduce their potential for applications in movie production and interactive systems. Motivated by the idea that HD images can be generated by adding high-frequency residuals to low-resolution results produced by a neural network, we propose a novel framework known as Animating Through Warping (ATW) to enable efficient animation of HD images. Specifically, the proposed framework consists of two modules, a novel two-stage neural-network generator and a novel post-processing module known as Animating Through Warping (ATW). It only requires the generator to be trained on small images and can do inference on an image of any size. During inference, an HD input image is decomposed into a low-resolution component(128x128) and its corresponding high-frequency residuals. The generator predicts the low-resolution result as well as the motion field that warps the input face to the desired status (e.g., expressions categories or action units). Finally, the ResWarp module warps the residuals based on the motion field and adding the warped residuals to generates the final HD results from the naively up-sampled low-resolution results. Experiments show the effectiveness and efficiency of our method in generating high-resolution animations. Our proposed framework successfully animates a 4K facial image, which has never been achieved by prior neural models. In addition, our method generally guarantee the temporal coherency of the generated animations. Source codes will be made publicly available.

翻訳日:2022-11-04 00:38:42 公開日:2020-08-01

# 小さな運動と大きな結果:運動拡大法を用いて乳児の震動を再現する

Little Motion, Big Results: Using Motion Magnification to Reveal Subtle Tremors in Infants ( http://arxiv.org/abs/2008.04946v1 )

ライセンス: Link先を確認

Girik Malik and Ish K. Gulati

(参考訳) 震えの検出は人間と機械の両方にとって困難である。妊娠中にオピオイドに曝露した幼児は、しばしばヒトの眼では見逃し易い出生後の離脱の徴候や症状を示す。新生児アブスティネンス症候群(nas)と呼ばれる一連の臨床特徴は、震え、発作、刺激性などである。現在のケア基準は主観的評価に基づいてFinnegan Neonatal Abstinence Syndrome Scoring System (FNASS) を用いている。 FNASSによるモニタリングには高度に熟練した看護スタッフが必要である。本稿では,増幅動作信号を用いた自動震動検出システムを提案する。 NASの徴候を示す乳児のベッドサイドビデオで適用可能性を示す。さらに, 深層畳み込みネットワークに基づく運動拡大の異なるモードをテストし, ダイナミックモードが臨床設定において最も有効であり, 共通の方向変化に不変であることを示す。本研究では,NAS患者に対して,既存のプロトコルを補うために運動倍率を用いた退院・フォローアップ戦略を提案する。本研究は,現在の実践,訓練,資源利用のギャップを埋める手法を提案する。

Detecting tremors is challenging for both humans and machines. Infants exposed to opioids during pregnancy often show signs and symptoms of withdrawal after birth, which are easy to miss with the human eye. The constellation of clinical features, termed as Neonatal Abstinence Syndrome (NAS), include tremors, seizures, irritability, etc. The current standard of care uses Finnegan Neonatal Abstinence Syndrome Scoring System (FNASS), based on subjective evaluations. Monitoring with FNASS requires highly skilled nursing staff, making continuous monitoring difficult. In this paper we propose an automated tremor detection system using amplified motion signals. We demonstrate its applicability on bedside video of infant exhibiting signs of NAS. Further, we test different modes of deep convolutional network based motion magnification, and identify that dynamic mode works best in the clinical setting, being invariant to common orientational changes. We propose a strategy for discharge and follow up for NAS patients, using motion magnification to supplement the existing protocols. Overall our study suggests methods for bridging the gap in current practices, training and resource utilization.

翻訳日:2022-11-04 00:38:14 公開日:2020-08-01

# tactilesgnet: イベントベースの触覚物体認識のためのスパイクグラフニューラルネットワーク

TactileSGNet: A Spiking Graph Neural Network for Event-based Tactile Object Recognition ( http://arxiv.org/abs/2008.08046v1 )

ライセンス: Link先を確認

Fuqiang Gu, Weicong Sng, Tasbolat Taunyazov and Harold Soh

(参考訳) 触覚知覚は、把持や手操作を含む様々なロボットタスクに不可欠である。フレキシブルでイベント駆動の電子皮膚の新しい進歩は、すぐに人間に似たタッチ認識能力を持つロボットに与えられるかもしれない。これらの電子皮膚は変化(例えば圧力、温度)に非同期に反応し、ロボットの体やエンドエフェクターに不規則にレイアウトすることができる。しかし、これらのユニークな特徴は、触覚学習には適さない畳み込み特徴抽出器のような現在のディープラーニングアプローチをもたらす可能性がある。本稿では,イベントに基づく触覚物体認識のための新しいスパイキンググラフニューラルネットワークを提案する。そこで本研究では,タクセルの局所接続性を活用するために,触覚データをグラフ構造に整理する手法を提案する。構築したグラフに基づいて,スパイキンググラフ畳み込みネットワークを開発した。スパイクニューラルネットワークのイベント駆動性は、イベントベースのデータを処理するのに間違いなく適している。 2つの触覚データセットによる実験結果から,提案手法は様々な家庭オブジェクトの分類において,約90%の精度で高い精度を達成できることがわかった。

Tactile perception is crucial for a variety of robot tasks including grasping and in-hand manipulation. New advances in flexible, event-driven, electronic skins may soon endow robots with touch perception capabilities similar to humans. These electronic skins respond asynchronously to changes (e.g., in pressure, temperature), and can be laid out irregularly on the robot's body or end-effector. However, these unique features may render current deep learning approaches such as convolutional feature extractors unsuitable for tactile learning. In this paper, we propose a novel spiking graph neural network for event-based tactile object recognition. To make use of local connectivity of taxels, we present several methods for organizing the tactile data in a graph structure. Based on the constructed graphs, we develop a spiking graph convolutional network. The event-driven nature of spiking neural network makes it arguably more suitable for processing the event-based data. Experimental results on two tactile datasets show that the proposed method outperforms other state-of-the-art spiking methods, achieving high accuracies of approximately 90\% when classifying a variety of different household objects.

翻訳日:2022-11-04 00:37:54 公開日:2020-08-01

# トランスコーダシステムのテストセット

The test set for the TransCoder system ( http://arxiv.org/abs/2008.00293v1 )

ライセンス: Link先を確認

Ernest Davis

(参考訳) TransCoderシステムは、Java、C++、Python 3間でソースコードを変換する。品質評価に使用されたテストセットには,クラスを定義して使用する機能や,再帰的以外のユーザ定義関数を呼び出す機能など,javaの重要な機能が欠落している。そのため、これらの特徴を持つプログラムに対するTransCoderの精度は未だ不明である。

The TransCoder system translates source code between Java, C++, and Python 3. The test set that was used to evaluate its quality is missing important features of Java, including the ability to define and use classes and the ability to call user-defined functions other than recursively. Therefore, the accuracy of TransCoder over programs with those features remains unknown.

翻訳日:2022-11-04 00:37:35 公開日:2020-08-01

# マイクロテキストから実行可能な情報を抽出する

Extracting actionable information from microtexts ( http://arxiv.org/abs/2008.00343v1 )

ライセンス: Link先を確認

Ali H\"urriyeto\u{g}lu

(参考訳) Twitterのようなマイクロブログは強力な情報ソースである。この情報の一部は、個々の投稿のレベルを超えて集約することができる。集約された情報のいくつかは、e-governanceや公共の安全、その他の公共の関心のレベルに関心を持って行動すべきイベントを指している。さらに、もし集約すれば、かなりの量の情報が既存の情報ネットワークを非自明な方法で補完することができる。この論文は、この目的を果たす実行可能な情報を抽出する半自動手法を提案する。まず,ドメイン内シナリオとクロスドメインシナリオの両方において,イベントまでの時間予測が可能であることを示す。第2に、アナリストのコンテキストに対する関連性の定義を容易にし、この定義を用いて新しいデータを分析する方法を提案する。最後に,機械学習に基づく関連情報分類手法とルールベースの情報分類手法を統合し,マイクロテキストを分類する手法を提案する。マイクロテキスト解析の完全自動化は、この研究プロジェクトの初日から私たちの目標です。この方向への取り組みは、この自動化がどの程度実現できるかを教えてくれました。主に自動化アプローチを開発し、その後、自動化アプローチのさまざまなステップで人間の介入を統合することで、それを拡張し、改善しました。我々の経験は、情報システムの設計、実現、評価によく設計された人間の介入や貢献が、その性能を改善するか、実現を可能にするかのどちらかであることを示す以前の研究を確認する。我々の研究と成果がその必要性と価値に向けられたので、私たちは人間の関与をデザインする以前の研究からインスピレーションを受け、人間の入力から利益を得るためのアプローチをカスタマイズしました。

Microblogs such as Twitter represent a powerful source of information. Part of this information can be aggregated beyond the level of individual posts. Some of this aggregated information is referring to events that could or should be acted upon in the interest of e-governance, public safety, or other levels of public interest. Moreover, a significant amount of this information, if aggregated, could complement existing information networks in a non-trivial way. This dissertation proposes a semi-automatic method for extracting actionable information that serves this purpose. First, we show that predicting time to event is possible for both in-domain and cross-domain scenarios. Second, we suggest a method which facilitates the definition of relevance for an analyst's context and the use of this definition to analyze new data. Finally, we propose a method to integrate the machine learning based relevant information classification method with a rule-based information classification technique to classify microtexts. Fully automatizing microtext analysis has been our goal since the first day of this research project. Our efforts in this direction informed us about the extent this automation can be realized. We mostly first developed an automated approach, then we extended and improved it by integrating human intervention at various steps of the automated approach. Our experience confirms previous work that states that a well-designed human intervention or contribution in design, realization, or evaluation of an information system either improves its performance or enables its realization. As our studies and results directed us toward its necessity and value, we were inspired from previous studies in designing human involvement and customized our approaches to benefit from human input.

翻訳日:2022-11-04 00:37:29 公開日:2020-08-01

# CLEF 2019 Lab ProtestNewsの概要: クロスコンテキスト設定でニュースから抗議を抽出する

Overview of CLEF 2019 Lab ProtestNews: Extracting Protests from News in a Cross-context Setting ( http://arxiv.org/abs/2008.00345v1 )

ライセンス: Link先を確認

Ali H\"urriyeto\u{g}lu, Erdem Y\"or\"uk, Deniz Y\"uret, \c{C}a\u{g}r{\i} Yoltar, Burak G\"urel, F{\i}rat Duru\c{s}an, Osman Mutlu, and Arda Akdemir

(参考訳) 我々は,一般化可能な自然言語処理の文脈において,ニュースからの抗議を取り出すためのclef-2019 lab protestnewsの概要を紹介する。この研究室は、文書、文、トークンレベルの情報分類および抽出タスクから構成されており、それぞれこの研究室の範囲内においてタスク1、タスク2、タスク3と呼ばれる。この作業では、参加者は、上記の1つ以上のレベルにおいて、英語のローカルニュースから抗議活動に関連する情報を識別する必要があった。トレーニングと開発データはインドから収集され、テストデータはインドと中国から収集された。 58チームが参加している。これらのチームの12チームと9チームがそれぞれ結果と作業ノートを提出した。我々は、ニューラルネットワークが最良の結果をもたらすのを観察し、その性能低下は、中国であるクロスカントリーセッティングにおけるほとんどの投稿に対して顕著である。

We present an overview of the CLEF-2019 Lab ProtestNews on Extracting Protests from News in the context of generalizable natural language processing. The lab consists of document, sentence, and token level information classification and extraction tasks that were referred as task 1, task 2, and task 3 respectively in the scope of this lab. The tasks required the participants to identify protest relevant information from English local news at one or more aforementioned levels in a cross-context setting, which is cross-country in the scope of this lab. The training and development data were collected from India and test data was collected from India and China. The lab attracted 58 teams to participate in the lab. 12 and 9 of these teams submitted results and working notes respectively. We have observed neural networks yield the best results and the performance drops significantly for majority of the submissions in the cross-country setting, which is China.

翻訳日:2022-11-04 00:37:03 公開日:2020-08-01

# 抗議イベント関連知識ベース構築のためのクロスコンテキストニュースコーパス

Cross-context News Corpus for Protest Events related Knowledge Base Construction ( http://arxiv.org/abs/2008.00351v1 )

ライセンス: Link先を確認

Ali H\"urriyeto\u{g}lu, Erdem Y\"or\"uk, Deniz Y\"uret, Osman Mutlu, \c{C}a\u{g}r{\i} Yoltar, F{\i}rat Duru\c{s}an, Burak G\"urel

(参考訳) 英語の様々な国からの様々な地域的・国際的ソースからなる抗議イベントの金本位制コーパスについて述べる。コーパスには文書、文、トークンレベルのアノテーションが含まれている。このコーパスは、ニュース記事を自動的に分類し、抗議イベント関連情報を抽出する機械学習モデルの作成を容易にし、社会科学と政治科学の比較研究を可能にする知識ベースを構築する。各ニュースソースについて、アノテーションはニュース記事のランダムなサンプルから始まり、アクティブな学習を用いて描画されたサンプルで続く。各サンプルのバッチは2人の社会・政治科学者によって注釈監督官によってアノテートされ、アノテーションエラーを半自動的に識別することで改善された。テキスト分類とイベント抽出システムの開発とベンチマークを行う上で,コーパスには多様性と品質があり,テキスト自動処理システムの汎用性と堅牢性に寄与することがわかった。このコーパスと報告された結果は、現在、自動抗議イベント収集研究の共通基盤を欠いている。

We describe a gold standard corpus of protest events that comprise of various local and international sources from various countries in English. The corpus contains document, sentence, and token level annotations. This corpus facilitates creating machine learning models that automatically classify news articles and extract protest event-related information, constructing knowledge bases which enable comparative social and political science studies. For each news source, the annotation starts on random samples of news articles and continues with samples that are drawn using active learning. Each batch of samples was annotated by two social and political scientists, adjudicated by an annotation supervisor, and was improved by identifying annotation errors semi-automatically. We found that the corpus has the variety and quality to develop and benchmark text classification and event extraction systems in a cross-context setting, which contributes to the generalizability and robustness of automated text processing systems. This corpus and the reported results will set the currently lacking common ground in automated protest event collection studies.

翻訳日:2022-11-04 00:36:46 公開日:2020-08-01

# 状況分析と危機管理のための時制・局面・ムードに基づくイベント抽出

Tense, aspect and mood based event extraction for situation analysis and crisis management ( http://arxiv.org/abs/2008.01555v1 )

ライセンス: Link先を確認

Ali H\"urriyeto\u{g}lu

(参考訳) 今日では、イベント抽出システムは主に、状況の時間的およびモード的資格に関する情報を比較的少ない量で処理し、主に過去時制でアサーション文を処理する。しかし、時制、アスペクト、ムードの幅が広いシステムは、より良い分析を提供し、より広い範囲のテキスト分析アプリケーションで使用することができる。この論文はトルコ語のこのような体系を発展させている。これは、オープンソース情報マイニング・分析(OPTIMA)研究グループのイベント抽出ソフトウェアを拡張し、セマンティック表現形式における適切な拡張を実装し、TAM(Tense, Aspect and Mood)マーカーを改善した部分文法、Expressionの副詞解析とマッチング機能を追加し、CORLEONEの標準で適切な辞書を構築することで達成される。これらの拡張はiv のアンカー関係の理論(tem\"urc\"u, 2007)に基づいている。その結果、基本的なイベント構造を抽出するだけでなく、その時間的、モーダル的、そして意志的/反復的な値に応じて、ニュースレポートに与えられる文章を分類できるシステムとなった。トルコ語の自然災害、病気の発生、人為的災害のニュースに焦点が当てられているが、他の言語、ドメイン、ジャンルに適応することができる。このイベント抽出・分類システムは、さらなる発展とともに、環境および人道上のリスクを防止するための自動ブラウジングシステムの基礎を提供することができる。

Nowadays event extraction systems mainly deal with a relatively small amount of information about temporal and modal qualifications of situations, primarily processing assertive sentences in the past tense. However, systems with a wider coverage of tense, aspect and mood can provide better analyses and can be used in a wider range of text analysis applications. This thesis develops such a system for Turkish language. This is accomplished by extending Open Source Information Mining and Analysis (OPTIMA) research group's event extraction software, by implementing appropriate extensions in the semantic representation format, by adding a partial grammar which improves the TAM (Tense, Aspect and Mood) marker, adverb analysis and matching functions of ExPRESS, and by constructing an appropriate lexicon in the standard of CORLEONE. These extensions are based on iv the theory of anchoring relations (Tem\"urc\"u, 2007, 2011) which is a crosslinguistically applicable semantic framework for analyzing tense, aspect and mood related categories. The result is a system which can, in addition to extracting basic event structures, classify sentences given in news reports according to their temporal, modal and volitional/illocutionary values. Although the focus is on news reports of natural disasters, disease outbreaks and man-made disasters in Turkish language, the approach can be adapted to other languages, domains and genres. This event extraction and classification system, with further developments, can provide a basis for automated browsing systems for preventing environmental and humanitarian risk.

翻訳日:2022-11-04 00:36:30 公開日:2020-08-01

# L-CNN:マルチストリーム畳み込みニューラルネットワークのための格子相互融合戦略

L-CNN: A Lattice cross-fusion strategy for multistream convolutional neural networks ( http://arxiv.org/abs/2008.00157v1 )

ライセンス: Link先を確認

Ana Paula G. S. de Almeida and Flavio de Barros Vidal

(参考訳) 本稿では,マルチストリーム畳み込みネットワークの融合戦略であるLattice Cross Fusionを提案する。このアプローチは、プール層直前に数学的操作に基づく融合を行う畳み込み層からの信号と交差する。画像分類データセットであるCIFAR-10を改良したAlexNet-LCNNバージョンを用いて目的的に悪化させた結果,この手法は,より高速な収束,安定性,ロバスト性を備えたベースラインシングルストリームネットワークにおいて,46%向上した。

This paper proposes a fusion strategy for multistream convolutional networks, the Lattice Cross Fusion. This approach crosses signals from convolution layers performing mathematical operation-based fusions right before pooling layers. Results on a purposely worsened CIFAR-10, a popular image classification data set, with a modified AlexNet-LCNN version show that this novel method outperforms by 46% the baseline single stream network, with faster convergence, stability, and robustness.

翻訳日:2022-11-04 00:30:34 公開日:2020-08-01

# eigen-cam:主成分を用いたクラスアクティベーションマップ

Eigen-CAM: Class Activation Map using Principal Components ( http://arxiv.org/abs/2008.00299v1 )

ライセンス: Link先を確認

Mohammed Bany Muhammad, Mohammed Yeasin

(参考訳) ディープニューラルネットワークは、モデルの開発と他のドメインへの影響により、ユビキタスである。この進歩の中心は畳み込みニューラルネットワーク(cnns)であり、一連のデータから表現や特徴を学習することができる。このような複雑なモデル(数百万のパラメータと数百のレイヤ)を理解することは、開発者だけでなくエンドユーザにとっても難しい。これは部分的には、解釈性と透明性を提供するツールやインターフェースの欠如によるものだ。クラスアクティベーションマップ(クラスアクティベーションマップ,class activation map, CAM)は、モデルがデータから何を学ぶか、あるいはそれが与えられたタスクでどのように振る舞うかを理解することに焦点を当てている。本稿では,解釈可能でロバストで透明なモデルに対する需要の増加に対応するために,従来の考え方を基礎としている。私たちのアプローチは、CAMを生成するためのシンプルで直感的な(あるいは慣れ親しんだ)方法を提供します。提案する固有camは畳み込み層から学習した特徴/表現の原理成分を計算・可視化する。逆雑音の存在下での弱教師付き局所化や局所化などのベンチマークデータセットを評価することにより,固有camと最先端手法(grad-cam,grad-cam++,cnn-fixationsなど)を比較する実験を行った。固有camはcnnの完全連結層による分類エラーに対して頑健であり、勾配のバックプロパゲーションやクラス妥当性スコア、最大アクティベーション位置、その他の重み付け機能には依存していない。さらに、レイヤの変更やモデルの再トレーニングを必要とせずに、すべてのCNNモデルで動作する。その結果, 弱教師付き物体定位法と比較して, 最良手法に比べて最大12%改善した。

Deep neural networks are ubiquitous due to the ease of developing models and their influence on other domains. At the heart of this progress is convolutional neural networks (CNNs) that are capable of learning representations or features given a set of data. Making sense of such complex models (i.e., millions of parameters and hundreds of layers) remains challenging for developers as well as the end-users. This is partially due to the lack of tools or interfaces capable of providing interpretability and transparency. A growing body of literature, for example, class activation map (CAM), focuses on making sense of what a model learns from the data or why it behaves poorly in a given task. This paper builds on previous ideas to cope with the increasing demand for interpretable, robust, and transparent models. Our approach provides a simpler and intuitive (or familiar) way of generating CAM. The proposed Eigen-CAM computes and visualizes the principle components of the learned features/representations from the convolutional layers. Empirical studies were performed to compare the Eigen-CAM with the state-of-the-art methods (such as Grad-CAM, Grad-CAM++, CNN-fixations) by evaluating on benchmark datasets such as weakly-supervised localization and localizing objects in the presence of adversarial noise. Eigen-CAM was found to be robust against classification errors made by fully connected layers in CNNs, does not rely on the backpropagation of gradients, class relevance score, maximum activation locations, or any other form of weighting features. In addition, it works with all CNN models without the need to modify layers or retrain models. Empirical results show up to 12% improvement over the best method among the methods compared on weakly supervised object localization.

翻訳日:2022-11-04 00:30:25 公開日:2020-08-01

# 視覚物体追跡のための効率的な逆襲

Efficient Adversarial Attacks for Visual Object Tracking ( http://arxiv.org/abs/2008.00217v1 )

ライセンス: Link先を確認

Siyuan Liang, Xingxing Wei, Siyuan Yao and Xiaochun Cao

(参考訳) ビジュアルオブジェクトのトラッキングは、トラッカーがオブジェクトを迅速かつ正確に見つける必要がある重要なタスクである。既存の最先端のオブジェクトトラッカー、すなわちシームズをベースとしたトラッカーは、DNNを使用して高精度を実現する。しかし、視覚追跡モデルの堅牢性はほとんど調査されていない。本稿では,Siameseネットワークに基づくオブジェクトトラッカーの弱点を分析し,その逆の例を視覚的オブジェクトトラッカーに拡張する。本稿では,新たなドリフト損失と組込み機能損失を併用して,シームズネットワークベースのトラッカーを攻撃するエンド・ツー・エンドネットワークFANを提案する。単一のGPUの下では、FANはトレーニング速度が効率的で、強力な攻撃性能を持つ。 FANは10msで敵の例を生成し、効果的な標的攻撃(OTBでは少なくとも40%の落差率)と未目標攻撃(OTBでは少なくとも70%の落差率)を達成する。

Visual object tracking is an important task that requires the tracker to find the objects quickly and accurately. The existing state-ofthe-art object trackers, i.e., Siamese based trackers, use DNNs to attain high accuracy. However, the robustness of visual tracking models is seldom explored. In this paper, we analyze the weakness of object trackers based on the Siamese network and then extend adversarial examples to visual object tracking. We present an end-to-end network FAN (Fast Attack Network) that uses a novel drift loss combined with the embedded feature loss to attack the Siamese network based trackers. Under a single GPU, FAN is efficient in the training speed and has a strong attack performance. The FAN can generate an adversarial example at 10ms, achieve effective targeted attack (at least 40% drop rate on OTB) and untargeted attack (at least 70% drop rate on OTB).

翻訳日:2022-11-04 00:27:51 公開日:2020-08-01

# DaTscan画像上のLIMEを用いたパーキンソン病早期検出のための説明可能な機械学習モデル

An Explainable Machine Learning Model for Early Detection of Parkinson's Disease using LIME on DaTscan Imagery ( http://arxiv.org/abs/2008.00238v1 )

ライセンス: Link先を確認

Pavan Rajkumar Magesh, Richard Delwin Myloth, Rijo Jackson Tom

(参考訳) パーキンソン病(英: Parkinson's disease、PD)は、神経疾患である。早期診断は患者の治療を改善し、spect datscanのようなドーパミン作動性イメージング技術によって行われる。本研究では,任意のdatscanがパーキンソン病であるか否かを正確に分類する機械学習モデルを提案する。この種の推論は、Local Interpretable Model-Agnostic Explainer (LIME) メソッドを用いて生成された視覚的指標を用いて行われる。 DaTscansはParkinson's Progression Markers Initiativeのデータベースから抽出され、CNN(VGG16)でトランスファーラーニングを使用して訓練され、95.2%の精度、97.5%の感度、90.9%の特異性を得た。本研究は、特に医療分野における最重要度のモデル解釈可能性を維持するために、DATscansの視覚的スーパーピクセルを用いて、PDとPDを区別するためにLIMEの説明を利用する。提案システムは, パーキンソン病の早期診断において, 医療従事者に対して有効に有効である可能性が示唆された。

Parkinson's disease (PD) is a degenerative and progressive neurological condition. Early diagnosis can improve treatment for patients and is performed through dopaminergic imaging techniques like the SPECT DaTscan. In this study, we propose a machine learning model that accurately classifies any given DaTscan as having Parkinson's disease or not, in addition to providing a plausible reason for the prediction. This is kind of reasoning is done through the use of visual indicators generated using Local Interpretable Model-Agnostic Explainer (LIME) methods. DaTscans were drawn from the Parkinson's Progression Markers Initiative database and trained on a CNN (VGG16) using transfer learning, yielding an accuracy of 95.2%, a sensitivity of 97.5%, and a specificity of 90.9%. Keeping model interpretability of paramount importance, especially in the healthcare field, this study utilises LIME explanations to distinguish PD from non-PD, using visual superpixels on the DaTscans. It could be concluded that the proposed system, in union with its measured interpretability and accuracy may effectively aid medical workers in the early diagnosis of Parkinson's Disease.

翻訳日:2022-11-04 00:27:32 公開日:2020-08-01

# 敵対的機械学習の脆弱性: バイアスか分散か?

Vulnerability Under Adversarial Machine Learning: Bias or Variance? ( http://arxiv.org/abs/2008.00138v1 )

ライセンス: Link先を確認

Hossein Aboutalebi, Mohammad Javad Shafiee, Michelle Karg, Christian Scharfenberger, and Alexander Wong

(参考訳) 先行研究により、敵の機械学習の文脈におけるディープニューラルネットワークの脆弱性が明らかにされ、この領域に最近注目が集まっている。まだ十分に検討されていない興味深い質問は、敵対的機械学習のバイアス-分散関係であり、この振る舞いに関する深い洞察を提供する可能性がある。バイアスと分散の概念は、機械学習モデルの一般化と信頼性を分析し評価するための主要なアプローチの1つである。他の機械学習モデルで広く使われているが、ディープラーニングの分野ではよく研究されておらず、敵対的な機械学習の分野でも研究されていない。本研究では,訓練された深層ニューラルネットワークのバイアスと分散に対する敵意機械学習の効果を調査し,敵意摂動がネットワークの一般化にどのように影響するかを分析する。 2つの主な損失関数に基づく分類および回帰適用のバイアス分散トレードオフを導出する。 (i)平均二乗誤差(MSE)、及び (ii)クロスエントロピー。さらに、シミュレーションデータと実データの両方を用いて定量的解析を行い、導出バイアス分散トレードオフとの整合性を実証的に評価する。我々の分析は、バイアス分散の観点から、ディープニューラルネットワークが逆方向の摂動の下で性能が劣る理由と、この種の摂動がネットワークの性能をどう変えるかに光を当てている。さらに,これらの新たな理論的知見を踏まえて,よく知られた機械学習戦略(例:pgd)よりも計算複雑性が低く,低摂動大小の深層ニューラルネットワークを騙すのに高い成功率を提供する新しい逆機械学習アルゴリズムを提案する。

Prior studies have unveiled the vulnerability of the deep neural networks in the context of adversarial machine learning, leading to great recent attention into this area. One interesting question that has yet to be fully explored is the bias-variance relationship of adversarial machine learning, which can potentially provide deeper insights into this behaviour. The notion of bias and variance is one of the main approaches to analyze and evaluate the generalization and reliability of a machine learning model. Although it has been extensively used in other machine learning models, it is not well explored in the field of deep learning and it is even less explored in the area of adversarial machine learning. In this study, we investigate the effect of adversarial machine learning on the bias and variance of a trained deep neural network and analyze how adversarial perturbations can affect the generalization of a network. We derive the bias-variance trade-off for both classification and regression applications based on two main loss functions: (i) mean squared error (MSE), and (ii) cross-entropy. Furthermore, we perform quantitative analysis with both simulated and real data to empirically evaluate consistency with the derived bias-variance tradeoffs. Our analysis sheds light on why the deep neural networks have poor performance under adversarial perturbation from a bias-variance point of view and how this type of perturbation would change the performance of a network. Moreover, given these new theoretical findings, we introduce a new adversarial machine learning algorithm with lower computational complexity than well-known adversarial machine learning strategies (e.g., PGD) while providing a high success rate in fooling deep neural networks in lower perturbation magnitudes.

翻訳日:2022-11-04 00:22:09 公開日:2020-08-01

# マルチノードベルトプレトレーニング:コスト効率のアプローチ

Multi-node Bert-pretraining: Cost-efficient Approach ( http://arxiv.org/abs/2008.00177v1 )

ライセンス: Link先を確認

Jiahuang Lin, Xin Li, Gennady Pekhimenko

(参考訳) 近年,BERT,GPT-2,XLNetなどの大規模トランスフォーマーベースの言語モデルが,多くの自然言語処理(NLP)タスクの最先端結果にエキサイティングな飛躍をもたらした。これらのモデルにおける一般的な傾向の1つは、重み付けと計算の両方を導入するモデル複雑性の著しい増加である。さらに、大規模な教師なしデータセットの出現に伴い、単一のトレーニングエポック内のデータサンプルの増加により、トレーニング時間がさらに延長される。結果として、これらのモデルを適切な時間内にトレーニングするために、機械学習(ML)プログラマは、GPU対応のNVIDIA DGXワークステーションやGoogleのTPU Podsのような特別なアクセラレータのような高度なハードウェアセットアップを必要とすることが多い。我々の研究は、この制限に対処し、BERT事前訓練モデルが2週間以内に、慎重にアルゴリズムとソフトウェア最適化を行うことで、広く利用可能なGPUの学術規模のクラスタでトレーニングできることを実証している。本稿では,単一デバイスでのトレーニングスループットの向上,複数のノードとgpu上でのトレーニングワークロードの分散,ネットワーク上での大規模データ交換によって引き起こされる通信ボトルネックを克服するための最適化について述べる。学術的な環境では,BERTの事前トレーニングを合理的な時間予算(12日)で行うことができるが,NVIDIA DGXマシンやGoogleのTPU Podをベースとした産業環境よりもはるかに安価で,攻撃的なハードウェアリソース要件で行うことができる。

Recently, large scale Transformer-based language models such as BERT, GPT-2, and XLNet have brought about exciting leaps in state-of-the-art results for many Natural Language Processing (NLP) tasks. One of the common trends in these recent models is a significant increase in model complexity, which introduces both more weights and computation. Moreover, with the advent of large-scale unsupervised datasets, training time is further extended due to the increased amount of data samples within a single training epoch. As a result, to train these models within a reasonable time, machine learning (ML) programmers often require advanced hardware setups such as the premium GPU-enabled NVIDIA DGX workstations or specialized accelerators such as Google's TPU Pods. Our work addresses this limitation and demonstrates that the BERT pre-trained model can be trained within 2 weeks on an academic-size cluster of widely available GPUs through careful algorithmic and software optimizations. In this paper, we present these optimizations on how to improve single device training throughput, distribute the training workload over multiple nodes and GPUs, and overcome the communication bottleneck introduced by the large data exchanges over the network. We show that we are able to perform pre-training on BERT within a reasonable time budget (12 days) in an academic setting, but with a much less expensive and less aggressive hardware resource requirement than in previously demonstrated industrial settings based on NVIDIA DGX machines or Google's TPU Pods.

翻訳日:2022-11-04 00:21:43 公開日:2020-08-01

# ブラックボックス予測モデルへの覗き込みのための因果レンズ--因果帰属による予測モデル解釈

A Causal Lens for Peeking into Black Box Predictive Models: Predictive Model Interpretation via Causal Attribution ( http://arxiv.org/abs/2008.00357v1 )

ライセンス: Link先を確認

Aria Khademi, Vasant Honavar

(参考訳) 機械学習を用いて訓練された予測モデルの採用が、医療、治安、刑事司法、財務、教育など、幅広い高度な応用にまたがって増加し、そのようなモデルとその予測を説明する効果的な技術の必要性が高まっている。我々は、予測モデルがブラックボックスである設定、すなわち、様々な入力に対するモデルの応答を観察できるだけでなく、予測モデルの内部構造、パラメータ、目的関数、およびモデルを最適化するのに使用されるアルゴリズムに関する知識を持たない設定でこの問題に対処することを目指している。モデル入力と対応する出力の観測から、ブラックボックス予測モデルをモデル出力に対する各モデル入力の因果効果を推定する方法に解釈する問題を低減させる。観測データから因果効果を推定するためのrubin neyman potential outcomes frameworkの変種を用いて,モデル入力のモデル出力に対する因果効果を推定する。モデル入力に対するモデル出力に対する責任の因果関係が、予測モデルを解釈し、その予測を説明するためにどのように使用できるかを示す。本研究では,1つの合成データセット(出力変数に影響を与える入力変数が設計上知られている)と2つの実世界のデータセット(手書き桁分類,パーキンソン病重症度予測)で訓練されたディープニューラルネットワークモデルにおいて,因果属性によるブラックボックス予測モデルの解釈の有効性を示す実験結果を示す。我々の手法は予測モデルアルゴリズムに関する知識を必要とせず、入力出力応答が観測可能であること以外はブラックボックス予測モデルに関する仮定を含まないため、原則としてブラックボックス予測モデルに適用することができる。

With the increasing adoption of predictive models trained using machine learning across a wide range of high-stakes applications, e.g., health care, security, criminal justice, finance, and education, there is a growing need for effective techniques for explaining such models and their predictions. We aim to address this problem in settings where the predictive model is a black box; That is, we can only observe the response of the model to various inputs, but have no knowledge about the internal structure of the predictive model, its parameters, the objective function, and the algorithm used to optimize the model. We reduce the problem of interpreting a black box predictive model to that of estimating the causal effects of each of the model inputs on the model output, from observations of the model inputs and the corresponding outputs. We estimate the causal effects of model inputs on model output using variants of the Rubin Neyman potential outcomes framework for estimating causal effects from observational data. We show how the resulting causal attribution of responsibility for model output to the different model inputs can be used to interpret the predictive model and to explain its predictions. We present results of experiments that demonstrate the effectiveness of our approach to the interpretation of black box predictive models via causal attribution in the case of deep neural network models trained on one synthetic data set (where the input variables that impact the output variable are known by design) and two real-world data sets: Handwritten digit classification, and Parkinson's disease severity prediction. Because our approach does not require knowledge about the predictive model algorithm and is free of assumptions regarding the black box predictive model except that its input-output responses be observable, it can be applied, in principle, to any black box predictive model.

翻訳日:2022-11-04 00:21:17 公開日:2020-08-01

# ニューラルネットワークにおける対比的説明

Contrastive Explanations in Neural Networks ( http://arxiv.org/abs/2008.00178v1 )

ライセンス: Link先を確認

Mohit Prabhushankar, Gukyeong Kwon, Dogancan Temel, and Ghassan AlRegib

(参考訳) 視覚的説明は、ニューラルネットワークによる予測を正当化する視覚的特徴に基づく論理的議論である。現在の視覚的説明のモードは、$`Why \text{ } P?'$という形式の質問に答える。これらの$Why$の質問は広義の文脈で動作し、場合によっては無関係な回答を提供する。我々は、これらの$Why$の質問を、あるコンテキストの$Q$に基づいて制限することを提案し、この説明は、$`Why \text{ } P, \text{} rather \text{ } than \text{ } Q?'$という形式の対照的な質問に答えるようにします。本稿では,ニューラルネットワークのためのコントラスト視覚説明の構造を定式化する。ニューラルネットワークに基づくコントラストを定義し,定義コントラストを抽出する手法を提案する。次に、抽出したコントラストを既存の $`Why \text{ } P?'$ テクニック、特に Grad-CAM 上のプラグインとして使用します。大規模認識,微粒化認識,地下地震解析,画像品質評価などの応用において,ネットワークとデータの両方を解析することの価値を実証する。

Visual explanations are logical arguments based on visual features that justify the predictions made by neural networks. Current modes of visual explanations answer questions of the form $`Why \text{ } P?'$. These $Why$ questions operate under broad contexts thereby providing answers that are irrelevant in some cases. We propose to constrain these $Why$ questions based on some context $Q$ so that our explanations answer contrastive questions of the form $`Why \text{ } P, \text{} rather \text{ } than \text{ } Q?'$. In this paper, we formalize the structure of contrastive visual explanations for neural networks. We define contrast based on neural networks and propose a methodology to extract defined contrasts. We then use the extracted contrasts as a plug-in on top of existing $`Why \text{ } P?'$ techniques, specifically Grad-CAM. We demonstrate their value in analyzing both networks and data in applications of large-scale recognition, fine-grained recognition, subsurface seismic analysis, and image quality assessment.

翻訳日:2022-11-04 00:20:43 公開日:2020-08-01

# semeval-2020タスク7: 編集ニュース見出しにおけるユーモアの評価

SemEval-2020 Task 7: Assessing Humor in Edited News Headlines ( http://arxiv.org/abs/2008.00304v1 )

ライセンス: Link先を確認

Nabil Hossain, John Krumm, Michael Gamon and Henry Kautz

(参考訳) 本稿では,SemEval-2020共有タスク"Assessing Humor in Edited News Headlines"について述べる。タスクのデータセットにはニュースの見出しが含まれており、短い編集を施して面白くし、これらの編集された見出しの面白さはクラウドソーシングを使って評価された。このタスクには2つのサブタスクが含まれており、その1つは、0-3間隔のユーモア尺度における見出しの面白さを推定することである。第二のサブタスクは、同じオリジナルの見出しの2つの編集されたバージョンについて予測することである。これまでのところ、このタスクは最も一般的な共有計算ユーモアタスクであり、最初のサブタスクで48チーム、1番目のタスクで31チームを惹きつける。

This paper describes the SemEval-2020 shared task "Assessing Humor in Edited News Headlines." The task's dataset contains news headlines in which short edits were applied to make them funny, and the funniness of these edited headlines was rated using crowdsourcing. This task includes two subtasks, the first of which is to estimate the funniness of headlines on a humor scale in the interval 0-3. The second subtask is to predict, for a pair of edited versions of the same original headline, which is the funnier version. To date, this task is the most popular shared computational humor task, attracting 48 teams for the first subtask and 31 teams for the second.

翻訳日:2022-11-04 00:20:23 公開日:2020-08-01

# LXPER Index:韓国のEFL学生を対象としたカリキュラム別テキスト可読性評価モデル

LXPER Index: a curriculum-specific text readability assessment model for EFL students in Korea ( http://arxiv.org/abs/2008.01564v1 )

ライセンス: Link先を確認

Bruce W. Lee, Jason Hyung-Jong Lee

(参考訳) 自動可読性評価は、教育における自然言語処理(NLP)の最も重要な応用の1つである。自動可読性評価は、あらゆるレベルの習熟度において読み手の適切な読解教材を迅速に選択することを可能にするため、世界中の英語の外国語(efl)学生にとって特に有用である。ほとんどの可読性評価モデルは英語のネイティブ読者向けに開発されており、非ネイティブ英語教育(ELT)カリキュラムにおけるテキストの精度は低い。韓国のELTカリキュラムにおいて,非ネイティブなEFL読者を対象とした可読性評価モデルであるLXPER Indexを紹介した。実験の結果,韓国のELTカリキュラムのテキストコーパス(テキストコーパス)を用いて学習した新モデルは,韓国のELTカリキュラムにおけるテキストの自動可読性評価の精度を大幅に向上させることがわかった。

Automatic readability assessment is one of the most important applications of Natural Language Processing (NLP) in education. Since automatic readability assessment allows the fast selection of appropriate reading material for readers at all levels of proficiency, it can be particularly useful for the English education of English as Foreign Language (EFL) students around the world. Most readability assessment models are developed for the native readers of English and have low accuracy for texts in the non-native English Language Training (ELT) curriculum. We introduce LXPER Index, which is a readability assessment model for non-native EFL readers in the ELT curriculum of Korea. Our experiments show that our new model, trained with CoKEC-text (Text Corpus of the Korean ELT Curriculum), significantly improves the accuracy of automatic readability assessment for texts in the Korean ELT curriculum.

翻訳日:2022-11-04 00:20:10 公開日:2020-08-01

# ガウス過程回帰におけるスパース変分推論の収束

Convergence of Sparse Variational Inference in Gaussian Processes Regression ( http://arxiv.org/abs/2008.00323v1 )

ライセンス: Link先を確認

David R. Burt and Carl Edward Rasmussen and Mark van der Wilk

(参考訳) ガウス過程(英: gaussian process)は、ベイズモデリングにおいて万能かつ数学的に便利である函数上の分布である。しかし、正確な推論に使われる行列演算の立方体コスト($n$)のため、多くの観測値を持つデータに対して、それらの使用が妨げられることが多い。変数を誘導する$M \ll N$に依存して$\mathcal{O}(NM^2)$のコストで近似を形成する多くの解が提案されている。計算コストは$N$で線形に見えるが、真の複雑さは近似の特定の品質を保証するために$M$を$N$でスケールする方法に依存する。本研究では,高品質な近似値を確保するために,$m$が$n$でどのように成長する必要があるかという上限について検討する。 M\ll N$ のガウス雑音回帰モデルに対して、近似モデルと正確な後続モデルの間の KL 分割を任意に小さくすることができることを示す。具体的には、一般的な二乗指数核と、d$-次元のガウス分布共変量に対して、$m=\mathcal{o}((\log n)^d)$ suffice と、全体的な計算コスト$\mathcal{o}(n(\log n)^{2d}(\log\log n)^2)$ を持つ方法が推論に利用できる。

Gaussian processes are distributions over functions that are versatile and mathematically convenient priors in Bayesian modelling. However, their use is often impeded for data with large numbers of observations, $N$, due to the cubic (in $N$) cost of matrix operations used in exact inference. Many solutions have been proposed that rely on $M \ll N$ inducing variables to form an approximation at a cost of $\mathcal{O}(NM^2)$. While the computational cost appears linear in $N$, the true complexity depends on how $M$ must scale with $N$ to ensure a certain quality of the approximation. In this work, we investigate upper and lower bounds on how $M$ needs to grow with $N$ to ensure high quality approximations. We show that we can make the KL-divergence between the approximate model and the exact posterior arbitrarily small for a Gaussian-noise regression model with $M\ll N$. Specifically, for the popular squared exponential kernel and $D$-dimensional Gaussian distributed covariates, $M=\mathcal{O}((\log N)^D)$ suffice and a method with an overall computational cost of $\mathcal{O}(N(\log N)^{2D}(\log\log N)^2)$ can be used to perform inference.

翻訳日:2022-11-04 00:19:35 公開日:2020-08-01

# 個人人口と公共人口の混合から学ぶ

Learning from Mixtures of Private and Public Populations ( http://arxiv.org/abs/2008.00331v1 )

ライセンス: Link先を確認

Raef Bassily, Shay Moran and Anupama Nandi

(参考訳) 我々は,プライバシー制約下での教師あり学習の新しいモデルの研究を開始する。健康な人や不健康な人の集団からデータセットを採取する医療研究を想像してください。健康な個人がプライバシーに懸念を抱いていない場合(そのような場合、データを「公開」と呼ぶ)、不健康な個人がデータに対する厳格なプライバシー保護を望んでいると仮定する。この例では、人口(データ分布)は個人(不健康)と公共(健康)のサブ人口の混合であり、非常に異なる可能性がある。上記の例に触発されて、人口の$\mathcal{d}$が2つのサブ人口の混合であるモデルを考える: プライベートなサブ人口の$\mathcal{d}_{\sf priv}$ プライベートでセンシティブなデータと、プライバシーの懸念のないパブリックなサブ人口の$\mathcal{d}_{\sf pub}$である。 $\mathcal{D}$から引き出された各例は、その例がプライベートかパブリックかを示すプライバシー統計ビットを含むと仮定される。目標は、プライベートな例に対してのみ差分プライバシーを満たす学習アルゴリズムを設計することだ。この文脈における先行研究は、プライベートおよびパブリックデータが同じ分布から生じる均質な集団を仮定し、特にこの仮定を利用する設計されたソリューションを仮定した。本研究では, 線形分類器の学習問題である$\mathbb{r}^d$ を考えることにより, この仮定を回避できることを示す。プライバシステータスがターゲットラベルと相関している場合(上述の例のように)、古典的(非プライベートな)PAC学習に匹敵する複雑さを持つ、非依存的かつ実現可能な設定において、$\mathbb{R}^d$の線形分類器が学習可能であることを示す。すべてのデータをプライベートとみなすと、このタスクは不可能であることが知られている。

We initiate the study of a new model of supervised learning under privacy constraints. Imagine a medical study where a dataset is sampled from a population of both healthy and unhealthy individuals. Suppose healthy individuals have no privacy concerns (in such case, we call their data "public") while the unhealthy individuals desire stringent privacy protection for their data. In this example, the population (data distribution) is a mixture of private (unhealthy) and public (healthy) sub-populations that could be very different. Inspired by the above example, we consider a model in which the population $\mathcal{D}$ is a mixture of two sub-populations: a private sub-population $\mathcal{D}_{\sf priv}$ of private and sensitive data, and a public sub-population $\mathcal{D}_{\sf pub}$ of data with no privacy concerns. Each example drawn from $\mathcal{D}$ is assumed to contain a privacy-status bit that indicates whether the example is private or public. The goal is to design a learning algorithm that satisfies differential privacy only with respect to the private examples. Prior works in this context assumed a homogeneous population where private and public data arise from the same distribution, and in particular designed solutions which exploit this assumption. We demonstrate how to circumvent this assumption by considering, as a case study, the problem of learning linear classifiers in $\mathbb{R}^d$. We show that in the case where the privacy status is correlated with the target label (as in the above example), linear classifiers in $\mathbb{R}^d$ can be learned, in the agnostic as well as the realizable setting, with sample complexity which is comparable to that of the classical (non-private) PAC-learning. It is known that this task is impossible if all the data is considered private.

翻訳日:2022-11-04 00:19:04 公開日:2020-08-01

# エルゴード・アニーリング

Ergodic Annealing ( http://arxiv.org/abs/2008.00234v1 )

ライセンス: Link先を確認

Carlo Baldassi, Fabio Maccheroni, Massimo Marinacci, Marco Pirazzini

(参考訳) シミュレート・アニーリング(シミュレート・アニーリング)は、コスト関数が知られているnpハード最適化問題の解に対するマルコフ連鎖モンテカルロ法の栄光である。ここでは,Simulated AnnealingのMetropolisエンジンを,Macau Algorithmと呼ばれる強化学習変種に置き換えることで,コスト関数が不明で,人工エージェントが学習しなければならない場合にも,Simulated Annealingヒューリスティックが非常に有効であることを示す。

Simulated Annealing is the crowning glory of Markov Chain Monte Carlo Methods for the solution of NP-hard optimization problems in which the cost function is known. Here, by replacing the Metropolis engine of Simulated Annealing with a reinforcement learning variation -- that we call Macau Algorithm -- we show that the Simulated Annealing heuristic can be very effective also when the cost function is unknown and has to be learned by an artificial agent.

翻訳日:2022-11-04 00:18:23 公開日:2020-08-01

PDF登録状況（公開日: 20200801）