Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20200310となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# オーブリー相におけるイオントラップ量子コンピュータのフォノンモードの特性 Properties of phonon modes of ion trap quantum computer in the Aubry phase ( http://arxiv.org/abs/2002.03730v2 ) ライセンス: Link先を確認	Justin Loye, Jos\'e Lages and Dima L. Shepelyansky	(参考訳) イオン量子コンピュータにおけるフォノンモードの特性を解析的および数値的に研究する。 The ion chain is placed in a harmonic trap with an additional periodic potential which dimensionless amplitude $K$ determines three main phases available for quantum computations: at zero $K$ we have the case of Cirac-Zoller quantum computer, below a certain critical amplitude $K<K_c$ the ions are in the Kolmogorov-Arnold-Moser (KAM) phase, with delocalized phonon modes and free chain sliding, and above the critical amplitude $K>K_c$ ions are in the pinned Aubry phase with a finite frequency gap protecting quantum gates from temperature and other external fluctuations. オーブリー相では、円-ゾラー相とKAM相とは対照的に、フォノンギャップはトラップ中心の周囲に固定されたイオン密度を保持するトラップ内に配置されるイオンの数とは独立である。オーブリー相では, フォノンモードはCirac-Zoller と KAM のケースと比較して, より局所化されている。したがって、オーブリー相ではリコイルパルスはイオンの局所的な振動を引き起こすが、他の2つの相ではイオン鎖全体に急速に拡散し、外部のゆらぎにかなり敏感である。オーブリー相における局所化フォノンモードとフォノンギャップの性質は、多くのイオンを持つこの相におけるイオン量子計算の利点をもたらすと主張する。 We study analytically and numerically the properties of phonon modes in an ion quantum computer. The ion chain is placed in a harmonic trap with an additional periodic potential which dimensionless amplitude $K$ determines three main phases available for quantum computations: at zero $K$ we have the case of Cirac-Zoller quantum computer, below a certain critical amplitude $K<K_c$ the ions are in the Kolmogorov-Arnold-Moser (KAM) phase, with delocalized phonon modes and free chain sliding, and above the critical amplitude $K>K_c$ ions are in the pinned Aubry phase with a finite frequency gap protecting quantum gates from temperature and other external fluctuations. For the Aubry phase, in contrast to the Cirac-Zoller and KAM phases, the phonon gap remains independent of the number of ions placed in the trap keeping a fixed ion density around the trap center. We show that in the Aubry phase the phonon modes are much better localized comparing to the Cirac-Zoller and KAM cases. Thus in the Aubry phase the recoil pulses lead to local oscillations of ions while in other two phases they spread rapidly over the whole ion chains making them rather sensible to external fluctuations. We argue that the properties of localized phonon modes and phonon gap in the Aubry phase provide advantages for the ion quantum computations in this phase with a large number of ions.	翻訳日:2023-06-04 01:54:29 公開日:2020-03-10
# 量子化条件 1900-1927 Quantization Conditions, 1900-1927 ( http://arxiv.org/abs/2003.04466v1 ) ライセンス: Link先を確認	Anthony Duncan, Michel Janssen	(参考訳) 量子化条件の進化は、プランクによる1900年の黒体放射の処理における新しい基本定数(h)の導入から、1927年のハイゼンベルクの不確実性原理による現代の量子力学の可換関係の解釈まで遡る。 We trace the evolution of quantization conditions from Planck's introduction of a new fundamental constant (h) in his treatment of blackbody radiation in 1900 to Heisenberg's interpretation of the commutation relations of modern quantum mechanics in terms of his uncertainty principle in 1927.	翻訳日:2023-05-30 01:19:06 公開日:2020-03-10
# 放射光による量子相互作用 Quantum interactions with pulses of radiation ( http://arxiv.org/abs/2003.04573v1 ) ライセンス: Link先を確認	Alexander Holm Kiilerich and Klaus M{\o}lmer	(参考訳) 本稿では,量子放射の進行パルスと局所量子システムとの相互作用に関する一般マスター方程式の定式化について述べる。移動場は自由空間放射モードの連続体を発生させ、空洞の離散固有モードに有効なJaynes-Cummings模型は適用されない。我々は、任意の入射パルスによる量子系の駆動と、任意の所望の時間モードに放出される場の量子状態を記述する完全な入力出力理論を開発する。我々の理論は、幅広い物質量子系への結合による放射パルスの変換と相互作用に適用できる。パルスと離散放射固有モードとの量子相互作用の最も本質的な違いを考察し、光・マイクロ波・音響波を用いた量子情報プロトコルに関する例を示す。 This article presents a general master equation formalism for the interaction between travelling pulses of quantum radiation and localized quantum systems. Traveling fields populate a continuum of free space radiation modes and the Jaynes-Cummings model, valid for a discrete eigenmode of a cavity, does not apply. We develop a complete input-output theory to describe the driving of quantum systems by arbitrary incident pulses of radiation and the quantum state of the field emitted into any desired outgoing temporal mode. Our theory is applicable to the transformation and interaction of pulses of radiation by their coupling to a wide class of material quantum systems. We discuss the most essential differences between quantum interactions with pulses and with discrete radiative eigenmodes and present examples relevant to quantum information protocols with optical, microwave and acoustic waves.	翻訳日:2023-05-30 01:15:48 公開日:2020-03-10
# 非古典性の幾何測度による量子相転移の同定 Identifying quantum phase transitions via geometric measures of nonclassicality ( http://arxiv.org/abs/2003.04527v1 ) ライセンス: Link先を確認	Kok Chuan Tan	(参考訳) 本稿では、量子相転移を識別するための一般的なツールとして、非古典性の幾何学的測度の使用を理論的に支援する。非古典性の幾何測度の感受性のばらつきは任意の温度で相転移を特定するのに十分な条件であると主張する。このことは、あらゆる量子資源理論において、非古典性の幾何学的測度が量子系の相転移を研究するための一般的なツールであることを証明している。ゼロ温度では、量子コヒーレンスの幾何学的測度は、特に一階量子相転移の同定に有用であり、量子相関の測度を用いる他のアプローチに対する特に堅牢な代替手段であることを示す。 In this article, we provide theoretical support for the use of geometric measures of nonclassicality as a general tool to identify quantum phase transitions. We argue that divergences in the susceptibility of any geometric measure of nonclassicality are sufficient conditions to identify phase transitions at arbitrary temperature. This establishes that geometric measures of nonclassicality, in any quantum resource theory, are generic tools to investigate phase transitions in quantum systems. At zero temperature, we show that geometric measures of quantum coherence are especially useful for identifying first order quantum phase transitions, and can be a particularly robust alternative to other approaches employing measures of quantum correlations.	翻訳日:2023-05-30 01:15:13 公開日:2020-03-10
# 透明ガタブル超伝導シャドージャンクション Transparent Gatable Superconducting Shadow Junctions ( http://arxiv.org/abs/2003.04487v1 ) ライセンス: Link先を確認	Sabbir A. Khan, Charalampos Lampadaris, Ajuan Cui, Lukas Stampfer, Yu Liu, S. J. Pauka, Martin E. Cachaza, Elisabetta M. Fiordaliso, Jung-Hyun Kang, Svetlana Korneychuk, Timo Mutas, Joachim E. Sestoft, Filip Krizek, Rawa Tanta, M. C. Cassidy, Thomas S. Jespersen, Peter Krogstrup	(参考訳) ゲート可変接合は、ハイブリッド半導体-超伝導材料に基づく量子デバイスにおける鍵となる要素である。それらは、トンネル分光プローブからゲートモンおよびトポロジカルキュービットにおける電圧制御キュービット演算まで、多目的に機能する。一般的には、ジャンクションの透明性が重要な役割を果たす。本研究では, 単結晶InAs, InSb, $\mathrm{InAs_{1-x}Sb_x}$ナノワイヤ, エピタキシャル超伝導体, In-situシャドウ接合体を1段階分子線エピタキシャルプロセスで成長させる。本研究は, 接合の加工パラメータ, 接合形態, 電子輸送特性の相関関係について検討し, 実験対象の陰影接合がエッチング接合よりも著しく高品質であることを示す。シャドージャンクションのエッジシャープ性を変化させることで、最もシャープなエッジは3つの半導体の最も高いジャンクション透過性をもたらすことを示した。さらに、臨界超電流測定では、 KO$-$2 の極限に近い非常に高い$I_\mathrm{C} R_\mathrm{N}$が示される。本研究は, ゲート可変超伝導量子ビットへの有望な技術経路を示す。 Gate tunable junctions are key elements in quantum devices based on hybrid semiconductor-superconductor materials. They serve multiple purposes ranging from tunnel spectroscopy probes to voltage-controlled qubit operations in gatemon and topological qubits. Common to all is that junction transparency plays a critical role. In this study, we grow single crystalline InAs, InSb and $\mathrm{InAs_{1-x}Sb_x}$ nanowires with epitaxial superconductors and in-situ shadowed junctions in a single-step molecular beam epitaxy process. We investigate correlations between fabrication parameters, junction morphologies, and electronic transport properties of the junctions and show that the examined in-situ shadowed junctions are of significantly higher quality than the etched junctions. By varying the edge sharpness of the shadow junctions we show that the sharpest edges yield the highest junction transparency for all three examined semiconductors. Further, critical supercurrent measurements reveal an extraordinarily high $I_\mathrm{C} R_\mathrm{N}$, close to the KO$-$2 limit. This study demonstrates a promising engineering path towards reliable gate-tunable superconducting qubits.	翻訳日:2023-05-30 01:14:43 公開日:2020-03-10
# 光シングルサイドバンド変調器を用いた単光子周波数の精密調整 Precise tuning of single-photon frequency using optical single sideband modulator ( http://arxiv.org/abs/2003.04486v1 ) ライセンス: Link先を確認	Hsin-Pin Lo, and Hiroki Takesue	(参考訳) 量子特性を保ちながら単一光子の周波数変換は、フォトニック量子通信システムのフレキシブルネットワークにとって重要な技術である。本稿では,光シングルサイドバンド(ossb)変調器を用いて異なる色光子を結合するフレキシブルスキームを示す。変調器を駆動する電波信号を変更することで、単一光子の周波数をシフトし、正確に調整することができる。 ossb変調器を用いて、非退化光子対の周波数識別性を消去し、視認性90%以上のhong-ou-mandel干渉を得ることに成功した。また,ossb変調器によって識別性レベルを正確に制御できることを実証した。 OSSB変調器は、高度な量子情報システムを実現するためのシンプルで柔軟なフォトニックインタフェースを提供することを期待している。 Frequency translation of single photons while preserving their quantum characteristics is an important technology for flexible networking of photonic quantum communication systems. Here we demonstrate a flexible scheme to interface different-color photons using an optical single sideband (OSSB) modulator. By changing the radio-frequency signal that drives the modulators, we can easily shift and precisely tune the frequency of single photons. Using the OSSB modulator, we successfully erased the frequency distinguishability of non-degenerated photon pairs to obtain the Hong-Ou-Mandel interference with a visibility exceeding 90%. We also demonstrated that the level of distinguishability can be precisely controlled by the OSSB modulator. We expect that the OSSB modulator will provide a simple and flexible photonic interface for realizing advanced quantum information systems.	翻訳日:2023-05-30 01:14:21 公開日:2020-03-10
# 時間ビン量子ビットの制御位相ゲートを用いた絡み合い生成 Entanglement generation using a controlled-phase gate for time-bin qubits ( http://arxiv.org/abs/2003.04483v1 ) ライセンス: Link先を確認	Hsin-Pin Lo, Takuya Ikuta, Nobuyuki Matsuda, Toshimori Honjo, and Hiroki Takesue	(参考訳) 量子論理ゲートは、多くの物理系における量子計算や量子情報処理において重要である。時間ビン量子ビットは光ファイバー上の量子通信に適しているが、多くの重要な量子論理ゲートはまだ実現されていない。そこで我々は,光変調器を用いた2x2光スイッチを用いた時間ビン量子ビットの制御位相(C-Phase)ゲートを実証した。ホン・ウー・マンデル干渉測定の結果、スイッチは時間依存のビームスプリッタとして機能することが分かった。スイッチによるC-Phaseゲート操作により2つの独立した時間ビン量子ビットが絡み合っていることを確認した。 Quantum logic gates are important for quantum computations and quantum information processing in numerous physical systems. While time-bin qubits are suited for quantum communications over optical fiber, many essential quantum logic gates for them have not yet been realized. Here, we demonstrated a controlled-phase (C-Phase) gate for time-bin qubits that uses a 2x2 optical switch based on an electro-optic modulator. A Hong-Ou-Mandel interference measurement showed that the switch could work as a time-dependent beam splitter with a variable spitting ratio. We confirmed that two independent time-bin qubits were entangled as a result of the C-Phase gate operation with the switch.	翻訳日:2023-05-30 01:14:08 公開日:2020-03-10
# 時間ビン量子ビット制御位相ゲートの量子プロセストモグラフィー Quantum process tomography of a controlled-phase gate for time-bin qubits ( http://arxiv.org/abs/2003.04473v1 ) ライセンス: Link先を確認	Hsin-Pin Lo, Takuya Ikuta, Nobuyuki Matsuda, Toshimori Honjo, William J. Munro, and Hiroki Takesue	(参考訳) 情報を1つの光子に異なるタイミングで符号化するタイムビン量子ビットは、光ファイバーや導波路に基づく量子通信で広く使われている。近年の分散量子計算の発展により、時間ビン符号化量子ビットがその文脈で有用かどうかを問うことは論理的である。我々は最近,ニオブ酸リチウム導波路をベースとした2X2光スイッチを用いた時間ビン量子ビット制御相(C-Phase)ゲートを実現し,絡み合った状態の生成を実証した。しかし、実験は入力状態のペアだけで行われ、C-Phaseゲートの機能は完全には検証されなかった。本研究では,量子プロセストモグラフィーを用いて,プロセス忠実度97.1%の確立を行った。さらに,プロセス忠実度が94%以上の制御なしゲート動作を実演した。本研究は,量子量子回路における2量子論理ゲートを時間ビン量子ビットで実装できることを確認し,時間ビン量子ビットに基づく分散量子計算の実現に向けた重要な一歩である。 Time-bin qubits, where information is encoded in a single photon at different times, have been widely used in optical fiber and waveguide based quantum communications. With the recent developments in distributed quantum computation, it is logical to ask whether time-bin encoded qubits may be useful in that context. We have recently realized a time-bin qubit controlled-phase (C-Phase) gate using a 2 X 2 optical switch based on a lithium niobate waveguide, with which we demonstrated the generation of an entangled state. However, the experiment was performed with only a pair of input states, and thus the functionality of the C-Phase gate was not fully verified. In this research, we used quantum process tomography to establish a process fidelity of 97.1%. Furthermore, we demonstrated the controlled-NOT gate operation with a process fidelity greater than 94%. This study confirms that typical two-qubit logic gates used in quantum computational circuits can be implemented with time-bin qubits, and thus it is a significant step forward for realization of distributed quantum computation based on time-bin qubits.	翻訳日:2023-05-30 01:13:57 公開日:2020-03-10
# 無伴奏移民のサポーター支援--社会的-生態学的レジリエンスをめざして Supporting the Supporters of Unaccompanied Migrant Youth: Designing for Social-ecological Resilience ( http://arxiv.org/abs/2003.04799v1 ) ライセンス: Link先を確認	Franziska Tachtler, Toni Michel, Petr Slov\'ak, Geraldine Fitzpatrick	(参考訳) 両親なしで新しい国に逃れる移民の若者は、メンタルヘルスのリスクにさらされている。レジリエンスの介入はそのようなリスクを軽減するが、アクセスはシステム的および個人的障壁によって妨げられる。最近多くの研究がメンタルヘルスを促進するテクノロジーの設計に取り組んできたが、これらの人口のニーズに焦点を絞ったものはない。本稿では,18名の専門職/ボランティア支援作業員と5名の無伴奏移民の若者を対象に,3つのデザインワークショップを開催する。結果は、若者のレジリエンス開発を促進する多様なシステムを示している。若年者とメンターとしてのボランティアの関係は特にレジリエンスを高める上で重要であるが、課題が伴う。このことは、若年者を支援するためにメンターを支援する技術の設計に焦点を当てた、社会的・生態的なレジリエンスモデルとの関連性を示唆している。最後に、メンタサポートのためにデザインスペースをマッピングする。 Unaccompanied migrant youth, fleeing to a new country without their parents, are exposed to mental health risks. Resilience interventions mitigate such risks, but access can be hindered by systemic and personal barriers. While much work has recently addressed designing technology to promote mental health, none has focused on the needs of these populations. This paper presents the results of interviews with 18 professional/ volunteer support workers and 5 unaccompanied migrant youths, followed by three design workshops. The results point to the diverse systems that can facilitate youths' resilience development. The relationship between the youth and volunteers acting as mentors is particularly important for increasing resilience but comes with challenges. This suggests the relevance of a social-ecological model of resilience with a focus on designing technology to support the mentors in order to help them better support the youth. We conclude by mapping out the design space for mentor support.	翻訳日:2023-05-30 01:05:24 公開日:2020-03-10
# 測定する、または測定しない、それが質問である To Measure, or Not to Measure, That is the Question ( http://arxiv.org/abs/2003.04683v1 ) ライセンス: Link先を確認	Juzar Thingna and Peter Talkner	(参考訳) ポインタ状態との接触中に取られる可観測性の値の総和を推定する手法を提案する。これにより、ポインタの状態はシステムに接触しながら更新され、システムが時間内に進化する間は、連絡先間で変化しない。所定の数に接触した後、射影測定によりポインタの位置が決定される。この結果は、ユニタリとマルコフの散逸ダイナミクスの確率分布関数を用いて特定され、観測可能と見なされる観測対象の一般化ガウス測定結果と比較される。特定の例として、量子ビットは、ハミルトニアン系と可換でないポインタに接触する可観測性を持つ。 A method is proposed that allows one to infer the sum of the values of an observable taken during contacts with a pointer state. Hereby the state of the pointer is updated while contacted with the system and remains unchanged between contacts while the system evolves in time. After a prescribed number of such contacts the position of the pointer is determined by means of a projective measurement. The outcome is specified in terms of a probability distribution function for unitary and Markovian dissipative dynamics and compared with the results of the same number of generalized Gaussian measurements of the considered observable. As a particular example a qubit is considered with an observable contacting to the pointer that does not commute with the system Hamiltonian.	翻訳日:2023-05-30 01:04:24 公開日:2020-03-10
# トラップされたイオン量子ビットの量子マスター方程式ダイナミクスの直接再構成 Direct reconstruction of the quantum master equation dynamics of a trapped ion qubit ( http://arxiv.org/abs/2003.04678v1 ) ライセンス: Link先を確認	Eitan Ben Av, Yotam Shapira, Nitzan Akerman and Roee Ozeri	(参考訳) マルコフ開量子系の物理学は量子マスター方程式によって記述できる。これらは力学方程式であり、ハミルトニアンおよびジャンプ作用素を包含し、系の時間発展を生成する。システムのハミルトニアンの再構成とその測定からの環境への結合は、基礎研究と量子機械の性能評価の両方において重要である。本稿では,選択した可観測物の期待値の集合から直接,オープン量子系の力学方程式を再構成する手法を提案する。我々は自発的光子散乱下で捕捉された$^{88}\text{sr}^+$イオンのダイナミクスを測定することで,シミュレーションと実験の両方でこの技術をベンチマークする。 The physics of Markovian open quantum systems can be described by quantum master equations. These are dynamical equations, that incorporate the Hamiltonian and jump operators, and generate the system's time evolution. Reconstructing the system's Hamiltonian and and its coupling to the environment from measurements is important both for fundamental research as well as for performance-evaluation of quantum machines. In this paper we introduce a method that reconstructs the dynamical equation of open quantum systems, directly from a set of expectation values of selected observables. We benchmark our technique both by a simulation and experimentally, by measuring the dynamics of a trapped $^{88}\text{Sr}^+$ ion under spontaneous photon scattering.	翻訳日:2023-05-30 01:04:13 公開日:2020-03-10
# マイクロ波光子を2つまで数える超伝導検出器 A superconducting detector that counts microwave photons up to two ( http://arxiv.org/abs/2003.04625v1 ) ライセンス: Link先を確認	Andrii M. Sokolov and Frank K. Wilhelm	(参考訳) 本研究では,真空状態,1光子状態,および2光子以上の状態の区別が可能なマイクロ波光子の検出器を提案する。その動作はバイアス付きジョセフソン接合における2光子遷移に基づいており、超伝導状態から正常状態への切り替え時に検出される。検出器を理論的にモデル化する。検出器は、数マイクロ秒間に90%以上の成功確率で実行される。 8.2GHzの光子に敏感である。動作周波数は、およそ1GHzから20GHzの範囲の設計段階で設定できる。 We propose a detector of microwave photons which can distinguish the vacuum state, one-photon state, and the states with two or more photons. Its operation is based on the two-photon transition in a biased Josephson junction and detection occurs when it switches from a superconducting to a normal state. We model the detector theoretically. The detector performs with more than 90% success probability in several microseconds. It is sensitive for the 8.2GHz photons. The working frequency could be set at the design stage in the range from about 1GHz to 20GHz.	翻訳日:2023-05-30 01:03:13 公開日:2020-03-10
# unruh効果の検出器としてのデコヒーレンス Decoherence as Detector of the Unruh Effect ( http://arxiv.org/abs/2003.05014v1 ) ライセンス: Link先を確認	Alexander I Nesterov, Gennady P Berman, Manuel A Rodr\'iguez Fern\'andez and Xidi Wang	(参考訳) 本研究では,無質量量子スカラー場と相互作用する検出器の密度行列のデコヒーレンスを計測する新しいタイプのUnruh-DeWitt検出器を提案する。慣性および加速基準系ではデコヒーレンス減衰率が異なることが判明した。指数的位相崩壊は,比較的低加速度で観測でき,unruh効果の測定条件を大幅に改善できることを示した。 We propose a new type of the Unruh-DeWitt detector which measures the decoherence of the reduced density matrix of the detector interacting with the massless quantum scalar field. We find that the decoherence decay rates are different in the inertial and accelerated reference frames. We show that the exponential phase decay can be observed for relatively low accelerations, that can significantly improve the conditions for measuring the Unruh effect.	翻訳日:2023-05-30 00:56:59 公開日:2020-03-10
# 同一性を持つ資源効率の良いゼロノイズ外挿 Resource Efficient Zero Noise Extrapolation with Identity Insertions ( http://arxiv.org/abs/2003.04941v1 ) ライセンス: Link先を確認	Andre He, Benjamin Nachman, Wibe A. de Jong, and Christian W. Bauer	(参考訳) 読み出し誤差に加えて、2量子ゲートノイズは、ノイズの多い中間スケール量子(NISQ)コンピュータ上の複雑な量子アルゴリズムの主要な課題である。これらの誤りは、量子化学、核物理学、高エネルギー物理学、その他の新興科学・産業応用の正確な計算を行う上で重要な課題である。 2ビットゲートエラーの軽減には、エラー訂正符号とゼロノイズ外挿という2つの提案がある。本稿では,後者に着目し,それを詳細に研究し,既存アプローチへの変更を提案する。特に,従来の固定id挿入法 (fiim) よりもはるかに少ないゲートで競争的漸近的精度を達成するためのランダムid挿入法 (riim) を提案する。例えば、先頭方向の非偏極ゲートノイズを修正するには、RIIMでは$n_\text{CNOT}+2$ゲートが必要であり、FIIMでは$3n_\text{CNOT}$ゲートが必要である。この重要なリソース節約により、近未来の量子ハードウェアにおける最先端の計算結果をより正確にすることができる。 In addition to readout errors, two-qubit gate noise is the main challenge for complex quantum algorithms on noisy intermediate-scale quantum (NISQ) computers. These errors are a significant challenge for making accurate calculations for quantum chemistry, nuclear physics, high energy physics, and other emerging scientific and industrial applications. There are two proposals for mitigating two-qubit gate errors: error-correcting codes and zero-noise extrapolation. This paper focuses on the latter, studying it in detail and proposing modifications to existing approaches. In particular, we propose a random identity insertion method (RIIM) that can achieve competitive asymptotic accuracy with far fewer gates than the traditional fixed identity insertion method (FIIM). For example, correcting the leading order depolarizing gate noise requires $n_\text{CNOT}+2$ gates for RIIM instead of $3n_\text{CNOT}$ gates for FIIM. This significant resource saving may enable more accurate results for state-of-the-art calculations on near term quantum hardware.	翻訳日:2023-05-30 00:56:19 公開日:2020-03-10
# 大規模ネットワークにおける量子ページランクのためのTensorFlowソルバー TensorFlow Solver for Quantum PageRank in Large-Scale Networks ( http://arxiv.org/abs/2003.04930v1 ) ライセンス: Link先を確認	Hao Tang, Tian-Shen He, Ruo-Xi Shi, Yan-Yan Zhu, Marcus Lee, Tian-Yu Wang, Xian-Min Jin	(参考訳) Google PageRankは、ネットワーク内のノードやWebサイトの重要度をランク付けするための一般的かつ有用なアルゴリズムである。量子ページランクアルゴリズムは本質的に量子確率的ウォークに基づいており、リンドブラッドマスター方程式を用いて表現することができるが、これはo(n^4)次元のクロネッカー積を解く必要があり、ネットワーク内のノードn数が150を超える場合、非常に大きなメモリと時間を必要とする。本稿では,Lange-Kutta法を用いて行列次元をO(N^2)に減らし,TensorFlowを用いてGPU並列計算を行うことにより,量子PageRankの効率的な解法を提案する。最大922ノードを持つ米国の主要航空会社ネットワークに対して、量子PageRankを解く際の性能を実証する。従来の量子ページランクソルバと比較して,100秒未満でメモリ4-8gbの通常のコンピュータで動作するためには,必要なメモリと時間を1%と0.2%に劇的に削減できる。この大規模量子PageRankと量子確率ウォークの効率的な解法は、現実の応用における量子情報の研究を大いに促進する。 Google PageRank is a prevalent and useful algorithm for ranking the significance of nodes or websites in a network, and a recent quantum counterpart for PageRank algorithm has been raised to suggest a higher accuracy of ranking comparing to Google PageRank. The quantum PageRank algorithm is essentially based on quantum stochastic walks and can be expressed using Lindblad master equation, which, however, needs to solve the Kronecker products of an O(N^4) dimension and requires severely large memory and time when the number of nodes N in a network increases above 150. Here, we present an efficient solver for quantum PageRank by using the Runge-Kutta method to reduce the matrix dimension to O(N^2) and employing TensorFlow to conduct GPU parallel computing. We demonstrate its performance in solving quantum PageRank for the USA major airline network with up to 922 nodes. Compared with the previous quantum PageRank solver, our solver dramatically reduces the required memory and time to only 1% and 0.2%, respectively, making it practical to work in a normal computer with a memory of 4-8 GB in no more than 100 seconds. This efficient solver for large-scale quantum PageRank and quantum stochastic walks would greatly facilitate studies of quantum information in real-life applications.	翻訳日:2023-05-30 00:56:01 公開日:2020-03-10
# 相互作用する2次元格子ゲージ理論における無秩序局在 Disorder-free localization in an interacting two-dimensional lattice gauge theory ( http://arxiv.org/abs/2003.04901v1 ) ライセンス: Link先を確認	P. Karpov, R. Verdel, Y.-P. Huang, M. Schmitt, and M. Heyl	(参考訳) 乱れのない局所化は、ゲージ不変性によって課される局所的制約によって引き起こされる低次元の均質格子ゲージ理論におけるエルゴード性破壊のメカニズムとして最近導入された。また, 2次元空間における真に相互作用する系は, この機構の結果として非エルゴード化できることを示した。具体的には、古典的相関パーコレーション問題を通じて局所化-非局在化遷移の厳密な束を得て量子リンクモデルにおける非エルゴード挙動を証明し、遷移の非エルゴード側でヒルベルト空間の断片化を示唆する。本システムにおける量子力学を,古典スピンの変動ネットワークと人工ニューラルネットワークとの類似性の観点から,効率的かつ摂動的に制御された波動関数の表現を用いて研究する。線形欠陥の伝播を研究することにより, 局所化相とエルゴード相の異なる光円錐構造を生成することにより, 動的特徴を識別する。この研究で導入された手法は、空間次元に関係なく有限次元局所ヒルベルト空間を持つ任意の格子ゲージ理論に適用できる。 Disorder-free localization has been recently introduced as a mechanism for ergodicity breaking in low-dimensional homogeneous lattice gauge theories caused by local constraints imposed by gauge invariance. We show that also genuinely interacting systems in two spatial dimensions can become nonergodic as a consequence of this mechanism. Specifically, we prove nonergodic behavior in the quantum link model by obtaining a rigorous bound on the localization-delocalization transition through a classical correlated percolation problem implying a fragmentation of Hilbert space on the nonergodic side of the transition. We study the quantum dynamics in this system by means of an efficient and perturbatively controlled representation of the wavefunction in terms of a variational network of classical spins akin to artificial neural networks. We identify a distinguishing dynamical signature by studying the propagation of line defects, yielding different light cone structures in the localized and ergodic phases, respectively. The methods we introduce in this work can be applied to any lattice gauge theory with finite-dimensional local Hilbert spaces irrespective of spatial dimensionality.	翻訳日:2023-05-30 00:55:23 公開日:2020-03-10
# 大規模コラボレーションによるスマートシティIoTサービス構築 Smart City IoT Services Creation through Large Scale Collaboration ( http://arxiv.org/abs/2003.04843v1 ) ライセンス: Link先を確認	Flavio Cirillo, David G\'omez, Luis Diez, Ignacio Elicegui Maestro, Thomas Barrie Juel Gilbert, Reza Akhavan	(参考訳) スマートシティソリューションは、センサーデータハンドリングから提供されたサービスまで、モノリシックに実装されることが多い。新しい都市のすべての新しいソリューションに対して、同じ課題が、さまざまな開発者によって頻繁に直面しています。専門知識とノウハウが再利用され、作業が共有される。本稿では,新しいスマートシティソリューションを実装し,コンポーネントの共有を最大化する努力を最小化する手法を提案する。最後の目標は、スマートシティアプリケーション開発者のライブ技術コミュニティを作ることだ。この活動の結果は、ヨーロッパと韓国の27都市で35の都市サービスを実施している。努力を共有するため、開発者はモジュラーアプローチを使ってアプリケーションを開発することを推奨します。他の都市サービスで再利用可能な単一機能コンポーネントはパッケージ化され、アトミックサービスと呼ばれるスタンドアロンコンポーネントとして公開される。データ分析、データ評価、データ統合、データ検証、可視化におけるスマートシティの課題に対処する15のアトミックサービスを特定します。原子サービスの38のインスタンスは、すでにいくつかのスマートシティサービスで運用されている。この記事では、アトミックサービス例、いくつかのデータ予測コンポーネントとして詳述します。さらに、サンタンデールとデンマークの3都市における実世界の原子サービス利用について述べる。結果として生じるアトミックサービスは、スマートシティソリューションのサイドマーケットも生み出しており、専門知識とノウハウを異なる利害関係者が再利用することができる。 Smart cities solutions are often monolithically implemented, from sensors data handling through to the provided services. The same challenges are regularly faced by different developers, for every new solution in a new city. Expertise and know-how can be re-used and the effort shared. In this article we present the methodologies to minimize the efforts of implementing new smart city solutions and maximizing the sharing of components. The final target is to have a live technical community of smart city application developers. The results of this activity comes from the implementation of 35 city services in 27 cities between Europe and South Korea. To share efforts, we encourage developers to devise applications using a modular approach. Single-function components that are re-usable by other city services are packaged and published as standalone components, named Atomic Services. We identify 15 atomic services addressing smart city challenges in data analytics, data evaluation, data integration, data validation, and visualization. 38 instances of the atomic services are already operational in several smart city services. We detail in this article, as atomic service examples, some data predictor components. Furthermore, we describe real-world atomic services usage in the scenarios of Santander and three Danish cities. The resulting atomic services also generate a side market for smart city solutions, allowing expertise and know-how to be re-used by different stakeholders.	翻訳日:2023-05-30 00:54:12 公開日:2020-03-10
# anyon と Gentile の統計量間の変換の中間対称構築 Intermediate symmetric construction of transformation between anyon and Gentile statistics ( http://arxiv.org/abs/2003.06235v1 ) ライセンス: Link先を確認	Yao Shen	(参考訳) 遺伝統計学は、職業数表現における分数統計システムを記述する。任意の統計はこれらの系を巻数表現で研究する。どちらもボース=アインシュタイン統計とフェルミ=ディラック統計の間の中間統計である。 Gentile統計の第二の量子化は、多くの利点を示している。波動関数の対称性の要求に従って、アノンとジェンティル統計の間の変換の一般的な構成を与える。言い換えれば、エノンの第二量子化形式を簡単な方法で導入する。また, 2次量子化作用素, コヒーレント状態, ベリー相の基本関係についても考察した。 Gentile statistics describes fractional statistical systems in the occupation number representation. Anyon statistics researches those systems in the winding number representation. Both of them are intermediate statistics between Bose-Einstein and Fermi-Dirac statistics. The second quantization of Gentile statistics shows a lot of advantages. According to the symmetry requirement of the wave function, we give the general construction of transformation between anyon and Gentile statistics. In other words, we introduce the second quantization form of anyons in a easy way. Basic relations of second quantization operators, the coherent state and Berry phase are also discussed.	翻訳日:2023-05-30 00:44:16 公開日:2020-03-10
# 低学費の大学生研究 Undergraduate Student Research With Low Faculty Cost ( http://arxiv.org/abs/2003.05719v1 ) ライセンス: Link先を確認	Sindhu Kutty, Mark Guzdial	(参考訳) 大学院生は、コンピュータサイエンスの研究が何であるかを知らないなら、コンピュータサイエンスの大学院研究を考えることもない。学部生を研究に紹介する多くのプログラムは大学院研究プログラムのように構成されており、少数の学部生が学部顧問と協力している。さらに、女性、少数派、第1世代の学生は威圧的すぎるか、あるいは研究のアイデアがアモルファスすぎるため、これらのプログラムを見逃してしまう可能性がある。その結果,CS研究の多様性向上の機会を失うことになった。我々は,機械学習と関連分野に焦点をあてた研究グループの一環として,多数の学生(約2ダース)が1人の教員とともに作業する試験プログラムを当社の部門で開始した。このプログラムの目的は、学生に研究キャリアを追求するよう説得することではなく、彼らが将来の研究に望む役割についてより深い決定をさせることである。提案手法を評価するため,匿名の出口調査を2回実施し,学生の体験を抽出した。学生は、コンピュータ科学の研究がどんなものであるかをよりよく理解していると報告している。彼らの研究への関心は、研究を行う能力に対する信頼が報告されているように高まったが、すべての学生がコンピュータ科学の研究機会を追求したいとは思っていなかった。女子学生の報告された経験を踏まえると、このプログラムはCS研究のさらなる多様性の出発点となる。 Undergraduates are unlikely to even consider graduate research in Computer Science if they do not know what Computer Science research is. Many programs aimed at introducing undergraduate to research are structured like graduate research programs, with a small number of undergraduates working with a faculty advisor. Further, females, under-represented minorities, and first generation students may be too intimidated or the idea of research may be too amorphous, so that they miss out on these programs. As a consequence, we lose out on opportunities for greater diversity in CS research. We have started a pilot program in our department where a larger number of students (close to two dozen) work with a single faculty member as part of a research group focused on Machine Learning and related areas. The goal of this program is not to convince students to pursue a research career but rather to enable them to make a more informed decision about what role they would like research to play in their future. In order to evaluate our approach, we elicited student experience via two anonymized exit surveys. Students report that they develop a better understanding of what research in Computer Science is. Their interest in research was increased as was their reported confidence in their ability to do research, although not all students wanted to further pursue computer science research opportunities. Given the reported experience of female students, this program can offer a starting point for greater diversity in CS research.	翻訳日:2023-05-30 00:44:09 公開日:2020-03-10
# SensAI+Expanse Emotional Valence Prediction Study with Cognition and Memory Integration SensAI+Expanse Emotional Valence Prediction Studies with Cognition and Memory Integration ( http://arxiv.org/abs/2001.09746v3 ) ライセンス: Link先を確認	Nuno A. C. Henriques, Helder Coelho, Leonel Garcia-Marques	(参考訳) 人間は感情的で認知的な存在であり、個人や社会的アイデンティティの記憶に依存している。また、人間のダイアド結合は、より良い相互作用のために共感行動のような共通の信念を必要とする。この意味で、人間とエージェントの相互作用に関する研究は、影響、認知、記憶の統合に資源を供給すべきである。開発された人工エージェントシステム(SensAI+Expanse)は、機械学習アルゴリズム、ヒューリスティックス、記憶を認知として、相互作用する人間の感情価予測に役立てる。また、人間を識別可能な相互作用結果に結びつけるために、常に適応共感スコアが存在する。 [...]エージェントはデータの収集に寛容であり、適切な文脈化予測のための学習最善策として、その認知過程を個人に適応させる。この研究は、達成された適応プロセスを活用する。また,従来の研究では,学習アルゴリズムと評価指標の特定の選択肢を用いた個人予測モデルを用いた。達成された解は、高い性能の予測能力、効率的なエネルギー使用、予測確率に対する特徴重要説明を含む。本研究の結果,年齢と性の組み合わせによって有意な感情的有意性行動の差異が認められた。したがって、この研究は認知科学研究を支援することができる人工知能エージェントに寄与する。この能力は、空間と時間で文脈化された人間の感情的原子価を予測することによって、情緒的障害に関するものである。さらに、学習過程やヒューリスティックスは、認知の経済性や環境に対処するための記憶などのタスクに適合する。最後に、これらの貢献には、コンテキストにおける感情的ヴァレンス状態の予測において、達成された年齢と性別の中立性が含まれます。 The humans are affective and cognitive beings relying on memories for their individual and social identities. Also, human dyadic bonds require some common beliefs such as empathetic behaviour for better interaction. In this sense, research studies involving human-agent interaction should resource on affect, cognition, and memory integration. The developed artificial agent system (SensAI+Expanse) includes machine learning algorithms, heuristics, and memory as cognition aids towards emotional valence prediction on the interacting human. Further, an adaptive empathy score is always present in order to engage the human in a recognisable interaction outcome. [...] The agent is resilient on collecting data, adapts its cognitive processes to each human individual in a learning best effort for proper contextualised prediction. The current study make use of an achieved adaptive process. Also, the use of individual prediction models with specific options of the learning algorithm and evaluation metric from a previous research study. The accomplished solution includes a highly performant prediction ability, an efficient energy use, and feature importance explanation for predicted probabilities. Results of the present study show evidence of significant emotional valence behaviour differences between some age ranges and gender combinations. Therefore, this work contributes with an artificial intelligent agent able to assist on cognitive science studies. This ability is about affective disturbances by means of predicting human emotional valence contextualised in space and time. Moreover, contributes with learning processes and heuristics fit to the task including economy of cognition and memory to cope with the environment. Finally, these contributions include an achieved age and gender neutrality on predicting emotional valence states in context and with very good performance for each individual.	翻訳日:2023-01-14 18:14:09 公開日:2020-03-10
# スピンボソンモデルの量子力学写像 The quantum dynamical map of the spin boson model ( http://arxiv.org/abs/2001.04236v2 ) ライセンス: Link先を確認	In\'es de Vega	(参考訳) 量子コンピュータにおける環境の影響を分析する主要なフレームワークの1つは、量子ビットのダイナミックスをよく知られた動的マップの観点で特徴づけることができる純粋退化である。本研究では,このような写像の非摂動的拡張を,この単純な純粋デファッショニングケース,すなわち熱状態のボソニック環境に結合した一般スピンに対して有効であることを示す。この目的のために、トローター分解とマグヌス展開を用いて相互作用図におけるユニタリ進化演算子を単純化する。提案された導出は、多くの体、初期系環境相関状態、多重時間相関関数、量子情報プロトコルを含む他の有限レベルの開量子系にも拡張することができる。 One of the main frameworks to analyze the effects of the environment in a quantum computer is that of pure dephasing, where the dynamics of qubits can be characterised in terms of a well-known dynamical map. In this work we present a non-peturbative extension of such map beyond this simple pure-dephasing case, i.e. that is valid for a general spin coupled to a bosonic environment in a thermal state. To this aim, we use a Trotter decomposition and a Magnus expansion to simplify the unitary evolution operator in interaction picture. The proposed derivation can be extended to other finite-level open quantum systems including many body, initial system-environment correlated states, multiple-time correlation functions or quantum information protocols.	翻訳日:2023-01-11 23:59:57 公開日:2020-03-10
# mixpath:ワンショットニューラルネットワーク検索のための統一アプローチ MixPath: A Unified Approach for One-shot Neural Architecture Search ( http://arxiv.org/abs/2001.05887v3 ) ライセンス: Link先を確認	Xiangxiang Chu, Xudong Li, Shun Lu, Bo Zhang, and Jixiang Li	(参考訳) 複数の畳み込みカーネルのブレンディングは、ニューラルアーキテクチャ設計において有利であることが証明されている。しかし、現在のニューラルアーキテクチャ探索手法は主にスタック化されたシングルパス探索空間に限られている。マルチパスモデルのワンショットドクトリン検索は、まだ未解決のままである。具体的には、候補アーキテクチャを正確に評価するために、マルチパススーパーネットをトレーニングする動機があります。本稿では, 探索空間において, 複数の経路から要約された特徴ベクトルが, スーパーネットトレーニングとそのランク付け能力を乱す単一経路からの特徴ベクトルのほぼ倍であることを示す。本稿では,異なる特徴統計を正規化するシャドウバッチ正規化(sbn)と呼ばれる新しいメカニズムを提案する。大規模な実験により、SBNはトレーニングを安定化し、ランキングパフォーマンスを改善することができる(例えば、NAS-Bench-101でテストされたKendall Tau 0.597)。当社の統一マルチパスワンショットアプローチをmixpathと呼び、imagenetで最先端の結果を得る一連のモデルを生成します。 Blending multiple convolutional kernels is proved advantageous in neural architectural design. However, current neural architecture search approaches are mainly limited to stacked single-path search space. How can the one-shot doctrine search for multi-path models remains unresolved. Specifically, we are motivated to train a multi-path supernet to accurately evaluate the candidate architectures. In this paper, we discover that in the studied search space, feature vectors summed from multiple paths are nearly multiples of those from a single path, which perturbs supernet training and its ranking ability. In this regard, we propose a novel mechanism called Shadow Batch Normalization(SBN) to regularize the disparate feature statistics. Extensive experiments prove that SBN is capable of stabilizing the training and improving the ranking performance (e.g. Kendall Tau 0.597 tested on NAS-Bench-101). We call our unified multi-path one-shot approach as MixPath, which generates a series of models that achieve state-of-the-art results on ImageNet.	翻訳日:2023-01-10 23:27:10 公開日:2020-03-10
# ファウショット学習のための連続的局所的置換 Continual Local Replacement for Few-shot Learning ( http://arxiv.org/abs/2001.08366v2 ) ライセンス: Link先を確認	Canyu Le, Zhonggui Chen, Xihan Wei, Biao Wang, Lei Zhang	(参考訳) 少数ショット学習の目標は、1つまたは複数のトレーニングデータに基づいて新しいクラスを認識できるモデルを学ぶことである。 1)新規クラスの優れた特徴表現が欠如している、(2)ラベル付きデータのいくつかは真のデータ分布を正確に表現できないため、分類のよい決定関数を学ぶのは難しい、という2つの側面から課題となっている。本研究では,高度なネットワークアーキテクチャを用いて,より優れた特徴表現を学習し,第2の課題に注目する。データ不足問題に対処するために,新たな局所的置換戦略を提案する。ラベルのない画像のコンテンツを活用することで、ラベル付き画像が継続的に強化される。具体的には、フライ時に意味的に類似した画像を常に選択するために擬似ラベリング法を採用する。オリジナルラベル付き画像は、次のエポックトレーニングのために選択された画像に局所的に置き換えられる。このように、ラベルのない画像から直接新しい意味情報を学習することができ、埋め込み空間における教師付き信号の容量を大幅に拡大することができる。これにより、モデルは一般化を改善し、分類のためのより良い決定境界を学ぶことができる。私たちの方法は概念的にシンプルで実装が簡単です。大規模な実験により、様々な数ショット画像認識ベンチマークで最先端の結果が得られることが示された。 The goal of few-shot learning is to learn a model that can recognize novel classes based on one or few training data. It is challenging mainly due to two aspects: (1) it lacks good feature representation of novel classes; (2) a few of labeled data could not accurately represent the true data distribution and thus it's hard to learn a good decision function for classification. In this work, we use a sophisticated network architecture to learn better feature representation and focus on the second issue. A novel continual local replacement strategy is proposed to address the data deficiency problem. It takes advantage of the content in unlabeled images to continually enhance labeled ones. Specifically, a pseudo labeling method is adopted to constantly select semantically similar images on the fly. Original labeled images will be locally replaced by the selected images for the next epoch training. In this way, the model can directly learn new semantic information from unlabeled images and the capacity of supervised signals in the embedding space can be significantly enlarged. This allows the model to improve generalization and learn a better decision boundary for classification. Our method is conceptually simple and easy to implement. Extensive experiments demonstrate that it can achieve state-of-the-art results on various few-shot image recognition benchmarks.	翻訳日:2023-01-07 10:03:42 公開日:2020-03-10
# psc-net:学習部空間共起による歩行者検出 PSC-Net: Learning Part Spatial Co-occurrence for Occluded Pedestrian Detection ( http://arxiv.org/abs/2001.09252v2 ) ライセンス: Link先を確認	Jin Xie and Yanwei Pang and Hisham Cholakkal and Rao Muhammad Anwer and Fahad Shahbaz Khan and Ling Shao	(参考訳) 特に重厚な閉塞下で歩行者を検知することは、現実の多くの応用において難しいコンピュータビジョン問題である。本稿では,歩行者検出のための新しいアプローチをPSC-Netと呼ぶ。提案したPSC-Netは、グラフ畳み込みネットワーク(GCN)を介して、異なる歩行者体のパーツ間の共起情報を明示的にキャプチャする専用モジュールを含む。部分的および部分的共起情報は、部分的から重度な咬合まで、様々な咬合レベルを扱うための特徴表現の改善に寄与する。我々のPSC-Netは歩行者のトポロジ的構造を利用しており、空間的共起を学習するために、部分ベースのアノテーションや視覚的バウンディングボックス(VBB)情報を必要としない。総合的な実験は、citypersonsとcaltech datasetsという2つの挑戦的なデータセットで行われている。提案したPSC-Netは,両者の最先端検出性能を実現する。 CityPerosns テストセットのヘビーオクルード (\textbf{HO}) セットでは、当社のPSC-Net は、同じバックボーン、入力スケール、追加の VBB 監督を使わずに、平均誤差率の4.0 % の絶対ゲインを得る。さらに、PSC-Netは、Caltech(\textbf{HO})テストセットのログ平均ミス率の観点から、最先端の37.9から34.8に改善している。 Detecting pedestrians, especially under heavy occlusions, is a challenging computer vision problem with numerous real-world applications. This paper introduces a novel approach, termed as PSC-Net, for occluded pedestrian detection. The proposed PSC-Net contains a dedicated module that is designed to explicitly capture both inter and intra-part co-occurrence information of different pedestrian body parts through a Graph Convolutional Network (GCN). Both inter and intra-part co-occurrence information contribute towards improving the feature representation for handling varying level of occlusions, ranging from partial to severe occlusions. Our PSC-Net exploits the topological structure of pedestrian and does not require part-based annotations or additional visible bounding-box (VBB) information to learn part spatial co-occurrence. Comprehensive experiments are performed on two challenging datasets: CityPersons and Caltech datasets. The proposed PSC-Net achieves state-of-the-art detection performance on both. On the heavy occluded (\textbf{HO}) set of CityPerosns test set, our PSC-Net obtains an absolute gain of 4.0\% in terms of log-average miss rate over the state-of-the-art with same backbone, input scale and without using additional VBB supervision. Further, PSC-Net improves the state-of-the-art from 37.9 to 34.8 in terms of log-average miss rate on Caltech (\textbf{HO}) test set.	翻訳日:2023-01-07 00:19:04 公開日:2020-03-10
# music2dance:音楽駆動ダンス生成のためのダンスネット Music2Dance: DanceNet for Music-driven Dance Generation ( http://arxiv.org/abs/2002.03761v2 ) ライセンス: Link先を確認	Wenlin Zhuang, Congyi Wang, Siyu Xia, Jinxiang Chai, Yangang Wang	(参考訳) 音楽、すなわち音楽からダンスへの人間の動きを合成することは魅力的であり、近年多くの研究関心を集めている。ダンスにリアルで複雑な人間の動きを必要とするだけでなく、より重要なこととして、合成された動きは音楽のスタイル、リズム、メロディと一致すべきである。本稿では,音楽のスタイル,リズム,メロディを制御信号として捉え,高いリアリズムと多様性を持つ3Dダンスモーションを生成するための,新しい自己回帰生成モデルDanceNetを提案する。提案モデルの性能向上のために,プロのダンサーによる複数の同期音楽ダンスペアをキャプチャし,高品質な音楽ダンスペアデータセットを構築する。実験により,提案手法は最先端の結果が得られることを示した。 Synthesize human motions from music, i.e., music to dance, is appealing and attracts lots of research interests in recent years. It is challenging due to not only the requirement of realistic and complex human motions for dance, but more importantly, the synthesized motions should be consistent with the style, rhythm and melody of the music. In this paper, we propose a novel autoregressive generative model, DanceNet, to take the style, rhythm and melody of music as the control signals to generate 3D dance motions with high realism and diversity. To boost the performance of our proposed model, we capture several synchronized music-dance pairs by professional dancers, and build a high-quality music-dance pair dataset. Experiments have demonstrated that the proposed method can achieve the state-of-the-art results.	翻訳日:2023-01-04 20:21:49 公開日:2020-03-10
# 不完全ラベルを用いたマルチタスク感情認識 Multitask Emotion Recognition with Incomplete Labels ( http://arxiv.org/abs/2002.03557v2 ) ライセンス: Link先を確認	Didan Deng, Zhaokang Chen, Bertram E. Shi	(参考訳) 顔行動単位の検出,表情分類,ヴァレンス覚醒推定の3つのタスクを統一したモデルを訓練した。 3つのタスクを学ぶ上での2つの大きな課題に対処します。まず、既存のデータセットの多くは高度に不均衡です。第二に、既存のデータセットのほとんどは、3つのタスクのラベルを含まない。最初の課題に取り組むために、実験データセットにデータバランシング技術を適用する。第2の課題に取り組むために,マルチタスクモデルにおけるラベルの欠落から学習するアルゴリズムを提案する。このアルゴリズムには2つのステップがある。まず3つのタスクすべてを実行するために教師モデルをトレーニングし、各インスタンスは対応するタスクの基底真理ラベルでトレーニングされます。次に,教師モデルの出力をソフトラベルと呼ぶ。学生モデルをトレーニングするために、ソフトラベルと基礎的な真実を使用します。学生のモデルのほとんどは、教師のモデルよりも3つのタスクで優れています。最後に、3つのタスクのパフォーマンスをさらに向上するためにモデルアンサンブルを使用します。 We train a unified model to perform three tasks: facial action unit detection, expression classification, and valence-arousal estimation. We address two main challenges of learning the three tasks. First, most existing datasets are highly imbalanced. Second, most existing datasets do not contain labels for all three tasks. To tackle the first challenge, we apply data balancing techniques to experimental datasets. To tackle the second challenge, we propose an algorithm for the multitask model to learn from missing (incomplete) labels. This algorithm has two steps. We first train a teacher model to perform all three tasks, where each instance is trained by the ground truth label of its corresponding task. Secondly, we refer to the outputs of the teacher model as the soft labels. We use the soft labels and the ground truth to train the student model. We find that most of the student models outperform their teacher model on all the three tasks. Finally, we use model ensembling to boost performance further on the three tasks.	翻訳日:2023-01-02 09:48:55 公開日:2020-03-10
# マンハッタンのような反復環境のトポロジマッピング Topological Mapping for Manhattan-like Repetitive Environments ( http://arxiv.org/abs/2002.06575v3 ) ライセンス: Link先を確認	Sai Shubodh Puligilla, Satyajit Tourani, Tushar Vaidya, Udit Singh Parihar, Ravi Kiran Sarvadevabhatla and K. Madhava Krishna	(参考訳) 我々は,屋内倉庫環境に挑戦するためのトポロジカルマッピングフレームワークを紹介する。最も抽象的なレベルでは、倉庫は、グラフのノードが特定の倉庫のトポロジー構造(例えばラックスペース、回廊)を表し、エッジが隣接する2つのノードまたはトポロジの間の経路の存在を表すトポロジーグラフとして表現される。中間レベルでは、マップはマンハッタングラフとして表現され、ノードとエッジはマンハッタンの特性によって特徴づけられ、最下層の細部ではポーズグラフとして表現される。トポロジ的構造はDeep Convolutional Networkを通じて学習され、トポロジ的インスタンス間の関係性はSiameseスタイルのニューラルネットワークを介して学習される。本稿では,トポロジカルグラフやマンハッタングラフなどの抽象化の維持が,高度に最適化されていないポーズグラフから正確なポーズグラフを復元する上で有効であることを示す。背景となるPose Graph最適化フレームワークの制約として,トポロジ的およびマンハッタン的関係とManhattan Graphのループクロージャ関係を組み込むことによって,これを実現できることを示す。実世界の屋内倉庫シーンにおける地中近傍ポーズグラフの復元は,提案手法の有効性を実証するものである。 We showcase a topological mapping framework for a challenging indoor warehouse setting. At the most abstract level, the warehouse is represented as a Topological Graph where the nodes of the graph represent a particular warehouse topological construct (e.g. rackspace, corridor) and the edges denote the existence of a path between two neighbouring nodes or topologies. At the intermediate level, the map is represented as a Manhattan Graph where the nodes and edges are characterized by Manhattan properties and as a Pose Graph at the lower-most level of detail. The topological constructs are learned via a Deep Convolutional Network while the relational properties between topological instances are learnt via a Siamese-style Neural Network. In the paper, we show that maintaining abstractions such as Topological Graph and Manhattan Graph help in recovering an accurate Pose Graph starting from a highly erroneous and unoptimized Pose Graph. We show how this is achieved by embedding topological and Manhattan relations as well as Manhattan Graph aided loop closure relations as constraints in the backend Pose Graph optimization framework. The recovery of near ground-truth Pose Graph on real-world indoor warehouse scenes vindicate the efficacy of the proposed framework.	翻訳日:2022-12-31 18:15:31 公開日:2020-03-10
# AdaEnsemble Learning Approach for Metro Passenger Flow Forecasting AdaEnsemble Learning Approach for Metro Passenger Flow Forecasting ( http://arxiv.org/abs/2002.07575v2 ) ライセンス: Link先を確認	Shaolong Sun, Dongchuan Yang, Ju-e Guo, Shouyang Wang	(参考訳) 正確な時間的かつタイムリーな乗客フロー予測は、インテリジェント交通システムの導入の成功に不可欠である。しかし,首都圏の旅客流のランダム性や変動に起因して,効率的かつロバストな予測手法を提案することは極めて困難である。そこで本研究では, 変動モード分解(VMD), 季節的自己回帰統合移動平均化(SARIMA), 多層パーセプトロンネットワーク(MLP), 長期記憶(LSTM)ネットワークの相補的利点を組み合わせた適応型アンサンブル(AdaEnsemble)学習手法を提案する。 AdaEnsembleの学習アプローチは3つの重要な段階で構成されている。第1段階では、VMDを適用して、メトロ旅客フローデータを周期成分、決定成分、ボラティリティ成分に分解する。次に、周期成分の予測にSARIMAモデル、決定論的成分の学習と予測にLSTMネットワーク、揮発性成分の予測にMLPネットワークを用いる。最終段階では、様々な予測コンポーネントが別のMLPネットワークによって再構成される。実験の結果,AdaEnsembleの学習手法は,最先端のモデルと比較して最高の予測性能を持つだけでなく,深セン地下鉄の歴史的乗客フローデータといくつかの標準評価基準に基づいて,最も有望かつ堅牢であることがわかった。 Accurate and timely metro passenger flow forecasting is critical for the successful deployment of intelligent transportation systems. However, it is quite challenging to propose an efficient and robust forecasting approach due to the inherent randomness and variations of metro passenger flow. In this study, we present a novel adaptive ensemble (AdaEnsemble) learning approach to accurately forecast the volume of metro passenger flows, and it combines the complementary advantages of variational mode decomposition (VMD), seasonal autoregressive integrated moving averaging (SARIMA), multilayer perceptron network (MLP) and long short-term memory (LSTM) network. The AdaEnsemble learning approach consists of three important stages. The first stage applies VMD to decompose the metro passenger flows data into periodic component, deterministic component and volatility component. Then we employ SARIMA model to forecast the periodic component, LSTM network to learn and forecast deterministic component and MLP network to forecast volatility component. In the last stage, the diverse forecasted components are reconstructed by another MLP network. The empirical results show that our proposed AdaEnsemble learning approach not only has the best forecasting performance compared with the state-of-the-art models but also appears to be the most promising and robust based on the historical passenger flow data in Shenzhen subway system and several standard evaluation measures.	翻訳日:2022-12-30 20:27:21 公開日:2020-03-10
# 観光客到着の季節予測と傾向予測--適応型マルチスケールアンサンブル学習アプローチ Seasonal and Trend Forecasting of Tourist Arrivals: An Adaptive Multiscale Ensemble Learning Approach ( http://arxiv.org/abs/2002.08021v2 ) ライセンス: Link先を確認	Shaolong Suna, Dan Bi, Ju-e Guo, Shouyang Wang	(参考訳) 観光客の到着の正確な季節予測と傾向予測は、非常に難しい課題である。来訪者の季節・傾向予測の重要性を念頭において、限定的な研究がこれらに注意を向けた。本研究では,観光客到着の短期・中・長期の季節・傾向予測のための変分モード分解 (vmd) と最小二乗支持ベクトル回帰 (lssvr) を組み込んだ適応型マルチスケールアンサンブル (ame) 学習手法を開発した。開発したame学習手法の定式化において,本シリーズは,まず傾向,季節,残りのボラティリティ成分に分解される。次に、ARIMAはトレンド成分の予測に使用され、SARIMAは12ヶ月周期で季節成分の予測に使用され、LSSVRは残りの変動成分の予測に使用される。最後に, 3つのコンポーネントの予測結果を集約し, LSSVRに基づく非線形アンサンブル手法により, 旅行者の到着を予測したアンサンブルを生成する。さらに、マルチステップアヘッド予測を実装するために直接戦略を用いる。 2つの精度測定とDiebold-Marianoテストから,本研究で使用した他のベンチマークと比較すると,AME学習手法が高度かつ指向性予測の精度を達成できることが実証された。 The accurate seasonal and trend forecasting of tourist arrivals is a very challenging task. In the view of the importance of seasonal and trend forecasting of tourist arrivals, and limited research work paid attention to these previously. In this study, a new adaptive multiscale ensemble (AME) learning approach incorporating variational mode decomposition (VMD) and least square support vector regression (LSSVR) is developed for short-, medium-, and long-term seasonal and trend forecasting of tourist arrivals. In the formulation of our developed AME learning approach, the original tourist arrivals series are first decomposed into the trend, seasonal and remainders volatility components. Then, the ARIMA is used to forecast the trend component, the SARIMA is used to forecast seasonal component with a 12-month cycle, while the LSSVR is used to forecast remainder volatility components. Finally, the forecasting results of the three components are aggregated to generate an ensemble forecasting of tourist arrivals by the LSSVR based nonlinear ensemble approach. Furthermore, a direct strategy is used to implement multi-step-ahead forecasting. Taking two accuracy measures and the Diebold-Mariano test, the empirical results demonstrate that our proposed AME learning approach can achieve higher level and directional forecasting accuracy compared with other benchmarks used in this study, indicating that our proposed approach is a promising model for forecasting tourist arrivals with high seasonality and volatility.	翻訳日:2022-12-30 14:39:05 公開日:2020-03-10
# DSSLP: 半教師付きリンク予測のための分散フレームワーク DSSLP: A Distributed Framework for Semi-supervised Link Prediction ( http://arxiv.org/abs/2002.12056v2 ) ライセンス: Link先を確認	Dalong Zhang, Xianzheng Song, Ziqi Liu, Zhiqiang Zhang, Xin Huang, Lin Wang, Jun Zhou	(参考訳) リンク予測は、商人の推薦、不正取引検出など、様々な産業用途で広く利用されている。しかし、数十億のノードとエッジを持つ産業規模のグラフ上でリンク予測モデルをトレーニングし、デプロイすることは大きな課題です。本研究では,産業規模のグラフを扱える半教師付きリンク予測問題(DSSLP)のためのスケーラブルで分散的なフレームワークを提案する。 DSSLPは、グラフ全体のトレーニングモデルではなく、ミニバッチ設定でノードの「emph{$k$-hops neighborhood}」でトレーニングすることを提案しており、入力グラフのスケールを小さくし、トレーニング手順を分散するのに役立つ。負の例を効果的に生成するために、DSSLPは分散バッチ実行時サンプリングモジュールを含んでいる。均一および動的サンプリングアプローチを実装し、トレーニングプロセスのガイドとして、正および負のサンプルを適応的に構築することができる。さらにdsslpはリンク予測タスクの推論処理速度を高速化するためのモデル分割戦略を提案する。実験により,産業規模グラフのリアルタイムデータセットだけでなく,サービス公開データセットにおけるDSSLPの有効性と効率が示された。 Link prediction is widely used in a variety of industrial applications, such as merchant recommendation, fraudulent transaction detection, and so on. However, it's a great challenge to train and deploy a link prediction model on industrial-scale graphs with billions of nodes and edges. In this work, we present a scalable and distributed framework for semi-supervised link prediction problem (named DSSLP), which is able to handle industrial-scale graphs. Instead of training model on the whole graph, DSSLP is proposed to train on the \emph{$k$-hops neighborhood} of nodes in a mini-batch setting, which helps reduce the scale of the input graph and distribute the training procedure. In order to generate negative examples effectively, DSSLP contains a distributed batched runtime sampling module. It implements uniform and dynamic sampling approaches, and is able to adaptively construct positive and negative examples to guide the training process. Moreover, DSSLP proposes a model-split strategy to accelerate the speed of inference process of the link prediction task. Experimental results demonstrate that the effectiveness and efficiency of DSSLP in serval public datasets as well as real-world datasets of industrial-scale graphs.	翻訳日:2022-12-28 09:24:36 公開日:2020-03-10
# 逆襲を受ける深部3次元点雲モデルの等尺性について On Isometry Robustness of Deep 3D Point Cloud Models under Adversarial Attacks ( http://arxiv.org/abs/2002.12222v2 ) ライセンス: Link先を確認	Yue Zhao, Yuwei Wu, Caihua Chen, Andrew Lim	(参考訳) 3D領域でのディープラーニングは多くのタスクにおいて革命的なパフォーマンスを達成したが、これらのモデルの堅牢性は十分に研究されていない。 3次元逆数サンプルについて、既存の研究のほとんどは局所点の操作に焦点をあてており、これはユークリッド距離、すなわち等距離を保存する線形射影の下でのロバスト性のような大域幾何学的性質を起こさない可能性がある。本研究では,既存の最先端3次元モデルがアイソメトリー変換に対して極めて脆弱であることを示す。トンプソンサンプリングを用いて、modelnet40データセットで95%以上の成功率を持つブラックボックス攻撃を開発した。制限等尺特性を組み込んだスペクトルノルムに基づく摂動の上に,ホワイトボックス攻撃の新たな枠組みを提案する。従来の研究とは対照的に,我々の反対サンプルは強く伝達可能であることが実験的に示されている。一般的な3dモデルで評価すると、ホワイトボックス攻撃は98.88%から100%の成功率を達成している。許容できない回転範囲$[\pm 2.81^{\circ}]$でも95%以上の攻撃率を維持している。 While deep learning in 3D domain has achieved revolutionary performance in many tasks, the robustness of these models has not been sufficiently studied or explored. Regarding the 3D adversarial samples, most existing works focus on manipulation of local points, which may fail to invoke the global geometry properties, like robustness under linear projection that preserves the Euclidean distance, i.e., isometry. In this work, we show that existing state-of-the-art deep 3D models are extremely vulnerable to isometry transformations. Armed with the Thompson Sampling, we develop a black-box attack with success rate over 95% on ModelNet40 data set. Incorporating with the Restricted Isometry Property, we propose a novel framework of white-box attack on top of spectral norm based perturbation. In contrast to previous works, our adversarial samples are experimentally shown to be strongly transferable. Evaluated on a sequence of prevailing 3D models, our white-box attack achieves success rates from 98.88% to 100%. It maintains a successful attack rate over 95% even within an imperceptible rotation range $[\pm 2.81^{\circ}]$.	翻訳日:2022-12-28 07:48:27 公開日:2020-03-10
# DeepMAL -- マルウェアのトラフィック検出と分類のためのディープラーニングモデル DeepMAL -- Deep Learning Models for Malware Traffic Detection and Classification ( http://arxiv.org/abs/2003.04079v2 ) ライセンス: Link先を確認	Gonzalo Mar\'in, Pedro Casas, Germ\'an Capdehourat	(参考訳) ロバストネットワークセキュリティシステムは、ネットワーク攻撃の持続的発生による害を予防し軽減するために不可欠である。近年、機械学習ベースのシステムはネットワークセキュリティアプリケーションで人気を博しており、通常は、専門家による手作り入力機能の注意深いエンジニアリングに依存する浅いモデルの応用を考慮している。このアプローチの主な制限は、さまざまなシナリオやタイプの攻撃下で手作りの機能がうまく機能しないことだ。ディープラーニング(dl)モデルは、生の非処理データから特徴表現を学習する能力を使って、この制限を解決できる。本稿では,マルウェアネットワークトラフィックの検出と分類に関する特定の問題に対するDLモデルのパワーについて検討する。技術の現状に関して大きな利点として,監視されたバイトのストリームから直接得られる生の計測を提案モデルへの入力として検討し,パケットやフローレベルを含む様々な生のトラヒックの特徴表現を評価する。我々は、悪質なトラフィックの基盤となる統計を、専門的な手作り機能なしで把握できるDLモデルであるDeepMALを紹介した。異なる種類のマルウェアトラフィックを含む公開トラフィックトレースを用いて、DeepMALはマルウェアフローを高精度に検出・分類し、従来の浅層モデルより優れた性能を発揮することを示す。 Robust network security systems are essential to prevent and mitigate the harming effects of the ever-growing occurrence of network attacks. In recent years, machine learning-based systems have gain popularity for network security applications, usually considering the application of shallow models, which rely on the careful engineering of expert, handcrafted input features. The main limitation of this approach is that handcrafted features can fail to perform well under different scenarios and types of attacks. Deep Learning (DL) models can solve this limitation using their ability to learn feature representations from raw, non-processed data. In this paper we explore the power of DL models on the specific problem of detection and classification of malware network traffic. As a major advantage with respect to the state of the art, we consider raw measurements coming directly from the stream of monitored bytes as input to the proposed models, and evaluate different raw-traffic feature representations, including packet and flow-level ones. We introduce DeepMAL, a DL model which is able to capture the underlying statistics of malicious traffic, without any sort of expert handcrafted features. Using publicly available traffic traces containing different families of malware traffic, we show that DeepMAL can detect and classify malware flows with high accuracy, outperforming traditional, shallow-like models.	翻訳日:2022-12-26 23:39:29 公開日:2020-03-10
# 確率最適化アルゴリズムのハイパーパラメータチューニングについて On Hyper-parameter Tuning for Stochastic Optimization Algorithms ( http://arxiv.org/abs/2003.02038v2 ) ライセンス: Link先を確認	Haotian Zhang, Jianyong Sun and Zongben Xu	(参考訳) 本稿では,強化学習に基づく確率最適化アルゴリズムのハイパーパラメータをチューニングするためのアルゴリズムフレームワークを提案する。ハイパーパラメータは進化的アルゴリズム(EA)やメタヒューリスティックスなどの確率最適化アルゴリズムの性能に大きな影響を及ぼす。しかし、これらのアルゴリズムの確率的性質から最適なハイパーパラメータを決定するのは非常に時間がかかる。我々は,マルコフ決定過程としてチューニング手順をモデル化し,ハイパーパラメータをチューニングするためのポリシー勾配アルゴリズムを適用することを提案する。異なる最適化問題(連続的および離散的)に対して、異なる種類のハイパーパラメータ(連続的および離散的)を持つ確率的アルゴリズムをチューニングする実験は、提案するハイパーパラメータチューニングアルゴリズムがベイズ最適化法よりも確率的アルゴリズムの実行時間が少なくないことを示している。提案フレームワークは確率アルゴリズムにおけるハイパーパラメータチューニングの標準ツールとして利用できる。 This paper proposes the first-ever algorithmic framework for tuning hyper-parameters of stochastic optimization algorithm based on reinforcement learning. Hyper-parameters impose significant influences on the performance of stochastic optimization algorithms, such as evolutionary algorithms (EAs) and meta-heuristics. Yet, it is very time-consuming to determine optimal hyper-parameters due to the stochastic nature of these algorithms. We propose to model the tuning procedure as a Markov decision process, and resort the policy gradient algorithm to tune the hyper-parameters. Experiments on tuning stochastic algorithms with different kinds of hyper-parameters (continuous and discrete) for different optimization problems (continuous and discrete) show that the proposed hyper-parameter tuning algorithms do not require much less running times of the stochastic algorithms than bayesian optimization method. The proposed framework can be used as a standard tool for hyper-parameter tuning in stochastic algorithms.	翻訳日:2022-12-26 12:07:11 公開日:2020-03-10
# 放送マッチングビデオにおける個人選手追跡のためのハイブリッド手法 A Hybrid Approach for Tracking Individual Players in Broadcast Match Videos ( http://arxiv.org/abs/2003.03271v2 ) ライセンス: Link先を確認	Roberto L. Castro, Diego Andrade, Basilio Fraguela	(参考訳) ビデオシーケンスで人々を追跡することは、多くの観点からアプローチされた難しいタスクです。このタスクは、放送されたスポーツイベントの選手であるときにさらに複雑になるが、その理由として、頻繁なカメラの動きやスイッチ、プレイヤー間の全体的および部分的閉塞、ビデオの凝固アルゴリズムによるぼやけたフレームなどの困難が存在する。本稿では,高速かつ高精度な選手追跡ソリューションを提案する。これにより、プレイヤーを正確にリアルタイムで追跡することができる。このアプローチは、比較的控えめなハードウェアで同時に実行される複数のモデルを組み合わせており、手ラベルのブロードキャストビデオシーケンスに対して精度が検証されている。精度については,本手法の曲線下領域 (auc) は約0.6であり, art 解の汎用状態と類似していることを示す。性能に関しては80fpsで高精細ビデオ(1920x1080px)を処理できる。 Tracking people in a video sequence is a challenging task that has been approached from many perspectives. This task becomes even more complicated when the person to track is a player in a broadcasted sport event, the reasons being the existence of difficulties such as frequent camera movements or switches, total and partial occlusions between players, and blurry frames due to the codification algorithm of the video. This paper introduces a player tracking solution which is both fast and accurate. This allows to track a player precisely in real-time. The approach combines several models that are executed concurrently in a relatively modest hardware, and whose accuracy has been validated against hand-labeled broadcast video sequences. Regarding the accuracy, the tests show that the area under curve (AUC) of our approach is around 0.6, which is similar to generic state of the art solutions. As for performance, our proposal can process high definition videos (1920x1080 px) at 80 fps.	翻訳日:2022-12-26 01:38:57 公開日:2020-03-10
# Mind the Gap: Open Set Domain Adaptationにおけるドメインギャップの拡大 Mind the Gap: Enlarging the Domain Gap in Open Set Domain Adaptation ( http://arxiv.org/abs/2003.03787v2 ) ライセンス: Link先を確認	Dongliang Chang, Aneeshan Sain, Zhanyu Ma, Yi-Zhe Song, Jun Guo	(参考訳) 教師なしのドメイン適応は、ソースドメインからのラベル付きデータを活用して、ラベルなしのターゲットドメインの分類子を学ぶことを目的としています。その多くの変種の中で、open set domain adaptation (osda) はおそらく最も困難であり、ターゲットドメインに未知のクラスの存在を想定している。本稿では,osda について,より大きな領域間隙を横断する能力の強化に特に焦点をあてて検討する。第一に、既存の最先端手法は、特にOSDA用に再設計された新しいデータセット(PACS)において、より大きなドメインギャップが存在する場合、大幅なパフォーマンス低下を被ることを示す。次に、より大きなドメインギャップに対処する新しいフレームワークを提案する。重要な洞察は、2つのネットワーク間の相互に有益な情報をどのように活用するかである。 a) 既知のクラスと未知のクラスのサンプルを分離する。 b) 未知のサンプルの影響を受けずに、ソースとターゲットドメイン間のドメインの混乱を最大化する。その通りです (a)及び (b)相互に監督し、収束するまで交代する。 Office-31、Office-Home、PACSのデータセットで大規模な実験を行い、他の最先端技術と比較して、我々の手法の優位性を実証した。コードはhttps://github.com/dongliangchang/mutual-to-separate/。 Unsupervised domain adaptation aims to leverage labeled data from a source domain to learn a classifier for an unlabeled target domain. Among its many variants, open set domain adaptation (OSDA) is perhaps the most challenging, as it further assumes the presence of unknown classes in the target domain. In this paper, we study OSDA with a particular focus on enriching its ability to traverse across larger domain gaps. Firstly, we show that existing state-of-the-art methods suffer a considerable performance drop in the presence of larger domain gaps, especially on a new dataset (PACS) that we re-purposed for OSDA. We then propose a novel framework to specifically address the larger domain gaps. The key insight lies with how we exploit the mutually beneficial information between two networks; (a) to separate samples of known and unknown classes, (b) to maximize the domain confusion between source and target domain without the influence of unknown samples. It follows that (a) and (b) will mutually supervise each other and alternate until convergence. Extensive experiments are conducted on Office-31, Office-Home, and PACS datasets, demonstrating the superiority of our method in comparison to other state-of-the-arts. Code available at https://github.com/dongliangchang/Mutual-to-Separate/	翻訳日:2022-12-25 14:34:22 公開日:2020-03-10
# SQUIRL:長軸ロボットマニピュレーションタスクのビデオデモによるロバストで効率的な学習 SQUIRL: Robust and Efficient Learning from Video Demonstration of Long-Horizon Robotic Manipulation Tasks ( http://arxiv.org/abs/2003.04956v1 ) ライセンス: Link先を確認	Bohan Wu, Feng Xu, Zhanpeng He, Abhi Gupta, and Peter K. Allen	(参考訳) 深部強化学習(RL)の最近の進歩は、複雑なロボット操作タスクを学習する可能性を示している。しかし、RLはロボットに大量の現実世界の体験を収集する必要がある。この問題に対処するため、近年の研究では、少数の専門家によるデモンストレーションだけで堅牢なパフォーマンスを実現する能力から、特に逆強化学習(irl)を通じて、エキスパートデモンストレーション(lfd)からの学習を提案している。それでも、実際のロボットにIRLをデプロイすることは、大量のロボット体験を必要とするため、依然として難しい。本稿では,この拡張性に頑健で,サンプル効率が高く,かつ汎用的なメタIRLアルゴリズムであるSQUIRLを用いて取り組むことを目的としている。このアルゴリズムはまず,行動クローニング(BC)を用いたタスクエンコーダとタスク条件付きポリシーの学習をブートストラップする。そして、実際のロボット体験を収集し、報酬学習を回避し、組み合わせたロボットと専門家の軌道からQ関数を直接回収する。次に、このアルゴリズムはQ関数を用いて、ロボットが収集した累積体験を再評価し、ポリシーを迅速に改善する。結局、このポリシーは、テスト時に試行錯誤を必要とせず、新しいタスクでbcよりも堅牢に(90%以上の成功)する。最後に、我々の実ロボットとシミュレーション実験は、異なる状態空間、アクション空間、視覚に基づく操作タスク、例えばピック・プール・プレースやピック・キャリー・ドロップにおけるアルゴリズムの一般化を実証する。 Recent advances in deep reinforcement learning (RL) have demonstrated its potential to learn complex robotic manipulation tasks. However, RL still requires the robot to collect a large amount of real-world experience. To address this problem, recent works have proposed learning from expert demonstrations (LfD), particularly via inverse reinforcement learning (IRL), given its ability to achieve robust performance with only a small number of expert demonstrations. Nevertheless, deploying IRL on real robots is still challenging due to the large number of robot experiences it requires. This paper aims to address this scalability challenge with a robust, sample-efficient, and general meta-IRL algorithm, SQUIRL, that performs a new but related long-horizon task robustly given only a single video demonstration. First, this algorithm bootstraps the learning of a task encoder and a task-conditioned policy using behavioral cloning (BC). It then collects real-robot experiences and bypasses reward learning by directly recovering a Q-function from the combined robot and expert trajectories. Next, this algorithm uses the Q-function to re-evaluate all cumulative experiences collected by the robot to improve the policy quickly. In the end, the policy performs more robustly (90%+ success) than BC on new tasks while requiring no trial-and-errors at test time. Finally, our real-robot and simulated experiments demonstrate our algorithm's generality across different state spaces, action spaces, and vision-based manipulation tasks, e.g., pick-pour-place and pick-carry-drop.	翻訳日:2022-12-24 21:59:08 公開日:2020-03-10
# PL${}_{1}$P -- 3つの視点における部分可視性の下でのポイントライン最小問題 PL${}_{1}$P -- Point-line Minimal Problems under Partial Visibility in Three Views ( http://arxiv.org/abs/2003.05015v1 ) ライセンス: Link先を確認	Timothy Duff, Kathl\'en Kohn, Anton Leykin, Tomas Pajdla	(参考訳) 本稿では,各直線が少なくとも1点に入射した場合に,空間内の点と線の一般的な配置に関する最小限の問題を3つの校正視点カメラで部分的に観察する。これは、オクルージョンによる画像の観察の欠如と検出の欠如を可能にする、興味深い極小問題の大きなクラスである。そのような最小限の問題は無限に存在するが、過剰な特徴を取り除き、カメラを回避し、140616の等価クラスに還元できることが示される。また,最小解法の設計に実用的なカメラミニマル問題を導入し,最小問題ごとに最も単純なカメラミニマル問題を選択する方法を示す。この単純化により、74575同値類が得られる。 76種のみが知られており、残りは新種である。画像マッチングと3次元再構成の実用的解決の可能性を持つ問題を特定するため,300未満の汎用データに対する解を持つカメラ最小問題に対する解数を計算するとともに,カメラ最小問題のより小さなサブファミリ数をいくつか提示する。 We present a complete classification of minimal problems for generic arrangements of points and lines in space observed partially by three calibrated perspective cameras when each line is incident to at most one point. This is a large class of interesting minimal problems that allows missing observations in images due to occlusions and missed detections. There is an infinite number of such minimal problems; however, we show that they can be reduced to 140616 equivalence classes by removing superfluous features and relabeling the cameras. We also introduce camera-minimal problems, which are practical for designing minimal solvers, and show how to pick a simplest camera-minimal problem for each minimal problem. This simplification results in 74575 equivalence classes. Only 76 of these were known; the rest are new. In order to identify problems that have potential for practical solving of image matching and 3D reconstruction, we present several smaller natural subfamilies of camera-minimal problems as well as compute solution counts for all camera-minimal problems which have less than 300 solutions for generic data.	翻訳日:2022-12-24 21:58:08 公開日:2020-03-10
# テラヘルツ通信ネットワークにおける間欠干渉緩和のための強化学習 Reinforcement Learning for Mitigating Intermittent Interference in Terahertz Communication Networks ( http://arxiv.org/abs/2003.04832v1 ) ライセンス: Link先を確認	Reza Barazideh and Omid Semiari and Solmaz Niknam and Balasubramaniam Natarajan	(参考訳) リアルタイム拡張現実感アプリケーションのような極めて高いデータレート要求の無線サービスを創り出すには、将来の無線ネットワークの容量をさらに増やす新しいソリューションを義務付ける必要がある。この場合、テラヘルツ周波数帯における大きな帯域幅を活用することが鍵となる。これらの高周波数での大きな伝搬損失を克服するためには、高方向リンク上での伝送を管理することは避けられない。しかし、多数のユーザによる非コーディネート指向送信はテラヘルツネットワークにかなりの干渉を引き起こす可能性がある。このような干渉は短いランダムな時間間隔で受信されるが、受信した電力は大きい。本研究では,適応型マルチthresholding戦略を用いて,時間領域における方向リンクからの間欠的干渉を効率的に検出し緩和する,強化学習に基づく新しいフレームワークを提案する。最適しきい値を求めるために、問題は多次元多腕バンディットシステムとして定式化される。次に、受信者が非常に低い複雑性で最適な閾値を学習できるアルゴリズムを提案する。提案手法のもう1つの重要な利点は、干渉統計に関する事前の知識に依存しないため、動的シナリオにおける干渉緩和に適していることである。シミュレーションの結果,従来の2つの時間領域干渉緩和法と比較して,提案手法のビットエラーレート性能が良好であることが確認された。 Emerging wireless services with extremely high data rate requirements, such as real-time extended reality applications, mandate novel solutions to further increase the capacity of future wireless networks. In this regard, leveraging large available bandwidth at terahertz frequency bands is seen as a key enabler. To overcome the large propagation loss at these very high frequencies, it is inevitable to manage transmissions over highly directional links. However, uncoordinated directional transmissions by a large number of users can cause substantial interference in terahertz networks. While such interference will be received over short random time intervals, the received power can be large. In this work, a new framework based on reinforcement learning is proposed that uses an adaptive multi-thresholding strategy to efficiently detect and mitigate the intermittent interference from directional links in the time domain. To find the optimal thresholds, the problem is formulated as a multidimensional multi-armed bandit system. Then, an algorithm is proposed that allows the receiver to learn the optimal thresholds with very low complexity. Another key advantage of the proposed approach is that it does not rely on any prior knowledge about the interference statistics, and hence, it is suitable for interference mitigation in dynamic scenarios. Simulation results confirm the superior bit-error-rate performance of the proposed method compared with two traditional time-domain interference mitigation approaches.	翻訳日:2022-12-24 21:56:26 公開日:2020-03-10
# O&G機械学習モデルのデータリニアジ管理:シェールユースケースのためのスイートスポット Managing Data Lineage of O&G Machine Learning Models: The Sweet Spot for Shale Use Case ( http://arxiv.org/abs/2003.04915v1 ) ライセンス: Link先を確認	Raphael Thiago, Renan Souza, L. Azevedo, E. Soares, Rodrigo Santos, Wallas Santos, Max De Bayser, M. Cardoso, M. Moreno, and Renato Cerqueira	(参考訳) 機械学習(ML)は、いくつかの業界で欠かせない役割を担っている。しかしながら、"このモデルをトレーニングするために使用されるデータセットはどこから来たのか?"、いくつかの新しいデータ保護法の導入、データガバナンス要件の必要性など、データ系統のトレーニングに関する疑問は、現実の世界におけるMLモデルの採用を妨げている。本稿では,シェールオイルとガス生産のためのスイートスポットを発見するためのMLモデルを構築するために,MLライフサイクルの恩恵を受けるために,データ系統をどのように活用できるかを論じる。 Machine Learning (ML) has increased its role, becoming essential in several industries. However, questions around training data lineage, such as "where has the dataset used to train this model come from?"; the introduction of several new data protection legislation; and, the need for data governance requirements, have hindered the adoption of ML models in the real world. In this paper, we discuss how data lineage can be leveraged to benefit the ML lifecycle to build ML models to discover sweet-spots for shale oil and gas production, a major application in the Oil and Gas O&G Industry.	翻訳日:2022-12-24 21:56:05 公開日:2020-03-10
# コミュニケーション効率のよい分散ディープラーニング:包括的調査 Communication-Efficient Distributed Deep Learning: A Comprehensive Survey ( http://arxiv.org/abs/2003.06307v1 ) ライセンス: Link先を確認	Zhenheng Tang, Shaohuai Shi, Xiaowen Chu, Wei Wang, Bo Li	(参考訳) 分散ディープラーニングは、ディープモデルやデータセットのサイズが大きくなるにつれて、複数のコンピューティングデバイス(gpuやtpuなど)を活用することで、トレーニング時間全体の削減に非常に一般的なものになる。しかしながら、コンピューティングデバイス間のデータ通信は、システムのスケーラビリティを制限する潜在的なボトルネックになり得る。分散ディープラーニングにおけるコミュニケーション問題への対処法は,近年ホットな研究トピックになりつつある。本稿では,システムレベルの最適化とアルゴリズムレベルの最適化の両方において,通信効率のよい分散学習アルゴリズムの包括的調査を行う。システムレベルでは、通信コストを削減するため、システム設計と実装をデミスティフィケートする。アルゴリズムレベルでは、異なるアルゴリズムと理論収束境界と通信複雑性を比較する。具体的には、まず、通信同期、システムアーキテクチャ、圧縮技術、通信とコンピューティングの並列性という4つの主次元を含むデータ並列分散トレーニングアルゴリズムの分類法を提案する。次に,コミュニケーションコストを比較するために,4次元の問題に対処する研究について述べる。さらに、異なるアルゴリズムの収束率を比較することで、反復の観点からアルゴリズムがどの程度の速度で解に収束できるかを知ることができる。システムレベルの通信コスト分析と理論収束速度比較により、特定の分散環境においてどのアルゴリズムがより効率的かを理解し、潜在的な方向を推定し、さらなる最適化を行うことができる。 Distributed deep learning becomes very common to reduce the overall training time by exploiting multiple computing devices (e.g., GPUs/TPUs) as the size of deep models and data sets increases. However, data communication between computing devices could be a potential bottleneck to limit the system scalability. How to address the communication problem in distributed deep learning is becoming a hot research topic recently. In this paper, we provide a comprehensive survey of the communication-efficient distributed training algorithms in both system-level and algorithmic-level optimizations. In the system-level, we demystify the system design and implementation to reduce the communication cost. In algorithmic-level, we compare different algorithms with theoretical convergence bounds and communication complexity. Specifically, we first propose the taxonomy of data-parallel distributed training algorithms, which contains four main dimensions: communication synchronization, system architectures, compression techniques, and parallelism of communication and computing. Then we discuss the studies in addressing the problems of the four dimensions to compare the communication cost. We further compare the convergence rates of different algorithms, which enable us to know how fast the algorithms can converge to the solution in terms of iterations. According to the system-level communication cost analysis and theoretical convergence speed comparison, we provide the readers to understand what algorithms are more efficient under specific distributed environments and extrapolate potential directions for further optimizations.	翻訳日:2022-12-24 21:55:28 公開日:2020-03-10
# JS-son - リーンで拡張可能なJavaScriptエージェントプログラミングライブラリ JS-son -- A Lean, Extensible JavaScript Agent Programming Library ( http://arxiv.org/abs/2003.04690v1 ) ライセンス: Link先を確認	Timotheus Kampik and Juan Carlos Nieves	(参考訳) エージェント指向のソフトウェアエンジニアリングフレームワークは数多く存在し、そのほとんどは学術的マルチエージェントシステムコミュニティによって開発されている。しかしながら、これらのフレームワークは、JavaScriptやPythonのようなモダンなハイレベルプログラミング言語に慣れているエンジニアのために学ぶのが難しい、プログラミングパラダイムをユーザに課すことが多い。ソフトウェア工学の主流によるエージェント指向プログラミングの導入がいかに容易かを示すため、推論ループエージェントを実装するためのリーンJavaScriptライブラリのプロトタイプを提供する。このライブラリはコアエージェントプログラミングの概念に重点を置いており、プログラミングアプローチにさらなる制限を課すことを控えている。その有用性を説明するために、このライブラリをweb上のマルチエージェントシステムシミュレーションに適用し、クラウドでホストされたfunction-as-a-service環境にデプロイし、pythonベースのデータサイエンスツールに組み込む方法を示す。 A multitude of agent-oriented software engineering frameworks exist, most of which are developed by the academic multi-agent systems community. However, these frameworks often impose programming paradigms on their users that are challenging to learn for engineers who are used to modern high-level programming languages such as JavaScript and Python. To show how the adoption of agent-oriented programming by the software engineering mainstream can be facilitated, we provide a lean JavaScript library prototype for implementing reasoning-loop agents. The library focuses on core agent programming concepts and refrains from imposing further restrictions on the programming approach. To illustrate its usefulness, we show how the library can be applied to multi-agent systems simulations on the web, deployed to cloud-hosted function-as-a-service environments, and embedded in Python-based data science tools.	翻訳日:2022-12-24 21:48:40 公開日:2020-03-10
# 移動目標モンテカルロ Moving Target Monte Carlo ( http://arxiv.org/abs/2003.04873v1 ) ライセンス: Link先を確認	Haoyun Ying, Keheng Mao, Klaus Mosegaard	(参考訳) マルコフ連鎖モンテカルロ法(mcmc)は、高次元確率変数 $\mathbf{x}$ と非正規化確率密度 $p$ と観測データ $\mathbf{d}$ からのサンプリングを考える際によく用いられる。しかし、MCMCは、受容率を構成する際に、提案された候補$\mathbf{x}$の後方分布$p(\mathbf{x}\|\mathbf{d})$を評価する必要がある。このような評価が難しければコストがかかる。本稿では,移動目標モンテカルロ (MTMC) と呼ばれる非マルコフ型サンプリングアルゴリズムを提案する。 n$-th での受け入れ率は、$p(\mathbf{x}\|\mathbf{d})$の代わりに、後方分布 $a_n(\mathbf{x})$ の反復的に更新された近似を用いて構成される。後方の$p(\mathbf{x}\|\mathbf{d})$ の真の値は、候補 $\mathbf{x}$ が受け入れられる場合にのみ計算される。近似$a_n$はこれらの評価を利用し、$n \rightarrow \infty$として$p$に収束する。異なる状況における収束の証明と収束率の推定が与えられる。 The Markov Chain Monte Carlo (MCMC) methods are popular when considering sampling from a high-dimensional random variable $\mathbf{x}$ with possibly unnormalised probability density $p$ and observed data $\mathbf{d}$. However, MCMC requires evaluating the posterior distribution $p(\mathbf{x}\|\mathbf{d})$ of the proposed candidate $\mathbf{x}$ at each iteration when constructing the acceptance rate. This is costly when such evaluations are intractable. In this paper, we introduce a new non-Markovian sampling algorithm called Moving Target Monte Carlo (MTMC). The acceptance rate at $n$-th iteration is constructed using an iteratively updated approximation of the posterior distribution $a_n(\mathbf{x})$ instead of $p(\mathbf{x}\|\mathbf{d})$. The true value of the posterior $p(\mathbf{x}\|\mathbf{d})$ is only calculated if the candidate $\mathbf{x}$ is accepted. The approximation $a_n$ utilises these evaluations and converges to $p$ as $n \rightarrow \infty$. A proof of convergence and estimation of convergence rate in different situations are given.	翻訳日:2022-12-24 21:48:18 公開日:2020-03-10
# 通信効率ばらつき低減確率勾配降下 Communication-efficient Variance-reduced Stochastic Gradient Descent ( http://arxiv.org/abs/2003.04686v1 ) ライセンス: Link先を確認	Hossein S. Ghadikolaei and Sindri Magnusson	(参考訳) 複数のノードが各イテレーションで重要なアルゴリズム情報を交換して大きな問題を解決する通信効率のよい分散最適化の問題を考える。特に,確率的分散還元勾配に着目し,通信効率を高めるための新しい手法を提案する。すなわち、元の非圧縮アルゴリズムの線形収束率を維持しながら、通信された情報を数ビットに圧縮する。実データ集合の包括的理論的および数値的解析により,本アルゴリズムは通信の複雑さを最大 95 % 削減できることを明らかにした。さらに、分散最適化問題を解くための最先端アルゴリズムよりも、量子化(真の最小値と収束率の維持の観点から)がはるかに堅牢である。この結果は,モノのインターネットやモバイルネットワーク上での機械学習利用に重要な意味を持つ。 We consider the problem of communication efficient distributed optimization where multiple nodes exchange important algorithm information in every iteration to solve large problems. In particular, we focus on the stochastic variance-reduced gradient and propose a novel approach to make it communication-efficient. That is, we compress the communicated information to a few bits while preserving the linear convergence rate of the original uncompressed algorithm. Comprehensive theoretical and numerical analyses on real datasets reveal that our algorithm can significantly reduce the communication complexity, by as much as 95\%, with almost no noticeable penalty. Moreover, it is much more robust to quantization (in terms of maintaining the true minimizer and the convergence rate) than the state-of-the-art algorithms for solving distributed optimization problems. Our results have important implications for using machine learning over internet-of-things and mobile networks.	翻訳日:2022-12-24 21:47:26 公開日:2020-03-10
# ヘテロティックラインバンドルモデルによる探索と爆発 Explore and Exploit with Heterotic Line Bundle Models ( http://arxiv.org/abs/2003.04817v1 ) ライセンス: Link先を確認	Magdalena Larfors and Robin Schneider	(参考訳) 我々は、完全区間カラビヤウ(CICY)多様体上のラインバンドル和から構築されたヘテロティック $SU(5)$ GUT モデルのクラスを、深層強化学習を用いて探索する。我々は,a3cエージェントがモデル探索を訓練する実験を複数実施する。これらのエージェントはランダムな探索よりも優れており、ユニークなモデルを見つける上で最も好ましい設定は1700倍である。さらに、訓練されたエージェントが新しい多様体上のランダムウォーカーよりも優れているという証拠も発見する。エージェントは圧縮データ中に隠れた構造を検知し,その一部が一般の性質であることがわかった。実験は$h^{(1,1)}$でうまくスケールし、大きな$h^{(1,1)}$でCICY上でのモデル構築の鍵を提供するかもしれない。 We use deep reinforcement learning to explore a class of heterotic $SU(5)$ GUT models constructed from line bundle sums over Complete Intersection Calabi Yau (CICY) manifolds. We perform several experiments where A3C agents are trained to search for such models. These agents significantly outperform random exploration, in the most favourable settings by a factor of 1700 when it comes to finding unique models. Furthermore, we find evidence that the trained agents also outperform random walkers on new manifolds. We conclude that the agents detect hidden structures in the compactification data, which is partly of general nature. The experiments scale well with $h^{(1,1)}$, and may thus provide the key to model building on CICYs with large $h^{(1,1)}$.	翻訳日:2022-12-24 21:47:13 公開日:2020-03-10
# データ分析によるレジリエンス犯罪ネットワークの破壊:シチリア・マフィアの事例 Disrupting Resilient Criminal Networks through Data Analysis: The case of Sicilian Mafia ( http://arxiv.org/abs/2003.05303v1 ) ライセンス: Link先を確認	Lucia Cavallaro, Annamaria Ficara, Pasquale De Meo, Giacomo Fiumara, Salvatore Catanese, Ovidiu Bagdasar and Antonio Liotta	(参考訳) 他のタイプのソーシャルネットワークと比較すると、犯罪ネットワークは破壊に対する強い弾力性があり、法執行機関に厳しいハードルをもたらすため、困難な課題を呈している。ここではソーシャルネットワーク分析から手法やツールを借りて (i)現実世界の2つのデータセットに基づくシチリアマフィアギャングの構造を明らかにし、 (ii)それらを効率的にディスラプトする方法についての洞察を得る。マフィアネットワークには、リンクの分布と強度により、他のソーシャルネットワークとは大きく異なる特徴があり、外因性摂動に対して非常に堅牢である。アナリストはまた、ギャングの内部構造と外界との関係を正確に記述する信頼できるデータセットの収集が困難であることに直面している。私たちの研究の付加価値は、2000年代前半にシチリアで活動したマフィア組織に関連する、法律的な行為に由来する生のデータに基づく、2つの現実世界のデータセットの生成です。 2つの異なるネットワークを作り、それぞれ電話と物理的な会議を捉えました。ネットワーク破壊分析は異なる介入手順をシミュレートしました一一度に一人の犯罪者を逮捕すること(次回的ノード削除) (ii)警察の襲撃(ノードの取り外し)。各アプローチの有効性を,複数のネットワーク集中度指標を用いて測定した。その中では、アフィリエイトの5%だけを中和することで、ネットワーク接続が70%低下したことを示す。また,重み付きネットワーク分析と非重み付きネットワーク分析では,犯罪ネットワークにおける特異な相互作用タイプ(すなわち相互作用頻度の分布)が有意な差は認められなかった。我々の研究は犯罪やテロリストのネットワークに対処するための重要な実践的応用を持っている。 Compared to other types of social networks, criminal networks present hard challenges, due to their strong resilience to disruption, which poses severe hurdles to law-enforcement agencies. Herein, we borrow methods and tools from Social Network Analysis to (i) unveil the structure of Sicilian Mafia gangs, based on two real-world datasets, and (ii) gain insights as to how to efficiently disrupt them. Mafia networks have peculiar features, due to the links distribution and strength, which makes them very different from other social networks, and extremely robust to exogenous perturbations. Analysts are also faced with the difficulty in collecting reliable datasets that accurately describe the gangs' internal structure and their relationships with the external world, which is why earlier studies are largely qualitative, elusive and incomplete. An added value of our work is the generation of two real-world datasets, based on raw data derived from juridical acts, relating to a Mafia organization that operated in Sicily during the first decade of 2000s. We created two different networks, capturing phone calls and physical meetings, respectively. Our network disruption analysis simulated different intervention procedures: (i) arresting one criminal at a time (sequential node removal); and (ii) police raids (node block removal). We measured the effectiveness of each approach through a number of network centrality metrics. We found Betweeness Centrality to be the most effective metric, showing how, by neutralizing only the 5% of the affiliates, network connectivity dropped by 70%. We also identified that, due the peculiar type of interactions in criminal networks (namely, the distribution of the interactions frequency) no significant differences exist between weighted and unweighted network analysis. Our work has significant practical applications for tackling criminal and terrorist networks.	翻訳日:2022-12-24 21:40:49 公開日:2020-03-10
# DymSLAM:4次元幾何学的モーションセグメンテーションに基づく動的シーン再構成 DymSLAM:4D Dynamic Scene Reconstruction Based on Geometrical Motion Segmentation ( http://arxiv.org/abs/2003.04569v1 ) ライセンス: Link先を確認	Chenjie Wang and Bin Luo and Yun Zhang and Qing Zhao and Lu Yin and Wei Wang and Xin Su and Yajun Wang and Chengyuan Li	(参考訳) ほとんどのSLAMアルゴリズムは、シーンが静的であるという仮定に基づいている。しかし実際には、ほとんどのシーンは動的で、通常動く物体を含んでいるが、これらの手法は適していない。本稿では、4D(3D + Time)動的シーンを剛体移動物体で再構成できる動的ステレオ視覚SLAMシステムDymSLAMを紹介する。 DymSLAMの唯一の入力はステレオビデオであり、その出力には、静止環境の密度の高いマップ、移動物体の3Dモデル、カメラと移動物体の軌跡が含まれる。まず、従来のSLAM手法を用いて、連続するフレーム間の興味深い点を検出し、マッチングする。次に、異なる運動モデルに属する興味深いポイント(剛体移動物体のエゴ運動や運動モデルを含む)をマルチモデルフィッティングアプローチで分割する。エゴモーションに属する興味深い点に基づいて、カメラの軌跡を推定し、静的な背景を再構築することができる。剛体移動物体の運動モデルに属する興味深い点は、相対運動モデルとカメラとの相対運動モデルの推定と、物体の3次元モデル再構築に使用される。次に、グローバル参照フレーム内の移動物体の軌道に対する相対運動を変換する。最後に,移動物体の3次元モデルを環境の3次元マップに融合させ,その運動軌跡を考慮し,4D(3D+time)配列を得る。 dymslamは、それを無視する代わりに動的オブジェクトに関する情報を取得し、未知の剛体オブジェクトに適している。そこで,提案システムでは,ロボットを動的物体に対する障害物回避などのハイレベルなタスクに活用することができる。我々は,カメラと物体が広範囲に移動している実環境において実験を行った。 Most SLAM algorithms are based on the assumption that the scene is static. However, in practice, most scenes are dynamic which usually contains moving objects, these methods are not suitable. In this paper, we introduce DymSLAM, a dynamic stereo visual SLAM system being capable of reconstructing a 4D (3D + time) dynamic scene with rigid moving objects. The only input of DymSLAM is stereo video, and its output includes a dense map of the static environment, 3D model of the moving objects and the trajectories of the camera and the moving objects. We at first detect and match the interesting points between successive frames by using traditional SLAM methods. Then the interesting points belonging to different motion models (including ego-motion and motion models of rigid moving objects) are segmented by a multi-model fitting approach. Based on the interesting points belonging to the ego-motion, we are able to estimate the trajectory of the camera and reconstruct the static background. The interesting points belonging to the motion models of rigid moving objects are then used to estimate their relative motion models to the camera and reconstruct the 3D models of the objects. We then transform the relative motion to the trajectories of the moving objects in the global reference frame. Finally, we then fuse the 3D models of the moving objects into the 3D map of the environment by considering their motion trajectories to obtain a 4D (3D+time) sequence. DymSLAM obtains information about the dynamic objects instead of ignoring them and is suitable for unknown rigid objects. Hence, the proposed system allows the robot to be employed for high-level tasks, such as obstacle avoidance for dynamic objects. We conducted experiments in a real-world environment where both the camera and the objects were moving in a wide range.	翻訳日:2022-12-24 21:40:22 公開日:2020-03-10
# アンサンブル色空間モデルを用いた逆例への取り組み Using an ensemble color space model to tackle adversarial examples ( http://arxiv.org/abs/2003.05005v1 ) ライセンス: Link先を確認	Shreyank N Gowda, Chun Yuan	(参考訳) 画像中の微小ピクセルの変化は、ディープラーニングモデルが生み出す予測を大幅に変える。例えば、これが原因で起こりうる最も重要な問題の1つは、自動運転だ。これに対処するために様々な方法が提案されている。このような攻撃を防御する3段階の手法を提案する。まず,統計的手法を用いて画像の識別を行う。第二に、同じモデルに複数の色空間を採用することで、各色空間がそれ自身に明示的な特徴を検出することによって、これらの敵対的攻撃と戦うことができることを示す。最後に、生成された特徴マップを拡大し、入力として送り返してさらに小さな特徴を得る。提案モデルは,特定の攻撃を防御するために訓練される必要はなく,本質的にはブラックボックス,ホワイトボックス,グレイボックスの対向攻撃技術に頑健であることを示す。特に、モデルが敵対的な例の訓練を受けていない場合、ホワイトボックス攻撃の場合、このモデルは比較モデルよりも56.12%頑丈である。 Minute pixel changes in an image drastically change the prediction that the deep learning model makes. One of the most significant problems that could arise due to this, for instance, is autonomous driving. Many methods have been proposed to combat this with varying amounts of success. We propose a 3 step method for defending such attacks. First, we denoise the image using statistical methods. Second, we show that adopting multiple color spaces in the same model can help us to fight these adversarial attacks further as each color space detects certain features explicit to itself. Finally, the feature maps generated are enlarged and sent back as an input to obtain even smaller features. We show that the proposed model does not need to be trained to defend an particular type of attack and is inherently more robust to black-box, white-box, and grey-box adversarial attack techniques. In particular, the model is 56.12 percent more robust than compared models in case of white box attacks when the models are not subject to adversarial example training.	翻訳日:2022-12-24 21:37:39 公開日:2020-03-10
# 医療画像セグメンテーションのための組込み集合知識の多レベルコンテキストゲーティング Multi-level Context Gating of Embedded Collective Knowledge for Medical Image Segmentation ( http://arxiv.org/abs/2003.05056v1 ) ライセンス: Link先を確認	Maryam Asadi-Aghbolaghi, Reza Azad, Mahmood Fathy, and Sergio Escalera	(参考訳) 様々な症例で解剖学的に大きな違いがあるため, 医用画像のセグメンテーションは非常に困難である。ディープラーニングフレームワークの最近の進歩は、画像セグメンテーションの高速で正確なパフォーマンスを示している。既存のネットワークのうち、u-netは医療画像のセグメンテーションにうまく適用されている。本稿では,U-Net,Squeeze and Excitation (SE) ブロック,双方向 ConvLSTM (BConvLSTM) および高密度畳み込みのメカニズムを最大限に活用する医用画像分割のためのU-Netの拡張を提案する。 (I) U-Net内のSEモジュールを利用することでセグメンテーション性能を向上し、モデルの複雑さに小さな影響を与える。これらのブロックは、特徴マップのグローバルな情報埋め込みの自己ゲーティング機構を利用して、チャネルワイドな特徴応答を適応的に補正する。 (II) 特徴伝播を強化し,特徴再利用を促進するため,符号化パスの最後の畳み込み層に密結合した畳み込みを用いる。 (III) U-Netのスキップ接続における単純な結合の代わりに、BConvLSTMをネットワークのすべてのレベルで使用し、対応する符号化パスから抽出された特徴マップと、以前のデコードアップ畳み込みレイヤを非線形に組み合わせる。提案モデルは,isic 2017と2018の6つのデータセット,肺分画,$ph^2$,細胞核分画で評価され,最新性能が得られた。 Medical image segmentation has been very challenging due to the large variation of anatomy across different cases. Recent advances in deep learning frameworks have exhibited faster and more accurate performance in image segmentation. Among the existing networks, U-Net has been successfully applied on medical image segmentation. In this paper, we propose an extension of U-Net for medical image segmentation, in which we take full advantages of U-Net, Squeeze and Excitation (SE) block, bi-directional ConvLSTM (BConvLSTM), and the mechanism of dense convolutions. (I) We improve the segmentation performance by utilizing SE modules within the U-Net, with a minor effect on model complexity. These blocks adaptively recalibrate the channel-wise feature responses by utilizing a self-gating mechanism of the global information embedding of the feature maps. (II) To strengthen feature propagation and encourage feature reuse, we use densely connected convolutions in the last convolutional layer of the encoding path. (III) Instead of a simple concatenation in the skip connection of U-Net, we employ BConvLSTM in all levels of the network to combine the feature maps extracted from the corresponding encoding path and the previous decoding up-convolutional layer in a non-linear way. The proposed model is evaluated on six datasets DRIVE, ISIC 2017 and 2018, lung segmentation, $PH^2$, and cell nuclei segmentation, achieving state-of-the-art performance.	翻訳日:2022-12-24 21:37:23 公開日:2020-03-10
# 深海塩分予測アーキテクチャ Tidying Deep Saliency Prediction Architectures ( http://arxiv.org/abs/2003.04942v1 ) ライセンス: Link先を確認	Navyasri Reddy, Samyak Jain, Pradeep Yarlagadda, Vineet Gandhi	(参考訳) 視覚注意のための計算モデル(サリエンシー推定)の学習は、機械やロボットを人間の視覚認知能力に近づける努力である。データ駆動の取り組みは、ディープニューラルネットワークアーキテクチャの導入以来、ランドスケープを支配してきた。ディープラーニングの研究において、アーキテクチャ設計の選択はしばしば経験的であり、必要以上に複雑なモデルにつながる。複雑さはアプリケーションの要求を妨げます。本稿では,saliencyモデルの4つのキーコンポーネント,すなわち入力機能,マルチレベル統合,読み出しアーキテクチャ,損失関数について述べる。これら4つの構成要素について,既存の技術モデルについて概観し,新しい,よりシンプルな代替案を提案する。そこで,本稿では,simplenet と mdnsal という2つの新しいエンド・ツー・エンドのアーキテクチャを提案する。 SimpleNetは最適化されたエンコーダ-デコーダアーキテクチャであり、SALICONデータセット(最大の唾液度ベンチマーク)で顕著なパフォーマンス向上をもたらす。 MDNSalは、GMM分布のパラメータを直接予測するパラメトリックモデルであり、予測マップにさらなる解釈可能性をもたらすことを目的としている。提案した精度モデルは25fpsで推定でき、リアルタイムアプリケーションに適している。コードと事前トレーニングされたモデルはhttps://github.com/samyak0210/saliencyで利用可能である。 Learning computational models for visual attention (saliency estimation) is an effort to inch machines/robots closer to human visual cognitive abilities. Data-driven efforts have dominated the landscape since the introduction of deep neural network architectures. In deep learning research, the choices in architecture design are often empirical and frequently lead to more complex models than necessary. The complexity, in turn, hinders the application requirements. In this paper, we identify four key components of saliency models, i.e., input features, multi-level integration, readout architecture, and loss functions. We review the existing state of the art models on these four components and propose novel and simpler alternatives. As a result, we propose two novel end-to-end architectures called SimpleNet and MDNSal, which are neater, minimal, more interpretable and achieve state of the art performance on public saliency benchmarks. SimpleNet is an optimized encoder-decoder architecture and brings notable performance gains on the SALICON dataset (the largest saliency benchmark). MDNSal is a parametric model that directly predicts parameters of a GMM distribution and is aimed to bring more interpretability to the prediction maps. The proposed saliency models can be inferred at 25fps, making them suitable for real-time applications. Code and pre-trained models are available at https://github.com/samyak0210/saliency.	翻訳日:2022-12-24 21:30:56 公開日:2020-03-10
# ラベルのないビデオからビデオオブジェクトのセグメンテーションを学ぶ Learning Video Object Segmentation from Unlabeled Videos ( http://arxiv.org/abs/2003.05020v1 ) ライセンス: Link先を確認	Xiankai Lu, Wenguan Wang, Jianbing Shen, Yu-Wing Tai, David Crandall, and Steven C. H. Hoi	(参考訳) 本稿では,ビデオオブジェクトセグメンテーション(VOS)の新たな手法を提案する。この手法は,広範囲な注釈付きデータに大きく依存する既存の手法とは異なり,未ラベルビデオからのオブジェクトパターン学習に対処する。複数の粒度で VOS 固有の特性を包括的にキャプチャする,教師なし/弱教師付き学習フレームワーク MuG を導入する。我々のアプローチは、VOSにおける視覚パターンの理解を深め、アノテーションの負担を大幅に軽減するのに役立つ。慎重に設計されたアーキテクチャと強力な表現学習能力により、学習モデルは、オブジェクトレベルのゼロショットVOS、インスタンスレベルのゼロショットVOS、ワンショットVOSなど、多様なVOS設定に適用できる。実験は、これらの設定で有望な性能を示すとともに、ラベルのないデータを利用してセグメント化精度をさらに向上させるmugの可能性を示す。 We propose a new method for video object segmentation (VOS) that addresses object pattern learning from unlabeled videos, unlike most existing methods which rely heavily on extensive annotated data. We introduce a unified unsupervised/weakly supervised learning framework, called MuG, that comprehensively captures intrinsic properties of VOS at multiple granularities. Our approach can help advance understanding of visual patterns in VOS and significantly reduce annotation burden. With a carefully-designed architecture and strong representation learning ability, our learned model can be applied to diverse VOS settings, including object-level zero-shot VOS, instance-level zero-shot VOS, and one-shot VOS. Experiments demonstrate promising performance in these settings, as well as the potential of MuG in leveraging unlabeled data to further improve the segmentation accuracy.	翻訳日:2022-12-24 21:30:35 公開日:2020-03-10
# 顔のグラフィカル認識のためのクロスモーダルマルチタスク学習 Cross-modal Multi-task Learning for Graphic Recognition of Caricature Face ( http://arxiv.org/abs/2003.05787v1 ) ライセンス: Link先を確認	Zuheng Ming, Jean-Christophe Burie, Muhammad Muzzamil Luqman	(参考訳) 写実的な視覚画像の顔認識は、近年、よく研究され、大きな進歩を遂げている。現実的な視覚画像とは異なり、似顔絵の顔認識は視覚画像のパフォーマンスからかけ離れている。これは、顔の特徴を誇張して文字を強めることによってもたらされた似顔絵の極端な非剛性歪みによるものである。似顔絵と視覚画像の不均一性から、似顔絵・視覚画像の認識はクロスモーダル問題である。本稿では,マルチタスク学習による顔画像認識を実現する手法を提案する。タスクの重みを固定した従来のマルチタスク学習よりも,タスクの重要性に応じてタスクの重みを学習するアプローチを提案する。提案した動的タスク重み付きマルチタスク学習は,従来の方法のように過度に学習しやすいタスクに留まらず,難易度と難易度を適切にトレーニングすることができる。提案する動的マルチタスク学習のクロスモーダル・カカチュアル・ビジュアル顔認識における効果を実験的に検証した。 CaVIとWebCaricatureのデータセットのパフォーマンスは、最先端のメソッドよりも優れていることを示している。 Face recognition of realistic visual images has been well studied and made a significant progress in the recent decade. Unlike the realistic visual images, the face recognition of the caricatures is far from the performance of the visual images. This is largely due to the extreme non-rigid distortions of the caricatures introduced by exaggerating the facial features to strengthen the characters. The heterogeneous modalities of the caricatures and the visual images result the caricature-visual face recognition is a cross-modal problem. In this paper, we propose a method to conduct caricature-visual face recognition via multi-task learning. Rather than the conventional multi-task learning with fixed weights of tasks, this work proposes an approach to learn the weights of tasks according to the importance of tasks. The proposed multi-task learning with dynamic tasks weights enables to appropriately train the hard task and easy task instead of being stuck in the over-training easy task as conventional methods. The experimental results demonstrate the effectiveness of the proposed dynamic multi-task learning for cross-modal caricature-visual face recognition. The performances on the datasets CaVI and WebCaricature show the superiority over the state-of-art methods.	翻訳日:2022-12-24 21:29:59 公開日:2020-03-10
# 機能重要度ランキングのためのmatlabツールボックス A Matlab Toolbox for Feature Importance Ranking ( http://arxiv.org/abs/2003.08737v1 ) ライセンス: Link先を確認	Shaode Yu, Zhicheng Zhang, Xiaokun Liang, Junjie Wu, Erlei Zhang, Wenjian Qin, and Yaoqin Xie	(参考訳) 特に、インテリジェントな診断とパーソナライズド医療のために何千もの特徴を抽出できる場合には、機能重要度ランキング(FIR)により多くの注意が払われている。多数のFIRアプローチが提案されているが、比較や実環境への応用のために統合されているものはほとんどない。本研究では,matlabツールボックスを提示し,合計30のアルゴリズムを収集した。さらに、ツールボックスを163枚の超音波画像のデータベース上で評価する。各乳房病変に対して,15個の特徴を抽出した。分類に最適な特徴のサブセットを明らかにするために、全ての特徴の組み合わせをテストし、超音波画像にアノテートされた病変の悪性度予測にリニアサポートベクターマシンを用いる。最終的に、性能比較に基づいてFIRの有効性を解析する。ツールボックスはオンライン(https://github.com/NicoYuCN/matFIR)である。今後の作業では、より多くのFIRメソッド、特徴選択メソッド、機械学習分類器が統合されます。 More attention is being paid for feature importance ranking (FIR), in particular when thousands of features can be extracted for intelligent diagnosis and personalized medicine. A large number of FIR approaches have been proposed, while few are integrated for comparison and real-life applications. In this study, a matlab toolbox is presented and a total of 30 algorithms are collected. Moreover, the toolbox is evaluated on a database of 163 ultrasound images. To each breast mass lesion, 15 features are extracted. To figure out the optimal subset of features for classification, all combinations of features are tested and linear support vector machine is used for the malignancy prediction of lesions annotated in ultrasound images. At last, the effectiveness of FIR is analyzed according to performance comparison. The toolbox is online (https://github.com/NicoYuCN/matFIR). In our future work, more FIR methods, feature selection methods and machine learning classifiers will be integrated.	翻訳日:2022-12-24 21:29:43 公開日:2020-03-10
# 競合する言語の共存について On the coexistence of competing languages ( http://arxiv.org/abs/2003.04748v1 ) ライセンス: Link先を確認	Jean-Marc Luck and Anita Mehta	(参考訳) 従来の文献では、結果が常に他の言語よりも1つの言語が支配されていることを示唆している。言語共存は現実的に観察されるため,言語競合の問題を再考し,共存の出現の方法を明らかにすることに注力する。一つの地理的領域における言語話者の人口動態の不均衡に関する第1のシナリオと、異なる地理的領域に言語嗜好が特有な空間的異質性に関連する第2のシナリオである。これらのそれぞれについて、パラダイム的状況の調査は、言語共存につながる条件の定量的理解に繋がる。また,様々なモデルパラメータの関数として,生存言語数の予測も行う。 We investigate the evolution of competing languages, a subject where much previous literature suggests that the outcome is always the domination of one language over all the others. Since coexistence of languages is observed in reality, we here revisit the question of language competition, with an emphasis on uncovering the ways in which coexistence might emerge. We find that this emergence is related to symmetry breaking, and explore two particular scenarios -- the first relating to an imbalance in the population dynamics of language speakers in a single geographical area, and the second to do with spatial heterogeneity, where language preferences are specific to different geographical regions. For each of these, the investigation of paradigmatic situations leads us to a quantitative understanding of the conditions leading to language coexistence. We also obtain predictions of the number of surviving languages as a function of various model parameters.	翻訳日:2022-12-24 21:29:26 公開日:2020-03-10
# 視覚フィードバックを用いたロボット制御のための生成モデル学習 Learning a generative model for robot control using visual feedback ( http://arxiv.org/abs/2003.04474v1 ) ライセンス: Link先を確認	Nishad Gothoskar, Miguel L\'azaro-Gredilla, Abhishek Agarwal, Yasemin Bekiroglu, Dileep George	(参考訳) ロボット制御に視覚フィードバックを取り入れた新しい定式化を提案する。我々は、アクションから、エンドエフェクタの機能のイメージ観察まで、生成モデルを定義する。モデルにおける推論により,特徴のターゲット位置に対応するロボット状態を推測することができる。これにより、ロボットの動きをガイドし、最先端のビジュアルサーボ法よりもはるかに少ないステップで特徴のターゲット位置をマッチングすることができる。本モデルのトレーニング手順により,キネマティクス,特徴構造,カメラパラメータの学習を同時に行うことができる。これは、それを観察するロボット、構造、およびカメラに関する事前情報なしで行うことができる。学習はサンプル効率よく行われ、テストデータに対して強い一般化を示す。フォーミュレーションはモジュール化されているので、カメラやオブジェクトなどのセットアップのコンポーネントを変更して、オンラインで素早く再学習することができます。本手法は,我々が操作するコントローラの観測状態とノイズのノイズを処理できる。本手法は,不正確な制御器を有するロボットに対して把持および密接な挿入を行うことにより,その効果を実証する。 We introduce a novel formulation for incorporating visual feedback in controlling robots. We define a generative model from actions to image observations of features on the end-effector. Inference in the model allows us to infer the robot state corresponding to target locations of the features. This, in turn, guides motion of the robot and allows for matching the target locations of the features in significantly fewer steps than state-of-the-art visual servoing methods. The training procedure for our model enables effective learning of the kinematics, feature structure, and camera parameters, simultaneously. This can be done with no prior information about the robot, structure, and cameras that observe it. Learning is done sample-efficiently and shows strong generalization to test data. Since our formulation is modular, we can modify components of our setup, like cameras and objects, and relearn them quickly online. Our method can handle noise in the observed state and noise in the controllers that we interact with. We demonstrate the effectiveness of our method by executing grasping and tight-fit insertions on robots with inaccurate controllers.	翻訳日:2022-12-24 21:29:15 公開日:2020-03-10
# 帯域限定環境におけるコロボティックビジョンに基づく探索のためのアクティブリワード学習 Active Reward Learning for Co-Robotic Vision Based Exploration in Bandwidth Limited Environments ( http://arxiv.org/abs/2003.05016v1 ) ライセンス: Link先を確認	Stewart Jamieson, Jonathan P. How, Yogesh Girdhar	(参考訳) 我々は,人間の操作者との通信能力に制限があるため,新たな科学的関連画像の収集場所を自律的に決定しなければならないロボットのための新しいPOMDP問題定式化を提案する。この定式化から,このようなロボットの観察モデル,報酬モデル,コミュニケーション戦略に対する制約と設計原則を導出し,非常に高次元の観察空間と関連する訓練データの不足に対処する手法を探求する。提案手法は,ロボットがオンラインの「レグレット」を最小化するためのクエリ作成に基づく,新たな能動的報酬学習戦略を導入し,シミュレーションによる自律的な視覚探索の適性を評価する。帯域制限のある環境では、この新たな後悔に基づく基準により、ロボット探検家は次の最高の基準よりも、1ミッションあたり最大17%の報酬を集めることができる。 We present a novel POMDP problem formulation for a robot that must autonomously decide where to go to collect new and scientifically relevant images given a limited ability to communicate with its human operator. From this formulation we derive constraints and design principles for the observation model, reward model, and communication strategy of such a robot, exploring techniques to deal with the very high-dimensional observation space and scarcity of relevant training data. We introduce a novel active reward learning strategy based on making queries to help the robot minimize path "regret" online, and evaluate it for suitability in autonomous visual exploration through simulations. We demonstrate that, in some bandwidth-limited environments, this novel regret-based criterion enables the robotic explorer to collect up to 17% more reward per mission than the next-best criterion.	翻訳日:2022-12-24 21:29:02 公開日:2020-03-10
# レーティングに基づくハイブリッドベイズネットワークによるアジアハンディキャップサッカーの賭け Asian Handicap football betting with Rating-based Hybrid Bayesian Networks ( http://arxiv.org/abs/2003.09384v1 ) ライセンス: Link先を確認	Anthony Constantinou	(参考訳) アジア・ハンディキャップ(ah)のサッカー賭け市場は盛んに人気があるが、関連する文献では十分に研究されていない。本稿では,レーティングシステムとハイブリッドベイズネットワークを組み合わせることで,AH賭け市場の予測と評価に特化して開発された最初のモデルを示す。結果はイングランドのプレミアリーグ13シーズンに基づいており、従来の1x2市場と比較される。異なる賭け状況が調べられました a) 平均値と最大値の両方の市場確率 b) 予測された確率と公表された確率の間の決定しきい値 c)再投資と利益の両面での最適化 d) 1x2 及び ah 市場で同等利益を目標にする場合におけるリターンのばらつきがどう変化するかを調査するための簡単なステークス調整。 ah市場は従来の1x2市場の非効率性を共有しているが、興味深い違いと両者の類似性が示されている。 Despite the massive popularity of the Asian Handicap (AH) football betting market, it has not been adequately studied by the relevant literature. This paper combines rating systems with hybrid Bayesian networks and presents the first published model specifically developed for prediction and assessment of the AH betting market. The results are based on 13 English Premier League seasons and are compared to the traditional 1X2 market. Different betting situations have been examined including a) both average and maximum (best available) market odds, b) all possible betting decision thresholds between predicted and published odds, c) optimisations for both return-on-investment and profit, and d) simple stake adjustments to investigate how the variance of returns changes when targeting equivalent profit in both 1X2 and AH markets. While the AH market is found to share the inefficiencies of the traditional 1X2 market, the findings reveal both interesting differences as well as similarities between the two.	翻訳日:2022-12-24 21:28:46 公開日:2020-03-10
# Nested Reduced-Rank Regularizationによる多変量機能回帰 Multivariate Functional Regression via Nested Reduced-Rank Regularization ( http://arxiv.org/abs/2003.04786v1 ) ライセンス: Link先を確認	Xiaokang Liu, Shujie Ma, Kun Chen	(参考訳) 本稿では,多変量関数応答と予測器を備えた回帰モデルにネステッド・レグレッション (nrrr) アプローチを適用し,調整された次元縮小を達成し,結果の関数モデルの解釈/可視化を容易にする。提案手法は,機能回帰面に課される2段階の低ランク構造に基づく。グローバルな低ランク構造は、下層の回帰関係を駆動する潜在主機能応答と予測器の小さなセットを特定する。局所的な低ランク構造は、主機能応答と予測器の関係の複雑さと滑らかさを制御する。基底展開アプローチにより、関数問題は興味深い統合行列近似タスクへと導かれる。そこでは、統合された低ランク行列のブロックまたはサブマトリクスが共通の行空間と/または列空間を共有する。収束保証付き反復アルゴリズムを開発した。我々は,nrrrの一貫性を確立し,非漸近的解析により,低ランク回帰のそれと少なくとも同等の誤差率が得られることを示す。シミュレーション研究はNRRRの有効性を示す。我々は,nrrrを電力需要問題に適用し,1日あたりの電力消費の軌跡と1日あたりの気温の関係を明らかにした。 We propose a nested reduced-rank regression (NRRR) approach in fitting regression model with multivariate functional responses and predictors, to achieve tailored dimension reduction and facilitate interpretation/visualization of the resulting functional model. Our approach is based on a two-level low-rank structure imposed on the functional regression surfaces. A global low-rank structure identifies a small set of latent principal functional responses and predictors that drives the underlying regression association. A local low-rank structure then controls the complexity and smoothness of the association between the principal functional responses and predictors. Through a basis expansion approach, the functional problem boils down to an interesting integrated matrix approximation task, where the blocks or submatrices of an integrated low-rank matrix share some common row space and/or column space. An iterative algorithm with convergence guarantee is developed. We establish the consistency of NRRR and also show through non-asymptotic analysis that it can achieve at least a comparable error rate to that of the reduced-rank regression. Simulation studies demonstrate the effectiveness of NRRR. We apply NRRR in an electricity demand problem, to relate the trajectories of the daily electricity consumption with those of the daily temperatures.	翻訳日:2022-12-24 21:28:32 公開日:2020-03-10
# Deep Blindビデオの超高解像度化 Deep Blind Video Super-resolution ( http://arxiv.org/abs/2003.04716v1 ) ライセンス: Link先を確認	Jinshan Pan, Songsheng Cheng, Jiawei Zhang, Jinhui Tang	(参考訳) 既存のビデオ超解像(SR)アルゴリズムは通常、劣化過程におけるぼやけたカーネルが知られており、復元過程におけるぼやけたカーネルをモデル化していないと仮定する。しかし、この仮定はビデオSRには当てはまらないため、通常は過度に滑らかな超解像につながる。本稿では,ぼかしカーネルモデリング手法を用いてビデオsrを解くための深層畳み込みニューラルネットワーク(cnn)モデルを提案する。提案するディープcnnモデルは, 動きのぼかし推定, 動き推定, 潜在画像復元モジュールで構成される。モーションボケ推定モジュールは、信頼できるボケカーネルを提供するために使用される。推定したぼやけたカーネルを用いて,ビデオSRの画像形成モデルに基づく画像デコンボリューション手法を開発し,画像内容の鮮明な復元を可能にする。しかし、生成された中間潜伏画像にはアーティファクトが含まれている可能性がある。高品質な画像を生成するために,移動推定モジュールを用いて隣接するフレームから情報を探索する。提案アルゴリズムは,より微細な構造情報でより鮮明な画像を生成することができることを示す。実験結果から,提案アルゴリズムは最先端手法に対して好適に動作することが示された。 Existing video super-resolution (SR) algorithms usually assume that the blur kernels in the degradation process are known and do not model the blur kernels in the restoration. However, this assumption does not hold for video SR and usually leads to over-smoothed super-resolved images. In this paper, we propose a deep convolutional neural network (CNN) model to solve video SR by a blur kernel modeling approach. The proposed deep CNN model consists of motion blur estimation, motion estimation, and latent image restoration modules. The motion blur estimation module is used to provide reliable blur kernels. With the estimated blur kernel, we develop an image deconvolution method based on the image formation model of video SR to generate intermediate latent images so that some sharp image contents can be restored well. However, the generated intermediate latent images may contain artifacts. To generate high-quality images, we use the motion estimation module to explore the information from adjacent frames, where the motion estimation can constrain the deep CNN model for better image restoration. We show that the proposed algorithm is able to generate clearer images with finer structural details. Extensive experimental results show that the proposed algorithm performs favorably against state-of-the-art methods.	翻訳日:2022-12-24 21:22:18 公開日:2020-03-10
# 雨のスクリーン:屋内で雨のデータセットを収集 Rainy screens: Collecting rainy datasets, indoors ( http://arxiv.org/abs/2003.04742v1 ) ライセンス: Link先を確認	Horia Porav, Valentina-Nicoleta Musat, Tom Bruls, Paul Newman	(参考訳) 適切な地上の真理の保証や、所望の気象条件との同期が難しいため、ロボット工学における不都合な状況を伴うデータの取得は厄介な作業である。本稿では,既存のクリア・グラウンド・ルース・イメージから多彩な雨画像を生成するための高精細なスクリーンを簡易に記録する手法を提案する。このセットアップにより、セマンティクスセグメンテーションやオブジェクト位置など、補助的なタスク基底データによる既存のデータセットの多様性を活用できます。都市景観とBDDに基づく降雨量と降雨量と実際の付着液滴を用いた降雨画像を生成し,デライニングモデルを訓練する。本稿では,画像再構成とセマンティックセグメンテーションの定量的な結果と,サンプル外領域の定性的な結果を示す。 Acquisition of data with adverse conditions in robotics is a cumbersome task due to the difficulty in guaranteeing proper ground truth and synchronising with desired weather conditions. In this paper, we present a simple method - recording a high resolution screen - for generating diverse rainy images from existing clear ground-truth images that is domain- and source-agnostic, simple and scales up. This setup allows us to leverage the diversity of existing datasets with auxiliary task ground-truth data, such as semantic segmentation, object positions etc. We generate rainy images with real adherent droplets and rain streaks based on Cityscapes and BDD, and train a de-raining model. We present quantitative results for image reconstruction and semantic segmentation, and qualitative results for an out-of-sample domain, showing that models trained with our data generalize well.	翻訳日:2022-12-24 21:21:59 公開日:2020-03-10
# 3次元LiDARデータを用いたオフロード乾燥領域抽出 Off-Road Drivable Area Extraction Using 3D LiDAR Data ( http://arxiv.org/abs/2003.04780v1 ) ライセンス: Link先を確認	Biao Gao, Anran Xu, Yancheng Pan, Xijun Zhao, Wen Yao, Huijing Zhao	(参考訳) 本研究では,3次元lidarデータを用いたオフロード自由領域抽出手法を提案する。特定のディープラーニングフレームワークは、オフロード環境における大きな課題の1つである曖昧な領域に対処するように設計されている。ネットワークトレーニングのための人手によるアノテートデータの需要を大幅に減らすため,大量の車両経路や自動生成障害物ラベルからの情報を利用する。これらの自動生成アノテーションを使用することで、提案されたネットワークは弱い教師付きまたは半教師付きメソッドを使ってトレーニングすることができる。このデータセットの実験は、我々のフレームワークの理性と弱く半教師ありの手法の有効性を示すものである。 We propose a method for off-road drivable area extraction using 3D LiDAR data with the goal of autonomous driving application. A specific deep learning framework is designed to deal with the ambiguous area, which is one of the main challenges in the off-road environment. To reduce the considerable demand for human-annotated data for network training, we utilize the information from vast quantities of vehicle paths and auto-generated obstacle labels. Using these autogenerated annotations, the proposed network can be trained using weakly supervised or semi-supervised methods, which can achieve better performance with fewer human annotations. The experiments on our dataset illustrate the reasonability of our framework and the validity of our weakly and semi-supervised methods.	翻訳日:2022-12-24 21:21:42 公開日:2020-03-10
# SAD:敵対的事例に対する衛生ベースの防衛 SAD: Saliency-based Defenses Against Adversarial Examples ( http://arxiv.org/abs/2003.04820v1 ) ライセンス: Link先を確認	Richard Tran, David Patrick, Michael Geyer, Amanda Fernandez	(参考訳) 機械学習モデルやディープラーニングモデルの人気が高まり、悪意のある入力に対する脆弱性への注目が高まっている。これらの逆の例では、ネットワークの本来の意図からモデル予測を逸脱させ、実践的セキュリティの懸念が高まっている。これらの攻撃に対抗するために、ニューラルネットワークは従来の画像処理アプローチや最先端の防御モデルを利用して、データの摂動を減らすことができる。ノイズ低減にグローバルアプローチを採用する防御アプローチは、敵の攻撃に対して有効であるが、その損失アプローチはしばしば画像内の重要なデータを歪ませる。本研究では, 対人攻撃の影響を受けやすいクリーニングデータに対する視覚的サリエンシに基づくアプローチを提案する。本モデルでは, 対象画像内の損失を相対的に低減しつつ, 対象画像のサルエント領域を有効活用する。攻撃前, 攻撃前, 清掃後において, 最先端の衛生手法の有効性を評価することにより, モデルの精度を評価する。提案手法は,2つのサリエンシーデータセットにまたがって,関連する防御手法や確立された敵対的攻撃手法と比較し,有効性を示す。対象としたアプローチでは,従来のアプローチと最先端のアプローチと比較して,標準統計値と距離塩分値の指標が大幅に改善されている。 With the rise in popularity of machine and deep learning models, there is an increased focus on their vulnerability to malicious inputs. These adversarial examples drift model predictions away from the original intent of the network and are a growing concern in practical security. In order to combat these attacks, neural networks can leverage traditional image processing approaches or state-of-the-art defensive models to reduce perturbations in the data. Defensive approaches that take a global approach to noise reduction are effective against adversarial attacks, however their lossy approach often distorts important data within the image. In this work, we propose a visual saliency based approach to cleaning data affected by an adversarial attack. Our model leverages the salient regions of an adversarial image in order to provide a targeted countermeasure while comparatively reducing loss within the cleaned images. We measure the accuracy of our model by evaluating the effectiveness of state-of-the-art saliency methods prior to attack, under attack, and after application of cleaning methods. We demonstrate the effectiveness of our proposed approach in comparison with related defenses and against established adversarial attack methods, across two saliency datasets. Our targeted approach shows significant improvements in a range of standard statistical and distance saliency metrics, in comparison with both traditional and state-of-the-art approaches.	翻訳日:2022-12-24 21:21:31 公開日:2020-03-10
# PANDA:ギガピクセルレベルの人間中心のビデオデータセット PANDA: A Gigapixel-level Human-centric Video Dataset ( http://arxiv.org/abs/2003.04852v1 ) ライセンス: Link先を確認	Xueyang Wang, Xiya Zhang, Yinheng Zhu, Yuchen Guo, Xiaoyun Yuan, Liuyu Xiang, Zerun Wang, Guiguang Ding, David J Brady, Qionghai Dai, Lu Fang	(参考訳) 大規模・長期・多目的視覚分析のための,最初のギガPixelレベルのフガン中心のViDeo dAtasetであるPANDAを提示する。 PANDAのビデオはギガピクセルカメラで撮影され、広視野(約1km)と高解像度(〜ギガピクセルレベル/フレーム)の両方で現実世界のシーンをカバーしている。シーンは、100倍以上のスケールの4Kヘッド数を含むことができる。 PANDAは15,974.6kのバウンディングボックス、111.8kの微粒な属性ラベル、12.7kの軌道、2.2kのグループ、2.9kの相互作用を含む、リッチで階層的な基底構造アノテーションを提供する。人間の検出と追跡のタスクをベンチマークします。歩行者のポーズ, スケール, 閉塞, 軌道の多様さから, 既存のアプローチは精度と効率の両面から挑戦されている。広帯域FoVと高解像度のPANDAの特異性を考えると,対話型グループ検出の新たな課題が紹介される。我々は,グローバルトラジェクタと局所的な相互作用を同時にエンコードし,有望な結果をもたらす「グローバルからローカルへのズームイン」フレームワークを設計する。我々はPANDAが、大規模な現実世界のシーンにおける人間の行動や相互作用を理解することによって、人工知能と実践学のコミュニティに貢献すると考えている。 panda webサイト: http://www.panda-dataset.com We present PANDA, the first gigaPixel-level humAN-centric viDeo dAtaset, for large-scale, long-term, and multi-object visual analysis. The videos in PANDA were captured by a gigapixel camera and cover real-world scenes with both wide field-of-view (~1 square kilometer area) and high-resolution details (~gigapixel-level/frame). The scenes may contain 4k head counts with over 100x scale variation. PANDA provides enriched and hierarchical ground-truth annotations, including 15,974.6k bounding boxes, 111.8k fine-grained attribute labels, 12.7k trajectories, 2.2k groups and 2.9k interactions. We benchmark the human detection and tracking tasks. Due to the vast variance of pedestrian pose, scale, occlusion and trajectory, existing approaches are challenged by both accuracy and efficiency. Given the uniqueness of PANDA with both wide FoV and high resolution, a new task of interaction-aware group detection is introduced. We design a 'global-to-local zoom-in' framework, where global trajectories and local interactions are simultaneously encoded, yielding promising results. We believe PANDA will contribute to the community of artificial intelligence and praxeology by understanding human behaviors and interactions in large-scale real-world scenes. PANDA Website: http://www.panda-dataset.com.	翻訳日:2022-12-24 21:20:48 公開日:2020-03-10
# 乳がん診断のための深層学習アプローチ Deep learning approach for breast cancer diagnosis ( http://arxiv.org/abs/2003.04480v1 ) ライセンス: Link先を確認	Essam A. Rashed and M. Samir Abou El Seoud	(参考訳) 乳がんは、早期発見時に高いリスクコントロールを持つ世界でも有数の致命的な疾患の一つである。乳房検診の従来の方法はX線マンモグラフィーであり,早期発見が難しいことが知られている。画像の圧縮による乳房の高密度構造は, 微小な異常を認識するのが困難であった。また,乳房組織の異種間および異種間は,手作りの特徴を用いた高い診断精度を達成することが極めて困難である。ディープラーニングは、比較的高い計算能力を必要とする、新しい機械学習技術である。しかし、それは人間の知能のレベルで意思決定を必要とするいくつかの難しいタスクにおいて非常に効果的であることが判明した。本稿では,乳がんを効果的かつ早期に検出できるU-net構造に着想を得た新しいネットワークアーキテクチャを開発する。その結果,臨床応用において提案手法の有用性を示す感度と特異性が高いことが示唆された。 Breast cancer is one of the leading fatal disease worldwide with high risk control if early discovered. Conventional method for breast screening is x-ray mammography, which is known to be challenging for early detection of cancer lesions. The dense breast structure produced due to the compression process during imaging lead to difficulties to recognize small size abnormalities. Also, inter- and intra-variations of breast tissues lead to significant difficulties to achieve high diagnosis accuracy using hand-crafted features. Deep learning is an emerging machine learning technology that requires a relatively high computation power. Yet, it proved to be very effective in several difficult tasks that requires decision making at the level of human intelligence. In this paper, we develop a new network architecture inspired by the U-net structure that can be used for effective and early detection of breast cancer. Results indicate a high rate of sensitivity and specificity that indicate potential usefulness of the proposed approach in clinical use.	翻訳日:2022-12-24 21:14:25 公開日:2020-03-10
# PBRnet: 物体位置決め精度を向上させるためのピラミッド境界ボックスリファインメント PBRnet: Pyramidal Bounding Box Refinement to Improve Object Localization Accuracy ( http://arxiv.org/abs/2003.04541v1 ) ライセンス: Link先を確認	Li Xiao, Yufan Luo, Chunlong Luo, Lianhe Zhao, Quanshui Fu, Guoqing Yang, Anpeng Huang, Yi Zhao	(参考訳) 近年,粗粒から微粒までの提案を分類・回帰する段階を数段階含む粗粒度フレームワークに着目した物体検出装置が多数開発され,より高精度な検出が徐々に実現されている。特徴ピラミッドネットワーク(FPN)のようなマルチレゾリューションモデルは、異なる解像度の情報を統合し、性能を効果的に改善する。以前の研究でも、ローカライゼーションをさらに改善できることが判明している。 1) より翻訳的な変種であるきめ細かい情報を使用すること 2)地域境界情報により焦点を絞った地域を精錬する。これらの原理に基づき、我々は、粗粒度フレームワークとピラミッド境界箱微細化ネットワーク(PBRnet)という特徴ピラミッド構造を組み合わせることにより、局所化精度を向上させるために、新しい境界改善アーキテクチャを設計した。大規模な実験はMS-COCOデータセット上で行われる。 PBRnetは、FPNやLibra R-CNNに追加すると、約3ポイントのmAP$が大幅に向上する。さらに、カスケードR-CNNを粗大な検出器として扱い、PBRnetの回帰器によってローカライゼーションブランチを置き換えることで、1.5$mAP$の余分な性能向上を実現し、最大5ポイントのmAP$まで性能が向上する。 Many recently developed object detectors focused on coarse-to-fine framework which contains several stages that classify and regress proposals from coarse-grain to fine-grain, and obtains more accurate detection gradually. Multi-resolution models such as Feature Pyramid Network(FPN) integrate information of different levels of resolution and effectively improve the performance. Previous researches also have revealed that localization can be further improved by: 1) using fine-grained information which is more translational variant; 2) refining local areas which is more focused on local boundary information. Based on these principles, we designed a novel boundary refinement architecture to improve localization accuracy by combining coarse-to-fine framework with feature pyramid structure, named as Pyramidal Bounding Box Refinement network(PBRnet), which parameterizes gradually focused boundary areas of objects and leverages lower-level feature maps to extract finer local information when refining the predicted bounding boxes. Extensive experiments are performed on the MS-COCO dataset. The PBRnet brings a significant performance gains by roughly 3 point of $mAP$ when added to FPN or Libra R-CNN. Moreover, by treating Cascade R-CNN as a coarse-to-fine detector and replacing its localization branch by the regressor of PBRnet, it leads an extra performance improvement by 1.5 $mAP$, yielding a total performance boosting by as high as 5 point of $mAP$.	翻訳日:2022-12-24 21:13:48 公開日:2020-03-10
# ハイパースペクトル画像復調のための3次元準リカレントニューラルネットワーク 3D Quasi-Recurrent Neural Network for Hyperspectral Image Denoising ( http://arxiv.org/abs/2003.04547v1 ) ライセンス: Link先を確認	Kaixuan Wei, Ying Fu, Hua Huang	(参考訳) 本稿では,ハイパースペクトル画像(hsi)デノイジングのための交互方向3次元準リカレントニューラルネットワークを提案し,スペクトルに沿った領域知識 -- 構造空間スペクトル相関と大域相関を効果的に組み込む。具体的には、3次元畳み込みを用いてHSIの構造空間-スペクトル相関を抽出し、準再帰プール関数を用いてスペクトルに沿った大域的相関を捉える。さらに,計算コストを増すことなく因果依存性を排除するために,方向の交互構造を導入する。提案モデルは、任意のバンド数でHSIに対する柔軟性を保ちながら、スペクトル依存性をモデル化することができる。 HSI復調に関する大規模な実験は、復元精度と計算時間の両方の観点から、様々な騒音条件下での最先端技術よりも大幅に改善されている。私たちのコードはhttps://github.com/vandermode/qrnn3dで利用可能です。 In this paper, we propose an alternating directional 3D quasi-recurrent neural network for hyperspectral image (HSI) denoising, which can effectively embed the domain knowledge -- structural spatio-spectral correlation and global correlation along spectrum. Specifically, 3D convolution is utilized to extract structural spatio-spectral correlation in an HSI, while a quasi-recurrent pooling function is employed to capture the global correlation along spectrum. Moreover, alternating directional structure is introduced to eliminate the causal dependency with no additional computation cost. The proposed model is capable of modeling spatio-spectral dependency while preserving the flexibility towards HSIs with arbitrary number of bands. Extensive experiments on HSI denoising demonstrate significant improvement over state-of-the-arts under various noise settings, in terms of both restoration accuracy and computation time. Our code is available at https://github.com/Vandermode/QRNN3D.	翻訳日:2022-12-24 21:12:51 公開日:2020-03-10
# クラスごとの注釈付き1点のみに基づく複合運転シーンにおける画素レベルセマンティック学習の実現 Realizing Pixel-Level Semantic Learning in Complex Driving Scenes based on Only One Annotated Pixel per Class ( http://arxiv.org/abs/2003.04671v1 ) ライセンス: Link先を確認	Xi Li, Huimin Ma, Sheng Yi, Yanxian Chen	(参考訳) 弱教師付き条件に基づくセマンティックセグメンテーションタスクは、軽量なラベリングプロセスを実現するために進められている。いくつかのカテゴリのみを含む単純な画像の場合、画像レベルのアノテーションに基づく研究は許容できる性能を達成した。しかし、複雑な場面に直面すると、画像には大量のクラスが含まれているため、画像タグに基づいて視覚的な外観を学ぶことが困難になる。この場合、画像レベルのアノテーションは情報提供に有効ではない。そこで,各カテゴリに1つのアノテートされた画素のみを割り当てるタスクを新たに設定した。より軽量で情報的な条件に基づいて、擬似ラベル生成のための3段階のプロセスが構築され、各カテゴリの最適な特徴表現、画像推論、コンテキストロケーションに基づくリファインメントを段階的に実装する。特に,高レベルセマンティクスと低レベルイメージング機能は,運転場面の各クラスで異なる識別能力を有するため,各カテゴリを「オブジェクト」または「シーン」に分け,その2つのタイプの異なる操作を提供する。さらに、cnnベースの画像間共通意味学習と画像内修正処理を組み合わせたセグメンテーション性能を徐々に向上させるために、交互に反復構造が確立される。 Cityscapesデータセットの実験では、複雑な運転シーン下で弱教師付きセマンティックセマンティックセグメンテーションタスクを解決するための提案手法が実現可能であることが示された。 Semantic segmentation tasks based on weakly supervised condition have been put forward to achieve a lightweight labeling process. For simple images that only include a few categories, researches based on image-level annotations have achieved acceptable performance. However, when facing complex scenes, since image contains a large amount of classes, it becomes difficult to learn visual appearance based on image tags. In this case, image-level annotations are not effective in providing information. Therefore, we set up a new task in which only one annotated pixel is provided for each category. Based on the more lightweight and informative condition, a three step process is built for pseudo labels generation, which progressively implement optimal feature representation for each category, image inference and context-location based refinement. In particular, since high-level semantics and low-level imaging feature have different discriminative ability for each class under driving scenes, we divide each category into "object" or "scene" and then provide different operations for the two types separately. Further, an alternate iterative structure is established to gradually improve segmentation performance, which combines CNN-based inter-image common semantic learning and imaging prior based intra-image modification process. Experiments on Cityscapes dataset demonstrate that the proposed method provides a feasible way to solve weakly supervised semantic segmentation task under complex driving scenes.	翻訳日:2022-12-24 21:10:59 公開日:2020-03-10
# HeatNet: 熱画像を用いたセマンティックセグメンテーションにおける日中ドメインギャップのブリッジ HeatNet: Bridging the Day-Night Domain Gap in Semantic Segmentation with Thermal Images ( http://arxiv.org/abs/2003.04645v1 ) ライセンス: Link先を確認	Johan Vertens, Jannik Z\"urn, Wolfram Burgard	(参考訳) 学習に基づくセマンティックセグメンテーション手法の大部分は、昼間のシナリオや好ましい照明条件に最適化されている。しかし、現実の運転シナリオは、既存のアプローチの課題である夜間照明やグレアのような有害な環境条件を伴っている。本研究では,日中と夜間に適用可能なマルチモーダル意味セグメンテーションモデルを提案する。この目的のために、RGB画像以外にも、熱画像を活用し、ネットワークをはるかに堅牢にする。我々は、既存の昼間RGBデータセットを活用して、夜間画像の高価なアノテーションを避けるとともに、夜間領域にデータセットの知識を伝達する教師学習アプローチを提案する。さらに,学習した特徴空間の整列化にドメイン適応法を適用し,新しい二段階学習法を提案する。さらに, 自動走行用サーマルデータが不足しているため, 時間同期とRGB熱画像ペアの整列が2万を超える新しいデータセットを提案する。そこで,本研究では,ロバストなサーマルカメラキャリブレーションを実現するための新しいターゲットレスキャリブレーション手法を提案する。中でも,夜間セマンティックセグメンテーションの最先端結果を示すために,我々の新しいデータセットを用いた。 The majority of learning-based semantic segmentation methods are optimized for daytime scenarios and favorable lighting conditions. Real-world driving scenarios, however, entail adverse environmental conditions such as nighttime illumination or glare which remain a challenge for existing approaches. In this work, we propose a multimodal semantic segmentation model that can be applied during daytime and nighttime. To this end, besides RGB images, we leverage thermal images, making our network significantly more robust. We avoid the expensive annotation of nighttime images by leveraging an existing daytime RGB-dataset and propose a teacher-student training approach that transfers the dataset's knowledge to the nighttime domain. We further employ a domain adaptation method to align the learned feature spaces across the domains and propose a novel two-stage training scheme. Furthermore, due to a lack of thermal data for autonomous driving, we present a new dataset comprising over 20,000 time-synchronized and aligned RGB-thermal image pairs. In this context, we also present a novel target-less calibration method that allows for automatic robust extrinsic and intrinsic thermal camera calibration. Among others, we employ our new dataset to show state-of-the-art results for nighttime semantic segmentation.	翻訳日:2022-12-24 21:04:56 公開日:2020-03-10
# 手術用ジェスチャ認識と進捗予測のためのマルチタスクリカレントニューラルネットワーク Multi-Task Recurrent Neural Network for Surgical Gesture Recognition and Progress Prediction ( http://arxiv.org/abs/2003.04772v1 ) ライセンス: Link先を確認	Beatrice van Amsterdam, Matthew J. Clarkson, Danail Stoyanov	(参考訳) 手術用ジェスチャー認識は手術用データサイエンスおよびコンピュータ支援介入において重要である。ロボティックキネマティックな情報であっても、手術手順を自動的に分割することは、手術のデモがスタイル、持続時間、行動の順序において高い変動性によって特徴づけられるため、多くの課題を生じさせる。運動信号から識別的特徴を抽出し,認識精度を高めるために,手術動作の同時認識と手術進行の新たな定式化を行うマルチタスクリカレントニューラルネットワークを提案する。提案手法の有効性を示すため,ロボットキネマティックデータを用いた外科的ジェスチャー認識用データセットとして現在唯一公開されているJIGSAWSデータセットについて,その適用性を評価する。マルチタスクフレームワークでは,手動ラベリングやトレーニングを伴わずに,進捗推定による認識性能が向上することが実証された。 Surgical gesture recognition is important for surgical data science and computer-aided intervention. Even with robotic kinematic information, automatically segmenting surgical steps presents numerous challenges because surgical demonstrations are characterized by high variability in style, duration and order of actions. In order to extract discriminative features from the kinematic signals and boost recognition accuracy, we propose a multi-task recurrent neural network for simultaneous recognition of surgical gestures and estimation of a novel formulation of surgical task progress. To show the effectiveness of the presented approach, we evaluate its application on the JIGSAWS dataset, that is currently the only publicly available dataset for surgical gesture recognition featuring robot kinematic data. We demonstrate that recognition performance improves in multi-task frameworks with progress estimation without any additional manual labelling and training.	翻訳日:2022-12-24 21:04:08 公開日:2020-03-10
# 形状変形のための小型分光ディスクリプタ A Compact Spectral Descriptor for Shape Deformations ( http://arxiv.org/abs/2003.08758v1 ) ライセンス: Link先を確認	Skylar Sible, Rodrigo Iza-Teran, Jochen Garcke, Nikola Aulig, Patricia Wollstadt	(参考訳) 工学領域における現代の製品設計は、有限要素に基づくシミュレーション、計算最適化、機械学習のような現代的なデータ分析技術を含む計算分析によってますます推進されている。これらの手法を適用するには、開発中のコンポーネントや関連する設計基準に適したデータ表現が必要となる。コンポーネントの幾何学は一般にポリゴン表面メッシュで表されるが、効率的な計算解析を実現するために重要な設計特性をどのようにパラメトリズするかは明確ではない。本研究では,自動車の衝突挙動を最適化する場合など,多くの応用分野において重要な設計基準となる応力下での部品の塑性変形挙動をパラメータ化するための新しい手法を提案する。既存のパラメータ化は計算解析を比較的単純な変形に制限し、一般に専門家による広範な入力を必要とし、設計プロセスは集中的でコストがかかる。そこで本研究では, スペクトルメッシュ処理に基づく変形挙動のコンパクトな記述子を導出し, 同様に複雑な変形の低次元表現を可能にする手法を提案する。提案するディスクリプタは, 幾何学的変形挙動のパラメトリゼーションに対する新しいアプローチを提供し, 塑性変形挙動に関する工学的課題に対する機械学習などの最先端データ解析技術の利用を可能にする。 Modern product design in the engineering domain is increasingly driven by computational analysis including finite-element based simulation, computational optimization, and modern data analysis techniques such as machine learning. To apply these methods, suitable data representations for components under development as well as for related design criteria have to be found. While a component's geometry is typically represented by a polygon surface mesh, it is often not clear how to parametrize critical design properties in order to enable efficient computational analysis. In the present work, we propose a novel methodology to obtain a parameterization of a component's plastic deformation behavior under stress, which is an important design criterion in many application domains, for example, when optimizing the crash behavior in the automotive context. Existing parameterizations limit computational analysis to relatively simple deformations and typically require extensive input by an expert, making the design process time intensive and costly. Hence, we propose a way to derive a compact descriptor of deformation behavior that is based on spectral mesh processing and enables a low-dimensional representation of also complex deformations.We demonstrate the descriptor's ability to represent relevant deformation behavior by applying it in a nearest-neighbor search to identify similar simulation results in a filtering task. The proposed descriptor provides a novel approach to the parametrization of geometric deformation behavior and enables the use of state-of-the-art data analysis techniques such as machine learning to engineering tasks concerned with plastic deformation behavior.	翻訳日:2022-12-24 21:03:24 公開日:2020-03-10
# 機械読解ゴールド標準の評価フレームワーク A Framework for Evaluation of Machine Reading Comprehension Gold Standards ( http://arxiv.org/abs/2003.04642v1 ) ライセンス: Link先を確認	Viktor Schlegel, Marco Valentino, Andr\'e Freitas, Goran Nenadic, Riza Batista-Navarro	(参考訳) 機械読解(英語: Machine Reading Comprehension、MRC)とは、1段落の文章で質問に答える作業である。ニューラルMCCシステムは人気を博し、顕著な性能を達成する一方で、それらの性能を確立するために使用される方法論、特にそれらの評価に使用される金の標準のデータ設計に関して問題が提起されている。このデータに存在する課題について、限られた理解しかできないため、比較を引いて信頼できる仮説を定式化することは困難である。本稿では,この問題を解消するための第一歩として,現在の言語的特徴,必要な推論と背景知識,事実的正確性,そして語彙的手がかりの存在を,理解要件の下限として体系的に検討するための統一的枠組みを提案する。本稿では,第1の定性的なアノテーションスキーマと後者の近似指標のセットを提案する。本フレームワークの第一の応用として, 現代のMRCゴールド標準を分析し, 語彙的曖昧性に寄与する特徴の欠如, 期待する回答の様々な事実的正しさ, 語彙的手がかりの存在などについて述べる。 Machine Reading Comprehension (MRC) is the task of answering a question over a paragraph of text. While neural MRC systems gain popularity and achieve noticeable performance, issues are being raised with the methodology used to establish their performance, particularly concerning the data design of gold standards that are used to evaluate them. There is but a limited understanding of the challenges present in this data, which makes it hard to draw comparisons and formulate reliable hypotheses. As a first step towards alleviating the problem, this paper proposes a unifying framework to systematically investigate the present linguistic features, required reasoning and background knowledge and factual correctness on one hand, and the presence of lexical cues as a lower bound for the requirement of understanding on the other hand. We propose a qualitative annotation schema for the first and a set of approximative metrics for the latter. In a first application of the framework, we analyse modern MRC gold standards and present our findings: the absence of features that contribute towards lexical ambiguity, the varying factual correctness of the expected answers and the presence of lexical cues, all of which potentially lower the reading comprehension complexity and quality of the evaluation data.	翻訳日:2022-12-24 21:02:40 公開日:2020-03-10
# デュアルセンスエンコーダを用いた効率的なインテント検出 Efficient Intent Detection with Dual Sentence Encoders ( http://arxiv.org/abs/2003.04807v1 ) ライセンス: Link先を確認	I\~nigo Casanueva, Tadas Tem\v{c}inas, Daniela Gerz, Matthew Henderson, Ivan Vuli\'c	(参考訳) 新しいドメインと追加機能で会話システムを構築するには、低データ状態下で動くリソース効率のモデルが必要となる。これらの要件により、USEやConveRTのような事前訓練された二重文エンコーダによるインテント検出手法を導入する。提案するインテント検出器の有用性と幅広い適用性を示す。 1 目的検出装置は、完全なBERTラージモデルを微調整し、又は三種類の目的検出データセットの固定ブラックボックスエンコーダとしてBERTを使用する。 2 利得は、特に少額の設定で発音される(すなわち、意図ごとの注記例が10又は30件のみである)。 3)我々の意図検出器は,1つのcpu上で数分で訓練することができる。 4) 異なるハイパーパラメータ設定で安定している。意図検出に焦点をあてた研究の促進と民主化を期待し、コードをリリースし、77以上のインテントに注釈付き例を含む、新たな挑戦的な1ドメインインテント検出データセットをリリースします。 Building conversational systems in new domains and with added functionality requires resource-efficient models that work under low-data regimes (i.e., in few-shot setups). Motivated by these requirements, we introduce intent detection methods backed by pretrained dual sentence encoders such as USE and ConveRT. We demonstrate the usefulness and wide applicability of the proposed intent detectors, showing that: 1) they outperform intent detectors based on fine-tuning the full BERT-Large model or using BERT as a fixed black-box encoder on three diverse intent detection data sets; 2) the gains are especially pronounced in few-shot setups (i.e., with only 10 or 30 annotated examples per intent); 3) our intent detectors can be trained in a matter of minutes on a single CPU; and 4) they are stable across different hyperparameter settings. In hope of facilitating and democratizing research focused on intention detection, we release our code, as well as a new challenging single-domain intent detection dataset comprising 13,083 annotated examples over 77 intents.	翻訳日:2022-12-24 21:02:20 公開日:2020-03-10
# multi-simlex:多言語・言語間意味類似性の大規模評価 Multi-SimLex: A Large-Scale Evaluation of Multilingual and Cross-Lingual Lexical Semantic Similarity ( http://arxiv.org/abs/2003.04866v1 ) ライセンス: Link先を確認	Ivan Vuli\'c, Simon Baker, Edoardo Maria Ponti, Ulla Petti, Ira Leviant, Kelly Wing, Olga Majewska, Eden Bar, Matt Malone, Thierry Poibeau, Roi Reichart, Anna Korhonen	(参考訳) 大規模な語彙資源と評価ベンチマークであるMulti-SimLexを導入し、主要な言語(中国語、スペイン語、ロシア語など)や低リソースの言語(ウェールズ語、キスワヒリ語など)を含む、12の類型的に多様な言語のデータセットをカバーした。各言語データセットは、意味的類似性の語彙的関係に注釈付けされ、1,888の意味的整合概念ペアを含み、単語クラス(名詞、動詞、形容詞、副詞)、頻度ランク、類似度間隔、語彙フィールド、具体性レベルを代表的にカバーする。さらに、言語間の概念のアラインメントにより、66の言語間の意味的類似性データセットを提供する。広範にわたるサイズと言語カバレッジのため、マルチsimlexは実験的な評価と分析のための全く新しい機会を提供する。モノリンガルおよびクロスリンガルのベンチマークでは,静的および文脈化された単語埋め込み(fastText, M-BERT, XLM など)や外部情報による語彙表現,さらには完全に教師のない(弱く)教師付き言語間単語埋め込みなど,最新のモノリンガルおよびクロスリンガル表現モデルの評価と解析を行った。また、追加言語のための一貫性のあるマルチシンプレックススタイルのリソースを作成するためのステップバイステップのデータセット生成プロトコルを提案する。我々は、これらの貢献 -- マルチsimlexデータセットのパブリックリリース、それらの作成プロトコル、強力なベースライン結果、そして多言語語彙意味論と表現学習の将来の発展を導くのに役立つ深い分析 -- を、コミュニティがより多くの言語にマルチsimlexをさらに拡張するための努力を促すwebサイトを通じて提供します。このような大規模セマンティックリソースは、言語間でのNLPのさらなる進歩を引き起こす可能性がある。 We introduce Multi-SimLex, a large-scale lexical resource and evaluation benchmark covering datasets for 12 typologically diverse languages, including major languages (e.g., Mandarin Chinese, Spanish, Russian) as well as less-resourced ones (e.g., Welsh, Kiswahili). Each language dataset is annotated for the lexical relation of semantic similarity and contains 1,888 semantically aligned concept pairs, providing a representative coverage of word classes (nouns, verbs, adjectives, adverbs), frequency ranks, similarity intervals, lexical fields, and concreteness levels. Additionally, owing to the alignment of concepts across languages, we provide a suite of 66 cross-lingual semantic similarity datasets. Due to its extensive size and language coverage, Multi-SimLex provides entirely novel opportunities for experimental evaluation and analysis. On its monolingual and cross-lingual benchmarks, we evaluate and analyze a wide array of recent state-of-the-art monolingual and cross-lingual representation models, including static and contextualized word embeddings (such as fastText, M-BERT and XLM), externally informed lexical representations, as well as fully unsupervised and (weakly) supervised cross-lingual word embeddings. We also present a step-by-step dataset creation protocol for creating consistent, Multi-Simlex-style resources for additional languages. We make these contributions -- the public release of Multi-SimLex datasets, their creation protocol, strong baseline results, and in-depth analyses which can be be helpful in guiding future developments in multilingual lexical semantics and representation learning -- available via a website which will encourage community effort in further expansion of Multi-Simlex to many more languages. Such a large-scale semantic resource could inspire significant further advances in NLP across languages.	翻訳日:2022-12-24 21:02:00 公開日:2020-03-10
# 制約プログラミングによる道路利用者追跡 Tracking Road Users using Constraint Programming ( http://arxiv.org/abs/2003.04468v1 ) ライセンス: Link先を確認	Alexandre Pineault, Guillaume-Alexandre Bilodeau, Gilles Pesant	(参考訳) 本稿では,都市景観における道路利用者の追跡を改善することを目的とする。本稿では,マルチオブジェクトトラッキング(MOT)問題のトラッキング・バイ・検出パラダイムに見られるデータアソシエーションフェーズに対する制約プログラミング(CP)アプローチを提案する。このようなアプローチは、グラフベースの手法よりも効率的にデータ関連問題を解決でき、複数のフレームを分析した時に発生する組合せ爆発をよりうまく扱うことができる。データアソシエーションの問題に焦点が当てられているため、MOT法では各フレームの中心位置と色である単純な画像特徴のみを用いる。制約はこれらの2つの特徴と一般的なMOT問題に基づいて定義される。例えば、軌跡に対して色覚の保存を強制し、フレーム間の動きの程度を制限する。フィルタ層は、CPを使用する前に検出候補を除去し、CPソルバが生成するダミー軌道を除去するために用いられる。提案手法は車両追跡データを用いてテストし,UA-DETRACベンチマークの上位手法よりも優れた結果を得た。 In this paper, we aim at improving the tracking of road users in urban scenes. We present a constraint programming (CP) approach for the data association phase found in the tracking-by-detection paradigm of the multiple object tracking (MOT) problem. Such an approach can solve the data association problem more efficiently than graph-based methods and can handle better the combinatorial explosion occurring when multiple frames are analyzed. Because our focus is on the data association problem, our MOT method only uses simple image features, which are the center position and color of detections for each frame. Constraints are defined on these two features and on the general MOT problem. For example, we enforce color appearance preservation over trajectories and constrain the extent of motion between frames. Filtering layers are used in order to eliminate detection candidates before using CP and to remove dummy trajectories produced by the CP solver. Our proposed method was tested on a motorized vehicles tracking dataset and produces results that outperform the top methods of the UA-DETRAC benchmark.	翻訳日:2022-12-24 21:01:23 公開日:2020-03-10
# ディープリカレントオートエンコーダを用いた蜂の異常検出 Anomaly Detection in Beehives using Deep Recurrent Autoencoders ( http://arxiv.org/abs/2003.04576v1 ) ライセンス: Link先を確認	Padraig Davidson, Michael Steininger, Florian Lautenschlager, Konstantin Kobs, Anna Krause and Andreas Hotho	(参考訳) 精密ビーキーピングは、ハチにセンサーを装着することで、ハチの生活状態をモニタリングする。これらのハイブによって記録されたデータは、機械学習モデルによって分析され、ミツバチコロニーにおける異常な事象の行動パターンを学習または探索することができる。典型的なターゲットの1つは、経済的理由からミツバチの群れを早期発見することである。先進的な方法は、ハチの病気や技術上の理由、例えばセンサーの故障に起因する他の異常または異常な行動を検出することができる。本稿では,その起源に依存しないデータの任意の種類の異常を検出する深層学習モデルであるオートエンコーダを提案する。我々のモデルは、単純なルールベースのSwarm検出アルゴリズムと同一のSwarmを明らかにすることができるが、他の異常によっても引き起こされる。我々は,異なるヒブと異なるセンサーで収集した実世界のデータセットを用いて,我々のモデルを評価した。 Precision beekeeping allows to monitor bees' living conditions by equipping beehives with sensors. The data recorded by these hives can be analyzed by machine learning models to learn behavioral patterns of or search for unusual events in bee colonies. One typical target is the early detection of bee swarming as apiarists want to avoid this due to economical reasons. Advanced methods should be able to detect any other unusual or abnormal behavior arising from illness of bees or from technical reasons, e.g. sensor failure. In this position paper we present an autoencoder, a deep learning model, which detects any type of anomaly in data independent of its origin. Our model is able to reveal the same swarms as a simple rule-based swarm detection algorithm but is also triggered by any other anomaly. We evaluated our model on real world data sets that were collected on different hives and with different sensor setups.	翻訳日:2022-12-24 20:55:05 公開日:2020-03-10
# データ駆動意思決定におけるグループフェアネスの複数の指標に対処する Addressing multiple metrics of group fairness in data-driven decision making ( http://arxiv.org/abs/2003.04794v1 ) ライセンス: Link先を確認	Marius Miron, Song\"ul Tolan, Emilia G\'omez, Carlos Castillo	(参考訳) 機械学習(fat-ml)文献における公平性、説明可能性、透明性は、性別や人種などの保護された特徴によって特徴付けられる社会デミックグループに対する差別を測定するために、様々な集団フェアネス指標を提案する。私たちは、これらのメトリクスのいくつかが、同じグループと機械学習の方法のために、2、3つのメインクラスタにまとめられるのを観察し、経験的にそうしている。さらに,グループフェアネス尺度の主成分分析(PCA)を用いて,2次元の多次元フェアネスを可視化する頑健な手法を提案する。複数のデータセットに対する実験結果から,PCA分解では測定値のばらつきを1～3成分で説明できることがわかった。 The Fairness, Accountability, and Transparency in Machine Learning (FAT-ML) literature proposes a varied set of group fairness metrics to measure discrimination against socio-demographic groups that are characterized by a protected feature, such as gender or race.Such a system can be deemed as either fair or unfair depending on the choice of the metric. Several metrics have been proposed, some of them incompatible with each other.We do so empirically, by observing that several of these metrics cluster together in two or three main clusters for the same groups and machine learning methods. In addition, we propose a robust way to visualize multidimensional fairness in two dimensions through a Principal Component Analysis (PCA) of the group fairness metrics. Experimental results on multiple datasets show that the PCA decomposition explains the variance between the metrics with one to three components.	翻訳日:2022-12-24 20:54:49 公開日:2020-03-10
# ブートストラップによるSketched SVDの誤差推定 Error Estimation for Sketched SVD via the Bootstrap ( http://arxiv.org/abs/2003.04937v1 ) ライセンス: Link先を確認	Miles E. Lopes and N. Benjamin Erichson and Michael W. Mahoney	(参考訳) 非常に大きな行列の特異値分解(SVD)に対する高速な近似を計算するために、ランダム化されたスケッチアルゴリズムが主要なアプローチとなっている。しかし,SVDをスケッチする上で重要な難しさは,スケッチした特異ベクトル/値と正確な値との距離がわからない点である。実際、ユーザは与えられた問題のユニークな構造を考慮しない分析的な最悪のエラー境界に頼らざるを得ない。結果として、エラー推定ツールの欠如は、本当に必要なものよりもはるかに多くの計算につながることが多い。これらの課題を克服するため,本稿では,スケッチした特異ベクトル/値の実際の誤差を数値的に推定する,データ駆動ブートストラップ法を開発した。特にこれは、ユーザが粗い初期スケッチされたsvdの品質を検査し、所定のエラー許容度に達するのに必要な余分な作業量を適応的に予測することを可能にする。さらに、この方法は、スケッチされたオブジェクトのみで動作し、分解される全行列をパスする必要がなくなるため、計算量的に安価である。最後に、この手法は理論的な保証と非常に奨励的な実験結果によって支持される。 In order to compute fast approximations to the singular value decompositions (SVD) of very large matrices, randomized sketching algorithms have become a leading approach. However, a key practical difficulty of sketching an SVD is that the user does not know how far the sketched singular vectors/values are from the exact ones. Indeed, the user may be forced to rely on analytical worst-case error bounds, which do not account for the unique structure of a given problem. As a result, the lack of tools for error estimation often leads to much more computation than is really necessary. To overcome these challenges, this paper develops a fully data-driven bootstrap method that numerically estimates the actual error of sketched singular vectors/values. In particular, this allows the user to inspect the quality of a rough initial sketched SVD, and then adaptively predict how much extra work is needed to reach a given error tolerance. Furthermore, the method is computationally inexpensive, because it operates only on sketched objects, and it requires no passes over the full matrix being factored. Lastly, the method is supported by theoretical guarantees and a very encouraging set of experimental results.	翻訳日:2022-12-24 20:54:10 公開日:2020-03-10
# 熱帯低気圧に対するベイズ間隔の予測 Prediction of Bayesian Intervals for Tropical Storms ( http://arxiv.org/abs/2003.05024v1 ) ライセンス: Link先を確認	Max Chiswick and Sam Ganzfried	(参考訳) リカレントニューラルネットワーク(RNN)を用いたハリケーンの軌道予測に関する最近の研究に基づいて,改良手法を開発し,単純な点推定に加えてベイズ区間の予測手法を一般化した。熱帯の嵐は深刻な被害を引き起こす可能性があるため、その軌道を正確に予測することは、特に気候変動の影響により、都市や生活に大きな利益をもたらす可能性がある。 RNNにおける降雨量を用いたベイズ区間の実施により, 降水地域を推定するなど, 予測の動作性の向上が図られる。我々は嵐の軌跡を6時間間隔で予測するためにRNNを使用した。大西洋で約500の熱帯嵐の統計ハリケーン強度予測スキーム(SHIPS)データから,緯度,経度,風速,気圧の特徴を抽出した。結果は,ニューラルネットワークのドロップアウト値が予測と間隔にどのように影響するかを示す。 Building on recent research for prediction of hurricane trajectories using recurrent neural networks (RNNs), we have developed improved methods and generalized the approach to predict Bayesian intervals in addition to simple point estimates. Tropical storms are capable of causing severe damage, so accurately predicting their trajectories can bring significant benefits to cities and lives, especially as they grow more intense due to climate change effects. By implementing the Bayesian interval using dropout in an RNN, we improve the actionability of the predictions, for example by estimating the areas to evacuate in the landfall region. We used an RNN to predict the trajectory of the storms at 6-hour intervals. We used latitude, longitude, windspeed, and pressure features from a Statistical Hurricane Intensity Prediction Scheme (SHIPS) dataset of about 500 tropical storms in the Atlantic Ocean. Our results show how neural network dropout values affect predictions and intervals.	翻訳日:2022-12-24 20:53:51 公開日:2020-03-10
# 頂点時間自己回帰モデルを用いたグラフ上の適応信号処理法 Methods of Adaptive Signal Processing on Graphs Using Vertex-Time Autoregressive Models ( http://arxiv.org/abs/2003.05729v1 ) ライセンス: Link先を確認	Thiernithi Variddhisai, Danilo Mandic	(参考訳) ランダムプロセスの概念は、最近グラフ信号に拡張され、ランダムグラフプロセスは、係数が \textit{graph-topological} 構造を持つ行列である多変量確率過程のクラスである。したがって、ランダムグラフプロセスのシステム同定問題は、その基礎となるトポロジーを決定すること、または数学的にグラフシフト演算子(gsos)、すなわち隣接行列やラプラシアン行列を決定することで解決される。ランダムグラフ処理を導入したのと同じ研究で、gso の解法である \textit{batch} の最適化手法が \textit{causal} 頂点時間自己回帰モデルに基づくランダムグラフプロセスに対して提案されている。この目的のために,適応フィルタリングの枠組みを用いて,最適化問題のオンライン版を提案した。修正確率勾配投影法は, 正規化最小二乗の目的に応用し, フィルタを試作した。再帰は3つの正規化サブプロブレムに分けられ、多重凸性、疎性、可換性、バイアスといった問題に対処する。収束分析に関する議論も含んでいる。最後に,提案アルゴリズムの性能を,従来のMSE測度から,正しい値に拘わらず良好な回復率まで,その可能性,限界,および本研究の可能性に光を当てる実験を行った。 The concept of a random process has been recently extended to graph signals, whereby random graph processes are a class of multivariate stochastic processes whose coefficients are matrices with a \textit{graph-topological} structure. The system identification problem of a random graph process therefore revolves around determining its underlying topology, or mathematically, the graph shift operators (GSOs) i.e. an adjacency matrix or a Laplacian matrix. In the same work that introduced random graph processes, a \textit{batch} optimization method to solve for the GSO was also proposed for the random graph process based on a \textit{causal} vertex-time autoregressive model. To this end, the online version of this optimization problem was proposed via the framework of adaptive filtering. The modified stochastic gradient projection method was employed on the regularized least squares objective to create the filter. The recursion is divided into 3 regularized sub-problems to address issues like multi-convexity, sparsity, commutativity and bias. A discussion on convergence analysis is also included. Finally, experiments are conducted to illustrate the performance of the proposed algorithm, from traditional MSE measure to successful recovery rate regardless correct values, all of which to shed light on the potential, the limit and the possible research attempt of this work.	翻訳日:2022-12-24 20:53:37 公開日:2020-03-10
# 機械学習による電力グリッド内のCO2排出強度の短期予測 Short-Term Forecasting of CO2 Emission Intensity in Power Grids by Machine Learning ( http://arxiv.org/abs/2003.05740v1 ) ライセンス: Link先を確認	Kenneth Leerbeck and Peder Bacher and Rune Junker and Goran Goranovi\'c and Olivier Corradi and Razgar Ebrahimy and Anna Tveit and Henrik Madsen	(参考訳) デンマークの入札ゾーンDK2における電力グリッドのCO2排出強度を予測し、平均と限界の排出量を区別する機械学習アルゴリズムを開発した。この分析は、電力生産、需要、輸入、気象条件など、選択された近隣地域から収集された膨大な数(473)の説明変数からなるデータセット上で行われた。この数は、lasso (penalized linear regression analysis) と前方特徴選択アルゴリズムの両方を用いて50未満に削減された。データの異なる側面(非線形性や変数の結合など)を捉えた3つの線形回帰モデルを作成し,ソフトマックス重み付き平均を用いて最終モデルに組み合わせた。残差を補正するために実装された脱バイアスおよび自己回帰移動平均モデル(ARIMA)に対してクロスバリデーションを行い、最終モデルは外因性入力(ARIMAX)を持つ変種とする。対応する不確実性の予測は6時間以下と2つの時間軸で与えられる。限界放射はdk2ゾーンのあらゆる条件とは独立に発生し、限界発生器は近隣のゾーンにあることを示唆している。開発手法は欧州電力網の入札ゾーンに適用でき、このゾーンに関する詳細な知識を必要とせずに適用できる。 A machine learning algorithm is developed to forecast the CO2 emission intensities in electrical power grids in the Danish bidding zone DK2, distinguishing between average and marginal emissions. The analysis was done on data set comprised of a large number (473) of explanatory variables such as power production, demand, import, weather conditions etc. collected from selected neighboring zones. The number was reduced to less than 50 using both LASSO (a penalized linear regression analysis) and a forward feature selection algorithm. Three linear regression models that capture different aspects of the data (non-linearities and coupling of variables etc.) were created and combined into a final model using Softmax weighted average. Cross-validation is performed for debiasing and autoregressive moving average model (ARIMA) implemented to correct the residuals, making the final model the variant with exogenous inputs (ARIMAX). The forecasts with the corresponding uncertainties are given for two time horizons, below and above six hours. Marginal emissions came up independent of any conditions in the DK2 zone, suggesting that the marginal generators are located in the neighbouring zones. The developed methodology can be applied to any bidding zone in the European electricity network without requiring detailed knowledge about the zone.	翻訳日:2022-12-24 20:53:13 公開日:2020-03-10
# 創始者理論の解明に向けて Towards Clarifying the Theory of the Deconfounder ( http://arxiv.org/abs/2003.04948v1 ) ライセンス: Link先を確認	Yixin Wang, David M. Blei	(参考訳) Wang and Blei (2019) は複数の因果推論を研究し、デコンファウンデーションアルゴリズムを提案する。理論的要件を論じ,実証的研究を行う。創始者理論に関するいくつかの改良が提案されている。このうち、今井と江は「観測されていない単一原因の共同設立者」という仮定を明確にした。それらの仮定を用いて、本論文は理論を明確にする。さらに、ogburn et al. (2020) はこの理論の反例を提案する。しかし、提案された反例は要求された仮定を満たさない。 Wang and Blei (2019) studies multiple causal inference and proposes the deconfounder algorithm. The paper discusses theoretical requirements and presents empirical studies. Several refinements have been suggested around the theory of the deconfounder. Among these, Imai and Jiang clarified the assumption of "no unobserved single-cause confounders." Using their assumption, this paper clarifies the theory. Furthermore, Ogburn et al. (2020) proposes counterexamples to the theory. But the proposed counterexamples do not satisfy the required assumptions.	翻訳日:2022-12-24 20:45:57 公開日:2020-03-10
# ステッカーで応答する学習:マルチターンダイアログにおけるマルチモーダルの統合フレームワーク Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn Dialog ( http://arxiv.org/abs/2003.04679v1 ) ライセンス: Link先を確認	Shen Gao, Xiuying Chen, Chang Liu, Li Liu, Dongyan Zhao and Rui Yan	(参考訳) オンラインメッセージングアプリでは、鮮やかで魅力的な表情のステッカーが人気を集めており、ステッカーのテキストラベルと以前の発話をマッチさせることで、ステッカー応答を自動的に選択する作業もある。しかし、その量が多いため、すべてのステッカーにテキストラベルを必要とするのは現実的ではない。そこで本稿では,外部ラベルを使わずに,複数ターンのダイアログコンテキスト履歴に基づいて適切なステッカーをユーザに推奨する。この課題には2つの大きな課題がある。 1つは、対応するテキストラベルなしでステッカーの意味を学ぶことである。もう一つの課題は、マルチターンダイアログコンテキストで候補ステッカーを共同でモデル化することである。これらの課題に対処するために、ステッカー応答セレクタ(SRS)モデルを提案する。具体的には、まず、畳み込みベースのステッカー画像エンコーダとセルフアテンションベースのマルチターンダイアログエンコーダを使用して、ステッカーと発話の表現を得る。次に,対話履歴内の各発話とステッカー間の深いマッチングを行うために,ディープインタラクションネットワークを提案する。次にsrsは、フュージョンネットワークによってすべてのインタラクション結果間の短期的および長期的な依存関係を学び、最終マッチングスコアを出力する。提案手法を評価するために,最も人気のあるオンラインチャットプラットフォームの1つであるステッカーを用いた大規模実世界のダイアログデータセットを収集した。このデータセットで行った大規模な実験により、我々のモデルは、一般的に使用されるすべてのメトリクスに対して最先端のパフォーマンスを達成することが示された。実験はまた、SRSの各コンポーネントの有効性を検証する。ステッカー選択フィールドのさらなる研究を容易にするため,このデータセットを340Kマルチターンダイアログとステッカーペアでリリースする。 Stickers with vivid and engaging expressions are becoming increasingly popular in online messaging apps, and some works are dedicated to automatically select sticker response by matching text labels of stickers with previous utterances. However, due to their large quantities, it is impractical to require text labels for the all stickers. Hence, in this paper, we propose to recommend an appropriate sticker to user based on multi-turn dialog context history without any external labels. Two main challenges are confronted in this task. One is to learn semantic meaning of stickers without corresponding text labels. Another challenge is to jointly model the candidate sticker with the multi-turn dialog context. To tackle these challenges, we propose a sticker response selector (SRS) model. Specifically, SRS first employs a convolutional based sticker image encoder and a self-attention based multi-turn dialog encoder to obtain the representation of stickers and utterances. Next, deep interaction network is proposed to conduct deep matching between the sticker with each utterance in the dialog history. SRS then learns the short-term and long-term dependency between all interaction results by a fusion network to output the the final matching score. To evaluate our proposed method, we collect a large-scale real-world dialog dataset with stickers from one of the most popular online chatting platform. Extensive experiments conducted on this dataset show that our model achieves the state-of-the-art performance for all commonly-used metrics. Experiments also verify the effectiveness of each component of SRS. To facilitate further research in sticker selection field, we release this dataset of 340K multi-turn dialog and sticker pairs.	翻訳日:2022-12-24 20:45:11 公開日:2020-03-10
# PnP-Net: ハイブリッドなパースペクティブnポイントネットワーク PnP-Net: A hybrid Perspective-n-Point Network ( http://arxiv.org/abs/2003.04626v1 ) ライセンス: Link先を確認	Roy Sheffer, Ami Wiesel	(参考訳) 我々は,ディープラーニングとモデルベースアルゴリズムを組み合わせたハイブリッド手法を用いて,pnp問題を考える。 PnPは、世界の3Dポイントのセットと、画像中の対応する2Dプロジェクションが与えられたキャリブレーションカメラのポーズを推定する問題である。より困難なロバストなバージョンでは、いくつかの対応がミスマッチし、効率的に破棄されなければならない。古典的解法は、問題の幾何を利用するが不正確なか計算集約的な反復的頑健な非線形最小二乗法を介してPnPに対処する。対照的に、深層学習の初期フェーズとモデルに基づく微調整フェーズを組み合わせることを提案する。 pnp-netで表されるこのハイブリッドアプローチは、応答誤差と雑音の下で未知のポーズパラメータを、計算複雑性の低さと固定の要件で推定することに成功している。合成データと実世界のデータの両方にその利点を示す。 We consider the robust Perspective-n-Point (PnP) problem using a hybrid approach that combines deep learning with model based algorithms. PnP is the problem of estimating the pose of a calibrated camera given a set of 3D points in the world and their corresponding 2D projections in the image. In its more challenging robust version, some of the correspondences may be mismatched and must be efficiently discarded. Classical solutions address PnP via iterative robust non-linear least squares method that exploit the problem's geometry but are either inaccurate or computationally intensive. In contrast, we propose to combine a deep learning initial phase followed by a model-based fine tuning phase. This hybrid approach, denoted by PnP-Net, succeeds in estimating the unknown pose parameters under correspondence errors and noise, with low and fixed computational complexity requirements. We demonstrate its advantages on both synthetic data and real world data.	翻訳日:2022-12-24 20:44:43 公開日:2020-03-10
# $\beta$-VAEの潜在空間を用いたマルチラベルデータセットの分布外検出 Out-of-Distribution Detection in Multi-Label Datasets using Latent Space of $\beta$-VAE ( http://arxiv.org/abs/2003.08740v1 ) ライセンス: Link先を確認	Vijaya Kumar Sundar, Shreyas Ramakrishna, Zahra Rahiminasab, Arvind Easwaran, Abhishek Dubey	(参考訳) 学習可能コンポーネント(LEC)は、イメージセグメンテーション、オブジェクト検出、エンドツーエンドの駆動など、さまざまな認識に基づく自律的なタスクで広く使用されている。これらのコンポーネントは、気象条件や日時、トラフィック密度といったマルチモーダルな要素を持つ大規模なイメージデータセットでトレーニングされる。 LECはトレーニング中にこれらの要因から学習し、これらの要因にばらつきがあるかどうかをテストする間、コンポーネントは混乱し、信頼性が低い。トレーニング中に見えない要因のイメージは、一般的にout-of-Distribution (OOD)と呼ばれる。安全な自律のためには、OOD画像の識別が重要であり、適切な緩和戦略が実行可能である。 SVMやSVDDのような古典的な一級分類器はOOD検出に使用される。しかし、これらのデータセットのイメージにアタッチされた複数のラベルは、これらのテクニックの直接適用を制限する。我々は、$\beta$-variational autoencoder ($\beta$-vae) の潜在空間を用いてこの問題に対処する。適切に選択された$\beta$-vae によって生成されるコンパクトな潜在空間は、これらの因子に関する情報をいくつかの潜在変数にエンコードし、計算的に安価な検出に使うことができる。我々はnuScenesデータセットに対するアプローチを評価し,この結果から生成因子の値の変化をエンコードするために$\beta$-VAEの潜伏空間が敏感であることが示唆された。 Learning Enabled Components (LECs) are widely being used in a variety of perception based autonomy tasks like image segmentation, object detection, end-to-end driving, etc. These components are trained with large image datasets with multimodal factors like weather conditions, time-of-day, traffic-density, etc. The LECs learn from these factors during training, and while testing if there is variation in any of these factors, the components get confused resulting in low confidence predictions. The images with factors not seen during training is commonly referred to as Out-of-Distribution (OOD). For safe autonomy it is important to identify the OOD images, so that a suitable mitigation strategy can be performed. Classical one-class classifiers like SVM and SVDD are used to perform OOD detection. However, the multiple labels attached to the images in these datasets, restricts the direct application of these techniques. We address this problem using the latent space of the $\beta$-Variational Autoencoder ($\beta$-VAE). We use the fact that compact latent space generated by an appropriately selected $\beta$-VAE will encode the information about these factors in a few latent variables, and that can be used for computationally inexpensive detection. We evaluate our approach on the nuScenes dataset, and our results shows the latent space of $\beta$-VAE is sensitive to encode changes in the values of the generative factor.	翻訳日:2022-12-24 20:44:27 公開日:2020-03-10
# 単語埋め込み規則化とソフト類似度尺度を用いたテキスト分類 Text classification with word embedding regularization and soft similarity measure ( http://arxiv.org/abs/2003.05019v1 ) ライセンス: Link先を確認	V\'it Novotn\'y, Eniafe Festus Ayetiran, Michal \v{S}tef\'anik, and Petr Sojka	(参考訳) Mikolovらの独創的な作品以来、単語の埋め込みは多くの自然言語処理タスクにおいて好まれる単語表現となっている。 SCM(Soft Cosine measure)やWord Mover's Distance(Word Mover's Distance)などの単語埋め込みから抽出した文書類似度尺度を報告し,意味的テキスト類似度とテキスト分類の最先端性能を実現する。テキスト分類と意味的テキスト類似性においてWMDの強い性能にもかかわらず、その超キュービック平均時間複雑性は実用的ではない。 SCMは2次最悪の時間複雑性を持つが、テキスト分類における性能はWMDと比較されることはなかった。近年, 2つの単語埋め込み正規化手法が, 記憶コストと記憶コストの低減, 学習速度の向上, 文書処理速度の向上, 単語アナロジー, 単語類似性, 意味テキスト類似性の向上に寄与した。しかし,これらの手法がテキスト分類に与える影響についてはまだ研究されていない。本研究では,文書処理速度と文書分類におけるscmとwmdのタスク性能に対する2つの単語埋め込み正規化手法の個人および共同効果について検討した。評価には、$k$NN分類器と、BBCSport、TWITTER、OHSUMED、REUTERS-21578、AMAZON、20NEWSの6つの標準データセットを使用する。正規化単語埋め込みによる平均$k$NNテスト誤差の39%を非正規化単語埋め込みと比較した。本稿では,コレスキー分解による正規化埋め込みの導出について述べる。また、正規化語埋め込みによるSCMはテキスト分類においてWMDよりも優れ、1万倍以上高速であることを示す。 Since the seminal work of Mikolov et al., word embeddings have become the preferred word representations for many natural language processing tasks. Document similarity measures extracted from word embeddings, such as the soft cosine measure (SCM) and the Word Mover's Distance (WMD), were reported to achieve state-of-the-art performance on semantic text similarity and text classification. Despite the strong performance of the WMD on text classification and semantic text similarity, its super-cubic average time complexity is impractical. The SCM has quadratic worst-case time complexity, but its performance on text classification has never been compared with the WMD. Recently, two word embedding regularization techniques were shown to reduce storage and memory costs, and to improve training speed, document processing speed, and task performance on word analogy, word similarity, and semantic text similarity. However, the effect of these techniques on text classification has not yet been studied. In our work, we investigate the individual and joint effect of the two word embedding regularization techniques on the document processing speed and the task performance of the SCM and the WMD on text classification. For evaluation, we use the $k$NN classifier and six standard datasets: BBCSPORT, TWITTER, OHSUMED, REUTERS-21578, AMAZON, and 20NEWS. We show 39% average $k$NN test error reduction with regularized word embeddings compared to non-regularized word embeddings. We describe a practical procedure for deriving such regularized embeddings through Cholesky factorization. We also show that the SCM with regularized word embeddings significantly outperforms the WMD on text classification and is over 10,000 times faster.	翻訳日:2022-12-24 20:44:01 公開日:2020-03-10
# 車線維持車両のエンド・ツー・エンド制御のためのスパイクニューラルネットワークの間接的および直接訓練 Indirect and Direct Training of Spiking Neural Networks for End-to-End Control of a Lane-Keeping Vehicle ( http://arxiv.org/abs/2003.04603v1 ) ライセンス: Link先を確認	Zhenshan Bing, Claus Meschede, Guang Chen, Alois Knoll, Kai Huang	(参考訳) 生物学的シナプス可塑性に基づくスパイクニューラルネットワーク(snn)の構築は、高速でエネルギー効率の良いコンピューティングを実現する有望な可能性を秘めている。しかし,ロボット分野におけるSNNの実装は,実践的な訓練方法の欠如により制限されている。そこで本稿では,車線維持車両におけるSNNの間接的および直接的エンドツーエンドのトレーニング手法を紹介する。まず,<textcolor{black}{Deep Q-Learning} (DQN) アルゴリズムを用いて学習し,その後,教師あり学習を用いてSNNに転送する。第二に, 強化学習の利点とstdp(spike-timing-dependent plasticity)の利点を併せ持つため, 直接sns訓練にr-stdp(reward-modulated spike-timing-dependent plasticity)を採用する。イベントベースニューロモルフィック視覚センサを用いて,ロボットが車線標識内に留まるように制御される3つのシナリオにおいて提案手法を検討する。本稿では,R-STDP手法の利点を,他の3つのアルゴリズムと比較することにより,横方向のローカライゼーション精度とトレーニング時間ステップの観点から明らかにする。 Building spiking neural networks (SNNs) based on biological synaptic plasticities holds a promising potential for accomplishing fast and energy-efficient computing, which is beneficial to mobile robotic applications. However, the implementations of SNNs in robotic fields are limited due to the lack of practical training methods. In this paper, we therefore introduce both indirect and direct end-to-end training methods of SNNs for a lane-keeping vehicle. First, we adopt a policy learned using the \textcolor{black}{Deep Q-Learning} (DQN) algorithm and then subsequently transfer it to an SNN using supervised learning. Second, we adopt the reward-modulated spike-timing-dependent plasticity (R-STDP) for training SNNs directly, since it combines the advantages of both reinforcement learning and the well-known spike-timing-dependent plasticity (STDP). We examine the proposed approaches in three scenarios in which a robot is controlled to keep within lane markings by using an event-based neuromorphic vision sensor. We further demonstrate the advantages of the R-STDP approach in terms of the lateral localization accuracy and training time steps by comparing them with other three algorithms presented in this paper.	翻訳日:2022-12-24 20:43:18 公開日:2020-03-10
# マルチウェイデータのためのオンラインテンソル学習 Online Tensor-Based Learning for Multi-Way Data ( http://arxiv.org/abs/2003.04497v1 ) ライセンス: Link先を確認	Ali Anaissi, Basem Suleiman, Seid Miad Zandavi	(参考訳) テンソル $\mathcal{X} \in \mathbb{R} ^{I_1 \times \dots \times I_N} $ に格納されたマルチウェイデータのオンライン解析は、基礎となる構造を捕捉し、予測モデルを学ぶのに使用できるセンシティブな特徴を抽出するための重要なツールとなっている。しかし、データ分布はしばしば時間とともに進化し、現在の予測モデルは将来十分に代表されないかもしれない。したがって、このような状況ではテンソルベースの特徴とモデル係数を段階的に更新する必要がある。オンラインの$CANDECOMP/PARAFAC$ (CP)分解において, テンソルを用いた新しい効率的な特徴抽出法NeSGDを提案する。 nesgdの結果から得られた新しい特徴によると、オンライン予測モデルの更新プロセスのために新しい基準がトリガーされる。実験室ベースおよび実生活構造データセットを用いた構造健康モニタリングの分野での実験的な評価は,既存のオンラインテンソル解析やモデル学習と比較して,より正確な結果が得られることを示している。その結果,提案手法は分類誤り率を大幅に改善し,時間とともに正のデータ分布の変化を同化することができ,全てのケーススタディにおいて高い予測精度を維持した。 The online analysis of multi-way data stored in a tensor $\mathcal{X} \in \mathbb{R} ^{I_1 \times \dots \times I_N} $ has become an essential tool for capturing the underlying structures and extracting the sensitive features which can be used to learn a predictive model. However, data distributions often evolve with time and a current predictive model may not be sufficiently representative in the future. Therefore, incrementally updating the tensor-based features and model coefficients are required in such situations. A new efficient tensor-based feature extraction, named NeSGD, is proposed for online $CANDECOMP/PARAFAC$ (CP) decomposition. According to the new features obtained from the resultant matrices of NeSGD, a new criteria is triggered for the updated process of the online predictive model. Experimental evaluation in the field of structural health monitoring using laboratory-based and real-life structural datasets show that our methods provide more accurate results compared with existing online tensor analysis and model learning. The results showed that the proposed methods significantly improved the classification error rates, were able to assimilate the changes in the positive data distribution over time, and maintained a high predictive accuracy in all case studies.	翻訳日:2022-12-24 20:36:56 公開日:2020-03-10
# 非均一密度入力のためのニューラルネットワークの周波数バイアス Frequency Bias in Neural Networks for Input of Non-Uniform Density ( http://arxiv.org/abs/2003.04560v1 ) ライセンス: Link先を確認	Ronen Basri, Meirav Galun, Amnon Geifman, David Jacobs, Yoni Kasten, Shira Kritchman	(参考訳) 最近の研究は、過パラメータニューラルネットの一般化能力を周波数バイアスに帰している。一様分布から引き出されたデータに勾配降下を訓練したネットワークは、高周波のニューラルネットワークよりも低い周波数に適合する。現実的なトレーニングセットは均一な分布から引き出されないため、我々はニューラルネットワーク・タンジェント・カーネル(NTK)モデルを用いて、学習力学における変動密度の影響を探索する。その結果、周波数の純調和関数である$\kappa$ を学習すると、点 $\x \in \sphere^{d-1}$ での収束は時刻 $o(\kappa^d/p(\x))$ ここで $p(\x)$ は局所密度 $\x$ を表す。具体的には、$\Sphere^1$のデータに対して、2層ネットワークのNTKに関連するカーネルの固有関数を解析的に導出する。さらに、NTKのスペクトル分解に関して、深い完全連結ネットワークに対する収束結果を証明した。実験では,このモデルにおける深層ネットワークと浅層ネットワークの類似性と差異に注目した。 Recent works have partly attributed the generalization ability of over-parameterized neural networks to frequency bias -- networks trained with gradient descent on data drawn from a uniform distribution find a low frequency fit before high frequency ones. As realistic training sets are not drawn from a uniform distribution, we here use the Neural Tangent Kernel (NTK) model to explore the effect of variable density on training dynamics. Our results, which combine analytic and empirical observations, show that when learning a pure harmonic function of frequency $\kappa$, convergence at a point $\x \in \Sphere^{d-1}$ occurs in time $O(\kappa^d/p(\x))$ where $p(\x)$ denotes the local density at $\x$. Specifically, for data in $\Sphere^1$ we analytically derive the eigenfunctions of the kernel associated with the NTK for two-layer networks. We further prove convergence results for deep, fully connected networks with respect to the spectral decomposition of the NTK. Our empirical study highlights similarities and differences between deep and shallow networks in this model.	翻訳日:2022-12-24 20:35:09 公開日:2020-03-10
# 時系列データの曖昧性:連続モデルによる不確実な未来予測 Ambiguity in Sequential Data: Predicting Uncertain Futures with Recurrent Models ( http://arxiv.org/abs/2003.10381v1 ) ライセンス: Link先を確認	Alessandro Berlati, Oliver Scheel, Luigi Di Stefano, Federico Tombari	(参考訳) あいまいさは多くの機械学習タスクに本質的に存在するが、特にシーケンシャルモデルでは、ほとんどの場合単一の予測しか出力しないため、ほとんど考慮されない。本研究では,逐次データを用いた曖昧な予測を扱うために,多重仮説予測(multiple hypothesis prediction,mhp)モデルの拡張を提案する。我々のアプローチは最も一般的な繰り返しアーキテクチャに適用でき、損失関数で使用できます。さらに,不確かさを考慮し,複数のラベルが存在する場合の正確さの直感的な理解と一致した,あいまいな問題に対する新しい尺度を提案する。提案手法は, 軌道予測や操作予測などの時系列データを扱う様々なタスクにおいて, 有望な結果を達成するために, 実験を行った。 Ambiguity is inherently present in many machine learning tasks, but especially for sequential models seldom accounted for, as most only output a single prediction. In this work we propose an extension of the Multiple Hypothesis Prediction (MHP) model to handle ambiguous predictions with sequential data, which is of special importance, as often multiple futures are equally likely. Our approach can be applied to the most common recurrent architectures and can be used with any loss function. Additionally, we introduce a novel metric for ambiguous problems, which is better suited to account for uncertainties and coincides with our intuitive understanding of correctness in the presence of multiple labels. We test our method on several experiments and across diverse tasks dealing with time series data, such as trajectory forecasting and maneuver prediction, achieving promising results.	翻訳日:2022-12-24 20:26:22 公開日:2020-03-10
# 日本人における人間の行動記述のためのビデオ字幕データセット Video Caption Dataset for Describing Human Actions in Japanese ( http://arxiv.org/abs/2003.04865v1 ) ライセンス: Link先を確認	Yutaro Shigeto, Yuya Yoshikawa, Jiaqing Lin, Akikazu Takeuchi	(参考訳) 近年,自動字幕生成が注目されている。本稿では,人間の行動を記述するための日本語字幕の生成に焦点をあてる。現在利用可能なほとんどのビデオキャプションデータセットは英語で構築されているが、同等の日本語データセットはない。そこで我々は,79,822本,399,233本からなる大規模日本ビデオキャプションデータセットを構築した。データセットの各キャプションは、"誰がどこで何をするのか"という形式でビデオを記述する。人間の行動を説明するには、人、場所、行動の詳細を特定することが重要である。実際、人間の行動を説明するとき、通常、場面、人物、行動について言及する。本実験では,2つのキャプション生成手法を評価し,ベンチマーク結果を得た。さらに,これらの生成手法が「何をどこで行うか」を特定できるかどうかを検討した。 In recent years, automatic video caption generation has attracted considerable attention. This paper focuses on the generation of Japanese captions for describing human actions. While most currently available video caption datasets have been constructed for English, there is no equivalent Japanese dataset. To address this, we constructed a large-scale Japanese video caption dataset consisting of 79,822 videos and 399,233 captions. Each caption in our dataset describes a video in the form of "who does what and where." To describe human actions, it is important to identify the details of a person, place, and action. Indeed, when we describe human actions, we usually mention the scene, person, and action. In our experiments, we evaluated two caption generation methods to obtain benchmark results. Further, we investigated whether those generation methods could specify "who does what and where."	翻訳日:2022-12-24 20:25:23 公開日:2020-03-10
# TyDi QA: タイポロジー多言語における情報探索質問回答のベンチマーク TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages ( http://arxiv.org/abs/2003.05002v1 ) ライセンス: Link先を確認	Jonathan H. Clark, Eunsol Choi, Michael Collins, Dan Garrette, Tom Kwiatkowski, Vitaly Nikolaev, and Jennimaria Palomaki	(参考訳) 多言語モデリングを確実に進めるためには、挑戦的で信頼できる評価が必要である。我々はTyDi QA--204Kの問合せ対を持つ11の類型的多様言語を対象とした質問応答データセットを提案する。 tydi qaの言語は、それぞれの言語が表現する言語的特徴のセットという、そのタイポロジーに関して多種多様であり、このセットでうまく機能するモデルが、世界の多くの言語にまたがって一般化することを期待しています。本稿では、英語のみのコーパスでは見つからない観察された言語現象のデータ品質と例レベルの質的言語分析について定量的に分析する。リアルな情報検索タスクであって、プライミング効果を回避し、回答を知りたいがまだ答えがわからない人々によって質問が書かれ、翻訳を使わずに各言語でデータを直接収集する。 Confidently making progress on multilingual modeling requires challenging, trustworthy evaluations. We present TyDi QA---a question answering dataset covering 11 typologically diverse languages with 204K question-answer pairs. The languages of TyDi QA are diverse with regard to their typology---the set of linguistic features each language expresses---such that we expect models performing well on this set to generalize across a large number of the world's languages. We present a quantitative analysis of the data quality and example-level qualitative linguistic analyses of observed language phenomena that would not be found in English-only corpora. To provide a realistic information-seeking task and avoid priming effects, questions are written by people who want to know the answer, but don't know the answer yet, and the data is collected directly in each language without the use of translation.	翻訳日:2022-12-24 20:25:12 公開日:2020-03-10
# 生成モデルを用いた大規模自然言語逆例の生成 Generating Natural Language Adversarial Examples on a Large Scale with Generative Models ( http://arxiv.org/abs/2003.10388v1 ) ライセンス: Link先を確認	Yankun Ren and Jianbin Lin and Siliang Tang and Jun Zhou and Shuang Yang and Yuan Qi and Xiang Ren	(参考訳) 現在、テキスト分類モデルは広く使われている。しかし、これらの分類器は逆例によって容易に騙される。幸いなことに、標準的な攻撃方法は、対向テキストを生成する。つまり、逆テキストは、いくつかの単語を置き換えることで、現実世界のテキストからのみ生成することができる。多くのアプリケーションでは、これらのテキストは数に制限があるため、その逆の例はしばしば多様ではなく、時には読みにくいため、人間が容易に検出でき、大規模にカオスを起こすことができない。本稿では,テキストの摂動に制限されない生成モデルを用いて,テキストをスクラッチから効率的に生成するエンド・ツー・エンドのソリューションを提案する。これを非制限逆テキスト生成と呼ぶ。具体的には,条件付き変分オートエンコーダ(VAE)を学習し,さらに逆転損失を加えて,逆転例の生成を誘導する。さらに,敵対的テキストの妥当性を向上させるために,実データと一致するように,識別器とGAN(Generative Adversarial Network)のトレーニングフレームワークを利用する。感情分析実験により,本手法のスケーラビリティと効率性を示す。既存の手法よりも高い成功率でテキスト分類モデルを攻撃することができ、一方で人間には許容できる品質を提供する。 Today text classification models have been widely used. However, these classifiers are found to be easily fooled by adversarial examples. Fortunately, standard attacking methods generate adversarial texts in a pair-wise way, that is, an adversarial text can only be created from a real-world text by replacing a few words. In many applications, these texts are limited in numbers, therefore their corresponding adversarial examples are often not diverse enough and sometimes hard to read, thus can be easily detected by humans and cannot create chaos at a large scale. In this paper, we propose an end to end solution to efficiently generate adversarial texts from scratch using generative models, which are not restricted to perturbing the given texts. We call it unrestricted adversarial text generation. Specifically, we train a conditional variational autoencoder (VAE) with an additional adversarial loss to guide the generation of adversarial examples. Moreover, to improve the validity of adversarial texts, we utilize discrimators and the training framework of generative adversarial networks (GANs) to make adversarial texts consistent with real data. Experimental results on sentiment analysis demonstrate the scalability and efficiency of our method. It can attack text classification models with a higher success rate than existing methods, and provide acceptable quality for humans in the meantime.	翻訳日:2022-12-24 20:17:52 公開日:2020-03-10
# 高度不均衡データに基づく適応的名前認識 Adaptive Name Entity Recognition under Highly Unbalanced Data ( http://arxiv.org/abs/2003.10296v1 ) ライセンス: Link先を確認	Thong Nguyen, Duy Nguyen, Pramod Rao	(参考訳) 自然言語処理(nlp)において、情報抽出、感情分析、チャットボットなどいくつかの目的において、名前付きエンティティ認識(ner)は、テキスト中のエンティティを人名、場所、量、組織、パーセンテージなどの予め定義されたグループに分類し分類する上で重要な役割を担っている。本稿では,両方向LSTM(BI-LSTM)層上に積み重ねた条件付きランダムフィールド(CRF)層からなるニューラルアーキテクチャの実験を行った。さらに、巨大なコーパス上で事前学習された埋め込みベクトル(glove, bert)の融合入力を用いて、モデルの一般化能力を高める。残念ながら、重いアンバランスな分散クロストレーニングデータのために、両方のアプローチはトレーニングの少ないサンプルクラスで悪いパフォーマンスを達成した。この課題を克服するために、文を弱クラスと強クラスに分割し、各セットのパフォーマンスを最適化するために2つのBi-LSTM-CRFモデルを適切に設計するアドオン分類モデルを導入する。テストセット上でのモデル評価を行い,他のクラスと比較して非常に小さなデータセット(約 0.45 %)を使用することで,Weakクラスの性能を著しく向上できることを確認した。 For several purposes in Natural Language Processing (NLP), such as Information Extraction, Sentiment Analysis or Chatbot, Named Entity Recognition (NER) holds an important role as it helps to determine and categorize entities in text into predefined groups such as the names of persons, locations, quantities, organizations or percentages, etc. In this report, we present our experiments on a neural architecture composed of a Conditional Random Field (CRF) layer stacked on top of a Bi-directional LSTM (BI-LSTM) layer for solving NER tasks. Besides, we also employ a fusion input of embedding vectors (Glove, BERT), which are pre-trained on the huge corpus to boost the generalization capacity of the model. Unfortunately, due to the heavy unbalanced distribution cross-training data, both approaches just attained a bad performance on less training samples classes. To overcome this challenge, we introduce an add-on classification model to split sentences into two different sets: Weak and Strong classes and then designing a couple of Bi-LSTM-CRF models properly to optimize performance on each set. We evaluated our models on the test set and discovered that our method can improve performance for Weak classes significantly by using a very small data set (approximately 0.45\%) compared to the rest classes.	翻訳日:2022-12-24 20:17:26 公開日:2020-03-10
# グローバルオプティマイザになるための学習 Learning to be Global Optimizer ( http://arxiv.org/abs/2003.04521v1 ) ライセンス: Link先を確認	Haotian Zhang, Jianyong Sun and Zongben Xu	(参考訳) 人工知能の進歩は、最適化アルゴリズムの開発に新たな光を当てている。本稿では,スムーズな非凸関数に対する2相(最小化フェーズとエスケープフェーズを含む)グローバル最適化アルゴリズムについて述べる。最小化フェーズにおいて、凸関数に対する履歴情報の非線形結合として形式化された降下方向の更新規則を学習するモデル駆動深層学習法を開発した。提案する適応方向のアルゴリズムによって凸関数の収束が保証されることを示す。実験的な研究により、学習アルゴリズムは勾配降下、共役降下、BFGSなどの古典最適化アルゴリズムを著しく上回り、不適切な関数に対してよく機能することが示された。局所最適からの脱出フェーズは、固定避難ポリシーを持つマルコフ決定プロセスとしてモデル化される。さらに,強化学習による最適避難政策の学習も提案する。合成関数を最適化し、CIFAR画像分類のためのディープニューラルネットワークを訓練することにより、エスケープポリシーの有効性を検証する。学習した2相大域最適化アルゴリズムは、いくつかのベンチマーク関数と機械学習タスクで有望な大域探索能力を示す。 The advancement of artificial intelligence has cast a new light on the development of optimization algorithm. This paper proposes to learn a two-phase (including a minimization phase and an escaping phase) global optimization algorithm for smooth non-convex functions. For the minimization phase, a model-driven deep learning method is developed to learn the update rule of descent direction, which is formalized as a nonlinear combination of historical information, for convex functions. We prove that the resultant algorithm with the proposed adaptive direction guarantees convergence for convex functions. Empirical study shows that the learned algorithm significantly outperforms some well-known classical optimization algorithms, such as gradient descent, conjugate descent and BFGS, and performs well on ill-posed functions. The escaping phase from local optimum is modeled as a Markov decision process with a fixed escaping policy. We further propose to learn an optimal escaping policy by reinforcement learning. The effectiveness of the escaping policies is verified by optimizing synthesized functions and training a deep neural network for CIFAR image classification. The learned two-phase global optimization algorithm demonstrates a promising global search capability on some benchmark functions and machine learning tasks.	翻訳日:2022-12-24 20:16:44 公開日:2020-03-10

Title

Authors

Abstract

論文公表日・翻訳日

# オーブリー相におけるイオントラップ量子コンピュータのフォノンモードの特性

Properties of phonon modes of ion trap quantum computer in the Aubry phase ( http://arxiv.org/abs/2002.03730v2 )

ライセンス: Link先を確認

Justin Loye, Jos\'e Lages and Dima L. Shepelyansky

(参考訳) イオン量子コンピュータにおけるフォノンモードの特性を解析的および数値的に研究する。 The ion chain is placed in a harmonic trap with an additional periodic potential which dimensionless amplitude $K$ determines three main phases available for quantum computations: at zero $K$ we have the case of Cirac-Zoller quantum computer, below a certain critical amplitude $K<K_c$ the ions are in the Kolmogorov-Arnold-Moser (KAM) phase, with delocalized phonon modes and free chain sliding, and above the critical amplitude $K>K_c$ ions are in the pinned Aubry phase with a finite frequency gap protecting quantum gates from temperature and other external fluctuations. オーブリー相では、円-ゾラー相とKAM相とは対照的に、フォノンギャップはトラップ中心の周囲に固定されたイオン密度を保持するトラップ内に配置されるイオンの数とは独立である。オーブリー相では, フォノンモードはCirac-Zoller と KAM のケースと比較して, より局所化されている。したがって、オーブリー相ではリコイルパルスはイオンの局所的な振動を引き起こすが、他の2つの相ではイオン鎖全体に急速に拡散し、外部のゆらぎにかなり敏感である。オーブリー相における局所化フォノンモードとフォノンギャップの性質は、多くのイオンを持つこの相におけるイオン量子計算の利点をもたらすと主張する。

We study analytically and numerically the properties of phonon modes in an ion quantum computer. The ion chain is placed in a harmonic trap with an additional periodic potential which dimensionless amplitude $K$ determines three main phases available for quantum computations: at zero $K$ we have the case of Cirac-Zoller quantum computer, below a certain critical amplitude $K<K_c$ the ions are in the Kolmogorov-Arnold-Moser (KAM) phase, with delocalized phonon modes and free chain sliding, and above the critical amplitude $K>K_c$ ions are in the pinned Aubry phase with a finite frequency gap protecting quantum gates from temperature and other external fluctuations. For the Aubry phase, in contrast to the Cirac-Zoller and KAM phases, the phonon gap remains independent of the number of ions placed in the trap keeping a fixed ion density around the trap center. We show that in the Aubry phase the phonon modes are much better localized comparing to the Cirac-Zoller and KAM cases. Thus in the Aubry phase the recoil pulses lead to local oscillations of ions while in other two phases they spread rapidly over the whole ion chains making them rather sensible to external fluctuations. We argue that the properties of localized phonon modes and phonon gap in the Aubry phase provide advantages for the ion quantum computations in this phase with a large number of ions.

翻訳日:2023-06-04 01:54:29 公開日:2020-03-10

# 量子化条件 1900-1927

Quantization Conditions, 1900-1927 ( http://arxiv.org/abs/2003.04466v1 )

ライセンス: Link先を確認

Anthony Duncan, Michel Janssen

(参考訳) 量子化条件の進化は、プランクによる1900年の黒体放射の処理における新しい基本定数(h)の導入から、1927年のハイゼンベルクの不確実性原理による現代の量子力学の可換関係の解釈まで遡る。

We trace the evolution of quantization conditions from Planck's introduction of a new fundamental constant (h) in his treatment of blackbody radiation in 1900 to Heisenberg's interpretation of the commutation relations of modern quantum mechanics in terms of his uncertainty principle in 1927.

翻訳日:2023-05-30 01:19:06 公開日:2020-03-10

# 放射光による量子相互作用

Quantum interactions with pulses of radiation ( http://arxiv.org/abs/2003.04573v1 )

ライセンス: Link先を確認

Alexander Holm Kiilerich and Klaus M{\o}lmer

(参考訳) 本稿では,量子放射の進行パルスと局所量子システムとの相互作用に関する一般マスター方程式の定式化について述べる。移動場は自由空間放射モードの連続体を発生させ、空洞の離散固有モードに有効なJaynes-Cummings模型は適用されない。我々は、任意の入射パルスによる量子系の駆動と、任意の所望の時間モードに放出される場の量子状態を記述する完全な入力出力理論を開発する。我々の理論は、幅広い物質量子系への結合による放射パルスの変換と相互作用に適用できる。パルスと離散放射固有モードとの量子相互作用の最も本質的な違いを考察し、光・マイクロ波・音響波を用いた量子情報プロトコルに関する例を示す。

This article presents a general master equation formalism for the interaction between travelling pulses of quantum radiation and localized quantum systems. Traveling fields populate a continuum of free space radiation modes and the Jaynes-Cummings model, valid for a discrete eigenmode of a cavity, does not apply. We develop a complete input-output theory to describe the driving of quantum systems by arbitrary incident pulses of radiation and the quantum state of the field emitted into any desired outgoing temporal mode. Our theory is applicable to the transformation and interaction of pulses of radiation by their coupling to a wide class of material quantum systems. We discuss the most essential differences between quantum interactions with pulses and with discrete radiative eigenmodes and present examples relevant to quantum information protocols with optical, microwave and acoustic waves.

翻訳日:2023-05-30 01:15:48 公開日:2020-03-10

# 非古典性の幾何測度による量子相転移の同定

Identifying quantum phase transitions via geometric measures of nonclassicality ( http://arxiv.org/abs/2003.04527v1 )

ライセンス: Link先を確認

Kok Chuan Tan

(参考訳) 本稿では、量子相転移を識別するための一般的なツールとして、非古典性の幾何学的測度の使用を理論的に支援する。非古典性の幾何測度の感受性のばらつきは任意の温度で相転移を特定するのに十分な条件であると主張する。このことは、あらゆる量子資源理論において、非古典性の幾何学的測度が量子系の相転移を研究するための一般的なツールであることを証明している。ゼロ温度では、量子コヒーレンスの幾何学的測度は、特に一階量子相転移の同定に有用であり、量子相関の測度を用いる他のアプローチに対する特に堅牢な代替手段であることを示す。

In this article, we provide theoretical support for the use of geometric measures of nonclassicality as a general tool to identify quantum phase transitions. We argue that divergences in the susceptibility of any geometric measure of nonclassicality are sufficient conditions to identify phase transitions at arbitrary temperature. This establishes that geometric measures of nonclassicality, in any quantum resource theory, are generic tools to investigate phase transitions in quantum systems. At zero temperature, we show that geometric measures of quantum coherence are especially useful for identifying first order quantum phase transitions, and can be a particularly robust alternative to other approaches employing measures of quantum correlations.

翻訳日:2023-05-30 01:15:13 公開日:2020-03-10

# 透明ガタブル超伝導シャドージャンクション

Transparent Gatable Superconducting Shadow Junctions ( http://arxiv.org/abs/2003.04487v1 )

ライセンス: Link先を確認

Sabbir A. Khan, Charalampos Lampadaris, Ajuan Cui, Lukas Stampfer, Yu Liu, S. J. Pauka, Martin E. Cachaza, Elisabetta M. Fiordaliso, Jung-Hyun Kang, Svetlana Korneychuk, Timo Mutas, Joachim E. Sestoft, Filip Krizek, Rawa Tanta, M. C. Cassidy, Thomas S. Jespersen, Peter Krogstrup

(参考訳) ゲート可変接合は、ハイブリッド半導体-超伝導材料に基づく量子デバイスにおける鍵となる要素である。それらは、トンネル分光プローブからゲートモンおよびトポロジカルキュービットにおける電圧制御キュービット演算まで、多目的に機能する。一般的には、ジャンクションの透明性が重要な役割を果たす。本研究では, 単結晶InAs, InSb, $\mathrm{InAs_{1-x}Sb_x}$ナノワイヤ, エピタキシャル超伝導体, In-situシャドウ接合体を1段階分子線エピタキシャルプロセスで成長させる。本研究は, 接合の加工パラメータ, 接合形態, 電子輸送特性の相関関係について検討し, 実験対象の陰影接合がエッチング接合よりも著しく高品質であることを示す。シャドージャンクションのエッジシャープ性を変化させることで、最もシャープなエッジは3つの半導体の最も高いジャンクション透過性をもたらすことを示した。さらに、臨界超電流測定では、 KO$-$2 の極限に近い非常に高い$I_\mathrm{C} R_\mathrm{N}$が示される。本研究は, ゲート可変超伝導量子ビットへの有望な技術経路を示す。

Gate tunable junctions are key elements in quantum devices based on hybrid semiconductor-superconductor materials. They serve multiple purposes ranging from tunnel spectroscopy probes to voltage-controlled qubit operations in gatemon and topological qubits. Common to all is that junction transparency plays a critical role. In this study, we grow single crystalline InAs, InSb and $\mathrm{InAs_{1-x}Sb_x}$ nanowires with epitaxial superconductors and in-situ shadowed junctions in a single-step molecular beam epitaxy process. We investigate correlations between fabrication parameters, junction morphologies, and electronic transport properties of the junctions and show that the examined in-situ shadowed junctions are of significantly higher quality than the etched junctions. By varying the edge sharpness of the shadow junctions we show that the sharpest edges yield the highest junction transparency for all three examined semiconductors. Further, critical supercurrent measurements reveal an extraordinarily high $I_\mathrm{C} R_\mathrm{N}$, close to the KO$-$2 limit. This study demonstrates a promising engineering path towards reliable gate-tunable superconducting qubits.

翻訳日:2023-05-30 01:14:43 公開日:2020-03-10

# 光シングルサイドバンド変調器を用いた単光子周波数の精密調整

Precise tuning of single-photon frequency using optical single sideband modulator ( http://arxiv.org/abs/2003.04486v1 )

ライセンス: Link先を確認

Hsin-Pin Lo, and Hiroki Takesue

(参考訳) 量子特性を保ちながら単一光子の周波数変換は、フォトニック量子通信システムのフレキシブルネットワークにとって重要な技術である。本稿では,光シングルサイドバンド(ossb)変調器を用いて異なる色光子を結合するフレキシブルスキームを示す。変調器を駆動する電波信号を変更することで、単一光子の周波数をシフトし、正確に調整することができる。 ossb変調器を用いて、非退化光子対の周波数識別性を消去し、視認性90%以上のhong-ou-mandel干渉を得ることに成功した。また,ossb変調器によって識別性レベルを正確に制御できることを実証した。 OSSB変調器は、高度な量子情報システムを実現するためのシンプルで柔軟なフォトニックインタフェースを提供することを期待している。

Frequency translation of single photons while preserving their quantum characteristics is an important technology for flexible networking of photonic quantum communication systems. Here we demonstrate a flexible scheme to interface different-color photons using an optical single sideband (OSSB) modulator. By changing the radio-frequency signal that drives the modulators, we can easily shift and precisely tune the frequency of single photons. Using the OSSB modulator, we successfully erased the frequency distinguishability of non-degenerated photon pairs to obtain the Hong-Ou-Mandel interference with a visibility exceeding 90%. We also demonstrated that the level of distinguishability can be precisely controlled by the OSSB modulator. We expect that the OSSB modulator will provide a simple and flexible photonic interface for realizing advanced quantum information systems.

翻訳日:2023-05-30 01:14:21 公開日:2020-03-10

# 時間ビン量子ビットの制御位相ゲートを用いた絡み合い生成

Entanglement generation using a controlled-phase gate for time-bin qubits ( http://arxiv.org/abs/2003.04483v1 )

ライセンス: Link先を確認

Hsin-Pin Lo, Takuya Ikuta, Nobuyuki Matsuda, Toshimori Honjo, and Hiroki Takesue

(参考訳) 量子論理ゲートは、多くの物理系における量子計算や量子情報処理において重要である。時間ビン量子ビットは光ファイバー上の量子通信に適しているが、多くの重要な量子論理ゲートはまだ実現されていない。そこで我々は,光変調器を用いた2x2光スイッチを用いた時間ビン量子ビットの制御位相(C-Phase)ゲートを実証した。ホン・ウー・マンデル干渉測定の結果、スイッチは時間依存のビームスプリッタとして機能することが分かった。スイッチによるC-Phaseゲート操作により2つの独立した時間ビン量子ビットが絡み合っていることを確認した。

Quantum logic gates are important for quantum computations and quantum information processing in numerous physical systems. While time-bin qubits are suited for quantum communications over optical fiber, many essential quantum logic gates for them have not yet been realized. Here, we demonstrated a controlled-phase (C-Phase) gate for time-bin qubits that uses a 2x2 optical switch based on an electro-optic modulator. A Hong-Ou-Mandel interference measurement showed that the switch could work as a time-dependent beam splitter with a variable spitting ratio. We confirmed that two independent time-bin qubits were entangled as a result of the C-Phase gate operation with the switch.

翻訳日:2023-05-30 01:14:08 公開日:2020-03-10

# 時間ビン量子ビット制御位相ゲートの量子プロセストモグラフィー

Quantum process tomography of a controlled-phase gate for time-bin qubits ( http://arxiv.org/abs/2003.04473v1 )

ライセンス: Link先を確認

Hsin-Pin Lo, Takuya Ikuta, Nobuyuki Matsuda, Toshimori Honjo, William J. Munro, and Hiroki Takesue

(参考訳) 情報を1つの光子に異なるタイミングで符号化するタイムビン量子ビットは、光ファイバーや導波路に基づく量子通信で広く使われている。近年の分散量子計算の発展により、時間ビン符号化量子ビットがその文脈で有用かどうかを問うことは論理的である。我々は最近,ニオブ酸リチウム導波路をベースとした2X2光スイッチを用いた時間ビン量子ビット制御相(C-Phase)ゲートを実現し,絡み合った状態の生成を実証した。しかし、実験は入力状態のペアだけで行われ、C-Phaseゲートの機能は完全には検証されなかった。本研究では,量子プロセストモグラフィーを用いて,プロセス忠実度97.1%の確立を行った。さらに,プロセス忠実度が94%以上の制御なしゲート動作を実演した。本研究は,量子量子回路における2量子論理ゲートを時間ビン量子ビットで実装できることを確認し,時間ビン量子ビットに基づく分散量子計算の実現に向けた重要な一歩である。

Time-bin qubits, where information is encoded in a single photon at different times, have been widely used in optical fiber and waveguide based quantum communications. With the recent developments in distributed quantum computation, it is logical to ask whether time-bin encoded qubits may be useful in that context. We have recently realized a time-bin qubit controlled-phase (C-Phase) gate using a 2 X 2 optical switch based on a lithium niobate waveguide, with which we demonstrated the generation of an entangled state. However, the experiment was performed with only a pair of input states, and thus the functionality of the C-Phase gate was not fully verified. In this research, we used quantum process tomography to establish a process fidelity of 97.1%. Furthermore, we demonstrated the controlled-NOT gate operation with a process fidelity greater than 94%. This study confirms that typical two-qubit logic gates used in quantum computational circuits can be implemented with time-bin qubits, and thus it is a significant step forward for realization of distributed quantum computation based on time-bin qubits.

翻訳日:2023-05-30 01:13:57 公開日:2020-03-10

# 無伴奏移民のサポーター支援--社会的-生態学的レジリエンスをめざして

Supporting the Supporters of Unaccompanied Migrant Youth: Designing for Social-ecological Resilience ( http://arxiv.org/abs/2003.04799v1 )

ライセンス: Link先を確認

Franziska Tachtler, Toni Michel, Petr Slov\'ak, Geraldine Fitzpatrick

(参考訳) 両親なしで新しい国に逃れる移民の若者は、メンタルヘルスのリスクにさらされている。レジリエンスの介入はそのようなリスクを軽減するが、アクセスはシステム的および個人的障壁によって妨げられる。最近多くの研究がメンタルヘルスを促進するテクノロジーの設計に取り組んできたが、これらの人口のニーズに焦点を絞ったものはない。本稿では,18名の専門職/ボランティア支援作業員と5名の無伴奏移民の若者を対象に,3つのデザインワークショップを開催する。結果は、若者のレジリエンス開発を促進する多様なシステムを示している。若年者とメンターとしてのボランティアの関係は特にレジリエンスを高める上で重要であるが、課題が伴う。このことは、若年者を支援するためにメンターを支援する技術の設計に焦点を当てた、社会的・生態的なレジリエンスモデルとの関連性を示唆している。最後に、メンタサポートのためにデザインスペースをマッピングする。

Unaccompanied migrant youth, fleeing to a new country without their parents, are exposed to mental health risks. Resilience interventions mitigate such risks, but access can be hindered by systemic and personal barriers. While much work has recently addressed designing technology to promote mental health, none has focused on the needs of these populations. This paper presents the results of interviews with 18 professional/ volunteer support workers and 5 unaccompanied migrant youths, followed by three design workshops. The results point to the diverse systems that can facilitate youths' resilience development. The relationship between the youth and volunteers acting as mentors is particularly important for increasing resilience but comes with challenges. This suggests the relevance of a social-ecological model of resilience with a focus on designing technology to support the mentors in order to help them better support the youth. We conclude by mapping out the design space for mentor support.

翻訳日:2023-05-30 01:05:24 公開日:2020-03-10

# 測定する、または測定しない、それが質問である

To Measure, or Not to Measure, That is the Question ( http://arxiv.org/abs/2003.04683v1 )

ライセンス: Link先を確認

Juzar Thingna and Peter Talkner

(参考訳) ポインタ状態との接触中に取られる可観測性の値の総和を推定する手法を提案する。これにより、ポインタの状態はシステムに接触しながら更新され、システムが時間内に進化する間は、連絡先間で変化しない。所定の数に接触した後、射影測定によりポインタの位置が決定される。この結果は、ユニタリとマルコフの散逸ダイナミクスの確率分布関数を用いて特定され、観測可能と見なされる観測対象の一般化ガウス測定結果と比較される。特定の例として、量子ビットは、ハミルトニアン系と可換でないポインタに接触する可観測性を持つ。

A method is proposed that allows one to infer the sum of the values of an observable taken during contacts with a pointer state. Hereby the state of the pointer is updated while contacted with the system and remains unchanged between contacts while the system evolves in time. After a prescribed number of such contacts the position of the pointer is determined by means of a projective measurement. The outcome is specified in terms of a probability distribution function for unitary and Markovian dissipative dynamics and compared with the results of the same number of generalized Gaussian measurements of the considered observable. As a particular example a qubit is considered with an observable contacting to the pointer that does not commute with the system Hamiltonian.

翻訳日:2023-05-30 01:04:24 公開日:2020-03-10

# トラップされたイオン量子ビットの量子マスター方程式ダイナミクスの直接再構成

Direct reconstruction of the quantum master equation dynamics of a trapped ion qubit ( http://arxiv.org/abs/2003.04678v1 )

ライセンス: Link先を確認

Eitan Ben Av, Yotam Shapira, Nitzan Akerman and Roee Ozeri

(参考訳) マルコフ開量子系の物理学は量子マスター方程式によって記述できる。これらは力学方程式であり、ハミルトニアンおよびジャンプ作用素を包含し、系の時間発展を生成する。システムのハミルトニアンの再構成とその測定からの環境への結合は、基礎研究と量子機械の性能評価の両方において重要である。本稿では,選択した可観測物の期待値の集合から直接,オープン量子系の力学方程式を再構成する手法を提案する。我々は自発的光子散乱下で捕捉された$^{88}\text{sr}^+$イオンのダイナミクスを測定することで,シミュレーションと実験の両方でこの技術をベンチマークする。

The physics of Markovian open quantum systems can be described by quantum master equations. These are dynamical equations, that incorporate the Hamiltonian and jump operators, and generate the system's time evolution. Reconstructing the system's Hamiltonian and and its coupling to the environment from measurements is important both for fundamental research as well as for performance-evaluation of quantum machines. In this paper we introduce a method that reconstructs the dynamical equation of open quantum systems, directly from a set of expectation values of selected observables. We benchmark our technique both by a simulation and experimentally, by measuring the dynamics of a trapped $^{88}\text{Sr}^+$ ion under spontaneous photon scattering.

翻訳日:2023-05-30 01:04:13 公開日:2020-03-10

# マイクロ波光子を2つまで数える超伝導検出器

A superconducting detector that counts microwave photons up to two ( http://arxiv.org/abs/2003.04625v1 )

ライセンス: Link先を確認

Andrii M. Sokolov and Frank K. Wilhelm

(参考訳) 本研究では,真空状態,1光子状態,および2光子以上の状態の区別が可能なマイクロ波光子の検出器を提案する。その動作はバイアス付きジョセフソン接合における2光子遷移に基づいており、超伝導状態から正常状態への切り替え時に検出される。検出器を理論的にモデル化する。検出器は、数マイクロ秒間に90%以上の成功確率で実行される。 8.2GHzの光子に敏感である。動作周波数は、およそ1GHzから20GHzの範囲の設計段階で設定できる。

We propose a detector of microwave photons which can distinguish the vacuum state, one-photon state, and the states with two or more photons. Its operation is based on the two-photon transition in a biased Josephson junction and detection occurs when it switches from a superconducting to a normal state. We model the detector theoretically. The detector performs with more than 90% success probability in several microseconds. It is sensitive for the 8.2GHz photons. The working frequency could be set at the design stage in the range from about 1GHz to 20GHz.

翻訳日:2023-05-30 01:03:13 公開日:2020-03-10

# unruh効果の検出器としてのデコヒーレンス

Decoherence as Detector of the Unruh Effect ( http://arxiv.org/abs/2003.05014v1 )

ライセンス: Link先を確認

Alexander I Nesterov, Gennady P Berman, Manuel A Rodr\'iguez Fern\'andez and Xidi Wang

(参考訳) 本研究では,無質量量子スカラー場と相互作用する検出器の密度行列のデコヒーレンスを計測する新しいタイプのUnruh-DeWitt検出器を提案する。慣性および加速基準系ではデコヒーレンス減衰率が異なることが判明した。指数的位相崩壊は,比較的低加速度で観測でき,unruh効果の測定条件を大幅に改善できることを示した。

We propose a new type of the Unruh-DeWitt detector which measures the decoherence of the reduced density matrix of the detector interacting with the massless quantum scalar field. We find that the decoherence decay rates are different in the inertial and accelerated reference frames. We show that the exponential phase decay can be observed for relatively low accelerations, that can significantly improve the conditions for measuring the Unruh effect.

翻訳日:2023-05-30 00:56:59 公開日:2020-03-10

# 同一性を持つ資源効率の良いゼロノイズ外挿

Resource Efficient Zero Noise Extrapolation with Identity Insertions ( http://arxiv.org/abs/2003.04941v1 )

ライセンス: Link先を確認

Andre He, Benjamin Nachman, Wibe A. de Jong, and Christian W. Bauer

(参考訳) 読み出し誤差に加えて、2量子ゲートノイズは、ノイズの多い中間スケール量子(NISQ)コンピュータ上の複雑な量子アルゴリズムの主要な課題である。これらの誤りは、量子化学、核物理学、高エネルギー物理学、その他の新興科学・産業応用の正確な計算を行う上で重要な課題である。 2ビットゲートエラーの軽減には、エラー訂正符号とゼロノイズ外挿という2つの提案がある。本稿では,後者に着目し,それを詳細に研究し,既存アプローチへの変更を提案する。特に,従来の固定id挿入法 (fiim) よりもはるかに少ないゲートで競争的漸近的精度を達成するためのランダムid挿入法 (riim) を提案する。例えば、先頭方向の非偏極ゲートノイズを修正するには、RIIMでは$n_\text{CNOT}+2$ゲートが必要であり、FIIMでは$3n_\text{CNOT}$ゲートが必要である。この重要なリソース節約により、近未来の量子ハードウェアにおける最先端の計算結果をより正確にすることができる。

In addition to readout errors, two-qubit gate noise is the main challenge for complex quantum algorithms on noisy intermediate-scale quantum (NISQ) computers. These errors are a significant challenge for making accurate calculations for quantum chemistry, nuclear physics, high energy physics, and other emerging scientific and industrial applications. There are two proposals for mitigating two-qubit gate errors: error-correcting codes and zero-noise extrapolation. This paper focuses on the latter, studying it in detail and proposing modifications to existing approaches. In particular, we propose a random identity insertion method (RIIM) that can achieve competitive asymptotic accuracy with far fewer gates than the traditional fixed identity insertion method (FIIM). For example, correcting the leading order depolarizing gate noise requires $n_\text{CNOT}+2$ gates for RIIM instead of $3n_\text{CNOT}$ gates for FIIM. This significant resource saving may enable more accurate results for state-of-the-art calculations on near term quantum hardware.

翻訳日:2023-05-30 00:56:19 公開日:2020-03-10

# 大規模ネットワークにおける量子ページランクのためのTensorFlowソルバー

TensorFlow Solver for Quantum PageRank in Large-Scale Networks ( http://arxiv.org/abs/2003.04930v1 )

ライセンス: Link先を確認

Hao Tang, Tian-Shen He, Ruo-Xi Shi, Yan-Yan Zhu, Marcus Lee, Tian-Yu Wang, Xian-Min Jin

(参考訳) Google PageRankは、ネットワーク内のノードやWebサイトの重要度をランク付けするための一般的かつ有用なアルゴリズムである。量子ページランクアルゴリズムは本質的に量子確率的ウォークに基づいており、リンドブラッドマスター方程式を用いて表現することができるが、これはo(n^4)次元のクロネッカー積を解く必要があり、ネットワーク内のノードn数が150を超える場合、非常に大きなメモリと時間を必要とする。本稿では,Lange-Kutta法を用いて行列次元をO(N^2)に減らし,TensorFlowを用いてGPU並列計算を行うことにより,量子PageRankの効率的な解法を提案する。最大922ノードを持つ米国の主要航空会社ネットワークに対して、量子PageRankを解く際の性能を実証する。従来の量子ページランクソルバと比較して,100秒未満でメモリ4-8gbの通常のコンピュータで動作するためには,必要なメモリと時間を1%と0.2%に劇的に削減できる。この大規模量子PageRankと量子確率ウォークの効率的な解法は、現実の応用における量子情報の研究を大いに促進する。

Google PageRank is a prevalent and useful algorithm for ranking the significance of nodes or websites in a network, and a recent quantum counterpart for PageRank algorithm has been raised to suggest a higher accuracy of ranking comparing to Google PageRank. The quantum PageRank algorithm is essentially based on quantum stochastic walks and can be expressed using Lindblad master equation, which, however, needs to solve the Kronecker products of an O(N^4) dimension and requires severely large memory and time when the number of nodes N in a network increases above 150. Here, we present an efficient solver for quantum PageRank by using the Runge-Kutta method to reduce the matrix dimension to O(N^2) and employing TensorFlow to conduct GPU parallel computing. We demonstrate its performance in solving quantum PageRank for the USA major airline network with up to 922 nodes. Compared with the previous quantum PageRank solver, our solver dramatically reduces the required memory and time to only 1% and 0.2%, respectively, making it practical to work in a normal computer with a memory of 4-8 GB in no more than 100 seconds. This efficient solver for large-scale quantum PageRank and quantum stochastic walks would greatly facilitate studies of quantum information in real-life applications.

翻訳日:2023-05-30 00:56:01 公開日:2020-03-10

# 相互作用する2次元格子ゲージ理論における無秩序局在

Disorder-free localization in an interacting two-dimensional lattice gauge theory ( http://arxiv.org/abs/2003.04901v1 )

ライセンス: Link先を確認

P. Karpov, R. Verdel, Y.-P. Huang, M. Schmitt, and M. Heyl

(参考訳) 乱れのない局所化は、ゲージ不変性によって課される局所的制約によって引き起こされる低次元の均質格子ゲージ理論におけるエルゴード性破壊のメカニズムとして最近導入された。また, 2次元空間における真に相互作用する系は, この機構の結果として非エルゴード化できることを示した。具体的には、古典的相関パーコレーション問題を通じて局所化-非局在化遷移の厳密な束を得て量子リンクモデルにおける非エルゴード挙動を証明し、遷移の非エルゴード側でヒルベルト空間の断片化を示唆する。本システムにおける量子力学を,古典スピンの変動ネットワークと人工ニューラルネットワークとの類似性の観点から,効率的かつ摂動的に制御された波動関数の表現を用いて研究する。線形欠陥の伝播を研究することにより, 局所化相とエルゴード相の異なる光円錐構造を生成することにより, 動的特徴を識別する。この研究で導入された手法は、空間次元に関係なく有限次元局所ヒルベルト空間を持つ任意の格子ゲージ理論に適用できる。

Disorder-free localization has been recently introduced as a mechanism for ergodicity breaking in low-dimensional homogeneous lattice gauge theories caused by local constraints imposed by gauge invariance. We show that also genuinely interacting systems in two spatial dimensions can become nonergodic as a consequence of this mechanism. Specifically, we prove nonergodic behavior in the quantum link model by obtaining a rigorous bound on the localization-delocalization transition through a classical correlated percolation problem implying a fragmentation of Hilbert space on the nonergodic side of the transition. We study the quantum dynamics in this system by means of an efficient and perturbatively controlled representation of the wavefunction in terms of a variational network of classical spins akin to artificial neural networks. We identify a distinguishing dynamical signature by studying the propagation of line defects, yielding different light cone structures in the localized and ergodic phases, respectively. The methods we introduce in this work can be applied to any lattice gauge theory with finite-dimensional local Hilbert spaces irrespective of spatial dimensionality.

翻訳日:2023-05-30 00:55:23 公開日:2020-03-10

# 大規模コラボレーションによるスマートシティIoTサービス構築

Smart City IoT Services Creation through Large Scale Collaboration ( http://arxiv.org/abs/2003.04843v1 )

ライセンス: Link先を確認

Flavio Cirillo, David G\'omez, Luis Diez, Ignacio Elicegui Maestro, Thomas Barrie Juel Gilbert, Reza Akhavan

(参考訳) スマートシティソリューションは、センサーデータハンドリングから提供されたサービスまで、モノリシックに実装されることが多い。新しい都市のすべての新しいソリューションに対して、同じ課題が、さまざまな開発者によって頻繁に直面しています。専門知識とノウハウが再利用され、作業が共有される。本稿では,新しいスマートシティソリューションを実装し,コンポーネントの共有を最大化する努力を最小化する手法を提案する。最後の目標は、スマートシティアプリケーション開発者のライブ技術コミュニティを作ることだ。この活動の結果は、ヨーロッパと韓国の27都市で35の都市サービスを実施している。努力を共有するため、開発者はモジュラーアプローチを使ってアプリケーションを開発することを推奨します。他の都市サービスで再利用可能な単一機能コンポーネントはパッケージ化され、アトミックサービスと呼ばれるスタンドアロンコンポーネントとして公開される。データ分析、データ評価、データ統合、データ検証、可視化におけるスマートシティの課題に対処する15のアトミックサービスを特定します。原子サービスの38のインスタンスは、すでにいくつかのスマートシティサービスで運用されている。この記事では、アトミックサービス例、いくつかのデータ予測コンポーネントとして詳述します。さらに、サンタンデールとデンマークの3都市における実世界の原子サービス利用について述べる。結果として生じるアトミックサービスは、スマートシティソリューションのサイドマーケットも生み出しており、専門知識とノウハウを異なる利害関係者が再利用することができる。

Smart cities solutions are often monolithically implemented, from sensors data handling through to the provided services. The same challenges are regularly faced by different developers, for every new solution in a new city. Expertise and know-how can be re-used and the effort shared. In this article we present the methodologies to minimize the efforts of implementing new smart city solutions and maximizing the sharing of components. The final target is to have a live technical community of smart city application developers. The results of this activity comes from the implementation of 35 city services in 27 cities between Europe and South Korea. To share efforts, we encourage developers to devise applications using a modular approach. Single-function components that are re-usable by other city services are packaged and published as standalone components, named Atomic Services. We identify 15 atomic services addressing smart city challenges in data analytics, data evaluation, data integration, data validation, and visualization. 38 instances of the atomic services are already operational in several smart city services. We detail in this article, as atomic service examples, some data predictor components. Furthermore, we describe real-world atomic services usage in the scenarios of Santander and three Danish cities. The resulting atomic services also generate a side market for smart city solutions, allowing expertise and know-how to be re-used by different stakeholders.

翻訳日:2023-05-30 00:54:12 公開日:2020-03-10

# anyon と Gentile の統計量間の変換の中間対称構築

Intermediate symmetric construction of transformation between anyon and Gentile statistics ( http://arxiv.org/abs/2003.06235v1 )

ライセンス: Link先を確認

Yao Shen

(参考訳) 遺伝統計学は、職業数表現における分数統計システムを記述する。任意の統計はこれらの系を巻数表現で研究する。どちらもボース=アインシュタイン統計とフェルミ=ディラック統計の間の中間統計である。 Gentile統計の第二の量子化は、多くの利点を示している。波動関数の対称性の要求に従って、アノンとジェンティル統計の間の変換の一般的な構成を与える。言い換えれば、エノンの第二量子化形式を簡単な方法で導入する。また, 2次量子化作用素, コヒーレント状態, ベリー相の基本関係についても考察した。

Gentile statistics describes fractional statistical systems in the occupation number representation. Anyon statistics researches those systems in the winding number representation. Both of them are intermediate statistics between Bose-Einstein and Fermi-Dirac statistics. The second quantization of Gentile statistics shows a lot of advantages. According to the symmetry requirement of the wave function, we give the general construction of transformation between anyon and Gentile statistics. In other words, we introduce the second quantization form of anyons in a easy way. Basic relations of second quantization operators, the coherent state and Berry phase are also discussed.

翻訳日:2023-05-30 00:44:16 公開日:2020-03-10

# 低学費の大学生研究

Undergraduate Student Research With Low Faculty Cost ( http://arxiv.org/abs/2003.05719v1 )

ライセンス: Link先を確認

Sindhu Kutty, Mark Guzdial

(参考訳) 大学院生は、コンピュータサイエンスの研究が何であるかを知らないなら、コンピュータサイエンスの大学院研究を考えることもない。学部生を研究に紹介する多くのプログラムは大学院研究プログラムのように構成されており、少数の学部生が学部顧問と協力している。さらに、女性、少数派、第1世代の学生は威圧的すぎるか、あるいは研究のアイデアがアモルファスすぎるため、これらのプログラムを見逃してしまう可能性がある。その結果,CS研究の多様性向上の機会を失うことになった。我々は,機械学習と関連分野に焦点をあてた研究グループの一環として,多数の学生(約2ダース)が1人の教員とともに作業する試験プログラムを当社の部門で開始した。このプログラムの目的は、学生に研究キャリアを追求するよう説得することではなく、彼らが将来の研究に望む役割についてより深い決定をさせることである。提案手法を評価するため,匿名の出口調査を2回実施し,学生の体験を抽出した。学生は、コンピュータ科学の研究がどんなものであるかをよりよく理解していると報告している。彼らの研究への関心は、研究を行う能力に対する信頼が報告されているように高まったが、すべての学生がコンピュータ科学の研究機会を追求したいとは思っていなかった。女子学生の報告された経験を踏まえると、このプログラムはCS研究のさらなる多様性の出発点となる。

Undergraduates are unlikely to even consider graduate research in Computer Science if they do not know what Computer Science research is. Many programs aimed at introducing undergraduate to research are structured like graduate research programs, with a small number of undergraduates working with a faculty advisor. Further, females, under-represented minorities, and first generation students may be too intimidated or the idea of research may be too amorphous, so that they miss out on these programs. As a consequence, we lose out on opportunities for greater diversity in CS research. We have started a pilot program in our department where a larger number of students (close to two dozen) work with a single faculty member as part of a research group focused on Machine Learning and related areas. The goal of this program is not to convince students to pursue a research career but rather to enable them to make a more informed decision about what role they would like research to play in their future. In order to evaluate our approach, we elicited student experience via two anonymized exit surveys. Students report that they develop a better understanding of what research in Computer Science is. Their interest in research was increased as was their reported confidence in their ability to do research, although not all students wanted to further pursue computer science research opportunities. Given the reported experience of female students, this program can offer a starting point for greater diversity in CS research.

翻訳日:2023-05-30 00:44:09 公開日:2020-03-10

# SensAI+Expanse Emotional Valence Prediction Study with Cognition and Memory Integration

SensAI+Expanse Emotional Valence Prediction Studies with Cognition and Memory Integration ( http://arxiv.org/abs/2001.09746v3 )

ライセンス: Link先を確認

Nuno A. C. Henriques, Helder Coelho, Leonel Garcia-Marques

(参考訳) 人間は感情的で認知的な存在であり、個人や社会的アイデンティティの記憶に依存している。また、人間のダイアド結合は、より良い相互作用のために共感行動のような共通の信念を必要とする。この意味で、人間とエージェントの相互作用に関する研究は、影響、認知、記憶の統合に資源を供給すべきである。開発された人工エージェントシステム(SensAI+Expanse)は、機械学習アルゴリズム、ヒューリスティックス、記憶を認知として、相互作用する人間の感情価予測に役立てる。また、人間を識別可能な相互作用結果に結びつけるために、常に適応共感スコアが存在する。 [...]エージェントはデータの収集に寛容であり、適切な文脈化予測のための学習最善策として、その認知過程を個人に適応させる。この研究は、達成された適応プロセスを活用する。また,従来の研究では,学習アルゴリズムと評価指標の特定の選択肢を用いた個人予測モデルを用いた。達成された解は、高い性能の予測能力、効率的なエネルギー使用、予測確率に対する特徴重要説明を含む。本研究の結果,年齢と性の組み合わせによって有意な感情的有意性行動の差異が認められた。したがって、この研究は認知科学研究を支援することができる人工知能エージェントに寄与する。この能力は、空間と時間で文脈化された人間の感情的原子価を予測することによって、情緒的障害に関するものである。さらに、学習過程やヒューリスティックスは、認知の経済性や環境に対処するための記憶などのタスクに適合する。最後に、これらの貢献には、コンテキストにおける感情的ヴァレンス状態の予測において、達成された年齢と性別の中立性が含まれます。

The humans are affective and cognitive beings relying on memories for their individual and social identities. Also, human dyadic bonds require some common beliefs such as empathetic behaviour for better interaction. In this sense, research studies involving human-agent interaction should resource on affect, cognition, and memory integration. The developed artificial agent system (SensAI+Expanse) includes machine learning algorithms, heuristics, and memory as cognition aids towards emotional valence prediction on the interacting human. Further, an adaptive empathy score is always present in order to engage the human in a recognisable interaction outcome. [...] The agent is resilient on collecting data, adapts its cognitive processes to each human individual in a learning best effort for proper contextualised prediction. The current study make use of an achieved adaptive process. Also, the use of individual prediction models with specific options of the learning algorithm and evaluation metric from a previous research study. The accomplished solution includes a highly performant prediction ability, an efficient energy use, and feature importance explanation for predicted probabilities. Results of the present study show evidence of significant emotional valence behaviour differences between some age ranges and gender combinations. Therefore, this work contributes with an artificial intelligent agent able to assist on cognitive science studies. This ability is about affective disturbances by means of predicting human emotional valence contextualised in space and time. Moreover, contributes with learning processes and heuristics fit to the task including economy of cognition and memory to cope with the environment. Finally, these contributions include an achieved age and gender neutrality on predicting emotional valence states in context and with very good performance for each individual.

翻訳日:2023-01-14 18:14:09 公開日:2020-03-10

# スピンボソンモデルの量子力学写像

The quantum dynamical map of the spin boson model ( http://arxiv.org/abs/2001.04236v2 )

ライセンス: Link先を確認

In\'es de Vega

(参考訳) 量子コンピュータにおける環境の影響を分析する主要なフレームワークの1つは、量子ビットのダイナミックスをよく知られた動的マップの観点で特徴づけることができる純粋退化である。本研究では,このような写像の非摂動的拡張を,この単純な純粋デファッショニングケース,すなわち熱状態のボソニック環境に結合した一般スピンに対して有効であることを示す。この目的のために、トローター分解とマグヌス展開を用いて相互作用図におけるユニタリ進化演算子を単純化する。提案された導出は、多くの体、初期系環境相関状態、多重時間相関関数、量子情報プロトコルを含む他の有限レベルの開量子系にも拡張することができる。

One of the main frameworks to analyze the effects of the environment in a quantum computer is that of pure dephasing, where the dynamics of qubits can be characterised in terms of a well-known dynamical map. In this work we present a non-peturbative extension of such map beyond this simple pure-dephasing case, i.e. that is valid for a general spin coupled to a bosonic environment in a thermal state. To this aim, we use a Trotter decomposition and a Magnus expansion to simplify the unitary evolution operator in interaction picture. The proposed derivation can be extended to other finite-level open quantum systems including many body, initial system-environment correlated states, multiple-time correlation functions or quantum information protocols.

翻訳日:2023-01-11 23:59:57 公開日:2020-03-10

# mixpath:ワンショットニューラルネットワーク検索のための統一アプローチ

MixPath: A Unified Approach for One-shot Neural Architecture Search ( http://arxiv.org/abs/2001.05887v3 )

ライセンス: Link先を確認

Xiangxiang Chu, Xudong Li, Shun Lu, Bo Zhang, and Jixiang Li

(参考訳) 複数の畳み込みカーネルのブレンディングは、ニューラルアーキテクチャ設計において有利であることが証明されている。しかし、現在のニューラルアーキテクチャ探索手法は主にスタック化されたシングルパス探索空間に限られている。マルチパスモデルのワンショットドクトリン検索は、まだ未解決のままである。具体的には、候補アーキテクチャを正確に評価するために、マルチパススーパーネットをトレーニングする動機があります。本稿では, 探索空間において, 複数の経路から要約された特徴ベクトルが, スーパーネットトレーニングとそのランク付け能力を乱す単一経路からの特徴ベクトルのほぼ倍であることを示す。本稿では,異なる特徴統計を正規化するシャドウバッチ正規化(sbn)と呼ばれる新しいメカニズムを提案する。大規模な実験により、SBNはトレーニングを安定化し、ランキングパフォーマンスを改善することができる(例えば、NAS-Bench-101でテストされたKendall Tau 0.597)。当社の統一マルチパスワンショットアプローチをmixpathと呼び、imagenetで最先端の結果を得る一連のモデルを生成します。

Blending multiple convolutional kernels is proved advantageous in neural architectural design. However, current neural architecture search approaches are mainly limited to stacked single-path search space. How can the one-shot doctrine search for multi-path models remains unresolved. Specifically, we are motivated to train a multi-path supernet to accurately evaluate the candidate architectures. In this paper, we discover that in the studied search space, feature vectors summed from multiple paths are nearly multiples of those from a single path, which perturbs supernet training and its ranking ability. In this regard, we propose a novel mechanism called Shadow Batch Normalization(SBN) to regularize the disparate feature statistics. Extensive experiments prove that SBN is capable of stabilizing the training and improving the ranking performance (e.g. Kendall Tau 0.597 tested on NAS-Bench-101). We call our unified multi-path one-shot approach as MixPath, which generates a series of models that achieve state-of-the-art results on ImageNet.

翻訳日:2023-01-10 23:27:10 公開日:2020-03-10

# ファウショット学習のための連続的局所的置換

Continual Local Replacement for Few-shot Learning ( http://arxiv.org/abs/2001.08366v2 )

ライセンス: Link先を確認

Canyu Le, Zhonggui Chen, Xihan Wei, Biao Wang, Lei Zhang

(参考訳) 少数ショット学習の目標は、1つまたは複数のトレーニングデータに基づいて新しいクラスを認識できるモデルを学ぶことである。 1)新規クラスの優れた特徴表現が欠如している、(2)ラベル付きデータのいくつかは真のデータ分布を正確に表現できないため、分類のよい決定関数を学ぶのは難しい、という2つの側面から課題となっている。本研究では,高度なネットワークアーキテクチャを用いて,より優れた特徴表現を学習し,第2の課題に注目する。データ不足問題に対処するために,新たな局所的置換戦略を提案する。ラベルのない画像のコンテンツを活用することで、ラベル付き画像が継続的に強化される。具体的には、フライ時に意味的に類似した画像を常に選択するために擬似ラベリング法を採用する。オリジナルラベル付き画像は、次のエポックトレーニングのために選択された画像に局所的に置き換えられる。このように、ラベルのない画像から直接新しい意味情報を学習することができ、埋め込み空間における教師付き信号の容量を大幅に拡大することができる。これにより、モデルは一般化を改善し、分類のためのより良い決定境界を学ぶことができる。私たちの方法は概念的にシンプルで実装が簡単です。大規模な実験により、様々な数ショット画像認識ベンチマークで最先端の結果が得られることが示された。

The goal of few-shot learning is to learn a model that can recognize novel classes based on one or few training data. It is challenging mainly due to two aspects: (1) it lacks good feature representation of novel classes; (2) a few of labeled data could not accurately represent the true data distribution and thus it's hard to learn a good decision function for classification. In this work, we use a sophisticated network architecture to learn better feature representation and focus on the second issue. A novel continual local replacement strategy is proposed to address the data deficiency problem. It takes advantage of the content in unlabeled images to continually enhance labeled ones. Specifically, a pseudo labeling method is adopted to constantly select semantically similar images on the fly. Original labeled images will be locally replaced by the selected images for the next epoch training. In this way, the model can directly learn new semantic information from unlabeled images and the capacity of supervised signals in the embedding space can be significantly enlarged. This allows the model to improve generalization and learn a better decision boundary for classification. Our method is conceptually simple and easy to implement. Extensive experiments demonstrate that it can achieve state-of-the-art results on various few-shot image recognition benchmarks.

翻訳日:2023-01-07 10:03:42 公開日:2020-03-10

# psc-net:学習部空間共起による歩行者検出

PSC-Net: Learning Part Spatial Co-occurrence for Occluded Pedestrian Detection ( http://arxiv.org/abs/2001.09252v2 )

ライセンス: Link先を確認

Jin Xie and Yanwei Pang and Hisham Cholakkal and Rao Muhammad Anwer and Fahad Shahbaz Khan and Ling Shao

(参考訳) 特に重厚な閉塞下で歩行者を検知することは、現実の多くの応用において難しいコンピュータビジョン問題である。本稿では,歩行者検出のための新しいアプローチをPSC-Netと呼ぶ。提案したPSC-Netは、グラフ畳み込みネットワーク(GCN)を介して、異なる歩行者体のパーツ間の共起情報を明示的にキャプチャする専用モジュールを含む。部分的および部分的共起情報は、部分的から重度な咬合まで、様々な咬合レベルを扱うための特徴表現の改善に寄与する。我々のPSC-Netは歩行者のトポロジ的構造を利用しており、空間的共起を学習するために、部分ベースのアノテーションや視覚的バウンディングボックス(VBB)情報を必要としない。総合的な実験は、citypersonsとcaltech datasetsという2つの挑戦的なデータセットで行われている。提案したPSC-Netは,両者の最先端検出性能を実現する。 CityPerosns テストセットのヘビーオクルード (\textbf{HO}) セットでは、当社のPSC-Net は、同じバックボーン、入力スケール、追加の VBB 監督を使わずに、平均誤差率の4.0 % の絶対ゲインを得る。さらに、PSC-Netは、Caltech(\textbf{HO})テストセットのログ平均ミス率の観点から、最先端の37.9から34.8に改善している。

Detecting pedestrians, especially under heavy occlusions, is a challenging computer vision problem with numerous real-world applications. This paper introduces a novel approach, termed as PSC-Net, for occluded pedestrian detection. The proposed PSC-Net contains a dedicated module that is designed to explicitly capture both inter and intra-part co-occurrence information of different pedestrian body parts through a Graph Convolutional Network (GCN). Both inter and intra-part co-occurrence information contribute towards improving the feature representation for handling varying level of occlusions, ranging from partial to severe occlusions. Our PSC-Net exploits the topological structure of pedestrian and does not require part-based annotations or additional visible bounding-box (VBB) information to learn part spatial co-occurrence. Comprehensive experiments are performed on two challenging datasets: CityPersons and Caltech datasets. The proposed PSC-Net achieves state-of-the-art detection performance on both. On the heavy occluded (\textbf{HO}) set of CityPerosns test set, our PSC-Net obtains an absolute gain of 4.0\% in terms of log-average miss rate over the state-of-the-art with same backbone, input scale and without using additional VBB supervision. Further, PSC-Net improves the state-of-the-art from 37.9 to 34.8 in terms of log-average miss rate on Caltech (\textbf{HO}) test set.

翻訳日:2023-01-07 00:19:04 公開日:2020-03-10

# music2dance:音楽駆動ダンス生成のためのダンスネット

Music2Dance: DanceNet for Music-driven Dance Generation ( http://arxiv.org/abs/2002.03761v2 )

ライセンス: Link先を確認

Wenlin Zhuang, Congyi Wang, Siyu Xia, Jinxiang Chai, Yangang Wang

(参考訳) 音楽、すなわち音楽からダンスへの人間の動きを合成することは魅力的であり、近年多くの研究関心を集めている。ダンスにリアルで複雑な人間の動きを必要とするだけでなく、より重要なこととして、合成された動きは音楽のスタイル、リズム、メロディと一致すべきである。本稿では,音楽のスタイル,リズム,メロディを制御信号として捉え,高いリアリズムと多様性を持つ3Dダンスモーションを生成するための,新しい自己回帰生成モデルDanceNetを提案する。提案モデルの性能向上のために,プロのダンサーによる複数の同期音楽ダンスペアをキャプチャし,高品質な音楽ダンスペアデータセットを構築する。実験により,提案手法は最先端の結果が得られることを示した。

Synthesize human motions from music, i.e., music to dance, is appealing and attracts lots of research interests in recent years. It is challenging due to not only the requirement of realistic and complex human motions for dance, but more importantly, the synthesized motions should be consistent with the style, rhythm and melody of the music. In this paper, we propose a novel autoregressive generative model, DanceNet, to take the style, rhythm and melody of music as the control signals to generate 3D dance motions with high realism and diversity. To boost the performance of our proposed model, we capture several synchronized music-dance pairs by professional dancers, and build a high-quality music-dance pair dataset. Experiments have demonstrated that the proposed method can achieve the state-of-the-art results.

翻訳日:2023-01-04 20:21:49 公開日:2020-03-10

# 不完全ラベルを用いたマルチタスク感情認識

Multitask Emotion Recognition with Incomplete Labels ( http://arxiv.org/abs/2002.03557v2 )

ライセンス: Link先を確認

Didan Deng, Zhaokang Chen, Bertram E. Shi

(参考訳) 顔行動単位の検出,表情分類,ヴァレンス覚醒推定の3つのタスクを統一したモデルを訓練した。 3つのタスクを学ぶ上での2つの大きな課題に対処します。まず、既存のデータセットの多くは高度に不均衡です。第二に、既存のデータセットのほとんどは、3つのタスクのラベルを含まない。最初の課題に取り組むために、実験データセットにデータバランシング技術を適用する。第2の課題に取り組むために,マルチタスクモデルにおけるラベルの欠落から学習するアルゴリズムを提案する。このアルゴリズムには2つのステップがある。まず3つのタスクすべてを実行するために教師モデルをトレーニングし、各インスタンスは対応するタスクの基底真理ラベルでトレーニングされます。次に,教師モデルの出力をソフトラベルと呼ぶ。学生モデルをトレーニングするために、ソフトラベルと基礎的な真実を使用します。学生のモデルのほとんどは、教師のモデルよりも3つのタスクで優れています。最後に、3つのタスクのパフォーマンスをさらに向上するためにモデルアンサンブルを使用します。

We train a unified model to perform three tasks: facial action unit detection, expression classification, and valence-arousal estimation. We address two main challenges of learning the three tasks. First, most existing datasets are highly imbalanced. Second, most existing datasets do not contain labels for all three tasks. To tackle the first challenge, we apply data balancing techniques to experimental datasets. To tackle the second challenge, we propose an algorithm for the multitask model to learn from missing (incomplete) labels. This algorithm has two steps. We first train a teacher model to perform all three tasks, where each instance is trained by the ground truth label of its corresponding task. Secondly, we refer to the outputs of the teacher model as the soft labels. We use the soft labels and the ground truth to train the student model. We find that most of the student models outperform their teacher model on all the three tasks. Finally, we use model ensembling to boost performance further on the three tasks.

翻訳日:2023-01-02 09:48:55 公開日:2020-03-10

# マンハッタンのような反復環境のトポロジマッピング

Topological Mapping for Manhattan-like Repetitive Environments ( http://arxiv.org/abs/2002.06575v3 )

ライセンス: Link先を確認

Sai Shubodh Puligilla, Satyajit Tourani, Tushar Vaidya, Udit Singh Parihar, Ravi Kiran Sarvadevabhatla and K. Madhava Krishna

(参考訳) 我々は,屋内倉庫環境に挑戦するためのトポロジカルマッピングフレームワークを紹介する。最も抽象的なレベルでは、倉庫は、グラフのノードが特定の倉庫のトポロジー構造(例えばラックスペース、回廊)を表し、エッジが隣接する2つのノードまたはトポロジの間の経路の存在を表すトポロジーグラフとして表現される。中間レベルでは、マップはマンハッタングラフとして表現され、ノードとエッジはマンハッタンの特性によって特徴づけられ、最下層の細部ではポーズグラフとして表現される。トポロジ的構造はDeep Convolutional Networkを通じて学習され、トポロジ的インスタンス間の関係性はSiameseスタイルのニューラルネットワークを介して学習される。本稿では,トポロジカルグラフやマンハッタングラフなどの抽象化の維持が,高度に最適化されていないポーズグラフから正確なポーズグラフを復元する上で有効であることを示す。背景となるPose Graph最適化フレームワークの制約として,トポロジ的およびマンハッタン的関係とManhattan Graphのループクロージャ関係を組み込むことによって,これを実現できることを示す。実世界の屋内倉庫シーンにおける地中近傍ポーズグラフの復元は,提案手法の有効性を実証するものである。

We showcase a topological mapping framework for a challenging indoor warehouse setting. At the most abstract level, the warehouse is represented as a Topological Graph where the nodes of the graph represent a particular warehouse topological construct (e.g. rackspace, corridor) and the edges denote the existence of a path between two neighbouring nodes or topologies. At the intermediate level, the map is represented as a Manhattan Graph where the nodes and edges are characterized by Manhattan properties and as a Pose Graph at the lower-most level of detail. The topological constructs are learned via a Deep Convolutional Network while the relational properties between topological instances are learnt via a Siamese-style Neural Network. In the paper, we show that maintaining abstractions such as Topological Graph and Manhattan Graph help in recovering an accurate Pose Graph starting from a highly erroneous and unoptimized Pose Graph. We show how this is achieved by embedding topological and Manhattan relations as well as Manhattan Graph aided loop closure relations as constraints in the backend Pose Graph optimization framework. The recovery of near ground-truth Pose Graph on real-world indoor warehouse scenes vindicate the efficacy of the proposed framework.

翻訳日:2022-12-31 18:15:31 公開日:2020-03-10

# AdaEnsemble Learning Approach for Metro Passenger Flow Forecasting

AdaEnsemble Learning Approach for Metro Passenger Flow Forecasting ( http://arxiv.org/abs/2002.07575v2 )

ライセンス: Link先を確認

Shaolong Sun, Dongchuan Yang, Ju-e Guo, Shouyang Wang

(参考訳) 正確な時間的かつタイムリーな乗客フロー予測は、インテリジェント交通システムの導入の成功に不可欠である。しかし,首都圏の旅客流のランダム性や変動に起因して,効率的かつロバストな予測手法を提案することは極めて困難である。そこで本研究では, 変動モード分解(VMD), 季節的自己回帰統合移動平均化(SARIMA), 多層パーセプトロンネットワーク(MLP), 長期記憶(LSTM)ネットワークの相補的利点を組み合わせた適応型アンサンブル(AdaEnsemble)学習手法を提案する。 AdaEnsembleの学習アプローチは3つの重要な段階で構成されている。第1段階では、VMDを適用して、メトロ旅客フローデータを周期成分、決定成分、ボラティリティ成分に分解する。次に、周期成分の予測にSARIMAモデル、決定論的成分の学習と予測にLSTMネットワーク、揮発性成分の予測にMLPネットワークを用いる。最終段階では、様々な予測コンポーネントが別のMLPネットワークによって再構成される。実験の結果,AdaEnsembleの学習手法は,最先端のモデルと比較して最高の予測性能を持つだけでなく,深セン地下鉄の歴史的乗客フローデータといくつかの標準評価基準に基づいて,最も有望かつ堅牢であることがわかった。

Accurate and timely metro passenger flow forecasting is critical for the successful deployment of intelligent transportation systems. However, it is quite challenging to propose an efficient and robust forecasting approach due to the inherent randomness and variations of metro passenger flow. In this study, we present a novel adaptive ensemble (AdaEnsemble) learning approach to accurately forecast the volume of metro passenger flows, and it combines the complementary advantages of variational mode decomposition (VMD), seasonal autoregressive integrated moving averaging (SARIMA), multilayer perceptron network (MLP) and long short-term memory (LSTM) network. The AdaEnsemble learning approach consists of three important stages. The first stage applies VMD to decompose the metro passenger flows data into periodic component, deterministic component and volatility component. Then we employ SARIMA model to forecast the periodic component, LSTM network to learn and forecast deterministic component and MLP network to forecast volatility component. In the last stage, the diverse forecasted components are reconstructed by another MLP network. The empirical results show that our proposed AdaEnsemble learning approach not only has the best forecasting performance compared with the state-of-the-art models but also appears to be the most promising and robust based on the historical passenger flow data in Shenzhen subway system and several standard evaluation measures.

翻訳日:2022-12-30 20:27:21 公開日:2020-03-10

# 観光客到着の季節予測と傾向予測--適応型マルチスケールアンサンブル学習アプローチ

Seasonal and Trend Forecasting of Tourist Arrivals: An Adaptive Multiscale Ensemble Learning Approach ( http://arxiv.org/abs/2002.08021v2 )

ライセンス: Link先を確認

Shaolong Suna, Dan Bi, Ju-e Guo, Shouyang Wang

(参考訳) 観光客の到着の正確な季節予測と傾向予測は、非常に難しい課題である。来訪者の季節・傾向予測の重要性を念頭において、限定的な研究がこれらに注意を向けた。本研究では,観光客到着の短期・中・長期の季節・傾向予測のための変分モード分解 (vmd) と最小二乗支持ベクトル回帰 (lssvr) を組み込んだ適応型マルチスケールアンサンブル (ame) 学習手法を開発した。開発したame学習手法の定式化において,本シリーズは,まず傾向,季節,残りのボラティリティ成分に分解される。次に、ARIMAはトレンド成分の予測に使用され、SARIMAは12ヶ月周期で季節成分の予測に使用され、LSSVRは残りの変動成分の予測に使用される。最後に, 3つのコンポーネントの予測結果を集約し, LSSVRに基づく非線形アンサンブル手法により, 旅行者の到着を予測したアンサンブルを生成する。さらに、マルチステップアヘッド予測を実装するために直接戦略を用いる。 2つの精度測定とDiebold-Marianoテストから,本研究で使用した他のベンチマークと比較すると,AME学習手法が高度かつ指向性予測の精度を達成できることが実証された。

The accurate seasonal and trend forecasting of tourist arrivals is a very challenging task. In the view of the importance of seasonal and trend forecasting of tourist arrivals, and limited research work paid attention to these previously. In this study, a new adaptive multiscale ensemble (AME) learning approach incorporating variational mode decomposition (VMD) and least square support vector regression (LSSVR) is developed for short-, medium-, and long-term seasonal and trend forecasting of tourist arrivals. In the formulation of our developed AME learning approach, the original tourist arrivals series are first decomposed into the trend, seasonal and remainders volatility components. Then, the ARIMA is used to forecast the trend component, the SARIMA is used to forecast seasonal component with a 12-month cycle, while the LSSVR is used to forecast remainder volatility components. Finally, the forecasting results of the three components are aggregated to generate an ensemble forecasting of tourist arrivals by the LSSVR based nonlinear ensemble approach. Furthermore, a direct strategy is used to implement multi-step-ahead forecasting. Taking two accuracy measures and the Diebold-Mariano test, the empirical results demonstrate that our proposed AME learning approach can achieve higher level and directional forecasting accuracy compared with other benchmarks used in this study, indicating that our proposed approach is a promising model for forecasting tourist arrivals with high seasonality and volatility.

翻訳日:2022-12-30 14:39:05 公開日:2020-03-10

# DSSLP: 半教師付きリンク予測のための分散フレームワーク

DSSLP: A Distributed Framework for Semi-supervised Link Prediction ( http://arxiv.org/abs/2002.12056v2 )

ライセンス: Link先を確認

Dalong Zhang, Xianzheng Song, Ziqi Liu, Zhiqiang Zhang, Xin Huang, Lin Wang, Jun Zhou

(参考訳) リンク予測は、商人の推薦、不正取引検出など、様々な産業用途で広く利用されている。しかし、数十億のノードとエッジを持つ産業規模のグラフ上でリンク予測モデルをトレーニングし、デプロイすることは大きな課題です。本研究では,産業規模のグラフを扱える半教師付きリンク予測問題(DSSLP)のためのスケーラブルで分散的なフレームワークを提案する。 DSSLPは、グラフ全体のトレーニングモデルではなく、ミニバッチ設定でノードの「emph{$k$-hops neighborhood}」でトレーニングすることを提案しており、入力グラフのスケールを小さくし、トレーニング手順を分散するのに役立つ。負の例を効果的に生成するために、DSSLPは分散バッチ実行時サンプリングモジュールを含んでいる。均一および動的サンプリングアプローチを実装し、トレーニングプロセスのガイドとして、正および負のサンプルを適応的に構築することができる。さらにdsslpはリンク予測タスクの推論処理速度を高速化するためのモデル分割戦略を提案する。実験により,産業規模グラフのリアルタイムデータセットだけでなく,サービス公開データセットにおけるDSSLPの有効性と効率が示された。

Link prediction is widely used in a variety of industrial applications, such as merchant recommendation, fraudulent transaction detection, and so on. However, it's a great challenge to train and deploy a link prediction model on industrial-scale graphs with billions of nodes and edges. In this work, we present a scalable and distributed framework for semi-supervised link prediction problem (named DSSLP), which is able to handle industrial-scale graphs. Instead of training model on the whole graph, DSSLP is proposed to train on the \emph{$k$-hops neighborhood} of nodes in a mini-batch setting, which helps reduce the scale of the input graph and distribute the training procedure. In order to generate negative examples effectively, DSSLP contains a distributed batched runtime sampling module. It implements uniform and dynamic sampling approaches, and is able to adaptively construct positive and negative examples to guide the training process. Moreover, DSSLP proposes a model-split strategy to accelerate the speed of inference process of the link prediction task. Experimental results demonstrate that the effectiveness and efficiency of DSSLP in serval public datasets as well as real-world datasets of industrial-scale graphs.

翻訳日:2022-12-28 09:24:36 公開日:2020-03-10

# 逆襲を受ける深部3次元点雲モデルの等尺性について

On Isometry Robustness of Deep 3D Point Cloud Models under Adversarial Attacks ( http://arxiv.org/abs/2002.12222v2 )

ライセンス: Link先を確認

Yue Zhao, Yuwei Wu, Caihua Chen, Andrew Lim

(参考訳) 3D領域でのディープラーニングは多くのタスクにおいて革命的なパフォーマンスを達成したが、これらのモデルの堅牢性は十分に研究されていない。 3次元逆数サンプルについて、既存の研究のほとんどは局所点の操作に焦点をあてており、これはユークリッド距離、すなわち等距離を保存する線形射影の下でのロバスト性のような大域幾何学的性質を起こさない可能性がある。本研究では,既存の最先端3次元モデルがアイソメトリー変換に対して極めて脆弱であることを示す。トンプソンサンプリングを用いて、modelnet40データセットで95%以上の成功率を持つブラックボックス攻撃を開発した。制限等尺特性を組み込んだスペクトルノルムに基づく摂動の上に,ホワイトボックス攻撃の新たな枠組みを提案する。従来の研究とは対照的に,我々の反対サンプルは強く伝達可能であることが実験的に示されている。一般的な3dモデルで評価すると、ホワイトボックス攻撃は98.88%から100%の成功率を達成している。許容できない回転範囲$[\pm 2.81^{\circ}]$でも95%以上の攻撃率を維持している。

While deep learning in 3D domain has achieved revolutionary performance in many tasks, the robustness of these models has not been sufficiently studied or explored. Regarding the 3D adversarial samples, most existing works focus on manipulation of local points, which may fail to invoke the global geometry properties, like robustness under linear projection that preserves the Euclidean distance, i.e., isometry. In this work, we show that existing state-of-the-art deep 3D models are extremely vulnerable to isometry transformations. Armed with the Thompson Sampling, we develop a black-box attack with success rate over 95% on ModelNet40 data set. Incorporating with the Restricted Isometry Property, we propose a novel framework of white-box attack on top of spectral norm based perturbation. In contrast to previous works, our adversarial samples are experimentally shown to be strongly transferable. Evaluated on a sequence of prevailing 3D models, our white-box attack achieves success rates from 98.88% to 100%. It maintains a successful attack rate over 95% even within an imperceptible rotation range $[\pm 2.81^{\circ}]$.

翻訳日:2022-12-28 07:48:27 公開日:2020-03-10

# DeepMAL -- マルウェアのトラフィック検出と分類のためのディープラーニングモデル

DeepMAL -- Deep Learning Models for Malware Traffic Detection and Classification ( http://arxiv.org/abs/2003.04079v2 )

ライセンス: Link先を確認

Gonzalo Mar\'in, Pedro Casas, Germ\'an Capdehourat

(参考訳) ロバストネットワークセキュリティシステムは、ネットワーク攻撃の持続的発生による害を予防し軽減するために不可欠である。近年、機械学習ベースのシステムはネットワークセキュリティアプリケーションで人気を博しており、通常は、専門家による手作り入力機能の注意深いエンジニアリングに依存する浅いモデルの応用を考慮している。このアプローチの主な制限は、さまざまなシナリオやタイプの攻撃下で手作りの機能がうまく機能しないことだ。ディープラーニング(dl)モデルは、生の非処理データから特徴表現を学習する能力を使って、この制限を解決できる。本稿では,マルウェアネットワークトラフィックの検出と分類に関する特定の問題に対するDLモデルのパワーについて検討する。技術の現状に関して大きな利点として,監視されたバイトのストリームから直接得られる生の計測を提案モデルへの入力として検討し,パケットやフローレベルを含む様々な生のトラヒックの特徴表現を評価する。我々は、悪質なトラフィックの基盤となる統計を、専門的な手作り機能なしで把握できるDLモデルであるDeepMALを紹介した。異なる種類のマルウェアトラフィックを含む公開トラフィックトレースを用いて、DeepMALはマルウェアフローを高精度に検出・分類し、従来の浅層モデルより優れた性能を発揮することを示す。

Robust network security systems are essential to prevent and mitigate the harming effects of the ever-growing occurrence of network attacks. In recent years, machine learning-based systems have gain popularity for network security applications, usually considering the application of shallow models, which rely on the careful engineering of expert, handcrafted input features. The main limitation of this approach is that handcrafted features can fail to perform well under different scenarios and types of attacks. Deep Learning (DL) models can solve this limitation using their ability to learn feature representations from raw, non-processed data. In this paper we explore the power of DL models on the specific problem of detection and classification of malware network traffic. As a major advantage with respect to the state of the art, we consider raw measurements coming directly from the stream of monitored bytes as input to the proposed models, and evaluate different raw-traffic feature representations, including packet and flow-level ones. We introduce DeepMAL, a DL model which is able to capture the underlying statistics of malicious traffic, without any sort of expert handcrafted features. Using publicly available traffic traces containing different families of malware traffic, we show that DeepMAL can detect and classify malware flows with high accuracy, outperforming traditional, shallow-like models.

翻訳日:2022-12-26 23:39:29 公開日:2020-03-10

# 確率最適化アルゴリズムのハイパーパラメータチューニングについて

On Hyper-parameter Tuning for Stochastic Optimization Algorithms ( http://arxiv.org/abs/2003.02038v2 )

ライセンス: Link先を確認

Haotian Zhang, Jianyong Sun and Zongben Xu

(参考訳) 本稿では,強化学習に基づく確率最適化アルゴリズムのハイパーパラメータをチューニングするためのアルゴリズムフレームワークを提案する。ハイパーパラメータは進化的アルゴリズム(EA)やメタヒューリスティックスなどの確率最適化アルゴリズムの性能に大きな影響を及ぼす。しかし、これらのアルゴリズムの確率的性質から最適なハイパーパラメータを決定するのは非常に時間がかかる。我々は,マルコフ決定過程としてチューニング手順をモデル化し,ハイパーパラメータをチューニングするためのポリシー勾配アルゴリズムを適用することを提案する。異なる最適化問題(連続的および離散的)に対して、異なる種類のハイパーパラメータ(連続的および離散的)を持つ確率的アルゴリズムをチューニングする実験は、提案するハイパーパラメータチューニングアルゴリズムがベイズ最適化法よりも確率的アルゴリズムの実行時間が少なくないことを示している。提案フレームワークは確率アルゴリズムにおけるハイパーパラメータチューニングの標準ツールとして利用できる。

This paper proposes the first-ever algorithmic framework for tuning hyper-parameters of stochastic optimization algorithm based on reinforcement learning. Hyper-parameters impose significant influences on the performance of stochastic optimization algorithms, such as evolutionary algorithms (EAs) and meta-heuristics. Yet, it is very time-consuming to determine optimal hyper-parameters due to the stochastic nature of these algorithms. We propose to model the tuning procedure as a Markov decision process, and resort the policy gradient algorithm to tune the hyper-parameters. Experiments on tuning stochastic algorithms with different kinds of hyper-parameters (continuous and discrete) for different optimization problems (continuous and discrete) show that the proposed hyper-parameter tuning algorithms do not require much less running times of the stochastic algorithms than bayesian optimization method. The proposed framework can be used as a standard tool for hyper-parameter tuning in stochastic algorithms.

翻訳日:2022-12-26 12:07:11 公開日:2020-03-10

# 放送マッチングビデオにおける個人選手追跡のためのハイブリッド手法

A Hybrid Approach for Tracking Individual Players in Broadcast Match Videos ( http://arxiv.org/abs/2003.03271v2 )

ライセンス: Link先を確認

Roberto L. Castro, Diego Andrade, Basilio Fraguela

(参考訳) ビデオシーケンスで人々を追跡することは、多くの観点からアプローチされた難しいタスクです。このタスクは、放送されたスポーツイベントの選手であるときにさらに複雑になるが、その理由として、頻繁なカメラの動きやスイッチ、プレイヤー間の全体的および部分的閉塞、ビデオの凝固アルゴリズムによるぼやけたフレームなどの困難が存在する。本稿では,高速かつ高精度な選手追跡ソリューションを提案する。これにより、プレイヤーを正確にリアルタイムで追跡することができる。このアプローチは、比較的控えめなハードウェアで同時に実行される複数のモデルを組み合わせており、手ラベルのブロードキャストビデオシーケンスに対して精度が検証されている。精度については,本手法の曲線下領域 (auc) は約0.6であり, art 解の汎用状態と類似していることを示す。性能に関しては80fpsで高精細ビデオ(1920x1080px)を処理できる。

Tracking people in a video sequence is a challenging task that has been approached from many perspectives. This task becomes even more complicated when the person to track is a player in a broadcasted sport event, the reasons being the existence of difficulties such as frequent camera movements or switches, total and partial occlusions between players, and blurry frames due to the codification algorithm of the video. This paper introduces a player tracking solution which is both fast and accurate. This allows to track a player precisely in real-time. The approach combines several models that are executed concurrently in a relatively modest hardware, and whose accuracy has been validated against hand-labeled broadcast video sequences. Regarding the accuracy, the tests show that the area under curve (AUC) of our approach is around 0.6, which is similar to generic state of the art solutions. As for performance, our proposal can process high definition videos (1920x1080 px) at 80 fps.

翻訳日:2022-12-26 01:38:57 公開日:2020-03-10

# Mind the Gap: Open Set Domain Adaptationにおけるドメインギャップの拡大

Mind the Gap: Enlarging the Domain Gap in Open Set Domain Adaptation ( http://arxiv.org/abs/2003.03787v2 )

ライセンス: Link先を確認

Dongliang Chang, Aneeshan Sain, Zhanyu Ma, Yi-Zhe Song, Jun Guo

(参考訳) 教師なしのドメイン適応は、ソースドメインからのラベル付きデータを活用して、ラベルなしのターゲットドメインの分類子を学ぶことを目的としています。その多くの変種の中で、open set domain adaptation (osda) はおそらく最も困難であり、ターゲットドメインに未知のクラスの存在を想定している。本稿では,osda について,より大きな領域間隙を横断する能力の強化に特に焦点をあてて検討する。第一に、既存の最先端手法は、特にOSDA用に再設計された新しいデータセット(PACS)において、より大きなドメインギャップが存在する場合、大幅なパフォーマンス低下を被ることを示す。次に、より大きなドメインギャップに対処する新しいフレームワークを提案する。重要な洞察は、2つのネットワーク間の相互に有益な情報をどのように活用するかである。 a) 既知のクラスと未知のクラスのサンプルを分離する。 b) 未知のサンプルの影響を受けずに、ソースとターゲットドメイン間のドメインの混乱を最大化する。その通りです (a)及び (b)相互に監督し、収束するまで交代する。 Office-31、Office-Home、PACSのデータセットで大規模な実験を行い、他の最先端技術と比較して、我々の手法の優位性を実証した。コードはhttps://github.com/dongliangchang/mutual-to-separate/。

Unsupervised domain adaptation aims to leverage labeled data from a source domain to learn a classifier for an unlabeled target domain. Among its many variants, open set domain adaptation (OSDA) is perhaps the most challenging, as it further assumes the presence of unknown classes in the target domain. In this paper, we study OSDA with a particular focus on enriching its ability to traverse across larger domain gaps. Firstly, we show that existing state-of-the-art methods suffer a considerable performance drop in the presence of larger domain gaps, especially on a new dataset (PACS) that we re-purposed for OSDA. We then propose a novel framework to specifically address the larger domain gaps. The key insight lies with how we exploit the mutually beneficial information between two networks; (a) to separate samples of known and unknown classes, (b) to maximize the domain confusion between source and target domain without the influence of unknown samples. It follows that (a) and (b) will mutually supervise each other and alternate until convergence. Extensive experiments are conducted on Office-31, Office-Home, and PACS datasets, demonstrating the superiority of our method in comparison to other state-of-the-arts. Code available at https://github.com/dongliangchang/Mutual-to-Separate/

翻訳日:2022-12-25 14:34:22 公開日:2020-03-10

# SQUIRL:長軸ロボットマニピュレーションタスクのビデオデモによるロバストで効率的な学習

SQUIRL: Robust and Efficient Learning from Video Demonstration of Long-Horizon Robotic Manipulation Tasks ( http://arxiv.org/abs/2003.04956v1 )

ライセンス: Link先を確認

Bohan Wu, Feng Xu, Zhanpeng He, Abhi Gupta, and Peter K. Allen

(参考訳) 深部強化学習(RL)の最近の進歩は、複雑なロボット操作タスクを学習する可能性を示している。しかし、RLはロボットに大量の現実世界の体験を収集する必要がある。この問題に対処するため、近年の研究では、少数の専門家によるデモンストレーションだけで堅牢なパフォーマンスを実現する能力から、特に逆強化学習(irl)を通じて、エキスパートデモンストレーション(lfd)からの学習を提案している。それでも、実際のロボットにIRLをデプロイすることは、大量のロボット体験を必要とするため、依然として難しい。本稿では,この拡張性に頑健で,サンプル効率が高く,かつ汎用的なメタIRLアルゴリズムであるSQUIRLを用いて取り組むことを目的としている。このアルゴリズムはまず,行動クローニング(BC)を用いたタスクエンコーダとタスク条件付きポリシーの学習をブートストラップする。そして、実際のロボット体験を収集し、報酬学習を回避し、組み合わせたロボットと専門家の軌道からQ関数を直接回収する。次に、このアルゴリズムはQ関数を用いて、ロボットが収集した累積体験を再評価し、ポリシーを迅速に改善する。結局、このポリシーは、テスト時に試行錯誤を必要とせず、新しいタスクでbcよりも堅牢に(90%以上の成功)する。最後に、我々の実ロボットとシミュレーション実験は、異なる状態空間、アクション空間、視覚に基づく操作タスク、例えばピック・プール・プレースやピック・キャリー・ドロップにおけるアルゴリズムの一般化を実証する。

Recent advances in deep reinforcement learning (RL) have demonstrated its potential to learn complex robotic manipulation tasks. However, RL still requires the robot to collect a large amount of real-world experience. To address this problem, recent works have proposed learning from expert demonstrations (LfD), particularly via inverse reinforcement learning (IRL), given its ability to achieve robust performance with only a small number of expert demonstrations. Nevertheless, deploying IRL on real robots is still challenging due to the large number of robot experiences it requires. This paper aims to address this scalability challenge with a robust, sample-efficient, and general meta-IRL algorithm, SQUIRL, that performs a new but related long-horizon task robustly given only a single video demonstration. First, this algorithm bootstraps the learning of a task encoder and a task-conditioned policy using behavioral cloning (BC). It then collects real-robot experiences and bypasses reward learning by directly recovering a Q-function from the combined robot and expert trajectories. Next, this algorithm uses the Q-function to re-evaluate all cumulative experiences collected by the robot to improve the policy quickly. In the end, the policy performs more robustly (90%+ success) than BC on new tasks while requiring no trial-and-errors at test time. Finally, our real-robot and simulated experiments demonstrate our algorithm's generality across different state spaces, action spaces, and vision-based manipulation tasks, e.g., pick-pour-place and pick-carry-drop.

翻訳日:2022-12-24 21:59:08 公開日:2020-03-10

# PL${}_{1}$P -- 3つの視点における部分可視性の下でのポイントライン最小問題

PL${}_{1}$P -- Point-line Minimal Problems under Partial Visibility in Three Views ( http://arxiv.org/abs/2003.05015v1 )

ライセンス: Link先を確認

Timothy Duff, Kathl\'en Kohn, Anton Leykin, Tomas Pajdla

(参考訳) 本稿では,各直線が少なくとも1点に入射した場合に,空間内の点と線の一般的な配置に関する最小限の問題を3つの校正視点カメラで部分的に観察する。これは、オクルージョンによる画像の観察の欠如と検出の欠如を可能にする、興味深い極小問題の大きなクラスである。そのような最小限の問題は無限に存在するが、過剰な特徴を取り除き、カメラを回避し、140616の等価クラスに還元できることが示される。また,最小解法の設計に実用的なカメラミニマル問題を導入し,最小問題ごとに最も単純なカメラミニマル問題を選択する方法を示す。この単純化により、74575同値類が得られる。 76種のみが知られており、残りは新種である。画像マッチングと3次元再構成の実用的解決の可能性を持つ問題を特定するため,300未満の汎用データに対する解を持つカメラ最小問題に対する解数を計算するとともに,カメラ最小問題のより小さなサブファミリ数をいくつか提示する。

We present a complete classification of minimal problems for generic arrangements of points and lines in space observed partially by three calibrated perspective cameras when each line is incident to at most one point. This is a large class of interesting minimal problems that allows missing observations in images due to occlusions and missed detections. There is an infinite number of such minimal problems; however, we show that they can be reduced to 140616 equivalence classes by removing superfluous features and relabeling the cameras. We also introduce camera-minimal problems, which are practical for designing minimal solvers, and show how to pick a simplest camera-minimal problem for each minimal problem. This simplification results in 74575 equivalence classes. Only 76 of these were known; the rest are new. In order to identify problems that have potential for practical solving of image matching and 3D reconstruction, we present several smaller natural subfamilies of camera-minimal problems as well as compute solution counts for all camera-minimal problems which have less than 300 solutions for generic data.

翻訳日:2022-12-24 21:58:08 公開日:2020-03-10

# テラヘルツ通信ネットワークにおける間欠干渉緩和のための強化学習

Reinforcement Learning for Mitigating Intermittent Interference in Terahertz Communication Networks ( http://arxiv.org/abs/2003.04832v1 )

ライセンス: Link先を確認

Reza Barazideh and Omid Semiari and Solmaz Niknam and Balasubramaniam Natarajan

(参考訳) リアルタイム拡張現実感アプリケーションのような極めて高いデータレート要求の無線サービスを創り出すには、将来の無線ネットワークの容量をさらに増やす新しいソリューションを義務付ける必要がある。この場合、テラヘルツ周波数帯における大きな帯域幅を活用することが鍵となる。これらの高周波数での大きな伝搬損失を克服するためには、高方向リンク上での伝送を管理することは避けられない。しかし、多数のユーザによる非コーディネート指向送信はテラヘルツネットワークにかなりの干渉を引き起こす可能性がある。このような干渉は短いランダムな時間間隔で受信されるが、受信した電力は大きい。本研究では,適応型マルチthresholding戦略を用いて,時間領域における方向リンクからの間欠的干渉を効率的に検出し緩和する,強化学習に基づく新しいフレームワークを提案する。最適しきい値を求めるために、問題は多次元多腕バンディットシステムとして定式化される。次に、受信者が非常に低い複雑性で最適な閾値を学習できるアルゴリズムを提案する。提案手法のもう1つの重要な利点は、干渉統計に関する事前の知識に依存しないため、動的シナリオにおける干渉緩和に適していることである。シミュレーションの結果,従来の2つの時間領域干渉緩和法と比較して,提案手法のビットエラーレート性能が良好であることが確認された。

Emerging wireless services with extremely high data rate requirements, such as real-time extended reality applications, mandate novel solutions to further increase the capacity of future wireless networks. In this regard, leveraging large available bandwidth at terahertz frequency bands is seen as a key enabler. To overcome the large propagation loss at these very high frequencies, it is inevitable to manage transmissions over highly directional links. However, uncoordinated directional transmissions by a large number of users can cause substantial interference in terahertz networks. While such interference will be received over short random time intervals, the received power can be large. In this work, a new framework based on reinforcement learning is proposed that uses an adaptive multi-thresholding strategy to efficiently detect and mitigate the intermittent interference from directional links in the time domain. To find the optimal thresholds, the problem is formulated as a multidimensional multi-armed bandit system. Then, an algorithm is proposed that allows the receiver to learn the optimal thresholds with very low complexity. Another key advantage of the proposed approach is that it does not rely on any prior knowledge about the interference statistics, and hence, it is suitable for interference mitigation in dynamic scenarios. Simulation results confirm the superior bit-error-rate performance of the proposed method compared with two traditional time-domain interference mitigation approaches.

翻訳日:2022-12-24 21:56:26 公開日:2020-03-10

# O&G機械学習モデルのデータリニアジ管理:シェールユースケースのためのスイートスポット

Managing Data Lineage of O&G Machine Learning Models: The Sweet Spot for Shale Use Case ( http://arxiv.org/abs/2003.04915v1 )

ライセンス: Link先を確認

Raphael Thiago, Renan Souza, L. Azevedo, E. Soares, Rodrigo Santos, Wallas Santos, Max De Bayser, M. Cardoso, M. Moreno, and Renato Cerqueira

(参考訳) 機械学習(ML)は、いくつかの業界で欠かせない役割を担っている。しかしながら、"このモデルをトレーニングするために使用されるデータセットはどこから来たのか?"、いくつかの新しいデータ保護法の導入、データガバナンス要件の必要性など、データ系統のトレーニングに関する疑問は、現実の世界におけるMLモデルの採用を妨げている。本稿では,シェールオイルとガス生産のためのスイートスポットを発見するためのMLモデルを構築するために,MLライフサイクルの恩恵を受けるために,データ系統をどのように活用できるかを論じる。

Machine Learning (ML) has increased its role, becoming essential in several industries. However, questions around training data lineage, such as "where has the dataset used to train this model come from?"; the introduction of several new data protection legislation; and, the need for data governance requirements, have hindered the adoption of ML models in the real world. In this paper, we discuss how data lineage can be leveraged to benefit the ML lifecycle to build ML models to discover sweet-spots for shale oil and gas production, a major application in the Oil and Gas O&G Industry.

翻訳日:2022-12-24 21:56:05 公開日:2020-03-10

# コミュニケーション効率のよい分散ディープラーニング:包括的調査

Communication-Efficient Distributed Deep Learning: A Comprehensive Survey ( http://arxiv.org/abs/2003.06307v1 )

ライセンス: Link先を確認

Zhenheng Tang, Shaohuai Shi, Xiaowen Chu, Wei Wang, Bo Li

(参考訳) 分散ディープラーニングは、ディープモデルやデータセットのサイズが大きくなるにつれて、複数のコンピューティングデバイス(gpuやtpuなど)を活用することで、トレーニング時間全体の削減に非常に一般的なものになる。しかしながら、コンピューティングデバイス間のデータ通信は、システムのスケーラビリティを制限する潜在的なボトルネックになり得る。分散ディープラーニングにおけるコミュニケーション問題への対処法は,近年ホットな研究トピックになりつつある。本稿では,システムレベルの最適化とアルゴリズムレベルの最適化の両方において,通信効率のよい分散学習アルゴリズムの包括的調査を行う。システムレベルでは、通信コストを削減するため、システム設計と実装をデミスティフィケートする。アルゴリズムレベルでは、異なるアルゴリズムと理論収束境界と通信複雑性を比較する。具体的には、まず、通信同期、システムアーキテクチャ、圧縮技術、通信とコンピューティングの並列性という4つの主次元を含むデータ並列分散トレーニングアルゴリズムの分類法を提案する。次に,コミュニケーションコストを比較するために,4次元の問題に対処する研究について述べる。さらに、異なるアルゴリズムの収束率を比較することで、反復の観点からアルゴリズムがどの程度の速度で解に収束できるかを知ることができる。システムレベルの通信コスト分析と理論収束速度比較により、特定の分散環境においてどのアルゴリズムがより効率的かを理解し、潜在的な方向を推定し、さらなる最適化を行うことができる。

Distributed deep learning becomes very common to reduce the overall training time by exploiting multiple computing devices (e.g., GPUs/TPUs) as the size of deep models and data sets increases. However, data communication between computing devices could be a potential bottleneck to limit the system scalability. How to address the communication problem in distributed deep learning is becoming a hot research topic recently. In this paper, we provide a comprehensive survey of the communication-efficient distributed training algorithms in both system-level and algorithmic-level optimizations. In the system-level, we demystify the system design and implementation to reduce the communication cost. In algorithmic-level, we compare different algorithms with theoretical convergence bounds and communication complexity. Specifically, we first propose the taxonomy of data-parallel distributed training algorithms, which contains four main dimensions: communication synchronization, system architectures, compression techniques, and parallelism of communication and computing. Then we discuss the studies in addressing the problems of the four dimensions to compare the communication cost. We further compare the convergence rates of different algorithms, which enable us to know how fast the algorithms can converge to the solution in terms of iterations. According to the system-level communication cost analysis and theoretical convergence speed comparison, we provide the readers to understand what algorithms are more efficient under specific distributed environments and extrapolate potential directions for further optimizations.

翻訳日:2022-12-24 21:55:28 公開日:2020-03-10

# JS-son - リーンで拡張可能なJavaScriptエージェントプログラミングライブラリ

JS-son -- A Lean, Extensible JavaScript Agent Programming Library ( http://arxiv.org/abs/2003.04690v1 )

ライセンス: Link先を確認

Timotheus Kampik and Juan Carlos Nieves

(参考訳) エージェント指向のソフトウェアエンジニアリングフレームワークは数多く存在し、そのほとんどは学術的マルチエージェントシステムコミュニティによって開発されている。しかしながら、これらのフレームワークは、JavaScriptやPythonのようなモダンなハイレベルプログラミング言語に慣れているエンジニアのために学ぶのが難しい、プログラミングパラダイムをユーザに課すことが多い。ソフトウェア工学の主流によるエージェント指向プログラミングの導入がいかに容易かを示すため、推論ループエージェントを実装するためのリーンJavaScriptライブラリのプロトタイプを提供する。このライブラリはコアエージェントプログラミングの概念に重点を置いており、プログラミングアプローチにさらなる制限を課すことを控えている。その有用性を説明するために、このライブラリをweb上のマルチエージェントシステムシミュレーションに適用し、クラウドでホストされたfunction-as-a-service環境にデプロイし、pythonベースのデータサイエンスツールに組み込む方法を示す。

A multitude of agent-oriented software engineering frameworks exist, most of which are developed by the academic multi-agent systems community. However, these frameworks often impose programming paradigms on their users that are challenging to learn for engineers who are used to modern high-level programming languages such as JavaScript and Python. To show how the adoption of agent-oriented programming by the software engineering mainstream can be facilitated, we provide a lean JavaScript library prototype for implementing reasoning-loop agents. The library focuses on core agent programming concepts and refrains from imposing further restrictions on the programming approach. To illustrate its usefulness, we show how the library can be applied to multi-agent systems simulations on the web, deployed to cloud-hosted function-as-a-service environments, and embedded in Python-based data science tools.

翻訳日:2022-12-24 21:48:40 公開日:2020-03-10

# 移動目標モンテカルロ

Moving Target Monte Carlo ( http://arxiv.org/abs/2003.04873v1 )

ライセンス: Link先を確認

Haoyun Ying, Keheng Mao, Klaus Mosegaard

(参考訳) マルコフ連鎖モンテカルロ法(mcmc)は、高次元確率変数 $\mathbf{x}$ と非正規化確率密度 $p$ と観測データ $\mathbf{d}$ からのサンプリングを考える際によく用いられる。しかし、MCMCは、受容率を構成する際に、提案された候補$\mathbf{x}$の後方分布$p(\mathbf{x}|\mathbf{d})$を評価する必要がある。このような評価が難しければコストがかかる。本稿では,移動目標モンテカルロ (MTMC) と呼ばれる非マルコフ型サンプリングアルゴリズムを提案する。 n$-th での受け入れ率は、$p(\mathbf{x}|\mathbf{d})$の代わりに、後方分布 $a_n(\mathbf{x})$ の反復的に更新された近似を用いて構成される。後方の$p(\mathbf{x}|\mathbf{d})$ の真の値は、候補 $\mathbf{x}$ が受け入れられる場合にのみ計算される。近似$a_n$はこれらの評価を利用し、$n \rightarrow \infty$として$p$に収束する。異なる状況における収束の証明と収束率の推定が与えられる。

The Markov Chain Monte Carlo (MCMC) methods are popular when considering sampling from a high-dimensional random variable $\mathbf{x}$ with possibly unnormalised probability density $p$ and observed data $\mathbf{d}$. However, MCMC requires evaluating the posterior distribution $p(\mathbf{x}|\mathbf{d})$ of the proposed candidate $\mathbf{x}$ at each iteration when constructing the acceptance rate. This is costly when such evaluations are intractable. In this paper, we introduce a new non-Markovian sampling algorithm called Moving Target Monte Carlo (MTMC). The acceptance rate at $n$-th iteration is constructed using an iteratively updated approximation of the posterior distribution $a_n(\mathbf{x})$ instead of $p(\mathbf{x}|\mathbf{d})$. The true value of the posterior $p(\mathbf{x}|\mathbf{d})$ is only calculated if the candidate $\mathbf{x}$ is accepted. The approximation $a_n$ utilises these evaluations and converges to $p$ as $n \rightarrow \infty$. A proof of convergence and estimation of convergence rate in different situations are given.

翻訳日:2022-12-24 21:48:18 公開日:2020-03-10

# 通信効率ばらつき低減確率勾配降下

Communication-efficient Variance-reduced Stochastic Gradient Descent ( http://arxiv.org/abs/2003.04686v1 )

ライセンス: Link先を確認

Hossein S. Ghadikolaei and Sindri Magnusson

(参考訳) 複数のノードが各イテレーションで重要なアルゴリズム情報を交換して大きな問題を解決する通信効率のよい分散最適化の問題を考える。特に,確率的分散還元勾配に着目し,通信効率を高めるための新しい手法を提案する。すなわち、元の非圧縮アルゴリズムの線形収束率を維持しながら、通信された情報を数ビットに圧縮する。実データ集合の包括的理論的および数値的解析により,本アルゴリズムは通信の複雑さを最大 95 % 削減できることを明らかにした。さらに、分散最適化問題を解くための最先端アルゴリズムよりも、量子化(真の最小値と収束率の維持の観点から)がはるかに堅牢である。この結果は,モノのインターネットやモバイルネットワーク上での機械学習利用に重要な意味を持つ。

We consider the problem of communication efficient distributed optimization where multiple nodes exchange important algorithm information in every iteration to solve large problems. In particular, we focus on the stochastic variance-reduced gradient and propose a novel approach to make it communication-efficient. That is, we compress the communicated information to a few bits while preserving the linear convergence rate of the original uncompressed algorithm. Comprehensive theoretical and numerical analyses on real datasets reveal that our algorithm can significantly reduce the communication complexity, by as much as 95\%, with almost no noticeable penalty. Moreover, it is much more robust to quantization (in terms of maintaining the true minimizer and the convergence rate) than the state-of-the-art algorithms for solving distributed optimization problems. Our results have important implications for using machine learning over internet-of-things and mobile networks.

翻訳日:2022-12-24 21:47:26 公開日:2020-03-10

# ヘテロティックラインバンドルモデルによる探索と爆発

Explore and Exploit with Heterotic Line Bundle Models ( http://arxiv.org/abs/2003.04817v1 )

ライセンス: Link先を確認

Magdalena Larfors and Robin Schneider

(参考訳) 我々は、完全区間カラビヤウ(CICY)多様体上のラインバンドル和から構築されたヘテロティック $SU(5)$ GUT モデルのクラスを、深層強化学習を用いて探索する。我々は,a3cエージェントがモデル探索を訓練する実験を複数実施する。これらのエージェントはランダムな探索よりも優れており、ユニークなモデルを見つける上で最も好ましい設定は1700倍である。さらに、訓練されたエージェントが新しい多様体上のランダムウォーカーよりも優れているという証拠も発見する。エージェントは圧縮データ中に隠れた構造を検知し,その一部が一般の性質であることがわかった。実験は$h^{(1,1)}$でうまくスケールし、大きな$h^{(1,1)}$でCICY上でのモデル構築の鍵を提供するかもしれない。

We use deep reinforcement learning to explore a class of heterotic $SU(5)$ GUT models constructed from line bundle sums over Complete Intersection Calabi Yau (CICY) manifolds. We perform several experiments where A3C agents are trained to search for such models. These agents significantly outperform random exploration, in the most favourable settings by a factor of 1700 when it comes to finding unique models. Furthermore, we find evidence that the trained agents also outperform random walkers on new manifolds. We conclude that the agents detect hidden structures in the compactification data, which is partly of general nature. The experiments scale well with $h^{(1,1)}$, and may thus provide the key to model building on CICYs with large $h^{(1,1)}$.

翻訳日:2022-12-24 21:47:13 公開日:2020-03-10

# データ分析によるレジリエンス犯罪ネットワークの破壊:シチリア・マフィアの事例

Disrupting Resilient Criminal Networks through Data Analysis: The case of Sicilian Mafia ( http://arxiv.org/abs/2003.05303v1 )

ライセンス: Link先を確認

Lucia Cavallaro, Annamaria Ficara, Pasquale De Meo, Giacomo Fiumara, Salvatore Catanese, Ovidiu Bagdasar and Antonio Liotta

(参考訳) 他のタイプのソーシャルネットワークと比較すると、犯罪ネットワークは破壊に対する強い弾力性があり、法執行機関に厳しいハードルをもたらすため、困難な課題を呈している。ここではソーシャルネットワーク分析から手法やツールを借りて (i)現実世界の2つのデータセットに基づくシチリアマフィアギャングの構造を明らかにし、 (ii)それらを効率的にディスラプトする方法についての洞察を得る。マフィアネットワークには、リンクの分布と強度により、他のソーシャルネットワークとは大きく異なる特徴があり、外因性摂動に対して非常に堅牢である。アナリストはまた、ギャングの内部構造と外界との関係を正確に記述する信頼できるデータセットの収集が困難であることに直面している。私たちの研究の付加価値は、2000年代前半にシチリアで活動したマフィア組織に関連する、法律的な行為に由来する生のデータに基づく、2つの現実世界のデータセットの生成です。 2つの異なるネットワークを作り、それぞれ電話と物理的な会議を捉えました。ネットワーク破壊分析は異なる介入手順をシミュレートしました一一度に一人の犯罪者を逮捕すること(次回的ノード削除) (ii)警察の襲撃(ノードの取り外し)。各アプローチの有効性を,複数のネットワーク集中度指標を用いて測定した。その中では、アフィリエイトの5%だけを中和することで、ネットワーク接続が70%低下したことを示す。また,重み付きネットワーク分析と非重み付きネットワーク分析では,犯罪ネットワークにおける特異な相互作用タイプ(すなわち相互作用頻度の分布)が有意な差は認められなかった。我々の研究は犯罪やテロリストのネットワークに対処するための重要な実践的応用を持っている。

Compared to other types of social networks, criminal networks present hard challenges, due to their strong resilience to disruption, which poses severe hurdles to law-enforcement agencies. Herein, we borrow methods and tools from Social Network Analysis to (i) unveil the structure of Sicilian Mafia gangs, based on two real-world datasets, and (ii) gain insights as to how to efficiently disrupt them. Mafia networks have peculiar features, due to the links distribution and strength, which makes them very different from other social networks, and extremely robust to exogenous perturbations. Analysts are also faced with the difficulty in collecting reliable datasets that accurately describe the gangs' internal structure and their relationships with the external world, which is why earlier studies are largely qualitative, elusive and incomplete. An added value of our work is the generation of two real-world datasets, based on raw data derived from juridical acts, relating to a Mafia organization that operated in Sicily during the first decade of 2000s. We created two different networks, capturing phone calls and physical meetings, respectively. Our network disruption analysis simulated different intervention procedures: (i) arresting one criminal at a time (sequential node removal); and (ii) police raids (node block removal). We measured the effectiveness of each approach through a number of network centrality metrics. We found Betweeness Centrality to be the most effective metric, showing how, by neutralizing only the 5% of the affiliates, network connectivity dropped by 70%. We also identified that, due the peculiar type of interactions in criminal networks (namely, the distribution of the interactions frequency) no significant differences exist between weighted and unweighted network analysis. Our work has significant practical applications for tackling criminal and terrorist networks.

翻訳日:2022-12-24 21:40:49 公開日:2020-03-10

# DymSLAM:4次元幾何学的モーションセグメンテーションに基づく動的シーン再構成

DymSLAM:4D Dynamic Scene Reconstruction Based on Geometrical Motion Segmentation ( http://arxiv.org/abs/2003.04569v1 )

ライセンス: Link先を確認

Chenjie Wang and Bin Luo and Yun Zhang and Qing Zhao and Lu Yin and Wei Wang and Xin Su and Yajun Wang and Chengyuan Li

(参考訳) ほとんどのSLAMアルゴリズムは、シーンが静的であるという仮定に基づいている。しかし実際には、ほとんどのシーンは動的で、通常動く物体を含んでいるが、これらの手法は適していない。本稿では、4D(3D + Time)動的シーンを剛体移動物体で再構成できる動的ステレオ視覚SLAMシステムDymSLAMを紹介する。 DymSLAMの唯一の入力はステレオビデオであり、その出力には、静止環境の密度の高いマップ、移動物体の3Dモデル、カメラと移動物体の軌跡が含まれる。まず、従来のSLAM手法を用いて、連続するフレーム間の興味深い点を検出し、マッチングする。次に、異なる運動モデルに属する興味深いポイント(剛体移動物体のエゴ運動や運動モデルを含む)をマルチモデルフィッティングアプローチで分割する。エゴモーションに属する興味深い点に基づいて、カメラの軌跡を推定し、静的な背景を再構築することができる。剛体移動物体の運動モデルに属する興味深い点は、相対運動モデルとカメラとの相対運動モデルの推定と、物体の3次元モデル再構築に使用される。次に、グローバル参照フレーム内の移動物体の軌道に対する相対運動を変換する。最後に,移動物体の3次元モデルを環境の3次元マップに融合させ,その運動軌跡を考慮し,4D(3D+time)配列を得る。 dymslamは、それを無視する代わりに動的オブジェクトに関する情報を取得し、未知の剛体オブジェクトに適している。そこで,提案システムでは,ロボットを動的物体に対する障害物回避などのハイレベルなタスクに活用することができる。我々は,カメラと物体が広範囲に移動している実環境において実験を行った。

Most SLAM algorithms are based on the assumption that the scene is static. However, in practice, most scenes are dynamic which usually contains moving objects, these methods are not suitable. In this paper, we introduce DymSLAM, a dynamic stereo visual SLAM system being capable of reconstructing a 4D (3D + time) dynamic scene with rigid moving objects. The only input of DymSLAM is stereo video, and its output includes a dense map of the static environment, 3D model of the moving objects and the trajectories of the camera and the moving objects. We at first detect and match the interesting points between successive frames by using traditional SLAM methods. Then the interesting points belonging to different motion models (including ego-motion and motion models of rigid moving objects) are segmented by a multi-model fitting approach. Based on the interesting points belonging to the ego-motion, we are able to estimate the trajectory of the camera and reconstruct the static background. The interesting points belonging to the motion models of rigid moving objects are then used to estimate their relative motion models to the camera and reconstruct the 3D models of the objects. We then transform the relative motion to the trajectories of the moving objects in the global reference frame. Finally, we then fuse the 3D models of the moving objects into the 3D map of the environment by considering their motion trajectories to obtain a 4D (3D+time) sequence. DymSLAM obtains information about the dynamic objects instead of ignoring them and is suitable for unknown rigid objects. Hence, the proposed system allows the robot to be employed for high-level tasks, such as obstacle avoidance for dynamic objects. We conducted experiments in a real-world environment where both the camera and the objects were moving in a wide range.

翻訳日:2022-12-24 21:40:22 公開日:2020-03-10

# アンサンブル色空間モデルを用いた逆例への取り組み

Using an ensemble color space model to tackle adversarial examples ( http://arxiv.org/abs/2003.05005v1 )

ライセンス: Link先を確認

Shreyank N Gowda, Chun Yuan

(参考訳) 画像中の微小ピクセルの変化は、ディープラーニングモデルが生み出す予測を大幅に変える。例えば、これが原因で起こりうる最も重要な問題の1つは、自動運転だ。これに対処するために様々な方法が提案されている。このような攻撃を防御する3段階の手法を提案する。まず,統計的手法を用いて画像の識別を行う。第二に、同じモデルに複数の色空間を採用することで、各色空間がそれ自身に明示的な特徴を検出することによって、これらの敵対的攻撃と戦うことができることを示す。最後に、生成された特徴マップを拡大し、入力として送り返してさらに小さな特徴を得る。提案モデルは,特定の攻撃を防御するために訓練される必要はなく,本質的にはブラックボックス,ホワイトボックス,グレイボックスの対向攻撃技術に頑健であることを示す。特に、モデルが敵対的な例の訓練を受けていない場合、ホワイトボックス攻撃の場合、このモデルは比較モデルよりも56.12%頑丈である。

Minute pixel changes in an image drastically change the prediction that the deep learning model makes. One of the most significant problems that could arise due to this, for instance, is autonomous driving. Many methods have been proposed to combat this with varying amounts of success. We propose a 3 step method for defending such attacks. First, we denoise the image using statistical methods. Second, we show that adopting multiple color spaces in the same model can help us to fight these adversarial attacks further as each color space detects certain features explicit to itself. Finally, the feature maps generated are enlarged and sent back as an input to obtain even smaller features. We show that the proposed model does not need to be trained to defend an particular type of attack and is inherently more robust to black-box, white-box, and grey-box adversarial attack techniques. In particular, the model is 56.12 percent more robust than compared models in case of white box attacks when the models are not subject to adversarial example training.

翻訳日:2022-12-24 21:37:39 公開日:2020-03-10

# 医療画像セグメンテーションのための組込み集合知識の多レベルコンテキストゲーティング

Multi-level Context Gating of Embedded Collective Knowledge for Medical Image Segmentation ( http://arxiv.org/abs/2003.05056v1 )

ライセンス: Link先を確認

Maryam Asadi-Aghbolaghi, Reza Azad, Mahmood Fathy, and Sergio Escalera

(参考訳) 様々な症例で解剖学的に大きな違いがあるため, 医用画像のセグメンテーションは非常に困難である。ディープラーニングフレームワークの最近の進歩は、画像セグメンテーションの高速で正確なパフォーマンスを示している。既存のネットワークのうち、u-netは医療画像のセグメンテーションにうまく適用されている。本稿では,U-Net,Squeeze and Excitation (SE) ブロック,双方向 ConvLSTM (BConvLSTM) および高密度畳み込みのメカニズムを最大限に活用する医用画像分割のためのU-Netの拡張を提案する。 (I) U-Net内のSEモジュールを利用することでセグメンテーション性能を向上し、モデルの複雑さに小さな影響を与える。これらのブロックは、特徴マップのグローバルな情報埋め込みの自己ゲーティング機構を利用して、チャネルワイドな特徴応答を適応的に補正する。 (II) 特徴伝播を強化し,特徴再利用を促進するため,符号化パスの最後の畳み込み層に密結合した畳み込みを用いる。 (III) U-Netのスキップ接続における単純な結合の代わりに、BConvLSTMをネットワークのすべてのレベルで使用し、対応する符号化パスから抽出された特徴マップと、以前のデコードアップ畳み込みレイヤを非線形に組み合わせる。提案モデルは,isic 2017と2018の6つのデータセット,肺分画,$ph^2$,細胞核分画で評価され,最新性能が得られた。

Medical image segmentation has been very challenging due to the large variation of anatomy across different cases. Recent advances in deep learning frameworks have exhibited faster and more accurate performance in image segmentation. Among the existing networks, U-Net has been successfully applied on medical image segmentation. In this paper, we propose an extension of U-Net for medical image segmentation, in which we take full advantages of U-Net, Squeeze and Excitation (SE) block, bi-directional ConvLSTM (BConvLSTM), and the mechanism of dense convolutions. (I) We improve the segmentation performance by utilizing SE modules within the U-Net, with a minor effect on model complexity. These blocks adaptively recalibrate the channel-wise feature responses by utilizing a self-gating mechanism of the global information embedding of the feature maps. (II) To strengthen feature propagation and encourage feature reuse, we use densely connected convolutions in the last convolutional layer of the encoding path. (III) Instead of a simple concatenation in the skip connection of U-Net, we employ BConvLSTM in all levels of the network to combine the feature maps extracted from the corresponding encoding path and the previous decoding up-convolutional layer in a non-linear way. The proposed model is evaluated on six datasets DRIVE, ISIC 2017 and 2018, lung segmentation, $PH^2$, and cell nuclei segmentation, achieving state-of-the-art performance.

翻訳日:2022-12-24 21:37:23 公開日:2020-03-10

# 深海塩分予測アーキテクチャ

Tidying Deep Saliency Prediction Architectures ( http://arxiv.org/abs/2003.04942v1 )

ライセンス: Link先を確認

Navyasri Reddy, Samyak Jain, Pradeep Yarlagadda, Vineet Gandhi

(参考訳) 視覚注意のための計算モデル(サリエンシー推定)の学習は、機械やロボットを人間の視覚認知能力に近づける努力である。データ駆動の取り組みは、ディープニューラルネットワークアーキテクチャの導入以来、ランドスケープを支配してきた。ディープラーニングの研究において、アーキテクチャ設計の選択はしばしば経験的であり、必要以上に複雑なモデルにつながる。複雑さはアプリケーションの要求を妨げます。本稿では,saliencyモデルの4つのキーコンポーネント,すなわち入力機能,マルチレベル統合,読み出しアーキテクチャ,損失関数について述べる。これら4つの構成要素について,既存の技術モデルについて概観し,新しい,よりシンプルな代替案を提案する。そこで,本稿では,simplenet と mdnsal という2つの新しいエンド・ツー・エンドのアーキテクチャを提案する。 SimpleNetは最適化されたエンコーダ-デコーダアーキテクチャであり、SALICONデータセット(最大の唾液度ベンチマーク)で顕著なパフォーマンス向上をもたらす。 MDNSalは、GMM分布のパラメータを直接予測するパラメトリックモデルであり、予測マップにさらなる解釈可能性をもたらすことを目的としている。提案した精度モデルは25fpsで推定でき、リアルタイムアプリケーションに適している。コードと事前トレーニングされたモデルはhttps://github.com/samyak0210/saliencyで利用可能である。

Learning computational models for visual attention (saliency estimation) is an effort to inch machines/robots closer to human visual cognitive abilities. Data-driven efforts have dominated the landscape since the introduction of deep neural network architectures. In deep learning research, the choices in architecture design are often empirical and frequently lead to more complex models than necessary. The complexity, in turn, hinders the application requirements. In this paper, we identify four key components of saliency models, i.e., input features, multi-level integration, readout architecture, and loss functions. We review the existing state of the art models on these four components and propose novel and simpler alternatives. As a result, we propose two novel end-to-end architectures called SimpleNet and MDNSal, which are neater, minimal, more interpretable and achieve state of the art performance on public saliency benchmarks. SimpleNet is an optimized encoder-decoder architecture and brings notable performance gains on the SALICON dataset (the largest saliency benchmark). MDNSal is a parametric model that directly predicts parameters of a GMM distribution and is aimed to bring more interpretability to the prediction maps. The proposed saliency models can be inferred at 25fps, making them suitable for real-time applications. Code and pre-trained models are available at https://github.com/samyak0210/saliency.

翻訳日:2022-12-24 21:30:56 公開日:2020-03-10

# ラベルのないビデオからビデオオブジェクトのセグメンテーションを学ぶ

Learning Video Object Segmentation from Unlabeled Videos ( http://arxiv.org/abs/2003.05020v1 )

ライセンス: Link先を確認

Xiankai Lu, Wenguan Wang, Jianbing Shen, Yu-Wing Tai, David Crandall, and Steven C. H. Hoi

(参考訳) 本稿では,ビデオオブジェクトセグメンテーション(VOS)の新たな手法を提案する。この手法は,広範囲な注釈付きデータに大きく依存する既存の手法とは異なり,未ラベルビデオからのオブジェクトパターン学習に対処する。複数の粒度で VOS 固有の特性を包括的にキャプチャする,教師なし/弱教師付き学習フレームワーク MuG を導入する。我々のアプローチは、VOSにおける視覚パターンの理解を深め、アノテーションの負担を大幅に軽減するのに役立つ。慎重に設計されたアーキテクチャと強力な表現学習能力により、学習モデルは、オブジェクトレベルのゼロショットVOS、インスタンスレベルのゼロショットVOS、ワンショットVOSなど、多様なVOS設定に適用できる。実験は、これらの設定で有望な性能を示すとともに、ラベルのないデータを利用してセグメント化精度をさらに向上させるmugの可能性を示す。

We propose a new method for video object segmentation (VOS) that addresses object pattern learning from unlabeled videos, unlike most existing methods which rely heavily on extensive annotated data. We introduce a unified unsupervised/weakly supervised learning framework, called MuG, that comprehensively captures intrinsic properties of VOS at multiple granularities. Our approach can help advance understanding of visual patterns in VOS and significantly reduce annotation burden. With a carefully-designed architecture and strong representation learning ability, our learned model can be applied to diverse VOS settings, including object-level zero-shot VOS, instance-level zero-shot VOS, and one-shot VOS. Experiments demonstrate promising performance in these settings, as well as the potential of MuG in leveraging unlabeled data to further improve the segmentation accuracy.

翻訳日:2022-12-24 21:30:35 公開日:2020-03-10

# 顔のグラフィカル認識のためのクロスモーダルマルチタスク学習

Cross-modal Multi-task Learning for Graphic Recognition of Caricature Face ( http://arxiv.org/abs/2003.05787v1 )

ライセンス: Link先を確認

Zuheng Ming, Jean-Christophe Burie, Muhammad Muzzamil Luqman

(参考訳) 写実的な視覚画像の顔認識は、近年、よく研究され、大きな進歩を遂げている。現実的な視覚画像とは異なり、似顔絵の顔認識は視覚画像のパフォーマンスからかけ離れている。これは、顔の特徴を誇張して文字を強めることによってもたらされた似顔絵の極端な非剛性歪みによるものである。似顔絵と視覚画像の不均一性から、似顔絵・視覚画像の認識はクロスモーダル問題である。本稿では,マルチタスク学習による顔画像認識を実現する手法を提案する。タスクの重みを固定した従来のマルチタスク学習よりも,タスクの重要性に応じてタスクの重みを学習するアプローチを提案する。提案した動的タスク重み付きマルチタスク学習は,従来の方法のように過度に学習しやすいタスクに留まらず,難易度と難易度を適切にトレーニングすることができる。提案する動的マルチタスク学習のクロスモーダル・カカチュアル・ビジュアル顔認識における効果を実験的に検証した。 CaVIとWebCaricatureのデータセットのパフォーマンスは、最先端のメソッドよりも優れていることを示している。

Face recognition of realistic visual images has been well studied and made a significant progress in the recent decade. Unlike the realistic visual images, the face recognition of the caricatures is far from the performance of the visual images. This is largely due to the extreme non-rigid distortions of the caricatures introduced by exaggerating the facial features to strengthen the characters. The heterogeneous modalities of the caricatures and the visual images result the caricature-visual face recognition is a cross-modal problem. In this paper, we propose a method to conduct caricature-visual face recognition via multi-task learning. Rather than the conventional multi-task learning with fixed weights of tasks, this work proposes an approach to learn the weights of tasks according to the importance of tasks. The proposed multi-task learning with dynamic tasks weights enables to appropriately train the hard task and easy task instead of being stuck in the over-training easy task as conventional methods. The experimental results demonstrate the effectiveness of the proposed dynamic multi-task learning for cross-modal caricature-visual face recognition. The performances on the datasets CaVI and WebCaricature show the superiority over the state-of-art methods.

翻訳日:2022-12-24 21:29:59 公開日:2020-03-10

# 機能重要度ランキングのためのmatlabツールボックス

A Matlab Toolbox for Feature Importance Ranking ( http://arxiv.org/abs/2003.08737v1 )

ライセンス: Link先を確認

Shaode Yu, Zhicheng Zhang, Xiaokun Liang, Junjie Wu, Erlei Zhang, Wenjian Qin, and Yaoqin Xie

(参考訳) 特に、インテリジェントな診断とパーソナライズド医療のために何千もの特徴を抽出できる場合には、機能重要度ランキング(FIR)により多くの注意が払われている。多数のFIRアプローチが提案されているが、比較や実環境への応用のために統合されているものはほとんどない。本研究では,matlabツールボックスを提示し,合計30のアルゴリズムを収集した。さらに、ツールボックスを163枚の超音波画像のデータベース上で評価する。各乳房病変に対して,15個の特徴を抽出した。分類に最適な特徴のサブセットを明らかにするために、全ての特徴の組み合わせをテストし、超音波画像にアノテートされた病変の悪性度予測にリニアサポートベクターマシンを用いる。最終的に、性能比較に基づいてFIRの有効性を解析する。ツールボックスはオンライン(https://github.com/NicoYuCN/matFIR)である。今後の作業では、より多くのFIRメソッド、特徴選択メソッド、機械学習分類器が統合されます。

More attention is being paid for feature importance ranking (FIR), in particular when thousands of features can be extracted for intelligent diagnosis and personalized medicine. A large number of FIR approaches have been proposed, while few are integrated for comparison and real-life applications. In this study, a matlab toolbox is presented and a total of 30 algorithms are collected. Moreover, the toolbox is evaluated on a database of 163 ultrasound images. To each breast mass lesion, 15 features are extracted. To figure out the optimal subset of features for classification, all combinations of features are tested and linear support vector machine is used for the malignancy prediction of lesions annotated in ultrasound images. At last, the effectiveness of FIR is analyzed according to performance comparison. The toolbox is online (https://github.com/NicoYuCN/matFIR). In our future work, more FIR methods, feature selection methods and machine learning classifiers will be integrated.

翻訳日:2022-12-24 21:29:43 公開日:2020-03-10

# 競合する言語の共存について

On the coexistence of competing languages ( http://arxiv.org/abs/2003.04748v1 )

ライセンス: Link先を確認

Jean-Marc Luck and Anita Mehta

(参考訳) 従来の文献では、結果が常に他の言語よりも1つの言語が支配されていることを示唆している。言語共存は現実的に観察されるため,言語競合の問題を再考し,共存の出現の方法を明らかにすることに注力する。一つの地理的領域における言語話者の人口動態の不均衡に関する第1のシナリオと、異なる地理的領域に言語嗜好が特有な空間的異質性に関連する第2のシナリオである。これらのそれぞれについて、パラダイム的状況の調査は、言語共存につながる条件の定量的理解に繋がる。また,様々なモデルパラメータの関数として,生存言語数の予測も行う。

We investigate the evolution of competing languages, a subject where much previous literature suggests that the outcome is always the domination of one language over all the others. Since coexistence of languages is observed in reality, we here revisit the question of language competition, with an emphasis on uncovering the ways in which coexistence might emerge. We find that this emergence is related to symmetry breaking, and explore two particular scenarios -- the first relating to an imbalance in the population dynamics of language speakers in a single geographical area, and the second to do with spatial heterogeneity, where language preferences are specific to different geographical regions. For each of these, the investigation of paradigmatic situations leads us to a quantitative understanding of the conditions leading to language coexistence. We also obtain predictions of the number of surviving languages as a function of various model parameters.

翻訳日:2022-12-24 21:29:26 公開日:2020-03-10

# 視覚フィードバックを用いたロボット制御のための生成モデル学習

Learning a generative model for robot control using visual feedback ( http://arxiv.org/abs/2003.04474v1 )

ライセンス: Link先を確認

Nishad Gothoskar, Miguel L\'azaro-Gredilla, Abhishek Agarwal, Yasemin Bekiroglu, Dileep George

(参考訳) ロボット制御に視覚フィードバックを取り入れた新しい定式化を提案する。我々は、アクションから、エンドエフェクタの機能のイメージ観察まで、生成モデルを定義する。モデルにおける推論により,特徴のターゲット位置に対応するロボット状態を推測することができる。これにより、ロボットの動きをガイドし、最先端のビジュアルサーボ法よりもはるかに少ないステップで特徴のターゲット位置をマッチングすることができる。本モデルのトレーニング手順により,キネマティクス,特徴構造,カメラパラメータの学習を同時に行うことができる。これは、それを観察するロボット、構造、およびカメラに関する事前情報なしで行うことができる。学習はサンプル効率よく行われ、テストデータに対して強い一般化を示す。フォーミュレーションはモジュール化されているので、カメラやオブジェクトなどのセットアップのコンポーネントを変更して、オンラインで素早く再学習することができます。本手法は,我々が操作するコントローラの観測状態とノイズのノイズを処理できる。本手法は,不正確な制御器を有するロボットに対して把持および密接な挿入を行うことにより,その効果を実証する。

We introduce a novel formulation for incorporating visual feedback in controlling robots. We define a generative model from actions to image observations of features on the end-effector. Inference in the model allows us to infer the robot state corresponding to target locations of the features. This, in turn, guides motion of the robot and allows for matching the target locations of the features in significantly fewer steps than state-of-the-art visual servoing methods. The training procedure for our model enables effective learning of the kinematics, feature structure, and camera parameters, simultaneously. This can be done with no prior information about the robot, structure, and cameras that observe it. Learning is done sample-efficiently and shows strong generalization to test data. Since our formulation is modular, we can modify components of our setup, like cameras and objects, and relearn them quickly online. Our method can handle noise in the observed state and noise in the controllers that we interact with. We demonstrate the effectiveness of our method by executing grasping and tight-fit insertions on robots with inaccurate controllers.

翻訳日:2022-12-24 21:29:15 公開日:2020-03-10

# 帯域限定環境におけるコロボティックビジョンに基づく探索のためのアクティブリワード学習

Active Reward Learning for Co-Robotic Vision Based Exploration in Bandwidth Limited Environments ( http://arxiv.org/abs/2003.05016v1 )

ライセンス: Link先を確認

Stewart Jamieson, Jonathan P. How, Yogesh Girdhar

(参考訳) 我々は,人間の操作者との通信能力に制限があるため,新たな科学的関連画像の収集場所を自律的に決定しなければならないロボットのための新しいPOMDP問題定式化を提案する。この定式化から,このようなロボットの観察モデル,報酬モデル,コミュニケーション戦略に対する制約と設計原則を導出し,非常に高次元の観察空間と関連する訓練データの不足に対処する手法を探求する。提案手法は,ロボットがオンラインの「レグレット」を最小化するためのクエリ作成に基づく,新たな能動的報酬学習戦略を導入し,シミュレーションによる自律的な視覚探索の適性を評価する。帯域制限のある環境では、この新たな後悔に基づく基準により、ロボット探検家は次の最高の基準よりも、1ミッションあたり最大17%の報酬を集めることができる。

We present a novel POMDP problem formulation for a robot that must autonomously decide where to go to collect new and scientifically relevant images given a limited ability to communicate with its human operator. From this formulation we derive constraints and design principles for the observation model, reward model, and communication strategy of such a robot, exploring techniques to deal with the very high-dimensional observation space and scarcity of relevant training data. We introduce a novel active reward learning strategy based on making queries to help the robot minimize path "regret" online, and evaluate it for suitability in autonomous visual exploration through simulations. We demonstrate that, in some bandwidth-limited environments, this novel regret-based criterion enables the robotic explorer to collect up to 17% more reward per mission than the next-best criterion.

翻訳日:2022-12-24 21:29:02 公開日:2020-03-10

# レーティングに基づくハイブリッドベイズネットワークによるアジアハンディキャップサッカーの賭け

Asian Handicap football betting with Rating-based Hybrid Bayesian Networks ( http://arxiv.org/abs/2003.09384v1 )

ライセンス: Link先を確認

Anthony Constantinou

(参考訳) アジア・ハンディキャップ(ah)のサッカー賭け市場は盛んに人気があるが、関連する文献では十分に研究されていない。本稿では,レーティングシステムとハイブリッドベイズネットワークを組み合わせることで,AH賭け市場の予測と評価に特化して開発された最初のモデルを示す。結果はイングランドのプレミアリーグ13シーズンに基づいており、従来の1x2市場と比較される。異なる賭け状況が調べられました a) 平均値と最大値の両方の市場確率 b) 予測された確率と公表された確率の間の決定しきい値 c)再投資と利益の両面での最適化 d) 1x2 及び ah 市場で同等利益を目標にする場合におけるリターンのばらつきがどう変化するかを調査するための簡単なステークス調整。 ah市場は従来の1x2市場の非効率性を共有しているが、興味深い違いと両者の類似性が示されている。

Despite the massive popularity of the Asian Handicap (AH) football betting market, it has not been adequately studied by the relevant literature. This paper combines rating systems with hybrid Bayesian networks and presents the first published model specifically developed for prediction and assessment of the AH betting market. The results are based on 13 English Premier League seasons and are compared to the traditional 1X2 market. Different betting situations have been examined including a) both average and maximum (best available) market odds, b) all possible betting decision thresholds between predicted and published odds, c) optimisations for both return-on-investment and profit, and d) simple stake adjustments to investigate how the variance of returns changes when targeting equivalent profit in both 1X2 and AH markets. While the AH market is found to share the inefficiencies of the traditional 1X2 market, the findings reveal both interesting differences as well as similarities between the two.

翻訳日:2022-12-24 21:28:46 公開日:2020-03-10

# Nested Reduced-Rank Regularizationによる多変量機能回帰

Multivariate Functional Regression via Nested Reduced-Rank Regularization ( http://arxiv.org/abs/2003.04786v1 )

ライセンス: Link先を確認

Xiaokang Liu, Shujie Ma, Kun Chen

(参考訳) 本稿では,多変量関数応答と予測器を備えた回帰モデルにネステッド・レグレッション (nrrr) アプローチを適用し,調整された次元縮小を達成し,結果の関数モデルの解釈/可視化を容易にする。提案手法は,機能回帰面に課される2段階の低ランク構造に基づく。グローバルな低ランク構造は、下層の回帰関係を駆動する潜在主機能応答と予測器の小さなセットを特定する。局所的な低ランク構造は、主機能応答と予測器の関係の複雑さと滑らかさを制御する。基底展開アプローチにより、関数問題は興味深い統合行列近似タスクへと導かれる。そこでは、統合された低ランク行列のブロックまたはサブマトリクスが共通の行空間と/または列空間を共有する。収束保証付き反復アルゴリズムを開発した。我々は,nrrrの一貫性を確立し,非漸近的解析により,低ランク回帰のそれと少なくとも同等の誤差率が得られることを示す。シミュレーション研究はNRRRの有効性を示す。我々は,nrrrを電力需要問題に適用し,1日あたりの電力消費の軌跡と1日あたりの気温の関係を明らかにした。

We propose a nested reduced-rank regression (NRRR) approach in fitting regression model with multivariate functional responses and predictors, to achieve tailored dimension reduction and facilitate interpretation/visualization of the resulting functional model. Our approach is based on a two-level low-rank structure imposed on the functional regression surfaces. A global low-rank structure identifies a small set of latent principal functional responses and predictors that drives the underlying regression association. A local low-rank structure then controls the complexity and smoothness of the association between the principal functional responses and predictors. Through a basis expansion approach, the functional problem boils down to an interesting integrated matrix approximation task, where the blocks or submatrices of an integrated low-rank matrix share some common row space and/or column space. An iterative algorithm with convergence guarantee is developed. We establish the consistency of NRRR and also show through non-asymptotic analysis that it can achieve at least a comparable error rate to that of the reduced-rank regression. Simulation studies demonstrate the effectiveness of NRRR. We apply NRRR in an electricity demand problem, to relate the trajectories of the daily electricity consumption with those of the daily temperatures.

翻訳日:2022-12-24 21:28:32 公開日:2020-03-10

# Deep Blindビデオの超高解像度化

Deep Blind Video Super-resolution ( http://arxiv.org/abs/2003.04716v1 )

ライセンス: Link先を確認

Jinshan Pan, Songsheng Cheng, Jiawei Zhang, Jinhui Tang

(参考訳) 既存のビデオ超解像(SR)アルゴリズムは通常、劣化過程におけるぼやけたカーネルが知られており、復元過程におけるぼやけたカーネルをモデル化していないと仮定する。しかし、この仮定はビデオSRには当てはまらないため、通常は過度に滑らかな超解像につながる。本稿では,ぼかしカーネルモデリング手法を用いてビデオsrを解くための深層畳み込みニューラルネットワーク(cnn)モデルを提案する。提案するディープcnnモデルは, 動きのぼかし推定, 動き推定, 潜在画像復元モジュールで構成される。モーションボケ推定モジュールは、信頼できるボケカーネルを提供するために使用される。推定したぼやけたカーネルを用いて,ビデオSRの画像形成モデルに基づく画像デコンボリューション手法を開発し,画像内容の鮮明な復元を可能にする。しかし、生成された中間潜伏画像にはアーティファクトが含まれている可能性がある。高品質な画像を生成するために,移動推定モジュールを用いて隣接するフレームから情報を探索する。提案アルゴリズムは,より微細な構造情報でより鮮明な画像を生成することができることを示す。実験結果から,提案アルゴリズムは最先端手法に対して好適に動作することが示された。

Existing video super-resolution (SR) algorithms usually assume that the blur kernels in the degradation process are known and do not model the blur kernels in the restoration. However, this assumption does not hold for video SR and usually leads to over-smoothed super-resolved images. In this paper, we propose a deep convolutional neural network (CNN) model to solve video SR by a blur kernel modeling approach. The proposed deep CNN model consists of motion blur estimation, motion estimation, and latent image restoration modules. The motion blur estimation module is used to provide reliable blur kernels. With the estimated blur kernel, we develop an image deconvolution method based on the image formation model of video SR to generate intermediate latent images so that some sharp image contents can be restored well. However, the generated intermediate latent images may contain artifacts. To generate high-quality images, we use the motion estimation module to explore the information from adjacent frames, where the motion estimation can constrain the deep CNN model for better image restoration. We show that the proposed algorithm is able to generate clearer images with finer structural details. Extensive experimental results show that the proposed algorithm performs favorably against state-of-the-art methods.

翻訳日:2022-12-24 21:22:18 公開日:2020-03-10

# 雨のスクリーン:屋内で雨のデータセットを収集

Rainy screens: Collecting rainy datasets, indoors ( http://arxiv.org/abs/2003.04742v1 )

ライセンス: Link先を確認

Horia Porav, Valentina-Nicoleta Musat, Tom Bruls, Paul Newman

(参考訳) 適切な地上の真理の保証や、所望の気象条件との同期が難しいため、ロボット工学における不都合な状況を伴うデータの取得は厄介な作業である。本稿では,既存のクリア・グラウンド・ルース・イメージから多彩な雨画像を生成するための高精細なスクリーンを簡易に記録する手法を提案する。このセットアップにより、セマンティクスセグメンテーションやオブジェクト位置など、補助的なタスク基底データによる既存のデータセットの多様性を活用できます。都市景観とBDDに基づく降雨量と降雨量と実際の付着液滴を用いた降雨画像を生成し,デライニングモデルを訓練する。本稿では,画像再構成とセマンティックセグメンテーションの定量的な結果と,サンプル外領域の定性的な結果を示す。

Acquisition of data with adverse conditions in robotics is a cumbersome task due to the difficulty in guaranteeing proper ground truth and synchronising with desired weather conditions. In this paper, we present a simple method - recording a high resolution screen - for generating diverse rainy images from existing clear ground-truth images that is domain- and source-agnostic, simple and scales up. This setup allows us to leverage the diversity of existing datasets with auxiliary task ground-truth data, such as semantic segmentation, object positions etc. We generate rainy images with real adherent droplets and rain streaks based on Cityscapes and BDD, and train a de-raining model. We present quantitative results for image reconstruction and semantic segmentation, and qualitative results for an out-of-sample domain, showing that models trained with our data generalize well.

翻訳日:2022-12-24 21:21:59 公開日:2020-03-10

# 3次元LiDARデータを用いたオフロード乾燥領域抽出

Off-Road Drivable Area Extraction Using 3D LiDAR Data ( http://arxiv.org/abs/2003.04780v1 )

ライセンス: Link先を確認

Biao Gao, Anran Xu, Yancheng Pan, Xijun Zhao, Wen Yao, Huijing Zhao

(参考訳) 本研究では,3次元lidarデータを用いたオフロード自由領域抽出手法を提案する。特定のディープラーニングフレームワークは、オフロード環境における大きな課題の1つである曖昧な領域に対処するように設計されている。ネットワークトレーニングのための人手によるアノテートデータの需要を大幅に減らすため,大量の車両経路や自動生成障害物ラベルからの情報を利用する。これらの自動生成アノテーションを使用することで、提案されたネットワークは弱い教師付きまたは半教師付きメソッドを使ってトレーニングすることができる。このデータセットの実験は、我々のフレームワークの理性と弱く半教師ありの手法の有効性を示すものである。

We propose a method for off-road drivable area extraction using 3D LiDAR data with the goal of autonomous driving application. A specific deep learning framework is designed to deal with the ambiguous area, which is one of the main challenges in the off-road environment. To reduce the considerable demand for human-annotated data for network training, we utilize the information from vast quantities of vehicle paths and auto-generated obstacle labels. Using these autogenerated annotations, the proposed network can be trained using weakly supervised or semi-supervised methods, which can achieve better performance with fewer human annotations. The experiments on our dataset illustrate the reasonability of our framework and the validity of our weakly and semi-supervised methods.

翻訳日:2022-12-24 21:21:42 公開日:2020-03-10

# SAD:敵対的事例に対する衛生ベースの防衛

SAD: Saliency-based Defenses Against Adversarial Examples ( http://arxiv.org/abs/2003.04820v1 )

ライセンス: Link先を確認

Richard Tran, David Patrick, Michael Geyer, Amanda Fernandez

(参考訳) 機械学習モデルやディープラーニングモデルの人気が高まり、悪意のある入力に対する脆弱性への注目が高まっている。これらの逆の例では、ネットワークの本来の意図からモデル予測を逸脱させ、実践的セキュリティの懸念が高まっている。これらの攻撃に対抗するために、ニューラルネットワークは従来の画像処理アプローチや最先端の防御モデルを利用して、データの摂動を減らすことができる。ノイズ低減にグローバルアプローチを採用する防御アプローチは、敵の攻撃に対して有効であるが、その損失アプローチはしばしば画像内の重要なデータを歪ませる。本研究では, 対人攻撃の影響を受けやすいクリーニングデータに対する視覚的サリエンシに基づくアプローチを提案する。本モデルでは, 対象画像内の損失を相対的に低減しつつ, 対象画像のサルエント領域を有効活用する。攻撃前, 攻撃前, 清掃後において, 最先端の衛生手法の有効性を評価することにより, モデルの精度を評価する。提案手法は,2つのサリエンシーデータセットにまたがって,関連する防御手法や確立された敵対的攻撃手法と比較し,有効性を示す。対象としたアプローチでは,従来のアプローチと最先端のアプローチと比較して,標準統計値と距離塩分値の指標が大幅に改善されている。

With the rise in popularity of machine and deep learning models, there is an increased focus on their vulnerability to malicious inputs. These adversarial examples drift model predictions away from the original intent of the network and are a growing concern in practical security. In order to combat these attacks, neural networks can leverage traditional image processing approaches or state-of-the-art defensive models to reduce perturbations in the data. Defensive approaches that take a global approach to noise reduction are effective against adversarial attacks, however their lossy approach often distorts important data within the image. In this work, we propose a visual saliency based approach to cleaning data affected by an adversarial attack. Our model leverages the salient regions of an adversarial image in order to provide a targeted countermeasure while comparatively reducing loss within the cleaned images. We measure the accuracy of our model by evaluating the effectiveness of state-of-the-art saliency methods prior to attack, under attack, and after application of cleaning methods. We demonstrate the effectiveness of our proposed approach in comparison with related defenses and against established adversarial attack methods, across two saliency datasets. Our targeted approach shows significant improvements in a range of standard statistical and distance saliency metrics, in comparison with both traditional and state-of-the-art approaches.

翻訳日:2022-12-24 21:21:31 公開日:2020-03-10

# PANDA:ギガピクセルレベルの人間中心のビデオデータセット

PANDA: A Gigapixel-level Human-centric Video Dataset ( http://arxiv.org/abs/2003.04852v1 )

ライセンス: Link先を確認

Xueyang Wang, Xiya Zhang, Yinheng Zhu, Yuchen Guo, Xiaoyun Yuan, Liuyu Xiang, Zerun Wang, Guiguang Ding, David J Brady, Qionghai Dai, Lu Fang

(参考訳) 大規模・長期・多目的視覚分析のための,最初のギガPixelレベルのフガン中心のViDeo dAtasetであるPANDAを提示する。 PANDAのビデオはギガピクセルカメラで撮影され、広視野(約1km)と高解像度(〜ギガピクセルレベル/フレーム)の両方で現実世界のシーンをカバーしている。シーンは、100倍以上のスケールの4Kヘッド数を含むことができる。 PANDAは15,974.6kのバウンディングボックス、111.8kの微粒な属性ラベル、12.7kの軌道、2.2kのグループ、2.9kの相互作用を含む、リッチで階層的な基底構造アノテーションを提供する。人間の検出と追跡のタスクをベンチマークします。歩行者のポーズ, スケール, 閉塞, 軌道の多様さから, 既存のアプローチは精度と効率の両面から挑戦されている。広帯域FoVと高解像度のPANDAの特異性を考えると,対話型グループ検出の新たな課題が紹介される。我々は,グローバルトラジェクタと局所的な相互作用を同時にエンコードし,有望な結果をもたらす「グローバルからローカルへのズームイン」フレームワークを設計する。我々はPANDAが、大規模な現実世界のシーンにおける人間の行動や相互作用を理解することによって、人工知能と実践学のコミュニティに貢献すると考えている。 panda webサイト: http://www.panda-dataset.com

We present PANDA, the first gigaPixel-level humAN-centric viDeo dAtaset, for large-scale, long-term, and multi-object visual analysis. The videos in PANDA were captured by a gigapixel camera and cover real-world scenes with both wide field-of-view (~1 square kilometer area) and high-resolution details (~gigapixel-level/frame). The scenes may contain 4k head counts with over 100x scale variation. PANDA provides enriched and hierarchical ground-truth annotations, including 15,974.6k bounding boxes, 111.8k fine-grained attribute labels, 12.7k trajectories, 2.2k groups and 2.9k interactions. We benchmark the human detection and tracking tasks. Due to the vast variance of pedestrian pose, scale, occlusion and trajectory, existing approaches are challenged by both accuracy and efficiency. Given the uniqueness of PANDA with both wide FoV and high resolution, a new task of interaction-aware group detection is introduced. We design a 'global-to-local zoom-in' framework, where global trajectories and local interactions are simultaneously encoded, yielding promising results. We believe PANDA will contribute to the community of artificial intelligence and praxeology by understanding human behaviors and interactions in large-scale real-world scenes. PANDA Website: http://www.panda-dataset.com.

翻訳日:2022-12-24 21:20:48 公開日:2020-03-10

# 乳がん診断のための深層学習アプローチ

Deep learning approach for breast cancer diagnosis ( http://arxiv.org/abs/2003.04480v1 )

ライセンス: Link先を確認

Essam A. Rashed and M. Samir Abou El Seoud

(参考訳) 乳がんは、早期発見時に高いリスクコントロールを持つ世界でも有数の致命的な疾患の一つである。乳房検診の従来の方法はX線マンモグラフィーであり,早期発見が難しいことが知られている。画像の圧縮による乳房の高密度構造は, 微小な異常を認識するのが困難であった。また,乳房組織の異種間および異種間は,手作りの特徴を用いた高い診断精度を達成することが極めて困難である。ディープラーニングは、比較的高い計算能力を必要とする、新しい機械学習技術である。しかし、それは人間の知能のレベルで意思決定を必要とするいくつかの難しいタスクにおいて非常に効果的であることが判明した。本稿では,乳がんを効果的かつ早期に検出できるU-net構造に着想を得た新しいネットワークアーキテクチャを開発する。その結果,臨床応用において提案手法の有用性を示す感度と特異性が高いことが示唆された。

Breast cancer is one of the leading fatal disease worldwide with high risk control if early discovered. Conventional method for breast screening is x-ray mammography, which is known to be challenging for early detection of cancer lesions. The dense breast structure produced due to the compression process during imaging lead to difficulties to recognize small size abnormalities. Also, inter- and intra-variations of breast tissues lead to significant difficulties to achieve high diagnosis accuracy using hand-crafted features. Deep learning is an emerging machine learning technology that requires a relatively high computation power. Yet, it proved to be very effective in several difficult tasks that requires decision making at the level of human intelligence. In this paper, we develop a new network architecture inspired by the U-net structure that can be used for effective and early detection of breast cancer. Results indicate a high rate of sensitivity and specificity that indicate potential usefulness of the proposed approach in clinical use.

翻訳日:2022-12-24 21:14:25 公開日:2020-03-10

# PBRnet: 物体位置決め精度を向上させるためのピラミッド境界ボックスリファインメント

PBRnet: Pyramidal Bounding Box Refinement to Improve Object Localization Accuracy ( http://arxiv.org/abs/2003.04541v1 )

ライセンス: Link先を確認

Li Xiao, Yufan Luo, Chunlong Luo, Lianhe Zhao, Quanshui Fu, Guoqing Yang, Anpeng Huang, Yi Zhao

(参考訳) 近年,粗粒から微粒までの提案を分類・回帰する段階を数段階含む粗粒度フレームワークに着目した物体検出装置が多数開発され,より高精度な検出が徐々に実現されている。特徴ピラミッドネットワーク(FPN)のようなマルチレゾリューションモデルは、異なる解像度の情報を統合し、性能を効果的に改善する。以前の研究でも、ローカライゼーションをさらに改善できることが判明している。 1) より翻訳的な変種であるきめ細かい情報を使用すること 2)地域境界情報により焦点を絞った地域を精錬する。これらの原理に基づき、我々は、粗粒度フレームワークとピラミッド境界箱微細化ネットワーク(PBRnet)という特徴ピラミッド構造を組み合わせることにより、局所化精度を向上させるために、新しい境界改善アーキテクチャを設計した。大規模な実験はMS-COCOデータセット上で行われる。 PBRnetは、FPNやLibra R-CNNに追加すると、約3ポイントのmAP$が大幅に向上する。さらに、カスケードR-CNNを粗大な検出器として扱い、PBRnetの回帰器によってローカライゼーションブランチを置き換えることで、1.5$mAP$の余分な性能向上を実現し、最大5ポイントのmAP$まで性能が向上する。

Many recently developed object detectors focused on coarse-to-fine framework which contains several stages that classify and regress proposals from coarse-grain to fine-grain, and obtains more accurate detection gradually. Multi-resolution models such as Feature Pyramid Network(FPN) integrate information of different levels of resolution and effectively improve the performance. Previous researches also have revealed that localization can be further improved by: 1) using fine-grained information which is more translational variant; 2) refining local areas which is more focused on local boundary information. Based on these principles, we designed a novel boundary refinement architecture to improve localization accuracy by combining coarse-to-fine framework with feature pyramid structure, named as Pyramidal Bounding Box Refinement network(PBRnet), which parameterizes gradually focused boundary areas of objects and leverages lower-level feature maps to extract finer local information when refining the predicted bounding boxes. Extensive experiments are performed on the MS-COCO dataset. The PBRnet brings a significant performance gains by roughly 3 point of $mAP$ when added to FPN or Libra R-CNN. Moreover, by treating Cascade R-CNN as a coarse-to-fine detector and replacing its localization branch by the regressor of PBRnet, it leads an extra performance improvement by 1.5 $mAP$, yielding a total performance boosting by as high as 5 point of $mAP$.

翻訳日:2022-12-24 21:13:48 公開日:2020-03-10

# ハイパースペクトル画像復調のための3次元準リカレントニューラルネットワーク

3D Quasi-Recurrent Neural Network for Hyperspectral Image Denoising ( http://arxiv.org/abs/2003.04547v1 )

ライセンス: Link先を確認

Kaixuan Wei, Ying Fu, Hua Huang

(参考訳) 本稿では,ハイパースペクトル画像(hsi)デノイジングのための交互方向3次元準リカレントニューラルネットワークを提案し,スペクトルに沿った領域知識 -- 構造空間スペクトル相関と大域相関を効果的に組み込む。具体的には、3次元畳み込みを用いてHSIの構造空間-スペクトル相関を抽出し、準再帰プール関数を用いてスペクトルに沿った大域的相関を捉える。さらに,計算コストを増すことなく因果依存性を排除するために,方向の交互構造を導入する。提案モデルは、任意のバンド数でHSIに対する柔軟性を保ちながら、スペクトル依存性をモデル化することができる。 HSI復調に関する大規模な実験は、復元精度と計算時間の両方の観点から、様々な騒音条件下での最先端技術よりも大幅に改善されている。私たちのコードはhttps://github.com/vandermode/qrnn3dで利用可能です。

In this paper, we propose an alternating directional 3D quasi-recurrent neural network for hyperspectral image (HSI) denoising, which can effectively embed the domain knowledge -- structural spatio-spectral correlation and global correlation along spectrum. Specifically, 3D convolution is utilized to extract structural spatio-spectral correlation in an HSI, while a quasi-recurrent pooling function is employed to capture the global correlation along spectrum. Moreover, alternating directional structure is introduced to eliminate the causal dependency with no additional computation cost. The proposed model is capable of modeling spatio-spectral dependency while preserving the flexibility towards HSIs with arbitrary number of bands. Extensive experiments on HSI denoising demonstrate significant improvement over state-of-the-arts under various noise settings, in terms of both restoration accuracy and computation time. Our code is available at https://github.com/Vandermode/QRNN3D.

翻訳日:2022-12-24 21:12:51 公開日:2020-03-10

# クラスごとの注釈付き1点のみに基づく複合運転シーンにおける画素レベルセマンティック学習の実現

Realizing Pixel-Level Semantic Learning in Complex Driving Scenes based on Only One Annotated Pixel per Class ( http://arxiv.org/abs/2003.04671v1 )

ライセンス: Link先を確認

Xi Li, Huimin Ma, Sheng Yi, Yanxian Chen

(参考訳) 弱教師付き条件に基づくセマンティックセグメンテーションタスクは、軽量なラベリングプロセスを実現するために進められている。いくつかのカテゴリのみを含む単純な画像の場合、画像レベルのアノテーションに基づく研究は許容できる性能を達成した。しかし、複雑な場面に直面すると、画像には大量のクラスが含まれているため、画像タグに基づいて視覚的な外観を学ぶことが困難になる。この場合、画像レベルのアノテーションは情報提供に有効ではない。そこで,各カテゴリに1つのアノテートされた画素のみを割り当てるタスクを新たに設定した。より軽量で情報的な条件に基づいて、擬似ラベル生成のための3段階のプロセスが構築され、各カテゴリの最適な特徴表現、画像推論、コンテキストロケーションに基づくリファインメントを段階的に実装する。特に,高レベルセマンティクスと低レベルイメージング機能は,運転場面の各クラスで異なる識別能力を有するため,各カテゴリを「オブジェクト」または「シーン」に分け,その2つのタイプの異なる操作を提供する。さらに、cnnベースの画像間共通意味学習と画像内修正処理を組み合わせたセグメンテーション性能を徐々に向上させるために、交互に反復構造が確立される。 Cityscapesデータセットの実験では、複雑な運転シーン下で弱教師付きセマンティックセマンティックセグメンテーションタスクを解決するための提案手法が実現可能であることが示された。

Semantic segmentation tasks based on weakly supervised condition have been put forward to achieve a lightweight labeling process. For simple images that only include a few categories, researches based on image-level annotations have achieved acceptable performance. However, when facing complex scenes, since image contains a large amount of classes, it becomes difficult to learn visual appearance based on image tags. In this case, image-level annotations are not effective in providing information. Therefore, we set up a new task in which only one annotated pixel is provided for each category. Based on the more lightweight and informative condition, a three step process is built for pseudo labels generation, which progressively implement optimal feature representation for each category, image inference and context-location based refinement. In particular, since high-level semantics and low-level imaging feature have different discriminative ability for each class under driving scenes, we divide each category into "object" or "scene" and then provide different operations for the two types separately. Further, an alternate iterative structure is established to gradually improve segmentation performance, which combines CNN-based inter-image common semantic learning and imaging prior based intra-image modification process. Experiments on Cityscapes dataset demonstrate that the proposed method provides a feasible way to solve weakly supervised semantic segmentation task under complex driving scenes.

翻訳日:2022-12-24 21:10:59 公開日:2020-03-10

# HeatNet: 熱画像を用いたセマンティックセグメンテーションにおける日中ドメインギャップのブリッジ

HeatNet: Bridging the Day-Night Domain Gap in Semantic Segmentation with Thermal Images ( http://arxiv.org/abs/2003.04645v1 )

ライセンス: Link先を確認

Johan Vertens, Jannik Z\"urn, Wolfram Burgard

(参考訳) 学習に基づくセマンティックセグメンテーション手法の大部分は、昼間のシナリオや好ましい照明条件に最適化されている。しかし、現実の運転シナリオは、既存のアプローチの課題である夜間照明やグレアのような有害な環境条件を伴っている。本研究では,日中と夜間に適用可能なマルチモーダル意味セグメンテーションモデルを提案する。この目的のために、RGB画像以外にも、熱画像を活用し、ネットワークをはるかに堅牢にする。我々は、既存の昼間RGBデータセットを活用して、夜間画像の高価なアノテーションを避けるとともに、夜間領域にデータセットの知識を伝達する教師学習アプローチを提案する。さらに,学習した特徴空間の整列化にドメイン適応法を適用し,新しい二段階学習法を提案する。さらに, 自動走行用サーマルデータが不足しているため, 時間同期とRGB熱画像ペアの整列が2万を超える新しいデータセットを提案する。そこで,本研究では,ロバストなサーマルカメラキャリブレーションを実現するための新しいターゲットレスキャリブレーション手法を提案する。中でも,夜間セマンティックセグメンテーションの最先端結果を示すために,我々の新しいデータセットを用いた。

The majority of learning-based semantic segmentation methods are optimized for daytime scenarios and favorable lighting conditions. Real-world driving scenarios, however, entail adverse environmental conditions such as nighttime illumination or glare which remain a challenge for existing approaches. In this work, we propose a multimodal semantic segmentation model that can be applied during daytime and nighttime. To this end, besides RGB images, we leverage thermal images, making our network significantly more robust. We avoid the expensive annotation of nighttime images by leveraging an existing daytime RGB-dataset and propose a teacher-student training approach that transfers the dataset's knowledge to the nighttime domain. We further employ a domain adaptation method to align the learned feature spaces across the domains and propose a novel two-stage training scheme. Furthermore, due to a lack of thermal data for autonomous driving, we present a new dataset comprising over 20,000 time-synchronized and aligned RGB-thermal image pairs. In this context, we also present a novel target-less calibration method that allows for automatic robust extrinsic and intrinsic thermal camera calibration. Among others, we employ our new dataset to show state-of-the-art results for nighttime semantic segmentation.

翻訳日:2022-12-24 21:04:56 公開日:2020-03-10

# 手術用ジェスチャ認識と進捗予測のためのマルチタスクリカレントニューラルネットワーク

Multi-Task Recurrent Neural Network for Surgical Gesture Recognition and Progress Prediction ( http://arxiv.org/abs/2003.04772v1 )

ライセンス: Link先を確認

Beatrice van Amsterdam, Matthew J. Clarkson, Danail Stoyanov

(参考訳) 手術用ジェスチャー認識は手術用データサイエンスおよびコンピュータ支援介入において重要である。ロボティックキネマティックな情報であっても、手術手順を自動的に分割することは、手術のデモがスタイル、持続時間、行動の順序において高い変動性によって特徴づけられるため、多くの課題を生じさせる。運動信号から識別的特徴を抽出し,認識精度を高めるために,手術動作の同時認識と手術進行の新たな定式化を行うマルチタスクリカレントニューラルネットワークを提案する。提案手法の有効性を示すため,ロボットキネマティックデータを用いた外科的ジェスチャー認識用データセットとして現在唯一公開されているJIGSAWSデータセットについて,その適用性を評価する。マルチタスクフレームワークでは,手動ラベリングやトレーニングを伴わずに,進捗推定による認識性能が向上することが実証された。

Surgical gesture recognition is important for surgical data science and computer-aided intervention. Even with robotic kinematic information, automatically segmenting surgical steps presents numerous challenges because surgical demonstrations are characterized by high variability in style, duration and order of actions. In order to extract discriminative features from the kinematic signals and boost recognition accuracy, we propose a multi-task recurrent neural network for simultaneous recognition of surgical gestures and estimation of a novel formulation of surgical task progress. To show the effectiveness of the presented approach, we evaluate its application on the JIGSAWS dataset, that is currently the only publicly available dataset for surgical gesture recognition featuring robot kinematic data. We demonstrate that recognition performance improves in multi-task frameworks with progress estimation without any additional manual labelling and training.

翻訳日:2022-12-24 21:04:08 公開日:2020-03-10

# 形状変形のための小型分光ディスクリプタ

A Compact Spectral Descriptor for Shape Deformations ( http://arxiv.org/abs/2003.08758v1 )

ライセンス: Link先を確認

Skylar Sible, Rodrigo Iza-Teran, Jochen Garcke, Nikola Aulig, Patricia Wollstadt

(参考訳) 工学領域における現代の製品設計は、有限要素に基づくシミュレーション、計算最適化、機械学習のような現代的なデータ分析技術を含む計算分析によってますます推進されている。これらの手法を適用するには、開発中のコンポーネントや関連する設計基準に適したデータ表現が必要となる。コンポーネントの幾何学は一般にポリゴン表面メッシュで表されるが、効率的な計算解析を実現するために重要な設計特性をどのようにパラメトリズするかは明確ではない。本研究では,自動車の衝突挙動を最適化する場合など,多くの応用分野において重要な設計基準となる応力下での部品の塑性変形挙動をパラメータ化するための新しい手法を提案する。既存のパラメータ化は計算解析を比較的単純な変形に制限し、一般に専門家による広範な入力を必要とし、設計プロセスは集中的でコストがかかる。そこで本研究では, スペクトルメッシュ処理に基づく変形挙動のコンパクトな記述子を導出し, 同様に複雑な変形の低次元表現を可能にする手法を提案する。提案するディスクリプタは, 幾何学的変形挙動のパラメトリゼーションに対する新しいアプローチを提供し, 塑性変形挙動に関する工学的課題に対する機械学習などの最先端データ解析技術の利用を可能にする。

Modern product design in the engineering domain is increasingly driven by computational analysis including finite-element based simulation, computational optimization, and modern data analysis techniques such as machine learning. To apply these methods, suitable data representations for components under development as well as for related design criteria have to be found. While a component's geometry is typically represented by a polygon surface mesh, it is often not clear how to parametrize critical design properties in order to enable efficient computational analysis. In the present work, we propose a novel methodology to obtain a parameterization of a component's plastic deformation behavior under stress, which is an important design criterion in many application domains, for example, when optimizing the crash behavior in the automotive context. Existing parameterizations limit computational analysis to relatively simple deformations and typically require extensive input by an expert, making the design process time intensive and costly. Hence, we propose a way to derive a compact descriptor of deformation behavior that is based on spectral mesh processing and enables a low-dimensional representation of also complex deformations.We demonstrate the descriptor's ability to represent relevant deformation behavior by applying it in a nearest-neighbor search to identify similar simulation results in a filtering task. The proposed descriptor provides a novel approach to the parametrization of geometric deformation behavior and enables the use of state-of-the-art data analysis techniques such as machine learning to engineering tasks concerned with plastic deformation behavior.

翻訳日:2022-12-24 21:03:24 公開日:2020-03-10

# 機械読解ゴールド標準の評価フレームワーク

A Framework for Evaluation of Machine Reading Comprehension Gold Standards ( http://arxiv.org/abs/2003.04642v1 )

ライセンス: Link先を確認

Viktor Schlegel, Marco Valentino, Andr\'e Freitas, Goran Nenadic, Riza Batista-Navarro

(参考訳) 機械読解(英語: Machine Reading Comprehension、MRC)とは、1段落の文章で質問に答える作業である。ニューラルMCCシステムは人気を博し、顕著な性能を達成する一方で、それらの性能を確立するために使用される方法論、特にそれらの評価に使用される金の標準のデータ設計に関して問題が提起されている。このデータに存在する課題について、限られた理解しかできないため、比較を引いて信頼できる仮説を定式化することは困難である。本稿では,この問題を解消するための第一歩として,現在の言語的特徴,必要な推論と背景知識,事実的正確性,そして語彙的手がかりの存在を,理解要件の下限として体系的に検討するための統一的枠組みを提案する。本稿では,第1の定性的なアノテーションスキーマと後者の近似指標のセットを提案する。本フレームワークの第一の応用として, 現代のMRCゴールド標準を分析し, 語彙的曖昧性に寄与する特徴の欠如, 期待する回答の様々な事実的正しさ, 語彙的手がかりの存在などについて述べる。

Machine Reading Comprehension (MRC) is the task of answering a question over a paragraph of text. While neural MRC systems gain popularity and achieve noticeable performance, issues are being raised with the methodology used to establish their performance, particularly concerning the data design of gold standards that are used to evaluate them. There is but a limited understanding of the challenges present in this data, which makes it hard to draw comparisons and formulate reliable hypotheses. As a first step towards alleviating the problem, this paper proposes a unifying framework to systematically investigate the present linguistic features, required reasoning and background knowledge and factual correctness on one hand, and the presence of lexical cues as a lower bound for the requirement of understanding on the other hand. We propose a qualitative annotation schema for the first and a set of approximative metrics for the latter. In a first application of the framework, we analyse modern MRC gold standards and present our findings: the absence of features that contribute towards lexical ambiguity, the varying factual correctness of the expected answers and the presence of lexical cues, all of which potentially lower the reading comprehension complexity and quality of the evaluation data.

翻訳日:2022-12-24 21:02:40 公開日:2020-03-10

# デュアルセンスエンコーダを用いた効率的なインテント検出

Efficient Intent Detection with Dual Sentence Encoders ( http://arxiv.org/abs/2003.04807v1 )

ライセンス: Link先を確認

I\~nigo Casanueva, Tadas Tem\v{c}inas, Daniela Gerz, Matthew Henderson, Ivan Vuli\'c

(参考訳) 新しいドメインと追加機能で会話システムを構築するには、低データ状態下で動くリソース効率のモデルが必要となる。これらの要件により、USEやConveRTのような事前訓練された二重文エンコーダによるインテント検出手法を導入する。提案するインテント検出器の有用性と幅広い適用性を示す。 1 目的検出装置は、完全なBERTラージモデルを微調整し、又は三種類の目的検出データセットの固定ブラックボックスエンコーダとしてBERTを使用する。 2 利得は、特に少額の設定で発音される(すなわち、意図ごとの注記例が10又は30件のみである)。 3)我々の意図検出器は,1つのcpu上で数分で訓練することができる。 4) 異なるハイパーパラメータ設定で安定している。意図検出に焦点をあてた研究の促進と民主化を期待し、コードをリリースし、77以上のインテントに注釈付き例を含む、新たな挑戦的な1ドメインインテント検出データセットをリリースします。

Building conversational systems in new domains and with added functionality requires resource-efficient models that work under low-data regimes (i.e., in few-shot setups). Motivated by these requirements, we introduce intent detection methods backed by pretrained dual sentence encoders such as USE and ConveRT. We demonstrate the usefulness and wide applicability of the proposed intent detectors, showing that: 1) they outperform intent detectors based on fine-tuning the full BERT-Large model or using BERT as a fixed black-box encoder on three diverse intent detection data sets; 2) the gains are especially pronounced in few-shot setups (i.e., with only 10 or 30 annotated examples per intent); 3) our intent detectors can be trained in a matter of minutes on a single CPU; and 4) they are stable across different hyperparameter settings. In hope of facilitating and democratizing research focused on intention detection, we release our code, as well as a new challenging single-domain intent detection dataset comprising 13,083 annotated examples over 77 intents.

翻訳日:2022-12-24 21:02:20 公開日:2020-03-10

# multi-simlex:多言語・言語間意味類似性の大規模評価

Multi-SimLex: A Large-Scale Evaluation of Multilingual and Cross-Lingual Lexical Semantic Similarity ( http://arxiv.org/abs/2003.04866v1 )

ライセンス: Link先を確認

Ivan Vuli\'c, Simon Baker, Edoardo Maria Ponti, Ulla Petti, Ira Leviant, Kelly Wing, Olga Majewska, Eden Bar, Matt Malone, Thierry Poibeau, Roi Reichart, Anna Korhonen

(参考訳) 大規模な語彙資源と評価ベンチマークであるMulti-SimLexを導入し、主要な言語(中国語、スペイン語、ロシア語など)や低リソースの言語(ウェールズ語、キスワヒリ語など)を含む、12の類型的に多様な言語のデータセットをカバーした。各言語データセットは、意味的類似性の語彙的関係に注釈付けされ、1,888の意味的整合概念ペアを含み、単語クラス(名詞、動詞、形容詞、副詞)、頻度ランク、類似度間隔、語彙フィールド、具体性レベルを代表的にカバーする。さらに、言語間の概念のアラインメントにより、66の言語間の意味的類似性データセットを提供する。広範にわたるサイズと言語カバレッジのため、マルチsimlexは実験的な評価と分析のための全く新しい機会を提供する。モノリンガルおよびクロスリンガルのベンチマークでは,静的および文脈化された単語埋め込み(fastText, M-BERT, XLM など)や外部情報による語彙表現,さらには完全に教師のない(弱く)教師付き言語間単語埋め込みなど,最新のモノリンガルおよびクロスリンガル表現モデルの評価と解析を行った。また、追加言語のための一貫性のあるマルチシンプレックススタイルのリソースを作成するためのステップバイステップのデータセット生成プロトコルを提案する。我々は、これらの貢献 -- マルチsimlexデータセットのパブリックリリース、それらの作成プロトコル、強力なベースライン結果、そして多言語語彙意味論と表現学習の将来の発展を導くのに役立つ深い分析 -- を、コミュニティがより多くの言語にマルチsimlexをさらに拡張するための努力を促すwebサイトを通じて提供します。このような大規模セマンティックリソースは、言語間でのNLPのさらなる進歩を引き起こす可能性がある。

We introduce Multi-SimLex, a large-scale lexical resource and evaluation benchmark covering datasets for 12 typologically diverse languages, including major languages (e.g., Mandarin Chinese, Spanish, Russian) as well as less-resourced ones (e.g., Welsh, Kiswahili). Each language dataset is annotated for the lexical relation of semantic similarity and contains 1,888 semantically aligned concept pairs, providing a representative coverage of word classes (nouns, verbs, adjectives, adverbs), frequency ranks, similarity intervals, lexical fields, and concreteness levels. Additionally, owing to the alignment of concepts across languages, we provide a suite of 66 cross-lingual semantic similarity datasets. Due to its extensive size and language coverage, Multi-SimLex provides entirely novel opportunities for experimental evaluation and analysis. On its monolingual and cross-lingual benchmarks, we evaluate and analyze a wide array of recent state-of-the-art monolingual and cross-lingual representation models, including static and contextualized word embeddings (such as fastText, M-BERT and XLM), externally informed lexical representations, as well as fully unsupervised and (weakly) supervised cross-lingual word embeddings. We also present a step-by-step dataset creation protocol for creating consistent, Multi-Simlex-style resources for additional languages. We make these contributions -- the public release of Multi-SimLex datasets, their creation protocol, strong baseline results, and in-depth analyses which can be be helpful in guiding future developments in multilingual lexical semantics and representation learning -- available via a website which will encourage community effort in further expansion of Multi-Simlex to many more languages. Such a large-scale semantic resource could inspire significant further advances in NLP across languages.

翻訳日:2022-12-24 21:02:00 公開日:2020-03-10

# 制約プログラミングによる道路利用者追跡

Tracking Road Users using Constraint Programming ( http://arxiv.org/abs/2003.04468v1 )

ライセンス: Link先を確認

Alexandre Pineault, Guillaume-Alexandre Bilodeau, Gilles Pesant

(参考訳) 本稿では,都市景観における道路利用者の追跡を改善することを目的とする。本稿では,マルチオブジェクトトラッキング(MOT)問題のトラッキング・バイ・検出パラダイムに見られるデータアソシエーションフェーズに対する制約プログラミング(CP)アプローチを提案する。このようなアプローチは、グラフベースの手法よりも効率的にデータ関連問題を解決でき、複数のフレームを分析した時に発生する組合せ爆発をよりうまく扱うことができる。データアソシエーションの問題に焦点が当てられているため、MOT法では各フレームの中心位置と色である単純な画像特徴のみを用いる。制約はこれらの2つの特徴と一般的なMOT問題に基づいて定義される。例えば、軌跡に対して色覚の保存を強制し、フレーム間の動きの程度を制限する。フィルタ層は、CPを使用する前に検出候補を除去し、CPソルバが生成するダミー軌道を除去するために用いられる。提案手法は車両追跡データを用いてテストし,UA-DETRACベンチマークの上位手法よりも優れた結果を得た。

In this paper, we aim at improving the tracking of road users in urban scenes. We present a constraint programming (CP) approach for the data association phase found in the tracking-by-detection paradigm of the multiple object tracking (MOT) problem. Such an approach can solve the data association problem more efficiently than graph-based methods and can handle better the combinatorial explosion occurring when multiple frames are analyzed. Because our focus is on the data association problem, our MOT method only uses simple image features, which are the center position and color of detections for each frame. Constraints are defined on these two features and on the general MOT problem. For example, we enforce color appearance preservation over trajectories and constrain the extent of motion between frames. Filtering layers are used in order to eliminate detection candidates before using CP and to remove dummy trajectories produced by the CP solver. Our proposed method was tested on a motorized vehicles tracking dataset and produces results that outperform the top methods of the UA-DETRAC benchmark.

翻訳日:2022-12-24 21:01:23 公開日:2020-03-10

# ディープリカレントオートエンコーダを用いた蜂の異常検出

Anomaly Detection in Beehives using Deep Recurrent Autoencoders ( http://arxiv.org/abs/2003.04576v1 )

ライセンス: Link先を確認

Padraig Davidson, Michael Steininger, Florian Lautenschlager, Konstantin Kobs, Anna Krause and Andreas Hotho

(参考訳) 精密ビーキーピングは、ハチにセンサーを装着することで、ハチの生活状態をモニタリングする。これらのハイブによって記録されたデータは、機械学習モデルによって分析され、ミツバチコロニーにおける異常な事象の行動パターンを学習または探索することができる。典型的なターゲットの1つは、経済的理由からミツバチの群れを早期発見することである。先進的な方法は、ハチの病気や技術上の理由、例えばセンサーの故障に起因する他の異常または異常な行動を検出することができる。本稿では,その起源に依存しないデータの任意の種類の異常を検出する深層学習モデルであるオートエンコーダを提案する。我々のモデルは、単純なルールベースのSwarm検出アルゴリズムと同一のSwarmを明らかにすることができるが、他の異常によっても引き起こされる。我々は,異なるヒブと異なるセンサーで収集した実世界のデータセットを用いて,我々のモデルを評価した。

Precision beekeeping allows to monitor bees' living conditions by equipping beehives with sensors. The data recorded by these hives can be analyzed by machine learning models to learn behavioral patterns of or search for unusual events in bee colonies. One typical target is the early detection of bee swarming as apiarists want to avoid this due to economical reasons. Advanced methods should be able to detect any other unusual or abnormal behavior arising from illness of bees or from technical reasons, e.g. sensor failure. In this position paper we present an autoencoder, a deep learning model, which detects any type of anomaly in data independent of its origin. Our model is able to reveal the same swarms as a simple rule-based swarm detection algorithm but is also triggered by any other anomaly. We evaluated our model on real world data sets that were collected on different hives and with different sensor setups.

翻訳日:2022-12-24 20:55:05 公開日:2020-03-10

# データ駆動意思決定におけるグループフェアネスの複数の指標に対処する

Addressing multiple metrics of group fairness in data-driven decision making ( http://arxiv.org/abs/2003.04794v1 )

ライセンス: Link先を確認

Marius Miron, Song\"ul Tolan, Emilia G\'omez, Carlos Castillo

(参考訳) 機械学習(fat-ml)文献における公平性、説明可能性、透明性は、性別や人種などの保護された特徴によって特徴付けられる社会デミックグループに対する差別を測定するために、様々な集団フェアネス指標を提案する。私たちは、これらのメトリクスのいくつかが、同じグループと機械学習の方法のために、2、3つのメインクラスタにまとめられるのを観察し、経験的にそうしている。さらに,グループフェアネス尺度の主成分分析(PCA)を用いて,2次元の多次元フェアネスを可視化する頑健な手法を提案する。複数のデータセットに対する実験結果から,PCA分解では測定値のばらつきを1～3成分で説明できることがわかった。

The Fairness, Accountability, and Transparency in Machine Learning (FAT-ML) literature proposes a varied set of group fairness metrics to measure discrimination against socio-demographic groups that are characterized by a protected feature, such as gender or race.Such a system can be deemed as either fair or unfair depending on the choice of the metric. Several metrics have been proposed, some of them incompatible with each other.We do so empirically, by observing that several of these metrics cluster together in two or three main clusters for the same groups and machine learning methods. In addition, we propose a robust way to visualize multidimensional fairness in two dimensions through a Principal Component Analysis (PCA) of the group fairness metrics. Experimental results on multiple datasets show that the PCA decomposition explains the variance between the metrics with one to three components.

翻訳日:2022-12-24 20:54:49 公開日:2020-03-10

# ブートストラップによるSketched SVDの誤差推定

Error Estimation for Sketched SVD via the Bootstrap ( http://arxiv.org/abs/2003.04937v1 )

ライセンス: Link先を確認

Miles E. Lopes and N. Benjamin Erichson and Michael W. Mahoney

(参考訳) 非常に大きな行列の特異値分解(SVD)に対する高速な近似を計算するために、ランダム化されたスケッチアルゴリズムが主要なアプローチとなっている。しかし,SVDをスケッチする上で重要な難しさは,スケッチした特異ベクトル/値と正確な値との距離がわからない点である。実際、ユーザは与えられた問題のユニークな構造を考慮しない分析的な最悪のエラー境界に頼らざるを得ない。結果として、エラー推定ツールの欠如は、本当に必要なものよりもはるかに多くの計算につながることが多い。これらの課題を克服するため,本稿では,スケッチした特異ベクトル/値の実際の誤差を数値的に推定する,データ駆動ブートストラップ法を開発した。特にこれは、ユーザが粗い初期スケッチされたsvdの品質を検査し、所定のエラー許容度に達するのに必要な余分な作業量を適応的に予測することを可能にする。さらに、この方法は、スケッチされたオブジェクトのみで動作し、分解される全行列をパスする必要がなくなるため、計算量的に安価である。最後に、この手法は理論的な保証と非常に奨励的な実験結果によって支持される。

In order to compute fast approximations to the singular value decompositions (SVD) of very large matrices, randomized sketching algorithms have become a leading approach. However, a key practical difficulty of sketching an SVD is that the user does not know how far the sketched singular vectors/values are from the exact ones. Indeed, the user may be forced to rely on analytical worst-case error bounds, which do not account for the unique structure of a given problem. As a result, the lack of tools for error estimation often leads to much more computation than is really necessary. To overcome these challenges, this paper develops a fully data-driven bootstrap method that numerically estimates the actual error of sketched singular vectors/values. In particular, this allows the user to inspect the quality of a rough initial sketched SVD, and then adaptively predict how much extra work is needed to reach a given error tolerance. Furthermore, the method is computationally inexpensive, because it operates only on sketched objects, and it requires no passes over the full matrix being factored. Lastly, the method is supported by theoretical guarantees and a very encouraging set of experimental results.

翻訳日:2022-12-24 20:54:10 公開日:2020-03-10

# 熱帯低気圧に対するベイズ間隔の予測

Prediction of Bayesian Intervals for Tropical Storms ( http://arxiv.org/abs/2003.05024v1 )

ライセンス: Link先を確認

Max Chiswick and Sam Ganzfried

(参考訳) リカレントニューラルネットワーク(RNN)を用いたハリケーンの軌道予測に関する最近の研究に基づいて,改良手法を開発し,単純な点推定に加えてベイズ区間の予測手法を一般化した。熱帯の嵐は深刻な被害を引き起こす可能性があるため、その軌道を正確に予測することは、特に気候変動の影響により、都市や生活に大きな利益をもたらす可能性がある。 RNNにおける降雨量を用いたベイズ区間の実施により, 降水地域を推定するなど, 予測の動作性の向上が図られる。我々は嵐の軌跡を6時間間隔で予測するためにRNNを使用した。大西洋で約500の熱帯嵐の統計ハリケーン強度予測スキーム(SHIPS)データから,緯度,経度,風速,気圧の特徴を抽出した。結果は,ニューラルネットワークのドロップアウト値が予測と間隔にどのように影響するかを示す。

Building on recent research for prediction of hurricane trajectories using recurrent neural networks (RNNs), we have developed improved methods and generalized the approach to predict Bayesian intervals in addition to simple point estimates. Tropical storms are capable of causing severe damage, so accurately predicting their trajectories can bring significant benefits to cities and lives, especially as they grow more intense due to climate change effects. By implementing the Bayesian interval using dropout in an RNN, we improve the actionability of the predictions, for example by estimating the areas to evacuate in the landfall region. We used an RNN to predict the trajectory of the storms at 6-hour intervals. We used latitude, longitude, windspeed, and pressure features from a Statistical Hurricane Intensity Prediction Scheme (SHIPS) dataset of about 500 tropical storms in the Atlantic Ocean. Our results show how neural network dropout values affect predictions and intervals.

翻訳日:2022-12-24 20:53:51 公開日:2020-03-10

# 頂点時間自己回帰モデルを用いたグラフ上の適応信号処理法

Methods of Adaptive Signal Processing on Graphs Using Vertex-Time Autoregressive Models ( http://arxiv.org/abs/2003.05729v1 )

ライセンス: Link先を確認

Thiernithi Variddhisai, Danilo Mandic

(参考訳) ランダムプロセスの概念は、最近グラフ信号に拡張され、ランダムグラフプロセスは、係数が \textit{graph-topological} 構造を持つ行列である多変量確率過程のクラスである。したがって、ランダムグラフプロセスのシステム同定問題は、その基礎となるトポロジーを決定すること、または数学的にグラフシフト演算子(gsos)、すなわち隣接行列やラプラシアン行列を決定することで解決される。ランダムグラフ処理を導入したのと同じ研究で、gso の解法である \textit{batch} の最適化手法が \textit{causal} 頂点時間自己回帰モデルに基づくランダムグラフプロセスに対して提案されている。この目的のために,適応フィルタリングの枠組みを用いて,最適化問題のオンライン版を提案した。修正確率勾配投影法は, 正規化最小二乗の目的に応用し, フィルタを試作した。再帰は3つの正規化サブプロブレムに分けられ、多重凸性、疎性、可換性、バイアスといった問題に対処する。収束分析に関する議論も含んでいる。最後に,提案アルゴリズムの性能を,従来のMSE測度から,正しい値に拘わらず良好な回復率まで,その可能性,限界,および本研究の可能性に光を当てる実験を行った。

The concept of a random process has been recently extended to graph signals, whereby random graph processes are a class of multivariate stochastic processes whose coefficients are matrices with a \textit{graph-topological} structure. The system identification problem of a random graph process therefore revolves around determining its underlying topology, or mathematically, the graph shift operators (GSOs) i.e. an adjacency matrix or a Laplacian matrix. In the same work that introduced random graph processes, a \textit{batch} optimization method to solve for the GSO was also proposed for the random graph process based on a \textit{causal} vertex-time autoregressive model. To this end, the online version of this optimization problem was proposed via the framework of adaptive filtering. The modified stochastic gradient projection method was employed on the regularized least squares objective to create the filter. The recursion is divided into 3 regularized sub-problems to address issues like multi-convexity, sparsity, commutativity and bias. A discussion on convergence analysis is also included. Finally, experiments are conducted to illustrate the performance of the proposed algorithm, from traditional MSE measure to successful recovery rate regardless correct values, all of which to shed light on the potential, the limit and the possible research attempt of this work.

翻訳日:2022-12-24 20:53:37 公開日:2020-03-10

# 機械学習による電力グリッド内のCO2排出強度の短期予測

Short-Term Forecasting of CO2 Emission Intensity in Power Grids by Machine Learning ( http://arxiv.org/abs/2003.05740v1 )

ライセンス: Link先を確認

Kenneth Leerbeck and Peder Bacher and Rune Junker and Goran Goranovi\'c and Olivier Corradi and Razgar Ebrahimy and Anna Tveit and Henrik Madsen

(参考訳) デンマークの入札ゾーンDK2における電力グリッドのCO2排出強度を予測し、平均と限界の排出量を区別する機械学習アルゴリズムを開発した。この分析は、電力生産、需要、輸入、気象条件など、選択された近隣地域から収集された膨大な数(473)の説明変数からなるデータセット上で行われた。この数は、lasso (penalized linear regression analysis) と前方特徴選択アルゴリズムの両方を用いて50未満に削減された。データの異なる側面(非線形性や変数の結合など)を捉えた3つの線形回帰モデルを作成し,ソフトマックス重み付き平均を用いて最終モデルに組み合わせた。残差を補正するために実装された脱バイアスおよび自己回帰移動平均モデル(ARIMA)に対してクロスバリデーションを行い、最終モデルは外因性入力(ARIMAX)を持つ変種とする。対応する不確実性の予測は6時間以下と2つの時間軸で与えられる。限界放射はdk2ゾーンのあらゆる条件とは独立に発生し、限界発生器は近隣のゾーンにあることを示唆している。開発手法は欧州電力網の入札ゾーンに適用でき、このゾーンに関する詳細な知識を必要とせずに適用できる。

A machine learning algorithm is developed to forecast the CO2 emission intensities in electrical power grids in the Danish bidding zone DK2, distinguishing between average and marginal emissions. The analysis was done on data set comprised of a large number (473) of explanatory variables such as power production, demand, import, weather conditions etc. collected from selected neighboring zones. The number was reduced to less than 50 using both LASSO (a penalized linear regression analysis) and a forward feature selection algorithm. Three linear regression models that capture different aspects of the data (non-linearities and coupling of variables etc.) were created and combined into a final model using Softmax weighted average. Cross-validation is performed for debiasing and autoregressive moving average model (ARIMA) implemented to correct the residuals, making the final model the variant with exogenous inputs (ARIMAX). The forecasts with the corresponding uncertainties are given for two time horizons, below and above six hours. Marginal emissions came up independent of any conditions in the DK2 zone, suggesting that the marginal generators are located in the neighbouring zones. The developed methodology can be applied to any bidding zone in the European electricity network without requiring detailed knowledge about the zone.

翻訳日:2022-12-24 20:53:13 公開日:2020-03-10

# 創始者理論の解明に向けて

Towards Clarifying the Theory of the Deconfounder ( http://arxiv.org/abs/2003.04948v1 )

ライセンス: Link先を確認

Yixin Wang, David M. Blei

(参考訳) Wang and Blei (2019) は複数の因果推論を研究し、デコンファウンデーションアルゴリズムを提案する。理論的要件を論じ,実証的研究を行う。創始者理論に関するいくつかの改良が提案されている。このうち、今井と江は「観測されていない単一原因の共同設立者」という仮定を明確にした。それらの仮定を用いて、本論文は理論を明確にする。さらに、ogburn et al. (2020) はこの理論の反例を提案する。しかし、提案された反例は要求された仮定を満たさない。

Wang and Blei (2019) studies multiple causal inference and proposes the deconfounder algorithm. The paper discusses theoretical requirements and presents empirical studies. Several refinements have been suggested around the theory of the deconfounder. Among these, Imai and Jiang clarified the assumption of "no unobserved single-cause confounders." Using their assumption, this paper clarifies the theory. Furthermore, Ogburn et al. (2020) proposes counterexamples to the theory. But the proposed counterexamples do not satisfy the required assumptions.

翻訳日:2022-12-24 20:45:57 公開日:2020-03-10

# ステッカーで応答する学習:マルチターンダイアログにおけるマルチモーダルの統合フレームワーク

Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn Dialog ( http://arxiv.org/abs/2003.04679v1 )

ライセンス: Link先を確認

Shen Gao, Xiuying Chen, Chang Liu, Li Liu, Dongyan Zhao and Rui Yan

(参考訳) オンラインメッセージングアプリでは、鮮やかで魅力的な表情のステッカーが人気を集めており、ステッカーのテキストラベルと以前の発話をマッチさせることで、ステッカー応答を自動的に選択する作業もある。しかし、その量が多いため、すべてのステッカーにテキストラベルを必要とするのは現実的ではない。そこで本稿では,外部ラベルを使わずに,複数ターンのダイアログコンテキスト履歴に基づいて適切なステッカーをユーザに推奨する。この課題には2つの大きな課題がある。 1つは、対応するテキストラベルなしでステッカーの意味を学ぶことである。もう一つの課題は、マルチターンダイアログコンテキストで候補ステッカーを共同でモデル化することである。これらの課題に対処するために、ステッカー応答セレクタ(SRS)モデルを提案する。具体的には、まず、畳み込みベースのステッカー画像エンコーダとセルフアテンションベースのマルチターンダイアログエンコーダを使用して、ステッカーと発話の表現を得る。次に,対話履歴内の各発話とステッカー間の深いマッチングを行うために,ディープインタラクションネットワークを提案する。次にsrsは、フュージョンネットワークによってすべてのインタラクション結果間の短期的および長期的な依存関係を学び、最終マッチングスコアを出力する。提案手法を評価するために,最も人気のあるオンラインチャットプラットフォームの1つであるステッカーを用いた大規模実世界のダイアログデータセットを収集した。このデータセットで行った大規模な実験により、我々のモデルは、一般的に使用されるすべてのメトリクスに対して最先端のパフォーマンスを達成することが示された。実験はまた、SRSの各コンポーネントの有効性を検証する。ステッカー選択フィールドのさらなる研究を容易にするため,このデータセットを340Kマルチターンダイアログとステッカーペアでリリースする。

Stickers with vivid and engaging expressions are becoming increasingly popular in online messaging apps, and some works are dedicated to automatically select sticker response by matching text labels of stickers with previous utterances. However, due to their large quantities, it is impractical to require text labels for the all stickers. Hence, in this paper, we propose to recommend an appropriate sticker to user based on multi-turn dialog context history without any external labels. Two main challenges are confronted in this task. One is to learn semantic meaning of stickers without corresponding text labels. Another challenge is to jointly model the candidate sticker with the multi-turn dialog context. To tackle these challenges, we propose a sticker response selector (SRS) model. Specifically, SRS first employs a convolutional based sticker image encoder and a self-attention based multi-turn dialog encoder to obtain the representation of stickers and utterances. Next, deep interaction network is proposed to conduct deep matching between the sticker with each utterance in the dialog history. SRS then learns the short-term and long-term dependency between all interaction results by a fusion network to output the the final matching score. To evaluate our proposed method, we collect a large-scale real-world dialog dataset with stickers from one of the most popular online chatting platform. Extensive experiments conducted on this dataset show that our model achieves the state-of-the-art performance for all commonly-used metrics. Experiments also verify the effectiveness of each component of SRS. To facilitate further research in sticker selection field, we release this dataset of 340K multi-turn dialog and sticker pairs.

翻訳日:2022-12-24 20:45:11 公開日:2020-03-10

# PnP-Net: ハイブリッドなパースペクティブnポイントネットワーク

PnP-Net: A hybrid Perspective-n-Point Network ( http://arxiv.org/abs/2003.04626v1 )

ライセンス: Link先を確認

Roy Sheffer, Ami Wiesel

(参考訳) 我々は,ディープラーニングとモデルベースアルゴリズムを組み合わせたハイブリッド手法を用いて,pnp問題を考える。 PnPは、世界の3Dポイントのセットと、画像中の対応する2Dプロジェクションが与えられたキャリブレーションカメラのポーズを推定する問題である。より困難なロバストなバージョンでは、いくつかの対応がミスマッチし、効率的に破棄されなければならない。古典的解法は、問題の幾何を利用するが不正確なか計算集約的な反復的頑健な非線形最小二乗法を介してPnPに対処する。対照的に、深層学習の初期フェーズとモデルに基づく微調整フェーズを組み合わせることを提案する。 pnp-netで表されるこのハイブリッドアプローチは、応答誤差と雑音の下で未知のポーズパラメータを、計算複雑性の低さと固定の要件で推定することに成功している。合成データと実世界のデータの両方にその利点を示す。

We consider the robust Perspective-n-Point (PnP) problem using a hybrid approach that combines deep learning with model based algorithms. PnP is the problem of estimating the pose of a calibrated camera given a set of 3D points in the world and their corresponding 2D projections in the image. In its more challenging robust version, some of the correspondences may be mismatched and must be efficiently discarded. Classical solutions address PnP via iterative robust non-linear least squares method that exploit the problem's geometry but are either inaccurate or computationally intensive. In contrast, we propose to combine a deep learning initial phase followed by a model-based fine tuning phase. This hybrid approach, denoted by PnP-Net, succeeds in estimating the unknown pose parameters under correspondence errors and noise, with low and fixed computational complexity requirements. We demonstrate its advantages on both synthetic data and real world data.

翻訳日:2022-12-24 20:44:43 公開日:2020-03-10

# $\beta$-VAEの潜在空間を用いたマルチラベルデータセットの分布外検出

Out-of-Distribution Detection in Multi-Label Datasets using Latent Space of $\beta$-VAE ( http://arxiv.org/abs/2003.08740v1 )

ライセンス: Link先を確認

Vijaya Kumar Sundar, Shreyas Ramakrishna, Zahra Rahiminasab, Arvind Easwaran, Abhishek Dubey

(参考訳) 学習可能コンポーネント(LEC)は、イメージセグメンテーション、オブジェクト検出、エンドツーエンドの駆動など、さまざまな認識に基づく自律的なタスクで広く使用されている。これらのコンポーネントは、気象条件や日時、トラフィック密度といったマルチモーダルな要素を持つ大規模なイメージデータセットでトレーニングされる。 LECはトレーニング中にこれらの要因から学習し、これらの要因にばらつきがあるかどうかをテストする間、コンポーネントは混乱し、信頼性が低い。トレーニング中に見えない要因のイメージは、一般的にout-of-Distribution (OOD)と呼ばれる。安全な自律のためには、OOD画像の識別が重要であり、適切な緩和戦略が実行可能である。 SVMやSVDDのような古典的な一級分類器はOOD検出に使用される。しかし、これらのデータセットのイメージにアタッチされた複数のラベルは、これらのテクニックの直接適用を制限する。我々は、$\beta$-variational autoencoder ($\beta$-vae) の潜在空間を用いてこの問題に対処する。適切に選択された$\beta$-vae によって生成されるコンパクトな潜在空間は、これらの因子に関する情報をいくつかの潜在変数にエンコードし、計算的に安価な検出に使うことができる。我々はnuScenesデータセットに対するアプローチを評価し,この結果から生成因子の値の変化をエンコードするために$\beta$-VAEの潜伏空間が敏感であることが示唆された。

Learning Enabled Components (LECs) are widely being used in a variety of perception based autonomy tasks like image segmentation, object detection, end-to-end driving, etc. These components are trained with large image datasets with multimodal factors like weather conditions, time-of-day, traffic-density, etc. The LECs learn from these factors during training, and while testing if there is variation in any of these factors, the components get confused resulting in low confidence predictions. The images with factors not seen during training is commonly referred to as Out-of-Distribution (OOD). For safe autonomy it is important to identify the OOD images, so that a suitable mitigation strategy can be performed. Classical one-class classifiers like SVM and SVDD are used to perform OOD detection. However, the multiple labels attached to the images in these datasets, restricts the direct application of these techniques. We address this problem using the latent space of the $\beta$-Variational Autoencoder ($\beta$-VAE). We use the fact that compact latent space generated by an appropriately selected $\beta$-VAE will encode the information about these factors in a few latent variables, and that can be used for computationally inexpensive detection. We evaluate our approach on the nuScenes dataset, and our results shows the latent space of $\beta$-VAE is sensitive to encode changes in the values of the generative factor.

翻訳日:2022-12-24 20:44:27 公開日:2020-03-10

# 単語埋め込み規則化とソフト類似度尺度を用いたテキスト分類

Text classification with word embedding regularization and soft similarity measure ( http://arxiv.org/abs/2003.05019v1 )

ライセンス: Link先を確認

V\'it Novotn\'y, Eniafe Festus Ayetiran, Michal \v{S}tef\'anik, and Petr Sojka

(参考訳) Mikolovらの独創的な作品以来、単語の埋め込みは多くの自然言語処理タスクにおいて好まれる単語表現となっている。 SCM(Soft Cosine measure)やWord Mover's Distance(Word Mover's Distance)などの単語埋め込みから抽出した文書類似度尺度を報告し,意味的テキスト類似度とテキスト分類の最先端性能を実現する。テキスト分類と意味的テキスト類似性においてWMDの強い性能にもかかわらず、その超キュービック平均時間複雑性は実用的ではない。 SCMは2次最悪の時間複雑性を持つが、テキスト分類における性能はWMDと比較されることはなかった。近年, 2つの単語埋め込み正規化手法が, 記憶コストと記憶コストの低減, 学習速度の向上, 文書処理速度の向上, 単語アナロジー, 単語類似性, 意味テキスト類似性の向上に寄与した。しかし,これらの手法がテキスト分類に与える影響についてはまだ研究されていない。本研究では,文書処理速度と文書分類におけるscmとwmdのタスク性能に対する2つの単語埋め込み正規化手法の個人および共同効果について検討した。評価には、$k$NN分類器と、BBCSport、TWITTER、OHSUMED、REUTERS-21578、AMAZON、20NEWSの6つの標準データセットを使用する。正規化単語埋め込みによる平均$k$NNテスト誤差の39%を非正規化単語埋め込みと比較した。本稿では,コレスキー分解による正規化埋め込みの導出について述べる。また、正規化語埋め込みによるSCMはテキスト分類においてWMDよりも優れ、1万倍以上高速であることを示す。

Since the seminal work of Mikolov et al., word embeddings have become the preferred word representations for many natural language processing tasks. Document similarity measures extracted from word embeddings, such as the soft cosine measure (SCM) and the Word Mover's Distance (WMD), were reported to achieve state-of-the-art performance on semantic text similarity and text classification. Despite the strong performance of the WMD on text classification and semantic text similarity, its super-cubic average time complexity is impractical. The SCM has quadratic worst-case time complexity, but its performance on text classification has never been compared with the WMD. Recently, two word embedding regularization techniques were shown to reduce storage and memory costs, and to improve training speed, document processing speed, and task performance on word analogy, word similarity, and semantic text similarity. However, the effect of these techniques on text classification has not yet been studied. In our work, we investigate the individual and joint effect of the two word embedding regularization techniques on the document processing speed and the task performance of the SCM and the WMD on text classification. For evaluation, we use the $k$NN classifier and six standard datasets: BBCSPORT, TWITTER, OHSUMED, REUTERS-21578, AMAZON, and 20NEWS. We show 39% average $k$NN test error reduction with regularized word embeddings compared to non-regularized word embeddings. We describe a practical procedure for deriving such regularized embeddings through Cholesky factorization. We also show that the SCM with regularized word embeddings significantly outperforms the WMD on text classification and is over 10,000 times faster.

翻訳日:2022-12-24 20:44:01 公開日:2020-03-10

# 車線維持車両のエンド・ツー・エンド制御のためのスパイクニューラルネットワークの間接的および直接訓練

Indirect and Direct Training of Spiking Neural Networks for End-to-End Control of a Lane-Keeping Vehicle ( http://arxiv.org/abs/2003.04603v1 )

ライセンス: Link先を確認

Zhenshan Bing, Claus Meschede, Guang Chen, Alois Knoll, Kai Huang

(参考訳) 生物学的シナプス可塑性に基づくスパイクニューラルネットワーク(snn)の構築は、高速でエネルギー効率の良いコンピューティングを実現する有望な可能性を秘めている。しかし,ロボット分野におけるSNNの実装は,実践的な訓練方法の欠如により制限されている。そこで本稿では,車線維持車両におけるSNNの間接的および直接的エンドツーエンドのトレーニング手法を紹介する。まず,<textcolor{black}{Deep Q-Learning} (DQN) アルゴリズムを用いて学習し,その後,教師あり学習を用いてSNNに転送する。第二に, 強化学習の利点とstdp(spike-timing-dependent plasticity)の利点を併せ持つため, 直接sns訓練にr-stdp(reward-modulated spike-timing-dependent plasticity)を採用する。イベントベースニューロモルフィック視覚センサを用いて,ロボットが車線標識内に留まるように制御される3つのシナリオにおいて提案手法を検討する。本稿では,R-STDP手法の利点を,他の3つのアルゴリズムと比較することにより,横方向のローカライゼーション精度とトレーニング時間ステップの観点から明らかにする。

Building spiking neural networks (SNNs) based on biological synaptic plasticities holds a promising potential for accomplishing fast and energy-efficient computing, which is beneficial to mobile robotic applications. However, the implementations of SNNs in robotic fields are limited due to the lack of practical training methods. In this paper, we therefore introduce both indirect and direct end-to-end training methods of SNNs for a lane-keeping vehicle. First, we adopt a policy learned using the \textcolor{black}{Deep Q-Learning} (DQN) algorithm and then subsequently transfer it to an SNN using supervised learning. Second, we adopt the reward-modulated spike-timing-dependent plasticity (R-STDP) for training SNNs directly, since it combines the advantages of both reinforcement learning and the well-known spike-timing-dependent plasticity (STDP). We examine the proposed approaches in three scenarios in which a robot is controlled to keep within lane markings by using an event-based neuromorphic vision sensor. We further demonstrate the advantages of the R-STDP approach in terms of the lateral localization accuracy and training time steps by comparing them with other three algorithms presented in this paper.

翻訳日:2022-12-24 20:43:18 公開日:2020-03-10

# マルチウェイデータのためのオンラインテンソル学習

Online Tensor-Based Learning for Multi-Way Data ( http://arxiv.org/abs/2003.04497v1 )

ライセンス: Link先を確認

Ali Anaissi, Basem Suleiman, Seid Miad Zandavi

(参考訳) テンソル $\mathcal{X} \in \mathbb{R} ^{I_1 \times \dots \times I_N} $ に格納されたマルチウェイデータのオンライン解析は、基礎となる構造を捕捉し、予測モデルを学ぶのに使用できるセンシティブな特徴を抽出するための重要なツールとなっている。しかし、データ分布はしばしば時間とともに進化し、現在の予測モデルは将来十分に代表されないかもしれない。したがって、このような状況ではテンソルベースの特徴とモデル係数を段階的に更新する必要がある。オンラインの$CANDECOMP/PARAFAC$ (CP)分解において, テンソルを用いた新しい効率的な特徴抽出法NeSGDを提案する。 nesgdの結果から得られた新しい特徴によると、オンライン予測モデルの更新プロセスのために新しい基準がトリガーされる。実験室ベースおよび実生活構造データセットを用いた構造健康モニタリングの分野での実験的な評価は,既存のオンラインテンソル解析やモデル学習と比較して,より正確な結果が得られることを示している。その結果,提案手法は分類誤り率を大幅に改善し,時間とともに正のデータ分布の変化を同化することができ,全てのケーススタディにおいて高い予測精度を維持した。

The online analysis of multi-way data stored in a tensor $\mathcal{X} \in \mathbb{R} ^{I_1 \times \dots \times I_N} $ has become an essential tool for capturing the underlying structures and extracting the sensitive features which can be used to learn a predictive model. However, data distributions often evolve with time and a current predictive model may not be sufficiently representative in the future. Therefore, incrementally updating the tensor-based features and model coefficients are required in such situations. A new efficient tensor-based feature extraction, named NeSGD, is proposed for online $CANDECOMP/PARAFAC$ (CP) decomposition. According to the new features obtained from the resultant matrices of NeSGD, a new criteria is triggered for the updated process of the online predictive model. Experimental evaluation in the field of structural health monitoring using laboratory-based and real-life structural datasets show that our methods provide more accurate results compared with existing online tensor analysis and model learning. The results showed that the proposed methods significantly improved the classification error rates, were able to assimilate the changes in the positive data distribution over time, and maintained a high predictive accuracy in all case studies.

翻訳日:2022-12-24 20:36:56 公開日:2020-03-10

# 非均一密度入力のためのニューラルネットワークの周波数バイアス

Frequency Bias in Neural Networks for Input of Non-Uniform Density ( http://arxiv.org/abs/2003.04560v1 )

ライセンス: Link先を確認

Ronen Basri, Meirav Galun, Amnon Geifman, David Jacobs, Yoni Kasten, Shira Kritchman

(参考訳) 最近の研究は、過パラメータニューラルネットの一般化能力を周波数バイアスに帰している。一様分布から引き出されたデータに勾配降下を訓練したネットワークは、高周波のニューラルネットワークよりも低い周波数に適合する。現実的なトレーニングセットは均一な分布から引き出されないため、我々はニューラルネットワーク・タンジェント・カーネル(NTK)モデルを用いて、学習力学における変動密度の影響を探索する。その結果、周波数の純調和関数である$\kappa$ を学習すると、点 $\x \in \sphere^{d-1}$ での収束は時刻 $o(\kappa^d/p(\x))$ ここで $p(\x)$ は局所密度 $\x$ を表す。具体的には、$\Sphere^1$のデータに対して、2層ネットワークのNTKに関連するカーネルの固有関数を解析的に導出する。さらに、NTKのスペクトル分解に関して、深い完全連結ネットワークに対する収束結果を証明した。実験では,このモデルにおける深層ネットワークと浅層ネットワークの類似性と差異に注目した。

Recent works have partly attributed the generalization ability of over-parameterized neural networks to frequency bias -- networks trained with gradient descent on data drawn from a uniform distribution find a low frequency fit before high frequency ones. As realistic training sets are not drawn from a uniform distribution, we here use the Neural Tangent Kernel (NTK) model to explore the effect of variable density on training dynamics. Our results, which combine analytic and empirical observations, show that when learning a pure harmonic function of frequency $\kappa$, convergence at a point $\x \in \Sphere^{d-1}$ occurs in time $O(\kappa^d/p(\x))$ where $p(\x)$ denotes the local density at $\x$. Specifically, for data in $\Sphere^1$ we analytically derive the eigenfunctions of the kernel associated with the NTK for two-layer networks. We further prove convergence results for deep, fully connected networks with respect to the spectral decomposition of the NTK. Our empirical study highlights similarities and differences between deep and shallow networks in this model.

翻訳日:2022-12-24 20:35:09 公開日:2020-03-10

# 時系列データの曖昧性:連続モデルによる不確実な未来予測

Ambiguity in Sequential Data: Predicting Uncertain Futures with Recurrent Models ( http://arxiv.org/abs/2003.10381v1 )

ライセンス: Link先を確認

Alessandro Berlati, Oliver Scheel, Luigi Di Stefano, Federico Tombari

(参考訳) あいまいさは多くの機械学習タスクに本質的に存在するが、特にシーケンシャルモデルでは、ほとんどの場合単一の予測しか出力しないため、ほとんど考慮されない。本研究では,逐次データを用いた曖昧な予測を扱うために,多重仮説予測(multiple hypothesis prediction,mhp)モデルの拡張を提案する。我々のアプローチは最も一般的な繰り返しアーキテクチャに適用でき、損失関数で使用できます。さらに,不確かさを考慮し,複数のラベルが存在する場合の正確さの直感的な理解と一致した,あいまいな問題に対する新しい尺度を提案する。提案手法は, 軌道予測や操作予測などの時系列データを扱う様々なタスクにおいて, 有望な結果を達成するために, 実験を行った。

Ambiguity is inherently present in many machine learning tasks, but especially for sequential models seldom accounted for, as most only output a single prediction. In this work we propose an extension of the Multiple Hypothesis Prediction (MHP) model to handle ambiguous predictions with sequential data, which is of special importance, as often multiple futures are equally likely. Our approach can be applied to the most common recurrent architectures and can be used with any loss function. Additionally, we introduce a novel metric for ambiguous problems, which is better suited to account for uncertainties and coincides with our intuitive understanding of correctness in the presence of multiple labels. We test our method on several experiments and across diverse tasks dealing with time series data, such as trajectory forecasting and maneuver prediction, achieving promising results.

翻訳日:2022-12-24 20:26:22 公開日:2020-03-10

# 日本人における人間の行動記述のためのビデオ字幕データセット

Video Caption Dataset for Describing Human Actions in Japanese ( http://arxiv.org/abs/2003.04865v1 )

ライセンス: Link先を確認

Yutaro Shigeto, Yuya Yoshikawa, Jiaqing Lin, Akikazu Takeuchi

(参考訳) 近年,自動字幕生成が注目されている。本稿では,人間の行動を記述するための日本語字幕の生成に焦点をあてる。現在利用可能なほとんどのビデオキャプションデータセットは英語で構築されているが、同等の日本語データセットはない。そこで我々は,79,822本,399,233本からなる大規模日本ビデオキャプションデータセットを構築した。データセットの各キャプションは、"誰がどこで何をするのか"という形式でビデオを記述する。人間の行動を説明するには、人、場所、行動の詳細を特定することが重要である。実際、人間の行動を説明するとき、通常、場面、人物、行動について言及する。本実験では,2つのキャプション生成手法を評価し,ベンチマーク結果を得た。さらに,これらの生成手法が「何をどこで行うか」を特定できるかどうかを検討した。

In recent years, automatic video caption generation has attracted considerable attention. This paper focuses on the generation of Japanese captions for describing human actions. While most currently available video caption datasets have been constructed for English, there is no equivalent Japanese dataset. To address this, we constructed a large-scale Japanese video caption dataset consisting of 79,822 videos and 399,233 captions. Each caption in our dataset describes a video in the form of "who does what and where." To describe human actions, it is important to identify the details of a person, place, and action. Indeed, when we describe human actions, we usually mention the scene, person, and action. In our experiments, we evaluated two caption generation methods to obtain benchmark results. Further, we investigated whether those generation methods could specify "who does what and where."

翻訳日:2022-12-24 20:25:23 公開日:2020-03-10

# TyDi QA: タイポロジー多言語における情報探索質問回答のベンチマーク

TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages ( http://arxiv.org/abs/2003.05002v1 )

ライセンス: Link先を確認

Jonathan H. Clark, Eunsol Choi, Michael Collins, Dan Garrette, Tom Kwiatkowski, Vitaly Nikolaev, and Jennimaria Palomaki

(参考訳) 多言語モデリングを確実に進めるためには、挑戦的で信頼できる評価が必要である。我々はTyDi QA--204Kの問合せ対を持つ11の類型的多様言語を対象とした質問応答データセットを提案する。 tydi qaの言語は、それぞれの言語が表現する言語的特徴のセットという、そのタイポロジーに関して多種多様であり、このセットでうまく機能するモデルが、世界の多くの言語にまたがって一般化することを期待しています。本稿では、英語のみのコーパスでは見つからない観察された言語現象のデータ品質と例レベルの質的言語分析について定量的に分析する。リアルな情報検索タスクであって、プライミング効果を回避し、回答を知りたいがまだ答えがわからない人々によって質問が書かれ、翻訳を使わずに各言語でデータを直接収集する。

Confidently making progress on multilingual modeling requires challenging, trustworthy evaluations. We present TyDi QA---a question answering dataset covering 11 typologically diverse languages with 204K question-answer pairs. The languages of TyDi QA are diverse with regard to their typology---the set of linguistic features each language expresses---such that we expect models performing well on this set to generalize across a large number of the world's languages. We present a quantitative analysis of the data quality and example-level qualitative linguistic analyses of observed language phenomena that would not be found in English-only corpora. To provide a realistic information-seeking task and avoid priming effects, questions are written by people who want to know the answer, but don't know the answer yet, and the data is collected directly in each language without the use of translation.

翻訳日:2022-12-24 20:25:12 公開日:2020-03-10

# 生成モデルを用いた大規模自然言語逆例の生成

Generating Natural Language Adversarial Examples on a Large Scale with Generative Models ( http://arxiv.org/abs/2003.10388v1 )

ライセンス: Link先を確認

Yankun Ren and Jianbin Lin and Siliang Tang and Jun Zhou and Shuang Yang and Yuan Qi and Xiang Ren

(参考訳) 現在、テキスト分類モデルは広く使われている。しかし、これらの分類器は逆例によって容易に騙される。幸いなことに、標準的な攻撃方法は、対向テキストを生成する。つまり、逆テキストは、いくつかの単語を置き換えることで、現実世界のテキストからのみ生成することができる。多くのアプリケーションでは、これらのテキストは数に制限があるため、その逆の例はしばしば多様ではなく、時には読みにくいため、人間が容易に検出でき、大規模にカオスを起こすことができない。本稿では,テキストの摂動に制限されない生成モデルを用いて,テキストをスクラッチから効率的に生成するエンド・ツー・エンドのソリューションを提案する。これを非制限逆テキスト生成と呼ぶ。具体的には,条件付き変分オートエンコーダ(VAE)を学習し,さらに逆転損失を加えて,逆転例の生成を誘導する。さらに,敵対的テキストの妥当性を向上させるために,実データと一致するように,識別器とGAN(Generative Adversarial Network)のトレーニングフレームワークを利用する。感情分析実験により,本手法のスケーラビリティと効率性を示す。既存の手法よりも高い成功率でテキスト分類モデルを攻撃することができ、一方で人間には許容できる品質を提供する。

Today text classification models have been widely used. However, these classifiers are found to be easily fooled by adversarial examples. Fortunately, standard attacking methods generate adversarial texts in a pair-wise way, that is, an adversarial text can only be created from a real-world text by replacing a few words. In many applications, these texts are limited in numbers, therefore their corresponding adversarial examples are often not diverse enough and sometimes hard to read, thus can be easily detected by humans and cannot create chaos at a large scale. In this paper, we propose an end to end solution to efficiently generate adversarial texts from scratch using generative models, which are not restricted to perturbing the given texts. We call it unrestricted adversarial text generation. Specifically, we train a conditional variational autoencoder (VAE) with an additional adversarial loss to guide the generation of adversarial examples. Moreover, to improve the validity of adversarial texts, we utilize discrimators and the training framework of generative adversarial networks (GANs) to make adversarial texts consistent with real data. Experimental results on sentiment analysis demonstrate the scalability and efficiency of our method. It can attack text classification models with a higher success rate than existing methods, and provide acceptable quality for humans in the meantime.

翻訳日:2022-12-24 20:17:52 公開日:2020-03-10

# 高度不均衡データに基づく適応的名前認識

Adaptive Name Entity Recognition under Highly Unbalanced Data ( http://arxiv.org/abs/2003.10296v1 )

ライセンス: Link先を確認

Thong Nguyen, Duy Nguyen, Pramod Rao

(参考訳) 自然言語処理(nlp)において、情報抽出、感情分析、チャットボットなどいくつかの目的において、名前付きエンティティ認識(ner)は、テキスト中のエンティティを人名、場所、量、組織、パーセンテージなどの予め定義されたグループに分類し分類する上で重要な役割を担っている。本稿では,両方向LSTM(BI-LSTM)層上に積み重ねた条件付きランダムフィールド(CRF)層からなるニューラルアーキテクチャの実験を行った。さらに、巨大なコーパス上で事前学習された埋め込みベクトル(glove, bert)の融合入力を用いて、モデルの一般化能力を高める。残念ながら、重いアンバランスな分散クロストレーニングデータのために、両方のアプローチはトレーニングの少ないサンプルクラスで悪いパフォーマンスを達成した。この課題を克服するために、文を弱クラスと強クラスに分割し、各セットのパフォーマンスを最適化するために2つのBi-LSTM-CRFモデルを適切に設計するアドオン分類モデルを導入する。テストセット上でのモデル評価を行い,他のクラスと比較して非常に小さなデータセット(約 0.45 %)を使用することで,Weakクラスの性能を著しく向上できることを確認した。

For several purposes in Natural Language Processing (NLP), such as Information Extraction, Sentiment Analysis or Chatbot, Named Entity Recognition (NER) holds an important role as it helps to determine and categorize entities in text into predefined groups such as the names of persons, locations, quantities, organizations or percentages, etc. In this report, we present our experiments on a neural architecture composed of a Conditional Random Field (CRF) layer stacked on top of a Bi-directional LSTM (BI-LSTM) layer for solving NER tasks. Besides, we also employ a fusion input of embedding vectors (Glove, BERT), which are pre-trained on the huge corpus to boost the generalization capacity of the model. Unfortunately, due to the heavy unbalanced distribution cross-training data, both approaches just attained a bad performance on less training samples classes. To overcome this challenge, we introduce an add-on classification model to split sentences into two different sets: Weak and Strong classes and then designing a couple of Bi-LSTM-CRF models properly to optimize performance on each set. We evaluated our models on the test set and discovered that our method can improve performance for Weak classes significantly by using a very small data set (approximately 0.45\%) compared to the rest classes.

翻訳日:2022-12-24 20:17:26 公開日:2020-03-10

# グローバルオプティマイザになるための学習

Learning to be Global Optimizer ( http://arxiv.org/abs/2003.04521v1 )

ライセンス: Link先を確認

Haotian Zhang, Jianyong Sun and Zongben Xu

(参考訳) 人工知能の進歩は、最適化アルゴリズムの開発に新たな光を当てている。本稿では,スムーズな非凸関数に対する2相(最小化フェーズとエスケープフェーズを含む)グローバル最適化アルゴリズムについて述べる。最小化フェーズにおいて、凸関数に対する履歴情報の非線形結合として形式化された降下方向の更新規則を学習するモデル駆動深層学習法を開発した。提案する適応方向のアルゴリズムによって凸関数の収束が保証されることを示す。実験的な研究により、学習アルゴリズムは勾配降下、共役降下、BFGSなどの古典最適化アルゴリズムを著しく上回り、不適切な関数に対してよく機能することが示された。局所最適からの脱出フェーズは、固定避難ポリシーを持つマルコフ決定プロセスとしてモデル化される。さらに,強化学習による最適避難政策の学習も提案する。合成関数を最適化し、CIFAR画像分類のためのディープニューラルネットワークを訓練することにより、エスケープポリシーの有効性を検証する。学習した2相大域最適化アルゴリズムは、いくつかのベンチマーク関数と機械学習タスクで有望な大域探索能力を示す。

The advancement of artificial intelligence has cast a new light on the development of optimization algorithm. This paper proposes to learn a two-phase (including a minimization phase and an escaping phase) global optimization algorithm for smooth non-convex functions. For the minimization phase, a model-driven deep learning method is developed to learn the update rule of descent direction, which is formalized as a nonlinear combination of historical information, for convex functions. We prove that the resultant algorithm with the proposed adaptive direction guarantees convergence for convex functions. Empirical study shows that the learned algorithm significantly outperforms some well-known classical optimization algorithms, such as gradient descent, conjugate descent and BFGS, and performs well on ill-posed functions. The escaping phase from local optimum is modeled as a Markov decision process with a fixed escaping policy. We further propose to learn an optimal escaping policy by reinforcement learning. The effectiveness of the escaping policies is verified by optimizing synthesized functions and training a deep neural network for CIFAR image classification. The learned two-phase global optimization algorithm demonstrates a promising global search capability on some benchmark functions and machine learning tasks.

翻訳日:2022-12-24 20:16:44 公開日:2020-03-10

PDF登録状況（公開日: 20200310）