Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20210710となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# 量子信号処理における効率的な位相要素評価 Efficient phase-factor evaluation in quantum signal processing ( http://arxiv.org/abs/2002.11649v2 ) ライセンス: Link先を確認	Yulong Dong, Xiang Meng, K. Birgitta Whaley, Lin Lin	(参考訳) 量子信号処理(QSP)は、量子コンピュータ上で行列多項式を正確に実装する強力な量子アルゴリズムである。 qspに基づく量子アルゴリズムの漸近解析は、原理上、ハミルトンシミュレーションや量子線形系問題のような様々なタスクに対して漸近的に最適な結果を得ることができることを示した。 QSPのさらなる利点は、最小数のアンシラ量子ビットを使い、準中間項量子アーキテクチャの実装を容易にすることである。しかし、QSP回路構築に必要な位相係数を計算できる古典的に安定したアルゴリズムは今のところ存在しない。既存の手法では可変精度演算が必要であり、比較的低い次数の多項式にしか適用できない。本稿では,標準倍精度演算を用いて位相係数を正確に計算する最適化手法を提案する。本手法の性能をハミルトンシミュレーション,固有値フィルタリング,量子線形系問題への応用により実証する。数値計算の結果, 最適化アルゴリズムは, 誤差が 10^{-12}$ 以下で 1 万ドル以上の次数の多項式を正確に近似する位相係数を求めることができることがわかった。 Quantum signal processing (QSP) is a powerful quantum algorithm to exactly implement matrix polynomials on quantum computers. Asymptotic analysis of quantum algorithms based on QSP has shown that asymptotically optimal results can in principle be obtained for a range of tasks, such as Hamiltonian simulation and the quantum linear system problem. A further benefit of QSP is that it uses a minimal number of ancilla qubits, which facilitates its implementation on near-to-intermediate term quantum architectures. However, there is so far no classically stable algorithm allowing computation of the phase factors that are needed to build QSP circuits. Existing methods require the usage of variable precision arithmetic and can only be applied to polynomials of relatively low degree. We present here an optimization based method that can accurately compute the phase factors using standard double precision arithmetic operations. We demonstrate the performance of this approach with applications to Hamiltonian simulation, eigenvalue filtering, and the quantum linear system problems. Our numerical results show that the optimization algorithm can find phase factors to accurately approximate polynomials of degree larger than $10,000$ with error below $10^{-12}$.	翻訳日:2023-06-01 21:03:46 公開日:2021-07-10
# ランダム回路ブロック符号化行列と量子LINPACKベンチマークの提案 Random circuit block-encoded matrix and a proposal of quantum LINPACK benchmark ( http://arxiv.org/abs/2006.04010v2 ) ライセンス: Link先を確認	Yulong Dong, Lin Lin	(参考訳) LINPACKベンチマークは、密度ランダム行列を持つ線形方程式系を解くコンピュータの性能を報告する。このタスクは実際のアプリケーションを直接念頭に置いて設計されたものではないが、1993年のリストの登場以来、LINPACKベンチマークによってTOP500スーパーコンピュータのリストが定義されている。我々は、量子LINPACKベンチマークと呼ばれる類似のベンチマークを用いて、量子コンピュータ全体のマシン性能を測定することを提案する。量子LINPACKベンチマークの成功は、方程式の線形系のような線形代数問題を解くための有用なタスクを実行するための量子コンピュータの最小限の要件と見なされるべきである。本稿では,Random Circuit Block-Encoded Matrix (RACBEM) と呼ばれる入力モデルを提案する。 RACBEMモデルは量子コンピュータ上での実装が効率的であり、ブラックボックスの量子コンパイラを頼りに、任意の量子アーキテクチャに最適に適応するように設計することができる。線形システムの解法以外にも、RACBEMモデルは、計算スペクトル測度、ハミルトンシミュレーションによって生成される時系列、エネルギーの熱平均など、多くの物理応用に関連する様々な線形代数タスクの実行に使用できる。我々は、これらの線形代数演算をibm q量子デバイスおよび量子仮想マシンに実装し、科学計算問題を解決する上での性能を実証する。 The LINPACK benchmark reports the performance of a computer for solving a system of linear equations with dense random matrices. Although this task was not designed with a real application directly in mind, the LINPACK benchmark has been used to define the list of TOP500 supercomputers since the debut of the list in 1993. We propose that a similar benchmark, called the quantum LINPACK benchmark, could be used to measure the whole machine performance of quantum computers. The success of the quantum LINPACK benchmark should be viewed as the minimal requirement for a quantum computer to perform a useful task of solving linear algebra problems, such as linear systems of equations. We propose an input model called the RAndom Circuit Block-Encoded Matrix (RACBEM), which is a proper generalization of a dense random matrix in the quantum setting. The RACBEM model is efficient to be implemented on a quantum computer, and can be designed to optimally adapt to any given quantum architecture, with relying on a black-box quantum compiler. Besides solving linear systems, the RACBEM model can be used to perform a variety of linear algebra tasks relevant to many physical applications, such as computing spectral measures, time series generated by a Hamiltonian simulation, and thermal averages of the energy. We implement these linear algebra operations on IBM Q quantum devices as well as quantum virtual machines, and demonstrate their performance in solving scientific computing problems.	翻訳日:2023-05-16 09:12:15 公開日:2021-07-10
# 量子アニールのマルチビット補正 Multi-Qubit Correction for Quantum Annealers ( http://arxiv.org/abs/2010.00115v4 ) ライセンス: Link先を確認	Ramin Ayanzadeh, John Dorband, Milton Halem and Tim Finin	(参考訳) 我々は,オープンシステムの進展をギブズ・サンプラーとして捉え,励起状態のセットをエネルギー値の低い新しい合成状態に還元する量子アニーラーのための新しい後処理法として, \emph{multi-qubit correction} (mqc)を提案する。所定の(Ising)ハミルトニアンの基底状態からサンプリングした後、MQCは励起状態のペアを比較して仮想トンネルを認識する。 D-Wave 2000Q量子アニールを用いた実験の結果、MQCは、スピン反転変換、古典的後処理技術、連続測定間のサンプル間遅延の増加など、近年の量子アニール領域におけるハードウェア/ソフトウェア進歩と比較して、顕著に低いエネルギー値のサンプルを見つけ、結果の再現性を向上させることを示した。 We present \emph{multi-qubit correction} (MQC) as a novel postprocessing method for quantum annealers that views the evolution in an open-system as a Gibbs sampler and reduces a set of excited states to a new synthetic state with lower energy value. After sampling from the ground state of a given (Ising) Hamiltonian, MQC compares pairs of excited states to recognize virtual tunnels--i.e., a group of qubits that changing their states simultaneously can result in a new state with lower energy value--and successively converges to the ground state. Experimental results using D-Wave 2000Q quantum annealers demonstrate that MQC finds samples with notably lower energy values and improves the reproducibility of results when compared to recent hardware/software advances in the realm of quantum annealing, such as spin-reversal transforms, classical postprocessing techniques, and increased inter-sample delay between successive measurements.	翻訳日:2023-04-30 13:59:35 公開日:2021-07-10
# 多体局在系における空間一粒子密度行列エントロピーのスケーリング特性 Scaling properties of a spatial one-particle density-matrix entropy in many-body localized systems ( http://arxiv.org/abs/2011.02200v2 ) ライセンス: Link先を確認	Miroslav Hopjan, Fabian Heidrich-Meisner, Vincenzo Alba	(参考訳) 本研究では,多体局在化(MBL)相を呈する1次元不規則相互作用フェルミオンにおいて,一粒子密度行列(OPDM)から抽出した空間サブシステムエントロピーについて検討した。このOPDMエントロピーは, 適切な絡み合いの指標ではないにもかかわらず, 局所化の健全な特徴を示す。固有状態のOPDMエントロピーが地域法則に従うことを数値的に示す。フォン・ノイマンのエントロピーと同様に、opdmエントロピーは量子クエンチの時間とともに対数的に成長する。これら2つの特徴は、中程度の大きな相互作用で生き残り、エルゴード相への遷移に向けて良好である。 OPDMエントロピーの計算コストはシステムサイズと多項式的にしか一致せず,シミュレーションや実験でMBLの診断ツールを開発する上で,OPDMが有望な出発点となることを示唆している。 We investigate a spatial subsystem entropy extracted from the one-particle density matrix (OPDM) in one-dimensional disordered interacting fermions that host a many-body localized (MBL) phase. Deep in the putative MBL regime, this OPDM entropy exhibits the salient features of localization, despite not being a proper entanglement measure. We numerically show that the OPDM entropy of the eigenstates obeys an area law. Similar to the von-Neumann entropy, the OPDM entropy grows logarithmically with time after a quantum quench, albeit with a different prefactor. Both these features survive at moderately large interactions and well towards the transition into the ergodic phase. The computational cost to calculate the OPDM entropy scales only polynomially with the system size, suggesting that the OPDM provides a promising starting point for developing diagnostic tools for MBL in simulations and experiments.	翻訳日:2023-04-25 07:32:02 公開日:2021-07-10
# 量子ラビ三角形におけるキラルコヒーレント相の量子三臨界 Quantum tricriticality of chiral-coherent phase in quantum Rabi triangle ( http://arxiv.org/abs/2011.11171v2 ) ライセンス: Link先を確認	Yu-Yu Zhang, Zi-Xiang Hu, Libin Fu, Hong-Gang Luo, Han Pu, Xue-Feng Zhang	(参考訳) 相互作用、対称性、ゲージ場の相互作用は通常、興味深い量子多体相をもたらす。出現相の性質を探るため,人工磁場を合成するための基本構造ブロックとして量子ラビ三角形系について検討した。我々は、リッチ位相図と関連する量子臨界性を研究する分析手法を開発した。特に興味深いのはキラルコヒーレント位相の出現であり、これは$\mathbb{z}_2$ とキラル対称性の両方を破る。このキラル相では、光子は一方向的に流れ、キラリティーは人工ゲージ場によって調整され、時間反転対称性の破れを示す。有限周波スケーリング解析により、関連する相転移がディックモデルの普遍性クラスにさらに確認される。このモデルは、光-物質結合系の幅広い物理現象をシミュレートすることができ、様々な量子情報技術の将来の発展に応用することができる。 The interplay of interactions, symmetries and gauge fields usually leads to intriguing quantum many-body phases. To explore the nature of emerging phases, we study a quantum Rabi triangle system as an elementary building block for synthesizing an artificial magnetic field. We develop an analytical approach to study the rich phase diagram and the associated quantum criticality. Of particular interest is the emergence of a chiral-coherent phase, which breaks both the $\mathbb{Z}_2$ and the chiral symmetry. In this chiral phase, photons flow unidirectionally and the chirality can be tuned by the artificial gauge field, exhibiting a signature of broken time-reversal symmetry. The finite-frequency scaling analysis further confirms the associated phase transition to be in the universality class of the Dicke model. This model can simulate a broad range of physical phenomena of light-matter coupling systems, and may have an application in future developments of various quantum information technologies.	翻訳日:2023-04-23 09:14:35 公開日:2021-07-10
# 円柱状光格子における空間時間対称性、軌道磁気、動的ベリー曲率 Intertwined Space-Time Symmetry, Orbital Magnetism and Dynamical Berry Curvature in a Circularly Shaken Optical Lattice ( http://arxiv.org/abs/2012.01822v2 ) ライセンス: Link先を確認	Hua Chen and W. Vincent Liu	(参考訳) 空間的および時間的次元において周期性を示す(2+1)次元の時空格子である2次元光学格子の円形揺動について検討する。ここで考慮された近共振光揺らぎは、振動周波数の光子を転送することで、低次の$s$バンドと最初の$p$バンドを動的に結合する。交叉型時空対称性はさらに発見され、一般化されたブロッホ・フロケの定理で解決されたスペクトルの縮退を解明する。円揺動のキラリティの設定は、時間反転対称性を明示的に破壊し、 $p_\pm = p_x \pm ip_y$ 軌道の縮退を持ち上げ、軌道磁気の局所循環、すなわち $p_\pm$ 軌道の非平衡占有をもたらす。さらに, ベリー接続のダイナミクスは, ベリー曲率の時間発展と, 実験において物理的に観測可能な効果を持つ分極によって明らかにされる。興味深いことに、動力学は、時間の分数変換を伴う時間ねじ回転対称性によって制御される普遍的な位相シフトによって特徴づけられる。これらの結果は、現在の格子シェイキングスキームが軌道物理学と対称性に保護されたダイナミクスを研究するための多用途なプラットフォームであることを示唆している。 We study the circular shaking of a two dimensional optical lattice, which is essentially a (2+1) dimensional space-time lattice exhibiting periodicities in both spatial and temporal dimensions. The near-resonant optical shaking considered here dynamically couples the low-lying $s$ band and the first excited $p$ bands by transferring a photon of shaking frequency. The intertwined space-time symmetries are further uncovered to elucidate the degeneracy in the spectrum solved with the generalized Bloch-Floquet theorem. Setting the chirality of circular shaking explicitly breaks time reversal symmetry and lifts the degeneracy of $p_\pm = p_x \pm ip_y$ orbitals, leading to the local circulation of orbital magnetism, i.e the imbalanced occupation in $p_\pm$ orbitals. Moreover, the dynamics of Berry connection is revealed by the time evolution of the Berry curvature and the polarization, which have physical observable effects in experiments. Interestingly, the dynamics is found characterized by a universal phase shift, governed by the time screw rotational symmetry involving a fractional translation of time. These findings suggest that the present lattice-shaking scheme provides a versatile platform for the investigation of the orbital physics and the symmetry-protected dynamics.	翻訳日:2023-04-22 05:36:10 公開日:2021-07-10
# 量子論のオントロジモデルに対するノーゴー定理 A no-go theorem for Quantum theory ontological models ( http://arxiv.org/abs/2012.05712v2 ) ライセンス: Link先を確認	Tung Ten Yong	(参考訳) 本稿では,系の量子状態が系の独立現実を表す物理状態の集合に対応できないという意味で,量子力学は存在論的モデルを認めないことを示す。ウィグナーの友人のシナリオに基づく2つの思考実験を通じて、実験室における物理系のオンティック状態がウィグナーと彼の友人にとって同じである場合、pbr定理、量子論的予測、そして「超決定論」の仮定の1つに違反することを示した。 In this paper, we show that Quantum Mechanics does not admit ontological models, in the sense that the quantum state of a system cannot correspond to a set of physical states representing the independent reality of the system. We show, via two thought experiments based on the Wigner's friend scenario, that if the ontic state of physical systems in the lab is the same for Wigner and for his friend, one of the following will be violated: PBR theorem, Quantum-theoretic predictions, and the "No-superdeterminism" assumption.	翻訳日:2023-04-21 07:49:59 公開日:2021-07-10
# 遺伝的プログラミングにおけるモジュールのタグベースの制御は、文脈依存問題解決を改善する Tag-based regulation of modules in genetic programming improves context-dependent problem solving ( http://arxiv.org/abs/2012.09229v3 ) ライセンス: Link先を確認	Alexander Lalejini, Matthew Andres Moreno, and Charles Ofria	(参考訳) 我々は、プログラムがどのコードモジュールを表現するかを動的に調整できる新しい遺伝的プログラミング(gp)技術であるtag-based genetic regulationの有用性を紹介、実証する。タグは進化可能なラベルであり、コードモジュールを参照するための柔軟なメカニズムを提供する。タグベースの遺伝的規制は、既存のタグベースの命名スキームを拡張し、プログラムが表現パターンを変更するためにコードモジュールを「宣伝」したり「抑圧」したりできる。この拡張により、モジュールが命令実行に基づいて制御される遺伝子制御ネットワークとしてプログラムを構築することができる。本稿では,プログラム合成問題に対するタグベースの制御の機能を実証する。タグベースの規制は,事前入力に基づいてプログラムが現在の入力にどのように反応するかを調整しなければならないという,文脈依存の問題に対する問題解決性能を向上させることが判明した。実際、システムは規制を加えるまで、文脈依存の問題に対する解決策を進化させることができなかった。しかし、タグに基づく遺伝子制御の実装は、普遍的に有益ではない。特定の入力に対する正しい応答が決して変化しないシナリオを特定し、タグベースの規制を不要な機能として、時には適応的な進化を妨げる可能性がある。タグベースの遺伝的レギュレーションは、よりダイナミックな遺伝的プログラムを進化させるための技術のレパートリーを広げ、既存のタグ対応GPシステムに容易に組み込むことができる。 We introduce and experimentally demonstrate the utility of tag-based genetic regulation, a new genetic programming (GP) technique that allows programs to dynamically adjust which code modules to express. Tags are evolvable labels that provide a flexible mechanism for referencing code modules. Tag-based genetic regulation extends existing tag-based naming schemes to allow programs to "promote" and "repress" code modules in order to alter expression patterns. This extension allows evolution to structure a program as a gene regulatory network where modules are regulated based on instruction executions. We demonstrate the functionality of tag-based regulation on a range of program synthesis problems. We find that tag-based regulation improves problem-solving performance on context-dependent problems; that is, problems where programs must adjust how they respond to current inputs based on prior inputs. Indeed, the system could not evolve solutions to some context-dependent problems until regulation was added. Our implementation of tag-based genetic regulation is not universally beneficial, however. We identify scenarios where the correct response to a particular input never changes, rendering tag-based regulation an unneeded functionality that can sometimes impede adaptive evolution. Tag-based genetic regulation broadens our repertoire of techniques for evolving more dynamic genetic programs and can easily be incorporated into existing tag-enabled GP systems.	翻訳日:2023-04-20 11:04:21 公開日:2021-07-10
# ボソニック$p$帯三角格子における軌道次数 Orbital order in a bosonic $p$-band triangular lattice ( http://arxiv.org/abs/2102.05319v2 ) ライセンス: Link先を確認	Hua Chen and X. C. Xie	(参考訳) 超流体-モット絶縁体遷移における軌道秩序の進化に着目して,Bose-Hubbardモデルの詳細を$p$バンド三角形格子で示す。 2つの異なる相が超流動状態にある。これらの位相の1つが弱い相互作用限界を断定的に結合する。この位相は、軸軸の$p_\pm=p_x \pm ip_y$ と、平面の $p_\theta=\cos\theta p_x+\sin\theta p_y$ の軌道秩序の相互結合によって特徴づけられる。さらに、計算されたボゴリューボフ励起スペクトルは、元のディラック点を単一粒子スペクトルで差すが、創発的なディラック点を示す。もう1つの超流動相はmott絶縁体に近接しており、単位ボソン充填は平面内強磁性軌道秩序の崩壊を示している。最後に、mott絶縁体相のために軌道交換モデルを構築する。その古典的な基底状態は、面内軌道空間における創発的なSO$(2)$回転対称性を持ち、従って無限の縮退を楽しみ、最終的には障害機構の順序によって軌道変動によって持ち上げられる。系統解析から,Mott絶縁体相の面内フェロ軌道秩序は後者の超流動相と一致し,進化する可能性が示唆された。 We present a detailed study of the Bose-Hubbard model in a $p$-band triangular lattice by focusing on the evolution of orbital order across the superfluid-Mott insulator transition. Two distinct phases are found in the superfluid regime. One of these phases adiabatically connects the weak interacting limit. This phase is characterized by the intertwining of axial $p_\pm=p_x \pm ip_y$ and in-plane $p_\theta=\cos\theta p_x+\sin\theta p_y$ orbital orders, which break the time-reversal symmetry and lattice symmetries simultaneously. In addition, the calculated Bogoliubov excitation spectrum gaps the original Dirac points in the single-particle spectrum but exhibits emergent Dirac points. The other superfluid phase in close proximity to the Mott insulator with unit boson filling shows a detwined in-plane ferro-orbital order. Finally, an orbital exchange model is constructed for the Mott insulator phase. Its classical ground state has an emergent SO$(2)$ rotational symmetry in the in-plane orbital space and therefore enjoys an infinite degeneracy, which is ultimately lifted by the orbital fluctuation via the order by disorder mechanism. Our systematic analysis suggests that the in-plane ferro-orbital order in the Mott insulator phase agrees with and likely evolves from the latter superfluid phase.	翻訳日:2023-04-12 00:58:36 公開日:2021-07-10
# 量子貯留層計算とエクストリームラーニングマシンの可能性 Opportunities in Quantum Reservoir Computing and Extreme Learning Machines ( http://arxiv.org/abs/2102.11831v2 ) ライセンス: Link先を確認	Pere Mujal, Rodrigo Mart\'inez-Pe\~na, Johannes Nokkala, Jorge Garc\'ia-Beni, Gian Luca Giorgi, Miguel C. Soriano, Roberta Zambrini	(参考訳) 量子貯水池コンピューティング (QRC) と量子極端学習マシン (QELM) は、古典的および量子機械学習タスクにおいてその可能性を実証する2つの新しいアプローチである。物理システムの量子性と簡単なトレーニング戦略を組み合わせることで、優れたパフォーマンスを実現している。これらの非従来型コンピューティングアプローチへの関心の高まりは、実装に適した多様な量子プラットフォームと、複雑な量子システムの研究における理論的進歩によって加速される。本稿では, 量子入力, 量子物理基板, 量子タスクを考慮した場合, 様々な可能性を示す最近の提案と最初の実験について述べる。主な焦点はこれらのアプローチのパフォーマンスであり、古典的なアプローチや機会に対するアドバンテージである。 Quantum reservoir computing (QRC) and quantum extreme learning machines (QELM) are two emerging approaches that have demonstrated their potential both in classical and quantum machine learning tasks. They exploit the quantumness of physical systems combined with an easy training strategy, achieving an excellent performance. The increasing interest in these unconventional computing approaches is fueled by the availability of diverse quantum platforms suitable for implementation and the theoretical progresses in the study of complex quantum systems. In this review article, recent proposals and first experiments displaying a broad range of possibilities are reviewed when quantum inputs, quantum physical substrates and quantum tasks are considered. The main focus is the performance of these approaches, on the advantages with respect to classical counterparts and opportunities.	翻訳日:2023-04-10 03:14:43 公開日:2021-07-10
# 過去と未来情報を用いた量子状態推定の統一理論 Unifying theory of quantum state estimation using past and future information ( http://arxiv.org/abs/2104.02911v2 ) ライセンス: Link先を確認	Areeya Chantasri, Ivonne Guevara, Kiarn T. Laverick, and Howard M. Wiseman	(参考訳) 連続監視された力学系に対する量子状態推定は、連続的な観測結果に基づいて、ある時点で量子状態を個々の系に割り当てることを含む。推定の質は、観測情報の使用量と、推定値に対する最適性の定義に依存する。本研究では, 測定記録の一部が入手できない量子状態推定の問題を考えるが, 利用可能な記録が前(ペースト)と後(将来の)の両方から得られる場合, 過去の情報のみを用いた場合よりもよい推定が可能となる。量子系の過去の情報は、特に量子状態の滑らか化、最も類似した経路、二状態ベクトルおよび関連する形式論において様々な方法で使われている。このような一見無関係なアプローチを統一するために、連続的な監視を伴う部分観測量子システムのためのフレームワークを提案し、いくつかの一般化により、最初の2つの既定形式を許容できる。統一フレームワークは、期待されるコスト最小化による状態推定に基づいており、コストは未知のレコードの空間または未知の真の状態の空間で定義することができる。さらに,新しいコスト関数を5つ定義することで,既存の3つのアプローチを概念的に結合し,それらのギャップを埋める新たなタイプの推定器を提案する。本手法は, ボソニック浴槽に散逸的に結合した駆動型2レベルシステムの例として, 提案する7つの推定値すべてを計算することで, 適用性を示す。我々の理論はまた、古典状態推定への接続を可能にし、量子状態推定器間のさらなる概念的リンクを生み出します。 Quantum state estimation for continuously monitored dynamical systems involves assigning a quantum state to an individual system at some time, conditioned on the results of continuous observations. The quality of the estimation depends on how much observed information is used and on how optimality is defined for the estimate. In this work, we consider problems of quantum state estimation where some of the measurement records are not available, but where the available records come from both before (past) and after (future) the estimation time, enabling better estimates than is possible using the past information alone. Past-future information for quantum systems has been used in various ways in the literature, in particular, the quantum state smoothing, the most-likely path, and the two-state vector and related formalisms. To unify these seemingly unrelated approaches, we propose a framework for partially-observed quantum system with continuous monitoring, wherein the first two existing formalisms can be accommodated, with some generalization. The unifying framework is based on state estimation with expected cost minimization, where the cost can be defined either in the space of the unknown record or in the space of the unknown true state. Moreover, we connect all three existing approaches conceptually by defining five new cost functions, and thus new types of estimators, which bridge the gaps between them. We illustrate the applicability of our method by calculating all seven estimators we consider for the example of a driven two-level system dissipatively coupled to bosonic baths. Our theory also allows connections to classical state estimation, which create further conceptual links between our quantum state estimators.	翻訳日:2023-04-05 02:28:25 公開日:2021-07-10
# WiFiMod:パッシブセンシングを用いたトランスフォーマーを用いた室内移動モデリング WiFiMod: Transformer-based Indoor Human Mobility Modeling using Passive Sensing ( http://arxiv.org/abs/2104.09835v3 ) ライセンス: Link先を確認	Amee Trivedi, Kate Silverstein, Emma Strubell, Mohit Iyyer, Prashant Shenoy	(参考訳) 人体移動のモデル化は、都市計画から病気拡散のシミュレーションまで幅広い応用がある。人間は屋内で80%の時間を過ごすことはよく知られているが、室内での移動のモデル化は3つの主な理由から困難である。 (i)簡単に入手でき、信頼性があり、低コストな屋内移動度データセットがないこと。 (ii)頻繁な屋内移動のモデル化における高い予測空間 (iii)移動度におけるマルチスカラー周期性と相関これらの課題に対処するために,WiFi システムログを用いて屋内の人体移動を複数の空間スケールでモデル化する Transformer ベースのデータ駆動型アプローチである WiFiMod を提案する。 WiFiModは、入力されたエンタープライズWiFiシステムログとして、スマートフォンのデジタルトレースから人間の移動軌跡を抽出する。次に,複数の空間的スケール,マクロ,マイクロの移動性特徴を特定し,複数の空間的粒度にわたってユーザの移動性を数時間から1日にわたって予測するマルチモーダル組込みトランスを設計した。マルチモーダル埋め込みは様々なスケールの移動周期と相関を捉え、トランスフォーマーは長期移動依存を捉え、モデル予測性能を高める。このアプローチは、まずマクロモビリティを予測し、次に推定マクロモビリティ分布に基づく屋内モビリティ、マイクロモビリティをモデル化し、マクロモビリティのトポロジカル制約を用いて予測空間を大幅に削減する。実験の結果、WiFiModは現在の最先端モデルよりも少なくとも10%高い精度で予測できることがわかった。さらにWiFiModの3つの実環境アプリケーションについても紹介する。 (i)covid-19またはiliの政策決定のための高密度ホットポケットの予測 (ii)室内移動の現実的なシミュレーションを生成する。 (iii)パーソナルアシスタントのデザイン。 Modeling human mobility has a wide range of applications from urban planning to simulations of disease spread. It is well known that humans spend 80% of their time indoors but modeling indoor human mobility is challenging due to three main reasons: (i) the absence of easily acquirable, reliable, low-cost indoor mobility datasets, (ii) high prediction space in modeling the frequent indoor mobility, and (iii) multi-scalar periodicity and correlations in mobility. To deal with all these challenges, we propose WiFiMod, a Transformer-based, data-driven approach that models indoor human mobility at multiple spatial scales using WiFi system logs. WiFiMod takes as input enterprise WiFi system logs to extract human mobility trajectories from smartphone digital traces. Next, for each extracted trajectory, we identify the mobility features at multiple spatial scales, macro, and micro, to design a multi-modal embedding Transformer that predicts user mobility for several hours to an entire day across multiple spatial granularities. Multi-modal embedding captures the mobility periodicity and correlations across various scales while Transformers capture long-term mobility dependencies boosting model prediction performance. This approach significantly reduces the prediction space by first predicting macro mobility, then modeling indoor scale mobility, micro-mobility, conditioned on the estimated macro mobility distribution, thereby using the topological constraint of the macro-scale. Experimental results show that WiFiMod achieves a prediction accuracy of at least 10% points higher than the current state-of-art models. Additionally, we present 3 real-world applications of WiFiMod - (i) predict high-density hot pockets for policy-making decisions for COVID19 or ILI, (ii) generate a realistic simulation of indoor mobility, (iii) design personal assistants.	翻訳日:2023-04-03 02:39:09 公開日:2021-07-10
# MIMOテラヘルツ量子鍵分布 MIMO Terahertz Quantum Key Distribution ( http://arxiv.org/abs/2105.03642v3 ) ライセンス: Link先を確認	Neel Kanth Kundu, Soumya P. Dash, Matthew R. McKay, and Ranjan K. Mallik	(参考訳) 室温で動作するテラヘルツ(THz)周波数アプリケーションのための多重出力多重出力(MIMO)量子鍵分布(QKD)方式を提案する。古典的MIMO通信により、AliceとBobの間のランク$r$MIMOチャネルを$r$並列損失量子チャネルに変換する送信受信ビームフォーミングスキームが提案されている。既存の単一アンテナqkd方式と比較すると,mimo qkd方式は秘密鍵レートを増加させ,伝送距離を延ばすことで性能向上をもたらすことを示す。シミュレーションの結果,THz周波数での高自由空間パス損失を克服するためには,複数のアンテナが必要であることがわかった。性能と周波数の非単調な関係を実証し,10-30$THzの周波数範囲で正の鍵レートが達成可能であることを示す。提案手法は,第5世代超セキュア無線通信システムにおける屋内および屋外のQKD応用に適用可能である。 We propose a multiple-input multiple-output (MIMO) quantum key distribution (QKD) scheme for terahertz (THz) frequency applications operating at room temperature. Motivated by classical MIMO communications, a transmit-receive beamforming scheme is proposed that converts the rank-$r$ MIMO channel between Alice and Bob into $r$ parallel lossy quantum channels. Compared with existing single-antenna QKD schemes, we demonstrate that the MIMO QKD scheme leads to performance improvements by increasing the secret key rate and extending the transmission distance. Our simulation results show that multiple antennas are necessary to overcome the high free-space path loss at THz frequencies. We demonstrate a non-monotonic relation between performance and frequency, and reveal that positive key rates are achievable in the $10-30$ THz frequency range. The proposed scheme can be used for both indoor and outdoor QKD applications for beyond fifth-generation ultra-secure wireless communications systems.	翻訳日:2023-04-01 03:28:10 公開日:2021-07-10
# 5G eHealth Systems, Technologies, ユースケース, 今後の課題 Design and Implementation of 5G eHealth Systems, Technologies, Use Cases and Future Challenges ( http://arxiv.org/abs/2106.05086v2 ) ライセンス: Link先を確認	Di Zhang, Joel J. P. C. Rodrigues, Yunkai Zhai, Takuro Sato	(参考訳) 第5世代(5G)は、高信頼性、低レイテンシ、さらに高速な送信速度で巨大なデバイスを接続することを目的としている。しかし、現在の5G E-Healthシステムへの取り組みは、その完全な青写真を達成するには不十分だ。本稿では,まず,5g e-healthシステムの設計における物理層,上層層,クロス層の観点からの関連技術について論じる。その後、遠隔医療用5G e-healthシステムとコビッドウイルス感染防止用5G e-healthシステムという2つのユースケースを実施。 5G e-health システムの今後の研究動向と課題について検討する。 Fifth generation (5G) aims to connect massive devices with even higher reliability, lower latency and even faster transmission speed, which are vital for implementing the e-health systems. However, the current efforts on 5G e-health systems are still not enough to accomplish its full blueprint. In this article, we first discuss the related technologies from physical layer, upper layer and cross layer perspectives on designing the 5G e-health systems. We afterwards elaborate two use cases according to our implementations, i.e., 5G e-health systems for remote health and 5G e-health systems for Covid-19 pandemic containment. We finally envision the future research trends and challenges of 5G e-health systems.	翻訳日:2023-03-27 04:19:12 公開日:2021-07-10
# 共形場理論の復号化-教師なしから教師なし学習へ Decoding conformal field theories: from supervised to unsupervised learning ( http://arxiv.org/abs/2106.13485v2 ) ライセンス: Link先を確認	En-Jui Kuo, Alireza Seif, Rex Lundgren, Seth Whitsitt, Mohammad Hafezi	(参考訳) 我々は機械学習を用いて有理2次元等角場理論を分類する。まず,これらの最小モデルのエネルギースペクトルを用いて教師あり学習アルゴリズムを学習する。機械はエネルギースペクトルのみを用いて、いくつかの強い相関を持つスピンモデルの臨界点の性質と値を正確に予測できることがわかった。これは、機械学習を用いて物質の異なる相を分類する以前の研究とは対照的であるが、相間の臨界点の性質を明らかにしない。ある種のトポロジカル位相の基底状態のハミルトニアンも共形場理論によって記述されるので、R\'{e}yniエントロピーの教師付き学習を用いて、機械が高い精度でR\'{e}yniエントロピーが最小のR\'{e}yniエントロピーのみを持つハミルトニアンをどの共形場理論で記述するかを特定できることを示した。最後に、教師なし学習アルゴリズムであるオートエンコーダを用いて、中央電荷と直接相関する隠れ変数を発見し、機械学習を用いて高次元の物体を含む他の共形場理論を研究するための展望を議論する。その結果、機械学習は臨界点の発見と特徴付けに利用でき、さらに機械学習を用いてより複雑な共形場の理論を学習する興味深い可能性を示唆している。 We use machine learning to classify rational two-dimensional conformal field theories. We first use the energy spectra of these minimal models to train a supervised learning algorithm. We find that the machine is able to correctly predict the nature and the value of critical points of several strongly correlated spin models using only their energy spectra. This is in contrast to previous works that use machine learning to classify different phases of matter, but do not reveal the nature of the critical point between phases. Given that the ground-state entanglement Hamiltonian of certain topological phases of matter is also described by conformal field theories, we use supervised learning on R\'{e}yni entropies and find that the machine is able to identify which conformal field theory describes the entanglement Hamiltonian with only the lowest few R\'{e}yni entropies to a high degree of accuracy. Finally, using autoencoders, an unsupervised learning algorithm, we find a hidden variable that has a direct correlation with the central charge and discuss prospects for using machine learning to investigate other conformal field theories, including higher-dimensional ones. Our results highlight that machine learning can be used to find and characterize critical points and also hint at the intriguing possibility to use machine learning to learn about more complex conformal field theories.	翻訳日:2023-03-25 14:07:20 公開日:2021-07-10
# フレーム重ね合わせクラスタ : 遷移行列を高精度に導出する方法 Frame Superposition Cluster: The method to derive the transition matrices in high accuracy ( http://arxiv.org/abs/2107.02979v2 ) ライセンス: Link先を確認	Hikaru Wakaura and Takao Tomono	(参考訳) 変分量子固有ソルバ(vqe)法は、量子コンピュータを用いて固有エネルギーから波動関数を導出する方法である。しかし、それらの重畳状態を利用する方法はまだ開発されていない。 VQE法により導かれる2つの状態間の可観測物のオフ対角要素を計算するには、ノイズ中間スケール量子(NISQ)デバイスに対する多数のゲート演算が必要となる。 VQE法により導出された状態間の重ね合わせ状態を駆動する新しい手法を提案した。 Frame superposition cluster (FSC) と呼ぶ。本手法を用いて, 双極子遷移モーメントを他の方法と比較して高い精度で計算できることを確認した。 Variational Quantum Eigensolver (VQE) method is the way to derive the wave functions from their eigen energies using quantum computers. But, the methods to utilize the superposition states between them haven't been developed yet. The method to calculate the off diagonal element of observables between two states derived by VQE method requires large numbers of gate operations for Noisy Intermediate Scale Quantum (NISQ) devices. We proposed the novel method to drive the superposition state between the states the derived by VQE method. We call it Frame superposition cluster (FSC) method. Using the method, we confirmed that dipole transition moment could be calculated with the highest accuracy compared to other methods.	翻訳日:2023-03-23 04:32:31 公開日:2021-07-10
# 好中球のメタ・アプレンジザド・パラオシミザカオ・デ・パラメトロス Meta-aprendizado para otimizacao de parametros de redes neurais ( http://arxiv.org/abs/2109.13745v1 ) ライセンス: Link先を確認	Tarsicio Lucas, Teresa Ludermir, Ricardo Prudencio, Carlos Soares	(参考訳) ANN(Artificial Neural Networks)の最適化は,これらのモデルを現実のアプリケーションで使用する上で重要な課題である。このタスクに採用されるソリューションは一般的に高価であり、試行錯誤手順や専門家の知識が常に利用できるわけではない。本研究では,ANNの最適化にメタラーニングを用いることを検討した。メタ学習は,学習問題の特徴と学習アルゴリズムの性能を関連付ける知識の自動獲得を目的とした研究分野である。メタラーニング手法は,元来アルゴリズム選択問題に対して提案され,その後に支援ベクトルマシンのパラメータの最適化を行った。しかし、メタラーニングは、ANNパラメータを最適化するためのより一般的な戦略として採用され、この研究の方向性における新たな取り組みの動機となっている。本研究では,mlpネットワークにおける隠れノード数をメタラーニングを用いて選択するケーススタディを行った。本研究では,93の回帰問題に関連するメタサンプルを作成した。それぞれのメタサンプルは回帰問題から生成され、格納される: 問題を記述する16の特徴(例えば、問題属性の属性の数と相関)と、可能な範囲から実験的に選択されたこの問題のノードの最大数。この一連のメタサンプルはmeta-learnerへの入力として与えられ、その特徴に基づいて新しい問題に対して最適なノード数を予測することができた。実験の結果, 良好な結果が得られた。 The optimization of Artificial Neural Networks (ANNs) is an important task to the success of using these models in real-world applications. The solutions adopted to this task are expensive in general, involving trial-and-error procedures or expert knowledge which are not always available. In this work, we investigated the use of meta-learning to the optimization of ANNs. Meta-learning is a research field aiming to automatically acquiring knowledge which relates features of the learning problems to the performance of the learning algorithms. The meta-learning techniques were originally proposed and evaluated to the algorithm selection problem and after to the optimization of parameters for Support Vector Machines. However, meta-learning can be adopted as a more general strategy to optimize ANN parameters, which motivates new efforts in this research direction. In the current work, we performed a case study using meta-learning to choose the number of hidden nodes for MLP networks, which is an important parameter to be defined aiming a good networks performance. In our work, we generated a base of meta-examples associated to 93 regression problems. Each meta-example was generated from a regression problem and stored: 16 features describing the problem (e.g., number of attributes and correlation among the problem attributes) and the best number of nodes for this problem, empirically chosen from a range of possible values. This set of meta-examples was given as input to a meta-learner which was able to predict the best number of nodes for new problems based on their features. The experiments performed in this case study revealed satisfactory results.	翻訳日:2023-03-22 21:58:24 公開日:2021-07-10
# 高等教育機関ウェブサイトの訪問者データの解析 Analysis of the Visitor Data of a Higher Education Institution Website ( http://arxiv.org/abs/2107.14107v1 ) ライセンス: Link先を確認	Omer Aydin	(参考訳) 今日の世界では、インターネットは人間の生活のあらゆる側面に影響を与えており、企業ウェブサイトや他の多くの分野にも変化をもたらしている。企業のWebサイトは、よりダイナミックでインタラクティブで、新しい技術との互換性が高まるべきだ。 webサイトとユーザ、検索エンジン、その他のデバイスとのインタラクションは、専門家によって検証され、このインタラクションのために改善と変更が行われるべきである。本研究では,高等教育機関のウェブサイトを調査した。分析には2013年から2019年にかけて収集された訪問者データを用いた。幅広い調査・データを含む本研究では,交通分析から開発提案までの重要な知見が盛り込まれている。特に,モバイル端末との互換性,画像と動画の最適化,ユーザの地理的特徴,言語オプション,時間とともにアクセスされるコンテンツの密度分析などを通じて有用な情報を得た。 In todays world, the internet affects every aspect of human life; it has caused changes in corporate websites as well as in many other areas. Corporate websites should be more dynamic, more interactive, and more compatible with new technologies. The interaction of the website with users, search engines, and other devices has to be examined by experts, and improvements and changes should be made for this interaction. In this study, a higher education institution website was examined. Visitor data collected between 2013 and 2019 were used for the analysis. In the study, which includes a wide range of examinations and data, important findings from traffic analysis to development suggestions were included. In particular, useful information has been obtained through the compatibility of the site with mobile devices, optimization of pictures and videos, geographical features of users, language options, and density analysis of the content accessed over time.	翻訳日:2023-03-22 21:57:59 公開日:2021-07-10
# ワイルディスクにおけるホロノミック量子操作 Holonomic quantum manipulation in the Weyl Disk ( http://arxiv.org/abs/2107.04814v1 ) ライセンス: Link先を確認	Victor Boogers, Janis Erdmanis, Yuli Nazarov	(参考訳) 超伝導ナノ構造のワイル点が、2つの量子状態がパラメトリック空間の2次元多様体においてほぼ退化するワイル円板を生じる可能性があることが示されている。これによりホロノミック量子操作の可能性が開かれ、縮退多様体内のパラメータの断熱的変化による波動関数の変換が実現される。本稿では,ワイルディスクにおけるホロノミック操作の機会について詳細に検討する。準古典近似で多様体の接続を計算し、アベリアンであることを示し、位相ゲートとして使うことができる。状態の準備と読み出しを含む量子操作のクローズドな例を提供するため、縮退した部分空間からシステムを引き出すパラメータの変更によりホロノミックゲートを補強する。数値図解には、準古典的パラメータの有限値と正確な量子力学を用いる。異なる実行時間に対するサンプルゲートの忠実度について検討する。 It has been shown that a Weyl point in a superconducting nanostructure may give rise to a Weyl disk where two quantum states are almost degenerate in a 2D manifold in the parametric space. This opens up the possibility of a holonomic quantum manipulation: a transformation of the wave function upon adiabatic change of the parameters within the degenerate manifold. In this paper, we investigate in detail the opportunities for holonomic manipulation in Weyl disks. We compute the connection at the manifold in quasiclassical approximation to show it is Abelian and can be used for a phase gate. To provide a closed example of quantum manipulation that includes a state preparation and read-out, we augment the holonomic gate with a change of parameters that brings the system out of the degenerate subspace. For numerical illustrations, we use a finite value of quasiclassical parameter and exact quantum dynamics. We investigate the fidelity of an example gate for different execution times.	翻訳日:2023-03-22 21:56:43 公開日:2021-07-10
# フラクタル量子力学によるブラックホール熱力学の展望 Prospecting Black Hole Thermodynamics with Fractional Quantum Mechanics ( http://arxiv.org/abs/2107.04789v1 ) ライセンス: Link先を確認	S. Jalalzadeh, F. Rodrigues da Silva and P. V. Moniz	(参考訳) 本稿では、分数量子力学の枠組みがブラックホール熱力学の視点を広げるかどうかを考察する。具体的には、主ツールとして {\it space-fractional} 微分 \cite{Rie} を用いる。さらに、解析はシュワルツシルト構成の場合に限定される。その後修正されたウィーラー・ドウィット方程式から、対応する特定の観測可能な式を取得する。つまり、ブラックホール質量スペクトルは$M$、温度は$T$、エントロピーは$S$である。これらの熊の連続的な変化は、分数パラメータ($\alpha$)を通して伝達される。特に、標準結果は特定の制限である$\alpha=2$で回収される。さらに、Tsallis と Cirto \cite{Tsallis} と Barrow \cite{Barrow} が提案するエントロピー-面積関係の一般化が、分数的な観点で補完的な解釈を得る方法について詳しく述べる。結果について徹底的に議論する。 This paper investigates whether the framework of fractional quantum mechanics can broaden our perspective of black hole thermodynamics. Concretely, we employ a {\it space-fractional} derivative \cite{Rie} as our main tool. Moreover, we restrict our analysis to the case of a Schwarzschild configuration. From a subsequently modified Wheeler-DeWitt equation, we retrieve the corresponding expressions for specific observables. Namely, the black hole mass spectrum, $M$, its temperature $T$, and entropy, $S$. We find that these bear consequential alterations conveyed through a fractional parameter, $\alpha$. In particular, the standard results are recovered in the specific limit $\alpha=2$. Furthermore, we elaborate how generalizations of the entropy-area relation suggested by Tsallis and Cirto \cite{Tsallis} and Barrow \cite{Barrow} acquire a complementary interpretation in terms of a fractional point of view. A thorough discussion of our results is presented.	翻訳日:2023-03-22 21:56:07 公開日:2021-07-10
# 改良チャプリギンガスによる暗黒物質ボース・アインシュタイン凝縮の粘性相互作用と安定性 Viscous interacting and stability on dark matter Bose-Einstein condensation with modified Chaplygin gas ( http://arxiv.org/abs/2107.04780v1 ) ライセンス: Link先を確認	E. Mahichi, Alireza Amani, and M. A. Ramzanpour	(参考訳) 本稿では,暗黒物質ボース・アインシュタイン凝縮(BEC)の存在下での粘性宇宙力学を曲面-FRW背景により研究する。この目的のために、我々は通常の暗黒物質(低温暗黒物質またはバロトロピック暗黒物質)ではなく、暗黒物質の状態方程式(eos)を重力形式から生じる$p_{dm} \propto \rho_{dm}^2$とするbecレジームを用いる。したがって、修正チャプリギンガスとの相互作用モデルを考えることにより、宇宙成分の存在を考慮した対応する連続性方程式を得る。その後、赤方偏移パラメータの観点から、エネルギー密度とダークエネルギーの圧力を導出する。そして、パラメトリゼーション関数を導入し、51の超新星データと確率解析を適合させることで、宇宙論的パラメータと赤方偏移パラメータを見いだす。以下に示すように、対応する動的グラフを赤方偏移に比例してプロットし、宇宙が現在加速展開段階にあることを示す。最後に, 本モデルの安定性と安定性について, 音速パラメータを用いて検討する。 In this paper, the viscous cosmological dynamics are studied in the presence of dark matter Bose-Einstein Condensation (BEC) by curved-FRW background. For this purpose, we use the BEC regime rather than the normal dark matter (the cold dark matter or the barotropic dark matter) with the dark matter Equation of State (EoS) as $p_{dm} \propto \rho_{dm}^2$, which arises from the gravitational form. Therefore, we obtain the corresponding continuity equations with the existence of the universe components by considering an interacting model with modified Chaplygin gas. Afterward, we derive the energy density and the pressure of dark energy in terms of the redshift parameter. And then, by introducing a parametrization function and fitting it with 51 supernova data with the likelihood analysis, we find the cosmological parameters versus redshift parameter. In what follows, we plot the corresponding dynamic graphs proportional to redshift, and then we represent the universe is currently undergoing an accelerated expansion phase. Finally, we explore the stability and the instability of the present model with the sound speed parameter.	翻訳日:2023-03-22 21:55:51 公開日:2021-07-10
# 標準模型物理学とデジタル量子革命:インターフェースを考える Standard Model Physics and the Digital Quantum Revolution: Thoughts about the Interface ( http://arxiv.org/abs/2107.04769v1 ) ライセンス: Link先を確認	Natalie Klco, Alessandro Roggero, Martin J. Savage	(参考訳) 量子システムの分離・制御・絡み合いの進歩は、かつての量子力学の興味深い特徴を、破壊的な科学的・技術的進歩のための乗り物へと変えつつある。 feynman氏が提唱したビジョンを追求するべく、多くの研究と開発分野にわたる協力的な取り組みが、ドメイン科学者が利用可能なコンピューティングエコシステムにプロトタイプ的デジタル量子デバイスを導入する。これらの初期の量子デバイスとの相互作用を通じて、古典的に難解な量子系を探索する抽象的なビジョンは、現実の現実へと進化している。これらの技術的進歩を触媒する以外に、絡み合いは量子相関の診断や組織ツールとして平行進行を可能にし、量子多体系と標準モデルから定義および出現する量子場理論の理解の改善を導く。 3つの領域科学理論家の視点からは、核・高エネルギー物理学の科学的目的により、最近のNISQ時代の進歩を文脈化するために、絡み合い、複雑性、量子シミュレーションのインターフェースについての考察をまとめる。 Advances in isolating, controlling and entangling quantum systems are transforming what was once a curious feature of quantum mechanics into a vehicle for disruptive scientific and technological progress. Pursuing the vision articulated by Feynman, a concerted effort across many areas of research and development is introducing prototypical digital quantum devices into the computing ecosystem available to domain scientists. Through interactions with these early quantum devices, the abstract vision of exploring classically-intractable quantum systems is evolving toward becoming a tangible reality. Beyond catalyzing these technological advances, entanglement is enabling parallel progress as a diagnostic for quantum correlations and as an organizational tool, both guiding improved understanding of quantum many-body systems and quantum field theories defining and emerging from the Standard Model. From the perspective of three domain science theorists, this article compiles thoughts about the interface on entanglement, complexity, and quantum simulation in an effort to contextualize recent NISQ-era progress with the scientific objectives of nuclear and high-energy physics.	翻訳日:2023-03-22 21:55:28 公開日:2021-07-10
# アンサンブル学習による行列補完問題としての非線形交通予測 Nonlinear Traffic Prediction as a Matrix Completion Problem with Ensemble Learning ( http://arxiv.org/abs/2001.02492v4 ) ライセンス: Link先を確認	Wenqing Li, Chuhan Yang, and Saif Eddin Jabari	(参考訳) 本稿では,信号化トラフィック運用管理における短期交通予測の問題に対処する。具体的には,高分解能(秒間)でのセンサ状態の予測に着目する。これは、通常5分未満の間隔で集約されたトラフィック変数を予測することに焦点を当てた従来のトラフィック予測問題とは対照的である。まず,予測問題を行列補完問題としてモデル化する方法を示す。第2に、ブロック座標降下アルゴリズムを用い、そのアルゴリズムがサブ線形時間でブロック座標最適化器に収束することを実証する。これにより,高分解能データの「大さ」を計算可能な方法で活用することができる。第3に,任意の誤差閾値内でトレーニングエラーを低減させるアンサンブル学習(適応ブースティング)手法を開発した。後者は過去数日間を利用して、データ内の周期的なパターンを捉えることができる。提案手法の性能を理論的に解析し,uaeのアブダビのシミュレーションデータと実世界の高分解能トラヒックデータセットを用いて実証実験を行った。実験の結果,提案手法は他の最先端アルゴリズムよりも優れていることがわかった。 This paper addresses the problem of short-term traffic prediction for signalized traffic operations management. Specifically, we focus on predicting sensor states in high-resolution (second-by-second). This contrasts with traditional traffic forecasting problems, which have focused on predicting aggregated traffic variables, typically over intervals that are no shorter than 5 minutes. Our contributions can be summarized as offering three insights: first, we show how the prediction problem can be modeled as a matrix completion problem. Second, we employ a block-coordinate descent algorithm and demonstrate that the algorithm converges in sub-linear time to a block coordinate-wise optimizer. This allows us to capitalize on the "bigness" of high-resolution data in a computationally feasible way. Third, we develop an ensemble learning (or adaptive boosting) approach to reduce the training error to within any arbitrary error threshold. The latter utilizes past days so that the boosting can be interpreted as capturing periodic patterns in the data. The performance of the proposed method is analyzed theoretically and tested empirically using both simulated data and a real-world high-resolution traffic dataset from Abu Dhabi, UAE. Our experimental results show that the proposed method outperforms other state-of-the-art algorithms.	翻訳日:2023-01-13 13:07:57 公開日:2021-07-10
# Monsterをバイパスする: 実現可能性下でのコンテキスト帯域の高速かつ簡易な最適アルゴリズム Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability ( http://arxiv.org/abs/2003.12699v5 ) ライセンス: Link先を確認	David Simchi-Levi, Yunzong Xu	(参考訳) 一般化可能性仮定の下での一般(確率的な)文脈的包帯問題、すなわち、期待される報酬は、文脈と行動の函数として、一般函数クラス $\mathcal{F}$ に属する。我々は,すべての$T$ラウンドにわたるオフライン回帰オラクルへの${O}(\log T)$呼び出しだけで,統計的に最適な後悔を実現する,高速で単純なアルゴリズムを設計する。 oracleの呼び出し数はさらに$t$が事前に分かっている場合は$o(\log\log t)$に減らすことができる。本研究は,コンテキストバンディットからオフライン回帰への最初の普遍的かつ最適な還元を行い,コンテキストバンディット文学における重要なオープン問題を解決した。我々の結果の直接的な結果として、オフライン回帰の進展は直ちに統計的・計算的に文脈的帯域に変換される。これにより、より高速なアルゴリズムと、より広い階層の文脈的バンディット問題の後悔の保証がもたらされる。 We consider the general (stochastic) contextual bandit problem under the realizability assumption, i.e., the expected reward, as a function of contexts and actions, belongs to a general function class $\mathcal{F}$. We design a fast and simple algorithm that achieves the statistically optimal regret with only ${O}(\log T)$ calls to an offline regression oracle across all $T$ rounds. The number of oracle calls can be further reduced to $O(\log\log T)$ if $T$ is known in advance. Our results provide the first universal and optimal reduction from contextual bandits to offline regression, solving an important open problem in the contextual bandit literature. A direct consequence of our results is that any advances in offline regression immediately translate to contextual bandits, statistically and computationally. This leads to faster algorithms and improved regret guarantees for broader classes of contextual bandit problems.	翻訳日:2022-12-18 23:37:48 公開日:2021-07-10
# 深部異常検出における再考 Rethinking Assumptions in Deep Anomaly Detection ( http://arxiv.org/abs/2006.00339v2 ) ライセンス: Link先を確認	Lukas Ruff, Robert A. Vandermeulen, Billy Joe Franks, Klaus-Robert M\"uller, and Marius Kloft	(参考訳) 異常検出(英: anomaly detection, ad)は、分類問題(固有対異常)と見なすことができるが、通常は「異常」の意味を十分に特徴付けるデータセットであるため、教師なしの方法で扱われる。本稿では,この直感が画像上での深部ADにまで拡張されないことを示す。 ImageNetの最近のADベンチマークでは、通常のサンプルと数個の(64)ランダムな自然画像の区別を訓練された分類器が、ADの最先端技術よりも優れている。画像データの多スケール構造は,異常を例示的に有益であることを示す。 Though anomaly detection (AD) can be viewed as a classification problem (nominal vs. anomalous) it is usually treated in an unsupervised manner since one typically does not have access to, or it is infeasible to utilize, a dataset that sufficiently characterizes what it means to be "anomalous." In this paper we present results demonstrating that this intuition surprisingly seems not to extend to deep AD on images. For a recent AD benchmark on ImageNet, classifiers trained to discern between normal samples and just a few (64) random natural images are able to outperform the current state of the art in deep AD. Experimentally we discover that the multiscale structure of image data makes example anomalies exceptionally informative.	翻訳日:2022-11-26 17:42:47 公開日:2021-07-10
# 凸ニューラルネットワークの奇妙なケース The Curious Case of Convex Neural Networks ( http://arxiv.org/abs/2006.05103v3 ) ライセンス: Link先を確認	Sarath Sivaprasad, Ankur Singh, Naresh Manwani, Vineet Gandhi	(参考訳) 本稿では,入力の凸関数を出力とするニューラルネットワークの制約付き定式化について検討する。完全連結層と畳み込み層の両方で凸性制約を強制でき、ほとんどのアーキテクチャに適用できることを示した。凸性制約は、(第一層を除いて)重みを非負に制限し、非減少凸活性化関数を使用することを含む。単純ではあるが、これらの制約はネットワークの一般化能力に深い影響を及ぼす。 3つの貴重な洞察を導きます (a)入力出力凸ニューラルネットワーク(IOC-NN)の自己正規化と過度適合問題の低減 (b)厳しい制約があるにもかかわらず、ベース多層パーセプトロンよりも優れ、ベース畳み込みアーキテクチャと同じような性能を達成する。 (c)IOC-NNは,列車ラベルの騒音に対する堅牢性を示す。 3種類のニューラルネットワークアーキテクチャを用いた標準画像分類データセットの徹底的な実験とアブレーション実験を用いて,提案手法の有効性を実証する。 In this paper, we investigate a constrained formulation of neural networks where the output is a convex function of the input. We show that the convexity constraints can be enforced on both fully connected and convolutional layers, making them applicable to most architectures. The convexity constraints include restricting the weights (for all but the first layer) to be non-negative and using a non-decreasing convex activation function. Albeit simple, these constraints have profound implications on the generalization abilities of the network. We draw three valuable insights: (a) Input Output Convex Neural Networks (IOC-NNs) self regularize and reduce the problem of overfitting; (b) Although heavily constrained, they outperform the base multi layer perceptrons and achieve similar performance as compared to base convolutional architectures and (c) IOC-NNs show robustness to noise in train labels. We demonstrate the efficacy of the proposed idea using thorough experiments and ablation studies on standard image classification datasets with three different neural network architectures.	翻訳日:2022-11-23 13:41:39 公開日:2021-07-10
# 制約ミスマッチポリシーによる安全強化学習の促進 Accelerating Safe Reinforcement Learning with Constraint-mismatched Policies ( http://arxiv.org/abs/2006.11645v3 ) ライセンス: Link先を確認	Tsung-Yen Yang and Justinian Rosca and Karthik Narasimhan and Peter J. Ramadge	(参考訳) 本研究では,(1)ベースライン制御ポリシと(2)学習者が満たさなければならない制約のセットを備える場合の強化学習の問題を考える。基本方針は、デモンストレーションデータや教師エージェントから生じて、学習に有用な手がかりを提供することもあるが、手元にあるタスクには準最適であり、安全、公正、その他のアプリケーション固有の要件を符号化する特定の制約を満たすことが保証されていない。基本方針から安全に学習するために,タスクに対する期待収益の最大化,基本方針への距離の最小化,制約満足セットへのポリシーの投影を交互に行う反復的ポリシー最適化アルゴリズムを提案する。アルゴリズムを理論的に解析し,有限時間収束保証を提供する。 5つの異なる制御タスクに関する実験では、アルゴリズムが最先端のベースラインを一貫して上回っており、10倍の制約違反と平均40%の報酬を達成しています。 We consider the problem of reinforcement learning when provided with (1) a baseline control policy and (2) a set of constraints that the learner must satisfy. The baseline policy can arise from demonstration data or a teacher agent and may provide useful cues for learning, but it might also be sub-optimal for the task at hand, and is not guaranteed to satisfy the specified constraints, which might encode safety, fairness or other application-specific requirements. In order to safely learn from baseline policies, we propose an iterative policy optimization algorithm that alternates between maximizing expected return on the task, minimizing distance to the baseline policy, and projecting the policy onto the constraint-satisfying set. We analyze our algorithm theoretically and provide a finite-time convergence guarantee. In our experiments on five different control tasks, our algorithm consistently outperforms several state-of-the-art baselines, achieving 10 times fewer constraint violations and 40% higher reward on average.	翻訳日:2022-11-18 21:52:57 公開日:2021-07-10
# 大規模シンボリック回帰のための高速ニューラルネットワークモデル Fast Neural Models for Symbolic Regression at Scale ( http://arxiv.org/abs/2007.10784v2 ) ライセンス: Link先を確認	Allan Costa and Rumen Dangovski and Owen Dugan and Samuel Kim and Pawan Goyal and Marin Solja\v{c}i\'c and Joseph Jacobson	(参考訳) ディープラーニングの成功は、ニューラルネットワークの驚くべき表現力に起因している。しかし、これは、科学、エンジニアリング、および現実世界のデータを記述する分析式を見つけるという目標と矛盾する、トレーニングデータセットのドメインをはるかに超過する複雑なブラックボックスモデルのコストが伴う。このような法則の階層的モジュラリティは、ニューラルネットワークのトレーニングによって捉えることができるという仮説のもとに、解釈可能でコンパクトでスパースな解を見つけるニューラルネットワークモデルであるoccamnet(英語版)を紹介し、データ適合のための解である \`{a} la occam's razor(英語版)を紹介する。我々のモデルは微分不可能関数空間上の確率分布を定義する。進化戦略におけるクロスエントロピーマッチングに基づいて,関数をサンプリングし,重みを逆プロパゲーションで更新する2段階最適化手法を提案する。 OccamNetは、単純な分析関数、再帰的プログラム、暗黙的な関数、単純な画像分類など、さまざまな記号法則に適合し、実世界の回帰データセット上で、顕著に最先端のシンボル回帰手法を上回ります。我々の手法はメモリフットプリントを最小限に抑え、AIアクセラレーターを必要とせず、単一のCPU上でのトレーニングに数分で複雑な関数を適合させ、GPU上でスケールした場合の大幅なパフォーマンス向上を示す。実験を再現するための実装、デモ、インストラクションはhttps://github.com/druidowm/occamnet_public.comで利用可能です。 Deep learning owes much of its success to the astonishing expressiveness of neural networks. However, this comes at the cost of complex, black-boxed models that extrapolate poorly beyond the domain of the training dataset, conflicting with goals of finding analytic expressions to describe science, engineering and real world data. Under the hypothesis that the hierarchical modularity of such laws can be captured by training a neural network, we introduce OccamNet, a neural network model that finds interpretable, compact, and sparse solutions for fitting data, \`{a} la Occam's razor. Our model defines a probability distribution over a non-differentiable function space. We introduce a two-step optimization method that samples functions and updates the weights with backpropagation based on cross-entropy matching in an evolutionary strategy: we train by biasing the probability mass toward better fitting solutions. OccamNet is able to fit a variety of symbolic laws including simple analytic functions, recursive programs, implicit functions, simple image classification, and can outperform noticeably state-of-the-art symbolic regression methods on real world regression datasets. Our method requires minimal memory footprint, does not require AI accelerators for efficient training, fits complicated functions in minutes of training on a single CPU, and demonstrates significant performance gains when scaled on a GPU. Our implementation, demonstrations and instructions for reproducing the experiments are available at https://github.com/druidowm/OccamNet_Public.	翻訳日:2022-11-09 21:47:53 公開日:2021-07-10
# 小さな逆バイアスを持つスパーススケッチ Sparse sketches with small inversion bias ( http://arxiv.org/abs/2011.10695v2 ) ライセンス: Link先を確認	Micha{\l} Derezi\'nski, Zhenyu Liao, Edgar Dobriban and Michael W. Mahoney	(参考訳) 高い$n\times d$ matrix $A$ とランダムな$m\times n$ スケッチ行列 $S$ に対して、逆共分散行列 $(A^\top A)^{-1}$ のスケッチされた推定は、一般的にバイアスされる: $E[(\tilde A^\top\tilde A)^{-1}]\ne(A^\top A)^{-1}$, $\tilde A=SA$。逆バイアスと呼ばれるこの現象は、統計学や分散最適化において、逆共分散に依存する複数の独立に構築された量の推定を平均化するときに生じる。我々は、ランダム行列に対する$(\epsilon,\delta)$-unbiased estimatorという概念に基づいて、逆バイアスを分析するフレームワークを開発した。スケッチマトリクス $s$ が密度が高く i.i.d. サブガウシアンエントリを持つ場合、単純な再スケーリングの後に、推定値 $(\frac m{m-d}\tilde a^\top\tilde a)^{-1}$ は $(\epsilon,\delta)$-unbiased for $(a^\top a)^{-1}$ で、サイズは $m=o(d+\sqrt d/\epsilon)$ である。これは、$m=O(d)$の場合、この推定子の逆バイアスは$O(1/\sqrt d)$であり、サブガウススケッチの埋め込み保証の結果得られる$\Theta(1)$近似誤差よりもはるかに小さいことを意味する。次に, LEverage Score Sparsified (LESS) Embeddingsという新しいスケッチ手法を提案する。この手法は, 疎結合とデータ認識のレバレッジベースの行サンプリング手法の両方のアイデアを用いて, スケッチサイズ$m=O(d\log d+\sqrt d/\epsilon)$ in time $O(\text{nnz}(A)\log n+md^2)$を得る。この解析を可能にする重要な手法は、制限されたbai-silverstein不等式(英語版)と呼ばれるランダム二次形式に対するbaiとsilversteinの古典的な不等式の拡張と、paley-zygmund不等式による二項分布の非集中化であり、スコアサンプリングスケッチを利用する下限を示す証明に使われる。 For a tall $n\times d$ matrix $A$ and a random $m\times n$ sketching matrix $S$, the sketched estimate of the inverse covariance matrix $(A^\top A)^{-1}$ is typically biased: $E[(\tilde A^\top\tilde A)^{-1}]\ne(A^\top A)^{-1}$, where $\tilde A=SA$. This phenomenon, which we call inversion bias, arises, e.g., in statistics and distributed optimization, when averaging multiple independently constructed estimates of quantities that depend on the inverse covariance. We develop a framework for analyzing inversion bias, based on our proposed concept of an $(\epsilon,\delta)$-unbiased estimator for random matrices. We show that when the sketching matrix $S$ is dense and has i.i.d. sub-gaussian entries, then after simple rescaling, the estimator $(\frac m{m-d}\tilde A^\top\tilde A)^{-1}$ is $(\epsilon,\delta)$-unbiased for $(A^\top A)^{-1}$ with a sketch of size $m=O(d+\sqrt d/\epsilon)$. This implies that for $m=O(d)$, the inversion bias of this estimator is $O(1/\sqrt d)$, which is much smaller than the $\Theta(1)$ approximation error obtained as a consequence of the subspace embedding guarantee for sub-gaussian sketches. We then propose a new sketching technique, called LEverage Score Sparsified (LESS) embeddings, which uses ideas from both data-oblivious sparse embeddings as well as data-aware leverage-based row sampling methods, to get $\epsilon$ inversion bias for sketch size $m=O(d\log d+\sqrt d/\epsilon)$ in time $O(\text{nnz}(A)\log n+md^2)$, where nnz is the number of non-zeros. The key techniques enabling our analysis include an extension of a classical inequality of Bai and Silverstein for random quadratic forms, which we call the Restricted Bai-Silverstein inequality; and anti-concentration of the Binomial distribution via the Paley-Zygmund inequality, which we use to prove a lower bound showing that leverage score sampling sketches generally do not achieve small inversion bias.	翻訳日:2022-09-22 23:17:45 公開日:2021-07-10
# (参考訳) Poissonレグレッションを伴うプレミアリーグのゴールスコア Goal scoring in Premier League with Poisson regression ( http://arxiv.org/abs/2108.05796v1 ) ライセンス: CC BY 4.0	Cuong Pham, Tung Le	(参考訳) プレミアリーグは世界で最も競争力のあるサッカーリーグの1つとして知られており、試合ごとに多くのゴールが与えられる。各試合で得点したゴール数に影響を与える要因は何か? 私たちは、ポアソン回帰を用いて、ターゲット、コーナー、レッドカードといった多くの要因と、ホームチームが試合で得点できるゴールとの関係を見つけます。 Premier League is known as one of the most competitive football league in the world, hence there are many goals are scored here every match. Which are the factors that affect to the number of goal scored in each match? We use Poisson regression to find out the relation between many factors as shots on target, corners, red cards, to the goals home team can score in their match.	翻訳日:2021-08-15 18:34:34 公開日:2021-07-10
# (参考訳) ビデオパニックセグメンテーションのためのマージタスク Merging Tasks for Video Panoptic Segmentation ( http://arxiv.org/abs/2108.04223v1 ) ライセンス: CC BY-SA 4.0	Jake Rap, Panagiotis Meletis	(参考訳) 本稿では,ビデオパノプティカルセグメンテーションの課題について検討し,その課題を解決するための2つの方法を提案する。ビデオパノプティカルセグメンテーション(VPS)は、最近導入されたコンピュータビジョンタスクであり、ビデオ内のすべてのピクセルを分類し、追跡する必要がある。このタスクの性質は、禁止されるデータセットに注釈を付けるコストを発生させる。ビデオのパンオプティカルセグメンテーションを理解するために、最初に、セマンティクスとトラッキングを別々に重視する構成タスクを導入した。その後、適切なVPSデータセットのトレーニングを必要としない2つのデータ駆動アプローチが選択される。最初のアプローチでは、事前訓練されたセマンティックセグメンテーションモデルと事前訓練されたマルチオブジェクト追跡モデルの出力をヒューリスティックに融合することにより、ビデオパノタイプセグメンテーションのモデルを構築する方法を示す。どちらのモデルの能力も容易に拡張したい場合、これは望まれる。第2のアプローチは、タスク固有の頭を持つ共有ニューラルネットワークバックボーン上に構築することで、最初のアプローチの欠点を克服する。このネットワークはパンオプティカルセグメンテーション用に設計されており、マスク伝搬モジュールによって時間にわたってインスタンスマスクをリンクするように拡張され、ビデオパンオプティカルセグメンテーションフォーマットとなる。 In this paper, the task of video panoptic segmentation is studied and two different methods to solve the task will be proposed. Video panoptic segmentation (VPS) is a recently introduced computer vision task that requires classifying and tracking every pixel in a given video. The nature of this task makes the cost of annotating datasets for it prohibiting. To understand video panoptic segmentation, first, earlier introduced constituent tasks that focus on semantics and tracking separately will be researched. Thereafter, two data-driven approaches which do not require training on a tailored VPS dataset will be selected to solve it. The first approach will show how a model for video panoptic segmentation can be built by heuristically fusing the outputs of a pre-trained semantic segmentation model and a pre-trained multi-object tracking model. This can be desired if one wants to easily extend the capabilities of either model. The second approach will counter some of the shortcomings of the first approach by building on top of a shared neural network backbone with task-specific heads. This network is designed for panoptic segmentation and will be extended by a mask propagation module to link instance masks across time, yielding the video panoptic segmentation format.	翻訳日:2021-08-15 16:44:40 公開日:2021-07-10
# 読み, 注意, コード: 機械による臨床ノートから医療用コード予測の限界を推し進める Read, Attend, and Code: Pushing the Limits of Medical Codes Prediction from Clinical Notes by Machines ( http://arxiv.org/abs/2107.10650v1 ) ライセンス: Link先を確認	Byung-Hak Kim and Varun Ganapathi	(参考訳) 臨床ノートから医療コードを予測することは、現在の医療システム内のすべての医療提供組織にとって実用的かつ本質的な必要性である。アノテーションの自動化は、今日の人間のコーダーによる多大な時間と過剰な労力を節約します。しかし、最大の課題は、構造化されていないフリーテキスト臨床ノートから数千の高次元コードの中から適切な医療コードを直接識別することである。過去3年間で、CNN(Convolutional Neural Networks)とLTSM(Long Short-Term Memory)ネットワークによって、MIMIC-III-full-label in patient Clinical Noteデータセットの最も難しいベンチマークに対処する上で、大幅な改善が行われた。この進歩は、自動機械学習(ML)システムが人間のプログラマの作業パフォーマンスからどこまで遠いのかという根本的な疑問を提起する。同じサブサンプルテストセット上で,人間のコーダのパフォーマンスのベースラインを評価した。また、医療コードの割り当てマッピングを学習するための、読み取り、アテント、コード(RAC)モデルも提示します。結合した埋め込みと自己注意とコードタイトル誘導注意モジュールを結合し、文置換に基づくデータ拡張と確率的ウェイト平均化トレーニングを組み合わせることで、RACは新たな技術状態(SOTA)を確立し、現在の最高のマクロF1を18.7%上回り、人間レベルのコーディングベースラインを通り過ぎている。この新たなマイルストーンは、医療コード予測における人間のコーダのパフォーマンスと同等に達するマシンにおける完全自律型医療コーディング(AMC)への重要なステップである。 Prediction of medical codes from clinical notes is both a practical and essential need for every healthcare delivery organization within current medical systems. Automating annotation will save significant time and excessive effort spent by human coders today. However, the biggest challenge is directly identifying appropriate medical codes out of several thousands of high-dimensional codes from unstructured free-text clinical notes. In the past three years, with Convolutional Neural Networks (CNN) and Long Short-Term Memory (LTSM) networks, there have been vast improvements in tackling the most challenging benchmark of the MIMIC-III-full-label inpatient clinical notes dataset. This progress raises the fundamental question of how far automated machine learning (ML) systems are from human coders' working performance. We assessed the baseline of human coders' performance on the same subsampled testing set. We also present our Read, Attend, and Code (RAC) model for learning the medical code assignment mappings. By connecting convolved embeddings with self-attention and code-title guided attention modules, combined with sentence permutation-based data augmentations and stochastic weight averaging training, RAC establishes a new state of the art (SOTA), considerably outperforming the current best Macro-F1 by 18.7%, and reaches past the human-level coding baseline. This new milestone marks a meaningful step toward fully autonomous medical coding (AMC) in machines reaching parity with human coders' performance in medical code prediction.	翻訳日:2021-07-25 11:57:36 公開日:2021-07-10
# 急速学習と体系的一般化の根底にあるもの What underlies rapid learning and systematic generalization in humans ( http://arxiv.org/abs/2107.06994v1 ) ライセンス: Link先を確認	Andrew Joohun Nam and James L. McClelland (Stanford University)	(参考訳) ニューラルネットワークの画期的な成功にもかかわらず、現代モデルは大量のデータセットによる広範なトレーニングを必要とし、サンプル外一般化の貧弱さを示す。提案された解決策の1つは、モデルに体系性とドメイン固有の制約を構築することである。本稿では,このアプローチの限界について,簡単な指導チュートリアルから抽象的推論タスクを学習する成人の能力と,不正確な回答に対する説明的フィードバックを検討することで考察し,トレーニング例の範囲外での人間学習のダイナミクスと一般化能力が,代表的なニューラルネットワークモデルとは大きく異なること,そして,モデルが著者が期待しない特徴の変化に対して脆弱であることを示す。このパズルを一貫して解く能力は, 教育, 特に基礎数学教育に関連し, 使用戦略の確実に識別可能かつ有効な説明を提供する能力である, という人間データからさらに証拠を提示する。本研究では,人間における素早い学習と体系的な一般化は,学習から学習までの段階的,経験に依存したプロセスに依存して,一般化可能な推論を支援する明示的な抽象ルールの構築を導くことを提案する。 Despite the groundbreaking successes of neural networks, contemporary models require extensive training with massive datasets and exhibit poor out-of-sample generalization. One proposed solution is to build systematicity and domain-specific constraints into the model, echoing the tenets of classical, symbolic cognitive architectures. In this paper, we consider the limitations of this approach by examining human adults' ability to learn an abstract reasoning task from a brief instructional tutorial and explanatory feedback for incorrect responses, demonstrating that human learning dynamics and ability to generalize outside the range of the training examples differ drastically from those of a representative neural network model, and that the model is brittle to changes in features not anticipated by its authors. We present further evidence from human data that the ability to consistently solve the puzzles was associated with education, particularly basic mathematics education, and with the ability to provide a reliably identifiable, valid description of the strategy used. We propose that rapid learning and systematic generalization in humans may depend on a gradual, experience-dependent process of learning-to-learn using instructions and explanations to guide the construction of explicit abstract rules that support generalizable inferences.	翻訳日:2021-07-18 12:36:26 公開日:2021-07-10
# アクティブニューラルネットワークによるバックプロップフリー強化学習 Backprop-Free Reinforcement Learning with Active Neural Generative Coding ( http://arxiv.org/abs/2107.07046v1 ) ライセンス: Link先を確認	Alexander Ororbia, Ankur Mali	(参考訳) ヒトでは知覚認知は感覚入力から情報の迅速な認識と抽出を促進する。この認識は、人間のエージェントが環境とどのように相互作用するかに大きく依存する。本研究では,動的環境における誤りのバックプロパゲーション(backprop)を伴わない動作駆動生成モデル学習のための計算フレームワークであるactive neural generative codingを提案する。具体的には,計画の認知理論からヒントを得て,少ない報酬でも操作できるインテリジェントエージェントを開発した。オンライン学習環境では,提案するモデリングフレームワークが深層Q-ラーニングモデルと競合する,いくつかの制御問題を実証する。我々のエージェントの堅牢な性能は、神経推論と学習のためのバックプロップフリーアプローチがゴール指向の行動を促進するという有望な証拠を提供する。 In humans, perceptual awareness facilitates the fast recognition and extraction of information from sensory input. This awareness largely depends on how the human agent interacts with the environment. In this work, we propose active neural generative coding, a computational framework for learning action-driven generative models without backpropagation of errors (backprop) in dynamic environments. Specifically, we develop an intelligent agent that operates even with sparse rewards, drawing inspiration from the cognitive theory of planning as inference. We demonstrate on several control problems, in the online learning setting, that our proposed modeling framework performs competitively with deep Q-learning models. The robust performance of our agent offers promising evidence that a backprop-free approach for neural inference and learning can drive goal-directed behavior.	翻訳日:2021-07-18 12:36:03 公開日:2021-07-10
# (参考訳) グラフ注意ネットワークを用いた薬物・標的相互作用予測 Drug-Target Interaction Prediction with Graph Attention networks ( http://arxiv.org/abs/2107.06099v1 ) ライセンス: CC BY 4.0	Haiyang Wang, Guangyu Zhou, Siqi Liu, Jyun-Yu Jiang and Wei Wang	(参考訳) モチベーション: 薬物と標的の相互作用を予測する(dti)は、プロテオミクスと医薬品研究の分野における関連性から、バイオインフォマティクスにおいてよく研究されているトピックである。このタスクには多くの機械学習手法がうまく適用されているが、DTIネットワークに固有の異種グラフ構造を活用することを目的としているものはほとんどない。 DTIのトポロジ構造と類似性をよりよく学習し,解釈するためには,グラフ構造から相互作用を予測する方法が望ましい。結果: DTI予測のためのエンドツーエンドフレームワークであるDTI-GAT(Drug-Target Interaction Prediction with Graph Attention Network)を提案する。 dti-gatは、相互作用パターンと薬物およびタンパク質配列の特徴の両方を利用する注意機構を備えたグラフ構造化データで動作するディープニューラルネットワークアーキテクチャを組み込んでいる。 DTI-GATは、DTIのトポロジ構造を自己注意機構で各ノードに異なる注意重みを割り当てることで解釈しやすくする。実験により、DTI-GATはバイナリDTI予測問題において、様々な最先端システムより優れていることが示された。さらに, 独立研究により, 従来の手法よりもモデルをより一般化できることが実証された。可用性: ソースコードとすべてのデータセットはhttps://github.com/Haiyang-W/DTI-GRAPHで公開されている。 Motivation: Predicting Drug-Target Interaction (DTI) is a well-studied topic in bioinformatics due to its relevance in the fields of proteomics and pharmaceutical research. Although many machine learning methods have been successfully applied in this task, few of them aim at leveraging the inherent heterogeneous graph structure in the DTI network to address the challenge. For better learning and interpreting the DTI topological structure and the similarity, it is desirable to have methods specifically for predicting interactions from the graph structure. Results: We present an end-to-end framework, DTI-GAT (Drug-Target Interaction prediction with Graph Attention networks) for DTI predictions. DTI-GAT incorporates a deep neural network architecture that operates on graph-structured data with the attention mechanism, which leverages both the interaction patterns and the features of drug and protein sequences. DTI-GAT facilitates the interpretation of the DTI topological structure by assigning different attention weights to each node with the self-attention mechanism. Experimental evaluations show that DTI-GAT outperforms various state-of-the-art systems on the binary DTI prediction problem. Moreover, the independent study results further demonstrate that our model can be generalized better than other conventional methods. Availability: The source code and all datasets are available at https://github.com/Haiyang-W/DTI-GRAPH	翻訳日:2021-07-15 05:53:38 公開日:2021-07-10
# (参考訳) 確率的機械学習を用いたネットワークトラフィックの実用的・構成的分類 Practical and Configurable Network Traffic Classification Using Probabilistic Machine Learning ( http://arxiv.org/abs/2107.06080v1 ) ライセンス: CC BY 4.0	Jiahui Chen, Joe Breen, Jeff M. Phillips, Jacobus Van der Merwe	(参考訳) 広く適用可能で高精度なネットワークトラフィック分類は、多くのネットワークセキュリティおよび管理タスクに有用である。フレキシブルで構成が容易な分類フレームワークは理想的であり、さまざまなネットワークで使用するためにカスタマイズすることができる。本稿では,未知のトラフィックから既知の,あるいは承認されたトラフィックを識別するために,パケットのシーケンスの統計のみに依存する,高度に構成可能で柔軟な機械学習トラフィック分類手法を提案する。提案手法は,確率推定に基づいて,分類決定のための確実性尺度を提供し,トラフィックを調整可能な確実性レベルに分類することができる。分類方法は, 異なる分類目標を優先して, 異なる分類シナリオにも適用できる。高性能コンピューティングネットワーク環境における実世界のトラフィックに対して,当社の分類手法とその構成がどのように機能するかを実証する。 Network traffic classification that is widely applicable and highly accurate is valuable for many network security and management tasks. A flexible and easily configurable classification framework is ideal, as it can be customized for use in a wide variety of networks. In this paper, we propose a highly configurable and flexible machine learning traffic classification method that relies only on statistics of sequences of packets to distinguish known, or approved, traffic from unknown traffic. Our method is based on likelihood estimation, provides a measure of certainty for classification decisions, and can classify traffic at adjustable certainty levels. Our classification method can also be applied in different classification scenarios, each prioritizing a different classification goal. We demonstrate how our classification scheme and all its configurations perform well on real-world traffic from a high performance computing network environment.	翻訳日:2021-07-15 05:36:16 公開日:2021-07-10
# (参考訳) 衛星システムのテレメトリとテレコマンドに対する間欠的妨害と学習駆動検出戦略 Intermittent Jamming against Telemetry and Telecommand of Satellite Systems and A Learning-driven Detection Strategy ( http://arxiv.org/abs/2107.06181v1 ) ライセンス: CC BY-SA 4.0	Selen Gecgel and Gunes Karabulut Kurt	(参考訳) 第6世代ネットワーク (6g) に向けて、衛星通信システム、特に低軌道 (leo) ネットワークは、そのユニークで包括的な能力のために期待されている。これらのアドバンテージには,セキュリティ脆弱性やハイブリッドシステムの管理,モビリティ向上など,さまざまな課題が伴っている。本稿では,まず,衛星システムのサイバー物理特性を考慮し,物理層におけるセキュリティの欠如を概念的枠組みで解決し,攻撃の可能性を明らかにする。次に、学習駆動型検出方式を提案し、軽量畳み込みニューラルネットワーク(CNN)を設計する。設計したCNNアーキテクチャの性能は、一般的な機械学習アルゴリズムであるサポートベクターマシン(SVM)と比較される。その結果,提案手法を用いて衛星システムに対する欠陥攻撃を検出することができた。 Towards sixth-generation networks (6G), satellite communication systems, especially based on Low Earth Orbit (LEO) networks, become promising due to their unique and comprehensive capabilities. These advantages are accompanied by a variety of challenges such as security vulnerabilities, management of hybrid systems, and high mobility. In this paper, firstly, a security deficiency in the physical layer is addressed with a conceptual framework, considering the cyber-physical nature of the satellite systems, highlighting the potential attacks. Secondly, a learning-driven detection scheme is proposed, and the lightweight convolutional neural network (CNN) is designed. The performance of the designed CNN architecture is compared with a prevalent machine learning algorithm, support vector machine (SVM). The results show that deficiency attacks against the satellite systems can be detected by employing the proposed scheme.	翻訳日:2021-07-15 05:20:03 公開日:2021-07-10
# 因果分析を用いた概念的深層学習説明 Using Causal Analysis for Conceptual Deep Learning Explanation ( http://arxiv.org/abs/2107.06098v1 ) ライセンス: Link先を確認	Sumedha Singla, Stephen Wallace, Sofia Triantafillou, Kayhan Batmanghelich	(参考訳) モデル説明責任は、医療における信頼できる機械学習モデルの作成に不可欠である。理想的な説明はドメインエキスパートの意思決定プロセスに似ており、臨床医にとって意味のある概念や用語を用いて表現される。このような説明を提供するため、まず分類器の隠れた単位を臨床的に関連する概念に関連付ける。胸部X線画像に付随する放射線学報告を利用して概念を定義した。線形スパースロジスティック回帰法を用いて,概念と隠れ単位の疎結合を発見する。同定された単位が分類器の結果に真に影響を及ぼすようにするために、因果推論文献およびより具体的には、反事実的介入による仲介分析のツールを採用する。最後に, 放射線学者に表現されたすべての概念を簡単な決定規則に変換するために, 低深度決定木を構築した。臨床知識と整合した世界的説明が得られた胸部X線データセットを用いて,我々のアプローチを評価した。 Model explainability is essential for the creation of trustworthy Machine Learning models in healthcare. An ideal explanation resembles the decision-making process of a domain expert and is expressed using concepts or terminology that is meaningful to the clinicians. To provide such an explanation, we first associate the hidden units of the classifier to clinically relevant concepts. We take advantage of radiology reports accompanying the chest X-ray images to define concepts. We discover sparse associations between concepts and hidden units using a linear sparse logistic regression. To ensure that the identified units truly influence the classifier's outcome, we adopt tools from Causal Inference literature and, more specifically, mediation analysis through counterfactual interventions. Finally, we construct a low-depth decision tree to translate all the discovered concepts into a straightforward decision rule, expressed to the radiologist. We evaluated our approach on a large chest x-ray dataset, where our model produces a global explanation consistent with clinical knowledge.	翻訳日:2021-07-14 14:51:58 公開日:2021-07-10
# (参考訳) 雑音下での表情認識に基づくコンセンサス協調学習と知識蒸留 Consensual Collaborative Training And Knowledge Distillation Based Facial Expression Recognition Under Noisy Annotations ( http://arxiv.org/abs/2107.04746v1 ) ライセンス: CC BY 4.0	Darshan Gera, S. Balasubramanian	(参考訳) 大規模表情データセットのラベルにおけるノイズの存在は、野生における顔表情認識(FER)にとって重要な課題である。学習の初期段階では、ディープネットワークはクリーンデータに適合する。そして最終的に、FER性能を制限する記憶能力のために、ノイズの多いラベルに過度に適合し始める。本研究は,CCT(Consensual Collaborative Training)フレームワークと呼ばれる,ノイズラベルの存在下での効果的なトレーニング戦略を提案する。 CCTは、騒音分布を仮定することなく、監督損失と整合損失の凸結合を用いて3つのネットワークを共同で訓練する。動的遷移機構は、早期学習における監督損失から、後期のネットワーク間の予測のコンセンサスに対する一貫性損失への移行に使用される。単純な知識蒸留スキームに基づいた単一のネットワークを用いて推論を行う。提案手法の有効性は,合成および実雑音FERデータセット上で実証される。さらに、約5K画像の大規模なテストサブセットを、16の異なるアノテータの群衆知恵を使ってFECデータセットからアノテートし、信頼できるラベルを推測する。 cctもその上で検証される。 FERDB (90.84%) FERPlus (89.99%) および AffectNet (66%) のベンチマークでは、最先端のパフォーマンスが報告されている。私たちのコードはhttps://github.com/1980x/CCTで公開されています。 Presence of noise in the labels of large scale facial expression datasets has been a key challenge towards Facial Expression Recognition (FER) in the wild. During early learning stage, deep networks fit on clean data. Then, eventually, they start overfitting on noisy labels due to their memorization ability, which limits FER performance. This work proposes an effective training strategy in the presence of noisy labels, called as Consensual Collaborative Training (CCT) framework. CCT co-trains three networks jointly using a convex combination of supervision loss and consistency loss, without making any assumption about the noise distribution. A dynamic transition mechanism is used to move from supervision loss in early learning to consistency loss for consensus of predictions among networks in the later stage. Inference is done using a single network based on a simple knowledge distillation scheme. Effectiveness of the proposed framework is demonstrated on synthetic as well as real noisy FER datasets. In addition, a large test subset of around 5K images is annotated from the FEC dataset using crowd wisdom of 16 different annotators and reliable labels are inferred. CCT is also validated on it. State-of-the-art performance is reported on the benchmark FER datasets RAFDB (90.84%) FERPlus (89.99%) and AffectNet (66%). Our codes are available at https://github.com/1980x/CCT.	翻訳日:2021-07-14 09:33:10 公開日:2021-07-10
# (参考訳) 対向的摂動に対する自律走行物体カテゴリー検出のレジリエンス Resilience of Autonomous Vehicle Object Category Detection to Universal Adversarial Perturbations ( http://arxiv.org/abs/2107.04749v1 ) ライセンス: CC BY 4.0	Mohammad Nayeem Teli and Seungwon Oh	(参考訳) 敵対的な事例に対するディープニューラルネットワークの脆弱性のため、過去数年間、敵の攻撃と防御に関する多くの研究が急増している。しかし、ほとんどの研究者が当然と捉えている敵の攻撃や物体検出のアプローチについては、従来の見方があるようだ。本研究では,クラスレベルでの物体検出に対する普遍摂動の影響を評価することによって,これらの手順に対する新たな視点を提供する。自律運転に関する注意深く計算されたデータセットに適用する。我々は、人、車、トラック、停止標識、COCOデータセットからの交通信号の5つのカテゴリの画像に対して、Faster-RCNNオブジェクト検出器を使用し、Universal Dense Object Suppressionアルゴリズムを用いて画像を注意深く摂動する。その結果、人、車、信号機、トラック、停止標識は、その順序で(少なくとも)普遍的な摂動に対して回復力があることが示された。私たちの知る限りでは、このようなランキングが確立されたのはこれが初めてで、自動運転車に関するデータセットのセキュリティとオブジェクト検出全般において重要な意味を持つ。 Due to the vulnerability of deep neural networks to adversarial examples, numerous works on adversarial attacks and defenses have been burgeoning over the past several years. However, there seem to be some conventional views regarding adversarial attacks and object detection approaches that most researchers take for granted. In this work, we bring a fresh perspective on those procedures by evaluating the impact of universal perturbations on object detection at a class-level. We apply it to a carefully curated data set related to autonomous driving. We use Faster-RCNN object detector on images of five different categories: person, car, truck, stop sign and traffic light from the COCO data set, while carefully perturbing the images using Universal Dense Object Suppression algorithm. Our results indicate that person, car, traffic light, truck and stop sign are resilient in that order (most to least) to universal perturbations. To the best of our knowledge, this is the first time such a ranking has been established which is significant for the security of the data sets pertaining to autonomous vehicles and object detection in general.	翻訳日:2021-07-14 09:15:08 公開日:2021-07-10
# (参考訳) 箱をハックする: ディープラーニングの抽象化に基づくモニタ Hack The Box: Fooling Deep Learning Abstraction-Based Monitors ( http://arxiv.org/abs/2107.04764v1 ) ライセンス: CC BY 4.0	Sara Hajj Ibrahim and Mohamed Nassar	(参考訳) ディープラーニングは、概念の深い階層に適応する機械学習の一種である。ディープラーニング分類器は、入力層における概念の最も基本的なバージョンと出力層における概念の最も抽象的なバージョン(クラスまたはラベルとしても知られる)をリンクする。しかし、一度有限個のクラスで訓練されたとき、深層学習モデルは与えられた入力がどのクラスにも属さず、単純にリンクできないと言う力を持っていない。非関連クラスの予測を正しく無効にすることは、文学において多くの点で取り組まれてきた難しい問題である。新規性検出は、新しい/見えないクラスに対して「知らない」出力を深層学習に与えます。それでも、新規性検出のセキュリティ面には注意が向けられていない。本稿では,抽象に基づく新奇性検出のケーススタディを考察し,敵のサンプルに対して頑健ではないことを示す。さらに,深層学習分類器を騙し,新奇な検出監視をバイパスする,逆行的なサンプル作成の可能性を示す。言い換えれば、これらの監視ボックスはハック可能である。新規検出自体が攻撃面となることを実証する。 Deep learning is a type of machine learning that adapts a deep hierarchy of concepts. Deep learning classifiers link the most basic version of concepts at the input layer to the most abstract version of concepts at the output layer, also known as a class or label. However, once trained over a finite set of classes, a deep learning model does not have the power to say that a given input does not belong to any of the classes and simply cannot be linked. Correctly invalidating the prediction of unrelated classes is a challenging problem that has been tackled in many ways in the literature. Novelty detection gives deep learning the ability to output "do not know" for novel/unseen classes. Still, no attention has been given to the security aspects of novelty detection. In this paper, we consider the case study of abstraction-based novelty detection and show that it is not robust against adversarial samples. Moreover, we show the feasibility of crafting adversarial samples that fool the deep learning classifier and bypass the novelty detection monitoring at the same time. In other words, these monitoring boxes are hackable. We demonstrate that novelty detection itself ends up as an attack surface.	翻訳日:2021-07-14 09:06:10 公開日:2021-07-10
# (参考訳) ls3: 反復タスクのロングホリゾン・バイスモータ制御のための潜在空間セーフセット LS3: Latent Space Safe Sets for Long-Horizon Visuomotor Control of Iterative Tasks ( http://arxiv.org/abs/2107.04775v1 ) ライセンス: CC BY 4.0	Albert Wilcox and Ashwin Balakrishna and Brijen Thananjeyan and Joseph E. Gonzalez and Ken Goldberg	(参考訳) 強化学習(rl)アルゴリズムは、複雑な長時間ホリゾンタスクを学習するために高次元環境を探索することに成功したが、しばしば安全でない振る舞いを示し、探索が制限されていない場合に広範な環境相互作用を必要とする。動的に不確実な環境での安全な学習のための有望な戦略は、エージェントが確実にタスク成功(したがって安全)を保証できる状態に戻ることを要求することである。このアプローチは低次元環境では成功したが、画像などの高次元状態空間を持つ環境ではこの制約を強制することは困難である。そこで我々は,この手法を拡張した潜在空間セーフセット(ls3)を,準最適実演と学習力学モデルを用いて画像観察を伴う反復的・長期ホリゾンタスクに拡張し,タスク完了の可能性のある学習されたセーフセットの近傍への探索を制限する。シミュレーションにおける逐次プッシュタスクや物理的ケーブルルーティングタスクを含む4つの領域におけるLS3の評価を行った。 LS3は事前のタスク成功を利用して探索を制限し、制約を満たしながら事前のアルゴリズムよりも効率的に学習できることが判明した。コードと補足材料については https://tinyurl.com/latent-ss をご覧ください。 Reinforcement learning (RL) algorithms have shown impressive success in exploring high-dimensional environments to learn complex, long-horizon tasks, but can often exhibit unsafe behaviors and require extensive environment interaction when exploration is unconstrained. A promising strategy for safe learning in dynamically uncertain environments is requiring that the agent can robustly return to states where task success (and therefore safety) can be guaranteed. While this approach has been successful in low-dimensions, enforcing this constraint in environments with high-dimensional state spaces, such as images, is challenging. We present Latent Space Safe Sets (LS3), which extends this strategy to iterative, long-horizon tasks with image observations by using suboptimal demonstrations and a learned dynamics model to restrict exploration to the neighborhood of a learned Safe Set where task completion is likely. We evaluate LS3 on 4 domains, including a challenging sequential pushing task in simulation and a physical cable routing task. We find that LS3 can use prior task successes to restrict exploration and learn more efficiently than prior algorithms while satisfying constraints. See https://tinyurl.com/latent-ss for code and supplementary material.	翻訳日:2021-07-14 08:58:25 公開日:2021-07-10
# (参考訳) Speech2Video:ビデオ生成のためのクロスモーダル蒸留 Speech2Video: Cross-Modal Distillation for Speech to Video Generation ( http://arxiv.org/abs/2107.04806v1 ) ライセンス: CC BY 4.0	Shijing Si, Jianzong Wang, Xiaoyang Qu, Ning Cheng, Wenqi Wei, Xinghua Zhu and Jing Xiao	(参考訳) 本稿では,音声のみから発声顔映像生成の新たな課題について検討する。音声対ビデオ生成技術は、エンターテイメント、カスタマーサービス、人間とコンピュータの相互作用産業に興味深い応用をもたらす可能性がある。実際、音声の音色、アクセント、速度は、話者の外観に関連する豊富な情報を含んでいる。この課題は主に、異なる視覚特性を音声信号から切り離すことである。本稿では,不規則なビデオ入力から絡み合った感情やアイデンティティ情報を抽出する軽量なクロスモーダル蒸留法を提案する。抽出した特徴は、生成的対向ネットワークによって音声合成ビデオクリップに統合される。慎重に考案された識別器を用いて、提案するフレームワークは現実的な生成結果を達成する。観察された個人による実験では、提案手法が発話のみから感情表現を捉え、映像出力に自発的な顔の動きを生じさせることが示されている。話者の静的画像と音声を結合したベースライン法と比較すると,提案手法の結果はほぼ区別がつかない。また,提案手法は,映像中の感情表現の面で既存のアルゴリズムを上回っていることを示す。 This paper investigates a novel task of talking face video generation solely from speeches. The speech-to-video generation technique can spark interesting applications in entertainment, customer service, and human-computer-interaction industries. Indeed, the timbre, accent and speed in speeches could contain rich information relevant to speakers' appearance. The challenge mainly lies in disentangling the distinct visual attributes from audio signals. In this article, we propose a light-weight, cross-modal distillation method to extract disentangled emotional and identity information from unlabelled video inputs. The extracted features are then integrated by a generative adversarial network into talking face video clips. With carefully crafted discriminators, the proposed framework achieves realistic generation results. Experiments with observed individuals demonstrated that the proposed framework captures the emotional expressions solely from speeches, and produces spontaneous facial motion in the video output. Compared to the baseline method where speeches are combined with a static image of the speaker, the results of the proposed framework is almost indistinguishable. User studies also show that the proposed method outperforms the existing algorithms in terms of emotion expression in the generated videos.	翻訳日:2021-07-14 08:38:34 公開日:2021-07-10
# (参考訳) 胸部ctにおけるcov19-ct-dbベースラインの改善 COVID Detection in Chest CTs: Improving the Baseline on COV19-CT-DB ( http://arxiv.org/abs/2107.04808v1 ) ライセンス: CC BY 4.0	Radu Miron, Cosmin Moisii, Sergiu Dinu, Mihaela Breaban	(参考訳) 胸部CTにおける深層学習に基づく3つの異なるアプローチの比較検討を行った。最初のアプローチは3次元畳み込みを伴うボリュームトリクティックなアプローチで、他の2つのアプローチは最初はスライスワイズ分類を行い、その後ボリュームレベルで結果を集約する。実験はCOV19-CT-DBデータセット上で実施され、ICCV 2021内のMIA-COV19Dコンペティションによって提起された課題に対処することを目的としている。検証サブセットの最良の結果はマクロF1スコアの0.92に達し、オーガナイザが設定したベースラインスコアの0.70を大幅に改善する。 The paper presents a comparative analysis of three distinct approaches based on deep learning for COVID-19 detection in chest CTs. The first approach is a volumetric one, involving 3D convolutions, while the other two approaches perform at first slice-wise classification and then aggregate the results at the volume level. The experiments are carried on the COV19-CT-DB dataset, with the aim of addressing the challenge raised by the MIA-COV19D Competition within ICCV 2021. Our best results on the validation subset reach a macro-F1 score of 0.92, which improves considerably the baseline score of 0.70 set by the organizers.	翻訳日:2021-07-14 08:27:51 公開日:2021-07-10
# (参考訳) BSDA-Net:OCTA画像のセグメンテーションと分類のための境界形状と距離を考慮した共同学習フレームワーク BSDA-Net: A Boundary Shape and Distance Aware Joint Learning Framework for Segmenting and Classifying OCTA Images ( http://arxiv.org/abs/2107.04823v1 ) ライセンス: CC BY 4.0	Li Lin, Zhonghua Wang, Jiewei Wu, Yijin Huang, Junyan Lyu, Pujin Cheng, Jiong Wu, Xiaoying Tang	(参考訳) 光コヒーレンストモグラフィアンギオグラフィー(OCTA)は、新しい非侵襲的イメージング技術であり、網膜層にまたがる血管と胎児の血管ゾーン(FAZ)の可視化を可能にする。臨床研究は、fazの形態と輪郭の不規則性が様々な眼疾患の重要なバイオマーカーであることを示唆している。したがって、FAZの正確なセグメンテーションは、非常に興味深い。また、FAZの特徴が深層診断分類網の性能を向上させるという研究報告はない。本稿では,OCTA画像からのFAZセグメンテーションと診断のためのマルチレベル境界形状と距離認識型共同学習フレームワークBSDA-Netを提案する。 2つの補助枝、すなわち境界熱マップ回帰と符号付き距離マップ再構成枝がセグメンテーション部に加えて構築され、セグメンテーション性能が向上し、より正確なFAZ輪郭とより少ないアウトリーが生じる。さらに、上記の3つの枝(形状、大きさ、境界、FAZの符号付き方向距離マップ)の低レベル特徴と高レベル特徴は、診断分類器の特徴と階層的に融合する。大規模な実験により、提案したBSDA-NetはOCTA-500、OCTAGON、FAZIDデータセットの最先端のセグメンテーションと分類結果が得られることがわかった。 Optical coherence tomography angiography (OCTA) is a novel non-invasive imaging technique that allows visualizations of vasculature and foveal avascular zone (FAZ) across retinal layers. Clinical researches suggest that the morphology and contour irregularity of FAZ are important biomarkers of various ocular pathologies. Therefore, precise segmentation of FAZ has great clinical interest. Also, there is no existing research reporting that FAZ features can improve the performance of deep diagnostic classification networks. In this paper, we propose a novel multi-level boundary shape and distance aware joint learning framework, named BSDA-Net, for FAZ segmentation and diagnostic classification from OCTA images. Two auxiliary branches, namely boundary heatmap regression and signed distance map reconstruction branches, are constructed in addition to the segmentation branch to improve the segmentation performance, resulting in more accurate FAZ contours and fewer outliers. Moreover, both low-level and high-level features from the aforementioned three branches, including shape, size, boundary, and signed directional distance map of FAZ, are fused hierarchically with features from the diagnostic classifier. Through extensive experiments, the proposed BSDA-Net is found to yield state-of-the-art segmentation and classification results on the OCTA-500, OCTAGON, and FAZID datasets.	翻訳日:2021-07-14 08:20:26 公開日:2021-07-10
# (参考訳) CSL-YOLO:エッジコンピューティングのための新しい軽量物体検出システム CSL-YOLO: A New Lightweight Object Detection System for Edge Computing ( http://arxiv.org/abs/2107.04829v1 ) ライセンス: CC BY 4.0	Yu-Ming Zhang, Chun-Chieh Lee, Jun-Wei Hsieh, Kuo-Chin Fan	(参考訳) 軽量な物体検出器の開発は計算資源が限られているため不可欠である。計算コストを削減するために、冗長な特徴の生成方法が重要な役割を果たす。本稿では,安価な操作から冗長な特徴を生成するために,新しい軽量な畳み込み方式であるクロスステージ軽量モジュールを提案する。中間展開段階では, ポイントワイズ畳み込みを深さ方向畳み込みに置き換え, 候補特徴量を生成する。提案するcslモジュールは計算コストを大幅に削減できる。 MS-COCOで行われた実験により、提案されたCSLモジュールはConvolution-3x3の適合能力を近似できることが示された。最後に、このモジュールを用いて軽量検出器CSL-YOLOを構築し、Tiny-YOLOv4よりも43%のFLOPと52%のパラメータで検出性能を向上させる。 The development of lightweight object detectors is essential due to the limited computation resources. To reduce the computation cost, how to generate redundant features plays a significant role. This paper proposes a new lightweight Convolution method Cross-Stage Lightweight (CSL) Module, to generate redundant features from cheap operations. In the intermediate expansion stage, we replaced Pointwise Convolution with Depthwise Convolution to produce candidate features. The proposed CSL-Module can reduce the computation cost significantly. Experiments conducted at MS-COCO show that the proposed CSL-Module can approximate the fitting ability of Convolution-3x3. Finally, we use the module to construct a lightweight detector CSL-YOLO, achieving better detection performance with only 43% FLOPs and 52% parameters than Tiny-YOLOv4.	翻訳日:2021-07-14 08:08:55 公開日:2021-07-10
# (参考訳) BERTファインチューニング改善のための雑音安定化規則化 Noise Stability Regularization for Improving BERT Fine-tuning ( http://arxiv.org/abs/2107.04835v1 ) ライセンス: CC BY 4.0	Hang Hua, Xingjian Li, Dejing Dou, Cheng-Zhong Xu, Jiebo Luo	(参考訳) BERTのような微調整済みの言語モデルは、様々なNLPタスクでリーダーボードを支配する一般的なプラクティスとなっている。近年の成功と広く採用されているにもかかわらず、このプロセスは少数のトレーニングサンプルしか入手できない場合、不安定である。この過程の脆さは、しばしばランダムな種子に対する感受性によって反映される。本稿では,近年の文献(Arora et al., 2018, Sanyal et al., 2020)で研究されているディープネットの雑音安定性特性に基づいて,この問題に取り組むことを提案する。具体的には,LNSR(Layer-wise Noise Stability Regularization)と呼ばれるNLPタスクの微調整を改善するための,新しい効果的な正規化手法を提案する。入力に雑音を加える理論を拡張し、この手法がより安定した正規化効果を与えることを示す。良好な性能のモデルではノイズに対する感度が低く,LNSRによる微調整では明らかに一般化性と安定性が向上することが実験的に確認された。さらに,L2-SP (Li et al., 2018), Mixout (Lee et al., 2020), SMART (Jiang et al., 2020) など,最先端のアルゴリズムに対する利点も示す。 Fine-tuning pre-trained language models such as BERT has become a common practice dominating leaderboards across various NLP tasks. Despite its recent success and wide adoption, this process is unstable when there are only a small number of training samples available. The brittleness of this process is often reflected by the sensitivity to random seeds. In this paper, we propose to tackle this problem based on the noise stability property of deep nets, which is investigated in recent literature (Arora et al., 2018; Sanyal et al., 2020). Specifically, we introduce a novel and effective regularization method to improve fine-tuning on NLP tasks, referred to as Layer-wise Noise Stability Regularization (LNSR). We extend the theories about adding noise to the input and prove that our method gives a stabler regularization effect. We provide supportive evidence by experimentally confirming that well-performing models show a low sensitivity to noise and fine-tuning with LNSR exhibits clearly higher generalizability and stability. Furthermore, our method also demonstrates advantages over other state-of-the-art algorithms including L2-SP (Li et al., 2018), Mixout (Lee et al., 2020) and SMART (Jiang et al., 2020).	翻訳日:2021-07-14 07:59:53 公開日:2021-07-10
# (参考訳) 伝達学習による伝播認識型ソーシャルレコメンデーション Propagation-aware Social Recommendation by Transfer Learning ( http://arxiv.org/abs/2107.04846v1 ) ライセンス: CC BY 4.0	Haodong Chang and Yabo Chu	(参考訳) ソーシャル・アウェア・レコメンデーションのアプローチは、従来のレコメンデーションシステムのデータスパーシティ問題を解決する効果的な方法として認識されてきた。背景にある前提は、ソーシャルユーザ-ユーザ接続の知識を共有して、ユーザ-テーマインタラクションのドメインに転送することで、ユーザの好みの学習を支援する、というものだ。しかし、既存のアプローチのほとんどは、転送学習中にユーザ間の1次接続を採用するだけで、それらの接続をより高い順序で無視する。より優れたレコメンデーションパフォーマンスは、高次の社会関係の恩恵を受けることができると我々は主張する。本稿では,社会関係の伝播に基づくPTLN(Propagation-Aware Transfer Learning Network)を提案する。我々は、ソーシャルネットワークに隠された共有知識をよりよく掘り下げ、レコメンデーションパフォーマンスをさらに向上させることを目指している。特に、社会的影響について2つの側面から検討する: (a) 上位の友人は秩序バイアスによって考慮されている; (b) 同じ順番の異なる友人は、注意のメカニズムによる推薦に対して明らかに重要である。さらに,ソーシャルリレーションとユーザ・テーマ間インタラクションのギャップを埋めるために,新たな正規化をデザインする。 2つの実世界のデータセットについて広範な実験を行い、特に歴史的な相互作用が少ないコールドスタートユーザーに対して、ランキングの正確性という点で他のデータセットを上回ります。 Social-aware recommendation approaches have been recognized as an effective way to solve the data sparsity issue of traditional recommender systems. The assumption behind is that the knowledge in social user-user connections can be shared and transferred to the domain of user-item interactions, whereby to help learn user preferences. However, most existing approaches merely adopt the first-order connections among users during transfer learning, ignoring those connections in higher orders. We argue that better recommendation performance can also benefit from high-order social relations. In this paper, we propose a novel Propagation-aware Transfer Learning Network (PTLN) based on the propagation of social relations. We aim to better mine the sharing knowledge hidden in social networks and thus further improve recommendation performance. Specifically, we explore social influence in two aspects: (a) higher-order friends have been taken into consideration by order bias; (b) different friends in the same order will have distinct importance for recommendation by an attention mechanism. Besides, we design a novel regularization to bridge the gap between social relations and user-item interactions. We conduct extensive experiments on two real-world datasets and beat other counterparts in terms of ranking accuracy, especially for the cold-start users with few historical interactions.	翻訳日:2021-07-14 07:44:24 公開日:2021-07-10
# (参考訳) SynPick: 動的ビンピッキングシーン理解のためのデータセット SynPick: A Dataset for Dynamic Bin Picking Scene Understanding ( http://arxiv.org/abs/2107.04852v1 ) ライセンス: CC BY 4.0	Arul Selvam Periyasamy, Max Schwarz, and Sven Behnke	(参考訳) ビンピッキングシナリオにおける動的シーン理解のための合成データセットであるSynPickを提案する。既存のデータセットとは対照的に、私たちのデータセットは、よく知られたAmazon Robotics Challenge(ARC)にインスパイアされた、現実的な産業用アプリケーションドメインにあり、ARC 2017のために開発されたピッキングヒューリスティックによって選択された、真のピッキングアクションを備えた動的シーンを備えています。データセットは人気のあるBOPデータセットフォーマットと互換性がある。本稿では、NVIDIA PhysX物理エンジンを用いたオブジェクト配置生成と操作シミュレーションを含むデータセット生成プロセスについて詳述する。大きなアクションスペースをカバーするために、ターゲットとターゲットを絞ったピッキングアクションとランダムな移動アクションを実行します。オブジェクト認識のためのベースラインを確立するために、データセット上で最先端のポーズ推定手法を評価する。単純なフィルタリング手法であっても、単発推定ではなく、操作中のトラッキングポーズの有用性を実証する。ジェネレータのソースコードとデータセットが公開されている。 We present SynPick, a synthetic dataset for dynamic scene understanding in bin-picking scenarios. In contrast to existing datasets, our dataset is both situated in a realistic industrial application domain -- inspired by the well-known Amazon Robotics Challenge (ARC) -- and features dynamic scenes with authentic picking actions as chosen by our picking heuristic developed for the ARC 2017. The dataset is compatible with the popular BOP dataset format. We describe the dataset generation process in detail, including object arrangement generation and manipulation simulation using the NVIDIA PhysX physics engine. To cover a large action space, we perform untargeted and targeted picking actions, as well as random moving actions. To establish a baseline for object perception, a state-of-the-art pose estimation approach is evaluated on the dataset. We demonstrate the usefulness of tracking poses during manipulation instead of single-shot estimation even with a naive filtering approach. The generator source code and dataset are publicly available.	翻訳日:2021-07-14 07:35:45 公開日:2021-07-10
# (参考訳) Marginalized Corrupted Distributions によるカーネル平均推定 Kernel Mean Estimation by Marginalized Corrupted Distributions ( http://arxiv.org/abs/2107.04855v1 ) ライセンス: CC0 1.0	Xiaobo Xia, Shuo Shan, Mingming Gong, Nannan Wang, Fei Gao, Haikun Wei, Tongliang Liu	(参考訳) 再生カーネルヒルベルト空間におけるカーネル平均の推定は、多くのカーネル学習アルゴリズムにおいて重要な要素である。有限サンプルが与えられた場合、ターゲットカーネル平均の標準推定値は経験平均である。以前の研究では、より良い推定器は縮小法によって構築できることを示した。そこで本研究では,既知の分布からのノイズを伴うデータサンプルを腐敗させ,カーネル平均推定器と呼ばれる新しいカーネル平均推定器を提案する。理論的には、限界化されたカーネル平均推定器は、カーネル平均推定に暗黙の正規化をもたらす。実験により,カーネル平均推定器が既存の推定器よりもはるかに低い推定誤差が得られることを示す。 Estimating the kernel mean in a reproducing kernel Hilbert space is a critical component in many kernel learning algorithms. Given a finite sample, the standard estimate of the target kernel mean is the empirical average. Previous works have shown that better estimators can be constructed by shrinkage methods. In this work, we propose to corrupt data examples with noise from known distributions and present a new kernel mean estimator, called the marginalized kernel mean estimator, which estimates kernel mean under the corrupted distribution. Theoretically, we show that the marginalized kernel mean estimator introduces implicit regularization in kernel mean estimation. Empirically, we show on a variety of datasets that the marginalized kernel mean estimator obtains much lower estimation error than the existing estimators.	翻訳日:2021-07-14 07:24:11 公開日:2021-07-10
# (参考訳) Dense-Sparse Deep CNN Training for Image Denoising Dense-Sparse Deep CNN Training for Image Denoising ( http://arxiv.org/abs/2107.04857v1 ) ライセンス: CC BY 4.0	Basit O. Alawode, Mudassir Masood, Tarig Ballal, and Tareq Al-Naffouri	(参考訳) 近年,畳み込みニューラルネットワーク(cnns)などの深層学習(dl)手法が画像デノイジングの分野で注目されている。これは、bm3dのような最先端の古典的な画像デノイジングアルゴリズムを超える能力が証明されたためである。 deep denoising cnns (dncnns) は、多くのフィードフォワード畳み込み層を使用し、バッチ正規化と残差学習の正規化法を追加し、デニューズ性能を大幅に改善する。しかし、これは膨大な数のトレーニング可能なパラメータを犠牲にしている。本稿では,パラメータ数を削減しつつ,同等の性能を実現することで,この問題に対処した。本研究では,DSDトレーニング手法を用いて,トレーニングネットワークによって得られる性能向上からモチベーションを導出する。我々はこのトレーニングアプローチをDnCNN(RDnCNN)ネットワークに拡張し、パラメータが大幅に減少し、DnCNNに匹敵する性能を持つ高速な復調ネットワークを実現する。 Recently, deep learning (DL) methods such as convolutional neural networks (CNNs) have gained prominence in the area of image denoising. This is owing to their proven ability to surpass state-of-the-art classical image denoising algorithms such as BM3D. Deep denoising CNNs (DnCNNs) use many feedforward convolution layers with added regularization methods of batch normalization and residual learning to improve denoising performance significantly. However, this comes at the expense of a huge number of trainable parameters. In this paper, we address this issue by reducing the number of parameters while achieving a comparable level of performance. We derive motivation from the improved performance obtained by training networks using the dense-sparse-dense (DSD) training approach. We extend this training approach to a reduced DnCNN (RDnCNN) network resulting in a faster denoising network with significantly reduced parameters and comparable performance to the DnCNN.	翻訳日:2021-07-14 07:05:43 公開日:2021-07-10
# (参考訳) 標準点オートエンコーダを用いた3次元距離対応学習 Learning 3D Dense Correspondence via Canonical Point Autoencoder ( http://arxiv.org/abs/2107.04867v1 ) ライセンス: CC BY 4.0	An-Chieh Cheng, Xueting Li, Min Sun, Ming-Hsuan Yang, Sifei Liu	(参考訳) 同一カテゴリの3次元形状間の密接な対応を予測できる標準点オートエンコーダ(CPAE)を提案する。オートエンコーダは、2つの重要な機能を実行する: (a) 任意に順序付けられた点クラウドを標準的なプリミティブ、例えば球体に符号化し、(b) プリミティブを元の入力インスタンス形状に復号する。ボトルネックに置かれているように、このプリミティブは、すべての無秩序点雲を正準面上にマッピングし、順序付けされた方法で再構築する重要な役割を果たす。一度訓練すると、原始曲面上の同じ位置にマッピングされた異なる形状のインスタンスからのポイントは、対応のペアであると決定される。本手法ではアノテーションや自己管理部分分割ネットワークを一切必要とせず,不整合入力点雲を処理できる。 3次元セマンティクスキーポイント転送と部分セグメンテーション伝達の実験結果は,本モデルが最先端対応学習法に対して有利に機能することを示す。 We propose a canonical point autoencoder (CPAE) that predicts dense correspondences between 3D shapes of the same category. The autoencoder performs two key functions: (a) encoding an arbitrarily ordered point cloud to a canonical primitive, e.g., a sphere, and (b) decoding the primitive back to the original input instance shape. As being placed in the bottleneck, this primitive plays a key role to map all the unordered point clouds on the canonical surface and to be reconstructed in an ordered fashion. Once trained, points from different shape instances that are mapped to the same locations on the primitive surface are determined to be a pair of correspondence. Our method does not require any form of annotation or self-supervised part segmentation network and can handle unaligned input point clouds. Experimental results on 3D semantic keypoint transfer and part segmentation transfer show that our model performs favorably against state-of-the-art correspondence learning methods.	翻訳日:2021-07-14 06:57:43 公開日:2021-07-10
# (参考訳) patentminer: コンテキスト強調と知識誘導グラフによる特許空白マイニング PatentMiner: Patent Vacancy Mining via Context-enhanced and Knowledge-guided Graph Attention ( http://arxiv.org/abs/2107.04880v1 ) ライセンス: CC BY 4.0	Gaochen Wu, Bin Xu, Yuxin Qin, Fei Kong, Bangchang Liu, Hongwen Zhao, Dejie Chang	(参考訳) 知識グラフを構築することで特許研究を行う作業は少ないが、特許文書を用いて特許知識グラフを構築したり、最新の自然言語処理手法を組み合わせて既存の特許に隠されたリッチなセマンティックな関係を掘り下げたり、新たな特許を予測したりしない。本稿では,知識グラフ(KG)とグラフアテンション機構に基づいて,リッチなセマンティック知識をマイニングし,新たな潜在的な特許を予測するために,PatentMinerという新しい特許空白予測手法を提案する。まず、特許に関する知識グラフ(例)。年) 特許書類から名称の実体の認識及び関係の延長を行うことにより構成される。第2に、構築した知識グラフにおいてリンク予測を行い、潜在的な三重項を探索するコモンネバー法(CNM)、グラフ注意ネットワーク(GAT)、コンテキスト強化グラフ注意ネットワーク(CGAT)を提案する。最後に,特許は知識グラフ上で,共起関係(co-occurrence relationship)により定義される。すなわち,各特許は,知識グラフ内のすべての実体と共起関係を含む完全連結部分グラフとして表現される。さらに,新たなパテントとして新たに追加された予測リンクを備えた完全連結部分グラフを予測する新しい特許予測タスクを提案する。実験の結果,提案手法は,新たな特許を正しく予測でき,文脈対応グラフアテンションネットワークはベースラインよりもはるかに優れていることがわかった。一方、我々の提案する特許空白予測タスクには、まだ未解決の余地がある。 Although there are a small number of work to conduct patent research by building knowledge graph, but without constructing patent knowledge graph using patent documents and combining latest natural language processing methods to mine hidden rich semantic relationships in existing patents and predict new possible patents. In this paper, we propose a new patent vacancy prediction approach named PatentMiner to mine rich semantic knowledge and predict new potential patents based on knowledge graph (KG) and graph attention mechanism. Firstly, patent knowledge graph over time (e.g. year) is constructed by carrying out named entity recognition and relation extrac-tion from patent documents. Secondly, Common Neighbor Method (CNM), Graph Attention Networks (GAT) and Context-enhanced Graph Attention Networks (CGAT) are proposed to perform link prediction in the constructed knowledge graph to dig out the potential triples. Finally, patents are defined on the knowledge graph by means of co-occurrence relationship, that is, each patent is represented as a fully connected subgraph containing all its entities and co-occurrence relationships of the patent in the knowledge graph; Furthermore, we propose a new patent prediction task which predicts a fully connected subgraph with newly added prediction links as a new pa-tent. The experimental results demonstrate that our proposed patent predic-tion approach can correctly predict new patents and Context-enhanced Graph Attention Networks is much better than the baseline. Meanwhile, our proposed patent vacancy prediction task still has significant room to im-prove.	翻訳日:2021-07-14 06:41:34 公開日:2021-07-10
# (参考訳) ロバストな医用画像解析のためのディープニューラルネットワークにおける分布検出と敵攻撃の概要 Out of Distribution Detection and Adversarial Attacks on Deep Neural Networks for Robust Medical Image Analysis ( http://arxiv.org/abs/2107.04882v1 ) ライセンス: CC BY 4.0	Anisie Uwimana1, Ransalu Senanayake	(参考訳) 深層学習モデルは、医用画像解析において一般的な選択肢となっている。しかし、深層学習モデルの一般化性能の低さは、医学的応用にとって堅牢性が不可欠であるため、実世界での展開を妨げている。例えば、最先端の畳み込みニューラルネットワーク(CNN)は、トレーニング分布から統計的に離れた敵のサンプルやサンプルを検出することができない。本研究は, マラリア寄生細胞と非感染細胞の分類において, マハラノビス距離に基づく信頼性スコア, および異常サンプルの検出方法の信頼性を実験的に評価した。その結果,mahalanobis confidence score detectorはディープラーニングモデルの性能と頑健性が向上し,out-of-distribution (ood) とadversarial sampleの両方において最先端のパフォーマンスが得られた。 Deep learning models have become a popular choice for medical image analysis. However, the poor generalization performance of deep learning models limits them from being deployed in the real world as robustness is critical for medical applications. For instance, the state-of-the-art Convolutional Neural Networks (CNNs) fail to detect adversarial samples or samples drawn statistically far away from the training distribution. In this work, we experimentally evaluate the robustness of a Mahalanobis distance-based confidence score, a simple yet effective method for detecting abnormal input samples, in classifying malaria parasitized cells and uninfected cells. Results indicated that the Mahalanobis confidence score detector exhibits improved performance and robustness of deep learning models, and achieves stateof-the-art performance on both out-of-distribution (OOD) and adversarial samples.	翻訳日:2021-07-14 06:31:21 公開日:2021-07-10
# (参考訳) ハイパーリレーショナルファクトを用いたインダクティブリンク予測の改善 Improving Inductive Link Prediction Using Hyper-Relational Facts ( http://arxiv.org/abs/2107.04894v1 ) ライセンス: CC0 1.0	Mehdi Ali, Max Berrendorf, Mikhail Galkin, Veronika Thost, Tengfei Ma, Volker Tresp, Jens Lehmann	(参考訳) 長年、知識グラフ(KG)上のリンク予測は純粋にトランスダクティブなタスクであり、目に見えないエンティティの推論を許さなかった。近年,半帰納的シナリオと完全帰納的シナリオを探求する取り組みが活発化しており,未確認および新興エンティティに対する推論が可能になっている。しかしながら、これらのアプローチはすべてトリプルベースの \glspl{kg} しか考慮していないが、よりリッチなKG(例えばWikidata)は十分に研究されていない。本研究では,様々な帰納的設定を分類し,グラフニューラルネットワークの最近の進歩を生かした,幅広い半帰納的および完全帰納的リンク予測タスクにハイパーリレーショナルKGを用いることの利点について検討する。新たなベンチマークによる実験結果から, 有意な利得の6%(hits@10メートル法の場合)は, 3値のみのベースラインに比べて性能が向上することが示された。我々のコードは \url{https://github.com/mali-git/hyper_relational_ilp} で利用可能です。 For many years, link prediction on knowledge graphs (KGs) has been a purely transductive task, not allowing for reasoning on unseen entities. Recently, increasing efforts are put into exploring semi- and fully inductive scenarios, enabling inference over unseen and emerging entities. Still, all these approaches only consider triple-based \glspl{kg}, whereas their richer counterparts, hyper-relational KGs (e.g., Wikidata), have not yet been properly studied. In this work, we classify different inductive settings and study the benefits of employing hyper-relational KGs on a wide range of semi- and fully inductive link prediction tasks powered by recent advancements in graph neural networks. Our experiments on a novel set of benchmarks show that qualifiers over typed edges can lead to performance improvements of 6% of absolute gains (for the Hits@10 metric) compared to triple-only baselines. Our code is available at \url{https://github.com/mali-git/hyper_relational_ilp}.	翻訳日:2021-07-14 06:21:43 公開日:2021-07-10
# (参考訳) MRIセグメンテーションにおけるドメインシフトの影響の解剖 Anatomy of Domain Shift Impact on U-Net Layers in MRI Segmentation ( http://arxiv.org/abs/2107.04914v1 ) ライセンス: CC BY 4.0	Ivan Zakazov, Boris Shirokikh, Alexey Chernyavskiy and Mikhail Belyaev	(参考訳) ドメイン適応(da)法は、異なる分散トレイン(ソース)とテスト(ターゲット)データの問題に取り組むために、医療画像分割タスクで広く使われている。対象ドメインからの注釈付きサンプルの数が限られている教師付きDAタスクについて検討する。最小限の注釈付きデータの量で十分な正確なモデルを構築することです。既存の手法のほとんどは、事前訓練された畳み込みニューラルネットワーク(CNN)の微調整固有の層である。しかし、どの層が微調整に優れているのか、コンセンサスはない。低レベルなドメインシフトを持つイメージの最初のレイヤや、高レベルなドメインシフトを持つイメージのより深いレイヤ。この目的のために,最適な微調整を行うレイヤを自動的に選択するCNNアーキテクチャであるSpotTUnetを提案する。より具体的には、対象ドメイン上で、トレーニング済みネットワークから特定の層を微調整するか再利用すべきかを示すポリシーも学習する。本手法は,アノテートデータの極端な不足下においても,非フレキシブル微調整法と同等のレベルで動作することを示す。第二に、SpotTUnetポリシーは、ネットワーク上でのドメインシフトの影響を階層的に可視化し、堅牢なドメイン一般化手法の開発にさらに使用できることを示す。 SpotTUnetの性能を広範囲に評価するために、明示的なドメインシフトを特徴とする脳MR画像の公開データセット(CC359)を用いる。再現可能な実験パイプラインをリリースする。 Domain Adaptation (DA) methods are widely used in medical image segmentation tasks to tackle the problem of differently distributed train (source) and test (target) data. We consider the supervised DA task with a limited number of annotated samples from the target domain. It corresponds to one of the most relevant clinical setups: building a sufficiently accurate model on the minimum possible amount of annotated data. Existing methods mostly fine-tune specific layers of the pretrained Convolutional Neural Network (CNN). However, there is no consensus on which layers are better to fine-tune, e.g. the first layers for images with low-level domain shift or the deeper layers for images with high-level domain shift. To this end, we propose SpotTUnet - a CNN architecture that automatically chooses the layers which should be optimally fine-tuned. More specifically, on the target domain, our method additionally learns the policy that indicates whether a specific layer should be fine-tuned or reused from the pretrained network. We show that our method performs at the same level as the best of the nonflexible fine-tuning methods even under the extreme scarcity of annotated data. Secondly, we show that SpotTUnet policy provides a layer-wise visualization of the domain shift impact on the network, which could be further used to develop robust domain generalization methods. In order to extensively evaluate SpotTUnet performance, we use a publicly available dataset of brain MR images (CC359), characterized by explicit domain shift. We release a reproducible experimental pipeline.	翻訳日:2021-07-14 06:02:18 公開日:2021-07-10
# (参考訳) 特徴量に基づくイベントステレオビジュアルオドメトリー Feature-based Event Stereo Visual Odometry ( http://arxiv.org/abs/2107.04921v1 ) ライセンス: CC BY 4.0	Antea Hadviger, Igor Cvi\v{s}i\'c, Ivan Markovi\'c, Sacha Vra\v{z}i\'c, Ivan Petrovi\'c	(参考訳) イベントベースのカメラは生物学的にインスパイアされたセンサーであり、シーン内の非同期画素の明るさ変化を出力する。ハイダイナミックレンジとマイクロ秒の時間分解能は、照明や高速シナリオに挑戦する環境では標準カメラよりも信頼性が高く、イベントカメラのみに基づいたオドメトリーアルゴリズムの開発は、自律システムやロボットにとってエキサイティングな新しい可能性をもたらす。本稿では,特徴量検出と注意的特徴管理によるマッチングに基づくイベントカメラのステレオ・ビジュアル・オドメトリ法を提案する。提案手法は,屋内飛行ドローンが取得したMVSECシーケンスとDSEC屋外運転シーケンスの2つの公開データセット上での性能を評価する。 mvsecはモーションキャプチャによる正確な地中真実を提供するが、dsecは地中真実を示さないが、標準カメラフレームの基準軌道を得るために、キッティスコアボードの最高ランキングアルゴリズムの一つであるソフト・ビジュアル・オドメトリ(soft visual odometry)を使用した。本手法とESVO法を比較した。この手法はMVSECシークエンスで同等の性能を示すが,DSECデータセットのESVOではデフォルトパラメータで屋外走行シナリオを処理できなかった。さらに,esvoに対する2つの重要な利点は,追跡周波数を非同期イベントレートに適応させ,初期化を必要としない点である。 Event-based cameras are biologically inspired sensors that output events, i.e., asynchronous pixel-wise brightness changes in the scene. Their high dynamic range and temporal resolution of a microsecond makes them more reliable than standard cameras in environments of challenging illumination and in high-speed scenarios, thus developing odometry algorithms based solely on event cameras offers exciting new possibilities for autonomous systems and robots. In this paper, we propose a novel stereo visual odometry method for event cameras based on feature detection and matching with careful feature management, while pose estimation is done by reprojection error minimization. We evaluate the performance of the proposed method on two publicly available datasets: MVSEC sequences captured by an indoor flying drone and DSEC outdoor driving sequences. MVSEC offers accurate ground truth from motion capture, while for DSEC, which does not offer ground truth, in order to obtain a reference trajectory on the standard camera frames we used our SOFT visual odometry, one of the highest ranking algorithms on the KITTI scoreboards. We compared our method to the ESVO method, which is the first and still the only stereo event odometry method, showing on par performance on the MVSEC sequences, while on the DSEC dataset ESVO, unlike our method, was unable to handle outdoor driving scenario with default parameters. Furthermore, two important advantages of our method over ESVO are that it adapts tracking frequency to the asynchronous event rate and does not require initialization.	翻訳日:2021-07-14 05:52:01 公開日:2021-07-10
# (参考訳) telinet - 単純かつ浅い畳み込みニューラルネットワーク(cnn)によるcovid-19患者のctスキャンの分類 TeliNet, a simple and shallow Convolution Neural Network (CNN) to Classify CT Scans of COVID-19 patients ( http://arxiv.org/abs/2107.04930v1 ) ライセンス: CC BY 4.0	Mohammad Nayeem Teli	(参考訳) 新型コロナウイルス(COVID-19)により世界中で数十万人が死亡し、数百万人が負傷した。このパンデミックに対する戦いは、複数の方面で進行中だ。ワクチン接種はスピードを上げているが、まだ何十億もの予防接種を受けていない人々がいる。この戦いでは、感染予防のために病気の診断と患者の隔離が大きな役割を果たす。機械学習は、患者の胸部X線とCTスキャン画像を分析し、新型コロナウイルスの診断を支援する。本研究では,単純で浅い畳み込み型ニューラルネットワークであるtelinetを用いて,新型コロナウイルスのctスキャン画像の分類を行う。この結果は,VGGNetのF1スコアとベンチマーク手法より優れていた。提案手法は他の手法と比較してより軽量である。 Hundreds of millions of cases and millions of deaths have occurred worldwide due to COVID-19. The fight against this pandemic is on-going on multiple fronts. While vaccinations are picking up speed, there are still billions of unvaccinated people. In this fight diagnosis of the disease and isolation of the patients to prevent any spreads play a huge role. Machine Learning approaches have assisted the diagnosis of COVID-19 cases by analyzing chest X-ray and CT-scan images of patients. In this research we present a simple and shallow Convolutional Neural Network based approach, TeliNet, to classify CT-scan images of COVID-19 patients. Our results outperform the F1 score of VGGNet and the benchmark approaches. Our proposed solution is also more lightweight in comparison to the other methods.	翻訳日:2021-07-14 05:40:55 公開日:2021-07-10
# 敵攻撃の影響を受けやすい層同定 Identifying Layers Susceptible to Adversarial Attacks ( http://arxiv.org/abs/2107.04827v1 ) ライセンス: Link先を確認	Shoaib Ahmed Siddiqui, Thomas Breuel	(参考訳) 一般的なニューラルネットワークアーキテクチャは、敵のサンプルによる攻撃を受けやすい。ニューラルネットワークアーキテクチャは、一般的に低レベル特徴抽出層と高レベル分類層に分けられるが、逆さまなサンプルへのネットワークの感受性は、特徴抽出よりも分類に関する問題と見なされることが多い。 CIFAR-10, Imagenette, ImageNet 上の VGG と ResNet アーキテクチャの異なる部分を,非逆データと逆データを用いて選択的に再学習することで,このアイデアを検証した。実験の結果, 対立サンプルに対する感受性は低レベル特徴抽出層と関連していることがわかった。したがって、高層層の再訓練は堅牢性を達成するには不十分である。この現象には2つの説明がある: 敵の攻撃は、攻撃クラスに見られる特徴と区別できない初期層からの出力を生じるか、または、敵でないサンプルの特徴と統計的に異なる初期層からの出力を、後続の層で一貫した分類を許さないかである。隠れ層における特徴ベクトルの分布に関する大規模非線形次元減少と密度モデルを用いてこの問題を検証し,非対角的および対角的標本間の特徴分布が著しく異なることを示す。本研究は,敵のサンプルの統計的起源と防御可能性に関する新たな知見を提供する。 Common neural network architectures are susceptible to attack by adversarial samples. Neural network architectures are commonly thought of as divided into low-level feature extraction layers and high-level classification layers; susceptibility of networks to adversarial samples is often thought of as a problem related to classification rather than feature extraction. We test this idea by selectively retraining different portions of VGG and ResNet architectures on CIFAR-10, Imagenette and ImageNet using non-adversarial and adversarial data. Our experimental results show that susceptibility to adversarial samples is associated with low-level feature extraction layers. Therefore, retraining high-level layers is insufficient for achieving robustness. This phenomenon could have two explanations: either, adversarial attacks yield outputs from early layers that are indistinguishable from features found in the attack classes, or adversarial attacks yield outputs from early layers that differ statistically from features for non-adversarial samples and do not permit consistent classification by subsequent layers. We test this question by large-scale non-linear dimensionality reduction and density modeling on distributions of feature vectors in hidden layers and find that the feature distributions between non-adversarial and adversarial samples differ substantially. Our results provide new insights into the statistical origins of adversarial samples and possible defenses.	翻訳日:2021-07-13 16:21:50 公開日:2021-07-10
# 適応型進化クラスタリングアルゴリズムstarを用いたコーパスから導出する概念階層の形式的コンテキスト削減 Formal context reduction in deriving concept hierarchies from corpora using adaptive evolutionary clustering algorithm star ( http://arxiv.org/abs/2107.04781v1 ) ライセンス: Link先を確認	Bryar A. Hassan, Tarik A. Rashid and Seyedali Mirjalili	(参考訳) 概念階層の手動構築は通常、時間を要するリソース集約的なプロセスであるため、コーパスから概念階層を導出するプロセスを自動化することは有益である。このように、コーパスから概念階層を学習する全体的なプロセスは、テキストを文にパースし、文章を分割し、トークン化する一連のステップを含んでいる。補間ステップの後、fcaを用いてペアを抽出する。しかし、形式的な文脈では、面白くない、誤ったペアがいくつか存在するかもしれない。形式的コンテキストの生成は時間のかかるプロセスにつながる可能性があるため、形式的コンテキストサイズ削減は、興味のない、誤ったペアを取り除くために必要であり、それに従って概念格子と概念階層を抽出する時間を削減する。本研究の目的は,(1)FCAを利用するコーパスから概念階層を導出するフレームワーク,(2)ECAの適応版を用いた第1フレームワークの形式的文脈あいまいさを低減させるフレームワーク,の2つの枠組みを提案することである。 wikipediaのサンプル385コーパスを2つのフレームワークに適用して、形式的コンテキストのサイズを削減し、概念格子と概念階層を生成する実験を行った。その結果得られる形式的文脈の格子は、概念格子不変量を用いて標準の格子に評価される。したがって、2つの格子間の準同型は、基本格子とは対照的に、結果として得られる概念階層の質を89%維持し、縮小された概念格子は標準格子の構造的関係を継承する。適応ECAは,異なる密度(充填比)のランダムデータセット上での実行時間を測定するために,対応する4つのベースラインアルゴリズムに対して検討される。その結果,適応ECAは,異なるフィリング比で,他の競合技術よりも高速に概念格子を実行することがわかった。 It is beneficial to automate the process of deriving concept hierarchies from corpora since a manual construction of concept hierarchies is typically a time-consuming and resource-intensive process. As such, the overall process of learning concept hierarchies from corpora encompasses a set of steps: parsing the text into sentences, splitting the sentences and then tokenising it. After the lemmatisation step, the pairs are extracted using FCA. However, there might be some uninteresting and erroneous pairs in the formal context. Generating formal context may lead to a time-consuming process, so formal context size reduction is required to remove uninterested and erroneous pairs, taking less time to extract the concept lattice and concept hierarchies accordingly. In this premise, this study aims to propose two frameworks: (1) A framework to review the current process of deriving concept hierarchies from corpus utilising FCA; (2) A framework to decrease the formal contexts ambiguity of the first framework using an adaptive version of ECA. Experiments are conducted by applying 385 sample corpora from Wikipedia on the two frameworks to examine the reducing size of formal context, which leads to yield concept lattice and concept hierarchy. The resulting lattice of formal context is evaluated to the standard one using concept lattice-invariants. Accordingly, the homomorphic between the two lattices preserves the quality of resulting concept hierarchies by 89% in contrast to the basic ones, and the reduced concept lattice inherits the structural relation of the standard one. The adaptive ECA* is examined against its four counterpart baseline algorithms to measure the execution time on random datasets with different densities (fill ratios). The results show that adaptive ECA* performs concept lattice faster than other mentioned competitive techniques in different fill ratios.	翻訳日:2021-07-13 16:20:49 公開日:2021-07-10
# IoTフレームワークにおけるエッジデバイス上の家庭用ビデオサーベイランスの異常検出 Anomaly Detection in Residential Video Surveillance on Edge Devices in IoT Framework ( http://arxiv.org/abs/2107.04767v1 ) ライセンス: Link先を確認	Mayur R. Parate, Kishor M. Bhurchandi, Ashwin G. Kothari	(参考訳) インテリジェントな居住者監視は、最も重要なスマートコミュニティサービスの1つだ。セキュリティに対する需要が高まる中、監視システムは監視シーンの異常を検出する必要がある。住宅社会における知的監視のための高容量計算装置の利用は費用がかかり、実現不可能である。そこで我々は,CPUのみのエッジデバイスを用いたインテリジェント監視のための異常検出を提案する。オブジェクトレベルの推論とトラッキングをキャプチャするモジュールフレームワークを開発した。部分閉塞,姿勢変形,複雑な場面に対処するために,特徴符号化と軌跡関連を用いた。 anomaly detection frameworkの要素は、十分なfpsでcpuのみのエッジデバイスで動作するように最適化されている。実験の結果,提案手法は実現可能であり,実生活シナリオにおいて良好な結果が得られた。 Intelligent resident surveillance is one of the most essential smart community services. The increasing demand for security needs surveillance systems to be able to detect anomalies in surveillance scenes. Employing high-capacity computational devices for intelligent surveillance in residential societies is costly and not feasible. Therefore, we propose anomaly detection for intelligent surveillance using CPU-only edge devices. A modular framework to capture object-level inferences and tracking is developed. To cope with partial occlusions, posture deformations, and complex scenes we employed feature encoding and trajectory associations. Elements of the anomaly detection framework are optimized to run on CPU-only edge devices with sufficient FPS. The experimental results indicate the proposed method is feasible and achieves satisfactory results in real-life scenarios.	翻訳日:2021-07-13 16:20:04 公開日:2021-07-10
# Not-to-End:オンライン外科的位相認識のためのマルチステージアーキテクチャの探索 Not End-to-End: Explore Multi-Stage Architecture for Online Surgical Phase Recognition ( http://arxiv.org/abs/2107.04810v1 ) ライセンス: Link先を確認	Fangqiu Yi and Tingting Jiang	(参考訳) 手術相認識はコンピュータ支援手術システムにおいて特に関心があり、手術ビデオのフレーム毎にどの位相が起こっているかを予測することが目的である。マルチステージアーキテクチャを持つネットワークは、多くのコンピュータビジョンタスクにおいてリッチパターンで広く適用されており、予測器が最初に初期予測を出力し、追加の改良段階が初期予測を実行してさらなる改良を行う。既存の研究では,手術用ビデオコンテンツは順調であり,時間的パターンが豊富であることを示し,手術用位相認識タスクに適している。しかし, 手術段階認識タスクに多段階アーキテクチャを単純に適用すれば, エンドツーエンドの訓練方法が洗練能力の低下を招きかねないことが観察された。この問題に対処するため,外科的位相認識タスクのための多段階アーキテクチャの異なる設計を探索し,新たなエンドツーエンドトレーニング戦略を提案する。非エンドツーエンドのトレーニング戦略では、改良段階は2種類の乱れたシーケンスを別々に訓練する。一方,リファインメントモデルの3つの異なる選択を評価し,解析と解が特定の多段階モデルの選択にロバストであることを示す。 M2CAI16 Workflow ChallengeとCholec80データセットの2つの公開ベンチマークで実験を行います。その結果,当社の戦略でトレーニングされたマルチステージアーキテクチャは,現在の最先端のシングルステージモデルのパフォーマンスを大きく向上させることがわかった。コードは \url{https://github.com/chinayi/casual_tcn} で入手できる。 Surgical phase recognition is of particular interest to computer assisted surgery systems, in which the goal is to predict what phase is occurring at each frame for a surgery video. Networks with multi-stage architecture have been widely applied in many computer vision tasks with rich patterns, where a predictor stage first outputs initial predictions and an additional refinement stage operates on the initial predictions to perform further refinement. Existing works show that surgical video contents are well ordered and contain rich temporal patterns, making the multi-stage architecture well suited for the surgical phase recognition task. However, we observe that when simply applying the multi-stage architecture to the surgical phase recognition task, the end-to-end training manner will make the refinement ability fall short of its wishes. To address the problem, we propose a new non end-to-end training strategy and explore different designs of multi-stage architecture for surgical phase recognition task. For the non end-to-end training strategy, the refinement stage is trained separately with proposed two types of disturbed sequences. Meanwhile, we evaluate three different choices of refinement models to show that our analysis and solution are robust to the choices of specific multi-stage models. We conduct experiments on two public benchmarks, the M2CAI16 Workflow Challenge, and the Cholec80 dataset. Results show that multi-stage architecture trained with our strategy largely boosts the performance of the current state-of-the-art single-stage model. Code is available at \url{https://github.com/ChinaYi/casual_tcn}.	翻訳日:2021-07-13 16:19:54 公開日:2021-07-10
# 低域フィルタを超える: 自動フィルタリングによるグラフ畳み込みネットワーク Beyond Low-pass Filtering: Graph Convolutional Networks with Automatic Filtering ( http://arxiv.org/abs/2107.04755v1 ) ライセンス: Link先を確認	Zonghan Wu, Shirui Pan, Guodong Long, Jing Jiang, Chengqi Zhang	(参考訳) グラフ構造データからの深層学習には,グラフ畳み込みネットワークが不可欠になりつつある。既存のグラフ畳み込みネットワークのほとんどは、2つの大きな欠点を共有している。第一に、それらは本質的に低パスフィルタであるため、グラフ信号の潜在的に有用な中・高周波帯域は無視される。次に、既存のグラフ畳み込みフィルタの帯域幅を固定する。グラフ畳み込みフィルタのパラメータは、グラフ畳み込みフィルタ関数の曲率を変更することなく、グラフ入力を変換する。実際、専門家のドメイン知識がなければ、ある時点で周波数を維持または遮断すべきかどうかは不明です。本稿では,グラフ信号の全スペクトルを捕捉し,グラフ畳み込みフィルタの帯域幅を自動的に更新する自動グラフ畳み込みネットワーク(AutoGCN)を提案する。グラフスペクトル理論に基づいているが、私たちのAutoGCNも空間に局在しており、空間形式を持っている。実験の結果,AutoGCNは低域通過フィルタとしてのみ動作するベースライン法よりも大幅に改善されていることがわかった。 Graph convolutional networks are becoming indispensable for deep learning from graph-structured data. Most of the existing graph convolutional networks share two big shortcomings. First, they are essentially low-pass filters, thus the potentially useful middle and high frequency band of graph signals are ignored. Second, the bandwidth of existing graph convolutional filters is fixed. Parameters of a graph convolutional filter only transform the graph inputs without changing the curvature of a graph convolutional filter function. In reality, we are uncertain about whether we should retain or cut off the frequency at a certain point unless we have expert domain knowledge. In this paper, we propose Automatic Graph Convolutional Networks (AutoGCN) to capture the full spectrum of graph signals and automatically update the bandwidth of graph convolutional filters. While it is based on graph spectral theory, our AutoGCN is also localized in space and has a spatial form. Experimental results show that AutoGCN achieves significant improvement over baseline methods which only work as low-pass filters.	翻訳日:2021-07-13 16:17:19 公開日:2021-07-10
# 階層的特徴回帰によるクラスタ正規化 Cluster Regularization via a Hierarchical Feature Regression ( http://arxiv.org/abs/2107.04831v1 ) ライセンス: Link先を確認	Johann Pfitzinger	(参考訳) 高次元非直交予測器セットを用いた予測タスクは、最小二乗ベースの適合手順に挑戦する。大規模で生産的な文献が存在し、パラメータ推定の外部ロバスト性を改善するための様々な正規化アプローチについて議論している。本稿では,機械学習およびグラフ理論の領域からの洞察を動員し,予測子集合の教師付き階層表現に沿ってパラメータを推定し,パラメータをグループターゲットへ縮小する新しいクラスタ型正規化法である階層的特徴回帰(hfr)を提案する。この方法は、予測群の最適組成を推定する能力と、グループ目標を不均一に推定する能力において革新的である。 HFRは調整因子の回帰と見なすことができ、フィッティングプロセスで捕獲された慣性変動の程度でペナルティによって支配される収縮の強さが支配される。この手法は,高密度,スパース,グループ化されたデータ生成プロセスを含む,多種多様な回帰タスクに対して,ベンチマーク正規化推定器のパネルよりも優れた予測精度と汎用性を示す。経済成長予測への応用は、HFRの有効性を実証的な環境で示し、いくつかの頻繁な選択肢やベイズ的な選択肢と好意的に比較するために用いられる。 Prediction tasks with high-dimensional nonorthogonal predictor sets pose a challenge for least squares based fitting procedures. A large and productive literature exists, discussing various regularized approaches to improving the out-of-sample robustness of parameter estimates. This paper proposes a novel cluster-based regularization - the hierarchical feature regression (HFR) -, which mobilizes insights from the domains of machine learning and graph theory to estimate parameters along a supervised hierarchical representation of the predictor set, shrinking parameters towards group targets. The method is innovative in its ability to estimate optimal compositions of predictor groups, as well as the group targets endogenously. The HFR can be viewed as a supervised factor regression, with the strength of shrinkage governed by a penalty on the extent of idiosyncratic variation captured in the fitting process. The method demonstrates good predictive accuracy and versatility, outperforming a panel of benchmark regularized estimators across a diverse set of simulated regression tasks, including dense, sparse and grouped data generating processes. An application to the prediction of economic growth is used to illustrate the HFR's effectiveness in an empirical setting, with favorable comparisons to several frequentist and Bayesian alternatives.	翻訳日:2021-07-13 16:13:24 公開日:2021-07-10
# マルチヘッドコトレーニングによる半教師付き学習 Semi-Supervised Learning with Multi-Head Co-Training ( http://arxiv.org/abs/2107.04795v1 ) ライセンス: Link先を確認	Mingcai Chen, Yuntao Du, Yi Zhang, Shuwei Qian, Chongjun Wang	(参考訳) 自己学習から拡張されたコトレーニングは、半教師付き学習のフレームワークの1つである。これは、個別の分類器が互いに衝突しないように、アルゴリズムを微妙に設計する余分な分類器の訓練に要する。本稿では,半教師付き画像分類のための簡易かつ効率的なコトレーニングアルゴリズムであるmulti-head co-trainingを提案する。ベースラーナーをマルチヘッド構造に統合することにより、モデルは最小限の余分なパラメータに収まる。統一モデルにおける全ての分類ヘッドは「弱く強い強化」戦略を通じて仲間と相互作用し、多様性を明示的に促進することなく単一視点のコトレーニングを達成する。マルチヘッド協調学習の有効性は,標準半教師付き学習ベンチマークを用いた実証研究で実証された。 Co-training, extended from self-training, is one of the frameworks for semi-supervised learning. It works at the cost of training extra classifiers, where the algorithm should be delicately designed to prevent individual classifiers from collapsing into each other. In this paper, we present a simple and efficient co-training algorithm, named Multi-Head Co-Training, for semi-supervised image classification. By integrating base learners into a multi-head structure, the model is in a minimal amount of extra parameters. Every classification head in the unified model interacts with its peers through a "Weak and Strong Augmentation" strategy, achieving single-view co-training without promoting diversity explicitly. The effectiveness of Multi-Head Co-Training is demonstrated in an empirical study on standard semi-supervised learning benchmarks.	翻訳日:2021-07-13 16:11:24 公開日:2021-07-10
# DualVGR:ビデオ質問応答のためのデュアルビジュアルグラフ推論ユニット DualVGR: A Dual-Visual Graph Reasoning Unit for Video Question Answering ( http://arxiv.org/abs/2107.04768v1 ) ライセンス: Link先を確認	Jianyu Wang, Bing-Kun Bao, Changsheng Xu	(参考訳) ビデオ質問応答は難しい作業であり、エージェントはリッチなビデオコンテンツを理解し、空間的時間的推論を行う必要がある。しかし、既存のグラフベースの手法では、ビデオQAの2つの特性を無視して、多段階の推論をうまく行えない。(1)同じビデオであっても、異なる質問は、関係推論で答えを推測するために異なる量のビデオクリップやオブジェクトを必要とする可能性がある。これらの観察に基づいて,ビデオ上でエンドツーエンドに推論を行うデュアルビジュアルグラフ推論ユニット(dualvgr)を提案する。 DualVGRの最初のコントリビューションは、説明可能なQuery Punishment Moduleの設計です。 2つめの貢献は、ビデオベースのマルチビューグラフアテンションネットワークであり、外観と動きの特徴の関係をキャプチャする。我々のDualVGRネットワークは、ベンチマークMSVD-QAおよびSVQAデータセットの最先端性能を実現し、ベンチマークMSRVTT-QAデータセットの競合結果を示す。私たちのコードはhttps://github.com/MMIR/DualVGR-VideoQA.comで公開されています。 Video question answering is a challenging task, which requires agents to be able to understand rich video contents and perform spatial-temporal reasoning. However, existing graph-based methods fail to perform multi-step reasoning well, neglecting two properties of VideoQA: (1) Even for the same video, different questions may require different amount of video clips or objects to infer the answer with relational reasoning; (2) During reasoning, appearance and motion features have complicated interdependence which are correlated and complementary to each other. Based on these observations, we propose a Dual-Visual Graph Reasoning Unit (DualVGR) which reasons over videos in an end-to-end fashion. The first contribution of our DualVGR is the design of an explainable Query Punishment Module, which can filter out irrelevant visual features through multiple cycles of reasoning. The second contribution is the proposed Video-based Multi-view Graph Attention Network, which captures the relations between appearance and motion features. Our DualVGR network achieves state-of-the-art performance on the benchmark MSVD-QA and SVQA datasets, and demonstrates competitive results on benchmark MSRVTT-QA datasets. Our code is available at https://github.com/MMIR/DualVGR-VideoQA.	翻訳日:2021-07-13 16:09:17 公開日:2021-07-10
# 自己教師型音声表現モデルの階層的解析 Layer-wise Analysis of a Self-supervised Speech Representation Model ( http://arxiv.org/abs/2107.04734v1 ) ライセンス: Link先を確認	Ankita Pasad, Ju-Chieh Chou, Karen Livescu	(参考訳) 近年,音声表現モデルの事前学習において,自己教師付き学習手法が成功している。これらの学習表現の有用性は実証的に観察されているが、事前訓練された表現自身で符号化された情報の種類や範囲についてはあまり研究されていない。このような洞察の開発は、これらのモデルの能力と限界を理解し、研究コミュニティがより効率的に下流アプリケーションに利用できるようにするのに役立つ。本研究では,その中間表現ベクトルを用いて,最近かつ成功した事前学習モデル(wav2vec 2.0)を解析ツールを用いて検討することにより,このギャップを埋める。非パラメトリックプローブを用いた単純な下流作業における標準相関,相互情報,および性能の測定値を用いて, (i) 音響的および言語的情報内容の問い合わせ, (ii) モデル層間の情報の進化を特徴付けるとともに, (iii) 自動音声認識(ASR) モデルがこれらの観測に与える影響を理解する。その結果,asrの微調整プロトコルの修正が動機となり,低リソース環境での単語誤り率の向上が図られた。 Recently proposed self-supervised learning approaches have been successful for pre-training speech representation models. The utility of these learned representations has been observed empirically, but not much has been studied about the type or extent of information encoded in the pre-trained representations themselves. Developing such insights can help understand the capabilities and limits of these models and enable the research community to more efficiently develop their usage for downstream applications. In this work, we begin to fill this gap by examining one recent and successful pre-trained model (wav2vec 2.0), via its intermediate representation vectors, using a suite of analysis tools. We use the metrics of canonical correlation, mutual information, and performance on simple downstream tasks with non-parametric probes, in order to (i) query for acoustic and linguistic information content, (ii) characterize the evolution of information across model layers, and (iii) understand how fine-tuning the model for automatic speech recognition (ASR) affects these observations. Our findings motivate modifying the fine-tuning protocol for ASR, which produces improved word error rates in a low-resource setting.	翻訳日:2021-07-13 16:07:44 公開日:2021-07-10
# 伝達学習法を用いたJPEG圧縮領域における植物葉病の直接検出 Detection of Plant Leaf Disease Directly in the JPEG Compressed Domain using Transfer Learning Technique ( http://arxiv.org/abs/2107.04813v1 ) ライセンス: Link先を確認	Atul Sharma, Bulla Rajesh and Mohammed Javed	(参考訳) 植物の葉病は食品の安全性に重大な危険をもたらし、品質と生産量の低下を引き起こす。したがって、葉病の正確かつタイムリーな検出は、作物の損失を確認し、人々の食料需要の増加に対応するために非常に重要である。従来の手法は、一般的に費用がかかり、アクセス不能な検査と人間のスキルに依存している。近年,Deep Neural Networksは画像分類において極めて有益である。本研究では, JPEG圧縮領域において, 転写学習を用いた植物葉病検出について検討した。ここでは、DCT係数からなるJPEG圧縮ストリームをニューラルネットワークに直接供給し、分類効率を向上させる。 JPEG圧縮葉データに対する実験結果から,提案モデルの有効性が示された。 Plant leaf diseases pose a significant danger to food security and they cause depletion in quality and volume of production. Therefore accurate and timely detection of leaf disease is very important to check the loss of the crops and meet the growing food demand of the people. Conventional techniques depend on lab investigation and human skills which are generally costly and inaccessible. Recently, Deep Neural Networks have been exceptionally fruitful in image classification. In this research paper, plant leaf disease detection employing transfer learning is explored in the JPEG compressed domain. Here, the JPEG compressed stream consisting of DCT coefficients is, directly fed into the Neural Network to improve the efficiency of classification. The experimental results on JPEG compressed leaf dataset demonstrate the efficacy of the proposed model.	翻訳日:2021-07-13 16:06:05 公開日:2021-07-10
# タスク指向意味解析におけるデータ効率の評価 Assessing Data Efficiency in Task-Oriented Semantic Parsing ( http://arxiv.org/abs/2107.04736v1 ) ライセンス: Link先を確認	Shrey Desai, Akshat Shrivastava, Justin Rill, Brian Moran, Safiyyah Saleem, Alexander Zotov, Ahmed Aly	(参考訳) データ効率は魅力的な特徴であるにもかかわらず、タスク指向のセマンティックパーシングで測定し最適化することはしばしば困難である。本研究は,データ効率に関する質問に対する統一的な解決策を提供するためのステップとして,パーサが特定の品質バーを達成するのに必要なドメイン内データ量を近似的に測定する4段階プロトコルを提案する。具体的には,(1)異なる濃度のターゲット部分集合をサンプリングする,(2)各部分集合上の微調整パーサ,(3)ターゲット部分集合 (%) と正確な一致 (%) に関する滑らかな曲線を得る,(4) 曲線をマイニングアドホック(ターゲット部分集合,完全一致)点に参照する。当社のプロトコルは,2つの実世界のケーススタディ – モデル一般化可能性と意図複雑性 – に適用されている。 Data efficiency, despite being an attractive characteristic, is often challenging to measure and optimize for in task-oriented semantic parsing; unlike exact match, it can require both model- and domain-specific setups, which have, historically, varied widely across experiments. In our work, as a step towards providing a unified solution to data-efficiency-related questions, we introduce a four-stage protocol which gives an approximate measure of how much in-domain, "target" data a parser requires to achieve a certain quality bar. Specifically, our protocol consists of (1) sampling target subsets of different cardinalities, (2) fine-tuning parsers on each subset, (3) obtaining a smooth curve relating target subset (%) vs. exact match (%), and (4) referencing the curve to mine ad-hoc (target subset, exact match) points. We apply our protocol in two real-world case studies -- model generalizability and intent complexity -- illustrating its flexibility and applicability to practitioners in task-oriented semantic parsing.	翻訳日:2021-07-13 16:04:39 公開日:2021-07-10
# 計算疫学:書籍、ニュース記事、ツイートにおける実証的利用の時間的、生態学的ダイナミクスのチャート化 Computational Paremiology: Charting the temporal, ecological dynamics of proverb use in books, news articles, and tweets ( http://arxiv.org/abs/2107.04929v1 ) ライセンス: Link先を確認	E. Davis, C. M. Danforth, W. Mieder, and P. S. Dodds	(参考訳) 弁証器は言語と文化の重要な要素であり、その歴史と通貨に多くの注意が払われているが、時間とともに使用される頻度の変化について、比較的定量的な研究は行われていない。文書の様々なジャンルを反映した大規模なコーパスが広く利用可能になったことにより、この証明の重要性を広くダイナミックに見ることが可能になった。ここでは、3つのコーパス内での証明の時間的変化、種類、規模、時間による違い、何世紀にもわたって何百万もの書籍、20年間で何億ものニュース記事、そして10年間で何十億ものツイートを測定します。調査の結果,各会場において,使用頻度が重く,時代文化の動態を反映した傾向がみられ,ソーシャルメディア上の現代的形態へと進化してきた。 Proverbs are an essential component of language and culture, and though much attention has been paid to their history and currency, there has been comparatively little quantitative work on changes in the frequency with which they are used over time. With wider availability of large corpora reflecting many diverse genres of documents, it is now possible to take a broad and dynamic view of the importance of the proverb. Here, we measure temporal changes in the relevance of proverbs within three corpora, differing in kind, scale, and time frame: Millions of books over centuries; hundreds of millions of news articles over twenty years; and billions of tweets over a decade. We find that proverbs present heavy-tailed frequency-of-usage rank distributions in each venue; exhibit trends reflecting the cultural dynamics of the eras covered; and have evolved into contemporary forms on social media.	翻訳日:2021-07-13 16:04:16 公開日:2021-07-10
# 法的知識グラフを用いた類似事例推薦 Similar Cases Recommendation using Legal Knowledge Graphs ( http://arxiv.org/abs/2107.04771v1 ) ライセンス: Link先を確認	Jaspreet Singh Dhani, Ruchika Bhatt, Balaji Ganesan, Parikshet Sirohi, Vasudha Bhatnagar	(参考訳) 裁判、判決、法律、その他の法的文書から構築された法的な知識グラフは、質問応答、文書の類似性、検索などの多くのアプリケーションを可能にする。 nlpタスクの遠隔監視にナレッジグラフを使用することはよく研究されているが、ノード類似性のようなダウンストリームグラフタスクにナレッジグラフを使用することは、ノードタイプとその機能の選択に困難をもたらす。本稿では,法律知識グラフから導出したケースグラフにおける類似ノードの予測手法について述べる。 A legal knowledge graph constructed from court cases, judgments, laws and other legal documents can enable a number of applications like question answering, document similarity, and search. While the use of knowledge graphs for distant supervision in NLP tasks is well researched, using knowledge graphs for downstream graph tasks like node similarity presents challenges in selecting node types and their features. In this demo, we describe our solution for predicting similar nodes in a case graph derived from our legal knowledge graph.	翻訳日:2021-07-13 16:02:47 公開日:2021-07-10
# 常識推論から複数の選好を通してのニューラルネットワークモデルへ:概要 From Common Sense Reasoning to Neural Network Models through Multiple Preferences: an overview ( http://arxiv.org/abs/2107.04870v1 ) ライセンス: Link先を確認	Laura Giordano, Valentina Gliozzi, Daniele Theseider Dupr\'e	(参考訳) 本稿では,条件論理と優先論理とニューラルネットワークモデルの関係について,多項意味論に基づく考察を行う。本稿では,ニューラルネットワークモデルに意味的解釈を提供するツールとして,異なる概念に対する嗜好を考慮に入れるために,最近導入された概念的多元参照セマンティクスを提案する。このアプローチは、教師なしニューラルネットワークモデル(自己組織化マップ)と教師なしニューラルネットワークモデル(マルチレイヤーパーセプトロン)の両方で検討されており、同じアプローチが他のニューラルネットワークモデルにも拡張されることを期待している。これにより、ネットワークの入出力動作をキャプチャする解釈を通じて、ネットワークの論理特性を(モデルチェックによって)チェックすることができる。多層パーセプトロンでは、ディープネットワーク自体を条件付き知識ベースと見なすことができ、シナプス接続は重み付き条件付き接続に対応する。本稿では, 自己組織化マップと多層パーセプトロンの事例を通して, 一般的なアプローチを説明し, オープンな課題と展望について考察する。 In this paper we discuss the relationships between conditional and preferential logics and neural network models, based on a multi-preferential semantics. We propose a concept-wise multipreference semantics, recently introduced for defeasible description logics to take into account preferences with respect to different concepts, as a tool for providing a semantic interpretation to neural network models. This approach has been explored both for unsupervised neural network models (Self-Organising Maps) and for supervised ones (Multilayer Perceptrons), and we expect that the same approach might be extended to other neural network models. It allows for logical properties of the network to be checked (by model checking) over an interpretation capturing the input-output behavior of the network. For Multilayer Perceptrons, the deep network itself can be regarded as a conditional knowledge base, in which synaptic connections correspond to weighted conditionals. The paper describes the general approach, through the cases of Self-Organising Maps and Multilayer Perceptrons, and discusses some open issues and perspectives.	翻訳日:2021-07-13 16:02:37 公開日:2021-07-10
# 視覚トランスフォーマーにおける局所からグローバルへの自己着脱 Local-to-Global Self-Attention in Vision Transformers ( http://arxiv.org/abs/2107.04735v1 ) ライセンス: Link先を確認	Jinpeng Li, Yichao Yan, Shengcai Liao, Xiaokang Yang, Ling Shao	(参考訳) トランスフォーマーはコンピュータビジョンタスクに大きな可能性を示した。高解像度の視覚データにおける自己注意の密度計算を避けるため、最近のTransformerモデルは階層設計を採用しており、ローカルウィンドウ内でのみ自己注意が計算される。この設計は効率を大幅に改善するが、早い段階ではグローバルな特徴推論を欠いている。本研究では,各ステージの複数の粒度で局所からグローバルへの推論を可能にする変圧器のマルチパス構造を設計する。提案するフレームワークは計算効率が高く,有効である。計算オーバーヘッドが極端に増加し,画像分類とセマンティックセグメンテーションの両方において顕著な改善が得られた。コードはhttps://github.com/ljpadam/LG-Transformerで入手できる。 Transformers have demonstrated great potential in computer vision tasks. To avoid dense computations of self-attentions in high-resolution visual data, some recent Transformer models adopt a hierarchical design, where self-attentions are only computed within local windows. This design significantly improves the efficiency but lacks global feature reasoning in early stages. In this work, we design a multi-path structure of the Transformer, which enables local-to-global reasoning at multiple granularities in each stage. The proposed framework is computationally efficient and highly effective. With a marginal increasement in computational overhead, our model achieves notable improvements in both image classification and semantic segmentation. Code is available at https://github.com/ljpadam/LG-Transformer	翻訳日:2021-07-13 16:01:02 公開日:2021-07-10
# TTAN:Few-shot行動認識のための2段階時間アライメントネットワーク TTAN: Two-Stage Temporal Alignment Network for Few-shot Action Recognition ( http://arxiv.org/abs/2107.04782v1 ) ライセンス: Link先を確認	Shuyuan Li, Huabin Liu, Rui Qian, Yuxi Li, John See, Mengjuan Fei, Xiaoyuan Yu, Weiyao Lin	(参考訳) 数少ないアクション認識は、少数のサンプル(サポート)を使用して、新しいアクションクラス(クエリ)を認識することを目的としている。現在のアプローチの大半は、ビデオ間の類似性を比較するために学習するメトリック学習パラダイムに従っている。近年,このような類似性を直接測定することは理想的ではないことが観測されている。本稿では,動作継続時間の誤認と動作進化の誤認の2つの側面からこの問題を逮捕する。我々は2段階の時間アライメントネットワーク(TTAN)を通してそれらを逐次処理する。第1段階は予測されたアフィンワープパラメータで時間変換を行い、第2段階はクロスアテンション機構を使用してサポートとクエリの特徴を一貫した進化に調整する。さらに,サポートサンプル間の不一致を考慮した,新しいマルチショット融合戦略を考案する。アブレーション研究と可視化は、両方の段階が誤認識に対処する役割を実証している。ベンチマークデータセットに関する広範囲な実験により, 提案手法が, 最先端の動作認識性能を実現する可能性を示した。 Few-shot action recognition aims to recognize novel action classes (query) using just a few samples (support). The majority of current approaches follow the metric learning paradigm, which learns to compare the similarity between videos. Recently, it has been observed that directly measuring this similarity is not ideal since different action instances may show distinctive temporal distribution, resulting in severe misalignment issues across query and support videos. In this paper, we arrest this problem from two distinct aspects -- action duration misalignment and motion evolution misalignment. We address them sequentially through a Two-stage Temporal Alignment Network (TTAN). The first stage performs temporal transformation with the predicted affine warp parameters, while the second stage utilizes a cross-attention mechanism to coordinate the features of the support and query to a consistent evolution. Besides, we devise a novel multi-shot fusion strategy, which takes the misalignment among support samples into consideration. Ablation studies and visualizations demonstrate the role played by both stages in addressing the misalignment. Extensive experiments on benchmark datasets show the potential of the proposed method in achieving state-of-the-art performance for few-shot action recognition.	翻訳日:2021-07-13 16:00:51 公開日:2021-07-10
# ポリモルフィックトランスフォーマによるマイノショット領域適応 Few-Shot Domain Adaptation with Polymorphic Transformers ( http://arxiv.org/abs/2107.04805v1 ) ライセンス: Link先を確認	Shaohua Li, Xiuchao Sui, Jie Fu, Huazhu Fu, Xiangde Luo, Yangqin Feng, Xinxing Xu, Yong Liu, Daniel Ting, Rick Siow Mong Goh	(参考訳) ある医療画像に対してトレーニングされたディープニューラルネットワーク(DNN)は、トレーニング画像(ソースドメイン)とテスト画像(ターゲットドメイン)とのさまざまなドメインの相違により、目に見えないテスト画像に深刻なパフォーマンス低下を経験することが多い。臨床環境では、十分な注記対象領域データを短時間で収集することは困難である。少数のアノテーションで訓練されたモデルを適用するようなドメイン適応は、この場合非常に実用的で有用である。本稿では,任意のdnnバックボーンに組み込むことができるポリモーフィックトランス(polyformer)を提案する。具体的には、ポリフォーマ層をソースドメインでトレーニングされたモデルに挿入した後、プロトタイプの埋め込みを抽出し、ソースドメインの特徴の"基底"と見ることができる。対象領域では、ポリフォーム層は、画像特徴とプロトタイプ埋め込み間の相互作用を制御する投影層を更新するだけで適応する。他のモデル重み(バッチノルムパラメータを除く)は適応中に凍結される。これにより、アノテーションをオーバーフィットする可能性が大幅に減少し、いくつかの注釈付き画像でトレーニングした後、ターゲットドメインでロバストに実行することが可能となる。本稿では,2つの医療的セグメンテーション課題(光ディスク/カップセグメンテーション,ポリープセグメンテーション)におけるPolyformerの有効性を示す。 Polyformerのソースコードはhttps://github.com/askerlee/segtranで公開されている。 Deep neural networks (DNNs) trained on one set of medical images often experience severe performance drop on unseen test images, due to various domain discrepancy between the training images (source domain) and the test images (target domain), which raises a domain adaptation issue. In clinical settings, it is difficult to collect enough annotated target domain data in a short period. Few-shot domain adaptation, i.e., adapting a trained model with a handful of annotations, is highly practical and useful in this case. In this paper, we propose a Polymorphic Transformer (Polyformer), which can be incorporated into any DNN backbones for few-shot domain adaptation. Specifically, after the polyformer layer is inserted into a model trained on the source domain, it extracts a set of prototype embeddings, which can be viewed as a "basis" of the source-domain features. On the target domain, the polyformer layer adapts by only updating a projection layer which controls the interactions between image features and the prototype embeddings. All other model weights (except BatchNorm parameters) are frozen during adaptation. Thus, the chance of overfitting the annotations is greatly reduced, and the model can perform robustly on the target domain after being trained on a few annotated images. We demonstrate the effectiveness of Polyformer on two medical segmentation tasks (i.e., optic disc/cup segmentation, and polyp segmentation). The source code of Polyformer is released at https://github.com/askerlee/segtran.	翻訳日:2021-07-13 16:00:32 公開日:2021-07-10
# 注意機構を用いた弱修正深度推定ネットワーク A Weakly-Supervised Depth Estimation Network Using Attention Mechanism ( http://arxiv.org/abs/2107.04819v1 ) ライセンス: Link先を確認	Fang Gao, Jiabao Wang, Jun Yu, Yaoxiong Wang, Feng Shuang	(参考訳) 単眼深度推定(MDE)はシーン理解や再構成といった多くのアプリケーションにおいて基本的な課題である。しかし、既存のメソッドのほとんどは正確なラベル付きデータセットに依存している。 ANUWという名前の注目ネスト付きU-net(ANU)に基づく弱監督型フレームワークを,ラベルの誤用に対して導入した。 ANUWは、入力された単一のRGB画像を深度画像に変換するためにエンドツーエンドに訓練される。これは、高密度残留ネットワーク構造、適応重みチャネルアテンション(AWCA)モジュール、パッチ第2非ローカル(PSNL)モジュール、ソフトラベル生成方法からなる。高密度残留ネットワークは、入力をエンコードしてデコードするネットワークの本体である。 awcaモジュールはチャネル重みを適応的に調整して重要な特徴を抽出することができる。 PSNLモジュールは2階非局所法により空間的注意機構を実装している。提案するソフトラベル生成手法は,データセットの事前知識を用いて,偽のラベルを置き換えるソフトラベルを生成する。提案したANUWは、欠陥のある単分子深度データセットに基づいてトレーニングされ、トレーニングされたモデルは3つの公開データセット上でテストされ、その結果、最先端のMDE手法と比較してANUWの優位性を示す。 Monocular depth estimation (MDE) is a fundamental task in many applications such as scene understanding and reconstruction. However, most of the existing methods rely on accurately labeled datasets. A weakly-supervised framework based on attention nested U-net (ANU) named as ANUW is introduced in this paper for cases with wrong labels. The ANUW is trained end-to-end to convert an input single RGB image into a depth image. It consists of a dense residual network structure, an adaptive weight channel attention (AWCA) module, a patch second non-local (PSNL) module and a soft label generation method. The dense residual network is the main body of the network to encode and decode the input. The AWCA module can adaptively adjust the channel weights to extract important features. The PSNL module implements the spatial attention mechanism through a second-order non-local method. The proposed soft label generation method uses the prior knowledge of the dataset to produce soft labels to replace false ones. The proposed ANUW is trained on a defective monocular depth dataset and the trained model is tested on three public datasets, and the results demonstrate the superiority of ANUW in comparison with the state-of-the-art MDE methods.	翻訳日:2021-07-13 16:00:05 公開日:2021-07-10
# 7つの基本表情分類のためのベイズ畳み込みニューラルネットワーク Bayesian Convolutional Neural Networks for Seven Basic Facial Expression Classifications ( http://arxiv.org/abs/2107.04834v1 ) ライセンス: Link先を確認	Wei Gong, Hailan Huang	(参考訳) 7つの基本的な表情分類は、複雑な人間の感情を表現する基本的な方法であり、人工知能研究の重要な部分である。従来のベイズニューラルネットワークの枠組みに基づき,本論文で構築したresnet-18_bnnネットワークは,(1)不確定パラメータのkl損失と特定のパラメータの交叉からなる,新たな目的関数を提案する。エントロピー損失組成物。 2) 特殊目的関数を対象として, これら2つのパラメータを交互に更新するトレーニングスキームを提案する。 (3) 最後の畳み込み群のパラメータのみをモデル化する。実験解析により,本手法はaf-wild2データベースの評価セットにおいて98.28%の精度を達成した。従来のベイズ型ニューラルネットワークと比較すると,本手法は分類精度が最も高い。 The seven basic facial expression classifications are a basic way to express complex human emotions and are an important part of artificial intelligence research. Based on the traditional Bayesian neural network framework, the ResNet-18_BNN network constructed in this paper has been improved in the following three aspects: (1) A new objective function is proposed, which is composed of the KL loss of uncertain parameters and the intersection of specific parameters. Entropy loss composition. (2) Aiming at a special objective function, a training scheme for alternately updating these two parameters is proposed. (3) Only model the parameters of the last convolution group. According to experimental analysis, our method achieves an accuracy of 98.28% on the evaluation set of the Aff-Wild2 database. Compared with the traditional Bayesian Neural Network, our method brings the highest classification accuracy gain.	翻訳日:2021-07-13 15:59:44 公開日:2021-07-10
# マルチドメインデータアグリゲーションに基づく医用画像セグメンテーションのための階層的自己監督学習 Hierarchical Self-Supervised Learning for Medical Image Segmentation Based on Multi-Domain Data Aggregation ( http://arxiv.org/abs/2107.04886v1 ) ライセンス: Link先を確認	Hao Zheng, Jun Han, Hongxiao Wang, Lin Yang, Zhuo Zhao, Chaoli Wang, Danny Z. Chen	(参考訳) 大規模ラベル付きデータセットは、教師付きディープラーニングの成功の鍵であるが、医用画像セグメンテーションでは、モデルトレーニングに十分な注釈付き画像を得ることは非常に困難である。多くのシナリオでは、注釈のない画像は豊富で容易に取得できる。自己教師付き学習(SSL)は、生のデータ情報と表現学習を利用する大きな可能性を示している。本稿では,無記名データを利用して医用画像セグメンテーションを促進する新しい自己教師付きフレームワークである階層型自己教師付き学習(hssl)を提案する。タスク固有の自己教師付き事前訓練とそれに続く教師付き微調整に関する現在の文献とは異なり、さまざまな医用画像セグメンテーションタスクのための異種データからタスク非依存の知識をSSLを用いて学習する。具体的には、まずいくつかの医学的課題からデータセットを集約し、自己教師付きでネットワークを事前訓練し、最後にラベル付きデータに微調整する。コントラスト損失と分類損失を組み合わせた新しい損失関数を開発し,セグメンテーションタスクのためのエンコーダ・デコーダアーキテクチャを事前学習する。広範な実験により,マルチドメイン合同事前学習は,ダウンストリームセグメンテーションタスクに有益であり,単ドメイン事前学習を大きく上回ることが示された。スクラッチから学ぶことに比べ、新しい手法は様々なタスク(例:+0.69%から+18.60%、注釈付きデータの5%)でパフォーマンスが向上する。限られたトレーニングデータで、我々の手法は性能ギャップw.r.tを著しく橋渡しすることができる。より密なアノテーション(例えば、注釈付きデータの10%対~100%)。 A large labeled dataset is a key to the success of supervised deep learning, but for medical image segmentation, it is highly challenging to obtain sufficient annotated images for model training. In many scenarios, unannotated images are abundant and easy to acquire. Self-supervised learning (SSL) has shown great potentials in exploiting raw data information and representation learning. In this paper, we propose Hierarchical Self-Supervised Learning (HSSL), a new self-supervised framework that boosts medical image segmentation by making good use of unannotated data. Unlike the current literature on task-specific self-supervised pretraining followed by supervised fine-tuning, we utilize SSL to learn task-agnostic knowledge from heterogeneous data for various medical image segmentation tasks. Specifically, we first aggregate a dataset from several medical challenges, then pre-train the network in a self-supervised manner, and finally fine-tune on labeled data. We develop a new loss function by combining contrastive loss and classification loss and pretrain an encoder-decoder architecture for segmentation tasks. Our extensive experiments show that multi-domain joint pre-training benefits downstream segmentation tasks and outperforms single-domain pre-training significantly. Compared to learning from scratch, our new method yields better performance on various tasks (e.g., +0.69% to +18.60% in Dice scores with 5% of annotated data). With limited amounts of training data, our method can substantially bridge the performance gap w.r.t. denser annotations (e.g., 10% vs.~100% of annotated data).	翻訳日:2021-07-13 15:59:30 公開日:2021-07-10
# コンピュータビジョンにおける産業と学術研究 Industry and Academic Research in Computer Vision ( http://arxiv.org/abs/2107.04902v1 ) ライセンス: Link先を確認	Iuliia Kotseruba	(参考訳) 本研究は,コンピュータビジョンにおける産学研究と学界のダイナミクスを研究することを目的とする。結果は、この分野を代表するトップ5ビジョンカンファレンスのセットで実証される。このような分析データの入手は容易ではなかったため、原版からのメタデータの収集と処理に多大な労力が費やされた。第一に,本研究は産業支援研究のシェアを定量化する。具体的には,産業界の研究者が発行する論文の割合が増加しており,より多くの学者が企業に参加したり協力したりしていることを示している。次に、研究トピックや引用パターンの分布など、業界におけるプレゼンスの影響について検討する。その結果,研究トピックの分布は産業論文や学術論文に類似していることが示唆された。しかし、業界論文の引用には強い好みがある。最後に,コードの可利用性や影響などの引用バイアスの原因について検討した。 This work aims to study the dynamic between research in the industry and academia in computer vision. The results are demonstrated on a set of top-5 vision conferences that are representative of the field. Since data for such analysis was not readily available, significant effort was spent on gathering and processing meta-data from the original publications. First, this study quantifies the share of industry-sponsored research. Specifically, it shows that the proportion of papers published by industry-affiliated researchers is increasing and that more academics join companies or collaborate with them. Next, the possible impact of industry presence is further explored, namely in the distribution of research topics and citation patterns. The results indicate that the distribution of the research topics is similar in industry and academic papers. However, there is a strong preference towards citing industry papers. Finally, possible reasons for citation bias, such as code availability and influence, are investigated.	翻訳日:2021-07-13 15:59:02 公開日:2021-07-10
# マルチモーダル脳のデコードにおける経時的相関解析 Longitudinal Correlation Analysis for Decoding Multi-Modal Brain Development ( http://arxiv.org/abs/2107.04724v1 ) ライセンス: Link先を確認	Qingyu Zhao, Ehsan Adeli, Kilian M. Pohl	(参考訳) 幼少期から、人間の脳は生涯にわたって構造を再構築し、リワイヤリングする。このような複雑な脳の発達を特徴付けるには、縦型およびマルチモーダルの神経画像データの効果的な分析が必要である。本稿では,縦相関解析 (LCA) と呼ばれる解析手法を提案する。 LCAは、まず各モーダルからの入力をオートエンコーダに基づく潜在表現に還元することで、2つのモーダルのデータを結合する。自己教師付き戦略は、各空間内の2つの方向を互いに分離し、それらの方向に沿った潜在表現の縦方向の変化がモダリティ間で最大に相関するようにすることで、2つの潜在空間を関連付ける。若年者におけるアルコール・神経発達に関する全国コンソーシアム679名の縦断的T1強調および拡散強調MRI解析にLCAを適用した。横断的あるいは単一モーダルモデリングに焦点を当てた既存のアプローチとは異なり、lcaはデータから抽出された形態的および拡散的特徴から、マクロ構造およびミクロ構造的脳の発達を解き放つことに成功した。対象者の生の3次元画像量に対するLCAの再検査は,特徴に基づく解析の結果を再現することに成功した。最後に、LCAが明らかにした発達効果は、青年期の脳の成熟パターンの現在の理解と一致した。 Starting from childhood, the human brain restructures and rewires throughout life. Characterizing such complex brain development requires effective analysis of longitudinal and multi-modal neuroimaging data. Here, we propose such an analysis approach named Longitudinal Correlation Analysis (LCA). LCA couples the data of two modalities by first reducing the input from each modality to a latent representation based on autoencoders. A self-supervised strategy then relates the two latent spaces by jointly disentangling two directions, one in each space, such that the longitudinal changes in latent representations along those directions are maximally correlated between modalities. We applied LCA to analyze the longitudinal T1-weighted and diffusion-weighted MRIs of 679 youths from the National Consortium on Alcohol and Neurodevelopment in Adolescence. Unlike existing approaches that focus on either cross-sectional or single-modal modeling, LCA successfully unraveled coupled macrostructural and microstructural brain development from morphological and diffusivity features extracted from the data. A retesting of LCA on raw 3D image volumes of those subjects successfully replicated the findings from the feature-based analysis. Lastly, the developmental effects revealed by LCA were inline with the current understanding of maturational patterns of the adolescent brain.	翻訳日:2021-07-13 15:55:07 公開日:2021-07-10
# copulasを用いたマルチエージェント模倣学習 Multi-Agent Imitation Learning with Copulas ( http://arxiv.org/abs/2107.04750v1 ) ライセンス: Link先を確認	Hongwei Wang, Lantao Yu, Zhangjie Cao, Stefano Ermon	(参考訳) マルチエージェント模倣学習は、物理的、社会的、チームプレイシステムを理解するのに不可欠な観察と行動のマッピングを学習することで、デモからタスクを実行するために複数のエージェントを訓練することを目的としている。しかしながら、マルチエージェント相互作用をモデル化する既存の研究の多くは、エージェントが観察に基づいて独立した決定をし、エージェント間の複雑な依存を無視していると仮定している。本稿では,確率変数間の依存を捉える強力な統計ツールである copula を用いて,マルチエージェントシステムにおける相関と協調を明示的にモデル化する。提案するモデルでは,個々のエージェントの局所的行動パターンを捉えた限界を個別に学習できるだけでなく,エージェント間の依存構造を単独かつ完全に捉えたcopula関数を学習することができる。合成および実世界のデータセットに対する大規模な実験により、我々のモデルはアクション予測タスクにおける様々なシナリオにおいて最先端のベースラインよりも優れており、専門家によるデモンストレーションに近い新しい軌道を生成することができる。 Multi-agent imitation learning aims to train multiple agents to perform tasks from demonstrations by learning a mapping between observations and actions, which is essential for understanding physical, social, and team-play systems. However, most existing works on modeling multi-agent interactions typically assume that agents make independent decisions based on their observations, ignoring the complex dependence among agents. In this paper, we propose to use copula, a powerful statistical tool for capturing dependence among random variables, to explicitly model the correlation and coordination in multi-agent systems. Our proposed model is able to separately learn marginals that capture the local behavioral patterns of each individual agent, as well as a copula function that solely and fully captures the dependence structure among agents. Extensive experiments on synthetic and real-world datasets show that our model outperforms state-of-the-art baselines across various scenarios in the action prediction task, and is able to generate new trajectories close to expert demonstrations.	翻訳日:2021-07-13 15:54:48 公開日:2021-07-10
# IoTと機械学習を用いた精密農業のためのマルチモーダルシステムを目指して Towards a Multimodal System for Precision Agriculture using IoT and Machine Learning ( http://arxiv.org/abs/2107.04895v1 ) ライセンス: Link先を確認	Satvik Garg, Pradyumn Pundir, Himanshu Jindal, Hemraj Saini, Somya Garg	(参考訳) 精密農業制度は、現在の情報・通信技術を利用した農業を監督し、人的作業を進めながら収穫量や品質を向上させることを指す。自動化には、土壌、水、光、湿度、温度などのセンサーが与える情報の組み合わせが必要であり、操作者に正確なデータを提供し、農家に優れた収量を得る。本研究は, 精密農業利用における最先端のアプローチをすべて取り入れた研究である。データ収集のためのIoT(Internet of Things)や、作物被害予測のための機械学習、作物病検出のためのディープラーニングといった技術が使用されている。 IoTを用いたデータ収集は、スマート灌水のための水分レベルの測定、n, p, kによる最適な収量開発のための肥料の推定に責任がある。作物被害予測には、ランダムフォレスト(rf)、光勾配ブースティングマシン(lgbm)、xgboost(xgb)、決定木(dt)、k極近傍(knn)といった様々なアルゴリズムが使用される。その後、vgg16、resnet50、drknet121などの事前訓練された畳み込みニューラルネットワーク(cnn)モデルも、作物が何らかの病気で汚染されたかどうかを確認するために訓練される。 Precision agriculture system is an arising idea that refers to overseeing farms utilizing current information and communication technologies to improve the quantity and quality of yields while advancing the human work required. The automation requires the assortment of information given by the sensors such as soil, water, light, humidity, temperature for additional information to furnish the operator with exact data to acquire excellent yield to farmers. In this work, a study is proposed that incorporates all common state-of-the-art approaches for precision agriculture use. Technologies like the Internet of Things (IoT) for data collection, machine Learning for crop damage prediction, and deep learning for crop disease detection is used. The data collection using IoT is responsible for the measure of moisture levels for smart irrigation, n, p, k estimations of fertilizers for best yield development. For crop damage prediction, various algorithms like Random Forest (RF), Light gradient boosting machine (LGBM), XGBoost (XGB), Decision Tree (DT) and K Nearest Neighbor (KNN) are used. Subsequently, Pre-Trained Convolutional Neural Network (CNN) models such as VGG16, Resnet50, and DenseNet121 are also trained to check if the crop was tainted with some illness or not.	翻訳日:2021-07-13 15:54:29 公開日:2021-07-10
# 記述論理における高速概念学習のための概念長予測 Prediction of concept lengths for fast concept learning in description logics ( http://arxiv.org/abs/2107.04911v1 ) ライセンス: Link先を確認	N'Dah Jean Kouagou, Stefan Heindorf, Caglar Demir, Axel-Cyrille Ngonga Ngomo	(参考訳) 洗練された演算子に基づく概念学習アプローチは、概念を計算するために部分的に順序付けられた解空間を探索する。しかし、これらのアプローチによって区切られた改良木は、複雑な学習問題に対して容易に数百万のノードに成長できる。これにより、リファインメントベースのアプローチは、しばしば最適な概念を効率的に検出できない。本稿では,対象概念の長さを予測し,概念学習における探索空間の削減を容易にする,概念長学習のための教師付き機械学習アプローチを提案する。この目的を達成するために、我々は4つのニューラルネットワークを比較し、それらを4つのベンチマーク知識グラフで評価する。評価結果から,再帰的ニューラルネットワークアーキテクチャは,f-測定値が最大92%で,概念長予測に最適であることが示唆された。概念長予測器をCELOE(Class Expression Learner for Ontology Engineering)アルゴリズムに統合することで,CELOEのランタイムを最大13.4倍改善し,結果の質に大きな変化を生じさせないことを示す。再現性については、https://github.com/ConceptLengthLearner/ReproducibilityRepoで公開GitHubリポジトリに実装を提供しています。 Concept learning approaches based on refinement operators explore partially ordered solution spaces to compute concepts, which are used as binary classification models for individuals. However, the refinement trees spanned by these approaches can easily grow to millions of nodes for complex learning problems. This leads to refinement-based approaches often failing to detect optimal concepts efficiently. In this paper, we propose a supervised machine learning approach for learning concept lengths, which allows predicting the length of the target concept and therefore facilitates the reduction of the search space during concept learning. To achieve this goal, we compare four neural architectures and evaluate them on four benchmark knowledge graphs--Carcinogenesis, Mutagenesis, Semantic Bible, Family Benchmark. Our evaluation results suggest that recurrent neural network architectures perform best at concept length prediction with an F-measure of up to 92%. We show that integrating our concept length predictor into the CELOE (Class Expression Learner for Ontology Engineering) algorithm improves CELOE's runtime by a factor of up to 13.4 without any significant changes to the quality of the results it generates. For reproducibility, we provide our implementation in the public GitHub repository at https://github.com/ConceptLengthLearner/ReproducibilityRepo	翻訳日:2021-07-13 15:54:04 公開日:2021-07-10
# 金融予測・計画・分析のための機械学習:最近の発展と落とし穴 Machine Learning for Financial Forecasting, Planning and Analysis: Recent Developments and Pitfalls ( http://arxiv.org/abs/2107.04851v1 ) ライセンス: Link先を確認	Helmut Wasserbacher and Martin Spindler	(参考訳) この記事では、財務予測、計画、分析(FP\&A)のための機械学習を紹介します。機械学習は、大量のデータから高度に自動化された情報抽出によってFP\&Aをサポートするのに適しているように見える。しかしながら,従来の機械学習手法の多くは予測(予測)に重点を置いているため,計画やリソース割り当て(因果推論)に使用する場合の落とし穴を回避するために必要となる注意事項について議論する。機械学習の単純な適用は通常この文脈で失敗するが、最近開発されたダブル機械学習フレームワークは興味のある因果問題に対処できる。我々は、FP\&Aにおける機械学習に関する現在の文献をレビューし、予測と計画の両方に機械学習をどのように使用できるかをシミュレーション研究で示す。また,データポイント数の増加に伴う予測と計画の改善についても検討する。 This article is an introduction to machine learning for financial forecasting, planning and analysis (FP\&A). Machine learning appears well suited to support FP\&A with the highly automated extraction of information from large amounts of data. However, because most traditional machine learning techniques focus on forecasting (prediction), we discuss the particular care that must be taken to avoid the pitfalls of using them for planning and resource allocation (causal inference). While the naive application of machine learning usually fails in this context, the recently developed double machine learning framework can address causal questions of interest. We review the current literature on machine learning in FP\&A and illustrate in a simulation study how machine learning can be used for both forecasting and planning. We also investigate how forecasting and planning improve as the number of data points increases.	翻訳日:2021-07-13 15:51:07 公開日:2021-07-10
# Weaving Attention U-net:新しいハイブリッドCNNとAttention-based Method for Organs-at-risk Segmentation in Head and Neck CT Images Weaving Attention U-net: A Novel Hybrid CNN and Attention-based Method for Organs-at-risk Segmentation in Head and Neck CT Images ( http://arxiv.org/abs/2107.04847v1 ) ライセンス: Link先を確認	Zhuangzhuang Zhang, Tianyu Zhao, Hiram Gay, Weixiong Zhang, Baozhou Sun	(参考訳) 放射線療法の計画では、手動コントゥーリングは労働集約的で時間を要する。正確で堅牢な自動セグメンテーションモデルは、効率と治療結果を改善する。本稿では,畳み込みニューラルネットワーク(CNN)と自己注意機構を組み合わせた新しいハイブリッドディープラーニング手法を開発し,頭頸部CT画像の高速かつ正確な多臓器分割を実現することを目的とする。 115例の頭頸部CT像を回顧的に収集,使用した。トレーニング/検証/テストの比率を81/9/25に設定し,10倍のクロスバリデーション戦略を用いて最適なモデルパラメータを選択した。提案するハイブリッドモデルでは,各症例に対して10個の臓器・リスク (oar) を分割した。モデルの性能はDice similarity Coefficient (DSC)、Hausdorff distance 95% (HD95)、平均表面距離 (MSD) の3つの指標で評価された。私たちは、Head and Neck 2015チャレンジデータセットでモデルのパフォーマンスをテストし、最先端の自動セグメンテーションアルゴリズムと比較しました。提案手法は、10個のOARの基底真実によく似た輪郭を生成する。新しいウィービング注意U-netは頭頸部CT画像のセグメンテーションに優れているか類似した性能を示した。 In radiotherapy planning, manual contouring is labor-intensive and time-consuming. Accurate and robust automated segmentation models improve the efficiency and treatment outcome. We aim to develop a novel hybrid deep learning approach, combining convolutional neural networks (CNNs) and the self-attention mechanism, for rapid and accurate multi-organ segmentation on head and neck computed tomography (CT) images. Head and neck CT images with manual contours of 115 patients were retrospectively collected and used. We set the training/validation/testing ratio to 81/9/25 and used the 10-fold cross-validation strategy to select the best model parameters. The proposed hybrid model segmented ten organs-at-risk (OARs) altogether for each case. The performance of the model was evaluated by three metrics, i.e., the Dice Similarity Coefficient (DSC), Hausdorff distance 95% (HD95), and mean surface distance (MSD). We also tested the performance of the model on the Head and Neck 2015 challenge dataset and compared it against several state-of-the-art automated segmentation algorithms. The proposed method generated contours that closely resemble the ground truth for ten OARs. Our results of the new Weaving Attention U-net demonstrate superior or similar performance on the segmentation of head and neck CT images.	翻訳日:2021-07-13 15:48:31 公開日:2021-07-10
# 航空ロボットチームによるインテリジェントトラヒックモニタリングのための分散深層強化学習 Distributed Deep Reinforcement Learning for Intelligent Traffic Monitoring with a Team of Aerial Robots ( http://arxiv.org/abs/2107.04924v1 ) ライセンス: Link先を確認	Behzad Khamidehi and Elvino S. Sousa	(参考訳) 本稿では,航空ロボットを用いた道路網における交通監視問題について検討する。問題は2つの主な理由から難しい。まず、交通イベントは時間的にも空間的にも確率的です。第二に、交通イベントが異なる速度で道路網の異なる場所に到着すると、この問題は非均質な構造となる。そのため、場所によっては、ロボットが他の場所よりも多くの訪問を必要とする。これらの問題に対処するために,道路網の各位置に対する不確実性指標を定義し,ネットワークの平均不確実性を最小限に抑えるための航空ロボットの経路計画問題を定式化する。本稿では,この問題を部分可観測マルコフ決定プロセス(pomdp)として表現し,深層強化学習に基づく分散スケーラブルなアルゴリズムを提案する。エージェント(aerial robot)と交通管理センター(traffic management center, tmc)の通信モードによって異なる2つのシナリオを検討する。最初のシナリオでは、エージェントがTMCと継続的に通信して、トラフィックイベントに関するリアルタイム情報を送受信していると仮定する。したがって、エージェントは環境のグローバルかつリアルタイムな知識を持っている。しかし,第2のシナリオでは,空中ロボットの観測が部分的かつセンシング範囲に限定された,困難な設定を考える。さらに、第1のシナリオとは対照的に、空中ロボットとTMCとの間の情報交換は特定の時間インスタンスに限定される。本研究では,実際の道路ネットワークトポロジーにおける両シナリオにおける提案アルゴリズムの性能を評価し,その性能を交通監視システムで実証する。 This paper studies the traffic monitoring problem in a road network using a team of aerial robots. The problem is challenging due to two main reasons. First, the traffic events are stochastic, both temporally and spatially. Second, the problem has a non-homogeneous structure as the traffic events arrive at different locations of the road network at different rates. Accordingly, some locations require more visits by the robots compared to other locations. To address these issues, we define an uncertainty metric for each location of the road network and formulate a path planning problem for the aerial robots to minimize the network's average uncertainty. We express this problem as a partially observable Markov decision process (POMDP) and propose a distributed and scalable algorithm based on deep reinforcement learning to solve it. We consider two different scenarios depending on the communication mode between the agents (aerial robots) and the traffic management center (TMC). The first scenario assumes that the agents continuously communicate with the TMC to send/receive real-time information about the traffic events. Hence, the agents have global and real-time knowledge of the environment. However, in the second scenario, we consider a challenging setting where the observation of the aerial robots is partial and limited to their sensing ranges. Moreover, in contrast to the first scenario, the information exchange between the aerial robots and the TMC is restricted to specific time instances. We evaluate the performance of our proposed algorithm in both scenarios for a real road network topology and demonstrate its functionality in a traffic monitoring system.	翻訳日:2021-07-13 15:44:38 公開日:2021-07-10
# 凸性のないSchr{\"o}dinger-F{\"o}llmerサンプリングの収束解析 Convergence Analysis of Schr{\"o}dinger-F{\"o}llmer Sampler without Convexity ( http://arxiv.org/abs/2107.04766v1 ) ライセンス: Link先を確認	Yuling Jiao and Lican Kang and Yanyan Liu and Youzhou Zhou	(参考訳) Schr\"{o}dinger-F\"{o}llmer sampler (SFS) は、エルゴード性のない非正規分布からサンプリングするための、新しく効率的なアプローチである。 SFS は、Schr\"{o}dinger-F\"{o}llmerfusion process $$\mathrm{d} X_{t}=-\nabla U\left(X_t, t\right) \mathrm{d} t+\mathrm{d} B_{t}, \quad t \in[0,1],\quad X_0=0$$ の単位区間上のオイラー・マルヤマの離散化に基づいており、これは時間にゼロの縮退分布を目標分布へ輸送する。 \cite{sfs21} において、SFS の整合性は、ドリフト項 $b(x,t)$ % $U(x,t)$ が一様 (on $t$) %concave convex (on $x$) であるという制限された仮定の下で確立される。本稿では,標準正規分布上の目標分布の密度比について,滑らかで有界な条件下でのwasserstein距離におけるsfsの非漸近的誤差境界を与えるが,そのポテンシャルの強い凸性は必要としない。 Schr\"{o}dinger-F\"{o}llmer sampler (SFS) is a novel and efficient approach for sampling from possibly unnormalized distributions without ergodicity. SFS is based on the Euler-Maruyama discretization of Schr\"{o}dinger-F\"{o}llmer diffusion process $$\mathrm{d} X_{t}=-\nabla U\left(X_t, t\right) \mathrm{d} t+\mathrm{d} B_{t}, \quad t \in[0,1],\quad X_0=0$$ on the unit interval, which transports the degenerate distribution at time zero to the target distribution at time one. In \cite{sfs21}, the consistency of SFS is established under a restricted assumption that %the drift term $b(x,t)$ the potential $U(x,t)$ is uniformly (on $t$) strongly %concave convex (on $x$). In this paper we provide a nonasymptotic error bound of SFS in Wasserstein distance under some smooth and bounded conditions on the density ratio of the target distribution over the standard normal distribution, but without requiring the strongly convexity of the potential.	翻訳日:2021-07-13 15:42:29 公開日:2021-07-10
# 単一モデルだけで十分か? MuCoS: セマンティックコード検索のためのマルチモデルアンサンブル学習 Is a Single Model Enough? MuCoS: A Multi-Model Ensemble Learning for Semantic Code Search ( http://arxiv.org/abs/2107.04773v1 ) ライセンス: Link先を確認	Lun Du, Xiaozhou Shi, Yanlin Wang, Ensheng Shi, Shi Han and Dongmei Zhang	(参考訳) 近年,コードスニペットと検索クエリ間のセマンティックな相関がより良くなり,有望な性能を持つため,深層学習がコード検索の主流となっている。しかし、コードスニペットはビジネスロジック、特定のアルゴリズム、ハードウェア通信など、さまざまな次元の様々な情報を持っているため、単一のコード表現モジュールがすべての視点をカバーすることは困難である。一方、特定のクエリは1つまたは複数の視点にフォーカスする可能性があるため、単一のクエリ表現モジュールが異なるユーザ意図を表現することは困難である。本稿では,意味コード検索のためのマルチモデルアンサンブル学習アーキテクチャであるMuCoSを提案する。複数の個別の学習者が組み合わさり、それぞれがコードスニペットの特定の視点を強調する。私たちは、コード情報の異なる視点を含む異なるデータセットで個々の学習者を訓練し、これらの異なるデータセットを取得するためにデータ拡張戦略を使用します。次に、学習者をアンサンブルして、コードスニペットの包括的な特徴を捉えます。 Recently, deep learning methods have become mainstream in code search since they do better at capturing semantic correlations between code snippets and search queries and have promising performance. However, code snippets have diverse information from different dimensions, such as business logic, specific algorithm, and hardware communication, so it is hard for a single code representation module to cover all the perspectives. On the other hand, as a specific query may focus on one or several perspectives, it is difficult for a single query representation module to represent different user intents. In this paper, we propose MuCoS, a multi-model ensemble learning architecture for semantic code search. It combines several individual learners, each of which emphasizes a specific perspective of code snippets. We train the individual learners on different datasets which contain different perspectives of code information, and we use a data augmentation strategy to get these different datasets. Then we ensemble the learners to capture comprehensive features of code snippets.	翻訳日:2021-07-13 15:41:52 公開日:2021-07-10
# HOMRS:ディープニューラルネットワークのための高次準同型関係セレクタ HOMRS: High Order Metamorphic Relations Selector for Deep Neural Networks ( http://arxiv.org/abs/2107.04863v1 ) ライセンス: Link先を確認	Florian Tambon, Giulio Antoniol and Foutse Khomh	(参考訳) ディープニューラルネットワーク(DNN)アプリケーションは、医療アプリケーションから自動運転車まで、私たちの日常生活の一部になりつつある。従来のDNNの検証は精度測定に頼っているが、敵の例の存在はこれらの精度測定の限界を強調しており、特にDNNが安全クリティカルシステムに統合された場合の懸念を高めている。本稿では,基本変成関係の最初の集合から高次変成関係の小さな最適化セットを自動構築することにより,変成テストを促進する方法であるHOMRSを提案する。 HOMRSのバックボーンは多目的検索であり、コードカバレッジ、テストケース、パスの多様性といった従来のシステムテストから引き出されたアイデアを利用する。 HOMRS を MNIST データセットで LeNet5 DNN に適用し,95% のキル比を達成できる小型だが効果的な高次変換セットを構築した証拠を報告する。 5つのラッカーは、高次変換前後に画像のプールを手動でラベル付けし、フライスのカッパと統計検査により、それらが変成特性であることを確認した。 HOMRSは、ランダムにサンプリングされたアウト・オブ・ディストリビューション画像の92%を検出した。 HOMRS変換はオンラインリアルタイム利用にも適している。 Deep Neural Networks (DNN) applications are increasingly becoming a part of our everyday life, from medical applications to autonomous cars. Traditional validation of DNN relies on accuracy measures, however, the existence of adversarial examples has highlighted the limitations of these accuracy measures, raising concerns especially when DNN are integrated into safety-critical systems. In this paper, we present HOMRS, an approach to boost metamorphic testing by automatically building a small optimized set of high order metamorphic relations from an initial set of elementary metamorphic relations. HOMRS' backbone is a multi-objective search; it exploits ideas drawn from traditional systems testing such as code coverage, test case, and path diversity. We applied HOMRS to LeNet5 DNN with MNIST dataset and we report evidence that it builds a small but effective set of high order transformations achieving a 95% kill ratio. Five raters manually labeled a pool of images before and after high order transformation; Fleiss' Kappa and statistical tests confirmed that they are metamorphic properties. HOMRS built-in relations are also effective to confront adversarial or out-of-distribution examples; HOMRS detected 92% of randomly sampled out-of-distribution images. HOMRS transformations are also suitable for online real-time use.	翻訳日:2021-07-13 15:41:37 公開日:2021-07-10

Title

Authors

Abstract

論文公表日・翻訳日

# 量子信号処理における効率的な位相要素評価

Efficient phase-factor evaluation in quantum signal processing ( http://arxiv.org/abs/2002.11649v2 )

ライセンス: Link先を確認

Yulong Dong, Xiang Meng, K. Birgitta Whaley, Lin Lin

(参考訳) 量子信号処理(QSP)は、量子コンピュータ上で行列多項式を正確に実装する強力な量子アルゴリズムである。 qspに基づく量子アルゴリズムの漸近解析は、原理上、ハミルトンシミュレーションや量子線形系問題のような様々なタスクに対して漸近的に最適な結果を得ることができることを示した。 QSPのさらなる利点は、最小数のアンシラ量子ビットを使い、準中間項量子アーキテクチャの実装を容易にすることである。しかし、QSP回路構築に必要な位相係数を計算できる古典的に安定したアルゴリズムは今のところ存在しない。既存の手法では可変精度演算が必要であり、比較的低い次数の多項式にしか適用できない。本稿では,標準倍精度演算を用いて位相係数を正確に計算する最適化手法を提案する。本手法の性能をハミルトンシミュレーション,固有値フィルタリング,量子線形系問題への応用により実証する。数値計算の結果, 最適化アルゴリズムは, 誤差が 10^{-12}$ 以下で 1 万ドル以上の次数の多項式を正確に近似する位相係数を求めることができることがわかった。

Quantum signal processing (QSP) is a powerful quantum algorithm to exactly implement matrix polynomials on quantum computers. Asymptotic analysis of quantum algorithms based on QSP has shown that asymptotically optimal results can in principle be obtained for a range of tasks, such as Hamiltonian simulation and the quantum linear system problem. A further benefit of QSP is that it uses a minimal number of ancilla qubits, which facilitates its implementation on near-to-intermediate term quantum architectures. However, there is so far no classically stable algorithm allowing computation of the phase factors that are needed to build QSP circuits. Existing methods require the usage of variable precision arithmetic and can only be applied to polynomials of relatively low degree. We present here an optimization based method that can accurately compute the phase factors using standard double precision arithmetic operations. We demonstrate the performance of this approach with applications to Hamiltonian simulation, eigenvalue filtering, and the quantum linear system problems. Our numerical results show that the optimization algorithm can find phase factors to accurately approximate polynomials of degree larger than $10,000$ with error below $10^{-12}$.

翻訳日:2023-06-01 21:03:46 公開日:2021-07-10

# ランダム回路ブロック符号化行列と量子LINPACKベンチマークの提案

Random circuit block-encoded matrix and a proposal of quantum LINPACK benchmark ( http://arxiv.org/abs/2006.04010v2 )

ライセンス: Link先を確認

Yulong Dong, Lin Lin

(参考訳) LINPACKベンチマークは、密度ランダム行列を持つ線形方程式系を解くコンピュータの性能を報告する。このタスクは実際のアプリケーションを直接念頭に置いて設計されたものではないが、1993年のリストの登場以来、LINPACKベンチマークによってTOP500スーパーコンピュータのリストが定義されている。我々は、量子LINPACKベンチマークと呼ばれる類似のベンチマークを用いて、量子コンピュータ全体のマシン性能を測定することを提案する。量子LINPACKベンチマークの成功は、方程式の線形系のような線形代数問題を解くための有用なタスクを実行するための量子コンピュータの最小限の要件と見なされるべきである。本稿では,Random Circuit Block-Encoded Matrix (RACBEM) と呼ばれる入力モデルを提案する。 RACBEMモデルは量子コンピュータ上での実装が効率的であり、ブラックボックスの量子コンパイラを頼りに、任意の量子アーキテクチャに最適に適応するように設計することができる。線形システムの解法以外にも、RACBEMモデルは、計算スペクトル測度、ハミルトンシミュレーションによって生成される時系列、エネルギーの熱平均など、多くの物理応用に関連する様々な線形代数タスクの実行に使用できる。我々は、これらの線形代数演算をibm q量子デバイスおよび量子仮想マシンに実装し、科学計算問題を解決する上での性能を実証する。

The LINPACK benchmark reports the performance of a computer for solving a system of linear equations with dense random matrices. Although this task was not designed with a real application directly in mind, the LINPACK benchmark has been used to define the list of TOP500 supercomputers since the debut of the list in 1993. We propose that a similar benchmark, called the quantum LINPACK benchmark, could be used to measure the whole machine performance of quantum computers. The success of the quantum LINPACK benchmark should be viewed as the minimal requirement for a quantum computer to perform a useful task of solving linear algebra problems, such as linear systems of equations. We propose an input model called the RAndom Circuit Block-Encoded Matrix (RACBEM), which is a proper generalization of a dense random matrix in the quantum setting. The RACBEM model is efficient to be implemented on a quantum computer, and can be designed to optimally adapt to any given quantum architecture, with relying on a black-box quantum compiler. Besides solving linear systems, the RACBEM model can be used to perform a variety of linear algebra tasks relevant to many physical applications, such as computing spectral measures, time series generated by a Hamiltonian simulation, and thermal averages of the energy. We implement these linear algebra operations on IBM Q quantum devices as well as quantum virtual machines, and demonstrate their performance in solving scientific computing problems.

翻訳日:2023-05-16 09:12:15 公開日:2021-07-10

# 量子アニールのマルチビット補正

Multi-Qubit Correction for Quantum Annealers ( http://arxiv.org/abs/2010.00115v4 )

ライセンス: Link先を確認

Ramin Ayanzadeh, John Dorband, Milton Halem and Tim Finin

(参考訳) 我々は,オープンシステムの進展をギブズ・サンプラーとして捉え,励起状態のセットをエネルギー値の低い新しい合成状態に還元する量子アニーラーのための新しい後処理法として, \emph{multi-qubit correction} (mqc)を提案する。所定の(Ising)ハミルトニアンの基底状態からサンプリングした後、MQCは励起状態のペアを比較して仮想トンネルを認識する。 D-Wave 2000Q量子アニールを用いた実験の結果、MQCは、スピン反転変換、古典的後処理技術、連続測定間のサンプル間遅延の増加など、近年の量子アニール領域におけるハードウェア/ソフトウェア進歩と比較して、顕著に低いエネルギー値のサンプルを見つけ、結果の再現性を向上させることを示した。

We present \emph{multi-qubit correction} (MQC) as a novel postprocessing method for quantum annealers that views the evolution in an open-system as a Gibbs sampler and reduces a set of excited states to a new synthetic state with lower energy value. After sampling from the ground state of a given (Ising) Hamiltonian, MQC compares pairs of excited states to recognize virtual tunnels--i.e., a group of qubits that changing their states simultaneously can result in a new state with lower energy value--and successively converges to the ground state. Experimental results using D-Wave 2000Q quantum annealers demonstrate that MQC finds samples with notably lower energy values and improves the reproducibility of results when compared to recent hardware/software advances in the realm of quantum annealing, such as spin-reversal transforms, classical postprocessing techniques, and increased inter-sample delay between successive measurements.

翻訳日:2023-04-30 13:59:35 公開日:2021-07-10

# 多体局在系における空間一粒子密度行列エントロピーのスケーリング特性

Scaling properties of a spatial one-particle density-matrix entropy in many-body localized systems ( http://arxiv.org/abs/2011.02200v2 )

ライセンス: Link先を確認

Miroslav Hopjan, Fabian Heidrich-Meisner, Vincenzo Alba

(参考訳) 本研究では,多体局在化(MBL)相を呈する1次元不規則相互作用フェルミオンにおいて,一粒子密度行列(OPDM)から抽出した空間サブシステムエントロピーについて検討した。このOPDMエントロピーは, 適切な絡み合いの指標ではないにもかかわらず, 局所化の健全な特徴を示す。固有状態のOPDMエントロピーが地域法則に従うことを数値的に示す。フォン・ノイマンのエントロピーと同様に、opdmエントロピーは量子クエンチの時間とともに対数的に成長する。これら2つの特徴は、中程度の大きな相互作用で生き残り、エルゴード相への遷移に向けて良好である。 OPDMエントロピーの計算コストはシステムサイズと多項式的にしか一致せず,シミュレーションや実験でMBLの診断ツールを開発する上で,OPDMが有望な出発点となることを示唆している。

We investigate a spatial subsystem entropy extracted from the one-particle density matrix (OPDM) in one-dimensional disordered interacting fermions that host a many-body localized (MBL) phase. Deep in the putative MBL regime, this OPDM entropy exhibits the salient features of localization, despite not being a proper entanglement measure. We numerically show that the OPDM entropy of the eigenstates obeys an area law. Similar to the von-Neumann entropy, the OPDM entropy grows logarithmically with time after a quantum quench, albeit with a different prefactor. Both these features survive at moderately large interactions and well towards the transition into the ergodic phase. The computational cost to calculate the OPDM entropy scales only polynomially with the system size, suggesting that the OPDM provides a promising starting point for developing diagnostic tools for MBL in simulations and experiments.

翻訳日:2023-04-25 07:32:02 公開日:2021-07-10

# 量子ラビ三角形におけるキラルコヒーレント相の量子三臨界

Quantum tricriticality of chiral-coherent phase in quantum Rabi triangle ( http://arxiv.org/abs/2011.11171v2 )

ライセンス: Link先を確認

Yu-Yu Zhang, Zi-Xiang Hu, Libin Fu, Hong-Gang Luo, Han Pu, Xue-Feng Zhang

(参考訳) 相互作用、対称性、ゲージ場の相互作用は通常、興味深い量子多体相をもたらす。出現相の性質を探るため,人工磁場を合成するための基本構造ブロックとして量子ラビ三角形系について検討した。我々は、リッチ位相図と関連する量子臨界性を研究する分析手法を開発した。特に興味深いのはキラルコヒーレント位相の出現であり、これは$\mathbb{z}_2$ とキラル対称性の両方を破る。このキラル相では、光子は一方向的に流れ、キラリティーは人工ゲージ場によって調整され、時間反転対称性の破れを示す。有限周波スケーリング解析により、関連する相転移がディックモデルの普遍性クラスにさらに確認される。このモデルは、光-物質結合系の幅広い物理現象をシミュレートすることができ、様々な量子情報技術の将来の発展に応用することができる。

The interplay of interactions, symmetries and gauge fields usually leads to intriguing quantum many-body phases. To explore the nature of emerging phases, we study a quantum Rabi triangle system as an elementary building block for synthesizing an artificial magnetic field. We develop an analytical approach to study the rich phase diagram and the associated quantum criticality. Of particular interest is the emergence of a chiral-coherent phase, which breaks both the $\mathbb{Z}_2$ and the chiral symmetry. In this chiral phase, photons flow unidirectionally and the chirality can be tuned by the artificial gauge field, exhibiting a signature of broken time-reversal symmetry. The finite-frequency scaling analysis further confirms the associated phase transition to be in the universality class of the Dicke model. This model can simulate a broad range of physical phenomena of light-matter coupling systems, and may have an application in future developments of various quantum information technologies.

翻訳日:2023-04-23 09:14:35 公開日:2021-07-10

# 円柱状光格子における空間時間対称性、軌道磁気、動的ベリー曲率

Intertwined Space-Time Symmetry, Orbital Magnetism and Dynamical Berry Curvature in a Circularly Shaken Optical Lattice ( http://arxiv.org/abs/2012.01822v2 )

ライセンス: Link先を確認

Hua Chen and W. Vincent Liu

(参考訳) 空間的および時間的次元において周期性を示す(2+1)次元の時空格子である2次元光学格子の円形揺動について検討する。ここで考慮された近共振光揺らぎは、振動周波数の光子を転送することで、低次の$s$バンドと最初の$p$バンドを動的に結合する。交叉型時空対称性はさらに発見され、一般化されたブロッホ・フロケの定理で解決されたスペクトルの縮退を解明する。円揺動のキラリティの設定は、時間反転対称性を明示的に破壊し、 $p_\pm = p_x \pm ip_y$ 軌道の縮退を持ち上げ、軌道磁気の局所循環、すなわち $p_\pm$ 軌道の非平衡占有をもたらす。さらに, ベリー接続のダイナミクスは, ベリー曲率の時間発展と, 実験において物理的に観測可能な効果を持つ分極によって明らかにされる。興味深いことに、動力学は、時間の分数変換を伴う時間ねじ回転対称性によって制御される普遍的な位相シフトによって特徴づけられる。これらの結果は、現在の格子シェイキングスキームが軌道物理学と対称性に保護されたダイナミクスを研究するための多用途なプラットフォームであることを示唆している。

We study the circular shaking of a two dimensional optical lattice, which is essentially a (2+1) dimensional space-time lattice exhibiting periodicities in both spatial and temporal dimensions. The near-resonant optical shaking considered here dynamically couples the low-lying $s$ band and the first excited $p$ bands by transferring a photon of shaking frequency. The intertwined space-time symmetries are further uncovered to elucidate the degeneracy in the spectrum solved with the generalized Bloch-Floquet theorem. Setting the chirality of circular shaking explicitly breaks time reversal symmetry and lifts the degeneracy of $p_\pm = p_x \pm ip_y$ orbitals, leading to the local circulation of orbital magnetism, i.e the imbalanced occupation in $p_\pm$ orbitals. Moreover, the dynamics of Berry connection is revealed by the time evolution of the Berry curvature and the polarization, which have physical observable effects in experiments. Interestingly, the dynamics is found characterized by a universal phase shift, governed by the time screw rotational symmetry involving a fractional translation of time. These findings suggest that the present lattice-shaking scheme provides a versatile platform for the investigation of the orbital physics and the symmetry-protected dynamics.

翻訳日:2023-04-22 05:36:10 公開日:2021-07-10

# 量子論のオントロジモデルに対するノーゴー定理

A no-go theorem for Quantum theory ontological models ( http://arxiv.org/abs/2012.05712v2 )

ライセンス: Link先を確認

Tung Ten Yong

(参考訳) 本稿では,系の量子状態が系の独立現実を表す物理状態の集合に対応できないという意味で,量子力学は存在論的モデルを認めないことを示す。ウィグナーの友人のシナリオに基づく2つの思考実験を通じて、実験室における物理系のオンティック状態がウィグナーと彼の友人にとって同じである場合、pbr定理、量子論的予測、そして「超決定論」の仮定の1つに違反することを示した。

In this paper, we show that Quantum Mechanics does not admit ontological models, in the sense that the quantum state of a system cannot correspond to a set of physical states representing the independent reality of the system. We show, via two thought experiments based on the Wigner's friend scenario, that if the ontic state of physical systems in the lab is the same for Wigner and for his friend, one of the following will be violated: PBR theorem, Quantum-theoretic predictions, and the "No-superdeterminism" assumption.

翻訳日:2023-04-21 07:49:59 公開日:2021-07-10

# 遺伝的プログラミングにおけるモジュールのタグベースの制御は、文脈依存問題解決を改善する

Tag-based regulation of modules in genetic programming improves context-dependent problem solving ( http://arxiv.org/abs/2012.09229v3 )

ライセンス: Link先を確認

Alexander Lalejini, Matthew Andres Moreno, and Charles Ofria

(参考訳) 我々は、プログラムがどのコードモジュールを表現するかを動的に調整できる新しい遺伝的プログラミング(gp)技術であるtag-based genetic regulationの有用性を紹介、実証する。タグは進化可能なラベルであり、コードモジュールを参照するための柔軟なメカニズムを提供する。タグベースの遺伝的規制は、既存のタグベースの命名スキームを拡張し、プログラムが表現パターンを変更するためにコードモジュールを「宣伝」したり「抑圧」したりできる。この拡張により、モジュールが命令実行に基づいて制御される遺伝子制御ネットワークとしてプログラムを構築することができる。本稿では,プログラム合成問題に対するタグベースの制御の機能を実証する。タグベースの規制は,事前入力に基づいてプログラムが現在の入力にどのように反応するかを調整しなければならないという,文脈依存の問題に対する問題解決性能を向上させることが判明した。実際、システムは規制を加えるまで、文脈依存の問題に対する解決策を進化させることができなかった。しかし、タグに基づく遺伝子制御の実装は、普遍的に有益ではない。特定の入力に対する正しい応答が決して変化しないシナリオを特定し、タグベースの規制を不要な機能として、時には適応的な進化を妨げる可能性がある。タグベースの遺伝的レギュレーションは、よりダイナミックな遺伝的プログラムを進化させるための技術のレパートリーを広げ、既存のタグ対応GPシステムに容易に組み込むことができる。

We introduce and experimentally demonstrate the utility of tag-based genetic regulation, a new genetic programming (GP) technique that allows programs to dynamically adjust which code modules to express. Tags are evolvable labels that provide a flexible mechanism for referencing code modules. Tag-based genetic regulation extends existing tag-based naming schemes to allow programs to "promote" and "repress" code modules in order to alter expression patterns. This extension allows evolution to structure a program as a gene regulatory network where modules are regulated based on instruction executions. We demonstrate the functionality of tag-based regulation on a range of program synthesis problems. We find that tag-based regulation improves problem-solving performance on context-dependent problems; that is, problems where programs must adjust how they respond to current inputs based on prior inputs. Indeed, the system could not evolve solutions to some context-dependent problems until regulation was added. Our implementation of tag-based genetic regulation is not universally beneficial, however. We identify scenarios where the correct response to a particular input never changes, rendering tag-based regulation an unneeded functionality that can sometimes impede adaptive evolution. Tag-based genetic regulation broadens our repertoire of techniques for evolving more dynamic genetic programs and can easily be incorporated into existing tag-enabled GP systems.

翻訳日:2023-04-20 11:04:21 公開日:2021-07-10

# ボソニック$p$帯三角格子における軌道次数

Orbital order in a bosonic $p$-band triangular lattice ( http://arxiv.org/abs/2102.05319v2 )

ライセンス: Link先を確認

Hua Chen and X. C. Xie

(参考訳) 超流体-モット絶縁体遷移における軌道秩序の進化に着目して,Bose-Hubbardモデルの詳細を$p$バンド三角形格子で示す。 2つの異なる相が超流動状態にある。これらの位相の1つが弱い相互作用限界を断定的に結合する。この位相は、軸軸の$p_\pm=p_x \pm ip_y$ と、平面の $p_\theta=\cos\theta p_x+\sin\theta p_y$ の軌道秩序の相互結合によって特徴づけられる。さらに、計算されたボゴリューボフ励起スペクトルは、元のディラック点を単一粒子スペクトルで差すが、創発的なディラック点を示す。もう1つの超流動相はmott絶縁体に近接しており、単位ボソン充填は平面内強磁性軌道秩序の崩壊を示している。最後に、mott絶縁体相のために軌道交換モデルを構築する。その古典的な基底状態は、面内軌道空間における創発的なSO$(2)$回転対称性を持ち、従って無限の縮退を楽しみ、最終的には障害機構の順序によって軌道変動によって持ち上げられる。系統解析から,Mott絶縁体相の面内フェロ軌道秩序は後者の超流動相と一致し,進化する可能性が示唆された。

We present a detailed study of the Bose-Hubbard model in a $p$-band triangular lattice by focusing on the evolution of orbital order across the superfluid-Mott insulator transition. Two distinct phases are found in the superfluid regime. One of these phases adiabatically connects the weak interacting limit. This phase is characterized by the intertwining of axial $p_\pm=p_x \pm ip_y$ and in-plane $p_\theta=\cos\theta p_x+\sin\theta p_y$ orbital orders, which break the time-reversal symmetry and lattice symmetries simultaneously. In addition, the calculated Bogoliubov excitation spectrum gaps the original Dirac points in the single-particle spectrum but exhibits emergent Dirac points. The other superfluid phase in close proximity to the Mott insulator with unit boson filling shows a detwined in-plane ferro-orbital order. Finally, an orbital exchange model is constructed for the Mott insulator phase. Its classical ground state has an emergent SO$(2)$ rotational symmetry in the in-plane orbital space and therefore enjoys an infinite degeneracy, which is ultimately lifted by the orbital fluctuation via the order by disorder mechanism. Our systematic analysis suggests that the in-plane ferro-orbital order in the Mott insulator phase agrees with and likely evolves from the latter superfluid phase.

翻訳日:2023-04-12 00:58:36 公開日:2021-07-10

# 量子貯留層計算とエクストリームラーニングマシンの可能性

Opportunities in Quantum Reservoir Computing and Extreme Learning Machines ( http://arxiv.org/abs/2102.11831v2 )

ライセンス: Link先を確認

Pere Mujal, Rodrigo Mart\'inez-Pe\~na, Johannes Nokkala, Jorge Garc\'ia-Beni, Gian Luca Giorgi, Miguel C. Soriano, Roberta Zambrini

(参考訳) 量子貯水池コンピューティング (QRC) と量子極端学習マシン (QELM) は、古典的および量子機械学習タスクにおいてその可能性を実証する2つの新しいアプローチである。物理システムの量子性と簡単なトレーニング戦略を組み合わせることで、優れたパフォーマンスを実現している。これらの非従来型コンピューティングアプローチへの関心の高まりは、実装に適した多様な量子プラットフォームと、複雑な量子システムの研究における理論的進歩によって加速される。本稿では, 量子入力, 量子物理基板, 量子タスクを考慮した場合, 様々な可能性を示す最近の提案と最初の実験について述べる。主な焦点はこれらのアプローチのパフォーマンスであり、古典的なアプローチや機会に対するアドバンテージである。

Quantum reservoir computing (QRC) and quantum extreme learning machines (QELM) are two emerging approaches that have demonstrated their potential both in classical and quantum machine learning tasks. They exploit the quantumness of physical systems combined with an easy training strategy, achieving an excellent performance. The increasing interest in these unconventional computing approaches is fueled by the availability of diverse quantum platforms suitable for implementation and the theoretical progresses in the study of complex quantum systems. In this review article, recent proposals and first experiments displaying a broad range of possibilities are reviewed when quantum inputs, quantum physical substrates and quantum tasks are considered. The main focus is the performance of these approaches, on the advantages with respect to classical counterparts and opportunities.

翻訳日:2023-04-10 03:14:43 公開日:2021-07-10

# 過去と未来情報を用いた量子状態推定の統一理論

Unifying theory of quantum state estimation using past and future information ( http://arxiv.org/abs/2104.02911v2 )

ライセンス: Link先を確認

Areeya Chantasri, Ivonne Guevara, Kiarn T. Laverick, and Howard M. Wiseman

(参考訳) 連続監視された力学系に対する量子状態推定は、連続的な観測結果に基づいて、ある時点で量子状態を個々の系に割り当てることを含む。推定の質は、観測情報の使用量と、推定値に対する最適性の定義に依存する。本研究では, 測定記録の一部が入手できない量子状態推定の問題を考えるが, 利用可能な記録が前(ペースト)と後(将来の)の両方から得られる場合, 過去の情報のみを用いた場合よりもよい推定が可能となる。量子系の過去の情報は、特に量子状態の滑らか化、最も類似した経路、二状態ベクトルおよび関連する形式論において様々な方法で使われている。このような一見無関係なアプローチを統一するために、連続的な監視を伴う部分観測量子システムのためのフレームワークを提案し、いくつかの一般化により、最初の2つの既定形式を許容できる。統一フレームワークは、期待されるコスト最小化による状態推定に基づいており、コストは未知のレコードの空間または未知の真の状態の空間で定義することができる。さらに,新しいコスト関数を5つ定義することで,既存の3つのアプローチを概念的に結合し,それらのギャップを埋める新たなタイプの推定器を提案する。本手法は, ボソニック浴槽に散逸的に結合した駆動型2レベルシステムの例として, 提案する7つの推定値すべてを計算することで, 適用性を示す。我々の理論はまた、古典状態推定への接続を可能にし、量子状態推定器間のさらなる概念的リンクを生み出します。

Quantum state estimation for continuously monitored dynamical systems involves assigning a quantum state to an individual system at some time, conditioned on the results of continuous observations. The quality of the estimation depends on how much observed information is used and on how optimality is defined for the estimate. In this work, we consider problems of quantum state estimation where some of the measurement records are not available, but where the available records come from both before (past) and after (future) the estimation time, enabling better estimates than is possible using the past information alone. Past-future information for quantum systems has been used in various ways in the literature, in particular, the quantum state smoothing, the most-likely path, and the two-state vector and related formalisms. To unify these seemingly unrelated approaches, we propose a framework for partially-observed quantum system with continuous monitoring, wherein the first two existing formalisms can be accommodated, with some generalization. The unifying framework is based on state estimation with expected cost minimization, where the cost can be defined either in the space of the unknown record or in the space of the unknown true state. Moreover, we connect all three existing approaches conceptually by defining five new cost functions, and thus new types of estimators, which bridge the gaps between them. We illustrate the applicability of our method by calculating all seven estimators we consider for the example of a driven two-level system dissipatively coupled to bosonic baths. Our theory also allows connections to classical state estimation, which create further conceptual links between our quantum state estimators.

翻訳日:2023-04-05 02:28:25 公開日:2021-07-10

# WiFiMod:パッシブセンシングを用いたトランスフォーマーを用いた室内移動モデリング

WiFiMod: Transformer-based Indoor Human Mobility Modeling using Passive Sensing ( http://arxiv.org/abs/2104.09835v3 )

ライセンス: Link先を確認

Amee Trivedi, Kate Silverstein, Emma Strubell, Mohit Iyyer, Prashant Shenoy

(参考訳) 人体移動のモデル化は、都市計画から病気拡散のシミュレーションまで幅広い応用がある。人間は屋内で80%の時間を過ごすことはよく知られているが、室内での移動のモデル化は3つの主な理由から困難である。 (i)簡単に入手でき、信頼性があり、低コストな屋内移動度データセットがないこと。 (ii)頻繁な屋内移動のモデル化における高い予測空間 (iii)移動度におけるマルチスカラー周期性と相関これらの課題に対処するために,WiFi システムログを用いて屋内の人体移動を複数の空間スケールでモデル化する Transformer ベースのデータ駆動型アプローチである WiFiMod を提案する。 WiFiModは、入力されたエンタープライズWiFiシステムログとして、スマートフォンのデジタルトレースから人間の移動軌跡を抽出する。次に,複数の空間的スケール,マクロ,マイクロの移動性特徴を特定し,複数の空間的粒度にわたってユーザの移動性を数時間から1日にわたって予測するマルチモーダル組込みトランスを設計した。マルチモーダル埋め込みは様々なスケールの移動周期と相関を捉え、トランスフォーマーは長期移動依存を捉え、モデル予測性能を高める。このアプローチは、まずマクロモビリティを予測し、次に推定マクロモビリティ分布に基づく屋内モビリティ、マイクロモビリティをモデル化し、マクロモビリティのトポロジカル制約を用いて予測空間を大幅に削減する。実験の結果、WiFiModは現在の最先端モデルよりも少なくとも10%高い精度で予測できることがわかった。さらにWiFiModの3つの実環境アプリケーションについても紹介する。 (i)covid-19またはiliの政策決定のための高密度ホットポケットの予測 (ii)室内移動の現実的なシミュレーションを生成する。 (iii)パーソナルアシスタントのデザイン。

Modeling human mobility has a wide range of applications from urban planning to simulations of disease spread. It is well known that humans spend 80% of their time indoors but modeling indoor human mobility is challenging due to three main reasons: (i) the absence of easily acquirable, reliable, low-cost indoor mobility datasets, (ii) high prediction space in modeling the frequent indoor mobility, and (iii) multi-scalar periodicity and correlations in mobility. To deal with all these challenges, we propose WiFiMod, a Transformer-based, data-driven approach that models indoor human mobility at multiple spatial scales using WiFi system logs. WiFiMod takes as input enterprise WiFi system logs to extract human mobility trajectories from smartphone digital traces. Next, for each extracted trajectory, we identify the mobility features at multiple spatial scales, macro, and micro, to design a multi-modal embedding Transformer that predicts user mobility for several hours to an entire day across multiple spatial granularities. Multi-modal embedding captures the mobility periodicity and correlations across various scales while Transformers capture long-term mobility dependencies boosting model prediction performance. This approach significantly reduces the prediction space by first predicting macro mobility, then modeling indoor scale mobility, micro-mobility, conditioned on the estimated macro mobility distribution, thereby using the topological constraint of the macro-scale. Experimental results show that WiFiMod achieves a prediction accuracy of at least 10% points higher than the current state-of-art models. Additionally, we present 3 real-world applications of WiFiMod - (i) predict high-density hot pockets for policy-making decisions for COVID19 or ILI, (ii) generate a realistic simulation of indoor mobility, (iii) design personal assistants.

翻訳日:2023-04-03 02:39:09 公開日:2021-07-10

# MIMOテラヘルツ量子鍵分布

MIMO Terahertz Quantum Key Distribution ( http://arxiv.org/abs/2105.03642v3 )

ライセンス: Link先を確認

Neel Kanth Kundu, Soumya P. Dash, Matthew R. McKay, and Ranjan K. Mallik

(参考訳) 室温で動作するテラヘルツ(THz)周波数アプリケーションのための多重出力多重出力(MIMO)量子鍵分布(QKD)方式を提案する。古典的MIMO通信により、AliceとBobの間のランク$r$MIMOチャネルを$r$並列損失量子チャネルに変換する送信受信ビームフォーミングスキームが提案されている。既存の単一アンテナqkd方式と比較すると,mimo qkd方式は秘密鍵レートを増加させ,伝送距離を延ばすことで性能向上をもたらすことを示す。シミュレーションの結果,THz周波数での高自由空間パス損失を克服するためには,複数のアンテナが必要であることがわかった。性能と周波数の非単調な関係を実証し,10-30$THzの周波数範囲で正の鍵レートが達成可能であることを示す。提案手法は,第5世代超セキュア無線通信システムにおける屋内および屋外のQKD応用に適用可能である。

We propose a multiple-input multiple-output (MIMO) quantum key distribution (QKD) scheme for terahertz (THz) frequency applications operating at room temperature. Motivated by classical MIMO communications, a transmit-receive beamforming scheme is proposed that converts the rank-$r$ MIMO channel between Alice and Bob into $r$ parallel lossy quantum channels. Compared with existing single-antenna QKD schemes, we demonstrate that the MIMO QKD scheme leads to performance improvements by increasing the secret key rate and extending the transmission distance. Our simulation results show that multiple antennas are necessary to overcome the high free-space path loss at THz frequencies. We demonstrate a non-monotonic relation between performance and frequency, and reveal that positive key rates are achievable in the $10-30$ THz frequency range. The proposed scheme can be used for both indoor and outdoor QKD applications for beyond fifth-generation ultra-secure wireless communications systems.

翻訳日:2023-04-01 03:28:10 公開日:2021-07-10

# 5G eHealth Systems, Technologies, ユースケース, 今後の課題

Design and Implementation of 5G eHealth Systems, Technologies, Use Cases and Future Challenges ( http://arxiv.org/abs/2106.05086v2 )

ライセンス: Link先を確認

Di Zhang, Joel J. P. C. Rodrigues, Yunkai Zhai, Takuro Sato

(参考訳) 第5世代(5G)は、高信頼性、低レイテンシ、さらに高速な送信速度で巨大なデバイスを接続することを目的としている。しかし、現在の5G E-Healthシステムへの取り組みは、その完全な青写真を達成するには不十分だ。本稿では,まず,5g e-healthシステムの設計における物理層,上層層,クロス層の観点からの関連技術について論じる。その後、遠隔医療用5G e-healthシステムとコビッドウイルス感染防止用5G e-healthシステムという2つのユースケースを実施。 5G e-health システムの今後の研究動向と課題について検討する。

Fifth generation (5G) aims to connect massive devices with even higher reliability, lower latency and even faster transmission speed, which are vital for implementing the e-health systems. However, the current efforts on 5G e-health systems are still not enough to accomplish its full blueprint. In this article, we first discuss the related technologies from physical layer, upper layer and cross layer perspectives on designing the 5G e-health systems. We afterwards elaborate two use cases according to our implementations, i.e., 5G e-health systems for remote health and 5G e-health systems for Covid-19 pandemic containment. We finally envision the future research trends and challenges of 5G e-health systems.

翻訳日:2023-03-27 04:19:12 公開日:2021-07-10

# 共形場理論の復号化-教師なしから教師なし学習へ

Decoding conformal field theories: from supervised to unsupervised learning ( http://arxiv.org/abs/2106.13485v2 )

ライセンス: Link先を確認

En-Jui Kuo, Alireza Seif, Rex Lundgren, Seth Whitsitt, Mohammad Hafezi

(参考訳) 我々は機械学習を用いて有理2次元等角場理論を分類する。まず,これらの最小モデルのエネルギースペクトルを用いて教師あり学習アルゴリズムを学習する。機械はエネルギースペクトルのみを用いて、いくつかの強い相関を持つスピンモデルの臨界点の性質と値を正確に予測できることがわかった。これは、機械学習を用いて物質の異なる相を分類する以前の研究とは対照的であるが、相間の臨界点の性質を明らかにしない。ある種のトポロジカル位相の基底状態のハミルトニアンも共形場理論によって記述されるので、R\'{e}yniエントロピーの教師付き学習を用いて、機械が高い精度でR\'{e}yniエントロピーが最小のR\'{e}yniエントロピーのみを持つハミルトニアンをどの共形場理論で記述するかを特定できることを示した。最後に、教師なし学習アルゴリズムであるオートエンコーダを用いて、中央電荷と直接相関する隠れ変数を発見し、機械学習を用いて高次元の物体を含む他の共形場理論を研究するための展望を議論する。その結果、機械学習は臨界点の発見と特徴付けに利用でき、さらに機械学習を用いてより複雑な共形場の理論を学習する興味深い可能性を示唆している。

We use machine learning to classify rational two-dimensional conformal field theories. We first use the energy spectra of these minimal models to train a supervised learning algorithm. We find that the machine is able to correctly predict the nature and the value of critical points of several strongly correlated spin models using only their energy spectra. This is in contrast to previous works that use machine learning to classify different phases of matter, but do not reveal the nature of the critical point between phases. Given that the ground-state entanglement Hamiltonian of certain topological phases of matter is also described by conformal field theories, we use supervised learning on R\'{e}yni entropies and find that the machine is able to identify which conformal field theory describes the entanglement Hamiltonian with only the lowest few R\'{e}yni entropies to a high degree of accuracy. Finally, using autoencoders, an unsupervised learning algorithm, we find a hidden variable that has a direct correlation with the central charge and discuss prospects for using machine learning to investigate other conformal field theories, including higher-dimensional ones. Our results highlight that machine learning can be used to find and characterize critical points and also hint at the intriguing possibility to use machine learning to learn about more complex conformal field theories.

翻訳日:2023-03-25 14:07:20 公開日:2021-07-10

# フレーム重ね合わせクラスタ : 遷移行列を高精度に導出する方法

Frame Superposition Cluster: The method to derive the transition matrices in high accuracy ( http://arxiv.org/abs/2107.02979v2 )

ライセンス: Link先を確認

Hikaru Wakaura and Takao Tomono

(参考訳) 変分量子固有ソルバ(vqe)法は、量子コンピュータを用いて固有エネルギーから波動関数を導出する方法である。しかし、それらの重畳状態を利用する方法はまだ開発されていない。 VQE法により導かれる2つの状態間の可観測物のオフ対角要素を計算するには、ノイズ中間スケール量子(NISQ)デバイスに対する多数のゲート演算が必要となる。 VQE法により導出された状態間の重ね合わせ状態を駆動する新しい手法を提案した。 Frame superposition cluster (FSC) と呼ぶ。本手法を用いて, 双極子遷移モーメントを他の方法と比較して高い精度で計算できることを確認した。

Variational Quantum Eigensolver (VQE) method is the way to derive the wave functions from their eigen energies using quantum computers. But, the methods to utilize the superposition states between them haven't been developed yet. The method to calculate the off diagonal element of observables between two states derived by VQE method requires large numbers of gate operations for Noisy Intermediate Scale Quantum (NISQ) devices. We proposed the novel method to drive the superposition state between the states the derived by VQE method. We call it Frame superposition cluster (FSC) method. Using the method, we confirmed that dipole transition moment could be calculated with the highest accuracy compared to other methods.

翻訳日:2023-03-23 04:32:31 公開日:2021-07-10

# 好中球のメタ・アプレンジザド・パラオシミザカオ・デ・パラメトロス

Meta-aprendizado para otimizacao de parametros de redes neurais ( http://arxiv.org/abs/2109.13745v1 )

ライセンス: Link先を確認

Tarsicio Lucas, Teresa Ludermir, Ricardo Prudencio, Carlos Soares

(参考訳) ANN(Artificial Neural Networks)の最適化は,これらのモデルを現実のアプリケーションで使用する上で重要な課題である。このタスクに採用されるソリューションは一般的に高価であり、試行錯誤手順や専門家の知識が常に利用できるわけではない。本研究では,ANNの最適化にメタラーニングを用いることを検討した。メタ学習は,学習問題の特徴と学習アルゴリズムの性能を関連付ける知識の自動獲得を目的とした研究分野である。メタラーニング手法は,元来アルゴリズム選択問題に対して提案され,その後に支援ベクトルマシンのパラメータの最適化を行った。しかし、メタラーニングは、ANNパラメータを最適化するためのより一般的な戦略として採用され、この研究の方向性における新たな取り組みの動機となっている。本研究では,mlpネットワークにおける隠れノード数をメタラーニングを用いて選択するケーススタディを行った。本研究では,93の回帰問題に関連するメタサンプルを作成した。それぞれのメタサンプルは回帰問題から生成され、格納される: 問題を記述する16の特徴(例えば、問題属性の属性の数と相関)と、可能な範囲から実験的に選択されたこの問題のノードの最大数。この一連のメタサンプルはmeta-learnerへの入力として与えられ、その特徴に基づいて新しい問題に対して最適なノード数を予測することができた。実験の結果, 良好な結果が得られた。

The optimization of Artificial Neural Networks (ANNs) is an important task to the success of using these models in real-world applications. The solutions adopted to this task are expensive in general, involving trial-and-error procedures or expert knowledge which are not always available. In this work, we investigated the use of meta-learning to the optimization of ANNs. Meta-learning is a research field aiming to automatically acquiring knowledge which relates features of the learning problems to the performance of the learning algorithms. The meta-learning techniques were originally proposed and evaluated to the algorithm selection problem and after to the optimization of parameters for Support Vector Machines. However, meta-learning can be adopted as a more general strategy to optimize ANN parameters, which motivates new efforts in this research direction. In the current work, we performed a case study using meta-learning to choose the number of hidden nodes for MLP networks, which is an important parameter to be defined aiming a good networks performance. In our work, we generated a base of meta-examples associated to 93 regression problems. Each meta-example was generated from a regression problem and stored: 16 features describing the problem (e.g., number of attributes and correlation among the problem attributes) and the best number of nodes for this problem, empirically chosen from a range of possible values. This set of meta-examples was given as input to a meta-learner which was able to predict the best number of nodes for new problems based on their features. The experiments performed in this case study revealed satisfactory results.

翻訳日:2023-03-22 21:58:24 公開日:2021-07-10

# 高等教育機関ウェブサイトの訪問者データの解析

Analysis of the Visitor Data of a Higher Education Institution Website ( http://arxiv.org/abs/2107.14107v1 )

ライセンス: Link先を確認

Omer Aydin

(参考訳) 今日の世界では、インターネットは人間の生活のあらゆる側面に影響を与えており、企業ウェブサイトや他の多くの分野にも変化をもたらしている。企業のWebサイトは、よりダイナミックでインタラクティブで、新しい技術との互換性が高まるべきだ。 webサイトとユーザ、検索エンジン、その他のデバイスとのインタラクションは、専門家によって検証され、このインタラクションのために改善と変更が行われるべきである。本研究では,高等教育機関のウェブサイトを調査した。分析には2013年から2019年にかけて収集された訪問者データを用いた。幅広い調査・データを含む本研究では,交通分析から開発提案までの重要な知見が盛り込まれている。特に,モバイル端末との互換性,画像と動画の最適化,ユーザの地理的特徴,言語オプション,時間とともにアクセスされるコンテンツの密度分析などを通じて有用な情報を得た。

In todays world, the internet affects every aspect of human life; it has caused changes in corporate websites as well as in many other areas. Corporate websites should be more dynamic, more interactive, and more compatible with new technologies. The interaction of the website with users, search engines, and other devices has to be examined by experts, and improvements and changes should be made for this interaction. In this study, a higher education institution website was examined. Visitor data collected between 2013 and 2019 were used for the analysis. In the study, which includes a wide range of examinations and data, important findings from traffic analysis to development suggestions were included. In particular, useful information has been obtained through the compatibility of the site with mobile devices, optimization of pictures and videos, geographical features of users, language options, and density analysis of the content accessed over time.

翻訳日:2023-03-22 21:57:59 公開日:2021-07-10

# ワイルディスクにおけるホロノミック量子操作

Holonomic quantum manipulation in the Weyl Disk ( http://arxiv.org/abs/2107.04814v1 )

ライセンス: Link先を確認

Victor Boogers, Janis Erdmanis, Yuli Nazarov

(参考訳) 超伝導ナノ構造のワイル点が、2つの量子状態がパラメトリック空間の2次元多様体においてほぼ退化するワイル円板を生じる可能性があることが示されている。これによりホロノミック量子操作の可能性が開かれ、縮退多様体内のパラメータの断熱的変化による波動関数の変換が実現される。本稿では,ワイルディスクにおけるホロノミック操作の機会について詳細に検討する。準古典近似で多様体の接続を計算し、アベリアンであることを示し、位相ゲートとして使うことができる。状態の準備と読み出しを含む量子操作のクローズドな例を提供するため、縮退した部分空間からシステムを引き出すパラメータの変更によりホロノミックゲートを補強する。数値図解には、準古典的パラメータの有限値と正確な量子力学を用いる。異なる実行時間に対するサンプルゲートの忠実度について検討する。

It has been shown that a Weyl point in a superconducting nanostructure may give rise to a Weyl disk where two quantum states are almost degenerate in a 2D manifold in the parametric space. This opens up the possibility of a holonomic quantum manipulation: a transformation of the wave function upon adiabatic change of the parameters within the degenerate manifold. In this paper, we investigate in detail the opportunities for holonomic manipulation in Weyl disks. We compute the connection at the manifold in quasiclassical approximation to show it is Abelian and can be used for a phase gate. To provide a closed example of quantum manipulation that includes a state preparation and read-out, we augment the holonomic gate with a change of parameters that brings the system out of the degenerate subspace. For numerical illustrations, we use a finite value of quasiclassical parameter and exact quantum dynamics. We investigate the fidelity of an example gate for different execution times.

翻訳日:2023-03-22 21:56:43 公開日:2021-07-10

# フラクタル量子力学によるブラックホール熱力学の展望

Prospecting Black Hole Thermodynamics with Fractional Quantum Mechanics ( http://arxiv.org/abs/2107.04789v1 )

ライセンス: Link先を確認

S. Jalalzadeh, F. Rodrigues da Silva and P. V. Moniz

(参考訳) 本稿では、分数量子力学の枠組みがブラックホール熱力学の視点を広げるかどうかを考察する。具体的には、主ツールとして {\it space-fractional} 微分 \cite{Rie} を用いる。さらに、解析はシュワルツシルト構成の場合に限定される。その後修正されたウィーラー・ドウィット方程式から、対応する特定の観測可能な式を取得する。つまり、ブラックホール質量スペクトルは$M$、温度は$T$、エントロピーは$S$である。これらの熊の連続的な変化は、分数パラメータ($\alpha$)を通して伝達される。特に、標準結果は特定の制限である$\alpha=2$で回収される。さらに、Tsallis と Cirto \cite{Tsallis} と Barrow \cite{Barrow} が提案するエントロピー-面積関係の一般化が、分数的な観点で補完的な解釈を得る方法について詳しく述べる。結果について徹底的に議論する。

This paper investigates whether the framework of fractional quantum mechanics can broaden our perspective of black hole thermodynamics. Concretely, we employ a {\it space-fractional} derivative \cite{Rie} as our main tool. Moreover, we restrict our analysis to the case of a Schwarzschild configuration. From a subsequently modified Wheeler-DeWitt equation, we retrieve the corresponding expressions for specific observables. Namely, the black hole mass spectrum, $M$, its temperature $T$, and entropy, $S$. We find that these bear consequential alterations conveyed through a fractional parameter, $\alpha$. In particular, the standard results are recovered in the specific limit $\alpha=2$. Furthermore, we elaborate how generalizations of the entropy-area relation suggested by Tsallis and Cirto \cite{Tsallis} and Barrow \cite{Barrow} acquire a complementary interpretation in terms of a fractional point of view. A thorough discussion of our results is presented.

翻訳日:2023-03-22 21:56:07 公開日:2021-07-10

# 改良チャプリギンガスによる暗黒物質ボース・アインシュタイン凝縮の粘性相互作用と安定性

Viscous interacting and stability on dark matter Bose-Einstein condensation with modified Chaplygin gas ( http://arxiv.org/abs/2107.04780v1 )

ライセンス: Link先を確認

E. Mahichi, Alireza Amani, and M. A. Ramzanpour

(参考訳) 本稿では,暗黒物質ボース・アインシュタイン凝縮(BEC)の存在下での粘性宇宙力学を曲面-FRW背景により研究する。この目的のために、我々は通常の暗黒物質(低温暗黒物質またはバロトロピック暗黒物質)ではなく、暗黒物質の状態方程式(eos)を重力形式から生じる$p_{dm} \propto \rho_{dm}^2$とするbecレジームを用いる。したがって、修正チャプリギンガスとの相互作用モデルを考えることにより、宇宙成分の存在を考慮した対応する連続性方程式を得る。その後、赤方偏移パラメータの観点から、エネルギー密度とダークエネルギーの圧力を導出する。そして、パラメトリゼーション関数を導入し、51の超新星データと確率解析を適合させることで、宇宙論的パラメータと赤方偏移パラメータを見いだす。以下に示すように、対応する動的グラフを赤方偏移に比例してプロットし、宇宙が現在加速展開段階にあることを示す。最後に, 本モデルの安定性と安定性について, 音速パラメータを用いて検討する。

In this paper, the viscous cosmological dynamics are studied in the presence of dark matter Bose-Einstein Condensation (BEC) by curved-FRW background. For this purpose, we use the BEC regime rather than the normal dark matter (the cold dark matter or the barotropic dark matter) with the dark matter Equation of State (EoS) as $p_{dm} \propto \rho_{dm}^2$, which arises from the gravitational form. Therefore, we obtain the corresponding continuity equations with the existence of the universe components by considering an interacting model with modified Chaplygin gas. Afterward, we derive the energy density and the pressure of dark energy in terms of the redshift parameter. And then, by introducing a parametrization function and fitting it with 51 supernova data with the likelihood analysis, we find the cosmological parameters versus redshift parameter. In what follows, we plot the corresponding dynamic graphs proportional to redshift, and then we represent the universe is currently undergoing an accelerated expansion phase. Finally, we explore the stability and the instability of the present model with the sound speed parameter.

翻訳日:2023-03-22 21:55:51 公開日:2021-07-10

# 標準模型物理学とデジタル量子革命:インターフェースを考える

Standard Model Physics and the Digital Quantum Revolution: Thoughts about the Interface ( http://arxiv.org/abs/2107.04769v1 )

ライセンス: Link先を確認

Natalie Klco, Alessandro Roggero, Martin J. Savage

(参考訳) 量子システムの分離・制御・絡み合いの進歩は、かつての量子力学の興味深い特徴を、破壊的な科学的・技術的進歩のための乗り物へと変えつつある。 feynman氏が提唱したビジョンを追求するべく、多くの研究と開発分野にわたる協力的な取り組みが、ドメイン科学者が利用可能なコンピューティングエコシステムにプロトタイプ的デジタル量子デバイスを導入する。これらの初期の量子デバイスとの相互作用を通じて、古典的に難解な量子系を探索する抽象的なビジョンは、現実の現実へと進化している。これらの技術的進歩を触媒する以外に、絡み合いは量子相関の診断や組織ツールとして平行進行を可能にし、量子多体系と標準モデルから定義および出現する量子場理論の理解の改善を導く。 3つの領域科学理論家の視点からは、核・高エネルギー物理学の科学的目的により、最近のNISQ時代の進歩を文脈化するために、絡み合い、複雑性、量子シミュレーションのインターフェースについての考察をまとめる。

Advances in isolating, controlling and entangling quantum systems are transforming what was once a curious feature of quantum mechanics into a vehicle for disruptive scientific and technological progress. Pursuing the vision articulated by Feynman, a concerted effort across many areas of research and development is introducing prototypical digital quantum devices into the computing ecosystem available to domain scientists. Through interactions with these early quantum devices, the abstract vision of exploring classically-intractable quantum systems is evolving toward becoming a tangible reality. Beyond catalyzing these technological advances, entanglement is enabling parallel progress as a diagnostic for quantum correlations and as an organizational tool, both guiding improved understanding of quantum many-body systems and quantum field theories defining and emerging from the Standard Model. From the perspective of three domain science theorists, this article compiles thoughts about the interface on entanglement, complexity, and quantum simulation in an effort to contextualize recent NISQ-era progress with the scientific objectives of nuclear and high-energy physics.

翻訳日:2023-03-22 21:55:28 公開日:2021-07-10

# アンサンブル学習による行列補完問題としての非線形交通予測

Nonlinear Traffic Prediction as a Matrix Completion Problem with Ensemble Learning ( http://arxiv.org/abs/2001.02492v4 )

ライセンス: Link先を確認

Wenqing Li, Chuhan Yang, and Saif Eddin Jabari

(参考訳) 本稿では,信号化トラフィック運用管理における短期交通予測の問題に対処する。具体的には,高分解能(秒間)でのセンサ状態の予測に着目する。これは、通常5分未満の間隔で集約されたトラフィック変数を予測することに焦点を当てた従来のトラフィック予測問題とは対照的である。まず,予測問題を行列補完問題としてモデル化する方法を示す。第2に、ブロック座標降下アルゴリズムを用い、そのアルゴリズムがサブ線形時間でブロック座標最適化器に収束することを実証する。これにより,高分解能データの「大さ」を計算可能な方法で活用することができる。第3に,任意の誤差閾値内でトレーニングエラーを低減させるアンサンブル学習(適応ブースティング)手法を開発した。後者は過去数日間を利用して、データ内の周期的なパターンを捉えることができる。提案手法の性能を理論的に解析し,uaeのアブダビのシミュレーションデータと実世界の高分解能トラヒックデータセットを用いて実証実験を行った。実験の結果,提案手法は他の最先端アルゴリズムよりも優れていることがわかった。

This paper addresses the problem of short-term traffic prediction for signalized traffic operations management. Specifically, we focus on predicting sensor states in high-resolution (second-by-second). This contrasts with traditional traffic forecasting problems, which have focused on predicting aggregated traffic variables, typically over intervals that are no shorter than 5 minutes. Our contributions can be summarized as offering three insights: first, we show how the prediction problem can be modeled as a matrix completion problem. Second, we employ a block-coordinate descent algorithm and demonstrate that the algorithm converges in sub-linear time to a block coordinate-wise optimizer. This allows us to capitalize on the "bigness" of high-resolution data in a computationally feasible way. Third, we develop an ensemble learning (or adaptive boosting) approach to reduce the training error to within any arbitrary error threshold. The latter utilizes past days so that the boosting can be interpreted as capturing periodic patterns in the data. The performance of the proposed method is analyzed theoretically and tested empirically using both simulated data and a real-world high-resolution traffic dataset from Abu Dhabi, UAE. Our experimental results show that the proposed method outperforms other state-of-the-art algorithms.

翻訳日:2023-01-13 13:07:57 公開日:2021-07-10

# Monsterをバイパスする: 実現可能性下でのコンテキスト帯域の高速かつ簡易な最適アルゴリズム

Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits under Realizability ( http://arxiv.org/abs/2003.12699v5 )

ライセンス: Link先を確認

David Simchi-Levi, Yunzong Xu

(参考訳) 一般化可能性仮定の下での一般(確率的な)文脈的包帯問題、すなわち、期待される報酬は、文脈と行動の函数として、一般函数クラス $\mathcal{F}$ に属する。我々は,すべての$T$ラウンドにわたるオフライン回帰オラクルへの${O}(\log T)$呼び出しだけで,統計的に最適な後悔を実現する,高速で単純なアルゴリズムを設計する。 oracleの呼び出し数はさらに$t$が事前に分かっている場合は$o(\log\log t)$に減らすことができる。本研究は,コンテキストバンディットからオフライン回帰への最初の普遍的かつ最適な還元を行い,コンテキストバンディット文学における重要なオープン問題を解決した。我々の結果の直接的な結果として、オフライン回帰の進展は直ちに統計的・計算的に文脈的帯域に変換される。これにより、より高速なアルゴリズムと、より広い階層の文脈的バンディット問題の後悔の保証がもたらされる。

We consider the general (stochastic) contextual bandit problem under the realizability assumption, i.e., the expected reward, as a function of contexts and actions, belongs to a general function class $\mathcal{F}$. We design a fast and simple algorithm that achieves the statistically optimal regret with only ${O}(\log T)$ calls to an offline regression oracle across all $T$ rounds. The number of oracle calls can be further reduced to $O(\log\log T)$ if $T$ is known in advance. Our results provide the first universal and optimal reduction from contextual bandits to offline regression, solving an important open problem in the contextual bandit literature. A direct consequence of our results is that any advances in offline regression immediately translate to contextual bandits, statistically and computationally. This leads to faster algorithms and improved regret guarantees for broader classes of contextual bandit problems.

翻訳日:2022-12-18 23:37:48 公開日:2021-07-10

# 深部異常検出における再考

Rethinking Assumptions in Deep Anomaly Detection ( http://arxiv.org/abs/2006.00339v2 )

ライセンス: Link先を確認

Lukas Ruff, Robert A. Vandermeulen, Billy Joe Franks, Klaus-Robert M\"uller, and Marius Kloft

(参考訳) 異常検出(英: anomaly detection, ad)は、分類問題(固有対異常)と見なすことができるが、通常は「異常」の意味を十分に特徴付けるデータセットであるため、教師なしの方法で扱われる。本稿では,この直感が画像上での深部ADにまで拡張されないことを示す。 ImageNetの最近のADベンチマークでは、通常のサンプルと数個の(64)ランダムな自然画像の区別を訓練された分類器が、ADの最先端技術よりも優れている。画像データの多スケール構造は,異常を例示的に有益であることを示す。

Though anomaly detection (AD) can be viewed as a classification problem (nominal vs. anomalous) it is usually treated in an unsupervised manner since one typically does not have access to, or it is infeasible to utilize, a dataset that sufficiently characterizes what it means to be "anomalous." In this paper we present results demonstrating that this intuition surprisingly seems not to extend to deep AD on images. For a recent AD benchmark on ImageNet, classifiers trained to discern between normal samples and just a few (64) random natural images are able to outperform the current state of the art in deep AD. Experimentally we discover that the multiscale structure of image data makes example anomalies exceptionally informative.

翻訳日:2022-11-26 17:42:47 公開日:2021-07-10

# 凸ニューラルネットワークの奇妙なケース

The Curious Case of Convex Neural Networks ( http://arxiv.org/abs/2006.05103v3 )

ライセンス: Link先を確認

Sarath Sivaprasad, Ankur Singh, Naresh Manwani, Vineet Gandhi

(参考訳) 本稿では,入力の凸関数を出力とするニューラルネットワークの制約付き定式化について検討する。完全連結層と畳み込み層の両方で凸性制約を強制でき、ほとんどのアーキテクチャに適用できることを示した。凸性制約は、(第一層を除いて)重みを非負に制限し、非減少凸活性化関数を使用することを含む。単純ではあるが、これらの制約はネットワークの一般化能力に深い影響を及ぼす。 3つの貴重な洞察を導きます (a)入力出力凸ニューラルネットワーク(IOC-NN)の自己正規化と過度適合問題の低減 (b)厳しい制約があるにもかかわらず、ベース多層パーセプトロンよりも優れ、ベース畳み込みアーキテクチャと同じような性能を達成する。 (c)IOC-NNは,列車ラベルの騒音に対する堅牢性を示す。 3種類のニューラルネットワークアーキテクチャを用いた標準画像分類データセットの徹底的な実験とアブレーション実験を用いて,提案手法の有効性を実証する。

In this paper, we investigate a constrained formulation of neural networks where the output is a convex function of the input. We show that the convexity constraints can be enforced on both fully connected and convolutional layers, making them applicable to most architectures. The convexity constraints include restricting the weights (for all but the first layer) to be non-negative and using a non-decreasing convex activation function. Albeit simple, these constraints have profound implications on the generalization abilities of the network. We draw three valuable insights: (a) Input Output Convex Neural Networks (IOC-NNs) self regularize and reduce the problem of overfitting; (b) Although heavily constrained, they outperform the base multi layer perceptrons and achieve similar performance as compared to base convolutional architectures and (c) IOC-NNs show robustness to noise in train labels. We demonstrate the efficacy of the proposed idea using thorough experiments and ablation studies on standard image classification datasets with three different neural network architectures.

翻訳日:2022-11-23 13:41:39 公開日:2021-07-10

# 制約ミスマッチポリシーによる安全強化学習の促進

Accelerating Safe Reinforcement Learning with Constraint-mismatched Policies ( http://arxiv.org/abs/2006.11645v3 )

ライセンス: Link先を確認

Tsung-Yen Yang and Justinian Rosca and Karthik Narasimhan and Peter J. Ramadge

(参考訳) 本研究では,(1)ベースライン制御ポリシと(2)学習者が満たさなければならない制約のセットを備える場合の強化学習の問題を考える。基本方針は、デモンストレーションデータや教師エージェントから生じて、学習に有用な手がかりを提供することもあるが、手元にあるタスクには準最適であり、安全、公正、その他のアプリケーション固有の要件を符号化する特定の制約を満たすことが保証されていない。基本方針から安全に学習するために,タスクに対する期待収益の最大化,基本方針への距離の最小化,制約満足セットへのポリシーの投影を交互に行う反復的ポリシー最適化アルゴリズムを提案する。アルゴリズムを理論的に解析し,有限時間収束保証を提供する。 5つの異なる制御タスクに関する実験では、アルゴリズムが最先端のベースラインを一貫して上回っており、10倍の制約違反と平均40%の報酬を達成しています。

We consider the problem of reinforcement learning when provided with (1) a baseline control policy and (2) a set of constraints that the learner must satisfy. The baseline policy can arise from demonstration data or a teacher agent and may provide useful cues for learning, but it might also be sub-optimal for the task at hand, and is not guaranteed to satisfy the specified constraints, which might encode safety, fairness or other application-specific requirements. In order to safely learn from baseline policies, we propose an iterative policy optimization algorithm that alternates between maximizing expected return on the task, minimizing distance to the baseline policy, and projecting the policy onto the constraint-satisfying set. We analyze our algorithm theoretically and provide a finite-time convergence guarantee. In our experiments on five different control tasks, our algorithm consistently outperforms several state-of-the-art baselines, achieving 10 times fewer constraint violations and 40% higher reward on average.

翻訳日:2022-11-18 21:52:57 公開日:2021-07-10

# 大規模シンボリック回帰のための高速ニューラルネットワークモデル

Fast Neural Models for Symbolic Regression at Scale ( http://arxiv.org/abs/2007.10784v2 )

ライセンス: Link先を確認

Allan Costa and Rumen Dangovski and Owen Dugan and Samuel Kim and Pawan Goyal and Marin Solja\v{c}i\'c and Joseph Jacobson

(参考訳) ディープラーニングの成功は、ニューラルネットワークの驚くべき表現力に起因している。しかし、これは、科学、エンジニアリング、および現実世界のデータを記述する分析式を見つけるという目標と矛盾する、トレーニングデータセットのドメインをはるかに超過する複雑なブラックボックスモデルのコストが伴う。このような法則の階層的モジュラリティは、ニューラルネットワークのトレーニングによって捉えることができるという仮説のもとに、解釈可能でコンパクトでスパースな解を見つけるニューラルネットワークモデルであるoccamnet(英語版)を紹介し、データ適合のための解である \`{a} la occam's razor(英語版)を紹介する。我々のモデルは微分不可能関数空間上の確率分布を定義する。進化戦略におけるクロスエントロピーマッチングに基づいて,関数をサンプリングし,重みを逆プロパゲーションで更新する2段階最適化手法を提案する。 OccamNetは、単純な分析関数、再帰的プログラム、暗黙的な関数、単純な画像分類など、さまざまな記号法則に適合し、実世界の回帰データセット上で、顕著に最先端のシンボル回帰手法を上回ります。我々の手法はメモリフットプリントを最小限に抑え、AIアクセラレーターを必要とせず、単一のCPU上でのトレーニングに数分で複雑な関数を適合させ、GPU上でスケールした場合の大幅なパフォーマンス向上を示す。実験を再現するための実装、デモ、インストラクションはhttps://github.com/druidowm/occamnet_public.comで利用可能です。

Deep learning owes much of its success to the astonishing expressiveness of neural networks. However, this comes at the cost of complex, black-boxed models that extrapolate poorly beyond the domain of the training dataset, conflicting with goals of finding analytic expressions to describe science, engineering and real world data. Under the hypothesis that the hierarchical modularity of such laws can be captured by training a neural network, we introduce OccamNet, a neural network model that finds interpretable, compact, and sparse solutions for fitting data, \`{a} la Occam's razor. Our model defines a probability distribution over a non-differentiable function space. We introduce a two-step optimization method that samples functions and updates the weights with backpropagation based on cross-entropy matching in an evolutionary strategy: we train by biasing the probability mass toward better fitting solutions. OccamNet is able to fit a variety of symbolic laws including simple analytic functions, recursive programs, implicit functions, simple image classification, and can outperform noticeably state-of-the-art symbolic regression methods on real world regression datasets. Our method requires minimal memory footprint, does not require AI accelerators for efficient training, fits complicated functions in minutes of training on a single CPU, and demonstrates significant performance gains when scaled on a GPU. Our implementation, demonstrations and instructions for reproducing the experiments are available at https://github.com/druidowm/OccamNet_Public.

翻訳日:2022-11-09 21:47:53 公開日:2021-07-10

# 小さな逆バイアスを持つスパーススケッチ

Sparse sketches with small inversion bias ( http://arxiv.org/abs/2011.10695v2 )

ライセンス: Link先を確認

Micha{\l} Derezi\'nski, Zhenyu Liao, Edgar Dobriban and Michael W. Mahoney

(参考訳) 高い$n\times d$ matrix $A$ とランダムな$m\times n$ スケッチ行列 $S$ に対して、逆共分散行列 $(A^\top A)^{-1}$ のスケッチされた推定は、一般的にバイアスされる: $E[(\tilde A^\top\tilde A)^{-1}]\ne(A^\top A)^{-1}$, $\tilde A=SA$。逆バイアスと呼ばれるこの現象は、統計学や分散最適化において、逆共分散に依存する複数の独立に構築された量の推定を平均化するときに生じる。我々は、ランダム行列に対する$(\epsilon,\delta)$-unbiased estimatorという概念に基づいて、逆バイアスを分析するフレームワークを開発した。スケッチマトリクス $s$ が密度が高く i.i.d. サブガウシアンエントリを持つ場合、単純な再スケーリングの後に、推定値 $(\frac m{m-d}\tilde a^\top\tilde a)^{-1}$ は $(\epsilon,\delta)$-unbiased for $(a^\top a)^{-1}$ で、サイズは $m=o(d+\sqrt d/\epsilon)$ である。これは、$m=O(d)$の場合、この推定子の逆バイアスは$O(1/\sqrt d)$であり、サブガウススケッチの埋め込み保証の結果得られる$\Theta(1)$近似誤差よりもはるかに小さいことを意味する。次に, LEverage Score Sparsified (LESS) Embeddingsという新しいスケッチ手法を提案する。この手法は, 疎結合とデータ認識のレバレッジベースの行サンプリング手法の両方のアイデアを用いて, スケッチサイズ$m=O(d\log d+\sqrt d/\epsilon)$ in time $O(\text{nnz}(A)\log n+md^2)$を得る。この解析を可能にする重要な手法は、制限されたbai-silverstein不等式(英語版)と呼ばれるランダム二次形式に対するbaiとsilversteinの古典的な不等式の拡張と、paley-zygmund不等式による二項分布の非集中化であり、スコアサンプリングスケッチを利用する下限を示す証明に使われる。

For a tall $n\times d$ matrix $A$ and a random $m\times n$ sketching matrix $S$, the sketched estimate of the inverse covariance matrix $(A^\top A)^{-1}$ is typically biased: $E[(\tilde A^\top\tilde A)^{-1}]\ne(A^\top A)^{-1}$, where $\tilde A=SA$. This phenomenon, which we call inversion bias, arises, e.g., in statistics and distributed optimization, when averaging multiple independently constructed estimates of quantities that depend on the inverse covariance. We develop a framework for analyzing inversion bias, based on our proposed concept of an $(\epsilon,\delta)$-unbiased estimator for random matrices. We show that when the sketching matrix $S$ is dense and has i.i.d. sub-gaussian entries, then after simple rescaling, the estimator $(\frac m{m-d}\tilde A^\top\tilde A)^{-1}$ is $(\epsilon,\delta)$-unbiased for $(A^\top A)^{-1}$ with a sketch of size $m=O(d+\sqrt d/\epsilon)$. This implies that for $m=O(d)$, the inversion bias of this estimator is $O(1/\sqrt d)$, which is much smaller than the $\Theta(1)$ approximation error obtained as a consequence of the subspace embedding guarantee for sub-gaussian sketches. We then propose a new sketching technique, called LEverage Score Sparsified (LESS) embeddings, which uses ideas from both data-oblivious sparse embeddings as well as data-aware leverage-based row sampling methods, to get $\epsilon$ inversion bias for sketch size $m=O(d\log d+\sqrt d/\epsilon)$ in time $O(\text{nnz}(A)\log n+md^2)$, where nnz is the number of non-zeros. The key techniques enabling our analysis include an extension of a classical inequality of Bai and Silverstein for random quadratic forms, which we call the Restricted Bai-Silverstein inequality; and anti-concentration of the Binomial distribution via the Paley-Zygmund inequality, which we use to prove a lower bound showing that leverage score sampling sketches generally do not achieve small inversion bias.

翻訳日:2022-09-22 23:17:45 公開日:2021-07-10

# (参考訳) Poissonレグレッションを伴うプレミアリーグのゴールスコア

Goal scoring in Premier League with Poisson regression ( http://arxiv.org/abs/2108.05796v1 )

ライセンス: CC BY 4.0

Cuong Pham, Tung Le

(参考訳) プレミアリーグは世界で最も競争力のあるサッカーリーグの1つとして知られており、試合ごとに多くのゴールが与えられる。各試合で得点したゴール数に影響を与える要因は何か? 私たちは、ポアソン回帰を用いて、ターゲット、コーナー、レッドカードといった多くの要因と、ホームチームが試合で得点できるゴールとの関係を見つけます。

Premier League is known as one of the most competitive football league in the world, hence there are many goals are scored here every match. Which are the factors that affect to the number of goal scored in each match? We use Poisson regression to find out the relation between many factors as shots on target, corners, red cards, to the goals home team can score in their match.

翻訳日:2021-08-15 18:34:34 公開日:2021-07-10

# (参考訳) ビデオパニックセグメンテーションのためのマージタスク

Merging Tasks for Video Panoptic Segmentation ( http://arxiv.org/abs/2108.04223v1 )

ライセンス: CC BY-SA 4.0

Jake Rap, Panagiotis Meletis

(参考訳) 本稿では,ビデオパノプティカルセグメンテーションの課題について検討し,その課題を解決するための2つの方法を提案する。ビデオパノプティカルセグメンテーション(VPS)は、最近導入されたコンピュータビジョンタスクであり、ビデオ内のすべてのピクセルを分類し、追跡する必要がある。このタスクの性質は、禁止されるデータセットに注釈を付けるコストを発生させる。ビデオのパンオプティカルセグメンテーションを理解するために、最初に、セマンティクスとトラッキングを別々に重視する構成タスクを導入した。その後、適切なVPSデータセットのトレーニングを必要としない2つのデータ駆動アプローチが選択される。最初のアプローチでは、事前訓練されたセマンティックセグメンテーションモデルと事前訓練されたマルチオブジェクト追跡モデルの出力をヒューリスティックに融合することにより、ビデオパノタイプセグメンテーションのモデルを構築する方法を示す。どちらのモデルの能力も容易に拡張したい場合、これは望まれる。第2のアプローチは、タスク固有の頭を持つ共有ニューラルネットワークバックボーン上に構築することで、最初のアプローチの欠点を克服する。このネットワークはパンオプティカルセグメンテーション用に設計されており、マスク伝搬モジュールによって時間にわたってインスタンスマスクをリンクするように拡張され、ビデオパンオプティカルセグメンテーションフォーマットとなる。

In this paper, the task of video panoptic segmentation is studied and two different methods to solve the task will be proposed. Video panoptic segmentation (VPS) is a recently introduced computer vision task that requires classifying and tracking every pixel in a given video. The nature of this task makes the cost of annotating datasets for it prohibiting. To understand video panoptic segmentation, first, earlier introduced constituent tasks that focus on semantics and tracking separately will be researched. Thereafter, two data-driven approaches which do not require training on a tailored VPS dataset will be selected to solve it. The first approach will show how a model for video panoptic segmentation can be built by heuristically fusing the outputs of a pre-trained semantic segmentation model and a pre-trained multi-object tracking model. This can be desired if one wants to easily extend the capabilities of either model. The second approach will counter some of the shortcomings of the first approach by building on top of a shared neural network backbone with task-specific heads. This network is designed for panoptic segmentation and will be extended by a mask propagation module to link instance masks across time, yielding the video panoptic segmentation format.

翻訳日:2021-08-15 16:44:40 公開日:2021-07-10

# 読み, 注意, コード: 機械による臨床ノートから医療用コード予測の限界を推し進める

Read, Attend, and Code: Pushing the Limits of Medical Codes Prediction from Clinical Notes by Machines ( http://arxiv.org/abs/2107.10650v1 )

ライセンス: Link先を確認

Byung-Hak Kim and Varun Ganapathi

(参考訳) 臨床ノートから医療コードを予測することは、現在の医療システム内のすべての医療提供組織にとって実用的かつ本質的な必要性である。アノテーションの自動化は、今日の人間のコーダーによる多大な時間と過剰な労力を節約します。しかし、最大の課題は、構造化されていないフリーテキスト臨床ノートから数千の高次元コードの中から適切な医療コードを直接識別することである。過去3年間で、CNN(Convolutional Neural Networks)とLTSM(Long Short-Term Memory)ネットワークによって、MIMIC-III-full-label in patient Clinical Noteデータセットの最も難しいベンチマークに対処する上で、大幅な改善が行われた。この進歩は、自動機械学習(ML)システムが人間のプログラマの作業パフォーマンスからどこまで遠いのかという根本的な疑問を提起する。同じサブサンプルテストセット上で,人間のコーダのパフォーマンスのベースラインを評価した。また、医療コードの割り当てマッピングを学習するための、読み取り、アテント、コード(RAC)モデルも提示します。結合した埋め込みと自己注意とコードタイトル誘導注意モジュールを結合し、文置換に基づくデータ拡張と確率的ウェイト平均化トレーニングを組み合わせることで、RACは新たな技術状態(SOTA)を確立し、現在の最高のマクロF1を18.7%上回り、人間レベルのコーディングベースラインを通り過ぎている。この新たなマイルストーンは、医療コード予測における人間のコーダのパフォーマンスと同等に達するマシンにおける完全自律型医療コーディング(AMC)への重要なステップである。

Prediction of medical codes from clinical notes is both a practical and essential need for every healthcare delivery organization within current medical systems. Automating annotation will save significant time and excessive effort spent by human coders today. However, the biggest challenge is directly identifying appropriate medical codes out of several thousands of high-dimensional codes from unstructured free-text clinical notes. In the past three years, with Convolutional Neural Networks (CNN) and Long Short-Term Memory (LTSM) networks, there have been vast improvements in tackling the most challenging benchmark of the MIMIC-III-full-label inpatient clinical notes dataset. This progress raises the fundamental question of how far automated machine learning (ML) systems are from human coders' working performance. We assessed the baseline of human coders' performance on the same subsampled testing set. We also present our Read, Attend, and Code (RAC) model for learning the medical code assignment mappings. By connecting convolved embeddings with self-attention and code-title guided attention modules, combined with sentence permutation-based data augmentations and stochastic weight averaging training, RAC establishes a new state of the art (SOTA), considerably outperforming the current best Macro-F1 by 18.7%, and reaches past the human-level coding baseline. This new milestone marks a meaningful step toward fully autonomous medical coding (AMC) in machines reaching parity with human coders' performance in medical code prediction.

翻訳日:2021-07-25 11:57:36 公開日:2021-07-10

# 急速学習と体系的一般化の根底にあるもの

What underlies rapid learning and systematic generalization in humans ( http://arxiv.org/abs/2107.06994v1 )

ライセンス: Link先を確認

Andrew Joohun Nam and James L. McClelland (Stanford University)

(参考訳) ニューラルネットワークの画期的な成功にもかかわらず、現代モデルは大量のデータセットによる広範なトレーニングを必要とし、サンプル外一般化の貧弱さを示す。提案された解決策の1つは、モデルに体系性とドメイン固有の制約を構築することである。本稿では,このアプローチの限界について,簡単な指導チュートリアルから抽象的推論タスクを学習する成人の能力と,不正確な回答に対する説明的フィードバックを検討することで考察し,トレーニング例の範囲外での人間学習のダイナミクスと一般化能力が,代表的なニューラルネットワークモデルとは大きく異なること,そして,モデルが著者が期待しない特徴の変化に対して脆弱であることを示す。このパズルを一貫して解く能力は, 教育, 特に基礎数学教育に関連し, 使用戦略の確実に識別可能かつ有効な説明を提供する能力である, という人間データからさらに証拠を提示する。本研究では,人間における素早い学習と体系的な一般化は,学習から学習までの段階的,経験に依存したプロセスに依存して,一般化可能な推論を支援する明示的な抽象ルールの構築を導くことを提案する。

Despite the groundbreaking successes of neural networks, contemporary models require extensive training with massive datasets and exhibit poor out-of-sample generalization. One proposed solution is to build systematicity and domain-specific constraints into the model, echoing the tenets of classical, symbolic cognitive architectures. In this paper, we consider the limitations of this approach by examining human adults' ability to learn an abstract reasoning task from a brief instructional tutorial and explanatory feedback for incorrect responses, demonstrating that human learning dynamics and ability to generalize outside the range of the training examples differ drastically from those of a representative neural network model, and that the model is brittle to changes in features not anticipated by its authors. We present further evidence from human data that the ability to consistently solve the puzzles was associated with education, particularly basic mathematics education, and with the ability to provide a reliably identifiable, valid description of the strategy used. We propose that rapid learning and systematic generalization in humans may depend on a gradual, experience-dependent process of learning-to-learn using instructions and explanations to guide the construction of explicit abstract rules that support generalizable inferences.

翻訳日:2021-07-18 12:36:26 公開日:2021-07-10

# アクティブニューラルネットワークによるバックプロップフリー強化学習

Backprop-Free Reinforcement Learning with Active Neural Generative Coding ( http://arxiv.org/abs/2107.07046v1 )

ライセンス: Link先を確認

Alexander Ororbia, Ankur Mali

(参考訳) ヒトでは知覚認知は感覚入力から情報の迅速な認識と抽出を促進する。この認識は、人間のエージェントが環境とどのように相互作用するかに大きく依存する。本研究では,動的環境における誤りのバックプロパゲーション(backprop)を伴わない動作駆動生成モデル学習のための計算フレームワークであるactive neural generative codingを提案する。具体的には,計画の認知理論からヒントを得て,少ない報酬でも操作できるインテリジェントエージェントを開発した。オンライン学習環境では,提案するモデリングフレームワークが深層Q-ラーニングモデルと競合する,いくつかの制御問題を実証する。我々のエージェントの堅牢な性能は、神経推論と学習のためのバックプロップフリーアプローチがゴール指向の行動を促進するという有望な証拠を提供する。

In humans, perceptual awareness facilitates the fast recognition and extraction of information from sensory input. This awareness largely depends on how the human agent interacts with the environment. In this work, we propose active neural generative coding, a computational framework for learning action-driven generative models without backpropagation of errors (backprop) in dynamic environments. Specifically, we develop an intelligent agent that operates even with sparse rewards, drawing inspiration from the cognitive theory of planning as inference. We demonstrate on several control problems, in the online learning setting, that our proposed modeling framework performs competitively with deep Q-learning models. The robust performance of our agent offers promising evidence that a backprop-free approach for neural inference and learning can drive goal-directed behavior.

翻訳日:2021-07-18 12:36:03 公開日:2021-07-10

# (参考訳) グラフ注意ネットワークを用いた薬物・標的相互作用予測

Drug-Target Interaction Prediction with Graph Attention networks ( http://arxiv.org/abs/2107.06099v1 )

ライセンス: CC BY 4.0

Haiyang Wang, Guangyu Zhou, Siqi Liu, Jyun-Yu Jiang and Wei Wang

(参考訳) モチベーション: 薬物と標的の相互作用を予測する(dti)は、プロテオミクスと医薬品研究の分野における関連性から、バイオインフォマティクスにおいてよく研究されているトピックである。このタスクには多くの機械学習手法がうまく適用されているが、DTIネットワークに固有の異種グラフ構造を活用することを目的としているものはほとんどない。 DTIのトポロジ構造と類似性をよりよく学習し,解釈するためには,グラフ構造から相互作用を予測する方法が望ましい。結果: DTI予測のためのエンドツーエンドフレームワークであるDTI-GAT(Drug-Target Interaction Prediction with Graph Attention Network)を提案する。 dti-gatは、相互作用パターンと薬物およびタンパク質配列の特徴の両方を利用する注意機構を備えたグラフ構造化データで動作するディープニューラルネットワークアーキテクチャを組み込んでいる。 DTI-GATは、DTIのトポロジ構造を自己注意機構で各ノードに異なる注意重みを割り当てることで解釈しやすくする。実験により、DTI-GATはバイナリDTI予測問題において、様々な最先端システムより優れていることが示された。さらに, 独立研究により, 従来の手法よりもモデルをより一般化できることが実証された。可用性: ソースコードとすべてのデータセットはhttps://github.com/Haiyang-W/DTI-GRAPHで公開されている。

Motivation: Predicting Drug-Target Interaction (DTI) is a well-studied topic in bioinformatics due to its relevance in the fields of proteomics and pharmaceutical research. Although many machine learning methods have been successfully applied in this task, few of them aim at leveraging the inherent heterogeneous graph structure in the DTI network to address the challenge. For better learning and interpreting the DTI topological structure and the similarity, it is desirable to have methods specifically for predicting interactions from the graph structure. Results: We present an end-to-end framework, DTI-GAT (Drug-Target Interaction prediction with Graph Attention networks) for DTI predictions. DTI-GAT incorporates a deep neural network architecture that operates on graph-structured data with the attention mechanism, which leverages both the interaction patterns and the features of drug and protein sequences. DTI-GAT facilitates the interpretation of the DTI topological structure by assigning different attention weights to each node with the self-attention mechanism. Experimental evaluations show that DTI-GAT outperforms various state-of-the-art systems on the binary DTI prediction problem. Moreover, the independent study results further demonstrate that our model can be generalized better than other conventional methods. Availability: The source code and all datasets are available at https://github.com/Haiyang-W/DTI-GRAPH

翻訳日:2021-07-15 05:53:38 公開日:2021-07-10

# (参考訳) 確率的機械学習を用いたネットワークトラフィックの実用的・構成的分類

Practical and Configurable Network Traffic Classification Using Probabilistic Machine Learning ( http://arxiv.org/abs/2107.06080v1 )

ライセンス: CC BY 4.0

Jiahui Chen, Joe Breen, Jeff M. Phillips, Jacobus Van der Merwe

(参考訳) 広く適用可能で高精度なネットワークトラフィック分類は、多くのネットワークセキュリティおよび管理タスクに有用である。フレキシブルで構成が容易な分類フレームワークは理想的であり、さまざまなネットワークで使用するためにカスタマイズすることができる。本稿では,未知のトラフィックから既知の,あるいは承認されたトラフィックを識別するために,パケットのシーケンスの統計のみに依存する,高度に構成可能で柔軟な機械学習トラフィック分類手法を提案する。提案手法は,確率推定に基づいて,分類決定のための確実性尺度を提供し,トラフィックを調整可能な確実性レベルに分類することができる。分類方法は, 異なる分類目標を優先して, 異なる分類シナリオにも適用できる。高性能コンピューティングネットワーク環境における実世界のトラフィックに対して,当社の分類手法とその構成がどのように機能するかを実証する。

Network traffic classification that is widely applicable and highly accurate is valuable for many network security and management tasks. A flexible and easily configurable classification framework is ideal, as it can be customized for use in a wide variety of networks. In this paper, we propose a highly configurable and flexible machine learning traffic classification method that relies only on statistics of sequences of packets to distinguish known, or approved, traffic from unknown traffic. Our method is based on likelihood estimation, provides a measure of certainty for classification decisions, and can classify traffic at adjustable certainty levels. Our classification method can also be applied in different classification scenarios, each prioritizing a different classification goal. We demonstrate how our classification scheme and all its configurations perform well on real-world traffic from a high performance computing network environment.

翻訳日:2021-07-15 05:36:16 公開日:2021-07-10

# (参考訳) 衛星システムのテレメトリとテレコマンドに対する間欠的妨害と学習駆動検出戦略

Intermittent Jamming against Telemetry and Telecommand of Satellite Systems and A Learning-driven Detection Strategy ( http://arxiv.org/abs/2107.06181v1 )

ライセンス: CC BY-SA 4.0

Selen Gecgel and Gunes Karabulut Kurt

(参考訳) 第6世代ネットワーク (6g) に向けて、衛星通信システム、特に低軌道 (leo) ネットワークは、そのユニークで包括的な能力のために期待されている。これらのアドバンテージには,セキュリティ脆弱性やハイブリッドシステムの管理,モビリティ向上など,さまざまな課題が伴っている。本稿では,まず,衛星システムのサイバー物理特性を考慮し,物理層におけるセキュリティの欠如を概念的枠組みで解決し,攻撃の可能性を明らかにする。次に、学習駆動型検出方式を提案し、軽量畳み込みニューラルネットワーク(CNN)を設計する。設計したCNNアーキテクチャの性能は、一般的な機械学習アルゴリズムであるサポートベクターマシン(SVM)と比較される。その結果,提案手法を用いて衛星システムに対する欠陥攻撃を検出することができた。

Towards sixth-generation networks (6G), satellite communication systems, especially based on Low Earth Orbit (LEO) networks, become promising due to their unique and comprehensive capabilities. These advantages are accompanied by a variety of challenges such as security vulnerabilities, management of hybrid systems, and high mobility. In this paper, firstly, a security deficiency in the physical layer is addressed with a conceptual framework, considering the cyber-physical nature of the satellite systems, highlighting the potential attacks. Secondly, a learning-driven detection scheme is proposed, and the lightweight convolutional neural network (CNN) is designed. The performance of the designed CNN architecture is compared with a prevalent machine learning algorithm, support vector machine (SVM). The results show that deficiency attacks against the satellite systems can be detected by employing the proposed scheme.

翻訳日:2021-07-15 05:20:03 公開日:2021-07-10

# 因果分析を用いた概念的深層学習説明

Using Causal Analysis for Conceptual Deep Learning Explanation ( http://arxiv.org/abs/2107.06098v1 )

ライセンス: Link先を確認

Sumedha Singla, Stephen Wallace, Sofia Triantafillou, Kayhan Batmanghelich

(参考訳) モデル説明責任は、医療における信頼できる機械学習モデルの作成に不可欠である。理想的な説明はドメインエキスパートの意思決定プロセスに似ており、臨床医にとって意味のある概念や用語を用いて表現される。このような説明を提供するため、まず分類器の隠れた単位を臨床的に関連する概念に関連付ける。胸部X線画像に付随する放射線学報告を利用して概念を定義した。線形スパースロジスティック回帰法を用いて,概念と隠れ単位の疎結合を発見する。同定された単位が分類器の結果に真に影響を及ぼすようにするために、因果推論文献およびより具体的には、反事実的介入による仲介分析のツールを採用する。最後に, 放射線学者に表現されたすべての概念を簡単な決定規則に変換するために, 低深度決定木を構築した。臨床知識と整合した世界的説明が得られた胸部X線データセットを用いて,我々のアプローチを評価した。

Model explainability is essential for the creation of trustworthy Machine Learning models in healthcare. An ideal explanation resembles the decision-making process of a domain expert and is expressed using concepts or terminology that is meaningful to the clinicians. To provide such an explanation, we first associate the hidden units of the classifier to clinically relevant concepts. We take advantage of radiology reports accompanying the chest X-ray images to define concepts. We discover sparse associations between concepts and hidden units using a linear sparse logistic regression. To ensure that the identified units truly influence the classifier's outcome, we adopt tools from Causal Inference literature and, more specifically, mediation analysis through counterfactual interventions. Finally, we construct a low-depth decision tree to translate all the discovered concepts into a straightforward decision rule, expressed to the radiologist. We evaluated our approach on a large chest x-ray dataset, where our model produces a global explanation consistent with clinical knowledge.

翻訳日:2021-07-14 14:51:58 公開日:2021-07-10

# (参考訳) 雑音下での表情認識に基づくコンセンサス協調学習と知識蒸留

Consensual Collaborative Training And Knowledge Distillation Based Facial Expression Recognition Under Noisy Annotations ( http://arxiv.org/abs/2107.04746v1 )

ライセンス: CC BY 4.0

Darshan Gera, S. Balasubramanian

(参考訳) 大規模表情データセットのラベルにおけるノイズの存在は、野生における顔表情認識(FER)にとって重要な課題である。学習の初期段階では、ディープネットワークはクリーンデータに適合する。そして最終的に、FER性能を制限する記憶能力のために、ノイズの多いラベルに過度に適合し始める。本研究は,CCT(Consensual Collaborative Training)フレームワークと呼ばれる,ノイズラベルの存在下での効果的なトレーニング戦略を提案する。 CCTは、騒音分布を仮定することなく、監督損失と整合損失の凸結合を用いて3つのネットワークを共同で訓練する。動的遷移機構は、早期学習における監督損失から、後期のネットワーク間の予測のコンセンサスに対する一貫性損失への移行に使用される。単純な知識蒸留スキームに基づいた単一のネットワークを用いて推論を行う。提案手法の有効性は,合成および実雑音FERデータセット上で実証される。さらに、約5K画像の大規模なテストサブセットを、16の異なるアノテータの群衆知恵を使ってFECデータセットからアノテートし、信頼できるラベルを推測する。 cctもその上で検証される。 FERDB (90.84%) FERPlus (89.99%) および AffectNet (66%) のベンチマークでは、最先端のパフォーマンスが報告されている。私たちのコードはhttps://github.com/1980x/CCTで公開されています。

Presence of noise in the labels of large scale facial expression datasets has been a key challenge towards Facial Expression Recognition (FER) in the wild. During early learning stage, deep networks fit on clean data. Then, eventually, they start overfitting on noisy labels due to their memorization ability, which limits FER performance. This work proposes an effective training strategy in the presence of noisy labels, called as Consensual Collaborative Training (CCT) framework. CCT co-trains three networks jointly using a convex combination of supervision loss and consistency loss, without making any assumption about the noise distribution. A dynamic transition mechanism is used to move from supervision loss in early learning to consistency loss for consensus of predictions among networks in the later stage. Inference is done using a single network based on a simple knowledge distillation scheme. Effectiveness of the proposed framework is demonstrated on synthetic as well as real noisy FER datasets. In addition, a large test subset of around 5K images is annotated from the FEC dataset using crowd wisdom of 16 different annotators and reliable labels are inferred. CCT is also validated on it. State-of-the-art performance is reported on the benchmark FER datasets RAFDB (90.84%) FERPlus (89.99%) and AffectNet (66%). Our codes are available at https://github.com/1980x/CCT.

翻訳日:2021-07-14 09:33:10 公開日:2021-07-10

# (参考訳) 対向的摂動に対する自律走行物体カテゴリー検出のレジリエンス

Resilience of Autonomous Vehicle Object Category Detection to Universal Adversarial Perturbations ( http://arxiv.org/abs/2107.04749v1 )

ライセンス: CC BY 4.0

Mohammad Nayeem Teli and Seungwon Oh

(参考訳) 敵対的な事例に対するディープニューラルネットワークの脆弱性のため、過去数年間、敵の攻撃と防御に関する多くの研究が急増している。しかし、ほとんどの研究者が当然と捉えている敵の攻撃や物体検出のアプローチについては、従来の見方があるようだ。本研究では,クラスレベルでの物体検出に対する普遍摂動の影響を評価することによって,これらの手順に対する新たな視点を提供する。自律運転に関する注意深く計算されたデータセットに適用する。我々は、人、車、トラック、停止標識、COCOデータセットからの交通信号の5つのカテゴリの画像に対して、Faster-RCNNオブジェクト検出器を使用し、Universal Dense Object Suppressionアルゴリズムを用いて画像を注意深く摂動する。その結果、人、車、信号機、トラック、停止標識は、その順序で(少なくとも)普遍的な摂動に対して回復力があることが示された。私たちの知る限りでは、このようなランキングが確立されたのはこれが初めてで、自動運転車に関するデータセットのセキュリティとオブジェクト検出全般において重要な意味を持つ。

Due to the vulnerability of deep neural networks to adversarial examples, numerous works on adversarial attacks and defenses have been burgeoning over the past several years. However, there seem to be some conventional views regarding adversarial attacks and object detection approaches that most researchers take for granted. In this work, we bring a fresh perspective on those procedures by evaluating the impact of universal perturbations on object detection at a class-level. We apply it to a carefully curated data set related to autonomous driving. We use Faster-RCNN object detector on images of five different categories: person, car, truck, stop sign and traffic light from the COCO data set, while carefully perturbing the images using Universal Dense Object Suppression algorithm. Our results indicate that person, car, traffic light, truck and stop sign are resilient in that order (most to least) to universal perturbations. To the best of our knowledge, this is the first time such a ranking has been established which is significant for the security of the data sets pertaining to autonomous vehicles and object detection in general.

翻訳日:2021-07-14 09:15:08 公開日:2021-07-10

# (参考訳) 箱をハックする: ディープラーニングの抽象化に基づくモニタ

Hack The Box: Fooling Deep Learning Abstraction-Based Monitors ( http://arxiv.org/abs/2107.04764v1 )

ライセンス: CC BY 4.0

Sara Hajj Ibrahim and Mohamed Nassar

(参考訳) ディープラーニングは、概念の深い階層に適応する機械学習の一種である。ディープラーニング分類器は、入力層における概念の最も基本的なバージョンと出力層における概念の最も抽象的なバージョン(クラスまたはラベルとしても知られる)をリンクする。しかし、一度有限個のクラスで訓練されたとき、深層学習モデルは与えられた入力がどのクラスにも属さず、単純にリンクできないと言う力を持っていない。非関連クラスの予測を正しく無効にすることは、文学において多くの点で取り組まれてきた難しい問題である。新規性検出は、新しい/見えないクラスに対して「知らない」出力を深層学習に与えます。それでも、新規性検出のセキュリティ面には注意が向けられていない。本稿では,抽象に基づく新奇性検出のケーススタディを考察し,敵のサンプルに対して頑健ではないことを示す。さらに,深層学習分類器を騙し,新奇な検出監視をバイパスする,逆行的なサンプル作成の可能性を示す。言い換えれば、これらの監視ボックスはハック可能である。新規検出自体が攻撃面となることを実証する。

Deep learning is a type of machine learning that adapts a deep hierarchy of concepts. Deep learning classifiers link the most basic version of concepts at the input layer to the most abstract version of concepts at the output layer, also known as a class or label. However, once trained over a finite set of classes, a deep learning model does not have the power to say that a given input does not belong to any of the classes and simply cannot be linked. Correctly invalidating the prediction of unrelated classes is a challenging problem that has been tackled in many ways in the literature. Novelty detection gives deep learning the ability to output "do not know" for novel/unseen classes. Still, no attention has been given to the security aspects of novelty detection. In this paper, we consider the case study of abstraction-based novelty detection and show that it is not robust against adversarial samples. Moreover, we show the feasibility of crafting adversarial samples that fool the deep learning classifier and bypass the novelty detection monitoring at the same time. In other words, these monitoring boxes are hackable. We demonstrate that novelty detection itself ends up as an attack surface.

翻訳日:2021-07-14 09:06:10 公開日:2021-07-10

# (参考訳) ls3: 反復タスクのロングホリゾン・バイスモータ制御のための潜在空間セーフセット

LS3: Latent Space Safe Sets for Long-Horizon Visuomotor Control of Iterative Tasks ( http://arxiv.org/abs/2107.04775v1 )

ライセンス: CC BY 4.0

Albert Wilcox and Ashwin Balakrishna and Brijen Thananjeyan and Joseph E. Gonzalez and Ken Goldberg

(参考訳) 強化学習(rl)アルゴリズムは、複雑な長時間ホリゾンタスクを学習するために高次元環境を探索することに成功したが、しばしば安全でない振る舞いを示し、探索が制限されていない場合に広範な環境相互作用を必要とする。動的に不確実な環境での安全な学習のための有望な戦略は、エージェントが確実にタスク成功(したがって安全)を保証できる状態に戻ることを要求することである。このアプローチは低次元環境では成功したが、画像などの高次元状態空間を持つ環境ではこの制約を強制することは困難である。そこで我々は,この手法を拡張した潜在空間セーフセット(ls3)を,準最適実演と学習力学モデルを用いて画像観察を伴う反復的・長期ホリゾンタスクに拡張し,タスク完了の可能性のある学習されたセーフセットの近傍への探索を制限する。シミュレーションにおける逐次プッシュタスクや物理的ケーブルルーティングタスクを含む4つの領域におけるLS3の評価を行った。 LS3は事前のタスク成功を利用して探索を制限し、制約を満たしながら事前のアルゴリズムよりも効率的に学習できることが判明した。コードと補足材料については https://tinyurl.com/latent-ss をご覧ください。

Reinforcement learning (RL) algorithms have shown impressive success in exploring high-dimensional environments to learn complex, long-horizon tasks, but can often exhibit unsafe behaviors and require extensive environment interaction when exploration is unconstrained. A promising strategy for safe learning in dynamically uncertain environments is requiring that the agent can robustly return to states where task success (and therefore safety) can be guaranteed. While this approach has been successful in low-dimensions, enforcing this constraint in environments with high-dimensional state spaces, such as images, is challenging. We present Latent Space Safe Sets (LS3), which extends this strategy to iterative, long-horizon tasks with image observations by using suboptimal demonstrations and a learned dynamics model to restrict exploration to the neighborhood of a learned Safe Set where task completion is likely. We evaluate LS3 on 4 domains, including a challenging sequential pushing task in simulation and a physical cable routing task. We find that LS3 can use prior task successes to restrict exploration and learn more efficiently than prior algorithms while satisfying constraints. See https://tinyurl.com/latent-ss for code and supplementary material.

翻訳日:2021-07-14 08:58:25 公開日:2021-07-10

# (参考訳) Speech2Video:ビデオ生成のためのクロスモーダル蒸留

Speech2Video: Cross-Modal Distillation for Speech to Video Generation ( http://arxiv.org/abs/2107.04806v1 )

ライセンス: CC BY 4.0

Shijing Si, Jianzong Wang, Xiaoyang Qu, Ning Cheng, Wenqi Wei, Xinghua Zhu and Jing Xiao

(参考訳) 本稿では,音声のみから発声顔映像生成の新たな課題について検討する。音声対ビデオ生成技術は、エンターテイメント、カスタマーサービス、人間とコンピュータの相互作用産業に興味深い応用をもたらす可能性がある。実際、音声の音色、アクセント、速度は、話者の外観に関連する豊富な情報を含んでいる。この課題は主に、異なる視覚特性を音声信号から切り離すことである。本稿では,不規則なビデオ入力から絡み合った感情やアイデンティティ情報を抽出する軽量なクロスモーダル蒸留法を提案する。抽出した特徴は、生成的対向ネットワークによって音声合成ビデオクリップに統合される。慎重に考案された識別器を用いて、提案するフレームワークは現実的な生成結果を達成する。観察された個人による実験では、提案手法が発話のみから感情表現を捉え、映像出力に自発的な顔の動きを生じさせることが示されている。話者の静的画像と音声を結合したベースライン法と比較すると,提案手法の結果はほぼ区別がつかない。また,提案手法は,映像中の感情表現の面で既存のアルゴリズムを上回っていることを示す。

This paper investigates a novel task of talking face video generation solely from speeches. The speech-to-video generation technique can spark interesting applications in entertainment, customer service, and human-computer-interaction industries. Indeed, the timbre, accent and speed in speeches could contain rich information relevant to speakers' appearance. The challenge mainly lies in disentangling the distinct visual attributes from audio signals. In this article, we propose a light-weight, cross-modal distillation method to extract disentangled emotional and identity information from unlabelled video inputs. The extracted features are then integrated by a generative adversarial network into talking face video clips. With carefully crafted discriminators, the proposed framework achieves realistic generation results. Experiments with observed individuals demonstrated that the proposed framework captures the emotional expressions solely from speeches, and produces spontaneous facial motion in the video output. Compared to the baseline method where speeches are combined with a static image of the speaker, the results of the proposed framework is almost indistinguishable. User studies also show that the proposed method outperforms the existing algorithms in terms of emotion expression in the generated videos.

翻訳日:2021-07-14 08:38:34 公開日:2021-07-10

# (参考訳) 胸部ctにおけるcov19-ct-dbベースラインの改善

COVID Detection in Chest CTs: Improving the Baseline on COV19-CT-DB ( http://arxiv.org/abs/2107.04808v1 )

ライセンス: CC BY 4.0

Radu Miron, Cosmin Moisii, Sergiu Dinu, Mihaela Breaban

(参考訳) 胸部CTにおける深層学習に基づく3つの異なるアプローチの比較検討を行った。最初のアプローチは3次元畳み込みを伴うボリュームトリクティックなアプローチで、他の2つのアプローチは最初はスライスワイズ分類を行い、その後ボリュームレベルで結果を集約する。実験はCOV19-CT-DBデータセット上で実施され、ICCV 2021内のMIA-COV19Dコンペティションによって提起された課題に対処することを目的としている。検証サブセットの最良の結果はマクロF1スコアの0.92に達し、オーガナイザが設定したベースラインスコアの0.70を大幅に改善する。

The paper presents a comparative analysis of three distinct approaches based on deep learning for COVID-19 detection in chest CTs. The first approach is a volumetric one, involving 3D convolutions, while the other two approaches perform at first slice-wise classification and then aggregate the results at the volume level. The experiments are carried on the COV19-CT-DB dataset, with the aim of addressing the challenge raised by the MIA-COV19D Competition within ICCV 2021. Our best results on the validation subset reach a macro-F1 score of 0.92, which improves considerably the baseline score of 0.70 set by the organizers.

翻訳日:2021-07-14 08:27:51 公開日:2021-07-10

# (参考訳) BSDA-Net:OCTA画像のセグメンテーションと分類のための境界形状と距離を考慮した共同学習フレームワーク

BSDA-Net: A Boundary Shape and Distance Aware Joint Learning Framework for Segmenting and Classifying OCTA Images ( http://arxiv.org/abs/2107.04823v1 )

ライセンス: CC BY 4.0

Li Lin, Zhonghua Wang, Jiewei Wu, Yijin Huang, Junyan Lyu, Pujin Cheng, Jiong Wu, Xiaoying Tang

(参考訳) 光コヒーレンストモグラフィアンギオグラフィー(OCTA)は、新しい非侵襲的イメージング技術であり、網膜層にまたがる血管と胎児の血管ゾーン(FAZ)の可視化を可能にする。臨床研究は、fazの形態と輪郭の不規則性が様々な眼疾患の重要なバイオマーカーであることを示唆している。したがって、FAZの正確なセグメンテーションは、非常に興味深い。また、FAZの特徴が深層診断分類網の性能を向上させるという研究報告はない。本稿では,OCTA画像からのFAZセグメンテーションと診断のためのマルチレベル境界形状と距離認識型共同学習フレームワークBSDA-Netを提案する。 2つの補助枝、すなわち境界熱マップ回帰と符号付き距離マップ再構成枝がセグメンテーション部に加えて構築され、セグメンテーション性能が向上し、より正確なFAZ輪郭とより少ないアウトリーが生じる。さらに、上記の3つの枝(形状、大きさ、境界、FAZの符号付き方向距離マップ)の低レベル特徴と高レベル特徴は、診断分類器の特徴と階層的に融合する。大規模な実験により、提案したBSDA-NetはOCTA-500、OCTAGON、FAZIDデータセットの最先端のセグメンテーションと分類結果が得られることがわかった。

Optical coherence tomography angiography (OCTA) is a novel non-invasive imaging technique that allows visualizations of vasculature and foveal avascular zone (FAZ) across retinal layers. Clinical researches suggest that the morphology and contour irregularity of FAZ are important biomarkers of various ocular pathologies. Therefore, precise segmentation of FAZ has great clinical interest. Also, there is no existing research reporting that FAZ features can improve the performance of deep diagnostic classification networks. In this paper, we propose a novel multi-level boundary shape and distance aware joint learning framework, named BSDA-Net, for FAZ segmentation and diagnostic classification from OCTA images. Two auxiliary branches, namely boundary heatmap regression and signed distance map reconstruction branches, are constructed in addition to the segmentation branch to improve the segmentation performance, resulting in more accurate FAZ contours and fewer outliers. Moreover, both low-level and high-level features from the aforementioned three branches, including shape, size, boundary, and signed directional distance map of FAZ, are fused hierarchically with features from the diagnostic classifier. Through extensive experiments, the proposed BSDA-Net is found to yield state-of-the-art segmentation and classification results on the OCTA-500, OCTAGON, and FAZID datasets.

翻訳日:2021-07-14 08:20:26 公開日:2021-07-10

# (参考訳) CSL-YOLO:エッジコンピューティングのための新しい軽量物体検出システム

CSL-YOLO: A New Lightweight Object Detection System for Edge Computing ( http://arxiv.org/abs/2107.04829v1 )

ライセンス: CC BY 4.0

Yu-Ming Zhang, Chun-Chieh Lee, Jun-Wei Hsieh, Kuo-Chin Fan

(参考訳) 軽量な物体検出器の開発は計算資源が限られているため不可欠である。計算コストを削減するために、冗長な特徴の生成方法が重要な役割を果たす。本稿では,安価な操作から冗長な特徴を生成するために,新しい軽量な畳み込み方式であるクロスステージ軽量モジュールを提案する。中間展開段階では, ポイントワイズ畳み込みを深さ方向畳み込みに置き換え, 候補特徴量を生成する。提案するcslモジュールは計算コストを大幅に削減できる。 MS-COCOで行われた実験により、提案されたCSLモジュールはConvolution-3x3の適合能力を近似できることが示された。最後に、このモジュールを用いて軽量検出器CSL-YOLOを構築し、Tiny-YOLOv4よりも43%のFLOPと52%のパラメータで検出性能を向上させる。

The development of lightweight object detectors is essential due to the limited computation resources. To reduce the computation cost, how to generate redundant features plays a significant role. This paper proposes a new lightweight Convolution method Cross-Stage Lightweight (CSL) Module, to generate redundant features from cheap operations. In the intermediate expansion stage, we replaced Pointwise Convolution with Depthwise Convolution to produce candidate features. The proposed CSL-Module can reduce the computation cost significantly. Experiments conducted at MS-COCO show that the proposed CSL-Module can approximate the fitting ability of Convolution-3x3. Finally, we use the module to construct a lightweight detector CSL-YOLO, achieving better detection performance with only 43% FLOPs and 52% parameters than Tiny-YOLOv4.

翻訳日:2021-07-14 08:08:55 公開日:2021-07-10

# (参考訳) BERTファインチューニング改善のための雑音安定化規則化

Noise Stability Regularization for Improving BERT Fine-tuning ( http://arxiv.org/abs/2107.04835v1 )

ライセンス: CC BY 4.0

Hang Hua, Xingjian Li, Dejing Dou, Cheng-Zhong Xu, Jiebo Luo

(参考訳) BERTのような微調整済みの言語モデルは、様々なNLPタスクでリーダーボードを支配する一般的なプラクティスとなっている。近年の成功と広く採用されているにもかかわらず、このプロセスは少数のトレーニングサンプルしか入手できない場合、不安定である。この過程の脆さは、しばしばランダムな種子に対する感受性によって反映される。本稿では,近年の文献(Arora et al., 2018, Sanyal et al., 2020)で研究されているディープネットの雑音安定性特性に基づいて,この問題に取り組むことを提案する。具体的には,LNSR(Layer-wise Noise Stability Regularization)と呼ばれるNLPタスクの微調整を改善するための,新しい効果的な正規化手法を提案する。入力に雑音を加える理論を拡張し、この手法がより安定した正規化効果を与えることを示す。良好な性能のモデルではノイズに対する感度が低く,LNSRによる微調整では明らかに一般化性と安定性が向上することが実験的に確認された。さらに,L2-SP (Li et al., 2018), Mixout (Lee et al., 2020), SMART (Jiang et al., 2020) など,最先端のアルゴリズムに対する利点も示す。

Fine-tuning pre-trained language models such as BERT has become a common practice dominating leaderboards across various NLP tasks. Despite its recent success and wide adoption, this process is unstable when there are only a small number of training samples available. The brittleness of this process is often reflected by the sensitivity to random seeds. In this paper, we propose to tackle this problem based on the noise stability property of deep nets, which is investigated in recent literature (Arora et al., 2018; Sanyal et al., 2020). Specifically, we introduce a novel and effective regularization method to improve fine-tuning on NLP tasks, referred to as Layer-wise Noise Stability Regularization (LNSR). We extend the theories about adding noise to the input and prove that our method gives a stabler regularization effect. We provide supportive evidence by experimentally confirming that well-performing models show a low sensitivity to noise and fine-tuning with LNSR exhibits clearly higher generalizability and stability. Furthermore, our method also demonstrates advantages over other state-of-the-art algorithms including L2-SP (Li et al., 2018), Mixout (Lee et al., 2020) and SMART (Jiang et al., 2020).

翻訳日:2021-07-14 07:59:53 公開日:2021-07-10

# (参考訳) 伝達学習による伝播認識型ソーシャルレコメンデーション

Propagation-aware Social Recommendation by Transfer Learning ( http://arxiv.org/abs/2107.04846v1 )

ライセンス: CC BY 4.0

Haodong Chang and Yabo Chu

(参考訳) ソーシャル・アウェア・レコメンデーションのアプローチは、従来のレコメンデーションシステムのデータスパーシティ問題を解決する効果的な方法として認識されてきた。背景にある前提は、ソーシャルユーザ-ユーザ接続の知識を共有して、ユーザ-テーマインタラクションのドメインに転送することで、ユーザの好みの学習を支援する、というものだ。しかし、既存のアプローチのほとんどは、転送学習中にユーザ間の1次接続を採用するだけで、それらの接続をより高い順序で無視する。より優れたレコメンデーションパフォーマンスは、高次の社会関係の恩恵を受けることができると我々は主張する。本稿では,社会関係の伝播に基づくPTLN(Propagation-Aware Transfer Learning Network)を提案する。我々は、ソーシャルネットワークに隠された共有知識をよりよく掘り下げ、レコメンデーションパフォーマンスをさらに向上させることを目指している。特に、社会的影響について2つの側面から検討する: (a) 上位の友人は秩序バイアスによって考慮されている; (b) 同じ順番の異なる友人は、注意のメカニズムによる推薦に対して明らかに重要である。さらに,ソーシャルリレーションとユーザ・テーマ間インタラクションのギャップを埋めるために,新たな正規化をデザインする。 2つの実世界のデータセットについて広範な実験を行い、特に歴史的な相互作用が少ないコールドスタートユーザーに対して、ランキングの正確性という点で他のデータセットを上回ります。

Social-aware recommendation approaches have been recognized as an effective way to solve the data sparsity issue of traditional recommender systems. The assumption behind is that the knowledge in social user-user connections can be shared and transferred to the domain of user-item interactions, whereby to help learn user preferences. However, most existing approaches merely adopt the first-order connections among users during transfer learning, ignoring those connections in higher orders. We argue that better recommendation performance can also benefit from high-order social relations. In this paper, we propose a novel Propagation-aware Transfer Learning Network (PTLN) based on the propagation of social relations. We aim to better mine the sharing knowledge hidden in social networks and thus further improve recommendation performance. Specifically, we explore social influence in two aspects: (a) higher-order friends have been taken into consideration by order bias; (b) different friends in the same order will have distinct importance for recommendation by an attention mechanism. Besides, we design a novel regularization to bridge the gap between social relations and user-item interactions. We conduct extensive experiments on two real-world datasets and beat other counterparts in terms of ranking accuracy, especially for the cold-start users with few historical interactions.

翻訳日:2021-07-14 07:44:24 公開日:2021-07-10

# (参考訳) SynPick: 動的ビンピッキングシーン理解のためのデータセット

SynPick: A Dataset for Dynamic Bin Picking Scene Understanding ( http://arxiv.org/abs/2107.04852v1 )

ライセンス: CC BY 4.0

Arul Selvam Periyasamy, Max Schwarz, and Sven Behnke

(参考訳) ビンピッキングシナリオにおける動的シーン理解のための合成データセットであるSynPickを提案する。既存のデータセットとは対照的に、私たちのデータセットは、よく知られたAmazon Robotics Challenge(ARC)にインスパイアされた、現実的な産業用アプリケーションドメインにあり、ARC 2017のために開発されたピッキングヒューリスティックによって選択された、真のピッキングアクションを備えた動的シーンを備えています。データセットは人気のあるBOPデータセットフォーマットと互換性がある。本稿では、NVIDIA PhysX物理エンジンを用いたオブジェクト配置生成と操作シミュレーションを含むデータセット生成プロセスについて詳述する。大きなアクションスペースをカバーするために、ターゲットとターゲットを絞ったピッキングアクションとランダムな移動アクションを実行します。オブジェクト認識のためのベースラインを確立するために、データセット上で最先端のポーズ推定手法を評価する。単純なフィルタリング手法であっても、単発推定ではなく、操作中のトラッキングポーズの有用性を実証する。ジェネレータのソースコードとデータセットが公開されている。

We present SynPick, a synthetic dataset for dynamic scene understanding in bin-picking scenarios. In contrast to existing datasets, our dataset is both situated in a realistic industrial application domain -- inspired by the well-known Amazon Robotics Challenge (ARC) -- and features dynamic scenes with authentic picking actions as chosen by our picking heuristic developed for the ARC 2017. The dataset is compatible with the popular BOP dataset format. We describe the dataset generation process in detail, including object arrangement generation and manipulation simulation using the NVIDIA PhysX physics engine. To cover a large action space, we perform untargeted and targeted picking actions, as well as random moving actions. To establish a baseline for object perception, a state-of-the-art pose estimation approach is evaluated on the dataset. We demonstrate the usefulness of tracking poses during manipulation instead of single-shot estimation even with a naive filtering approach. The generator source code and dataset are publicly available.

翻訳日:2021-07-14 07:35:45 公開日:2021-07-10

# (参考訳) Marginalized Corrupted Distributions によるカーネル平均推定

Kernel Mean Estimation by Marginalized Corrupted Distributions ( http://arxiv.org/abs/2107.04855v1 )

ライセンス: CC0 1.0

Xiaobo Xia, Shuo Shan, Mingming Gong, Nannan Wang, Fei Gao, Haikun Wei, Tongliang Liu

(参考訳) 再生カーネルヒルベルト空間におけるカーネル平均の推定は、多くのカーネル学習アルゴリズムにおいて重要な要素である。有限サンプルが与えられた場合、ターゲットカーネル平均の標準推定値は経験平均である。以前の研究では、より良い推定器は縮小法によって構築できることを示した。そこで本研究では,既知の分布からのノイズを伴うデータサンプルを腐敗させ,カーネル平均推定器と呼ばれる新しいカーネル平均推定器を提案する。理論的には、限界化されたカーネル平均推定器は、カーネル平均推定に暗黙の正規化をもたらす。実験により,カーネル平均推定器が既存の推定器よりもはるかに低い推定誤差が得られることを示す。

Estimating the kernel mean in a reproducing kernel Hilbert space is a critical component in many kernel learning algorithms. Given a finite sample, the standard estimate of the target kernel mean is the empirical average. Previous works have shown that better estimators can be constructed by shrinkage methods. In this work, we propose to corrupt data examples with noise from known distributions and present a new kernel mean estimator, called the marginalized kernel mean estimator, which estimates kernel mean under the corrupted distribution. Theoretically, we show that the marginalized kernel mean estimator introduces implicit regularization in kernel mean estimation. Empirically, we show on a variety of datasets that the marginalized kernel mean estimator obtains much lower estimation error than the existing estimators.

翻訳日:2021-07-14 07:24:11 公開日:2021-07-10

# (参考訳) Dense-Sparse Deep CNN Training for Image Denoising

Dense-Sparse Deep CNN Training for Image Denoising ( http://arxiv.org/abs/2107.04857v1 )

ライセンス: CC BY 4.0

Basit O. Alawode, Mudassir Masood, Tarig Ballal, and Tareq Al-Naffouri

(参考訳) 近年,畳み込みニューラルネットワーク(cnns)などの深層学習(dl)手法が画像デノイジングの分野で注目されている。これは、bm3dのような最先端の古典的な画像デノイジングアルゴリズムを超える能力が証明されたためである。 deep denoising cnns (dncnns) は、多くのフィードフォワード畳み込み層を使用し、バッチ正規化と残差学習の正規化法を追加し、デニューズ性能を大幅に改善する。しかし、これは膨大な数のトレーニング可能なパラメータを犠牲にしている。本稿では,パラメータ数を削減しつつ,同等の性能を実現することで,この問題に対処した。本研究では,DSDトレーニング手法を用いて,トレーニングネットワークによって得られる性能向上からモチベーションを導出する。我々はこのトレーニングアプローチをDnCNN(RDnCNN)ネットワークに拡張し、パラメータが大幅に減少し、DnCNNに匹敵する性能を持つ高速な復調ネットワークを実現する。

Recently, deep learning (DL) methods such as convolutional neural networks (CNNs) have gained prominence in the area of image denoising. This is owing to their proven ability to surpass state-of-the-art classical image denoising algorithms such as BM3D. Deep denoising CNNs (DnCNNs) use many feedforward convolution layers with added regularization methods of batch normalization and residual learning to improve denoising performance significantly. However, this comes at the expense of a huge number of trainable parameters. In this paper, we address this issue by reducing the number of parameters while achieving a comparable level of performance. We derive motivation from the improved performance obtained by training networks using the dense-sparse-dense (DSD) training approach. We extend this training approach to a reduced DnCNN (RDnCNN) network resulting in a faster denoising network with significantly reduced parameters and comparable performance to the DnCNN.

翻訳日:2021-07-14 07:05:43 公開日:2021-07-10

# (参考訳) 標準点オートエンコーダを用いた3次元距離対応学習

Learning 3D Dense Correspondence via Canonical Point Autoencoder ( http://arxiv.org/abs/2107.04867v1 )

ライセンス: CC BY 4.0

An-Chieh Cheng, Xueting Li, Min Sun, Ming-Hsuan Yang, Sifei Liu

(参考訳) 同一カテゴリの3次元形状間の密接な対応を予測できる標準点オートエンコーダ(CPAE)を提案する。オートエンコーダは、2つの重要な機能を実行する: (a) 任意に順序付けられた点クラウドを標準的なプリミティブ、例えば球体に符号化し、(b) プリミティブを元の入力インスタンス形状に復号する。ボトルネックに置かれているように、このプリミティブは、すべての無秩序点雲を正準面上にマッピングし、順序付けされた方法で再構築する重要な役割を果たす。一度訓練すると、原始曲面上の同じ位置にマッピングされた異なる形状のインスタンスからのポイントは、対応のペアであると決定される。本手法ではアノテーションや自己管理部分分割ネットワークを一切必要とせず,不整合入力点雲を処理できる。 3次元セマンティクスキーポイント転送と部分セグメンテーション伝達の実験結果は,本モデルが最先端対応学習法に対して有利に機能することを示す。

We propose a canonical point autoencoder (CPAE) that predicts dense correspondences between 3D shapes of the same category. The autoencoder performs two key functions: (a) encoding an arbitrarily ordered point cloud to a canonical primitive, e.g., a sphere, and (b) decoding the primitive back to the original input instance shape. As being placed in the bottleneck, this primitive plays a key role to map all the unordered point clouds on the canonical surface and to be reconstructed in an ordered fashion. Once trained, points from different shape instances that are mapped to the same locations on the primitive surface are determined to be a pair of correspondence. Our method does not require any form of annotation or self-supervised part segmentation network and can handle unaligned input point clouds. Experimental results on 3D semantic keypoint transfer and part segmentation transfer show that our model performs favorably against state-of-the-art correspondence learning methods.

翻訳日:2021-07-14 06:57:43 公開日:2021-07-10

# (参考訳) patentminer: コンテキスト強調と知識誘導グラフによる特許空白マイニング

PatentMiner: Patent Vacancy Mining via Context-enhanced and Knowledge-guided Graph Attention ( http://arxiv.org/abs/2107.04880v1 )

ライセンス: CC BY 4.0

Gaochen Wu, Bin Xu, Yuxin Qin, Fei Kong, Bangchang Liu, Hongwen Zhao, Dejie Chang

(参考訳) 知識グラフを構築することで特許研究を行う作業は少ないが、特許文書を用いて特許知識グラフを構築したり、最新の自然言語処理手法を組み合わせて既存の特許に隠されたリッチなセマンティックな関係を掘り下げたり、新たな特許を予測したりしない。本稿では,知識グラフ(KG)とグラフアテンション機構に基づいて,リッチなセマンティック知識をマイニングし,新たな潜在的な特許を予測するために,PatentMinerという新しい特許空白予測手法を提案する。まず、特許に関する知識グラフ(例)。年) 特許書類から名称の実体の認識及び関係の延長を行うことにより構成される。第2に、構築した知識グラフにおいてリンク予測を行い、潜在的な三重項を探索するコモンネバー法(CNM)、グラフ注意ネットワーク(GAT)、コンテキスト強化グラフ注意ネットワーク(CGAT)を提案する。最後に,特許は知識グラフ上で,共起関係(co-occurrence relationship)により定義される。すなわち,各特許は,知識グラフ内のすべての実体と共起関係を含む完全連結部分グラフとして表現される。さらに,新たなパテントとして新たに追加された予測リンクを備えた完全連結部分グラフを予測する新しい特許予測タスクを提案する。実験の結果,提案手法は,新たな特許を正しく予測でき,文脈対応グラフアテンションネットワークはベースラインよりもはるかに優れていることがわかった。一方、我々の提案する特許空白予測タスクには、まだ未解決の余地がある。

Although there are a small number of work to conduct patent research by building knowledge graph, but without constructing patent knowledge graph using patent documents and combining latest natural language processing methods to mine hidden rich semantic relationships in existing patents and predict new possible patents. In this paper, we propose a new patent vacancy prediction approach named PatentMiner to mine rich semantic knowledge and predict new potential patents based on knowledge graph (KG) and graph attention mechanism. Firstly, patent knowledge graph over time (e.g. year) is constructed by carrying out named entity recognition and relation extrac-tion from patent documents. Secondly, Common Neighbor Method (CNM), Graph Attention Networks (GAT) and Context-enhanced Graph Attention Networks (CGAT) are proposed to perform link prediction in the constructed knowledge graph to dig out the potential triples. Finally, patents are defined on the knowledge graph by means of co-occurrence relationship, that is, each patent is represented as a fully connected subgraph containing all its entities and co-occurrence relationships of the patent in the knowledge graph; Furthermore, we propose a new patent prediction task which predicts a fully connected subgraph with newly added prediction links as a new pa-tent. The experimental results demonstrate that our proposed patent predic-tion approach can correctly predict new patents and Context-enhanced Graph Attention Networks is much better than the baseline. Meanwhile, our proposed patent vacancy prediction task still has significant room to im-prove.

翻訳日:2021-07-14 06:41:34 公開日:2021-07-10

# (参考訳) ロバストな医用画像解析のためのディープニューラルネットワークにおける分布検出と敵攻撃の概要

Out of Distribution Detection and Adversarial Attacks on Deep Neural Networks for Robust Medical Image Analysis ( http://arxiv.org/abs/2107.04882v1 )

ライセンス: CC BY 4.0

Anisie Uwimana1, Ransalu Senanayake

(参考訳) 深層学習モデルは、医用画像解析において一般的な選択肢となっている。しかし、深層学習モデルの一般化性能の低さは、医学的応用にとって堅牢性が不可欠であるため、実世界での展開を妨げている。例えば、最先端の畳み込みニューラルネットワーク(CNN)は、トレーニング分布から統計的に離れた敵のサンプルやサンプルを検出することができない。本研究は, マラリア寄生細胞と非感染細胞の分類において, マハラノビス距離に基づく信頼性スコア, および異常サンプルの検出方法の信頼性を実験的に評価した。その結果,mahalanobis confidence score detectorはディープラーニングモデルの性能と頑健性が向上し,out-of-distribution (ood) とadversarial sampleの両方において最先端のパフォーマンスが得られた。

Deep learning models have become a popular choice for medical image analysis. However, the poor generalization performance of deep learning models limits them from being deployed in the real world as robustness is critical for medical applications. For instance, the state-of-the-art Convolutional Neural Networks (CNNs) fail to detect adversarial samples or samples drawn statistically far away from the training distribution. In this work, we experimentally evaluate the robustness of a Mahalanobis distance-based confidence score, a simple yet effective method for detecting abnormal input samples, in classifying malaria parasitized cells and uninfected cells. Results indicated that the Mahalanobis confidence score detector exhibits improved performance and robustness of deep learning models, and achieves stateof-the-art performance on both out-of-distribution (OOD) and adversarial samples.

翻訳日:2021-07-14 06:31:21 公開日:2021-07-10

# (参考訳) ハイパーリレーショナルファクトを用いたインダクティブリンク予測の改善

Improving Inductive Link Prediction Using Hyper-Relational Facts ( http://arxiv.org/abs/2107.04894v1 )

ライセンス: CC0 1.0

Mehdi Ali, Max Berrendorf, Mikhail Galkin, Veronika Thost, Tengfei Ma, Volker Tresp, Jens Lehmann

(参考訳) 長年、知識グラフ(KG)上のリンク予測は純粋にトランスダクティブなタスクであり、目に見えないエンティティの推論を許さなかった。近年,半帰納的シナリオと完全帰納的シナリオを探求する取り組みが活発化しており,未確認および新興エンティティに対する推論が可能になっている。しかしながら、これらのアプローチはすべてトリプルベースの \glspl{kg} しか考慮していないが、よりリッチなKG(例えばWikidata)は十分に研究されていない。本研究では,様々な帰納的設定を分類し,グラフニューラルネットワークの最近の進歩を生かした,幅広い半帰納的および完全帰納的リンク予測タスクにハイパーリレーショナルKGを用いることの利点について検討する。新たなベンチマークによる実験結果から, 有意な利得の6%(hits@10メートル法の場合)は, 3値のみのベースラインに比べて性能が向上することが示された。我々のコードは \url{https://github.com/mali-git/hyper_relational_ilp} で利用可能です。

For many years, link prediction on knowledge graphs (KGs) has been a purely transductive task, not allowing for reasoning on unseen entities. Recently, increasing efforts are put into exploring semi- and fully inductive scenarios, enabling inference over unseen and emerging entities. Still, all these approaches only consider triple-based \glspl{kg}, whereas their richer counterparts, hyper-relational KGs (e.g., Wikidata), have not yet been properly studied. In this work, we classify different inductive settings and study the benefits of employing hyper-relational KGs on a wide range of semi- and fully inductive link prediction tasks powered by recent advancements in graph neural networks. Our experiments on a novel set of benchmarks show that qualifiers over typed edges can lead to performance improvements of 6% of absolute gains (for the Hits@10 metric) compared to triple-only baselines. Our code is available at \url{https://github.com/mali-git/hyper_relational_ilp}.

翻訳日:2021-07-14 06:21:43 公開日:2021-07-10

# (参考訳) MRIセグメンテーションにおけるドメインシフトの影響の解剖

Anatomy of Domain Shift Impact on U-Net Layers in MRI Segmentation ( http://arxiv.org/abs/2107.04914v1 )

ライセンス: CC BY 4.0

Ivan Zakazov, Boris Shirokikh, Alexey Chernyavskiy and Mikhail Belyaev

(参考訳) ドメイン適応(da)法は、異なる分散トレイン(ソース)とテスト(ターゲット)データの問題に取り組むために、医療画像分割タスクで広く使われている。対象ドメインからの注釈付きサンプルの数が限られている教師付きDAタスクについて検討する。最小限の注釈付きデータの量で十分な正確なモデルを構築することです。既存の手法のほとんどは、事前訓練された畳み込みニューラルネットワーク(CNN)の微調整固有の層である。しかし、どの層が微調整に優れているのか、コンセンサスはない。低レベルなドメインシフトを持つイメージの最初のレイヤや、高レベルなドメインシフトを持つイメージのより深いレイヤ。この目的のために,最適な微調整を行うレイヤを自動的に選択するCNNアーキテクチャであるSpotTUnetを提案する。より具体的には、対象ドメイン上で、トレーニング済みネットワークから特定の層を微調整するか再利用すべきかを示すポリシーも学習する。本手法は,アノテートデータの極端な不足下においても,非フレキシブル微調整法と同等のレベルで動作することを示す。第二に、SpotTUnetポリシーは、ネットワーク上でのドメインシフトの影響を階層的に可視化し、堅牢なドメイン一般化手法の開発にさらに使用できることを示す。 SpotTUnetの性能を広範囲に評価するために、明示的なドメインシフトを特徴とする脳MR画像の公開データセット(CC359)を用いる。再現可能な実験パイプラインをリリースする。

Domain Adaptation (DA) methods are widely used in medical image segmentation tasks to tackle the problem of differently distributed train (source) and test (target) data. We consider the supervised DA task with a limited number of annotated samples from the target domain. It corresponds to one of the most relevant clinical setups: building a sufficiently accurate model on the minimum possible amount of annotated data. Existing methods mostly fine-tune specific layers of the pretrained Convolutional Neural Network (CNN). However, there is no consensus on which layers are better to fine-tune, e.g. the first layers for images with low-level domain shift or the deeper layers for images with high-level domain shift. To this end, we propose SpotTUnet - a CNN architecture that automatically chooses the layers which should be optimally fine-tuned. More specifically, on the target domain, our method additionally learns the policy that indicates whether a specific layer should be fine-tuned or reused from the pretrained network. We show that our method performs at the same level as the best of the nonflexible fine-tuning methods even under the extreme scarcity of annotated data. Secondly, we show that SpotTUnet policy provides a layer-wise visualization of the domain shift impact on the network, which could be further used to develop robust domain generalization methods. In order to extensively evaluate SpotTUnet performance, we use a publicly available dataset of brain MR images (CC359), characterized by explicit domain shift. We release a reproducible experimental pipeline.

翻訳日:2021-07-14 06:02:18 公開日:2021-07-10

# (参考訳) 特徴量に基づくイベントステレオビジュアルオドメトリー

Feature-based Event Stereo Visual Odometry ( http://arxiv.org/abs/2107.04921v1 )

ライセンス: CC BY 4.0

Antea Hadviger, Igor Cvi\v{s}i\'c, Ivan Markovi\'c, Sacha Vra\v{z}i\'c, Ivan Petrovi\'c

(参考訳) イベントベースのカメラは生物学的にインスパイアされたセンサーであり、シーン内の非同期画素の明るさ変化を出力する。ハイダイナミックレンジとマイクロ秒の時間分解能は、照明や高速シナリオに挑戦する環境では標準カメラよりも信頼性が高く、イベントカメラのみに基づいたオドメトリーアルゴリズムの開発は、自律システムやロボットにとってエキサイティングな新しい可能性をもたらす。本稿では,特徴量検出と注意的特徴管理によるマッチングに基づくイベントカメラのステレオ・ビジュアル・オドメトリ法を提案する。提案手法は,屋内飛行ドローンが取得したMVSECシーケンスとDSEC屋外運転シーケンスの2つの公開データセット上での性能を評価する。 mvsecはモーションキャプチャによる正確な地中真実を提供するが、dsecは地中真実を示さないが、標準カメラフレームの基準軌道を得るために、キッティスコアボードの最高ランキングアルゴリズムの一つであるソフト・ビジュアル・オドメトリ(soft visual odometry)を使用した。本手法とESVO法を比較した。この手法はMVSECシークエンスで同等の性能を示すが,DSECデータセットのESVOではデフォルトパラメータで屋外走行シナリオを処理できなかった。さらに,esvoに対する2つの重要な利点は,追跡周波数を非同期イベントレートに適応させ,初期化を必要としない点である。

Event-based cameras are biologically inspired sensors that output events, i.e., asynchronous pixel-wise brightness changes in the scene. Their high dynamic range and temporal resolution of a microsecond makes them more reliable than standard cameras in environments of challenging illumination and in high-speed scenarios, thus developing odometry algorithms based solely on event cameras offers exciting new possibilities for autonomous systems and robots. In this paper, we propose a novel stereo visual odometry method for event cameras based on feature detection and matching with careful feature management, while pose estimation is done by reprojection error minimization. We evaluate the performance of the proposed method on two publicly available datasets: MVSEC sequences captured by an indoor flying drone and DSEC outdoor driving sequences. MVSEC offers accurate ground truth from motion capture, while for DSEC, which does not offer ground truth, in order to obtain a reference trajectory on the standard camera frames we used our SOFT visual odometry, one of the highest ranking algorithms on the KITTI scoreboards. We compared our method to the ESVO method, which is the first and still the only stereo event odometry method, showing on par performance on the MVSEC sequences, while on the DSEC dataset ESVO, unlike our method, was unable to handle outdoor driving scenario with default parameters. Furthermore, two important advantages of our method over ESVO are that it adapts tracking frequency to the asynchronous event rate and does not require initialization.

翻訳日:2021-07-14 05:52:01 公開日:2021-07-10

# (参考訳) telinet - 単純かつ浅い畳み込みニューラルネットワーク(cnn)によるcovid-19患者のctスキャンの分類

TeliNet, a simple and shallow Convolution Neural Network (CNN) to Classify CT Scans of COVID-19 patients ( http://arxiv.org/abs/2107.04930v1 )

ライセンス: CC BY 4.0

Mohammad Nayeem Teli

(参考訳) 新型コロナウイルス(COVID-19)により世界中で数十万人が死亡し、数百万人が負傷した。このパンデミックに対する戦いは、複数の方面で進行中だ。ワクチン接種はスピードを上げているが、まだ何十億もの予防接種を受けていない人々がいる。この戦いでは、感染予防のために病気の診断と患者の隔離が大きな役割を果たす。機械学習は、患者の胸部X線とCTスキャン画像を分析し、新型コロナウイルスの診断を支援する。本研究では,単純で浅い畳み込み型ニューラルネットワークであるtelinetを用いて,新型コロナウイルスのctスキャン画像の分類を行う。この結果は,VGGNetのF1スコアとベンチマーク手法より優れていた。提案手法は他の手法と比較してより軽量である。

Hundreds of millions of cases and millions of deaths have occurred worldwide due to COVID-19. The fight against this pandemic is on-going on multiple fronts. While vaccinations are picking up speed, there are still billions of unvaccinated people. In this fight diagnosis of the disease and isolation of the patients to prevent any spreads play a huge role. Machine Learning approaches have assisted the diagnosis of COVID-19 cases by analyzing chest X-ray and CT-scan images of patients. In this research we present a simple and shallow Convolutional Neural Network based approach, TeliNet, to classify CT-scan images of COVID-19 patients. Our results outperform the F1 score of VGGNet and the benchmark approaches. Our proposed solution is also more lightweight in comparison to the other methods.

翻訳日:2021-07-14 05:40:55 公開日:2021-07-10

# 敵攻撃の影響を受けやすい層同定

Identifying Layers Susceptible to Adversarial Attacks ( http://arxiv.org/abs/2107.04827v1 )

ライセンス: Link先を確認

Shoaib Ahmed Siddiqui, Thomas Breuel

(参考訳) 一般的なニューラルネットワークアーキテクチャは、敵のサンプルによる攻撃を受けやすい。ニューラルネットワークアーキテクチャは、一般的に低レベル特徴抽出層と高レベル分類層に分けられるが、逆さまなサンプルへのネットワークの感受性は、特徴抽出よりも分類に関する問題と見なされることが多い。 CIFAR-10, Imagenette, ImageNet 上の VGG と ResNet アーキテクチャの異なる部分を,非逆データと逆データを用いて選択的に再学習することで,このアイデアを検証した。実験の結果, 対立サンプルに対する感受性は低レベル特徴抽出層と関連していることがわかった。したがって、高層層の再訓練は堅牢性を達成するには不十分である。この現象には2つの説明がある: 敵の攻撃は、攻撃クラスに見られる特徴と区別できない初期層からの出力を生じるか、または、敵でないサンプルの特徴と統計的に異なる初期層からの出力を、後続の層で一貫した分類を許さないかである。隠れ層における特徴ベクトルの分布に関する大規模非線形次元減少と密度モデルを用いてこの問題を検証し,非対角的および対角的標本間の特徴分布が著しく異なることを示す。本研究は,敵のサンプルの統計的起源と防御可能性に関する新たな知見を提供する。

Common neural network architectures are susceptible to attack by adversarial samples. Neural network architectures are commonly thought of as divided into low-level feature extraction layers and high-level classification layers; susceptibility of networks to adversarial samples is often thought of as a problem related to classification rather than feature extraction. We test this idea by selectively retraining different portions of VGG and ResNet architectures on CIFAR-10, Imagenette and ImageNet using non-adversarial and adversarial data. Our experimental results show that susceptibility to adversarial samples is associated with low-level feature extraction layers. Therefore, retraining high-level layers is insufficient for achieving robustness. This phenomenon could have two explanations: either, adversarial attacks yield outputs from early layers that are indistinguishable from features found in the attack classes, or adversarial attacks yield outputs from early layers that differ statistically from features for non-adversarial samples and do not permit consistent classification by subsequent layers. We test this question by large-scale non-linear dimensionality reduction and density modeling on distributions of feature vectors in hidden layers and find that the feature distributions between non-adversarial and adversarial samples differ substantially. Our results provide new insights into the statistical origins of adversarial samples and possible defenses.

翻訳日:2021-07-13 16:21:50 公開日:2021-07-10

# 適応型進化クラスタリングアルゴリズムstarを用いたコーパスから導出する概念階層の形式的コンテキスト削減

Formal context reduction in deriving concept hierarchies from corpora using adaptive evolutionary clustering algorithm star ( http://arxiv.org/abs/2107.04781v1 )

ライセンス: Link先を確認

Bryar A. Hassan, Tarik A. Rashid and Seyedali Mirjalili

(参考訳) 概念階層の手動構築は通常、時間を要するリソース集約的なプロセスであるため、コーパスから概念階層を導出するプロセスを自動化することは有益である。このように、コーパスから概念階層を学習する全体的なプロセスは、テキストを文にパースし、文章を分割し、トークン化する一連のステップを含んでいる。補間ステップの後、fcaを用いてペアを抽出する。しかし、形式的な文脈では、面白くない、誤ったペアがいくつか存在するかもしれない。形式的コンテキストの生成は時間のかかるプロセスにつながる可能性があるため、形式的コンテキストサイズ削減は、興味のない、誤ったペアを取り除くために必要であり、それに従って概念格子と概念階層を抽出する時間を削減する。本研究の目的は,(1)FCAを利用するコーパスから概念階層を導出するフレームワーク,(2)ECA*の適応版を用いた第1フレームワークの形式的文脈あいまいさを低減させるフレームワーク,の2つの枠組みを提案することである。 wikipediaのサンプル385コーパスを2つのフレームワークに適用して、形式的コンテキストのサイズを削減し、概念格子と概念階層を生成する実験を行った。その結果得られる形式的文脈の格子は、概念格子不変量を用いて標準の格子に評価される。したがって、2つの格子間の準同型は、基本格子とは対照的に、結果として得られる概念階層の質を89%維持し、縮小された概念格子は標準格子の構造的関係を継承する。適応ECA*は,異なる密度(充填比)のランダムデータセット上での実行時間を測定するために,対応する4つのベースラインアルゴリズムに対して検討される。その結果,適応ECA*は,異なるフィリング比で,他の競合技術よりも高速に概念格子を実行することがわかった。

It is beneficial to automate the process of deriving concept hierarchies from corpora since a manual construction of concept hierarchies is typically a time-consuming and resource-intensive process. As such, the overall process of learning concept hierarchies from corpora encompasses a set of steps: parsing the text into sentences, splitting the sentences and then tokenising it. After the lemmatisation step, the pairs are extracted using FCA. However, there might be some uninteresting and erroneous pairs in the formal context. Generating formal context may lead to a time-consuming process, so formal context size reduction is required to remove uninterested and erroneous pairs, taking less time to extract the concept lattice and concept hierarchies accordingly. In this premise, this study aims to propose two frameworks: (1) A framework to review the current process of deriving concept hierarchies from corpus utilising FCA; (2) A framework to decrease the formal contexts ambiguity of the first framework using an adaptive version of ECA*. Experiments are conducted by applying 385 sample corpora from Wikipedia on the two frameworks to examine the reducing size of formal context, which leads to yield concept lattice and concept hierarchy. The resulting lattice of formal context is evaluated to the standard one using concept lattice-invariants. Accordingly, the homomorphic between the two lattices preserves the quality of resulting concept hierarchies by 89% in contrast to the basic ones, and the reduced concept lattice inherits the structural relation of the standard one. The adaptive ECA* is examined against its four counterpart baseline algorithms to measure the execution time on random datasets with different densities (fill ratios). The results show that adaptive ECA* performs concept lattice faster than other mentioned competitive techniques in different fill ratios.

翻訳日:2021-07-13 16:20:49 公開日:2021-07-10

# IoTフレームワークにおけるエッジデバイス上の家庭用ビデオサーベイランスの異常検出

Anomaly Detection in Residential Video Surveillance on Edge Devices in IoT Framework ( http://arxiv.org/abs/2107.04767v1 )

ライセンス: Link先を確認

Mayur R. Parate, Kishor M. Bhurchandi, Ashwin G. Kothari

(参考訳) インテリジェントな居住者監視は、最も重要なスマートコミュニティサービスの1つだ。セキュリティに対する需要が高まる中、監視システムは監視シーンの異常を検出する必要がある。住宅社会における知的監視のための高容量計算装置の利用は費用がかかり、実現不可能である。そこで我々は,CPUのみのエッジデバイスを用いたインテリジェント監視のための異常検出を提案する。オブジェクトレベルの推論とトラッキングをキャプチャするモジュールフレームワークを開発した。部分閉塞,姿勢変形,複雑な場面に対処するために,特徴符号化と軌跡関連を用いた。 anomaly detection frameworkの要素は、十分なfpsでcpuのみのエッジデバイスで動作するように最適化されている。実験の結果,提案手法は実現可能であり,実生活シナリオにおいて良好な結果が得られた。

Intelligent resident surveillance is one of the most essential smart community services. The increasing demand for security needs surveillance systems to be able to detect anomalies in surveillance scenes. Employing high-capacity computational devices for intelligent surveillance in residential societies is costly and not feasible. Therefore, we propose anomaly detection for intelligent surveillance using CPU-only edge devices. A modular framework to capture object-level inferences and tracking is developed. To cope with partial occlusions, posture deformations, and complex scenes we employed feature encoding and trajectory associations. Elements of the anomaly detection framework are optimized to run on CPU-only edge devices with sufficient FPS. The experimental results indicate the proposed method is feasible and achieves satisfactory results in real-life scenarios.

翻訳日:2021-07-13 16:20:04 公開日:2021-07-10

# Not-to-End:オンライン外科的位相認識のためのマルチステージアーキテクチャの探索

Not End-to-End: Explore Multi-Stage Architecture for Online Surgical Phase Recognition ( http://arxiv.org/abs/2107.04810v1 )

ライセンス: Link先を確認

Fangqiu Yi and Tingting Jiang

(参考訳) 手術相認識はコンピュータ支援手術システムにおいて特に関心があり、手術ビデオのフレーム毎にどの位相が起こっているかを予測することが目的である。マルチステージアーキテクチャを持つネットワークは、多くのコンピュータビジョンタスクにおいてリッチパターンで広く適用されており、予測器が最初に初期予測を出力し、追加の改良段階が初期予測を実行してさらなる改良を行う。既存の研究では,手術用ビデオコンテンツは順調であり,時間的パターンが豊富であることを示し,手術用位相認識タスクに適している。しかし, 手術段階認識タスクに多段階アーキテクチャを単純に適用すれば, エンドツーエンドの訓練方法が洗練能力の低下を招きかねないことが観察された。この問題に対処するため,外科的位相認識タスクのための多段階アーキテクチャの異なる設計を探索し,新たなエンドツーエンドトレーニング戦略を提案する。非エンドツーエンドのトレーニング戦略では、改良段階は2種類の乱れたシーケンスを別々に訓練する。一方,リファインメントモデルの3つの異なる選択を評価し,解析と解が特定の多段階モデルの選択にロバストであることを示す。 M2CAI16 Workflow ChallengeとCholec80データセットの2つの公開ベンチマークで実験を行います。その結果,当社の戦略でトレーニングされたマルチステージアーキテクチャは,現在の最先端のシングルステージモデルのパフォーマンスを大きく向上させることがわかった。コードは \url{https://github.com/chinayi/casual_tcn} で入手できる。

Surgical phase recognition is of particular interest to computer assisted surgery systems, in which the goal is to predict what phase is occurring at each frame for a surgery video. Networks with multi-stage architecture have been widely applied in many computer vision tasks with rich patterns, where a predictor stage first outputs initial predictions and an additional refinement stage operates on the initial predictions to perform further refinement. Existing works show that surgical video contents are well ordered and contain rich temporal patterns, making the multi-stage architecture well suited for the surgical phase recognition task. However, we observe that when simply applying the multi-stage architecture to the surgical phase recognition task, the end-to-end training manner will make the refinement ability fall short of its wishes. To address the problem, we propose a new non end-to-end training strategy and explore different designs of multi-stage architecture for surgical phase recognition task. For the non end-to-end training strategy, the refinement stage is trained separately with proposed two types of disturbed sequences. Meanwhile, we evaluate three different choices of refinement models to show that our analysis and solution are robust to the choices of specific multi-stage models. We conduct experiments on two public benchmarks, the M2CAI16 Workflow Challenge, and the Cholec80 dataset. Results show that multi-stage architecture trained with our strategy largely boosts the performance of the current state-of-the-art single-stage model. Code is available at \url{https://github.com/ChinaYi/casual_tcn}.

翻訳日:2021-07-13 16:19:54 公開日:2021-07-10

# 低域フィルタを超える: 自動フィルタリングによるグラフ畳み込みネットワーク

Beyond Low-pass Filtering: Graph Convolutional Networks with Automatic Filtering ( http://arxiv.org/abs/2107.04755v1 )

ライセンス: Link先を確認

Zonghan Wu, Shirui Pan, Guodong Long, Jing Jiang, Chengqi Zhang

(参考訳) グラフ構造データからの深層学習には,グラフ畳み込みネットワークが不可欠になりつつある。既存のグラフ畳み込みネットワークのほとんどは、2つの大きな欠点を共有している。第一に、それらは本質的に低パスフィルタであるため、グラフ信号の潜在的に有用な中・高周波帯域は無視される。次に、既存のグラフ畳み込みフィルタの帯域幅を固定する。グラフ畳み込みフィルタのパラメータは、グラフ畳み込みフィルタ関数の曲率を変更することなく、グラフ入力を変換する。実際、専門家のドメイン知識がなければ、ある時点で周波数を維持または遮断すべきかどうかは不明です。本稿では,グラフ信号の全スペクトルを捕捉し,グラフ畳み込みフィルタの帯域幅を自動的に更新する自動グラフ畳み込みネットワーク(AutoGCN)を提案する。グラフスペクトル理論に基づいているが、私たちのAutoGCNも空間に局在しており、空間形式を持っている。実験の結果,AutoGCNは低域通過フィルタとしてのみ動作するベースライン法よりも大幅に改善されていることがわかった。

Graph convolutional networks are becoming indispensable for deep learning from graph-structured data. Most of the existing graph convolutional networks share two big shortcomings. First, they are essentially low-pass filters, thus the potentially useful middle and high frequency band of graph signals are ignored. Second, the bandwidth of existing graph convolutional filters is fixed. Parameters of a graph convolutional filter only transform the graph inputs without changing the curvature of a graph convolutional filter function. In reality, we are uncertain about whether we should retain or cut off the frequency at a certain point unless we have expert domain knowledge. In this paper, we propose Automatic Graph Convolutional Networks (AutoGCN) to capture the full spectrum of graph signals and automatically update the bandwidth of graph convolutional filters. While it is based on graph spectral theory, our AutoGCN is also localized in space and has a spatial form. Experimental results show that AutoGCN achieves significant improvement over baseline methods which only work as low-pass filters.

翻訳日:2021-07-13 16:17:19 公開日:2021-07-10

# 階層的特徴回帰によるクラスタ正規化

Cluster Regularization via a Hierarchical Feature Regression ( http://arxiv.org/abs/2107.04831v1 )

ライセンス: Link先を確認

Johann Pfitzinger

(参考訳) 高次元非直交予測器セットを用いた予測タスクは、最小二乗ベースの適合手順に挑戦する。大規模で生産的な文献が存在し、パラメータ推定の外部ロバスト性を改善するための様々な正規化アプローチについて議論している。本稿では,機械学習およびグラフ理論の領域からの洞察を動員し,予測子集合の教師付き階層表現に沿ってパラメータを推定し,パラメータをグループターゲットへ縮小する新しいクラスタ型正規化法である階層的特徴回帰(hfr)を提案する。この方法は、予測群の最適組成を推定する能力と、グループ目標を不均一に推定する能力において革新的である。 HFRは調整因子の回帰と見なすことができ、フィッティングプロセスで捕獲された慣性変動の程度でペナルティによって支配される収縮の強さが支配される。この手法は,高密度,スパース,グループ化されたデータ生成プロセスを含む,多種多様な回帰タスクに対して,ベンチマーク正規化推定器のパネルよりも優れた予測精度と汎用性を示す。経済成長予測への応用は、HFRの有効性を実証的な環境で示し、いくつかの頻繁な選択肢やベイズ的な選択肢と好意的に比較するために用いられる。

Prediction tasks with high-dimensional nonorthogonal predictor sets pose a challenge for least squares based fitting procedures. A large and productive literature exists, discussing various regularized approaches to improving the out-of-sample robustness of parameter estimates. This paper proposes a novel cluster-based regularization - the hierarchical feature regression (HFR) -, which mobilizes insights from the domains of machine learning and graph theory to estimate parameters along a supervised hierarchical representation of the predictor set, shrinking parameters towards group targets. The method is innovative in its ability to estimate optimal compositions of predictor groups, as well as the group targets endogenously. The HFR can be viewed as a supervised factor regression, with the strength of shrinkage governed by a penalty on the extent of idiosyncratic variation captured in the fitting process. The method demonstrates good predictive accuracy and versatility, outperforming a panel of benchmark regularized estimators across a diverse set of simulated regression tasks, including dense, sparse and grouped data generating processes. An application to the prediction of economic growth is used to illustrate the HFR's effectiveness in an empirical setting, with favorable comparisons to several frequentist and Bayesian alternatives.

翻訳日:2021-07-13 16:13:24 公開日:2021-07-10

# マルチヘッドコトレーニングによる半教師付き学習

Semi-Supervised Learning with Multi-Head Co-Training ( http://arxiv.org/abs/2107.04795v1 )

ライセンス: Link先を確認

Mingcai Chen, Yuntao Du, Yi Zhang, Shuwei Qian, Chongjun Wang

(参考訳) 自己学習から拡張されたコトレーニングは、半教師付き学習のフレームワークの1つである。これは、個別の分類器が互いに衝突しないように、アルゴリズムを微妙に設計する余分な分類器の訓練に要する。本稿では,半教師付き画像分類のための簡易かつ効率的なコトレーニングアルゴリズムであるmulti-head co-trainingを提案する。ベースラーナーをマルチヘッド構造に統合することにより、モデルは最小限の余分なパラメータに収まる。統一モデルにおける全ての分類ヘッドは「弱く強い強化」戦略を通じて仲間と相互作用し、多様性を明示的に促進することなく単一視点のコトレーニングを達成する。マルチヘッド協調学習の有効性は,標準半教師付き学習ベンチマークを用いた実証研究で実証された。

Co-training, extended from self-training, is one of the frameworks for semi-supervised learning. It works at the cost of training extra classifiers, where the algorithm should be delicately designed to prevent individual classifiers from collapsing into each other. In this paper, we present a simple and efficient co-training algorithm, named Multi-Head Co-Training, for semi-supervised image classification. By integrating base learners into a multi-head structure, the model is in a minimal amount of extra parameters. Every classification head in the unified model interacts with its peers through a "Weak and Strong Augmentation" strategy, achieving single-view co-training without promoting diversity explicitly. The effectiveness of Multi-Head Co-Training is demonstrated in an empirical study on standard semi-supervised learning benchmarks.

翻訳日:2021-07-13 16:11:24 公開日:2021-07-10

# DualVGR:ビデオ質問応答のためのデュアルビジュアルグラフ推論ユニット

DualVGR: A Dual-Visual Graph Reasoning Unit for Video Question Answering ( http://arxiv.org/abs/2107.04768v1 )

ライセンス: Link先を確認

Jianyu Wang, Bing-Kun Bao, Changsheng Xu

(参考訳) ビデオ質問応答は難しい作業であり、エージェントはリッチなビデオコンテンツを理解し、空間的時間的推論を行う必要がある。しかし、既存のグラフベースの手法では、ビデオQAの2つの特性を無視して、多段階の推論をうまく行えない。(1)同じビデオであっても、異なる質問は、関係推論で答えを推測するために異なる量のビデオクリップやオブジェクトを必要とする可能性がある。これらの観察に基づいて,ビデオ上でエンドツーエンドに推論を行うデュアルビジュアルグラフ推論ユニット(dualvgr)を提案する。 DualVGRの最初のコントリビューションは、説明可能なQuery Punishment Moduleの設計です。 2つめの貢献は、ビデオベースのマルチビューグラフアテンションネットワークであり、外観と動きの特徴の関係をキャプチャする。我々のDualVGRネットワークは、ベンチマークMSVD-QAおよびSVQAデータセットの最先端性能を実現し、ベンチマークMSRVTT-QAデータセットの競合結果を示す。私たちのコードはhttps://github.com/MMIR/DualVGR-VideoQA.comで公開されています。

Video question answering is a challenging task, which requires agents to be able to understand rich video contents and perform spatial-temporal reasoning. However, existing graph-based methods fail to perform multi-step reasoning well, neglecting two properties of VideoQA: (1) Even for the same video, different questions may require different amount of video clips or objects to infer the answer with relational reasoning; (2) During reasoning, appearance and motion features have complicated interdependence which are correlated and complementary to each other. Based on these observations, we propose a Dual-Visual Graph Reasoning Unit (DualVGR) which reasons over videos in an end-to-end fashion. The first contribution of our DualVGR is the design of an explainable Query Punishment Module, which can filter out irrelevant visual features through multiple cycles of reasoning. The second contribution is the proposed Video-based Multi-view Graph Attention Network, which captures the relations between appearance and motion features. Our DualVGR network achieves state-of-the-art performance on the benchmark MSVD-QA and SVQA datasets, and demonstrates competitive results on benchmark MSRVTT-QA datasets. Our code is available at https://github.com/MMIR/DualVGR-VideoQA.

翻訳日:2021-07-13 16:09:17 公開日:2021-07-10

# 自己教師型音声表現モデルの階層的解析

Layer-wise Analysis of a Self-supervised Speech Representation Model ( http://arxiv.org/abs/2107.04734v1 )

ライセンス: Link先を確認

Ankita Pasad, Ju-Chieh Chou, Karen Livescu

(参考訳) 近年,音声表現モデルの事前学習において,自己教師付き学習手法が成功している。これらの学習表現の有用性は実証的に観察されているが、事前訓練された表現自身で符号化された情報の種類や範囲についてはあまり研究されていない。このような洞察の開発は、これらのモデルの能力と限界を理解し、研究コミュニティがより効率的に下流アプリケーションに利用できるようにするのに役立つ。本研究では,その中間表現ベクトルを用いて,最近かつ成功した事前学習モデル(wav2vec 2.0)を解析ツールを用いて検討することにより,このギャップを埋める。非パラメトリックプローブを用いた単純な下流作業における標準相関,相互情報,および性能の測定値を用いて, (i) 音響的および言語的情報内容の問い合わせ, (ii) モデル層間の情報の進化を特徴付けるとともに, (iii) 自動音声認識(ASR) モデルがこれらの観測に与える影響を理解する。その結果,asrの微調整プロトコルの修正が動機となり,低リソース環境での単語誤り率の向上が図られた。

Recently proposed self-supervised learning approaches have been successful for pre-training speech representation models. The utility of these learned representations has been observed empirically, but not much has been studied about the type or extent of information encoded in the pre-trained representations themselves. Developing such insights can help understand the capabilities and limits of these models and enable the research community to more efficiently develop their usage for downstream applications. In this work, we begin to fill this gap by examining one recent and successful pre-trained model (wav2vec 2.0), via its intermediate representation vectors, using a suite of analysis tools. We use the metrics of canonical correlation, mutual information, and performance on simple downstream tasks with non-parametric probes, in order to (i) query for acoustic and linguistic information content, (ii) characterize the evolution of information across model layers, and (iii) understand how fine-tuning the model for automatic speech recognition (ASR) affects these observations. Our findings motivate modifying the fine-tuning protocol for ASR, which produces improved word error rates in a low-resource setting.

翻訳日:2021-07-13 16:07:44 公開日:2021-07-10

# 伝達学習法を用いたJPEG圧縮領域における植物葉病の直接検出

Detection of Plant Leaf Disease Directly in the JPEG Compressed Domain using Transfer Learning Technique ( http://arxiv.org/abs/2107.04813v1 )

ライセンス: Link先を確認

Atul Sharma, Bulla Rajesh and Mohammed Javed

(参考訳) 植物の葉病は食品の安全性に重大な危険をもたらし、品質と生産量の低下を引き起こす。したがって、葉病の正確かつタイムリーな検出は、作物の損失を確認し、人々の食料需要の増加に対応するために非常に重要である。従来の手法は、一般的に費用がかかり、アクセス不能な検査と人間のスキルに依存している。近年,Deep Neural Networksは画像分類において極めて有益である。本研究では, JPEG圧縮領域において, 転写学習を用いた植物葉病検出について検討した。ここでは、DCT係数からなるJPEG圧縮ストリームをニューラルネットワークに直接供給し、分類効率を向上させる。 JPEG圧縮葉データに対する実験結果から,提案モデルの有効性が示された。

Plant leaf diseases pose a significant danger to food security and they cause depletion in quality and volume of production. Therefore accurate and timely detection of leaf disease is very important to check the loss of the crops and meet the growing food demand of the people. Conventional techniques depend on lab investigation and human skills which are generally costly and inaccessible. Recently, Deep Neural Networks have been exceptionally fruitful in image classification. In this research paper, plant leaf disease detection employing transfer learning is explored in the JPEG compressed domain. Here, the JPEG compressed stream consisting of DCT coefficients is, directly fed into the Neural Network to improve the efficiency of classification. The experimental results on JPEG compressed leaf dataset demonstrate the efficacy of the proposed model.

翻訳日:2021-07-13 16:06:05 公開日:2021-07-10

# タスク指向意味解析におけるデータ効率の評価

Assessing Data Efficiency in Task-Oriented Semantic Parsing ( http://arxiv.org/abs/2107.04736v1 )

ライセンス: Link先を確認

Shrey Desai, Akshat Shrivastava, Justin Rill, Brian Moran, Safiyyah Saleem, Alexander Zotov, Ahmed Aly

(参考訳) データ効率は魅力的な特徴であるにもかかわらず、タスク指向のセマンティックパーシングで測定し最適化することはしばしば困難である。本研究は,データ効率に関する質問に対する統一的な解決策を提供するためのステップとして,パーサが特定の品質バーを達成するのに必要なドメイン内データ量を近似的に測定する4段階プロトコルを提案する。具体的には,(1)異なる濃度のターゲット部分集合をサンプリングする,(2)各部分集合上の微調整パーサ,(3)ターゲット部分集合 (%) と正確な一致 (%) に関する滑らかな曲線を得る,(4) 曲線をマイニングアドホック(ターゲット部分集合,完全一致)点に参照する。当社のプロトコルは,2つの実世界のケーススタディ – モデル一般化可能性と意図複雑性 – に適用されている。

Data efficiency, despite being an attractive characteristic, is often challenging to measure and optimize for in task-oriented semantic parsing; unlike exact match, it can require both model- and domain-specific setups, which have, historically, varied widely across experiments. In our work, as a step towards providing a unified solution to data-efficiency-related questions, we introduce a four-stage protocol which gives an approximate measure of how much in-domain, "target" data a parser requires to achieve a certain quality bar. Specifically, our protocol consists of (1) sampling target subsets of different cardinalities, (2) fine-tuning parsers on each subset, (3) obtaining a smooth curve relating target subset (%) vs. exact match (%), and (4) referencing the curve to mine ad-hoc (target subset, exact match) points. We apply our protocol in two real-world case studies -- model generalizability and intent complexity -- illustrating its flexibility and applicability to practitioners in task-oriented semantic parsing.

翻訳日:2021-07-13 16:04:39 公開日:2021-07-10

# 計算疫学:書籍、ニュース記事、ツイートにおける実証的利用の時間的、生態学的ダイナミクスのチャート化

Computational Paremiology: Charting the temporal, ecological dynamics of proverb use in books, news articles, and tweets ( http://arxiv.org/abs/2107.04929v1 )

ライセンス: Link先を確認

E. Davis, C. M. Danforth, W. Mieder, and P. S. Dodds

(参考訳) 弁証器は言語と文化の重要な要素であり、その歴史と通貨に多くの注意が払われているが、時間とともに使用される頻度の変化について、比較的定量的な研究は行われていない。文書の様々なジャンルを反映した大規模なコーパスが広く利用可能になったことにより、この証明の重要性を広くダイナミックに見ることが可能になった。ここでは、3つのコーパス内での証明の時間的変化、種類、規模、時間による違い、何世紀にもわたって何百万もの書籍、20年間で何億ものニュース記事、そして10年間で何十億ものツイートを測定します。調査の結果,各会場において,使用頻度が重く,時代文化の動態を反映した傾向がみられ,ソーシャルメディア上の現代的形態へと進化してきた。

Proverbs are an essential component of language and culture, and though much attention has been paid to their history and currency, there has been comparatively little quantitative work on changes in the frequency with which they are used over time. With wider availability of large corpora reflecting many diverse genres of documents, it is now possible to take a broad and dynamic view of the importance of the proverb. Here, we measure temporal changes in the relevance of proverbs within three corpora, differing in kind, scale, and time frame: Millions of books over centuries; hundreds of millions of news articles over twenty years; and billions of tweets over a decade. We find that proverbs present heavy-tailed frequency-of-usage rank distributions in each venue; exhibit trends reflecting the cultural dynamics of the eras covered; and have evolved into contemporary forms on social media.

翻訳日:2021-07-13 16:04:16 公開日:2021-07-10

# 法的知識グラフを用いた類似事例推薦

Similar Cases Recommendation using Legal Knowledge Graphs ( http://arxiv.org/abs/2107.04771v1 )

ライセンス: Link先を確認

Jaspreet Singh Dhani, Ruchika Bhatt, Balaji Ganesan, Parikshet Sirohi, Vasudha Bhatnagar

(参考訳) 裁判、判決、法律、その他の法的文書から構築された法的な知識グラフは、質問応答、文書の類似性、検索などの多くのアプリケーションを可能にする。 nlpタスクの遠隔監視にナレッジグラフを使用することはよく研究されているが、ノード類似性のようなダウンストリームグラフタスクにナレッジグラフを使用することは、ノードタイプとその機能の選択に困難をもたらす。本稿では,法律知識グラフから導出したケースグラフにおける類似ノードの予測手法について述べる。

A legal knowledge graph constructed from court cases, judgments, laws and other legal documents can enable a number of applications like question answering, document similarity, and search. While the use of knowledge graphs for distant supervision in NLP tasks is well researched, using knowledge graphs for downstream graph tasks like node similarity presents challenges in selecting node types and their features. In this demo, we describe our solution for predicting similar nodes in a case graph derived from our legal knowledge graph.

翻訳日:2021-07-13 16:02:47 公開日:2021-07-10

# 常識推論から複数の選好を通してのニューラルネットワークモデルへ:概要

From Common Sense Reasoning to Neural Network Models through Multiple Preferences: an overview ( http://arxiv.org/abs/2107.04870v1 )

ライセンス: Link先を確認

Laura Giordano, Valentina Gliozzi, Daniele Theseider Dupr\'e

(参考訳) 本稿では,条件論理と優先論理とニューラルネットワークモデルの関係について,多項意味論に基づく考察を行う。本稿では,ニューラルネットワークモデルに意味的解釈を提供するツールとして,異なる概念に対する嗜好を考慮に入れるために,最近導入された概念的多元参照セマンティクスを提案する。このアプローチは、教師なしニューラルネットワークモデル(自己組織化マップ)と教師なしニューラルネットワークモデル(マルチレイヤーパーセプトロン)の両方で検討されており、同じアプローチが他のニューラルネットワークモデルにも拡張されることを期待している。これにより、ネットワークの入出力動作をキャプチャする解釈を通じて、ネットワークの論理特性を(モデルチェックによって)チェックすることができる。多層パーセプトロンでは、ディープネットワーク自体を条件付き知識ベースと見なすことができ、シナプス接続は重み付き条件付き接続に対応する。本稿では, 自己組織化マップと多層パーセプトロンの事例を通して, 一般的なアプローチを説明し, オープンな課題と展望について考察する。

In this paper we discuss the relationships between conditional and preferential logics and neural network models, based on a multi-preferential semantics. We propose a concept-wise multipreference semantics, recently introduced for defeasible description logics to take into account preferences with respect to different concepts, as a tool for providing a semantic interpretation to neural network models. This approach has been explored both for unsupervised neural network models (Self-Organising Maps) and for supervised ones (Multilayer Perceptrons), and we expect that the same approach might be extended to other neural network models. It allows for logical properties of the network to be checked (by model checking) over an interpretation capturing the input-output behavior of the network. For Multilayer Perceptrons, the deep network itself can be regarded as a conditional knowledge base, in which synaptic connections correspond to weighted conditionals. The paper describes the general approach, through the cases of Self-Organising Maps and Multilayer Perceptrons, and discusses some open issues and perspectives.

翻訳日:2021-07-13 16:02:37 公開日:2021-07-10

# 視覚トランスフォーマーにおける局所からグローバルへの自己着脱

Local-to-Global Self-Attention in Vision Transformers ( http://arxiv.org/abs/2107.04735v1 )

ライセンス: Link先を確認

Jinpeng Li, Yichao Yan, Shengcai Liao, Xiaokang Yang, Ling Shao

(参考訳) トランスフォーマーはコンピュータビジョンタスクに大きな可能性を示した。高解像度の視覚データにおける自己注意の密度計算を避けるため、最近のTransformerモデルは階層設計を採用しており、ローカルウィンドウ内でのみ自己注意が計算される。この設計は効率を大幅に改善するが、早い段階ではグローバルな特徴推論を欠いている。本研究では,各ステージの複数の粒度で局所からグローバルへの推論を可能にする変圧器のマルチパス構造を設計する。提案するフレームワークは計算効率が高く,有効である。計算オーバーヘッドが極端に増加し,画像分類とセマンティックセグメンテーションの両方において顕著な改善が得られた。コードはhttps://github.com/ljpadam/LG-Transformerで入手できる。

Transformers have demonstrated great potential in computer vision tasks. To avoid dense computations of self-attentions in high-resolution visual data, some recent Transformer models adopt a hierarchical design, where self-attentions are only computed within local windows. This design significantly improves the efficiency but lacks global feature reasoning in early stages. In this work, we design a multi-path structure of the Transformer, which enables local-to-global reasoning at multiple granularities in each stage. The proposed framework is computationally efficient and highly effective. With a marginal increasement in computational overhead, our model achieves notable improvements in both image classification and semantic segmentation. Code is available at https://github.com/ljpadam/LG-Transformer

翻訳日:2021-07-13 16:01:02 公開日:2021-07-10

# TTAN:Few-shot行動認識のための2段階時間アライメントネットワーク

TTAN: Two-Stage Temporal Alignment Network for Few-shot Action Recognition ( http://arxiv.org/abs/2107.04782v1 )

ライセンス: Link先を確認

Shuyuan Li, Huabin Liu, Rui Qian, Yuxi Li, John See, Mengjuan Fei, Xiaoyuan Yu, Weiyao Lin

(参考訳) 数少ないアクション認識は、少数のサンプル(サポート)を使用して、新しいアクションクラス(クエリ)を認識することを目的としている。現在のアプローチの大半は、ビデオ間の類似性を比較するために学習するメトリック学習パラダイムに従っている。近年,このような類似性を直接測定することは理想的ではないことが観測されている。本稿では,動作継続時間の誤認と動作進化の誤認の2つの側面からこの問題を逮捕する。我々は2段階の時間アライメントネットワーク(TTAN)を通してそれらを逐次処理する。第1段階は予測されたアフィンワープパラメータで時間変換を行い、第2段階はクロスアテンション機構を使用してサポートとクエリの特徴を一貫した進化に調整する。さらに,サポートサンプル間の不一致を考慮した,新しいマルチショット融合戦略を考案する。アブレーション研究と可視化は、両方の段階が誤認識に対処する役割を実証している。ベンチマークデータセットに関する広範囲な実験により, 提案手法が, 最先端の動作認識性能を実現する可能性を示した。

Few-shot action recognition aims to recognize novel action classes (query) using just a few samples (support). The majority of current approaches follow the metric learning paradigm, which learns to compare the similarity between videos. Recently, it has been observed that directly measuring this similarity is not ideal since different action instances may show distinctive temporal distribution, resulting in severe misalignment issues across query and support videos. In this paper, we arrest this problem from two distinct aspects -- action duration misalignment and motion evolution misalignment. We address them sequentially through a Two-stage Temporal Alignment Network (TTAN). The first stage performs temporal transformation with the predicted affine warp parameters, while the second stage utilizes a cross-attention mechanism to coordinate the features of the support and query to a consistent evolution. Besides, we devise a novel multi-shot fusion strategy, which takes the misalignment among support samples into consideration. Ablation studies and visualizations demonstrate the role played by both stages in addressing the misalignment. Extensive experiments on benchmark datasets show the potential of the proposed method in achieving state-of-the-art performance for few-shot action recognition.

翻訳日:2021-07-13 16:00:51 公開日:2021-07-10

# ポリモルフィックトランスフォーマによるマイノショット領域適応

Few-Shot Domain Adaptation with Polymorphic Transformers ( http://arxiv.org/abs/2107.04805v1 )

ライセンス: Link先を確認

Shaohua Li, Xiuchao Sui, Jie Fu, Huazhu Fu, Xiangde Luo, Yangqin Feng, Xinxing Xu, Yong Liu, Daniel Ting, Rick Siow Mong Goh

(参考訳) ある医療画像に対してトレーニングされたディープニューラルネットワーク(DNN)は、トレーニング画像(ソースドメイン)とテスト画像(ターゲットドメイン)とのさまざまなドメインの相違により、目に見えないテスト画像に深刻なパフォーマンス低下を経験することが多い。臨床環境では、十分な注記対象領域データを短時間で収集することは困難である。少数のアノテーションで訓練されたモデルを適用するようなドメイン適応は、この場合非常に実用的で有用である。本稿では,任意のdnnバックボーンに組み込むことができるポリモーフィックトランス(polyformer)を提案する。具体的には、ポリフォーマ層をソースドメインでトレーニングされたモデルに挿入した後、プロトタイプの埋め込みを抽出し、ソースドメインの特徴の"基底"と見ることができる。対象領域では、ポリフォーム層は、画像特徴とプロトタイプ埋め込み間の相互作用を制御する投影層を更新するだけで適応する。他のモデル重み(バッチノルムパラメータを除く)は適応中に凍結される。これにより、アノテーションをオーバーフィットする可能性が大幅に減少し、いくつかの注釈付き画像でトレーニングした後、ターゲットドメインでロバストに実行することが可能となる。本稿では,2つの医療的セグメンテーション課題(光ディスク/カップセグメンテーション,ポリープセグメンテーション)におけるPolyformerの有効性を示す。 Polyformerのソースコードはhttps://github.com/askerlee/segtranで公開されている。

Deep neural networks (DNNs) trained on one set of medical images often experience severe performance drop on unseen test images, due to various domain discrepancy between the training images (source domain) and the test images (target domain), which raises a domain adaptation issue. In clinical settings, it is difficult to collect enough annotated target domain data in a short period. Few-shot domain adaptation, i.e., adapting a trained model with a handful of annotations, is highly practical and useful in this case. In this paper, we propose a Polymorphic Transformer (Polyformer), which can be incorporated into any DNN backbones for few-shot domain adaptation. Specifically, after the polyformer layer is inserted into a model trained on the source domain, it extracts a set of prototype embeddings, which can be viewed as a "basis" of the source-domain features. On the target domain, the polyformer layer adapts by only updating a projection layer which controls the interactions between image features and the prototype embeddings. All other model weights (except BatchNorm parameters) are frozen during adaptation. Thus, the chance of overfitting the annotations is greatly reduced, and the model can perform robustly on the target domain after being trained on a few annotated images. We demonstrate the effectiveness of Polyformer on two medical segmentation tasks (i.e., optic disc/cup segmentation, and polyp segmentation). The source code of Polyformer is released at https://github.com/askerlee/segtran.

翻訳日:2021-07-13 16:00:32 公開日:2021-07-10

# 注意機構を用いた弱修正深度推定ネットワーク

A Weakly-Supervised Depth Estimation Network Using Attention Mechanism ( http://arxiv.org/abs/2107.04819v1 )

ライセンス: Link先を確認

Fang Gao, Jiabao Wang, Jun Yu, Yaoxiong Wang, Feng Shuang

(参考訳) 単眼深度推定(MDE)はシーン理解や再構成といった多くのアプリケーションにおいて基本的な課題である。しかし、既存のメソッドのほとんどは正確なラベル付きデータセットに依存している。 ANUWという名前の注目ネスト付きU-net(ANU)に基づく弱監督型フレームワークを,ラベルの誤用に対して導入した。 ANUWは、入力された単一のRGB画像を深度画像に変換するためにエンドツーエンドに訓練される。これは、高密度残留ネットワーク構造、適応重みチャネルアテンション(AWCA)モジュール、パッチ第2非ローカル(PSNL)モジュール、ソフトラベル生成方法からなる。高密度残留ネットワークは、入力をエンコードしてデコードするネットワークの本体である。 awcaモジュールはチャネル重みを適応的に調整して重要な特徴を抽出することができる。 PSNLモジュールは2階非局所法により空間的注意機構を実装している。提案するソフトラベル生成手法は,データセットの事前知識を用いて,偽のラベルを置き換えるソフトラベルを生成する。提案したANUWは、欠陥のある単分子深度データセットに基づいてトレーニングされ、トレーニングされたモデルは3つの公開データセット上でテストされ、その結果、最先端のMDE手法と比較してANUWの優位性を示す。

Monocular depth estimation (MDE) is a fundamental task in many applications such as scene understanding and reconstruction. However, most of the existing methods rely on accurately labeled datasets. A weakly-supervised framework based on attention nested U-net (ANU) named as ANUW is introduced in this paper for cases with wrong labels. The ANUW is trained end-to-end to convert an input single RGB image into a depth image. It consists of a dense residual network structure, an adaptive weight channel attention (AWCA) module, a patch second non-local (PSNL) module and a soft label generation method. The dense residual network is the main body of the network to encode and decode the input. The AWCA module can adaptively adjust the channel weights to extract important features. The PSNL module implements the spatial attention mechanism through a second-order non-local method. The proposed soft label generation method uses the prior knowledge of the dataset to produce soft labels to replace false ones. The proposed ANUW is trained on a defective monocular depth dataset and the trained model is tested on three public datasets, and the results demonstrate the superiority of ANUW in comparison with the state-of-the-art MDE methods.

翻訳日:2021-07-13 16:00:05 公開日:2021-07-10

# 7つの基本表情分類のためのベイズ畳み込みニューラルネットワーク

Bayesian Convolutional Neural Networks for Seven Basic Facial Expression Classifications ( http://arxiv.org/abs/2107.04834v1 )

ライセンス: Link先を確認

Wei Gong, Hailan Huang

(参考訳) 7つの基本的な表情分類は、複雑な人間の感情を表現する基本的な方法であり、人工知能研究の重要な部分である。従来のベイズニューラルネットワークの枠組みに基づき,本論文で構築したresnet-18_bnnネットワークは,(1)不確定パラメータのkl損失と特定のパラメータの交叉からなる,新たな目的関数を提案する。エントロピー損失組成物。 2) 特殊目的関数を対象として, これら2つのパラメータを交互に更新するトレーニングスキームを提案する。 (3) 最後の畳み込み群のパラメータのみをモデル化する。実験解析により,本手法はaf-wild2データベースの評価セットにおいて98.28%の精度を達成した。従来のベイズ型ニューラルネットワークと比較すると,本手法は分類精度が最も高い。

The seven basic facial expression classifications are a basic way to express complex human emotions and are an important part of artificial intelligence research. Based on the traditional Bayesian neural network framework, the ResNet-18_BNN network constructed in this paper has been improved in the following three aspects: (1) A new objective function is proposed, which is composed of the KL loss of uncertain parameters and the intersection of specific parameters. Entropy loss composition. (2) Aiming at a special objective function, a training scheme for alternately updating these two parameters is proposed. (3) Only model the parameters of the last convolution group. According to experimental analysis, our method achieves an accuracy of 98.28% on the evaluation set of the Aff-Wild2 database. Compared with the traditional Bayesian Neural Network, our method brings the highest classification accuracy gain.

翻訳日:2021-07-13 15:59:44 公開日:2021-07-10

# マルチドメインデータアグリゲーションに基づく医用画像セグメンテーションのための階層的自己監督学習

Hierarchical Self-Supervised Learning for Medical Image Segmentation Based on Multi-Domain Data Aggregation ( http://arxiv.org/abs/2107.04886v1 )

ライセンス: Link先を確認

Hao Zheng, Jun Han, Hongxiao Wang, Lin Yang, Zhuo Zhao, Chaoli Wang, Danny Z. Chen

(参考訳) 大規模ラベル付きデータセットは、教師付きディープラーニングの成功の鍵であるが、医用画像セグメンテーションでは、モデルトレーニングに十分な注釈付き画像を得ることは非常に困難である。多くのシナリオでは、注釈のない画像は豊富で容易に取得できる。自己教師付き学習(SSL)は、生のデータ情報と表現学習を利用する大きな可能性を示している。本稿では,無記名データを利用して医用画像セグメンテーションを促進する新しい自己教師付きフレームワークである階層型自己教師付き学習(hssl)を提案する。タスク固有の自己教師付き事前訓練とそれに続く教師付き微調整に関する現在の文献とは異なり、さまざまな医用画像セグメンテーションタスクのための異種データからタスク非依存の知識をSSLを用いて学習する。具体的には、まずいくつかの医学的課題からデータセットを集約し、自己教師付きでネットワークを事前訓練し、最後にラベル付きデータに微調整する。コントラスト損失と分類損失を組み合わせた新しい損失関数を開発し,セグメンテーションタスクのためのエンコーダ・デコーダアーキテクチャを事前学習する。広範な実験により,マルチドメイン合同事前学習は,ダウンストリームセグメンテーションタスクに有益であり,単ドメイン事前学習を大きく上回ることが示された。スクラッチから学ぶことに比べ、新しい手法は様々なタスク(例:+0.69%から+18.60%、注釈付きデータの5%)でパフォーマンスが向上する。限られたトレーニングデータで、我々の手法は性能ギャップw.r.tを著しく橋渡しすることができる。より密なアノテーション(例えば、注釈付きデータの10%対~100%)。

A large labeled dataset is a key to the success of supervised deep learning, but for medical image segmentation, it is highly challenging to obtain sufficient annotated images for model training. In many scenarios, unannotated images are abundant and easy to acquire. Self-supervised learning (SSL) has shown great potentials in exploiting raw data information and representation learning. In this paper, we propose Hierarchical Self-Supervised Learning (HSSL), a new self-supervised framework that boosts medical image segmentation by making good use of unannotated data. Unlike the current literature on task-specific self-supervised pretraining followed by supervised fine-tuning, we utilize SSL to learn task-agnostic knowledge from heterogeneous data for various medical image segmentation tasks. Specifically, we first aggregate a dataset from several medical challenges, then pre-train the network in a self-supervised manner, and finally fine-tune on labeled data. We develop a new loss function by combining contrastive loss and classification loss and pretrain an encoder-decoder architecture for segmentation tasks. Our extensive experiments show that multi-domain joint pre-training benefits downstream segmentation tasks and outperforms single-domain pre-training significantly. Compared to learning from scratch, our new method yields better performance on various tasks (e.g., +0.69% to +18.60% in Dice scores with 5% of annotated data). With limited amounts of training data, our method can substantially bridge the performance gap w.r.t. denser annotations (e.g., 10% vs.~100% of annotated data).

翻訳日:2021-07-13 15:59:30 公開日:2021-07-10

# コンピュータビジョンにおける産業と学術研究

Industry and Academic Research in Computer Vision ( http://arxiv.org/abs/2107.04902v1 )

ライセンス: Link先を確認

Iuliia Kotseruba

(参考訳) 本研究は,コンピュータビジョンにおける産学研究と学界のダイナミクスを研究することを目的とする。結果は、この分野を代表するトップ5ビジョンカンファレンスのセットで実証される。このような分析データの入手は容易ではなかったため、原版からのメタデータの収集と処理に多大な労力が費やされた。第一に,本研究は産業支援研究のシェアを定量化する。具体的には,産業界の研究者が発行する論文の割合が増加しており,より多くの学者が企業に参加したり協力したりしていることを示している。次に、研究トピックや引用パターンの分布など、業界におけるプレゼンスの影響について検討する。その結果,研究トピックの分布は産業論文や学術論文に類似していることが示唆された。しかし、業界論文の引用には強い好みがある。最後に,コードの可利用性や影響などの引用バイアスの原因について検討した。

This work aims to study the dynamic between research in the industry and academia in computer vision. The results are demonstrated on a set of top-5 vision conferences that are representative of the field. Since data for such analysis was not readily available, significant effort was spent on gathering and processing meta-data from the original publications. First, this study quantifies the share of industry-sponsored research. Specifically, it shows that the proportion of papers published by industry-affiliated researchers is increasing and that more academics join companies or collaborate with them. Next, the possible impact of industry presence is further explored, namely in the distribution of research topics and citation patterns. The results indicate that the distribution of the research topics is similar in industry and academic papers. However, there is a strong preference towards citing industry papers. Finally, possible reasons for citation bias, such as code availability and influence, are investigated.

翻訳日:2021-07-13 15:59:02 公開日:2021-07-10

# マルチモーダル脳のデコードにおける経時的相関解析

Longitudinal Correlation Analysis for Decoding Multi-Modal Brain Development ( http://arxiv.org/abs/2107.04724v1 )

ライセンス: Link先を確認

Qingyu Zhao, Ehsan Adeli, Kilian M. Pohl

(参考訳) 幼少期から、人間の脳は生涯にわたって構造を再構築し、リワイヤリングする。このような複雑な脳の発達を特徴付けるには、縦型およびマルチモーダルの神経画像データの効果的な分析が必要である。本稿では,縦相関解析 (LCA) と呼ばれる解析手法を提案する。 LCAは、まず各モーダルからの入力をオートエンコーダに基づく潜在表現に還元することで、2つのモーダルのデータを結合する。自己教師付き戦略は、各空間内の2つの方向を互いに分離し、それらの方向に沿った潜在表現の縦方向の変化がモダリティ間で最大に相関するようにすることで、2つの潜在空間を関連付ける。若年者におけるアルコール・神経発達に関する全国コンソーシアム679名の縦断的T1強調および拡散強調MRI解析にLCAを適用した。横断的あるいは単一モーダルモデリングに焦点を当てた既存のアプローチとは異なり、lcaはデータから抽出された形態的および拡散的特徴から、マクロ構造およびミクロ構造的脳の発達を解き放つことに成功した。対象者の生の3次元画像量に対するLCAの再検査は,特徴に基づく解析の結果を再現することに成功した。最後に、LCAが明らかにした発達効果は、青年期の脳の成熟パターンの現在の理解と一致した。

Starting from childhood, the human brain restructures and rewires throughout life. Characterizing such complex brain development requires effective analysis of longitudinal and multi-modal neuroimaging data. Here, we propose such an analysis approach named Longitudinal Correlation Analysis (LCA). LCA couples the data of two modalities by first reducing the input from each modality to a latent representation based on autoencoders. A self-supervised strategy then relates the two latent spaces by jointly disentangling two directions, one in each space, such that the longitudinal changes in latent representations along those directions are maximally correlated between modalities. We applied LCA to analyze the longitudinal T1-weighted and diffusion-weighted MRIs of 679 youths from the National Consortium on Alcohol and Neurodevelopment in Adolescence. Unlike existing approaches that focus on either cross-sectional or single-modal modeling, LCA successfully unraveled coupled macrostructural and microstructural brain development from morphological and diffusivity features extracted from the data. A retesting of LCA on raw 3D image volumes of those subjects successfully replicated the findings from the feature-based analysis. Lastly, the developmental effects revealed by LCA were inline with the current understanding of maturational patterns of the adolescent brain.

翻訳日:2021-07-13 15:55:07 公開日:2021-07-10

# copulasを用いたマルチエージェント模倣学習

Multi-Agent Imitation Learning with Copulas ( http://arxiv.org/abs/2107.04750v1 )

ライセンス: Link先を確認

Hongwei Wang, Lantao Yu, Zhangjie Cao, Stefano Ermon

(参考訳) マルチエージェント模倣学習は、物理的、社会的、チームプレイシステムを理解するのに不可欠な観察と行動のマッピングを学習することで、デモからタスクを実行するために複数のエージェントを訓練することを目的としている。しかしながら、マルチエージェント相互作用をモデル化する既存の研究の多くは、エージェントが観察に基づいて独立した決定をし、エージェント間の複雑な依存を無視していると仮定している。本稿では,確率変数間の依存を捉える強力な統計ツールである copula を用いて,マルチエージェントシステムにおける相関と協調を明示的にモデル化する。提案するモデルでは,個々のエージェントの局所的行動パターンを捉えた限界を個別に学習できるだけでなく,エージェント間の依存構造を単独かつ完全に捉えたcopula関数を学習することができる。合成および実世界のデータセットに対する大規模な実験により、我々のモデルはアクション予測タスクにおける様々なシナリオにおいて最先端のベースラインよりも優れており、専門家によるデモンストレーションに近い新しい軌道を生成することができる。

Multi-agent imitation learning aims to train multiple agents to perform tasks from demonstrations by learning a mapping between observations and actions, which is essential for understanding physical, social, and team-play systems. However, most existing works on modeling multi-agent interactions typically assume that agents make independent decisions based on their observations, ignoring the complex dependence among agents. In this paper, we propose to use copula, a powerful statistical tool for capturing dependence among random variables, to explicitly model the correlation and coordination in multi-agent systems. Our proposed model is able to separately learn marginals that capture the local behavioral patterns of each individual agent, as well as a copula function that solely and fully captures the dependence structure among agents. Extensive experiments on synthetic and real-world datasets show that our model outperforms state-of-the-art baselines across various scenarios in the action prediction task, and is able to generate new trajectories close to expert demonstrations.

翻訳日:2021-07-13 15:54:48 公開日:2021-07-10

# IoTと機械学習を用いた精密農業のためのマルチモーダルシステムを目指して

Towards a Multimodal System for Precision Agriculture using IoT and Machine Learning ( http://arxiv.org/abs/2107.04895v1 )

ライセンス: Link先を確認

Satvik Garg, Pradyumn Pundir, Himanshu Jindal, Hemraj Saini, Somya Garg

(参考訳) 精密農業制度は、現在の情報・通信技術を利用した農業を監督し、人的作業を進めながら収穫量や品質を向上させることを指す。自動化には、土壌、水、光、湿度、温度などのセンサーが与える情報の組み合わせが必要であり、操作者に正確なデータを提供し、農家に優れた収量を得る。本研究は, 精密農業利用における最先端のアプローチをすべて取り入れた研究である。データ収集のためのIoT(Internet of Things)や、作物被害予測のための機械学習、作物病検出のためのディープラーニングといった技術が使用されている。 IoTを用いたデータ収集は、スマート灌水のための水分レベルの測定、n, p, kによる最適な収量開発のための肥料の推定に責任がある。作物被害予測には、ランダムフォレスト(rf)、光勾配ブースティングマシン(lgbm)、xgboost(xgb)、決定木(dt)、k極近傍(knn)といった様々なアルゴリズムが使用される。その後、vgg16、resnet50、drknet121などの事前訓練された畳み込みニューラルネットワーク(cnn)モデルも、作物が何らかの病気で汚染されたかどうかを確認するために訓練される。

Precision agriculture system is an arising idea that refers to overseeing farms utilizing current information and communication technologies to improve the quantity and quality of yields while advancing the human work required. The automation requires the assortment of information given by the sensors such as soil, water, light, humidity, temperature for additional information to furnish the operator with exact data to acquire excellent yield to farmers. In this work, a study is proposed that incorporates all common state-of-the-art approaches for precision agriculture use. Technologies like the Internet of Things (IoT) for data collection, machine Learning for crop damage prediction, and deep learning for crop disease detection is used. The data collection using IoT is responsible for the measure of moisture levels for smart irrigation, n, p, k estimations of fertilizers for best yield development. For crop damage prediction, various algorithms like Random Forest (RF), Light gradient boosting machine (LGBM), XGBoost (XGB), Decision Tree (DT) and K Nearest Neighbor (KNN) are used. Subsequently, Pre-Trained Convolutional Neural Network (CNN) models such as VGG16, Resnet50, and DenseNet121 are also trained to check if the crop was tainted with some illness or not.

翻訳日:2021-07-13 15:54:29 公開日:2021-07-10

# 記述論理における高速概念学習のための概念長予測

Prediction of concept lengths for fast concept learning in description logics ( http://arxiv.org/abs/2107.04911v1 )

ライセンス: Link先を確認

N'Dah Jean Kouagou, Stefan Heindorf, Caglar Demir, Axel-Cyrille Ngonga Ngomo

(参考訳) 洗練された演算子に基づく概念学習アプローチは、概念を計算するために部分的に順序付けられた解空間を探索する。しかし、これらのアプローチによって区切られた改良木は、複雑な学習問題に対して容易に数百万のノードに成長できる。これにより、リファインメントベースのアプローチは、しばしば最適な概念を効率的に検出できない。本稿では,対象概念の長さを予測し,概念学習における探索空間の削減を容易にする,概念長学習のための教師付き機械学習アプローチを提案する。この目的を達成するために、我々は4つのニューラルネットワークを比較し、それらを4つのベンチマーク知識グラフで評価する。評価結果から,再帰的ニューラルネットワークアーキテクチャは,f-測定値が最大92%で,概念長予測に最適であることが示唆された。概念長予測器をCELOE(Class Expression Learner for Ontology Engineering)アルゴリズムに統合することで,CELOEのランタイムを最大13.4倍改善し,結果の質に大きな変化を生じさせないことを示す。再現性については、https://github.com/ConceptLengthLearner/ReproducibilityRepoで公開GitHubリポジトリに実装を提供しています。

Concept learning approaches based on refinement operators explore partially ordered solution spaces to compute concepts, which are used as binary classification models for individuals. However, the refinement trees spanned by these approaches can easily grow to millions of nodes for complex learning problems. This leads to refinement-based approaches often failing to detect optimal concepts efficiently. In this paper, we propose a supervised machine learning approach for learning concept lengths, which allows predicting the length of the target concept and therefore facilitates the reduction of the search space during concept learning. To achieve this goal, we compare four neural architectures and evaluate them on four benchmark knowledge graphs--Carcinogenesis, Mutagenesis, Semantic Bible, Family Benchmark. Our evaluation results suggest that recurrent neural network architectures perform best at concept length prediction with an F-measure of up to 92%. We show that integrating our concept length predictor into the CELOE (Class Expression Learner for Ontology Engineering) algorithm improves CELOE's runtime by a factor of up to 13.4 without any significant changes to the quality of the results it generates. For reproducibility, we provide our implementation in the public GitHub repository at https://github.com/ConceptLengthLearner/ReproducibilityRepo

翻訳日:2021-07-13 15:54:04 公開日:2021-07-10

# 金融予測・計画・分析のための機械学習:最近の発展と落とし穴

Machine Learning for Financial Forecasting, Planning and Analysis: Recent Developments and Pitfalls ( http://arxiv.org/abs/2107.04851v1 )

ライセンス: Link先を確認

Helmut Wasserbacher and Martin Spindler

(参考訳) この記事では、財務予測、計画、分析(FP\&A)のための機械学習を紹介します。機械学習は、大量のデータから高度に自動化された情報抽出によってFP\&Aをサポートするのに適しているように見える。しかしながら,従来の機械学習手法の多くは予測(予測)に重点を置いているため,計画やリソース割り当て(因果推論)に使用する場合の落とし穴を回避するために必要となる注意事項について議論する。機械学習の単純な適用は通常この文脈で失敗するが、最近開発されたダブル機械学習フレームワークは興味のある因果問題に対処できる。我々は、FP\&Aにおける機械学習に関する現在の文献をレビューし、予測と計画の両方に機械学習をどのように使用できるかをシミュレーション研究で示す。また,データポイント数の増加に伴う予測と計画の改善についても検討する。

This article is an introduction to machine learning for financial forecasting, planning and analysis (FP\&A). Machine learning appears well suited to support FP\&A with the highly automated extraction of information from large amounts of data. However, because most traditional machine learning techniques focus on forecasting (prediction), we discuss the particular care that must be taken to avoid the pitfalls of using them for planning and resource allocation (causal inference). While the naive application of machine learning usually fails in this context, the recently developed double machine learning framework can address causal questions of interest. We review the current literature on machine learning in FP\&A and illustrate in a simulation study how machine learning can be used for both forecasting and planning. We also investigate how forecasting and planning improve as the number of data points increases.

翻訳日:2021-07-13 15:51:07 公開日:2021-07-10

# Weaving Attention U-net:新しいハイブリッドCNNとAttention-based Method for Organs-at-risk Segmentation in Head and Neck CT Images

Weaving Attention U-net: A Novel Hybrid CNN and Attention-based Method for Organs-at-risk Segmentation in Head and Neck CT Images ( http://arxiv.org/abs/2107.04847v1 )

ライセンス: Link先を確認

Zhuangzhuang Zhang, Tianyu Zhao, Hiram Gay, Weixiong Zhang, Baozhou Sun

(参考訳) 放射線療法の計画では、手動コントゥーリングは労働集約的で時間を要する。正確で堅牢な自動セグメンテーションモデルは、効率と治療結果を改善する。本稿では,畳み込みニューラルネットワーク(CNN)と自己注意機構を組み合わせた新しいハイブリッドディープラーニング手法を開発し,頭頸部CT画像の高速かつ正確な多臓器分割を実現することを目的とする。 115例の頭頸部CT像を回顧的に収集,使用した。トレーニング/検証/テストの比率を81/9/25に設定し,10倍のクロスバリデーション戦略を用いて最適なモデルパラメータを選択した。提案するハイブリッドモデルでは,各症例に対して10個の臓器・リスク (oar) を分割した。モデルの性能はDice similarity Coefficient (DSC)、Hausdorff distance 95% (HD95)、平均表面距離 (MSD) の3つの指標で評価された。私たちは、Head and Neck 2015チャレンジデータセットでモデルのパフォーマンスをテストし、最先端の自動セグメンテーションアルゴリズムと比較しました。提案手法は、10個のOARの基底真実によく似た輪郭を生成する。新しいウィービング注意U-netは頭頸部CT画像のセグメンテーションに優れているか類似した性能を示した。

In radiotherapy planning, manual contouring is labor-intensive and time-consuming. Accurate and robust automated segmentation models improve the efficiency and treatment outcome. We aim to develop a novel hybrid deep learning approach, combining convolutional neural networks (CNNs) and the self-attention mechanism, for rapid and accurate multi-organ segmentation on head and neck computed tomography (CT) images. Head and neck CT images with manual contours of 115 patients were retrospectively collected and used. We set the training/validation/testing ratio to 81/9/25 and used the 10-fold cross-validation strategy to select the best model parameters. The proposed hybrid model segmented ten organs-at-risk (OARs) altogether for each case. The performance of the model was evaluated by three metrics, i.e., the Dice Similarity Coefficient (DSC), Hausdorff distance 95% (HD95), and mean surface distance (MSD). We also tested the performance of the model on the Head and Neck 2015 challenge dataset and compared it against several state-of-the-art automated segmentation algorithms. The proposed method generated contours that closely resemble the ground truth for ten OARs. Our results of the new Weaving Attention U-net demonstrate superior or similar performance on the segmentation of head and neck CT images.

翻訳日:2021-07-13 15:48:31 公開日:2021-07-10

# 航空ロボットチームによるインテリジェントトラヒックモニタリングのための分散深層強化学習

Distributed Deep Reinforcement Learning for Intelligent Traffic Monitoring with a Team of Aerial Robots ( http://arxiv.org/abs/2107.04924v1 )

ライセンス: Link先を確認

Behzad Khamidehi and Elvino S. Sousa

(参考訳) 本稿では,航空ロボットを用いた道路網における交通監視問題について検討する。問題は2つの主な理由から難しい。まず、交通イベントは時間的にも空間的にも確率的です。第二に、交通イベントが異なる速度で道路網の異なる場所に到着すると、この問題は非均質な構造となる。そのため、場所によっては、ロボットが他の場所よりも多くの訪問を必要とする。これらの問題に対処するために,道路網の各位置に対する不確実性指標を定義し,ネットワークの平均不確実性を最小限に抑えるための航空ロボットの経路計画問題を定式化する。本稿では,この問題を部分可観測マルコフ決定プロセス(pomdp)として表現し,深層強化学習に基づく分散スケーラブルなアルゴリズムを提案する。エージェント(aerial robot)と交通管理センター(traffic management center, tmc)の通信モードによって異なる2つのシナリオを検討する。最初のシナリオでは、エージェントがTMCと継続的に通信して、トラフィックイベントに関するリアルタイム情報を送受信していると仮定する。したがって、エージェントは環境のグローバルかつリアルタイムな知識を持っている。しかし,第2のシナリオでは,空中ロボットの観測が部分的かつセンシング範囲に限定された,困難な設定を考える。さらに、第1のシナリオとは対照的に、空中ロボットとTMCとの間の情報交換は特定の時間インスタンスに限定される。本研究では,実際の道路ネットワークトポロジーにおける両シナリオにおける提案アルゴリズムの性能を評価し,その性能を交通監視システムで実証する。

This paper studies the traffic monitoring problem in a road network using a team of aerial robots. The problem is challenging due to two main reasons. First, the traffic events are stochastic, both temporally and spatially. Second, the problem has a non-homogeneous structure as the traffic events arrive at different locations of the road network at different rates. Accordingly, some locations require more visits by the robots compared to other locations. To address these issues, we define an uncertainty metric for each location of the road network and formulate a path planning problem for the aerial robots to minimize the network's average uncertainty. We express this problem as a partially observable Markov decision process (POMDP) and propose a distributed and scalable algorithm based on deep reinforcement learning to solve it. We consider two different scenarios depending on the communication mode between the agents (aerial robots) and the traffic management center (TMC). The first scenario assumes that the agents continuously communicate with the TMC to send/receive real-time information about the traffic events. Hence, the agents have global and real-time knowledge of the environment. However, in the second scenario, we consider a challenging setting where the observation of the aerial robots is partial and limited to their sensing ranges. Moreover, in contrast to the first scenario, the information exchange between the aerial robots and the TMC is restricted to specific time instances. We evaluate the performance of our proposed algorithm in both scenarios for a real road network topology and demonstrate its functionality in a traffic monitoring system.

翻訳日:2021-07-13 15:44:38 公開日:2021-07-10

# 凸性のないSchr{\"o}dinger-F{\"o}llmerサンプリングの収束解析

Convergence Analysis of Schr{\"o}dinger-F{\"o}llmer Sampler without Convexity ( http://arxiv.org/abs/2107.04766v1 )

ライセンス: Link先を確認

Yuling Jiao and Lican Kang and Yanyan Liu and Youzhou Zhou

(参考訳) Schr\"{o}dinger-F\"{o}llmer sampler (SFS) は、エルゴード性のない非正規分布からサンプリングするための、新しく効率的なアプローチである。 SFS は、Schr\"{o}dinger-F\"{o}llmerfusion process $$\mathrm{d} X_{t}=-\nabla U\left(X_t, t\right) \mathrm{d} t+\mathrm{d} B_{t}, \quad t \in[0,1],\quad X_0=0$$ の単位区間上のオイラー・マルヤマの離散化に基づいており、これは時間にゼロの縮退分布を目標分布へ輸送する。 \cite{sfs21} において、SFS の整合性は、ドリフト項 $b(x,t)$ % $U(x,t)$ が一様 (on $t$) %concave convex (on $x$) であるという制限された仮定の下で確立される。本稿では,標準正規分布上の目標分布の密度比について,滑らかで有界な条件下でのwasserstein距離におけるsfsの非漸近的誤差境界を与えるが,そのポテンシャルの強い凸性は必要としない。

Schr\"{o}dinger-F\"{o}llmer sampler (SFS) is a novel and efficient approach for sampling from possibly unnormalized distributions without ergodicity. SFS is based on the Euler-Maruyama discretization of Schr\"{o}dinger-F\"{o}llmer diffusion process $$\mathrm{d} X_{t}=-\nabla U\left(X_t, t\right) \mathrm{d} t+\mathrm{d} B_{t}, \quad t \in[0,1],\quad X_0=0$$ on the unit interval, which transports the degenerate distribution at time zero to the target distribution at time one. In \cite{sfs21}, the consistency of SFS is established under a restricted assumption that %the drift term $b(x,t)$ the potential $U(x,t)$ is uniformly (on $t$) strongly %concave convex (on $x$). In this paper we provide a nonasymptotic error bound of SFS in Wasserstein distance under some smooth and bounded conditions on the density ratio of the target distribution over the standard normal distribution, but without requiring the strongly convexity of the potential.

翻訳日:2021-07-13 15:42:29 公開日:2021-07-10

# 単一モデルだけで十分か? MuCoS: セマンティックコード検索のためのマルチモデルアンサンブル学習

Is a Single Model Enough? MuCoS: A Multi-Model Ensemble Learning for Semantic Code Search ( http://arxiv.org/abs/2107.04773v1 )

ライセンス: Link先を確認

Lun Du, Xiaozhou Shi, Yanlin Wang, Ensheng Shi, Shi Han and Dongmei Zhang

(参考訳) 近年,コードスニペットと検索クエリ間のセマンティックな相関がより良くなり,有望な性能を持つため,深層学習がコード検索の主流となっている。しかし、コードスニペットはビジネスロジック、特定のアルゴリズム、ハードウェア通信など、さまざまな次元の様々な情報を持っているため、単一のコード表現モジュールがすべての視点をカバーすることは困難である。一方、特定のクエリは1つまたは複数の視点にフォーカスする可能性があるため、単一のクエリ表現モジュールが異なるユーザ意図を表現することは困難である。本稿では,意味コード検索のためのマルチモデルアンサンブル学習アーキテクチャであるMuCoSを提案する。複数の個別の学習者が組み合わさり、それぞれがコードスニペットの特定の視点を強調する。私たちは、コード情報の異なる視点を含む異なるデータセットで個々の学習者を訓練し、これらの異なるデータセットを取得するためにデータ拡張戦略を使用します。次に、学習者をアンサンブルして、コードスニペットの包括的な特徴を捉えます。

Recently, deep learning methods have become mainstream in code search since they do better at capturing semantic correlations between code snippets and search queries and have promising performance. However, code snippets have diverse information from different dimensions, such as business logic, specific algorithm, and hardware communication, so it is hard for a single code representation module to cover all the perspectives. On the other hand, as a specific query may focus on one or several perspectives, it is difficult for a single query representation module to represent different user intents. In this paper, we propose MuCoS, a multi-model ensemble learning architecture for semantic code search. It combines several individual learners, each of which emphasizes a specific perspective of code snippets. We train the individual learners on different datasets which contain different perspectives of code information, and we use a data augmentation strategy to get these different datasets. Then we ensemble the learners to capture comprehensive features of code snippets.

翻訳日:2021-07-13 15:41:52 公開日:2021-07-10

# HOMRS:ディープニューラルネットワークのための高次準同型関係セレクタ

HOMRS: High Order Metamorphic Relations Selector for Deep Neural Networks ( http://arxiv.org/abs/2107.04863v1 )

ライセンス: Link先を確認

Florian Tambon, Giulio Antoniol and Foutse Khomh

(参考訳) ディープニューラルネットワーク(DNN)アプリケーションは、医療アプリケーションから自動運転車まで、私たちの日常生活の一部になりつつある。従来のDNNの検証は精度測定に頼っているが、敵の例の存在はこれらの精度測定の限界を強調しており、特にDNNが安全クリティカルシステムに統合された場合の懸念を高めている。本稿では,基本変成関係の最初の集合から高次変成関係の小さな最適化セットを自動構築することにより,変成テストを促進する方法であるHOMRSを提案する。 HOMRSのバックボーンは多目的検索であり、コードカバレッジ、テストケース、パスの多様性といった従来のシステムテストから引き出されたアイデアを利用する。 HOMRS を MNIST データセットで LeNet5 DNN に適用し,95% のキル比を達成できる小型だが効果的な高次変換セットを構築した証拠を報告する。 5つのラッカーは、高次変換前後に画像のプールを手動でラベル付けし、フライスのカッパと統計検査により、それらが変成特性であることを確認した。 HOMRSは、ランダムにサンプリングされたアウト・オブ・ディストリビューション画像の92%を検出した。 HOMRS変換はオンラインリアルタイム利用にも適している。

Deep Neural Networks (DNN) applications are increasingly becoming a part of our everyday life, from medical applications to autonomous cars. Traditional validation of DNN relies on accuracy measures, however, the existence of adversarial examples has highlighted the limitations of these accuracy measures, raising concerns especially when DNN are integrated into safety-critical systems. In this paper, we present HOMRS, an approach to boost metamorphic testing by automatically building a small optimized set of high order metamorphic relations from an initial set of elementary metamorphic relations. HOMRS' backbone is a multi-objective search; it exploits ideas drawn from traditional systems testing such as code coverage, test case, and path diversity. We applied HOMRS to LeNet5 DNN with MNIST dataset and we report evidence that it builds a small but effective set of high order transformations achieving a 95% kill ratio. Five raters manually labeled a pool of images before and after high order transformation; Fleiss' Kappa and statistical tests confirmed that they are metamorphic properties. HOMRS built-in relations are also effective to confront adversarial or out-of-distribution examples; HOMRS detected 92% of randomly sampled out-of-distribution images. HOMRS transformations are also suitable for online real-time use.

翻訳日:2021-07-13 15:41:37 公開日:2021-07-10

PDF登録状況（公開日: 20210710）