Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20201226となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# 短期量子コンピュータにおけるリサイクル量子ビット Recycling qubits in near-term quantum computers ( http://arxiv.org/abs/2012.01676v2 ) ライセンス: Link先を確認	Galit Anikeeva, Isaac H. Kim, Patrick Hayden	(参考訳) 量子コンピュータは単体テンソルネットワークを効率的に収縮させることができる。例えば、行列積状態に基づくネットワークや、MERA(Multi-scale entanglement renormalization ansatz)は、小さな量子コンピュータ上で収縮し、大きな量子システムのシミュレーションを支援することができる。しかし、選択的に量子ビットをリセットする能力がなければ、関連する空間コストは無視できる。本稿では,回路が共通の畳み込み形式を持つ場合,量子ビットを一元的にリセット可能なプロトコルを提案する。このプロトコルは、使用されていないキュービットに時間反転量子回路を部分的に適用することにより、使用中のキュービットから新しいキュービットを生成する。ノイズがなければ、これらの量子ビットのサブセットの状態が$\|0\ldots 0\rangle$となり、適用されるゲートの数で指数関数的に小さな誤差となる。また,ノイズの存在下でプロトコルが機能することを示す数値的な証拠も提示する。また,このプロトコルが雑音の存在下で機能することを示す数値的な証拠を提供し,ノイズ耐性が厳密に従う条件を定式化する。 Quantum computers are capable of efficiently contracting unitary tensor networks, a task that is likely to remain difficult for classical computers. For instance, networks based on matrix product states or the multi-scale entanglement renormalization ansatz (MERA) can be contracted on a small quantum computer to aid the simulation of a large quantum system. However, without the ability to selectively reset qubits, the associated spatial cost can be exorbitant. In this paper, we propose a protocol that can unitarily reset qubits when the circuit has a common convolutional form, thus dramatically reducing the spatial cost for implementing the contraction algorithm on general near-term quantum computers. This protocol generates fresh qubits from used ones by partially applying the time-reversed quantum circuit over qubits that are no longer in use. In the absence of noise, we prove that the state of a subset of these qubits becomes $\|0\ldots 0\rangle$, up to an error exponentially small in the number of gates applied. We also provide a numerical evidence that the protocol works in the presence of noise. We also provide a numerical evidence that the protocol works in the presence of noise, and formulate a condition under which the noise-resilience follows rigorously.	翻訳日:2023-04-22 05:43:40 公開日:2020-12-26
# prove-it: 一般的な数学的知識の整理と検証のための証明アシスタント Prove-It: A Proof Assistant for Organizing and Verifying General Mathematical Knowledge ( http://arxiv.org/abs/2012.10987v2 ) ライセンス: Link先を確認	Wayne M. Witzel, Warren D. Craft, Robert D. Carr and Joaqu\'in E. Madrid Larra\~naga	(参考訳) 形式的定理証明を(適度な訓練で)非公式な定理証明と同じくらい簡単かつ自然なものにすることを目的として設計された,pythonベースの汎用対話型定理証明アシスタントであるpromise-itを紹介する。 Prove-Itは、高フレキシブルなJupyterノートブックベースのユーザインターフェースを使用して、LaTeXを使用してインタラクションと証明ステップを文書化する。本稿では,表現,判断,定理,証明の表現を高度に表現し,$\sqrt{2}\notin\mathbb{q}$という従来の証明バイコントラディションを構築し,ラッセルやカリーのパラドックスのような矛盾を避ける方法について論じる。システムの中核要素に関する付録には、広範なドキュメントが記載されている。現在の開発と今後の研究は、量子回路操作と量子アルゴリズム検証への有望な応用を含んでいる。 We introduce Prove-It, a Python-based general-purpose interactive theorem-proving assistant designed with the goal of making formal theorem proving as easy and natural as informal theorem proving (with moderate training). Prove-It uses a highly-flexible Jupyter notebook-based user interface that documents interactions and proof steps using LaTeX. We review Prove-It's highly expressive representation of expressions, judgments, theorems, and proofs; demonstrate the system by constructing a traditional proof-by-contradiction that $\sqrt{2}\notin\mathbb{Q}$; and discuss how the system avoids inconsistencies such as Russell's and Curry's paradoxes. Extensive documentation is provided in the appendices about core elements of the system. Current development and future work includes promising applications to quantum circuit manipulation and quantum algorithm verification.	翻訳日:2023-04-20 02:27:20 公開日:2020-12-26
# 双対性特性解-情報キャリング(SIC)ユニタリプロパゲータ The duality-character Solution-Information-Carrying (SIC) unitary propagators ( http://arxiv.org/abs/2012.13250v2 ) ライセンス: Link先を確認	Xijia Miao	(参考訳) HSSS量子探索プロセスは、ユニタリ量子力学と非構造探索問題の数学的原理の両方に従う二重文字を所有している。基本的には従来の量子探索アルゴリズムとは異なる。非構造探索問題の双対特性オラクル演算を用いて構築する。 1)探索空間の動的還元と(2)量子状態差分増幅(QUANSDAM)という2つの連続的なステップから構成される。 QUINSDAMプロセスはSICユニタリプロパゲータで直接構築され、後者は基本SICユニタリ演算子で準備される。ここでは、単原子系のSICユニタリプロパゲータの準備を、基本SICユニタリ演算子から開始して具体的に行う。量子系のSICユニタリプロパゲータは量子系の量子対称性を反映するが、基本的なSICユニタリ作用素は反映しない。量子対称性は、量子計算スピードアップ理論における基本的な量子計算スピードアップ資源と見なされる。この準備の目的は、量子対称性を利用してクエンダム過程を高速化することである。製剤は、溶液情報伝達プロセスである。単体的かつ決定論的である。情報保護法に従っている。方法論では、エネルギー固有関数展開と多重量子作用素代数空間に基づいている。さらに、ファインマン経路積分法とエネルギー固有関数展開法を主とする一般理論が確立され、任意の量子系の座標表現におけるSICユニタリプロパゲータを理論的に扱い、計算し、将来指数的quaNSDAM過程を理論的に構築することができる。 The HSSS quantum search process owns the dual character that it obeys both the unitary quantum dynamics and the mathematical-logical principle of the unstructured search problem. It is essentially different from a conventional quantum search algorithm. It is constructed with the duality-character oracle operations of unstructured search problem. It consists of the two consecutive steps: (1) the search-space dynamical reduction and (2) the dynamical quantum-state-difference amplification (QUANSDAM). The QUANSDAM process is directly constructed with the SIC unitary propagators, while the latter each are prepared with the basic SIC unitary operators. Here the preparation for the SIC unitary propagators of a single-atom system is concretely carried out by starting from the basic SIC unitary operators. The SIC unitary propagator of a quantum system may reflect the quantum symmetry of the quantum system, while the basic SIC unitary operators may not. The quantum symmetry is considered as the fundamental quantum-computing-speedup resource in the quantum-computing speedup theory. The purpose for the preparation is ultimately to employ the quantum symmetry to speed up the QUANSDAM process. The preparation is a solution-information transfer process. It is unitary and deterministic. It obeys the information conservation law. In methodology it is based on the energy eigenfunction expansion and the multiple-quantum operator algebra space. Furthermore, a general theory mainly based on the Feynman path integration technique and also the energy eigenfunction expansion method is established to treat theoretically and calculate a SIC unitary propagator of any quantum system in the coordinate representation, which may be further used to construct theoretically an exponential QUANSDAM process in future.	翻訳日:2023-04-19 11:53:13 公開日:2020-12-26
# 量子クレジットローン Quantum credit loans ( http://arxiv.org/abs/2101.03231v1 ) ライセンス: Link先を確認	Ardenghi Juan Sebastian	(参考訳) 量子力学の数学(qm)に基づく量子モデルは認知科学、ゲーム理論、生態物理学で開発されてきた。この研究では、QM のベクトル空間形式を用いて、クレジットローンの一般化を導入する。負債、償却、利子、定期的な設定の演算子が定義され、ベクトル空間の任意の正規直交基底における平均値は、ローンの各期間において対応する値を与える。 M がローン期間である次元 M のベクトル空間を SO(M) 対称性で回転させることで、固有基底を回転させ、SO(M) 変換の回転角を用いて借主のスケジュール周期的な支払いを改善することができる。回転がベクトルの長さを保存することを考えると、総償却、負債、周期的な配置は変化しない。導入された形式論の一般的な説明として、ローン作用素関係は有限次元表現を考慮し、特定のローンタイプに対して可換作用素を定義する一般化ハイゼンベルク代数(英語版)によって与えられる。その結果、借主が定期的な設置を調整できるように、貸し手が稼ぐものを変えることなく、より良い利益を得るために対応する交換業者の重ね合わせ状態を選択できるため、回転角を通じて自由度が数度導入されるため、通常のクレジットの金融機器の改善が図られる。 Quantum models based on the mathematics of quantum mechanics (QM) have been developed in cognitive sciences, game theory and econophysics. In this work a generalization of credit loans is introduced by using the vector space formalism of QM. Operators for the debt, amortization, interest and periodic installments are defined and its mean values in an arbitrary orthonormal basis of the vectorial space give the corresponding values at each period of the loan. Endowing the vector space of dimension M, where M is the loan duration, with a SO(M) symmetry, it is possible to rotate the eigenbasis to obtain better schedule periodic payments for the borrower, by using the rotation angles of the SO(M) transformation. Given that a rotation preserves the length of the vectors, the total amortization, debt and periodic installments are not changed. For a general description of the formalism introduced, the loan operator relations are given in terms of a generalized Heisenberg algebra, where finite dimensional representations are considered and commutative operators are defined for the specific loan types. The results obtained are an improvement of the usual financial instrument of credit because introduce several degrees of freedom through the rotation angles, which allows to select superposition states of the corresponding commutative operators that enables the borrower to tune the periodic installments in order to obtain better benefits without changing what the lender earns.	翻訳日:2023-04-19 05:56:48 公開日:2020-12-26
# シリセンの局所磁気モーメントに及ぼす外部電場の影響 Effect of an external electric field on local magnetic moments in silicene ( http://arxiv.org/abs/2101.00952v1 ) ライセンス: Link先を確認	Villarreal Julian, Escudero Federico, Ardenghi Juan Sebastian and Jasen Paula	(参考訳) 本研究では,シリセンにおける局所磁気モーメントの形成における外部電場の適用の影響を分析する。ホスト格子の上部に不純物を加えて、不純物エネルギーレベルの自己エネルギーの現実的および想像的部分を計算することにより、平均場近似を考慮した不純物における上下スピン形成の占有数を得るために、状態の偏極密度を用いる。不等占領数は局所的な磁気モーメントの形成の前駆体であり、これはハバードパラメータ、不純物のオンサイトエネルギー、シリセン中のスピン軌道相互作用、および電場に大きく依存する。特に、電場がない場合、磁気相と非磁性相の境界は、トップサイト不純物を持つグラフェンに対するスピン軌道相互作用によって増大し、電場がオンになると収縮して狭くなることが示されている。グラフェンの文献で得られた結果を一般化した負および正のオンサイト不純物エネルギーに対する電界効果について検討した。 In this work we analyze the effects of the application of an external electric field in the formation of a local magnetic moment in silicene. By adding an impurity in a top site in the host lattice and computing the real and imaginary part of the self-energy of the impurity energy level, the polarized density of states is used in order to obtain the occupation number of the up and down spin formation in the impurity considering the mean field approximation. Unequal occupation numbers is the precursor of a formation of a local magnetic moment and this depends critically on the Hubbard parameter, the on-site energy of the impurity, the spin-orbit interaction in silicene and the electric field applied. In particular, it is shown that in the absence of electric field, the boundary between the magnetic and non-magnetic phases increases with the spin-orbit interaction with respect to graphene with a top site impurity and shrinks and narrows it when the electric field is turned on. The electric field effect is studied for negative and positive on-site impurity energies generalizing the results obtained in the literature for graphene.	翻訳日:2023-04-19 05:56:25 公開日:2020-12-26
# 非線形ゲージ結合量子流体の流体力学 Hydrodynamics of nonlinear gauge-coupled quantum fluids ( http://arxiv.org/abs/2012.13834v1 ) ライセンス: Link先を確認	Y. Buggy, L.G. Phillips and P. \"Ohberg	(参考訳) 流体力学の正準形式を構築することにより,ボース凝縮流体の平均場ハミルトニアンにおける任意の密度依存ゲージポテンシャルの発生は,流体密度に対する力学的流れの明示的な依存性によって生じる位相の波動方程式における非線形フロー依存項をもたらすことを示した。さらに、この種類の非線形流体に対して標準運動量輸送方程式を導出し、応力テンソルの式を得る。さらに, 非線形流体中の流体力学式について検討し, 光学配向二層原子の超低温希薄ボースガス中の弱い接触相互作用の導入による有効ゲージポテンシャルについて検討した。超流体の機械運動量輸送のコーシー方程式では、密度依存ベクトルポテンシャルにより2つの非自明な項が現れる。希釈の体力はゲージポテンシャルと流体の希釈率の積として現れるが、応力テンソルはゲージポテンシャルの内積と正準電流密度によって与えられる正準流れ圧力項を特徴とする。数値シミュレーションにより, 外部不純物の存在下での超流体の基底状態波動関数に対する非線形ゲージポテンシャルの興味深い影響を示す。基底状態は非自明な局所位相を採用しており、ゲージポテンシャルの反転の下では非対称である。相プロファイルは、不純物に関する正準流または相流双極子につながり、機械的な流れを引き起こす。その結果、圧力は物体に対して非対称になり、凝縮物は変形する。 By constructing a hydrodynamic canonical formalism, we show that the occurrence of an arbitrary density-dependent gauge potential in the meanfield Hamiltonian of a Bose-condensed fluid invariably leads to nonlinear flow-dependent terms in the wave equation for the phase, where such terms arise due to the explicit dependence of the mechanical flow on the fluid density. In addition, we derive a canonical momentum transport equation for this class of nonlinear fluid and obtain an expression for the stress tensor. Further, we study the hydrodynamic equations in a particular nonlinear fluid, where the effective gauge potential results from the introduction of weak contact interactions in an ultracold dilute Bose gas of optically-addressed two-level atoms. In the Cauchy equation of mechanical momentum transport of the superfluid, two non-trivial terms emerge due to the density-dependent vector potential. A body-force of dilation appears as a product of the gauge potential and the dilation rate of the fluid, while the stress tensor features a canonical flow pressure term given by the inner-product of the gauge potential and the canonical current density. By numerical simulation, we illustrate an interesting effect of the nonlinear gauge potential on the groundstate wavefunction of a superfluid in the presence of a foreign impurity. We find that the groundstate adopts a non-trivial local phase, which is antisymmetric under reversal of the gauge potential. The phase profile leads to a canonical-flow or phase-flow dipole about the impurity, resulting in a skirting mechanical flow. As a result, the pressure becomes asymmetric about the object and the condensate undergoes a deformation.	翻訳日:2023-04-19 05:56:07 公開日:2020-12-26
# Varshni-Hulth\'enポテンシャルモデルと相互作用するN-次元シュリンガー方程式の固有解 Eigensolutions of the N-dimensional Schr\"odinger equation interacting with Varshni-Hulth\'en potential model ( http://arxiv.org/abs/2012.13826v1 ) ライセンス: Link先を確認	E. P. Inyang, E. S. William and J. A. Obu	(参考訳) 新しく提案されたヴァルシュニ・ハルトポテンシャルに対するN次元シュリンガー方程式の解析解は、遠心障壁に対するグリーン・アルドリッチ近似スキームを用いてニキフォロフ・ウバロフ法の枠組みの中で得られる。数値エネルギー固有値と対応する正規化固有関数はヤコビ多項式の項で得られる。ポテンシャルの特別なケースは等しく研究され、それらの数値エネルギー固有値は他の手法で得られたものと一致している。しかし、基底状態といくつかの励起状態に対するエネルギーの挙動は図式的に示される。 Analytical solutions of the N-dimensional Schr\"odinger equation for the newly proposed Varshni-Hulth\'en potential are obtained within the framework of Nikiforov-Uvarov method by using Greene-Aldrich approximation scheme to the centrifugal barrier. The numerical energy eigenvalues and the corresponding normalized eigenfunctions are obtained in terms of Jacobi polynomials. Special cases of the potential are equally studied and their numerical energy eigenvalues are in agreement with those obtained previously with other methods. However, the behavior of the energy for the ground state and several excited states is illustrated graphically.	翻訳日:2023-04-19 05:55:40 公開日:2020-12-26
# 集合非線形性を用いた近決定的弱値メトロロジー Near-Deterministic Weak-Value Metrology via Collective non-Linearity ( http://arxiv.org/abs/2012.13749v1 ) ライセンス: Link先を確認	Muthumanimaran Vetrivelan and Sai Vinjanampathy	(参考訳) 弱値増幅は、関心の小さなパラメータの測定を強化するためにポストセレクションを用いる。増幅は成功確率の低下を犠牲にして行われ、実用的メトロロジーのツールとしてこの技術の有用性を阻害する。量子アドバンテージを示す他の量子技術に従い、成功確率の量子アドバンテージを定式化し、成功確率の超拡大を示す非線形集団ハミルトニアンに基づくスキームを提示し、同時に弱値の広範な成長を示す。提案手法の実験的実装を提案する。 Weak-value amplification employs postselection to enhance the measurement of small parameters of interest. The amplification comes at the expense of reduced success probability, hindering the utility of this technique as a tool for practical metrology. Following other quantum technologies that display a quantum advantage, we formalize a quantum advantage in the success probability and present a scheme based on non-linear collective Hamiltonians that shows a super-extensive growth in success probability while simultaneously displaying an extensive growth in the weak value. We propose an experimental implementation of our scheme.	翻訳日:2023-04-19 05:54:34 公開日:2020-12-26
# 2モードスクイーズ光を用いた量子増強2光子分光 Quantum-Enhanced Two-Photon Spectroscopy Using Two-mode Squeezed Light ( http://arxiv.org/abs/2012.13745v1 ) ライセンス: Link先を確認	Nikunjkumar Prajapati, Ziqi Niu, and Irina Novikova	(参考訳) 本研究では,Rb蒸気中で発生する2モード強度強化ツインビームを用いて,2光子ラマン転移による分光測定の感度向上を図る。原理実証の実証として,ラマンポンプレーザーパワーとrb蒸気数密度の要件を低減した超微細構造rb 5.d_{3/2}の量子エンハンス測定を実証した。 We investigate the prospects of using two-mode intensity squeezed twin-beams, generated in Rb vapor, to improve the sensitivity of spectroscopic measurements by engaging two-photon Raman transitions. As a proof of principle demonstration, we demonstrated the quantum-enhanced measurements of the Rb $5D_{3/2}$ hyperfine structure with reduced requirements for the Raman pump laser power and Rb vapor number density.	翻訳日:2023-04-19 05:54:24 公開日:2020-12-26
# 深いシグマ点過程 Deep Sigma Point Processes ( http://arxiv.org/abs/2002.09112v2 ) ライセンス: Link先を確認	Martin Jankowiak, Geoff Pleiss, Jacob R. Gardner	(参考訳) 本稿では,Deep Gaussian Processes (DGP) の構成構造から着想を得たパラメトリックモデルのクラスであるDeep Sigma Point Processesを紹介する。ディープシグマポイントプロセス(DSPP)は、カーネル基底関数によって制御されるミニバッチトレーニングや予測不確実性を含む、(可変)DGPの魅力的な特徴の多くを保持している。重要なことは、DSPPは単純な極大推定手順を許容しているため、結果として生じる予測分布は後部近似によって劣化しない。単変量および多変量回帰タスクに関する広範な実証的な比較では、結果の予測分布は、拡張性のある回帰のための他の確率的手法で得られたものよりも、はるかによく校正されている。 We introduce Deep Sigma Point Processes, a class of parametric models inspired by the compositional structure of Deep Gaussian Processes (DGPs). Deep Sigma Point Processes (DSPPs) retain many of the attractive features of (variational) DGPs, including mini-batch training and predictive uncertainty that is controlled by kernel basis functions. Importantly, since DSPPs admit a simple maximum likelihood inference procedure, the resulting predictive distributions are not degraded by any posterior approximations. In an extensive empirical comparison on univariate and multivariate regression tasks we find that the resulting predictive distributions are significantly better calibrated than those obtained with other probabilistic methods for scalable regression, including variational DGPs--often by as much as a nat per datapoint.	翻訳日:2022-12-30 00:33:18 公開日:2020-12-26
# 深層強化学習のための弱い人間選好監督 Weak Human Preference Supervision For Deep Reinforcement Learning ( http://arxiv.org/abs/2007.12904v2 ) ライセンス: Link先を確認	Zehong Cao, KaiChiu Wong, Chin-Teng Lin	(参考訳) 人間の好みからの現在の報酬学習は、一対の軌道セグメント間の単一の固定された嗜好を定義することで、報酬関数にアクセスせずに複雑な強化学習(RL)タスクを解決するために使用できる。しかし、軌道間の選好の判断は動的ではなく、何千回も繰り返して人間の入力を必要とする。本研究では,人選好の選好度を自然に反映した人選好スケーリングモデルを構築し,教師付き学習による人選好推定装置を構築し,人選好数を減らすための予測選好を生成するという,弱い人選好監視フレームワークを提案する。提案されている弱い人間の嗜好監視フレームワークは、複雑なRLタスクを効果的に解決し、シミュレーションされたロボットの移動 -- MuJoCoゲーム -- における累積的な報酬を達成することができる。さらに,本手法では,環境との相互作用の0.01 %未満の人的フィードバックしか必要とせず,既存の手法と比較して,人的入力のコストを最大30 %削減する。このアプローチの柔軟性を示すために、私たちは、異なるタイプの人間の入力に基づいて訓練されたエージェントの振る舞いの比較を示すビデオ(https://youtu.be/jQPe1OILT0M)をリリースした。我々は、弱い教師付き学習による自然にインスピレーションを受けた人間の嗜好が、正確な報酬学習に有用であり、人間と自律的なチームリングシステムのような最先端のRLシステムに適用できると考えている。 The current reward learning from human preferences could be used to resolve complex reinforcement learning (RL) tasks without access to a reward function by defining a single fixed preference between pairs of trajectory segments. However, the judgement of preferences between trajectories is not dynamic and still requires human input over thousands of iterations. In this study, we proposed a weak human preference supervision framework, for which we developed a human preference scaling model that naturally reflects the human perception of the degree of weak choices between trajectories and established a human-demonstration estimator via supervised learning to generate the predicted preferences for reducing the number of human inputs. The proposed weak human preference supervision framework can effectively solve complex RL tasks and achieve higher cumulative rewards in simulated robot locomotion -- MuJoCo games -- relative to the single fixed human preferences. Furthermore, our established human-demonstration estimator requires human feedback only for less than 0.01\% of the agent's interactions with the environment and significantly reduces the cost of human inputs by up to 30\% compared with the existing approaches. To present the flexibility of our approach, we released a video (https://youtu.be/jQPe1OILT0M) showing comparisons of the behaviours of agents trained on different types of human input. We believe that our naturally inspired human preferences with weakly supervised learning are beneficial for precise reward learning and can be applied to state-of-the-art RL systems, such as human-autonomy teaming systems.	翻訳日:2022-11-07 01:10:23 公開日:2020-12-26
# 意思決定アルゴリズム評価のためのマルチモーダル安全批判シナリオ生成 Multimodal Safety-Critical Scenarios Generation for Decision-Making Algorithms Evaluation ( http://arxiv.org/abs/2009.08311v3 ) ライセンス: Link先を確認	Wenhao Ding, Baiming Chen, Bo Li, Kim Ji Eun, Ding Zhao	(参考訳) 既存のニューラルネットワークベースの自律システムは敵攻撃に対して脆弱であるため、その堅牢性に関する高度な評価は非常に重要である。しかしながら、既知の攻撃に基づいて最悪のシナリオでのみロバスト性を評価することは包括的ではない。加えて、安全クリティカルなデータの分布は通常マルチモーダルであり、伝統的な攻撃や評価方法は単一のモダリティに焦点を当てている。上記の課題を解決するため,意思決定アルゴリズムを評価するためのフローベースマルチモーダル安全クリティカルシナリオジェネレータを提案する。提案する生成モデルは重み付き確率最大化により最適化され, 勾配に基づくサンプリング手法が統合され, サンプリング効率が向上する。セーフティクリティカルなシナリオはタスクアルゴリズムをクエリすることで生成され、生成されたシナリオのログライクな状態はリスクレベルに比例する。自動運転タスクの実験は、テスト効率とマルチモーダルモデリング能力の観点から、我々の利点を示しています。 6つの強化学習アルゴリズムを生成したトラヒックシナリオで評価し,その頑健性に関する実証的結論を与える。 Existing neural network-based autonomous systems are shown to be vulnerable against adversarial attacks, therefore sophisticated evaluation on their robustness is of great importance. However, evaluating the robustness only under the worst-case scenarios based on known attacks is not comprehensive, not to mention that some of them even rarely occur in the real world. In addition, the distribution of safety-critical data is usually multimodal, while most traditional attacks and evaluation methods focus on a single modality. To solve the above challenges, we propose a flow-based multimodal safety-critical scenario generator for evaluating decisionmaking algorithms. The proposed generative model is optimized with weighted likelihood maximization and a gradient-based sampling procedure is integrated to improve the sampling efficiency. The safety-critical scenarios are generated by querying the task algorithms and the log-likelihood of the generated scenarios is in proportion to the risk level. Experiments on a self-driving task demonstrate our advantages in terms of testing efficiency and multimodal modeling capability. We evaluate six Reinforcement Learning algorithms with our generated traffic scenarios and provide empirical conclusions about their robustness.	翻訳日:2022-10-17 23:47:22 公開日:2020-12-26
# コメントとソースコード間の深いジャストインタイム不整合検出 Deep Just-In-Time Inconsistency Detection Between Comments and Source Code ( http://arxiv.org/abs/2010.01625v2 ) ライセンス: Link先を確認	Sheena Panthaplackel, Junyi Jessy Li, Milos Gligoric, Raymond J. Mooney	(参考訳) 自然言語コメントは、実装、使用法、プリ・ポスト・コンディションといったソースコードの重要な側面を伝える。対応するコードが変更されたときにコメントを更新するのに失敗すると、矛盾が生じ、混乱とソフトウェアのバグが引き起こされる。本稿では,コードベースにコミットする前に,コメントが対応するコード本体の変更によって一貫性のないものになるかどうかを検知し,潜在的な不整合性,すなわち,コードベースにコミットする前に検出することを目的とする。これを実現するために,コメントとコードの変更を関連付けるディープラーニングアプローチを開発した。様々なコメントタイプにまたがるコメント/コードペアの大規模なコーパスを評価することで,本モデルが複数のベースラインを著しく上回ることを示す。外部評価において,コード変更に基づく不整合コメントの検出と解決が可能な,より包括的な自動コメント保守システムを構築するために,コメント更新モデルと組み合わせて提案手法の有用性を示す。 Natural language comments convey key aspects of source code such as implementation, usage, and pre- and post-conditions. Failure to update comments accordingly when the corresponding code is modified introduces inconsistencies, which is known to lead to confusion and software bugs. In this paper, we aim to detect whether a comment becomes inconsistent as a result of changes to the corresponding body of code, in order to catch potential inconsistencies just-in-time, i.e., before they are committed to a code base. To achieve this, we develop a deep-learning approach that learns to correlate a comment with code changes. By evaluating on a large corpus of comment/code pairs spanning various comment types, we show that our model outperforms multiple baselines by significant margins. For extrinsic evaluation, we show the usefulness of our approach by combining it with a comment update model to build a more comprehensive automatic comment maintenance system which can both detect and resolve inconsistent comments based on code changes.	翻訳日:2022-10-11 02:58:00 公開日:2020-12-26
# 重み付き不均一グラフに基づく対話システム A Weighted Heterogeneous Graph Based Dialogue System ( http://arxiv.org/abs/2010.10699v2 ) ライセンス: Link先を確認	Xinyan Zhao, Liangwei Chen, Huanhuan Chen	(参考訳) 知識に基づく対話システムは、多様なアプリケーションに対する研究の関心を惹きつけている。しかし, 疾患診断においては, 従来の知識グラフのエッジが重み付けされていないため, 症状-症状関係と症状-症状関係を表現することは困難である。疾患診断対話システムに関するほとんどの研究は、データ駆動型手法と統計的特徴に強く依存しており、症状-症状関係と症状-交感神経関係の深い理解を欠いている。そこで本研究では,重み付きヘテロジニアスグラフを用いた疾患診断のための対話システムを提案する。具体的には、症状共起に基づく重み付きヘテロジニアスグラフと、症状周波数逆病頻度を提案する。次に,対話管理のためのグラフベースのディープqネットワーク(graph-dqn)を提案する。グラフ畳み込みネットワーク(GCN)とDQNを組み合わせることで、重み付きヘテロジニアスグラフの構造情報と属性情報の両方から疾患や症状の埋め込みを学習することで、Graph-DQNは症状・症状・症状の関係をよりよく捉えることができる。実験の結果,提案する対話システムは最先端のモデルに匹敵することがわかった。さらに重要なことは、対話システムは対話のターンを減らしてタスクを完了し、類似の症状を持つ疾患に対するより良い識別能力を有することである。 Knowledge based dialogue systems have attracted increasing research interest in diverse applications. However, for disease diagnosis, the widely used knowledge graph is hard to represent the symptom-symptom relations and symptom-disease relations since the edges of traditional knowledge graph are unweighted. Most research on disease diagnosis dialogue systems highly rely on data-driven methods and statistical features, lacking profound comprehension of symptom-disease relations and symptom-symptom relations. To tackle this issue, this work presents a weighted heterogeneous graph based dialogue system for disease diagnosis. Specifically, we build a weighted heterogeneous graph based on symptom co-occurrence and a proposed symptom frequency-inverse disease frequency. Then this work proposes a graph based deep Q-network (Graph-DQN) for dialogue management. By combining Graph Convolutional Network (GCN) with DQN to learn the embeddings of diseases and symptoms from both the structural and attribute information in the weighted heterogeneous graph, Graph-DQN could capture the symptom-disease relations and symptom-symptom relations better. Experimental results show that the proposed dialogue system rivals the state-of-the-art models. More importantly, the proposed dialogue system can complete the task with less dialogue turns and possess a better distinguishing capability on diseases with similar symptoms.	翻訳日:2022-10-04 22:39:13 公開日:2020-12-26
# ロシア語の科学・技術文献からのエンティティ認識と関係抽出 Entity Recognition and Relation Extraction from Scientific and Technical Texts in Russian ( http://arxiv.org/abs/2011.09817v3 ) ライセンス: Link先を確認	Elena Bruches, Alexey Pauls, Tatiana Batura, Vladimir Isachenko	(参考訳) 本稿では,情報技術に関する学術文献から情報抽出(エンティティ認識と関係分類)の手法について考察する。科学出版物は最先端の科学的進歩に貴重な情報を提供するが、データ量の増加の効率的な処理は時間のかかる作業である。本稿では、ロシア語の方法のいくつかの修正を提案する。また、キーワード抽出法、語彙法、およびニューラルネットワークに基づくいくつかの方法の比較実験結果を含む。これらのタスクのためのテキストコレクションは英語に存在し、科学コミュニティが積極的に使用しているが、現在、ロシア語のデータセットは公開されていない。本稿では,ロシアにおける学術文献のコーパス,RuSERRCについて述べる。このデータセットは1600の未ラベル文書と80のエンティティとセマンティックリレーションでラベル付けされている(6つの関係型が考慮された)。データセットとモデルはhttps://github.com/iis-research-teamで入手できる。情報抽出システムの研究や開発に活用できることを願っている。 This paper is devoted to the study of methods for information extraction (entity recognition and relation classification) from scientific texts on information technology. Scientific publications provide valuable information into cutting-edge scientific advances, but efficient processing of increasing amounts of data is a time-consuming task. In this paper, several modifications of methods for the Russian language are proposed. It also includes the results of experiments comparing a keyword extraction method, vocabulary method, and some methods based on neural networks. Text collections for these tasks exist for the English language and are actively used by the scientific community, but at present, such datasets in Russian are not publicly available. In this paper, we present a corpus of scientific texts in Russian, RuSERRC. This dataset consists of 1600 unlabeled documents and 80 labeled with entities and semantic relations (6 relation types were considered). The dataset and models are available at https://github.com/iis-research-team. We hope they can be useful for research purposes and development of information extraction systems.	翻訳日:2022-09-23 21:00:23 公開日:2020-12-26
# 多発性硬化性病変の分節化 : CNN法の検討 Multiple Sclerosis Lesion Segmentation -- A Survey of Supervised CNN-Based Methods ( http://arxiv.org/abs/2012.08317v2 ) ライセンス: Link先を確認	Huahong Zhang and Ipek Oguz	(参考訳) 病変分割は多発性硬化症患者のmriスキャンを定量的に解析するための重要な課題である。近年,様々な医療画像解析アプリケーションにおける深層学習技術の成功により,この課題に対するコミュニティの関心が高まり,新たなアルゴリズム開発に向けた活動が活発化している。そこで本研究では,CNNを用いたMS病変分類法について検討した。レビューした作品をアルゴリズムコンポーネントに分離し,それぞれを別々に議論する。公開ベンチマークデータセットの評価を行う手法については,結果の比較を報告する。 Lesion segmentation is a core task for quantitative analysis of MRI scans of Multiple Sclerosis patients. The recent success of deep learning techniques in a variety of medical image analysis applications has renewed community interest in this challenging problem and led to a burst of activity for new algorithm development. In this survey, we investigate the supervised CNN-based methods for MS lesion segmentation. We decouple these reviewed works into their algorithmic components and discuss each separately. For methods that provide evaluations on public benchmark datasets, we report comparisons between their results.	翻訳日:2021-05-10 05:20:51 公開日:2020-12-26
# 分散検索によるクエリ応答 Query Answering via Decentralized Search ( http://arxiv.org/abs/2012.12192v2 ) ライセンス: Link先を確認	Liang Ma	(参考訳) エキスパートネットワークは、ネットワークに投稿された特定のクエリを協調的に解決するために、異なる専門性を持つ専門家専門家のグループによって形成される。このようなネットワークでは、十分な専門知識を持たない専門家に問い合わせが届くと、このクエリを他の専門家にルーティングして、完全に解決するまで処理する必要があるため、クエリ応答効率は、使用されているクエリルーティングメカニズムに敏感である。可能なすべてのクエリルーティング機構のうち、ネットワークのグローバル構造を知らずに各専門家のローカル情報に対して純粋に動作する分散検索は、最も基本的でスケーラブルなルーティング機構であり、動的ネットワークにおいても任意のネットワークシナリオに適用できる。しかし、専門家ネットワークにおける分散検索の効率性に関する基本的な理解が不足している。本稿では,様々なネットワーク環境下での性能を定量化し,分散検索について検討する。我々の重要な発見はネットワーク条件の存在を示し、その下にある分散検索は、非常に短いクエリルーティングパスを実現することができる(すなわち、$O(\log n)$と$O(\log^2 n)$ホップ、$n$:ネットワークの専門家の総数)。このような理論的基礎に基づき、専門家ネットワークにおける分散探索のユニークな性質が、逸話的小世界現象とどのように関連しているかをさらに研究する。さらに,必要な専門知識レベルを誤解釈することによって生じる推定誤差に対して,分散検索が堅牢であることを示す。我々の知る限りでは、これはエキスパートネットワークにおける分散検索の基本的な振る舞いを研究する最初の研究である。開発したパフォーマンス境界は、実際のデータセットによって確認され、ネットワークパフォーマンスの予測と複雑なエキスパートネットワークの設計を支援することができる。 Expert networks are formed by a group of expert-professionals with different specialties to collaboratively resolve specific queries posted to the network. In such networks, when a query reaches an expert who does not have sufficient expertise, this query needs to be routed to other experts for further processing until it is completely solved; therefore, query answering efficiency is sensitive to the underlying query routing mechanism being used. Among all possible query routing mechanisms, decentralized search, operating purely on each expert's local information without any knowledge of network global structure, represents the most basic and scalable routing mechanism, which is applicable to any network scenarios even in dynamic networks. However, there is still a lack of fundamental understanding of the efficiency of decentralized search in expert networks. In this regard, we investigate decentralized search by quantifying its performance under a variety of network settings. Our key findings reveal the existence of network conditions, under which decentralized search can achieve significantly short query routing paths (i.e., between $O(\log n)$ and $O(\log^2 n)$ hops, $n$: total number of experts in the network). Based on such theoretical foundation, we further study how the unique properties of decentralized search in expert networks is related to the anecdotal small-world phenomenon. In addition, we demonstrate that decentralized search is robust against estimation errors introduced by misinterpreting the required expertise levels. To the best of our knowledge, this is the first work studying fundamental behaviors of decentralized search in expert networks. The developed performance bounds, confirmed by real datasets, are able to assist in predicting network performance and designing complex expert networks.	翻訳日:2021-05-01 18:15:42 公開日:2020-12-26
# (参考訳) 不確実性定量化による糖尿病網膜症分類の能動的学習法 An Active Learning Method for Diabetic Retinopathy Classification with Uncertainty Quantification ( http://arxiv.org/abs/2012.13325v2 ) ライセンス: CC BY 4.0	Muhammad Ahtazaz Ahsan, Adnan Qayyum, Junaid Qadir and Adeel Razi	(参考訳) 近年、深層学習(DL)技術は様々な医療画像のタスクに最先端のパフォーマンスを提供している。しかし、時間的制約や専門的なアノテータ、例えば放射線技師が利用できるため、良質なアノテート医療データの入手は非常に困難である。加えて、DLはデータハングリーであり、そのトレーニングには広範な計算資源が必要である。 DLのもう1つの問題は、そのブラックボックスの性質と、因果的理解と推論を妨げる内部動作への透明性の欠如である。本稿では,不確実性定量化のためのベイズ畳み込みニューラルネットワーク(BCNN)を用いたハイブリッドモデルと,未ラベルデータの注釈付けのためのアクティブラーニングアプローチを提案する。 BCNNは機能記述子として使用され、これらの機能は、アクティブな学習環境でモデルのトレーニングに使用される。糖尿病網膜症分類の枠組みについて検討し,様々な指標で最先端のパフォーマンスを達成した。 In recent years, deep learning (DL) techniques have provided state-of-the-art performance on different medical imaging tasks. However, the availability of good quality annotated medical data is very challenging due to involved time constraints and the availability of expert annotators, e.g., radiologists. In addition, DL is data-hungry and their training requires extensive computational resources. Another problem with DL is their black-box nature and lack of transparency on its inner working which inhibits causal understanding and reasoning. In this paper, we jointly address these challenges by proposing a hybrid model, which uses a Bayesian convolutional neural network (BCNN) for uncertainty quantification, and an active learning approach for annotating the unlabelled data. The BCNN is used as a feature descriptor and these features are then used for training a model, in an active learning setting. We evaluate the proposed framework for diabetic retinopathy classification problem and have achieved state-of-the-art performance in terms of different metrics.	翻訳日:2021-04-25 13:13:08 公開日:2020-12-26
# (参考訳) 組織的サイバーセキュリティリスクの予測 - ディープラーニングアプローチ Predicting Organizational Cybersecurity Risk: A Deep Learning Approach ( http://arxiv.org/abs/2012.14425v1 ) ライセンス: CC BY 4.0	Benjamin M. Ampel	(参考訳) 悪意あるハッカーによるサイバー攻撃は、組織、政府、個人に毎年不可分なダメージを与える。ハッカーはハッカーフォーラムで見つかったエクスプロイトを使って複雑なサイバー攻撃を実行し、これらのフォーラムの探索を不可欠にする。本稿では,攻撃対象と攻撃対象を識別するためのハッカーフォーラムエンティティ認識フレームワーク(HackER)を提案する。 hackerは双方向のlong short-term memory model(bilstm)を使用して、企業がエクスプロイト対象とする予測モデルを作成する。アルゴリズムの結果は、精度、精度、リコール、F1スコアを指標として、手動でゴールドスタンダードテストデータセットを使用して評価される。このモデルと最先端の古典的機械学習とディープラーニングのベンチマークモデルを比較します。その結果,提案したHacker BiLSTMモデルはF1スコア(79.71%)の古典的機械学習モデルやディープラーニングモデルよりも優れていた。これらの結果はLSTMを除く全てのベンチマークで0.05以下で統計的に有意である。予備研究の結果から,サイバーセキュリティの重要なステークホルダー(アナリスト,研究者,教育者など)が,エクスプロイトがターゲットとするビジネスの種類を特定するのに役立つことが示唆された。 Cyberattacks conducted by malicious hackers cause irreparable damage to organizations, governments, and individuals every year. Hackers use exploits found on hacker forums to carry out complex cyberattacks, making exploration of these forums vital. We propose a hacker forum entity recognition framework (HackER) to identify exploits and the entities that the exploits target. HackER then uses a bidirectional long short-term memory model (BiLSTM) to create a predictive model for what companies will be targeted by exploits. The results of the algorithm will be evaluated using a manually labeled gold-standard test dataset, using accuracy, precision, recall, and F1-score as metrics. We choose to compare our model against state of the art classical machine learning and deep learning benchmark models. Results show that our proposed HackER BiLSTM model outperforms all classical machine learning and deep learning models in F1-score (79.71%). These results are statistically significant at 0.05 or lower for all benchmarks except LSTM. The results of preliminary work suggest our model can help key cybersecurity stakeholders (e.g., analysts, researchers, educators) identify what type of business an exploit is targeting.	翻訳日:2021-04-25 04:15:46 公開日:2020-12-26
# (参考訳) rough to fine: global/local attentionによるマルチラベル画像分類 Coarse to Fine: Multi-label Image Classification with Global/Local Attention ( http://arxiv.org/abs/2012.13662v1 ) ライセンス: CC BY 4.0	Fan Lyu, Fuyuan Hu, Victor S. Sheng, Zhengtian Wu, Qiming Fu and Baochuan Fu	(参考訳) 私たちの日常生活では、周囲のシーンは常に複数のラベルがあり、特にスマートシティ、すなわち、応答と制御に対する都市操作の情報を認識する。ディープニューラルネットワークを使ってマルチラベル画像を認識することで、大きな努力がなされている。マルチラベル画像分類は非常に複雑であるため、注意機構を用いて分類プロセスを導こうとしている。しかし,従来の注意法は画像を直接的かつ積極的に分析する。複雑な場面をよく理解することは困難である。本稿では,人間による画像観察を模倣することで,粗い画像から細かい画像まで認識できるグローバル/ローカルアテンション手法を提案する。具体的には、まず、グローバル/ローカルアテンション手法が画像全体に集中し、次に画像内の局所的なオブジェクトに注目します。また,正のラベルの最小スコアが負のラベルの最大スコアよりも水平および垂直に大きいことを強制する統合的マックスマージン客観的関数を提案する。この機能は、マルチラベル画像分類法をさらに改善することができる。提案手法の有効性を2つの一般的なマルチラベル画像データセット(Pascal VOCとMS-COCO)で評価した。実験の結果,本手法は最先端手法よりも優れていた。 In our daily life, the scenes around us are always with multiple labels especially in a smart city, i.e., recognizing the information of city operation to response and control. Great efforts have been made by using Deep Neural Networks to recognize multi-label images. Since multi-label image classification is very complicated, people seek to use the attention mechanism to guide the classification process. However, conventional attention-based methods always analyzed images directly and aggressively. It is difficult for them to well understand complicated scenes. In this paper, we propose a global/local attention method that can recognize an image from coarse to fine by mimicking how human-beings observe images. Specifically, our global/local attention method first concentrates on the whole image, and then focuses on local specific objects in the image. We also propose a joint max-margin objective function, which enforces that the minimum score of positive labels should be larger than the maximum score of negative labels horizontally and vertically. This function can further improve our multi-label image classification method. We evaluate the effectiveness of our method on two popular multi-label image datasets (i.e., Pascal VOC and MS-COCO). Our experimental results show that our method outperforms state-of-the-art methods.	翻訳日:2021-04-25 04:06:56 公開日:2020-12-26
# (参考訳) ミリ波センシング:アプリケーションパイプラインとビルディングブロックのレビュー Millimeter Wave Sensing: A Review of Application Pipelines and Building Blocks ( http://arxiv.org/abs/2012.13664v1 ) ライセンス: CC BY 4.0	Bram van Berlo, Amany Elkelany, Tanir Ozcelebi, Nirvana Meratnia	(参考訳) 新しい無線アプリケーションの帯域幅が増加すると、高速無線通信のためのミリ波スペクトルの標準化に繋がる。ミリ波スペクトルは5Gの一部であり、10mmから1mmの波長に対応する周波数は30〜300GHzである。ミリ波は、しばしば通信媒体と見なされるが、狭いビーム、広帯域での動作、大気成分との相互作用などにより、優れた「センサー」であることが証明されている。本稿では,ミリ波センシングアプリケーションパイプラインを網羅する最初のレビューにおいて,ハードウェア,アルゴリズム,解析モデル,モデル評価技術など,さまざまな基本アプリケーションパイプライン構築ブロックの概要と解析について述べる。レビューはまた、異なるミリ波センシングアプリケーションドメインを強調する分類も提供している。総合的な分析を行い、体系的な文献レビューの方法論に従い、165の論文をレビューすることで、ミリ波技術の通信面のみに焦点をあて、アクティブイメージングにミリ波技術を用い、科学的・技術的課題と動向を強調し、ミリ波をセンシング技術として応用するための今後の展望を提供する。 The increasing bandwidth requirement of new wireless applications has lead to standardization of the millimeter wave spectrum for high-speed wireless communication. The millimeter wave spectrum is part of 5G and covers frequencies between 30 and 300 GHz corresponding to wavelengths ranging from 10 to 1 mm. Although millimeter wave is often considered as a communication medium, it has also proved to be an excellent 'sensor', thanks to its narrow beams, operation across a wide bandwidth, and interaction with atmospheric constituents. In this paper, which is to the best of our knowledge the first review that completely covers millimeter wave sensing application pipelines, we provide a comprehensive overview and analysis of different basic application pipeline building blocks, including hardware, algorithms, analytical models, and model evaluation techniques. The review also provides a taxonomy that highlights different millimeter wave sensing application domains. By performing a thorough analysis, complying with the systematic literature review methodology and reviewing 165 papers, we not only extend previous investigations focused only on communication aspects of the millimeter wave technology and using millimeter wave technology for active imaging, but also highlight scientific and technological challenges and trends, and provide a future perspective for applications of millimeter wave as a sensing technology.	翻訳日:2021-04-25 03:55:02 公開日:2020-12-26
# (参考訳) 雑音ラベルを用いたロバスト協調学習 Robust Collaborative Learning with Noisy Labels ( http://arxiv.org/abs/2012.13670v1 ) ライセンス: CC BY 4.0	Mengying Sun, Jing Xing, Bin Chen, Jiayu Zhou	(参考訳) カリキュラムによる学習は、適切な設計でノイズの多いサンプルを再重み付けしたりフィルタリングしたりできるため、データがノイズラベルを含むタスクにおいて大きな効果を示してきた。しかし、追加の監督やフィードバックなしに学習者自身からカリキュラムを得ることは、サンプル選択バイアスによる効果を低下させる。そのため、2つ以上のネットワークを含む手法が近年提案されている。それにもかかわらず、これらの研究はネットワーク間の協調を利用して、意見の相違を強調したり、合意に焦点を合わせながら他方を無視したりしている。本稿では,ネットワーク間の不一致と合意が勾配の雑音を減らし,ネットワーク間の不一致と合意の両面を活用したロバスト協調学習(RCL)と呼ばれる新しいフレームワークを開発するための基盤機構について検討する。実世界の大規模バイオインフォマティクスデータとベンチマーク画像データの両方において,RCLの有効性を示す。 Learning with curriculum has shown great effectiveness in tasks where the data contains noisy (corrupted) labels, since the curriculum can be used to re-weight or filter out noisy samples via proper design. However, obtaining curriculum from a learner itself without additional supervision or feedback deteriorates the effectiveness due to sample selection bias. Therefore, methods that involve two or more networks have been recently proposed to mitigate such bias. Nevertheless, these studies utilize the collaboration between networks in a way that either emphasizes the disagreement or focuses on the agreement while ignores the other. In this paper, we study the underlying mechanism of how disagreement and agreement between networks can help reduce the noise in gradients and develop a novel framework called Robust Collaborative Learning (RCL) that leverages both disagreement and agreement among networks. We demonstrate the effectiveness of RCL on both synthetic benchmark image data and real-world large-scale bioinformatics data.	翻訳日:2021-04-25 03:53:50 公開日:2020-12-26
# (参考訳) グラフニューラルネットワークを用いた複雑なネットワークにおけるノードレジリエンスの近似 Graph neural network based approximation of Node Resiliency in complex networks ( http://arxiv.org/abs/2012.15725v1 ) ライセンス: CC BY 4.0	Sai Munikoti, Laya Das and Balasubramaniam Natarajan	(参考訳) 最適操作と効率の重視により、エンジニアリングシステムの複雑さが増大した。これによりシステムの脆弱性が増大する。しかし、極端な事象の発生頻度の増加に伴い、レジリエンスは重要な考慮事項となっている。レジリエンスは、極端な条件から吸収および回復するシステムの能力を定量化する。グラフ理論は、攻撃に対するレジリエンスを評価するために複雑なエンジニアリングシステムのモデリングに広く使われているフレームワークである。レジリエンス解析の既存の手法のほとんどは、グラフの各ノード/リンクを探索する反復的アプローチに基づいている。これらの手法は計算量が高く、解析結果はネットワーク固有である。これらの課題に対処するために,大規模複雑ネットワークにおけるノードレジリエンスを近似するためのグラフニューラルネットワーク(GNN)ベースのフレームワークを提案する。提案フレームワークは、ノードの小さな代表部分集合上のノードランクを学習するGNNモデルを定義する。次に、トレーニングされたモデルを用いて、類似したグラフの型における見えないノードのランクを予測する。このフレームワークのスケーラビリティは,実世界のグラフにおけるノードランクの予測を通じて実証される。提案手法は, ノードのレジリエンススコアを近似する精度が高く, 従来の手法よりも計算能力に優れる。 The emphasis on optimal operations and efficiency has led to increased complexity in engineered systems. This in turn increases the vulnerability of the system. However, with the increasing frequency of extreme events, resilience has now become an important consideration. Resilience quantifies the ability of the system to absorb and recover from extreme conditions. Graph theory is a widely used framework for modeling complex engineered systems to evaluate their resilience to attacks. Most existing methods in resilience analysis are based on an iterative approach that explores each node/link of a graph. These methods suffer from high computational complexity and the resulting analysis is network specific. To address these challenges, we propose a graph neural network (GNN) based framework for approximating node resilience in large complex networks. The proposed framework defines a GNN model that learns the node rank on a small representative subset of nodes. Then, the trained model can be employed to predict the ranks of unseen nodes in similar types of graphs. The scalability of the framework is demonstrated through the prediction of node ranks in real-world graphs. The proposed approach is accurate in approximating the node resilience scores and offers a significant computational advantage over conventional approaches.	翻訳日:2021-04-25 03:39:24 公開日:2020-12-26
# (参考訳) 不確実性下におけるグラフレジリエンスのためのベイズ誘導学習 Bayesian Inductive Learner for Graph Resiliency under uncertainty ( http://arxiv.org/abs/2012.15733v1 ) ライセンス: CC BY 4.0	Sai Munikoti and Balasubramaniam Natarajan	(参考訳) 効率性向上を追求する中で、相互依存と複雑性は現代のエンジニアリングシステムの特性を定義している。障害のカスケードに対する脆弱性の増加に伴い、そのようなエンジニアシステムに関連するリスクと不確実性を理解し、管理することが不可欠である。グラフ理論は相互依存系をモデル化し、破壊に対する回復力を評価するために広く使われているフレームワークである。レジリエンス解析の既存の手法のほとんどは、グラフの各ノード/リンクを探索する反復的アプローチに基づいている。これらの手法は計算量が高く、解析結果はネットワーク固有である。さらに、基礎となるグラフィカルモデルに関連する不確実性は、これらの従来のアプローチの潜在的な価値をさらに制限します。これらの課題を克服するために,大規模グラフ内の臨界ノードを迅速に識別するベイズグラフニューラルネットワークベースのフレームワークを提案する。体系的に不確実性を取り入れていますモデルをトレーニングするために観測グラフを利用する代わりに、観測されたトポロジーとノードターゲットラベルに基づいてグラフの地図推定を算出する。さらに、認識の不確実性を考慮したモンテカルロ(mc)ドロップアウトアルゴリズムが組み込まれている。ベイズフレームワークが提供する計算複雑性の忠実性と利得をシミュレーション結果を用いて示す。 In the quest to improve efficiency, interdependence and complexity are becoming defining characteristics of modern engineered systems. With increasing vulnerability to cascading failures, it is imperative to understand and manage the risk and uncertainty associated with such engineered systems. Graph theory is a widely used framework for modeling interdependent systems and to evaluate their resilience to disruptions. Most existing methods in resilience analysis are based on an iterative approach that explores each node/link of a graph. These methods suffer from high computational complexity and the resulting analysis is network specific. Additionally, uncertainty associated with the underlying graphical model further limits the potential value of these traditional approaches. To overcome these challenges, we propose a Bayesian graph neural network-based framework for quickly identifying critical nodes in a large graph. while systematically incorporating uncertainties. Instead of utilizing the observed graph for training the model, a MAP estimate of the graph is computed based on the observed topology, and node target labels. Further, a Monte-Carlo (MC) dropout algorithm is incorporated to account for the epistemic uncertainty. The fidelity and the gain in computational complexity offered by the Bayesian framework are illustrated using simulation results.	翻訳日:2021-04-25 03:18:00 公開日:2020-12-26
# (参考訳) TSGCNet:2ストリームグラフ畳み込みネットワークを用いた3次元歯科モデルセグメンテーションのための識別幾何学的特徴学習 TSGCNet: Discriminative Geometric Feature Learning with Two-Stream GraphConvolutional Network for 3D Dental Model Segmentation ( http://arxiv.org/abs/2012.13697v1 ) ライセンス: CC BY 4.0	Lingming Zhang, Yue Zhao, Deyu Meng, Zhiming Cui, Chenqiang Gao, Xinbo Gao, Chunfeng Lian, Dinggang Shen	(参考訳) デジタル化された3次元歯科モデルから歯を正確に切り離す能力は,コンピュータ支援歯科矯正計画において必須の課題である。これまで、ディープラーニングに基づく手法は、このタスクの処理に広く用いられてきた。最先端の手法は、メッシュセルの座標と通常のベクトルである3d入力の生属性を直接結合し、完全に自動化された歯のセグメンテーションのための単一ストリームネットワークを訓練する。しかし、これはこれらの原属性によって提供される異なる幾何学的意味を無視する欠点がある。この問題は、識別幾何学的特徴を学ぶ上でネットワークを混乱させ、歯科モデルの多くの孤立した誤った予測をもたらす可能性がある。本稿では,異なる幾何学的属性から多視点幾何学情報を学習するための2ストリームグラフ畳み込みネットワーク(tsgcnet)を提案する。我々のTSGCNetは2つのグラフ学習ストリームを入力認識方式で設計し、座標と正規ベクトルからより識別性の高い高次幾何表現を抽出する。設計した2つの異なるストリームから得られたこれらの特徴表現はさらに融合し、セルワイドな予測タスクのための多視点補完情報を統合する。 3次元口腔内スキャナーで取得した歯科モデルの実患者データセット上でのtsgcnetの評価を行い,本手法が3次元形状セグメンテーションの最先端法を大幅に上回っていることを実験的に示す。 The ability to segment teeth precisely from digitized 3D dental models is an essential task in computer-aided orthodontic surgical planning. To date, deep learning based methods have been popularly used to handle this task. State-of-the-art methods directly concatenate the raw attributes of 3D inputs, namely coordinates and normal vectors of mesh cells, to train a single-stream network for fully-automated tooth segmentation. This, however, has the drawback of ignoring the different geometric meanings provided by those raw attributes. This issue might possibly confuse the network in learning discriminative geometric features and result in many isolated false predictions on the dental model. Against this issue, we propose a two-stream graph convolutional network (TSGCNet) to learn multi-view geometric information from different geometric attributes. Our TSGCNet adopts two graph-learning streams, designed in an input-aware fashion, to extract more discriminative high-level geometric representations from coordinates and normal vectors, respectively. These feature representations learned from the designed two different streams are further fused to integrate the multi-view complementary information for the cell-wise dense prediction task. We evaluate our proposed TSGCNet on a real-patient dataset of dental models acquired by 3D intraoral scanners, and experimental results demonstrate that our method significantly outperforms state-of-the-art methods for 3D shape segmentation.	翻訳日:2021-04-25 03:08:12 公開日:2020-12-26
# (参考訳) 効率的推論のためのレトロ合成データを用いたハイブリッドおよび非一様量子化法 Hybrid and Non-Uniform quantization methods using retro synthesis data for efficient inference ( http://arxiv.org/abs/2012.13716v1 ) ライセンス: CC BY 4.0	Tej pratap GVSL, Raja Kumar	(参考訳) 既存の量子化対応トレーニング手法は、トレーニング後の量子化方法のほとんどと同様に、トレーニングデータを活用することで、量子化損失を補おうとする。これらの方法は、トレーニングデータと密結合しているため、プライバシ制約アプリケーションには有効ではない。対照的に,本稿では,トレーニングデータの必要性をなくすデータ非依存なトレーニング後量子化手法を提案する。これは、FP32モデル層統計からフェローデータセット(以下、Retro-Synthesis Dataと呼ぶ)を生成し、さらに量子化に使用することで達成される。このアプローチは、imagenetとcifar-10データセットのバッチ正規化層8,6,4ビット精度のモデルにおいて、zeroqとdfqを含む最先端の手法よりも優れていた。また,2種類のポストトレーニング量子化手法,すなわちハイブリッド量子化と非均一量子化を導入した。 Existing quantization aware training methods attempt to compensate for the quantization loss by leveraging on training data, like most of the post-training quantization methods, and are also time consuming. Both these methods are not effective for privacy constraint applications as they are tightly coupled with training data. In contrast, this paper proposes a data-independent post-training quantization scheme that eliminates the need for training data. This is achieved by generating a faux dataset, hereafter referred to as Retro-Synthesis Data, from the FP32 model layer statistics and further using it for quantization. This approach outperformed state-of-the-art methods including, but not limited to, ZeroQ and DFQ on models with and without Batch-Normalization layers for 8, 6, and 4 bit precisions on ImageNet and CIFAR-10 datasets. We also introduced two futuristic variants of post-training quantization methods namely Hybrid Quantization and Non-Uniform Quantization	翻訳日:2021-04-25 02:52:19 公開日:2020-12-26
# (参考訳) 分離指数に基づく伝達学習における事前学習深層ニューラルネットワークのランク付けと排除 Ranking and Rejecting of Pre-Trained Deep Neural Networks in Transfer Learning based on Separation Index ( http://arxiv.org/abs/2012.13717v1 ) ライセンス: CC BY 4.0	Mostafa Kalhor, Ahmad Kalhor, and Mehdi Rahmani	(参考訳) 事前学習型ディープニューラルネットワーク(DNN)の自動ランキングは、最適な事前学習型DNNを選択するために必要な時間を短縮し、転送学習における分類性能を高める。本稿では,対象データセットに分離指数(SI)という簡単な距離に基づく複雑性尺度を適用し,事前学習したDNNをランク付けするアルゴリズムを提案する。この目的のために、まず、SIに関する背景が与えられ、その後、自動ランキングアルゴリズムが説明される。このアルゴリズムでは、事前訓練されたDNNの特徴抽出部分から通過するターゲットデータセットに対してSIを演算する。そして、計算されたSIを下位にすることで、事前訓練されたDNNを簡単にランク付けする。このランキング法では、最高のDNNがターゲットデータセット上で最大SIを出力し、十分に低計算のSIの場合、いくつかの事前訓練されたDNNを拒否することができる。提案アルゴリズムの効率は、Linnaeus 5, Breast Cancer Images, COVID-CTの3つの挑戦的データセットを用いて評価される。第3のケーススタディでは,対象データに対する前処理が異なっていたにもかかわらず,アルゴリズムのランク付けは分類精度から得られたランキングと高い相関性を有する。 Automated ranking of pre-trained Deep Neural Networks (DNNs) reduces the required time for selecting optimal pre-trained DNN and boost the classification performance in transfer learning. In this paper, we introduce a novel algorithm to rank pre-trained DNNs by applying a straightforward distance-based complexity measure named Separation Index (SI) to the target dataset. For this purpose, at first, a background about the SI is given and then the automated ranking algorithm is explained. In this algorithm, the SI is computed for the target dataset which passes from the feature extracting parts of pre-trained DNNs. Then, by descending sort of the computed SIs, the pre-trained DNNs are ranked, easily. In this ranking method, the best DNN makes maximum SI on the target dataset and a few pre-trained DNNs may be rejected in the case of their sufficiently low computed SIs. The efficiency of the proposed algorithm is evaluated by using three challenging datasets including Linnaeus 5, Breast Cancer Images, and COVID-CT. For the two first case studies, the results of the proposed algorithm exactly match with the ranking of the trained DNNs by the accuracy on the target dataset. For the third case study, despite using different preprocessing on the target data, the ranking of the algorithm has a high correlation with the ranking resulted from classification accuracy.	翻訳日:2021-04-25 02:39:13 公開日:2020-12-26
# (参考訳) ラベルのないショット学習はほとんどありません Few Shot Learning With No Labels ( http://arxiv.org/abs/2012.13751v1 ) ライセンス: CC BY-SA 4.0	Aditya Bharti, N.B. Vineeth, C.V. Jawahar	(参考訳) 少数派の学習者は,少数の学習サンプルに限って,新たなカテゴリの認識を目指す。主な課題は、新しいクラスへの適切な一般化を確保しながら、限られたデータへの過度な適合を避けることである。既存の文献は、ラベル要件を新しいクラスからベースクラスに単純にシフトすることで、大量の注釈付きデータを利用する。データアノテーションは時間とコストがかかるため、ラベル要件の削減がさらに重要な目標である。そこで,本稿では,トレーニングやテスト中にラベルアクセスを許可しない,より難易度の高い少数ショット設定を提案する。自己スーパービジョンを利用して画像表現と画像類似性をテスト時に学習することにより,最先端のラベルよりも少ないラベルである \textbf{zero}ラベルを用いて,競合ベースラインを実現する。この研究は、注釈付きデータにまったく依存しない、少数の学習方法を開発するための一歩になることを願っている。私たちのコードは公開されます。 Few-shot learners aim to recognize new categories given only a small number of training samples. The core challenge is to avoid overfitting to the limited data while ensuring good generalization to novel classes. Existing literature makes use of vast amounts of annotated data by simply shifting the label requirement from novel classes to base classes. Since data annotation is time-consuming and costly, reducing the label requirement even further is an important goal. To that end, our paper presents a more challenging few-shot setting where no label access is allowed during training or testing. By leveraging self-supervision for learning image representations and image similarity for classification at test time, we achieve competitive baselines while using \textbf{zero} labels, which is at least fewer labels than state-of-the-art. We hope that this work is a step towards developing few-shot learning methods which do not depend on annotated data at all. Our code will be publicly released.	翻訳日:2021-04-25 02:38:07 公開日:2020-12-26
# (参考訳) 高精度低ビット幅深層ニューラルネットワークの直接量子化 Direct Quantization for Training Highly Accurate Low Bit-width Deep Neural Networks ( http://arxiv.org/abs/2012.13762v1 ) ライセンス: CC BY 4.0	Tuan Hoang and Thanh-Toan Do and Tam V. Nguyen and Ngai-Man Cheung	(参考訳) 本稿では,低ビット幅重みとアクティベーションで深部畳み込みニューラルネットワークを訓練する2つの新しい手法を提案する。まず、ビット幅の少ない重みを得るため、既存の方法の多くは、全精度ネットワーク重みで量子化することにより量子化重みを得る。しかし、このアプローチはいくつかのミスマッチをもたらす:勾配降下は全精度重みを更新するが、量子化された重みは更新しない。この問題に対処するために,学習可能な量子化レベルを持つ量子化重みの{direct}更新を可能にし,勾配降下を用いたコスト関数を最小化する新しい手法を提案する。第二に、ビット幅の低いアクティベーションを得るために、既存の研究は全てのチャネルを等しく考慮している。しかし、活性化量子化器は高分散のいくつかのチャネルに偏りがある。この問題に対処するために,個別チャネルの量子化誤差を考慮した手法を提案する。このアプローチでは、多くのチャネルで量子化エラーを最小化するアクティベーション量子化子を学習できる。実験により,提案手法は,CIFAR-100およびImageNetデータセット上のAlexNet,ResNet,MobileNetV2アーキテクチャを用いて,画像分類タスクにおける最先端性能を実現することを示す。 This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations. First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights. However, this approach would result in some mismatch: the gradient descent updates full-precision weights, but it does not update the quantized weights. To address this issue, we propose a novel method that enables {direct} updating of quantized weights {with learnable quantization levels} to minimize the cost function using gradient descent. Second, to obtain low bit-width activations, existing works consider all channels equally. However, the activation quantizers could be biased toward a few channels with high-variance. To address this issue, we propose a method to take into account the quantization errors of individual channels. With this approach, we can learn activation quantizers that minimize the quantization errors in the majority of channels. Experimental results demonstrate that our proposed method achieves state-of-the-art performance on the image classification task, using AlexNet, ResNet and MobileNetV2 architectures on CIFAR-100 and ImageNet datasets.	翻訳日:2021-04-25 02:24:39 公開日:2020-12-26
# (参考訳) 多成分形状に対するアフィンモーメント不変量 An Affine moment invariant for multi-component shapes ( http://arxiv.org/abs/2012.13774v1 ) ライセンス: CC BY 4.0	Jovisa Zunic, Milos Stojmenovic	(参考訳) 本稿では,多成分形状解析のための画像ベースアルゴリズムツールを提案する。多成分形状の一般的な概念のため、本手法は実物体をその形状に基づいて解析する広い範囲のアプリケーションの解析に適用することができる。対応する白黒の画像ですこの方法は、多成分形状測定(multi-component shapes measure)と呼ばれる数値を形に割り当てる。この数/測度はアフィン変換に対して不変であり、本論文で開発された理論的枠組みに基づいて確立される。加えて、この方法は実装が容易で、堅牢である(例えば、)。騒音については)。航空画像解析と銀河画像解析に関する2つの小さめながら図示的な例を示す。また,測定値の挙動をよりよく理解するための合成例も提示する。 We introduce an image based algorithmic tool for analyzing multi-component shapes here. Due to the generic concept of multi-component shapes, our method can be applied to the analysis of a wide spectrum of applications where real objects are analyzed based on their shapes - i.e. on their corresponded black and white images. The method allocates a number to a shape, herein called a multi-component shapes measure. This number/measure is invariant with respect to affine transformations and is established based on the theoretical frame developed in this paper. In addition, the method is easy to implement and is robust (e.g. with respect to noise). We provide two small but illustrative examples related to aerial image analysis and galaxy image analysis. Also, we provide some synthetic examples for a better understanding of the measure behavior.	翻訳日:2021-04-25 01:50:52 公開日:2020-12-26
# (参考訳) DAC-MLを用いた試料効率制御に向けて Towards sample-efficient episodic control with DAC-ML ( http://arxiv.org/abs/2012.13779v1 ) ライセンス: CC BY 4.0	Ismael T. Freire, Adri\'an F. Amil, Vasiliki Vouloutsi, Paul F.M.J. Verschure	(参考訳) 人工知能におけるサンプル効率問題は、少数のエピソードでアクションポリシーを最適化する現在のDeep Reinforcement Learningモデルが存在しないことを指す。近年の研究では、エピソード強化学習のような学習速度を改善するために、メモリシステムやアーキテクチャバイアスを追加することで、この制限を克服しようとしている。しかし、漸進的な改善を達成しても、そのパフォーマンスは人間の行動方針の学習方法に匹敵するものではない。本稿では、脳と心の分散適応制御(DAC)理論の設計原理を活かして、海馬にインスパイアされたシーケンシャルメモリシステムを導入することで、挑戦的な捕食作業における報酬獲得を最大化する効果的なアクションポリシーに迅速に収束できる新しい認知アーキテクチャ(DAC-ML)を構築する。 The sample-inefficiency problem in Artificial Intelligence refers to the inability of current Deep Reinforcement Learning models to optimize action policies within a small number of episodes. Recent studies have tried to overcome this limitation by adding memory systems and architectural biases to improve learning speed, such as in Episodic Reinforcement Learning. However, despite achieving incremental improvements, their performance is still not comparable to how humans learn behavioral policies. In this paper, we capitalize on the design principles of the Distributed Adaptive Control (DAC) theory of mind and brain to build a novel cognitive architecture (DAC-ML) that, by incorporating a hippocampus-inspired sequential memory system, can rapidly converge to effective action policies that maximize reward acquisition in a challenging foraging task.	翻訳日:2021-04-25 01:43:39 公開日:2020-12-26
# (参考訳) 説明可能な医療データの多クラス分類 Explainable Multi-class Classification of Medical Data ( http://arxiv.org/abs/2012.13796v1 ) ライセンス: CC BY 4.0	YuanZheng Hu, Marina Sokolova	(参考訳) 機械学習アプリケーションは、医療データの二次分析に新たな洞察をもたらした。機械学習は、新しい薬物の開発を支援し、特定の病気に罹患する集団を定義し、多くの共通疾患の予測因子を特定する。同時に、機械学習の結果は、特徴の選択、クラス(im)バランス、アルゴリズムの選好、パフォーマンスメトリクスなど、多くの要因の畳み込みに依存する。本稿では,大規模医療データセットのマルチクラス分類について説明する。本稿では,知識に基づく機能工学,データセットのバランス,最良のモデル選択,パラメータチューニングについて述べる。この研究では、SVM(Support Vector Machine)、Na\"ive Bayes、Gradient Boosting、Decision Trees、Random Forest、Logistic Regressionの6つのアルゴリズムが使用されている。 UCI 糖尿病130-US 病院の1999-2008 年データセットにおける経験的評価を行い,患者病院の再入院期間を0日,<30日,>30日という3つのクラスに分類した。その結果,23種類の薬品を学習実験で使用することにより,6種類の学習アルゴリズムのうち5つをリコールできることがわかった。これは、同じデータで行った以前の研究を拡大する新しい結果である。勾配ブースティングとランダムフォレストは他のアルゴリズムよりも3つの分類精度で優れていた。 Machine Learning applications have brought new insights into a secondary analysis of medical data. Machine Learning helps to develop new drugs, define populations susceptible to certain illnesses, identify predictors of many common diseases. At the same time, Machine Learning results depend on convolution of many factors, including feature selection, class (im)balance, algorithm preference, and performance metrics. In this paper, we present explainable multi-class classification of a large medical data set. We in details discuss knowledge-based feature engineering, data set balancing, best model selection, and parameter tuning. Six algorithms are used in this study: Support Vector Machine (SVM), Na\"ive Bayes, Gradient Boosting, Decision Trees, Random Forest, and Logistic Regression. Our empirical evaluation is done on the UCI Diabetes 130-US hospitals for years 1999-2008 dataset, with the task to classify patient hospital re-admission stay into three classes: 0 days, <30 days, or > 30 days. Our results show that using 23 medication features in learning experiments improves Recall of five out of the six applied learning algorithms. This is a new result that expands the previous studies conducted on the same data. Gradient Boosting and Random Forest outperformed other algorithms in terms of the three-class classification Accuracy.	翻訳日:2021-04-25 01:36:41 公開日:2020-12-26
# (参考訳) 段木モデルに基づく新しい生成型分類器のクラス A new class of generative classifiers based on staged tree models ( http://arxiv.org/abs/2012.13798v1 ) ライセンス: CC BY 4.0	Federico Carli, Manuele Leonelli, Gherardo Varando	(参考訳) 分類のための生成モデルは、クラス変数と特徴の合同確率分布を使用して決定規則を構成する。生成モデルのうち、ベイズネットワークとナイーブベイズ分類器が最も一般的に使われ、すべての変数間の関係を明確に表現している。しかしこれらは、文脈固有の独立を許さないことで、存在可能な関係のタイプを高度に制限する欠点がある。ここでは,ステージ付き木分類器と呼ばれる,コンテキスト固有の独立性を考慮した新しい生成型分類器を導入する。これらは、条件付き独立性が形式的に読み取れるイベントツリーの頂点の分割によって構成される。 naive staged tree分類器も定義されており、同じ複雑さを維持しながら、古典的なnaive bayes分類器を拡張する。大規模シミュレーションにより,段階木分類器の分類精度は最先端の分類器と競合することが示された。タイタニック号の乗客の運命を予測するための応用分析は、新しい世代分類器が与えうる洞察を強調している。 Generative models for classification use the joint probability distribution of the class variable and the features to construct a decision rule. Among generative models, Bayesian networks and naive Bayes classifiers are the most commonly used and provide a clear graphical representation of the relationship among all variables. However, these have the disadvantage of highly restricting the type of relationships that could exist, by not allowing for context-specific independences. Here we introduce a new class of generative classifiers, called staged tree classifiers, which formally account for context-specific independence. They are constructed by a partitioning of the vertices of an event tree from which conditional independence can be formally read. The naive staged tree classifier is also defined, which extends the classic naive Bayes classifier whilst retaining the same complexity. An extensive simulation study shows that the classification accuracy of staged tree classifiers is competitive with those of state-of-the-art classifiers. An applied analysis to predict the fate of the passengers of the Titanic highlights the insights that the new class of generative classifiers can give.	翻訳日:2021-04-25 01:24:11 公開日:2020-12-26
# 多次元不確実性認識ニューラルネットワーク Multidimensional Uncertainty-Aware Evidential Neural Networks ( http://arxiv.org/abs/2012.13676v1 ) ライセンス: Link先を確認	Yibo Hu, Yuzhe Ou, Xujiang Zhao, Jin-Hee Cho, Feng Chen	(参考訳) 従来のディープニューラルネットワーク(NN)は、さまざまなアプリケーションドメインの分類タスクにおける最先端のパフォーマンスに大きく貢献している。しかし、NNは、不確実性の下での誤分類が現実世界の文脈で意思決定のリスクを高くする(例えば、道路における物体の誤分類が深刻な事故を引き起こす)クラス確率に関連するデータに固有の不確実性は考慮していない。重みの不確実性を通じて間接的に不確実性を推定するベイズNNとは異なり、顕在的NN(ENN)は近年、クラス確率の不確かさを明示的にモデル化し、分類タスクに使用するために提案されている。 ENNは、NNの予測を主観的意見として定式化し、データから決定論的NNによって主観的意見を形成することができる量の証拠を収集して機能を学ぶ。しかし、ENNは、空白(証拠の欠如による不確実性)や不協和(証拠の矛盾による不確実性)など、異なる根本原因を持つデータに固有の不確かさを明示的に考慮することなく、ブラックボックスとして訓練されている。多次元不確かさを考慮し, オフ・オブ・ディストリビューション(OOD)検出問題の解法として, WGAN-ENN (WENN) と呼ばれる新しい不確実性検出NNを提案する。 We take a hybrid approach which with Wasserstein Generative Adversarial Network (WGAN) with ENNs to jointly training a model with a prior knowledge of a class, which has high vacuity for OOD sample。人工と実世界の両方のデータセットに基づく広範な実験実験により、WENNによる不確実性の推定は、OODサンプルと境界サンプルの区別に大きく役立つことを示した。 WENNは、他の競合相手と比較してOOD検出に優れていた。 Traditional deep neural networks (NNs) have significantly contributed to the state-of-the-art performance in the task of classification under various application domains. However, NNs have not considered inherent uncertainty in data associated with the class probabilities where misclassification under uncertainty may easily introduce high risk in decision making in real-world contexts (e.g., misclassification of objects in roads leads to serious accidents). Unlike Bayesian NN that indirectly infer uncertainty through weight uncertainties, evidential NNs (ENNs) have been recently proposed to explicitly model the uncertainty of class probabilities and use them for classification tasks. An ENN offers the formulation of the predictions of NNs as subjective opinions and learns the function by collecting an amount of evidence that can form the subjective opinions by a deterministic NN from data. However, the ENN is trained as a black box without explicitly considering inherent uncertainty in data with their different root causes, such as vacuity (i.e., uncertainty due to a lack of evidence) or dissonance (i.e., uncertainty due to conflicting evidence). By considering the multidimensional uncertainty, we proposed a novel uncertainty-aware evidential NN called WGAN-ENN (WENN) for solving an out-of-distribution (OOD) detection problem. We took a hybrid approach that combines Wasserstein Generative Adversarial Network (WGAN) with ENNs to jointly train a model with prior knowledge of a certain class, which has high vacuity for OOD samples. Via extensive empirical experiments based on both synthetic and real-world datasets, we demonstrated that the estimation of uncertainty by WENN can significantly help distinguish OOD samples from boundary samples. WENN outperformed in OOD detection when compared with other competitive counterparts.	翻訳日:2021-04-25 01:15:41 公開日:2020-12-26
# PaXNet:Ensemble Transfer LearningとCapsule Classifierを用いたパノラマX線歯列検出 PaXNet: Dental Caries Detection in Panoramic X-ray using Ensemble Transfer Learning and Capsule Classifier ( http://arxiv.org/abs/2012.13666v1 ) ライセンス: Link先を確認	Arman Haghanifar, Mahdiyar Molahasani Majdabadi, Seok-Bum Ko	(参考訳) 歯列骨は、生後最も慢性的な疾患の1つであり、人口の大半を包含している。喉頭病変は通常、歯科用X線による視力検査のみに依存する放射線医によって診断される。多くの場合、歯列はX線で識別することは困難であり、低画質などの異なる理由から影と誤解されることがある。したがって,近年,ケーリー検出のための意思決定支援システムの開発が注目されている。そこで本研究では,パノラマ画像中のデンタルカリーを初めて検出し,著者の知識を最大限に活用する自動診断システムを提案する。提案モデルは、X線から関連する特徴を抽出し、カプセルネットワークを用いて予測結果を描画するトランスファーラーニングにより、事前訓練された様々な深層学習モデルの利点を享受する。 240個のラベル付き画像を含む特徴抽出に使用される470個のパノラマ画像のデータセットにおいて,本モデルが精度86.05\%を達成した。得られたスコアは、実際の患者のパノラマX線を使用する際の課題を考慮し、許容する検出性能とキャリー検出速度の増大を示す。本モデルでは, 軽度および重度者に対して69.44\%, 90.52\%のリコールスコアを取得し, 軽度カリー検出がより容易で, 効果的でロバストで大きなデータセットが必要であることを確認した。パノラマ画像を用いた最近の研究の目新しさを考えると、この研究はドメインエキスパートを支援する完全自動化された効率的な意思決定支援システムを開発するための一歩である。 Dental caries is one of the most chronic diseases involving the majority of the population during their lifetime. Caries lesions are typically diagnosed by radiologists relying only on their visual inspection to detect via dental x-rays. In many cases, dental caries is hard to identify using x-rays and can be misinterpreted as shadows due to different reasons such as low image quality. Hence, developing a decision support system for caries detection has been a topic of interest in recent years. Here, we propose an automatic diagnosis system to detect dental caries in Panoramic images for the first time, to the best of authors' knowledge. The proposed model benefits from various pretrained deep learning models through transfer learning to extract relevant features from x-rays and uses a capsule network to draw prediction results. On a dataset of 470 Panoramic images used for features extraction, including 240 labeled images for classification, our model achieved an accuracy score of 86.05\% on the test set. The obtained score demonstrates acceptable detection performance and an increase in caries detection speed, as long as the challenges of using Panoramic x-rays of real patients are taken into account. Among images with caries lesions in the test set, our model acquired recall scores of 69.44\% and 90.52\% for mild and severe ones, confirming the fact that severe caries spots are more straightforward to detect and efficient mild caries detection needs a more robust and larger dataset. Considering the novelty of current research study as using Panoramic images, this work is a step towards developing a fully automated efficient decision support system to assist domain experts.	翻訳日:2021-04-25 01:15:06 公開日:2020-12-26
# ソーシャルメディアが消費者の認識のシグナルを公表 Social media data reveals signal for public consumer perceptions ( http://arxiv.org/abs/2012.13675v1 ) ライセンス: Link先を確認	Neeti Pokhriyal, Abenezer Dara, Benjamin Valentino, Soroush Vosoughi	(参考訳) 研究者たちはソーシャルメディアのデータを使って、公共の行動に関する様々なマクロ経済指標を推定してきた。最も広く引用されている経済指標の1つは消費者信頼指数(CCI)である。これまで多くの研究がソーシャルメディア、特にTwitterデータを使ってCCIを予測することに重点を置いてきた。しかし、最近の包括的調査によると、これらのモデルが新しいデータでテストされると、強い相関関係は消失した。本稿では,ガウス過程の回帰(推定とそれに関連する不確実性の両方を提供する)を基礎としたロバストな非パラメトリックベイズモデリングフレームワークを提案することにより,ソーシャルメディアデータを用いたcci測定の真の可能性を評価する問題を再考する。我々のフレームワークと一体化することは、調査頻度を減らすためにデジタルデータをいかに活用できるかを実証する原理的な実験手法であり、定期的なポーリングは我々のモデルを校正するためにのみ必要である。広範囲な実験により、スムーズな間隔や様々な種類のラグなど、異なるマイクロ決定の選択方法が示される。結果に重要な影響を与えます Redditのdecadal data (2008-2019) を用いて、CCIの月次推定と日次推定の両方が、少なくとも数ヶ月前に確実に予測可能であること、我々のモデル推定が既存の方法よりもはるかに優れていることを示します。 Researchers have used social media data to estimate various macroeconomic indicators about public behaviors, mostly as a way to reduce surveying costs. One of the most widely cited economic indicator is consumer confidence index (CCI). Numerous studies in the past have focused on using social media, especially Twitter data, to predict CCI. However, the strong correlations disappeared when those models were tested with newer data according to a recent comprehensive survey. In this work, we revisit this problem of assessing the true potential of using social media data to measure CCI, by proposing a robust non-parametric Bayesian modeling framework grounded in Gaussian Process Regression (which provides both an estimate and an uncertainty associated with it). Integral to our framework is a principled experimentation methodology that demonstrates how digital data can be employed to reduce the frequency of surveys, and thus periodic polling would be needed only to calibrate our model. Via extensive experimentation we show how the choice of different micro-decisions, such as the smoothing interval, various types of lags etc. have an important bearing on the results. By using decadal data (2008-2019) from Reddit, we show that both monthly and daily estimates of CCI can, indeed, be reliably estimated at least several months in advance, and that our model estimates are far superior to those generated by the existing methods.	翻訳日:2021-04-25 01:13:51 公開日:2020-12-26
# Few-Shot分類のための空間コントラスト学習 Spatial Contrastive Learning for Few-Shot Classification ( http://arxiv.org/abs/2012.13831v1 ) ライセンス: Link先を確認	Yassine Ouali, C\'eline Hudelot, Myriam Tami	(参考訳) 既存の数ショットの分類法は、限られたデータを持つ未確認クラスへのテスト時間適応を容易にするトランスファー可能な表現を学習するために、クロスエントロピー(CE)損失にある程度依存している。しかし、CE損失にはいくつかの欠点があり、例えば、目に見えないクラスに対する過度な差別を伴う表現の誘導は、見つからないクラスへの転送可能性を抑制し、その結果、準最適一般化をもたらす。本研究では,データ依存正規化器として機能する補助的な学習目標として,コントラスト学習を考察する。局所的な識別特徴を抑圧する標準的な対照目的ではなく、局所的な識別とクラス非依存の特徴を学習するための新しい注意に基づく空間比較目的を提案する。広範な実験により,提案手法が最先端のアプローチに勝ることを示し,数発学習における良質な組込みの学習の重要性を確認した。 Existing few-shot classification methods rely to some degree on the cross-entropy (CE) loss to learn transferable representations that facilitate the test time adaptation to unseen classes with limited data. However, the CE loss has several shortcomings, e.g., inducing representations with excessive discrimination towards seen classes, which reduces their transferability to unseen classes and results in sub-optimal generalization. In this work, we explore contrastive learning as an additional auxiliary training objective, acting as a data-dependent regularizer to promote more general and transferable features. Instead of using the standard contrastive objective, which suppresses local discriminative features, we propose a novel attention-based spatial contrastive objective to learn locally discriminative and class-agnostic features. With extensive experiments, we show that the proposed method outperforms state-of-the-art approaches, confirming the importance of learning good and transferable embeddings for few-shot learning.	翻訳日:2021-04-25 01:13:27 公開日:2020-12-26
# 拡張現実のためのシーンテキスト検出 --文字bigramによる偽陽性率の低減 Scene Text Detection for Augmented Reality -- Character Bigram Approach to reduce False Positive Rate ( http://arxiv.org/abs/2101.01054v1 ) ライセンス: Link先を確認	Sagar Gubbi and Bharadwaj Amrutur	(参考訳) 自然シーンのテキスト検出はシーン理解の重要な側面であり、拡張現実アプリケーションを構築する上で有用なツールである。本研究では,テキストスポッティングにおける偽陽性の問題に対処する。単文字ではなく文字ペア(ビグラム)を探すことにより,スライディングウィンドウテキストスポッターの性能向上を提案する。効率的な畳み込みニューラルネットワークを設計し、ビッグラムを検出するように訓練する。提案された検出器は、ICDAR 2015データセットにおいて偽陽性率を28.16%削減する。我々は,スライディングウィンドウのテキストスポッターを改善するために,bigramsの検出が計算的に安価な方法であることを実証する。 Natural scene text detection is an important aspect of scene understanding and could be a useful tool in building engaging augmented reality applications. In this work, we address the problem of false positives in text spotting. We propose improving the performace of sliding window text spotters by looking for character pairs (bigrams) rather than single characters. An efficient convolutional neural network is designed and trained to detect bigrams. The proposed detector reduces false positive rate by 28.16% on the ICDAR 2015 dataset. We demonstrate that detecting bigrams is a computationally inexpensive way to improve sliding window text spotters.	翻訳日:2021-04-25 01:13:09 公開日:2020-12-26
# アラビア語引用規則のSmartajweed自動認識 Smartajweed Automatic Recognition of Arabic Quranic Recitation Rules ( http://arxiv.org/abs/2101.04200v1 ) ライセンス: Link先を確認	Ali M. Alagrami, Maged M. Eljazzar	(参考訳) タジウェド(Tajweed)は、クァラン語を正しい発音で読むための一連の規則であり、クァラン語を暗唱している。つまり、クァーランのすべての文字に特徴の特質を付与し、読みながらこの特定の状況においてこの特定の文字にそれを適用しなければならない。これらの特徴はメロディックな規則、例えば、どこで停止するか、どのくらいの期間、発音で2文字をマージするか、あるいは何文字を伸ばすか、あるいは他の文字に力を加えるかなどである。論文のほとんどが主な朗読規則と発音に焦点を合わせているが(ahkam al tajweed)、異なるリズムと異なるメロディを発音に与えている(tajweed)。それはまた、クァランを読む上で非常に重要で不可欠であると考えられており、語に異なる意味を与えることができる。本稿では,サポートベクタマシンとしきい値スコアリングシステムを用いて,Quran Recitation Rules(Tajweed)の自動認識のための詳細なシステムについて論じる。 Tajweed is a set of rules to read the Quran in a correct Pronunciation of the letters with all its Qualities, while Reciting the Quran. which means you have to give every letter in the Quran its due of characteristics and apply it to this particular letter in this specific situation while reading, which may differ in other times. These characteristics include melodic rules, like where to stop and for how long, when to merge two letters in pronunciation or when to stretch some, or even when to put more strength on some letters over other. Most of the papers focus mainly on the main recitation rules and the pronunciation but not (Ahkam AL Tajweed) which give different rhythm and different melody to the pronunciation with every different rule of (Tajweed). Which is also considered very important and essential in Reading the Quran as it can give different meanings to the words. In this paper we discuss in detail full system for automatic recognition of Quran Recitation Rules (Tajweed) by using support vector machine and threshold scoring system	翻訳日:2021-04-25 01:12:18 公開日:2020-12-26
# ビッグデータからのコンパクトデータに向けて Toward Compact Data from Big Data ( http://arxiv.org/abs/2012.13677v1 ) ライセンス: Link先を確認	Song-Kyoo (Amang) Kim	(参考訳) bigdataは、価値ある原材料を扱う能力以上の大きさのデータセットで、特定の洞察に洗練され、蒸留される。 compact dataは、複雑なbigdataを扱うことなく、最高のアセットを提供するbig datasetを最適化するメソッドである。このコンパクトデータセットは、ビッグデータのないビッグデータシステムの有効かつパーソナライズされた利用のために、きめ細かいレベルの最大知識パターンを含む。コンパクトデータ手法は,問題状況に依存したテーラーメイドの設計である。論文の様々なデータ駆動研究領域において、様々なコンパクトデータ技術が実証されている。 Bigdata is a dataset of which size is beyond the ability of handling a valuable raw material that can be refined and distilled into valuable specific insights. Compact data is a method that optimizes the big dataset that gives best assets without handling complex bigdata. The compact dataset contains the maximum knowledge patterns at fine grained level for effective and personalized utilization of bigdata systems without bigdata. The compact data method is a tailor-made design which depends on problem situations. Various compact data techniques have been demonstrated into various data-driven research area in the paper.	翻訳日:2021-04-25 01:11:57 公開日:2020-12-26
# スペクトル正規化による安定性確認強化学習 Stability-Certified Reinforcement Learning via Spectral Normalization ( http://arxiv.org/abs/2012.13744v1 ) ライセンス: Link先を確認	Ryoichi Takase, Nobuyuki Yoshikawa, Toshisada Mariyama, and Takeshi Tsuchiya	(参考訳) 本稿では、ニューラルネットワークが制御するシステムの安定性を確保するために、スペクトル正規化に基づく異なる視点からの2つの方法について述べる。 1つ目は、フィードバックシステムのL2ゲインが1未満の有界であり、小利得定理から導かれる安定性条件を満たすことである。第1の方法は、安定性条件を明示的に含むが、厳密な安定性条件のため、ニューラルネットワークコントローラの性能が不十分である可能性がある。この難しさを克服するため,第2の課題が提案され,より広いアトラクション領域での局所安定性を確保しつつ,性能の向上が図られた。第2の方法は、ニューラルネットワークコントローラのトレーニング後に線形行列の不等式を解くことにより安定性を確保する。本稿で提案するスペクトル正規化は, より厳密な局所セクターを構築することにより, a-posteriori 安定性試験の実現可能性を向上させる。数値実験により,第2法は第1法と比較して十分な性能を示し,既存の強化学習アルゴリズムと比較して十分な安定性が得られた。 In this article, two types of methods from different perspectives based on spectral normalization are described for ensuring the stability of the system controlled by a neural network. The first one is that the L2 gain of the feedback system is bounded less than 1 to satisfy the stability condition derived from the small-gain theorem. While explicitly including the stability condition, the first method may provide an insufficient performance on the neural network controller due to its strict stability condition. To overcome this difficulty, the second one is proposed, which improves the performance while ensuring the local stability with a larger region of attraction. In the second method, the stability is ensured by solving linear matrix inequalities after training the neural network controller. The spectral normalization proposed in this article improves the feasibility of the a-posteriori stability test by constructing tighter local sectors. The numerical experiments show that the second method provides enough performance compared with the first one while ensuring enough stability compared with the existing reinforcement learning algorithms.	翻訳日:2021-04-25 01:11:49 公開日:2020-12-26
# 呼吸音の異常予測のための深層学習フレームワーク Deep Learning Framework Applied for Predicting Anomaly of Respiratory Sounds ( http://arxiv.org/abs/2012.13668v1 ) ライセンス: Link先を確認	Dat Ngo, Lam Pham, Anh Nguyen, Ben Phan, Khoa Tran, Truong Nguyen	(参考訳) 本稿では,呼吸サイクルの異常を分類するためのロバストなディープラーニングフレームワークを提案する。まず、フレームワークはフロントエンドの機能抽出ステップから始まります。このステップは、呼吸入力音をスペクトルと時間的特徴をよく表現した2次元スペクトログラムに変換することを目的としている。次に、C-DNNとオートエンコーダネットワークのアンサンブルを用いて、呼吸異常サイクルの4つのカテゴリに分類する。本研究は2017年にICBHI(Institutal Conference on Biomedical Health Informatics)ベンチマークデータセットを用いて実施した。その結果,ICBHI平均スコア0.49,IABHI高調波スコア0.42の競争性能が得られた。 This paper proposes a robust deep learning framework used for classifying anomaly of respiratory cycles. Initially, our framework starts with front-end feature extraction step. This step aims to transform the respiratory input sound into a two-dimensional spectrogram where both spectral and temporal features are well presented. Next, an ensemble of C- DNN and Autoencoder networks is then applied to classify into four categories of respiratory anomaly cycles. In this work, we conducted experiments over 2017 Internal Conference on Biomedical Health Informatics (ICBHI) benchmark dataset. As a result, we achieve competitive performances with ICBHI average score of 0.49, ICBHI harmonic score of 0.42.	翻訳日:2021-04-25 01:11:34 公開日:2020-12-26
# siameseネットワークを用いた学習視覚手がかりを用いたワンショット物体定位 One-Shot Object Localization Using Learnt Visual Cues via Siamese Networks ( http://arxiv.org/abs/2012.13690v1 ) ライセンス: Link先を確認	Sagar Gubbi Venkatesh and Bharadwaj Amrutur	(参考訳) 新規で非構造的な環境で動作可能なロボットは、これまで見えなかった新しい物体を認識する能力を持つ必要がある。本研究では,新しい環境にローカライズされなければならない新規な関心対象を特定するために視覚的な手がかりを用いる。 siameseネットワークを備えたエンドツーエンドニューラルネットワークを使用して、キューを学習し、関心のあるオブジェクトを推論し、新たな環境にローカライズする。シミュレーションロボットはレーザーポインターが指している新しい物体をピックアップ・アンド・プレースできることを示す。また,オムニグロット手書き文字データセットと玩具の小さなデータセットから得られたデータセットに対する提案手法の性能評価を行った。 A robot that can operate in novel and unstructured environments must be capable of recognizing new, previously unseen, objects. In this work, a visual cue is used to specify a novel object of interest which must be localized in new environments. An end-to-end neural network equipped with a Siamese network is used to learn the cue, infer the object of interest, and then to localize it in new environments. We show that a simulated robot can pick-and-place novel objects pointed to by a laser pointer. We also evaluate the performance of the proposed approach on a dataset derived from the Omniglot handwritten character dataset and on a small dataset of toys.	翻訳日:2021-04-25 01:11:05 公開日:2020-12-26
# オブジェクト検出に対するsparse adversarial attack Sparse Adversarial Attack to Object Detection ( http://arxiv.org/abs/2012.13692v1 ) ライセンス: Link先を確認	Jiayu Bao	(参考訳) 敵対的な例は近年多くの注目を集めている。画像分類器を攻撃するために多くの敵攻撃が提案されているが、対象検出器に注意を向ける作業はほとんどない。本稿では,Sparse Adversarial Attack (SAA)を提案する。画像の脆弱な位置を選択し,タスクの回避損失関数を設計した。 YOLOv4とFasterRCNNの実験結果から,本手法の有効性が明らかになった。さらに、我々のSAAはブラックボックス攻撃設定で異なる検出器間で大きな伝達性を示す。コードは \emph{https://github.com/thurssq/tianchi04} で入手できる。 Adversarial examples have gained tons of attention in recent years. Many adversarial attacks have been proposed to attack image classifiers, but few work shift attention to object detectors. In this paper, we propose Sparse Adversarial Attack (SAA) which enables adversaries to perform effective evasion attack on detectors with bounded \emph{l$_{0}$} norm perturbation. We select the fragile position of the image and designed evasion loss function for the task. Experiment results on YOLOv4 and FasterRCNN reveal the effectiveness of our method. In addition, our SAA shows great transferability across different detectors in the black-box attack setting. Codes are available at \emph{https://github.com/THUrssq/Tianchi04}.	翻訳日:2021-04-25 01:10:29 公開日:2020-12-26
# 3次元連続心筋mriのための2次元呼吸ナビゲーションフレームワーク 2-D Respiration Navigation Framework for 3-D Continuous Cardiac Magnetic Resonance Imaging ( http://arxiv.org/abs/2012.13700v1 ) ライセンス: Link先を確認	Elisabeth Hoppe, Jens Wetzl, Philipp Roser, Lina Felsner, Alexander Preuhs, Andreas Maier	(参考訳) 心臓磁気共鳴イメージングのための連続的プロトコルは、同時に心筋相に分解された心臓解剖のサンプリングを可能にする。呼吸アーチファクトを避けるために、スキャン中の関連する動きを再建時に補償する必要がある。本稿では,連続スキャン中に2次元呼吸情報を取得するためのサンプリング適応を提案する。さらに、取得した信号から異なる呼吸状態を抽出するパイプラインを開発し、1つの呼吸相からデータを再構成する。以上の結果から,従来の1次元呼吸ナビゲーション手法と同様に,呼吸補償の不要な画像品質に対するワークフローの有用性が示された。 Continuous protocols for cardiac magnetic resonance imaging enable sampling of the cardiac anatomy simultaneously resolved into cardiac phases. To avoid respiration artifacts, associated motion during the scan has to be compensated for during reconstruction. In this paper, we propose a sampling adaption to acquire 2-D respiration information during a continuous scan. Further, we develop a pipeline to extract the different respiration states from the acquired signals, which are used to reconstruct data from one respiration phase. Our results show the benefit of the proposed workflow on the image quality compared to no respiration compensation, as well as a previous 1-D respiration navigation approach.	翻訳日:2021-04-25 01:10:17 公開日:2020-12-26
# 3次元カラーポイント雲を用いた高密度果樹樹樹へのリンゴの割り当て Assigning Apples to Individual Trees in Dense Orchards using 3D Color Point Clouds ( http://arxiv.org/abs/2012.13721v1 ) ライセンス: Link先を確認	Mouad Zine-El-Abidine, Helin Dutagaci, Gilles Galopin, David Rousseau	(参考訳) 本稿では,trellis構造化果樹園の個々のリンゴのリンゴを数える3dカラーポイントクラウド処理パイプラインを提案する。木レベルでの果実の計数には、密集した果樹園では難しい木を切り離す必要がある。枝構造が見える冬期に葉樹園から取得した点雲を用いて樹冠の樹冠を画定する。我々は収穫期に獲得した点雲にリンゴをローカライズする。 2つの点のクラウドをアライメントすることで、appleのロケーションを線引きされた冬のクラウドにマッピングし、それぞれのリンゴをベアリングツリーに割り当てることができる。我々のリンゴ割当法は95%以上の精度を達成する。実現可能性の最初の証明を示すことに加えて、リンゴの割り当てパイプラインにさらなる改善を提案する。 We propose a 3D color point cloud processing pipeline to count apples on individual apple trees in trellis structured orchards. Fruit counting at the tree level requires separating trees, which is challenging in dense orchards. We employ point clouds acquired from the leaf-off orchard in winter period, where the branch structure is visible, to delineate tree crowns. We localize apples in point clouds acquired in harvest period. Alignment of the two point clouds enables mapping apple locations to the delineated winter cloud and assigning each apple to its bearing tree. Our apple assignment method achieves an accuracy rate higher than 95%. In addition to presenting a first proof of feasibility, we also provide suggestions for further improvement on our apple assignment pipeline.	翻訳日:2021-04-25 01:09:47 公開日:2020-12-26
# 周波数領域からの高速かつ高精度な圧縮映像行動認識 Faster and Accurate Compressed Video Action Recognition Straight from the Frequency Domain ( http://arxiv.org/abs/2012.13726v1 ) ライセンス: Link先を確認	Samuel Felipe dos Santos and Jurandy Almeida	(参考訳) 人間の行動認識は、監視、医療、産業環境、スマートホームなど幅広い応用のために、コンピュータビジョンにおける最も活発な研究分野の1つになっている。近年,ビデオ中の人間の行動を認識するための強力で解釈可能な特徴の習得にディープラーニングが成功している。既存のディープラーニングアプローチのほとんどは、RGB画像シーケンスとしてビデオ情報を処理するために設計されている。そのため、ビデオデータは圧縮フォーマットに格納されることが多いため、プリミティブな復号処理が必要となる。しかし、ビデオのデコードには高い計算負荷とメモリ使用量が必要である。そこで本研究では,圧縮映像から直接学習可能な深層ニューラルネットワークを提案する。提案手法は,UCF-101およびHMDB-51データセットの2つの公開ベンチマークで評価され,予測速度の最大2倍の高速化が期待できる。 Human action recognition has become one of the most active field of research in computer vision due to its wide range of applications, like surveillance, medical, industrial environments, smart homes, among others. Recently, deep learning has been successfully used to learn powerful and interpretable features for recognizing human actions in videos. Most of the existing deep learning approaches have been designed for processing video information as RGB image sequences. For this reason, a preliminary decoding process is required, since video data are often stored in a compressed format. However, a high computational load and memory usage is demanded for decoding a video. To overcome this problem, we propose a deep neural network capable of learning straight from compressed video. Our approach was evaluated on two public benchmarks, the UCF-101 and HMDB-51 datasets, demonstrating comparable recognition performance to the state-of-the-art methods, with the advantage of running up to 2 times faster in terms of inference speed.	翻訳日:2021-04-25 01:09:34 公開日:2020-12-26
# アンカー自由物体検出のための線形スケジューリングによるバランス指向焦点損失 Balance-Oriented Focal Loss with Linear Scheduling for Anchor Free Object Detection ( http://arxiv.org/abs/2012.13763v1 ) ライセンス: Link先を確認	Hopyong Gil, Sangwoo Park, Yusang Park, Wongoo Han, Juyean Hong, Juneyoung Jung	(参考訳) 既存のオブジェクト検出器の多くは、パフォーマンスのバランスを阻害するクラス不均衡の問題に苦しんでいる。特にアンカーフリーオブジェクト検出器は、画素毎の予測方法での検出とフォアグラウンドのアンバランス問題を同時に解決する必要がある。本研究では,背景バランスと前景バランスを総合的に考慮し,バランス学習を促すバランス指向焦点損失を提案する。本研究は,アンカーフリー物体検出器のショット数や焦点損失を含む非極端分布の一般不均衡データを用いた場合の非平衡問題に対処することを目的とする。我々は、この不均衡問題に精巧に対処するために、焦点損失のバッチワイズアルファバランスの変種を用いる。一般的な不均衡データに対して再重み付けのみを使用する、シンプルで実用的なソリューションである。推論やグルーピングクラスにおいて、追加の学習コストも構造的な変更も不要である。広範にわたる実験により,各部品の性能改善を示し,損失に対する再重み付けを用いた線形スケジューリングの効果を解析した。前景階級のバランスの点で焦点損失を改善することにより、アンカーフリーリアルタイム検出器のためのMS-COCOにおけるAP利得+1.2を達成する。 Most existing object detectors suffer from class imbalance problems that hinder balanced performance. In particular, anchor free object detectors have to solve the background imbalance problem due to detection in a per-pixel prediction fashion as well as foreground imbalance problem simultaneously. In this work, we propose Balance-oriented focal loss that can induce balanced learning by considering both background and foreground balance comprehensively. This work aims to address imbalance problem in the situation of using a general unbalanced data of non-extreme distribution not including few shot and the focal loss for anchor free object detector. We use a batch-wise alpha-balanced variant of the focal loss to deal with this imbalance problem elaborately. It is a simple and practical solution using only re-weighting for general unbalanced data. It does require neither additional learning cost nor structural change during inference and grouping classes is also unnecessary. Through extensive experiments, we show the performance improvement for each component and analyze the effect of linear scheduling when using re-weighting for the loss. By improving the focal loss in terms of balancing foreground classes, our method achieves AP gains of +1.2 in MS-COCO for the anchor free real-time detector.	翻訳日:2021-04-25 01:09:18 公開日:2020-12-26
# エッジコンピューティングに向けたディープラーニング - 圧縮データからニューラルネットワークへ Deep Learning Towards Edge Computing: Neural Networks Straight from Compressed Data ( http://arxiv.org/abs/2012.14426v1 ) ライセンス: Link先を確認	Samuel Felipe dos Santos and Jurandy Almeida	(参考訳) 携帯電話の普及と計算能力の増大、人工知能の進歩により、多くのインテリジェントなアプリケーションが開発され、有意義に人々の生活を豊かにしている。そのため、エッジインテリジェンス(エッジインテリジェンス)の分野への関心が高まっており、これらのアプリケーションをより効率的かつセキュアにするために、データの計算をネットワークのエッジにプッシュすることを目指している。多くのインテリジェントアプリケーションは、畳み込みニューラルネットワーク(CNN)のようなディープラーニングモデルに依存している。過去10年間で、多くのコンピュータビジョンタスクで最先端のパフォーマンスを達成した。これらの手法の性能を高めるために、より深いアーキテクチャとより多くのパラメータを使用する傾向があり、計算コストが高くなる。実際、これはディープアーキテクチャが直面する主な問題の一つであり、エッジデバイスのような限られた計算リソースを持つドメインでの適用性を制限する。計算複雑性を軽減するために,画像とビデオの記憶と伝送に使用される圧縮表現で容易に利用できる視覚コンテンツに関連する情報から直接学習できるディープニューラルネットワークを提案する。提案手法の新規性は,RGB 画素ではなく DCT 係数で学習することで,周波数領域データを直接操作するように設計されている。これにより、データストリームの完全復号化において高い計算負荷を節約し、処理時間を大幅に短縮することが可能になる。 1)ImageNetデータセット上の画像分類と,(2)UCF-101データセットとHMDB-51データセット上の映像分類の2つの課題について,ネットワークの評価を行った。その結果, 計算効率が向上し, 精度の面では最先端手法に匹敵する効果を示した。 Due to the popularization and grow in computational power of mobile phones, as well as advances in artificial intelligence, many intelligent applications have been developed, meaningfully enriching people's life. For this reason, there is a growing interest in the area of edge intelligence, that aims to push the computation of data to the edges of the network, in order to make those applications more efficient and secure. Many intelligent applications rely on deep learning models, like convolutional neural networks (CNNs). Over the past decade, they have achieved state-of-the-art performance in many computer vision tasks. To increase the performance of these methods, the trend has been to use increasingly deeper architectures and with more parameters, leading to a high computational cost. Indeed, this is one of the main problems faced by deep architectures, limiting their applicability in domains with limited computational resources, like edge devices. To alleviate the computational complexity, we propose a deep neural network capable of learning straight from the relevant information pertaining to visual content readily available in the compressed representation used for image and video storage and transmission. The novelty of our approach is that it was designed to operate directly on frequency domain data, learning with DCT coefficients rather than RGB pixels. This enables to save high computational load in full decoding the data stream and therefore greatly speed up the processing time, which has become a big bottleneck of deep learning. We evaluated our network on two challenging tasks: (1) image classification on the ImageNet dataset and (2) video classification on the UCF-101 and HMDB-51 datasets. Our results demonstrate comparable effectiveness to the state-of-the-art methods in terms of accuracy, with the advantage of being more computationally efficient.	翻訳日:2021-04-25 01:09:00 公開日:2020-12-26
# スパース報酬を伴う連続制御タスクの局所的持続的探索 Locally Persistent Exploration in Continuous Control Tasks with Sparse Rewards ( http://arxiv.org/abs/2012.13658v1 ) ライセンス: Link先を確認	Susan Amin (1 and 2), Maziar Gomrokchi (1 and 2), Hossein Aboutalebi (3), Harsh Satija (1 and 2) and Doina Precup (1 and 2) ((1) McGill University, (2) Mila- Quebec Artificial Intelligence Institute, (3) University of Waterloo)	(参考訳) 強化学習における大きな課題は、特に粗末な報酬構造と連続状態と行動空間を持つ環境において、探索戦略の設計である。直感的には、補強信号が非常に少ない場合、エージェントは環境を効率的にカバーするために何らかの短期記憶に頼るべきである。我々は,(1)次の探索行動の選択は環境の(マルコフ)状態だけでなく,エージェントの軌道にも依存すべきであり,(2)エージェントは,状態空間における拡散の指標を利用して,小さな領域で立ち往生することを避ける必要がある,という2つの直観に基づく新たな探索法を提案する。本手法は,統計物理学でよく用いられる概念を応用し,状態空間における持続的(局所的に自己回避する)軌道を生成するために,単純化された(ポリマー)鎖の挙動を説明する。本稿では,局所自己回避歩行の理論的特性と,軌道内における時間的相関による短期記憶の提供能力について論じる。シミュレーションによる2次元ナビゲーションタスクや,高次元のムジョコ連続制御ロコモーションタスクにおいて,そのアプローチを経験的に評価した。 A major challenge in reinforcement learning is the design of exploration strategies, especially for environments with sparse reward structures and continuous state and action spaces. Intuitively, if the reinforcement signal is very scarce, the agent should rely on some form of short-term memory in order to cover its environment efficiently. We propose a new exploration method, based on two intuitions: (1) the choice of the next exploratory action should depend not only on the (Markovian) state of the environment, but also on the agent's trajectory so far, and (2) the agent should utilize a measure of spread in the state space to avoid getting stuck in a small region. Our method leverages concepts often used in statistical physics to provide explanations for the behavior of simplified (polymer) chains, in order to generate persistent (locally self-avoiding) trajectories in state space. We discuss the theoretical properties of locally self-avoiding walks, and their ability to provide a kind of short-term memory, through a decaying temporal correlation within the trajectory. We provide empirical evaluations of our approach in a simulated 2D navigation task, as well as higher-dimensional MuJoCo continuous control locomotion tasks with sparse rewards.	翻訳日:2021-04-25 01:08:33 公開日:2020-12-26
# 深層学習に基づく6G協調運転のためのインテリジェント車間距離制御 Deep Learning Based Intelligent Inter-Vehicle Distance Control for 6G Enabled Cooperative Autonomous Driving ( http://arxiv.org/abs/2012.13817v1 ) ライセンス: Link先を確認	Xiaosha Chen, Supeng Leng, Jianhua He, and Longyu Zhou	(参考訳) 第6世代セルネットワーク(6G)の研究は、ユビキタス無線接続を実現するために大きな勢いを増している。コネクテッド・自律運転(CAV)は、6Gにとって重要な垂直方向であり、道路の安全性、道路、エネルギー効率を改善する大きな可能性を秘めている。しかし、信頼性、レイテンシ、高速通信に関するCAVアプリケーションの厳しいサービス要件は、6Gネットワークに大きな課題をもたらすだろう。 6g対応cavには,新たなチャネルアクセスアルゴリズムとコネクテッドカーのインテリジェント制御スキームが必要である。本稿では,情報共有と運転協調による高度な運転モードである6G支援協調運転について検討した。まず,ハイブリッド通信とチャネルアクセス技術を用いたV2V通信における6G車両の遅延上限を定量化する。リアルタイム操作における遅延境界の高速計算のために,ディープラーニングニューラルネットワークを開発し,学習する。そして、協調自動運転のための車間距離を制御するインテリジェントな戦略を設計する。さらに,システム状態のパラメータを予測するマルコフ連鎖に基づくアルゴリズムと,スムーズな車速変化を可能にする安全な距離マッピング手法を提案する。提案アルゴリズムはAirSim自動運転プラットフォームで実装されている。シミュレーションの結果,提案手法は安全で安定な協調運転により有効で頑健であり,道路の安全性,容量,効率が大幅に向上することがわかった。 Research on the sixth generation cellular networks (6G) is gaining huge momentum to achieve ubiquitous wireless connectivity. Connected autonomous driving (CAV) is a critical vertical envisioned for 6G, holding great potentials of improving road safety, road and energy efficiency. However the stringent service requirements of CAV applications on reliability, latency and high speed communications will present big challenges to 6G networks. New channel access algorithms and intelligent control schemes for connected vehicles are needed for 6G supported CAV. In this paper, we investigated 6G supported cooperative driving, which is an advanced driving mode through information sharing and driving coordination. Firstly we quantify the delay upper bounds of 6G vehicle to vehicle (V2V) communications with hybrid communication and channel access technologies. A deep learning neural network is developed and trained for fast computation of the delay bounds in real time operations. Then, an intelligent strategy is designed to control the inter-vehicle distance for cooperative autonomous driving. Furthermore, we propose a Markov Chain based algorithm to predict the parameters of the system states, and also a safe distance mapping method to enable smooth vehicular speed changes. The proposed algorithms are implemented in the AirSim autonomous driving platform. Simulation results show that the proposed algorithms are effective and robust with safe and stable cooperative autonomous driving, which greatly improve the road safety, capacity and efficiency.	翻訳日:2021-04-25 01:08:12 公開日:2020-12-26
# 逆ネットワークを用いた画像合成:包括的調査とケーススタディ Image Synthesis with Adversarial Networks: a Comprehensive Survey and Case Studies ( http://arxiv.org/abs/2012.13736v1 ) ライセンス: Link先を確認	Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger, Huiyu Zhou, Ruili Wang, M. Emre Celebi and Jie Yang	(参考訳) generative adversarial networks (gans) はコンピュータビジョン、医学、自然言語処理といった様々なアプリケーションドメインで非常に成功している。さらに、オブジェクトや人物を望ましい形に変えることは、GANにおいてよく研究される研究になる。 GANは複雑な分布を学習して意味のあるサンプルを合成する強力なモデルである。しかし、この分野には包括的なレビューの欠如、特にgans損失変動のコレクションの欠如、評価メトリクス、多様な画像生成のための修正、安定したトレーニングがある。本調査では,現時点の高速なGANの開発状況を踏まえ,画像合成の敵モデルに関する総合的なレビューを行う。合成画像生成法を要約し,画像間変換,融合画像生成,ラベル間マッピング,テキスト間変換などのカテゴリについて考察する。基礎モデルに基づいて文献を整理し,アーキテクチャ,制約,損失関数,評価指標,データセットの学習に関するアイデアを考案した。本稿では, 敵モデルのマイルストーン, 様々なカテゴリにおける先行作品の広範な選択, モデルベースからデータ駆動手法への開発経路に関する知見について述べる。さらに,今後の研究の方向性についても注目する。このレビューのユニークな特徴の1つは、これらのGANメソッドとデータセットのすべてのソフトウェア実装が収集され、https://github.com/pshams55/GAN-Case-Studyで一箇所で利用可能になったことである。 Generative Adversarial Networks (GANs) have been extremely successful in various application domains such as computer vision, medicine, and natural language processing. Moreover, transforming an object or person to a desired shape become a well-studied research in the GANs. GANs are powerful models for learning complex distributions to synthesize semantically meaningful samples. However, there is a lack of comprehensive review in this field, especially lack of a collection of GANs loss-variant, evaluation metrics, remedies for diverse image generation, and stable training. Given the current fast GANs development, in this survey, we provide a comprehensive review of adversarial models for image synthesis. We summarize the synthetic image generation methods, and discuss the categories including image-to-image translation, fusion image generation, label-to-image mapping, and text-to-image translation. We organize the literature based on their base models, developed ideas related to architectures, constraints, loss functions, evaluation metrics, and training datasets. We present milestones of adversarial models, review an extensive selection of previous works in various categories, and present insights on the development route from the model-based to data-driven methods. Further, we highlight a range of potential future research directions. One of the unique features of this review is that all software implementations of these GAN methods and datasets have been collected and made available in one place at https://github.com/pshams55/GAN-Case-Study.	翻訳日:2021-04-25 01:07:53 公開日:2020-12-26
# 自律走行のための確率的3次元マルチモーダルマルチオブジェクトトラッキング Probabilistic 3D Multi-Modal, Multi-Object Tracking for Autonomous Driving ( http://arxiv.org/abs/2012.13755v1 ) ライセンス: Link先を確認	Hsu-kuang Chiu, Jie Li, Rares Ambrus, Jeannette Bohg	(参考訳) マルチオブジェクトトラッキングは、自動運転車が交通シーンを安全にナビゲートする重要な機能である。現在の最先端は、ある距離メトリックを通じて検出対象と既存のトラックが関連付けられるトラッキング・バイ・検出パラダイムに従っている。追跡精度を高めるための重要な課題は、データアソシエーションとライフサイクル管理の追跡にある。本稿では,複数のトレーニング可能なモジュールからなる確率的マルチモーダル・マルチオブジェクトトラッキングシステムを提案し,ロバストかつデータ駆動的なトラッキング結果を提供する。まず、2D画像と3D LiDAR点雲から特徴を融合して、オブジェクトの外観と幾何学的情報をキャプチャする方法を学ぶ。第2に,マハラノビスと特徴距離を組み合わせた距離を,トラックとデータアソシエーションにおける新たな検出とを比較して学習することを提案する。そして第3に、未整合物体検出からトラックをいつ初期化するかを学ぶことを提案する。そこで本手法は,NuScenes Trackingデータセットにおける最先端の手法よりも優れていることを示す。 Multi-object tracking is an important ability for an autonomous vehicle to safely navigate a traffic scene. Current state-of-the-art follows the tracking-by-detection paradigm where existing tracks are associated with detected objects through some distance metric. The key challenges to increase tracking accuracy lie in data association and track life cycle management. We propose a probabilistic, multi-modal, multi-object tracking system consisting of different trainable modules to provide robust and data-driven tracking results. First, we learn how to fuse features from 2D images and 3D LiDAR point clouds to capture the appearance and geometric information of an object. Second, we propose to learn a metric that combines the Mahalanobis and feature distances when comparing a track and a new detection in data association. And third, we propose to learn when to initialize a track from an unmatched object detection. Through extensive quantitative and qualitative results, we show that our method outperforms current state-of-the-art on the NuScenes Tracking dataset.	翻訳日:2021-04-25 01:07:31 公開日:2020-12-26
# エッジ保存フィルタの評価と比較 Evaluation and Comparison of Edge-Preserving Filters ( http://arxiv.org/abs/2012.13778v1 ) ライセンス: Link先を確認	Sarah Gingichashvili and Dani Lischinski	(参考訳) エッジ保存フィルタは、抽象化、トーンマップ、細部の拡張、テクスチャの除去など、計算写真の最も基本的なタスクにおいて重要な役割を果たす。スムーズな演算子の多様さと多様性は、出力品質を評価したり、それらの間の非バイアス比較を行う方法論の欠如と共に、そのような方法の誤解や潜在的な誤用につながる可能性がある。本稿では,そのような演算子を評価・比較するための体系的手法を導入し,多種多様なエッジ保存フィルタ上で実証する。さらに,異なる演算子の比較が可能な共通ベースラインを提案し,それを用いてメソッド間の等価パラメータマッピングを決定する。最後に,エッジ保存フィルタの客観的比較と評価のためのガイドラインを提案する。 Edge-preserving filters play an essential role in some of the most basic tasks of computational photography, such as abstraction, tonemapping, detail enhancement and texture removal, to name a few. The abundance and diversity of smoothing operators, accompanied by a lack of methodology to evaluate output quality and/or perform an unbiased comparison between them, could lead to misunderstanding and potential misuse of such methods. This paper introduces a systematic methodology for evaluating and comparing such operators and demonstrates it on a diverse set of published edge-preserving filters. Additionally, we present a common baseline along which a comparison of different operators can be achieved and use it to determine equivalent parameter mappings between methods. Finally, we suggest some guidelines for objective comparison and evaluation of edge-preserving filters.	翻訳日:2021-04-25 01:07:13 公開日:2020-12-26
# IDSのためのLSTMの異なるハイパーパラメータの相対的重要性の評価 Assessment of the Relative Importance of different hyper-parameters of LSTM for an IDS ( http://arxiv.org/abs/2012.14427v1 ) ライセンス: Link先を確認	Mohit Sewak, Sanjay K. Sahay and Hemant Rathore	(参考訳) LSTMのような反復的なディープラーニング言語モデルは、しばしば高価値資産のための高度なサイバー防御を提供するために使用される。 LSTMネットワークをマルウェア検出に使用する基本的な前提は、マルウェアのオプトコードシーケンスを(偽)言語表現として扱うことができることである。音声言語(単語/単語の系列)と機械語(オペ符号の系列)には違いがある。本稿では,これら固有の違いから,ネットワークの必須ハイパーパラメータが適切に調整されない限り,音声言語用に調整されたデフォルト構成のLSTMモデルでは,マルウェアを検出するのに有効ではないことを示す。その過程では,lstmネットワークのすべての異なるハイパーパラメータの相対的重要性を,そのオペコードシーケンス表現を用いてマルウェア検出に適用する。 LSTMネットワークの異なる構成を実験し、埋め込みサイズ、隠蔽層数、隠蔽層数、入力ベクトルのプルーニング/パディング長、アクティベーション-ファンクション、バッチサイズなどのハイパーパラメータを変更した。侵入検知システム用に構成されたLSTMネットワークの性能は,マルウェア/機械言語の複雑さの増大により,隠れ層数,入力シーケンス長,アクティベーション-ファンクションの選択に非常に敏感であることが判明した。また、言語モデリングでは、リカレントアーキテクチャは非リカレントアーキテクチャよりも優れています。したがって、LSTMのような連続的なDLアーキテクチャが、マルウェア検出のために、MLP-DNNのようなシーケンシャルでないアーキテクチャとどのように比較するかを評価する。 Recurrent deep learning language models like the LSTM are often used to provide advanced cyber-defense for high-value assets. The underlying assumption for using LSTM networks for malware-detection is that the op-code sequence of malware could be treated as a (spoken) language representation. There are differences between any spoken-language (sequence of words/sentences) and the machine-language (sequence of op-codes). In this paper, we demonstrate that due to these inherent differences, an LSTM model with its default configuration as tuned for a spoken-language, may not work well to detect malware (using its op-code sequence) unless the network's essential hyper-parameters are tuned appropriately. In the process, we also determine the relative importance of all the different hyper-parameters of an LSTM network as applied to malware detection using their op-code sequence representations. We experimented with different configurations of LSTM networks, and altered hyper-parameters like the embedding-size, number of hidden layers, number of LSTM-units in a hidden layer, pruning/padding-length of the input-vector, activation-function, and batch-size. We discovered that owing to the enhanced complexity of the malware/machine-language, the performance of an LSTM network configured for an Intrusion Detection System, is very sensitive towards the number-of-hidden-layers, input sequence-length, and the choice of the activation-function. Also, for (spoken) language-modeling, the recurrent architectures by-far outperform their non-recurrent counterparts. Therefore, we also assess how sequential DL architectures like the LSTM compare against their non-sequential counterparts like the MLP-DNN for the purpose of malware-detection.	翻訳日:2021-04-25 01:07:00 公開日:2020-12-26
# 高精度ペグ・イン・ホールタスクの模倣学習 Imitation Learning for High Precision Peg-in-Hole Tasks ( http://arxiv.org/abs/2101.01052v1 ) ライセンス: Link先を確認	Sagar Gubbi and Shishir Kolathaya and Bharadwaj Amrutur	(参考訳) 産業用ロボットマニピュレータは、人間が現在に至るまで、コンタクトリッチなタスクを実行できる精度とスピードとを一致させることができない。そこで, このギャップを克服する手段として, 6-DOFロボットマニピュレータにおいて, 穴内挿入タスクを模倣する生成方法を示す。特に, GAIL(Generative Adversarial mimicion Learning, GAIL)は, 八川GP8産業用ロボットの10mと6mのペグホールクリアランスを用いて, このタスクを成功させる。実験の結果,ロボット上での少数の人間専門家によるデモンストレーション(遠隔操作ロボット10例)から20エピソード以内の学習が得られた。挿入時間は > 20 秒(挿入失敗を含む)から < 15 秒に改善され、このアプローチの有効性が検証される。 Industrial robot manipulators are not able to match the precision and speed with which humans are able to execute contact rich tasks even to this day. Therefore, as a means overcome this gap, we demonstrate generative methods for imitating a peg-in-hole insertion task in a 6-DOF robot manipulator. In particular, generative adversarial imitation learning (GAIL) is used to successfully achieve this task with a 10 um, and a 6 um peg-hole clearance on the Yaskawa GP8 industrial robot. Experimental results show that the policy successfully learns within 20 episodes from a handful of human expert demonstrations on the robot (i.e., < 10 tele-operated robot demonstrations). The insertion time improves from > 20 seconds (which also includes failed insertions) to < 15 seconds, thereby validating the effectiveness of this approach.	翻訳日:2021-04-25 01:06:35 公開日:2020-12-26
# エンド・ツー・エンドの模倣学習のためのマルチインスタンスアウェアローカライゼーション Multi-Instance Aware Localization for End-to-End Imitation Learning ( http://arxiv.org/abs/2101.01053v1 ) ライセンス: Link先を確認	Sagar Gubbi Venkatesh and Raviteja Upadrashta and Shishir Kolathaya and Bharadwaj Amrutur	(参考訳) イメージ・ツー・アクション・ポリシー・ネットワークを用いた模倣学習の既存のアーキテクチャは、興味のある対象の複数のインスタンスを含む入力画像が提示された場合、特に訓練に利用可能な専門家のデモ数が限られている場合、性能が低下する。 a) 視覚層の特徴マップ出力に、例の好みを示すような埋め込みや、専門家のデモに存在する暗黙の嗜好を活用できる埋め込みを付加し、(b) 制御層に自己回帰行動生成ネットワークを用いることで、エンドツーエンドのポリシーネットワークを効率的にトレーニングできることが示される。ローカライゼーションのためのアーキテクチャは精度とサンプル効率を向上し、トレーニング中に見るよりも多くのオブジェクトの存在を一般化することができる。エンド・ツー・エンドの模倣学習で実際のロボットでリーチ、プッシュ、ピック・アンド・プレイスのタスクを実行する場合、トレーニングは15のエキスパート・デモで達成される。 Existing architectures for imitation learning using image-to-action policy networks perform poorly when presented with an input image containing multiple instances of the object of interest, especially when the number of expert demonstrations available for training are limited. We show that end-to-end policy networks can be trained in a sample efficient manner by (a) appending the feature map output of the vision layers with an embedding that can indicate instance preference or take advantage of an implicit preference present in the expert demonstrations, and (b) employing an autoregressive action generator network for the control layers. The proposed architecture for localization has improved accuracy and sample efficiency and can generalize to the presence of more instances of objects than seen during training. When used for end-to-end imitation learning to perform reach, push, and pick-and-place tasks on a real robot, training is achieved with as few as 15 expert demonstrations.	翻訳日:2021-04-25 01:06:18 公開日:2020-12-26
# 模倣学習のための確率的行動予測 Stochastic Action Prediction for Imitation Learning ( http://arxiv.org/abs/2101.01055v1 ) ライセンス: Link先を確認	Sagar Gubbi Venkatesh and Nihesh Rathod and Shishir Kolathaya and Bharadwaj Amrutur	(参考訳) 模倣学習(imitation learning)は、専門家によるデモンストレーションに頼って、観察を行動にマッピングするポリシーを学ぶための、データ駆動の手法である。デモを行う場合、専門家は常に一貫性があり、わずかに異なる方法で同じタスクを達成する可能性がある。本稿では,遠隔操作車に追従するラインや,物体の到達,押圧,ピック,配置などの操作タスクを含む,実演における固有確率性を示す。自己回帰的行動生成,生成的逆ネット,変動予測を用いてデータ分布の確率性をモデル化し,これらの手法の性能を比較する。専門家データにおける確率性の説明は,タスク完了の成功率を大幅に向上させることがわかった。 Imitation learning is a data-driven approach to acquiring skills that relies on expert demonstrations to learn a policy that maps observations to actions. When performing demonstrations, experts are not always consistent and might accomplish the same task in slightly different ways. In this paper, we demonstrate inherent stochasticity in demonstrations collected for tasks including line following with a remote-controlled car and manipulation tasks including reaching, pushing, and picking and placing an object. We model stochasticity in the data distribution using autoregressive action generation, generative adversarial nets, and variational prediction and compare the performance of these approaches. We find that accounting for stochasticity in the expert data leads to substantial improvement in the success rate of task completion.	翻訳日:2021-04-25 01:05:57 公開日:2020-12-26
# 肺疾患と呼吸器疾患の予知に応用したインセプションベースネットワークとマルチスペクトログラム Inception-Based Network and Multi-Spectrogram Ensemble Applied For Predicting Respiratory Anomalies and Lung Diseases ( http://arxiv.org/abs/2012.13699v1 ) ライセンス: Link先を確認	Lam Pham, Huy Phan, Ross King, Alfred Mertins, Ian McLoughlin	(参考訳) 本稿では,呼吸音入力を用いた肺疾患検出のためのインセプションベースディープニューラルネットワークを提案する。患者から収集された呼吸音の記録は、まずスペクトル情報と時間情報の両方がよく提示される分光器に変換される。これらのスペクトログラムは、肺関連疾患に罹患する患者を検出するために、バックエンド分類と呼ばれる提案されたネットワークに供給される。呼吸音のicbhiベンチマークメタデータセットを用いて, 呼吸異常と疾患検出に関して, それぞれ0.53/0.45 と 0.87/0.85 の競合 icbhiスコアを達成する実験を行った。 This paper presents an inception-based deep neural network for detecting lung diseases using respiratory sound input. Recordings of respiratory sound collected from patients are firstly transformed into spectrograms where both spectral and temporal information are well presented, referred to as front-end feature extraction. These spectrograms are then fed into the proposed network, referred to as back-end classification, for detecting whether patients suffer from lung-relevant diseases. Our experiments, conducted over the ICBHI benchmark meta-dataset of respiratory sound, achieve competitive ICBHI scores of 0.53/0.45 and 0.87/0.85 regarding respiratory anomaly and disease detection, respectively.	翻訳日:2021-04-25 01:05:46 公開日:2020-12-26

Title

Authors

Abstract

論文公表日・翻訳日

# 短期量子コンピュータにおけるリサイクル量子ビット

Recycling qubits in near-term quantum computers ( http://arxiv.org/abs/2012.01676v2 )

ライセンス: Link先を確認

Galit Anikeeva, Isaac H. Kim, Patrick Hayden

(参考訳) 量子コンピュータは単体テンソルネットワークを効率的に収縮させることができる。例えば、行列積状態に基づくネットワークや、MERA(Multi-scale entanglement renormalization ansatz)は、小さな量子コンピュータ上で収縮し、大きな量子システムのシミュレーションを支援することができる。しかし、選択的に量子ビットをリセットする能力がなければ、関連する空間コストは無視できる。本稿では,回路が共通の畳み込み形式を持つ場合,量子ビットを一元的にリセット可能なプロトコルを提案する。このプロトコルは、使用されていないキュービットに時間反転量子回路を部分的に適用することにより、使用中のキュービットから新しいキュービットを生成する。ノイズがなければ、これらの量子ビットのサブセットの状態が$|0\ldots 0\rangle$となり、適用されるゲートの数で指数関数的に小さな誤差となる。また,ノイズの存在下でプロトコルが機能することを示す数値的な証拠も提示する。また,このプロトコルが雑音の存在下で機能することを示す数値的な証拠を提供し,ノイズ耐性が厳密に従う条件を定式化する。

Quantum computers are capable of efficiently contracting unitary tensor networks, a task that is likely to remain difficult for classical computers. For instance, networks based on matrix product states or the multi-scale entanglement renormalization ansatz (MERA) can be contracted on a small quantum computer to aid the simulation of a large quantum system. However, without the ability to selectively reset qubits, the associated spatial cost can be exorbitant. In this paper, we propose a protocol that can unitarily reset qubits when the circuit has a common convolutional form, thus dramatically reducing the spatial cost for implementing the contraction algorithm on general near-term quantum computers. This protocol generates fresh qubits from used ones by partially applying the time-reversed quantum circuit over qubits that are no longer in use. In the absence of noise, we prove that the state of a subset of these qubits becomes $|0\ldots 0\rangle$, up to an error exponentially small in the number of gates applied. We also provide a numerical evidence that the protocol works in the presence of noise. We also provide a numerical evidence that the protocol works in the presence of noise, and formulate a condition under which the noise-resilience follows rigorously.

翻訳日:2023-04-22 05:43:40 公開日:2020-12-26

# prove-it: 一般的な数学的知識の整理と検証のための証明アシスタント

Prove-It: A Proof Assistant for Organizing and Verifying General Mathematical Knowledge ( http://arxiv.org/abs/2012.10987v2 )

ライセンス: Link先を確認

Wayne M. Witzel, Warren D. Craft, Robert D. Carr and Joaqu\'in E. Madrid Larra\~naga

(参考訳) 形式的定理証明を(適度な訓練で)非公式な定理証明と同じくらい簡単かつ自然なものにすることを目的として設計された,pythonベースの汎用対話型定理証明アシスタントであるpromise-itを紹介する。 Prove-Itは、高フレキシブルなJupyterノートブックベースのユーザインターフェースを使用して、LaTeXを使用してインタラクションと証明ステップを文書化する。本稿では,表現,判断,定理,証明の表現を高度に表現し,$\sqrt{2}\notin\mathbb{q}$という従来の証明バイコントラディションを構築し,ラッセルやカリーのパラドックスのような矛盾を避ける方法について論じる。システムの中核要素に関する付録には、広範なドキュメントが記載されている。現在の開発と今後の研究は、量子回路操作と量子アルゴリズム検証への有望な応用を含んでいる。

We introduce Prove-It, a Python-based general-purpose interactive theorem-proving assistant designed with the goal of making formal theorem proving as easy and natural as informal theorem proving (with moderate training). Prove-It uses a highly-flexible Jupyter notebook-based user interface that documents interactions and proof steps using LaTeX. We review Prove-It's highly expressive representation of expressions, judgments, theorems, and proofs; demonstrate the system by constructing a traditional proof-by-contradiction that $\sqrt{2}\notin\mathbb{Q}$; and discuss how the system avoids inconsistencies such as Russell's and Curry's paradoxes. Extensive documentation is provided in the appendices about core elements of the system. Current development and future work includes promising applications to quantum circuit manipulation and quantum algorithm verification.

翻訳日:2023-04-20 02:27:20 公開日:2020-12-26

# 双対性特性解-情報キャリング(SIC)ユニタリプロパゲータ

The duality-character Solution-Information-Carrying (SIC) unitary propagators ( http://arxiv.org/abs/2012.13250v2 )

ライセンス: Link先を確認

Xijia Miao

(参考訳) HSSS量子探索プロセスは、ユニタリ量子力学と非構造探索問題の数学的原理の両方に従う二重文字を所有している。基本的には従来の量子探索アルゴリズムとは異なる。非構造探索問題の双対特性オラクル演算を用いて構築する。 1)探索空間の動的還元と(2)量子状態差分増幅(QUANSDAM)という2つの連続的なステップから構成される。 QUINSDAMプロセスはSICユニタリプロパゲータで直接構築され、後者は基本SICユニタリ演算子で準備される。ここでは、単原子系のSICユニタリプロパゲータの準備を、基本SICユニタリ演算子から開始して具体的に行う。量子系のSICユニタリプロパゲータは量子系の量子対称性を反映するが、基本的なSICユニタリ作用素は反映しない。量子対称性は、量子計算スピードアップ理論における基本的な量子計算スピードアップ資源と見なされる。この準備の目的は、量子対称性を利用してクエンダム過程を高速化することである。製剤は、溶液情報伝達プロセスである。単体的かつ決定論的である。情報保護法に従っている。方法論では、エネルギー固有関数展開と多重量子作用素代数空間に基づいている。さらに、ファインマン経路積分法とエネルギー固有関数展開法を主とする一般理論が確立され、任意の量子系の座標表現におけるSICユニタリプロパゲータを理論的に扱い、計算し、将来指数的quaNSDAM過程を理論的に構築することができる。

The HSSS quantum search process owns the dual character that it obeys both the unitary quantum dynamics and the mathematical-logical principle of the unstructured search problem. It is essentially different from a conventional quantum search algorithm. It is constructed with the duality-character oracle operations of unstructured search problem. It consists of the two consecutive steps: (1) the search-space dynamical reduction and (2) the dynamical quantum-state-difference amplification (QUANSDAM). The QUANSDAM process is directly constructed with the SIC unitary propagators, while the latter each are prepared with the basic SIC unitary operators. Here the preparation for the SIC unitary propagators of a single-atom system is concretely carried out by starting from the basic SIC unitary operators. The SIC unitary propagator of a quantum system may reflect the quantum symmetry of the quantum system, while the basic SIC unitary operators may not. The quantum symmetry is considered as the fundamental quantum-computing-speedup resource in the quantum-computing speedup theory. The purpose for the preparation is ultimately to employ the quantum symmetry to speed up the QUANSDAM process. The preparation is a solution-information transfer process. It is unitary and deterministic. It obeys the information conservation law. In methodology it is based on the energy eigenfunction expansion and the multiple-quantum operator algebra space. Furthermore, a general theory mainly based on the Feynman path integration technique and also the energy eigenfunction expansion method is established to treat theoretically and calculate a SIC unitary propagator of any quantum system in the coordinate representation, which may be further used to construct theoretically an exponential QUANSDAM process in future.

翻訳日:2023-04-19 11:53:13 公開日:2020-12-26

# 量子クレジットローン

Quantum credit loans ( http://arxiv.org/abs/2101.03231v1 )

ライセンス: Link先を確認

Ardenghi Juan Sebastian

(参考訳) 量子力学の数学(qm)に基づく量子モデルは認知科学、ゲーム理論、生態物理学で開発されてきた。この研究では、QM のベクトル空間形式を用いて、クレジットローンの一般化を導入する。負債、償却、利子、定期的な設定の演算子が定義され、ベクトル空間の任意の正規直交基底における平均値は、ローンの各期間において対応する値を与える。 M がローン期間である次元 M のベクトル空間を SO(M) 対称性で回転させることで、固有基底を回転させ、SO(M) 変換の回転角を用いて借主のスケジュール周期的な支払いを改善することができる。回転がベクトルの長さを保存することを考えると、総償却、負債、周期的な配置は変化しない。導入された形式論の一般的な説明として、ローン作用素関係は有限次元表現を考慮し、特定のローンタイプに対して可換作用素を定義する一般化ハイゼンベルク代数(英語版)によって与えられる。その結果、借主が定期的な設置を調整できるように、貸し手が稼ぐものを変えることなく、より良い利益を得るために対応する交換業者の重ね合わせ状態を選択できるため、回転角を通じて自由度が数度導入されるため、通常のクレジットの金融機器の改善が図られる。

Quantum models based on the mathematics of quantum mechanics (QM) have been developed in cognitive sciences, game theory and econophysics. In this work a generalization of credit loans is introduced by using the vector space formalism of QM. Operators for the debt, amortization, interest and periodic installments are defined and its mean values in an arbitrary orthonormal basis of the vectorial space give the corresponding values at each period of the loan. Endowing the vector space of dimension M, where M is the loan duration, with a SO(M) symmetry, it is possible to rotate the eigenbasis to obtain better schedule periodic payments for the borrower, by using the rotation angles of the SO(M) transformation. Given that a rotation preserves the length of the vectors, the total amortization, debt and periodic installments are not changed. For a general description of the formalism introduced, the loan operator relations are given in terms of a generalized Heisenberg algebra, where finite dimensional representations are considered and commutative operators are defined for the specific loan types. The results obtained are an improvement of the usual financial instrument of credit because introduce several degrees of freedom through the rotation angles, which allows to select superposition states of the corresponding commutative operators that enables the borrower to tune the periodic installments in order to obtain better benefits without changing what the lender earns.

翻訳日:2023-04-19 05:56:48 公開日:2020-12-26

# シリセンの局所磁気モーメントに及ぼす外部電場の影響

Effect of an external electric field on local magnetic moments in silicene ( http://arxiv.org/abs/2101.00952v1 )

ライセンス: Link先を確認

Villarreal Julian, Escudero Federico, Ardenghi Juan Sebastian and Jasen Paula

(参考訳) 本研究では,シリセンにおける局所磁気モーメントの形成における外部電場の適用の影響を分析する。ホスト格子の上部に不純物を加えて、不純物エネルギーレベルの自己エネルギーの現実的および想像的部分を計算することにより、平均場近似を考慮した不純物における上下スピン形成の占有数を得るために、状態の偏極密度を用いる。不等占領数は局所的な磁気モーメントの形成の前駆体であり、これはハバードパラメータ、不純物のオンサイトエネルギー、シリセン中のスピン軌道相互作用、および電場に大きく依存する。特に、電場がない場合、磁気相と非磁性相の境界は、トップサイト不純物を持つグラフェンに対するスピン軌道相互作用によって増大し、電場がオンになると収縮して狭くなることが示されている。グラフェンの文献で得られた結果を一般化した負および正のオンサイト不純物エネルギーに対する電界効果について検討した。

In this work we analyze the effects of the application of an external electric field in the formation of a local magnetic moment in silicene. By adding an impurity in a top site in the host lattice and computing the real and imaginary part of the self-energy of the impurity energy level, the polarized density of states is used in order to obtain the occupation number of the up and down spin formation in the impurity considering the mean field approximation. Unequal occupation numbers is the precursor of a formation of a local magnetic moment and this depends critically on the Hubbard parameter, the on-site energy of the impurity, the spin-orbit interaction in silicene and the electric field applied. In particular, it is shown that in the absence of electric field, the boundary between the magnetic and non-magnetic phases increases with the spin-orbit interaction with respect to graphene with a top site impurity and shrinks and narrows it when the electric field is turned on. The electric field effect is studied for negative and positive on-site impurity energies generalizing the results obtained in the literature for graphene.

翻訳日:2023-04-19 05:56:25 公開日:2020-12-26

# 非線形ゲージ結合量子流体の流体力学

Hydrodynamics of nonlinear gauge-coupled quantum fluids ( http://arxiv.org/abs/2012.13834v1 )

ライセンス: Link先を確認

Y. Buggy, L.G. Phillips and P. \"Ohberg

(参考訳) 流体力学の正準形式を構築することにより,ボース凝縮流体の平均場ハミルトニアンにおける任意の密度依存ゲージポテンシャルの発生は,流体密度に対する力学的流れの明示的な依存性によって生じる位相の波動方程式における非線形フロー依存項をもたらすことを示した。さらに、この種類の非線形流体に対して標準運動量輸送方程式を導出し、応力テンソルの式を得る。さらに, 非線形流体中の流体力学式について検討し, 光学配向二層原子の超低温希薄ボースガス中の弱い接触相互作用の導入による有効ゲージポテンシャルについて検討した。超流体の機械運動量輸送のコーシー方程式では、密度依存ベクトルポテンシャルにより2つの非自明な項が現れる。希釈の体力はゲージポテンシャルと流体の希釈率の積として現れるが、応力テンソルはゲージポテンシャルの内積と正準電流密度によって与えられる正準流れ圧力項を特徴とする。数値シミュレーションにより, 外部不純物の存在下での超流体の基底状態波動関数に対する非線形ゲージポテンシャルの興味深い影響を示す。基底状態は非自明な局所位相を採用しており、ゲージポテンシャルの反転の下では非対称である。相プロファイルは、不純物に関する正準流または相流双極子につながり、機械的な流れを引き起こす。その結果、圧力は物体に対して非対称になり、凝縮物は変形する。

By constructing a hydrodynamic canonical formalism, we show that the occurrence of an arbitrary density-dependent gauge potential in the meanfield Hamiltonian of a Bose-condensed fluid invariably leads to nonlinear flow-dependent terms in the wave equation for the phase, where such terms arise due to the explicit dependence of the mechanical flow on the fluid density. In addition, we derive a canonical momentum transport equation for this class of nonlinear fluid and obtain an expression for the stress tensor. Further, we study the hydrodynamic equations in a particular nonlinear fluid, where the effective gauge potential results from the introduction of weak contact interactions in an ultracold dilute Bose gas of optically-addressed two-level atoms. In the Cauchy equation of mechanical momentum transport of the superfluid, two non-trivial terms emerge due to the density-dependent vector potential. A body-force of dilation appears as a product of the gauge potential and the dilation rate of the fluid, while the stress tensor features a canonical flow pressure term given by the inner-product of the gauge potential and the canonical current density. By numerical simulation, we illustrate an interesting effect of the nonlinear gauge potential on the groundstate wavefunction of a superfluid in the presence of a foreign impurity. We find that the groundstate adopts a non-trivial local phase, which is antisymmetric under reversal of the gauge potential. The phase profile leads to a canonical-flow or phase-flow dipole about the impurity, resulting in a skirting mechanical flow. As a result, the pressure becomes asymmetric about the object and the condensate undergoes a deformation.

翻訳日:2023-04-19 05:56:07 公開日:2020-12-26

# Varshni-Hulth\'enポテンシャルモデルと相互作用するN-次元シュリンガー方程式の固有解

Eigensolutions of the N-dimensional Schr\"odinger equation interacting with Varshni-Hulth\'en potential model ( http://arxiv.org/abs/2012.13826v1 )

ライセンス: Link先を確認

E. P. Inyang, E. S. William and J. A. Obu

(参考訳) 新しく提案されたヴァルシュニ・ハルトポテンシャルに対するN次元シュリンガー方程式の解析解は、遠心障壁に対するグリーン・アルドリッチ近似スキームを用いてニキフォロフ・ウバロフ法の枠組みの中で得られる。数値エネルギー固有値と対応する正規化固有関数はヤコビ多項式の項で得られる。ポテンシャルの特別なケースは等しく研究され、それらの数値エネルギー固有値は他の手法で得られたものと一致している。しかし、基底状態といくつかの励起状態に対するエネルギーの挙動は図式的に示される。

Analytical solutions of the N-dimensional Schr\"odinger equation for the newly proposed Varshni-Hulth\'en potential are obtained within the framework of Nikiforov-Uvarov method by using Greene-Aldrich approximation scheme to the centrifugal barrier. The numerical energy eigenvalues and the corresponding normalized eigenfunctions are obtained in terms of Jacobi polynomials. Special cases of the potential are equally studied and their numerical energy eigenvalues are in agreement with those obtained previously with other methods. However, the behavior of the energy for the ground state and several excited states is illustrated graphically.

翻訳日:2023-04-19 05:55:40 公開日:2020-12-26

# 集合非線形性を用いた近決定的弱値メトロロジー

Near-Deterministic Weak-Value Metrology via Collective non-Linearity ( http://arxiv.org/abs/2012.13749v1 )

ライセンス: Link先を確認

Muthumanimaran Vetrivelan and Sai Vinjanampathy

(参考訳) 弱値増幅は、関心の小さなパラメータの測定を強化するためにポストセレクションを用いる。増幅は成功確率の低下を犠牲にして行われ、実用的メトロロジーのツールとしてこの技術の有用性を阻害する。量子アドバンテージを示す他の量子技術に従い、成功確率の量子アドバンテージを定式化し、成功確率の超拡大を示す非線形集団ハミルトニアンに基づくスキームを提示し、同時に弱値の広範な成長を示す。提案手法の実験的実装を提案する。

Weak-value amplification employs postselection to enhance the measurement of small parameters of interest. The amplification comes at the expense of reduced success probability, hindering the utility of this technique as a tool for practical metrology. Following other quantum technologies that display a quantum advantage, we formalize a quantum advantage in the success probability and present a scheme based on non-linear collective Hamiltonians that shows a super-extensive growth in success probability while simultaneously displaying an extensive growth in the weak value. We propose an experimental implementation of our scheme.

翻訳日:2023-04-19 05:54:34 公開日:2020-12-26

# 2モードスクイーズ光を用いた量子増強2光子分光

Quantum-Enhanced Two-Photon Spectroscopy Using Two-mode Squeezed Light ( http://arxiv.org/abs/2012.13745v1 )

ライセンス: Link先を確認

Nikunjkumar Prajapati, Ziqi Niu, and Irina Novikova

(参考訳) 本研究では,Rb蒸気中で発生する2モード強度強化ツインビームを用いて,2光子ラマン転移による分光測定の感度向上を図る。原理実証の実証として,ラマンポンプレーザーパワーとrb蒸気数密度の要件を低減した超微細構造rb 5.d_{3/2}の量子エンハンス測定を実証した。

We investigate the prospects of using two-mode intensity squeezed twin-beams, generated in Rb vapor, to improve the sensitivity of spectroscopic measurements by engaging two-photon Raman transitions. As a proof of principle demonstration, we demonstrated the quantum-enhanced measurements of the Rb $5D_{3/2}$ hyperfine structure with reduced requirements for the Raman pump laser power and Rb vapor number density.

翻訳日:2023-04-19 05:54:24 公開日:2020-12-26

# 深いシグマ点過程

Deep Sigma Point Processes ( http://arxiv.org/abs/2002.09112v2 )

ライセンス: Link先を確認

Martin Jankowiak, Geoff Pleiss, Jacob R. Gardner

(参考訳) 本稿では,Deep Gaussian Processes (DGP) の構成構造から着想を得たパラメトリックモデルのクラスであるDeep Sigma Point Processesを紹介する。ディープシグマポイントプロセス(DSPP)は、カーネル基底関数によって制御されるミニバッチトレーニングや予測不確実性を含む、(可変)DGPの魅力的な特徴の多くを保持している。重要なことは、DSPPは単純な極大推定手順を許容しているため、結果として生じる予測分布は後部近似によって劣化しない。単変量および多変量回帰タスクに関する広範な実証的な比較では、結果の予測分布は、拡張性のある回帰のための他の確率的手法で得られたものよりも、はるかによく校正されている。

We introduce Deep Sigma Point Processes, a class of parametric models inspired by the compositional structure of Deep Gaussian Processes (DGPs). Deep Sigma Point Processes (DSPPs) retain many of the attractive features of (variational) DGPs, including mini-batch training and predictive uncertainty that is controlled by kernel basis functions. Importantly, since DSPPs admit a simple maximum likelihood inference procedure, the resulting predictive distributions are not degraded by any posterior approximations. In an extensive empirical comparison on univariate and multivariate regression tasks we find that the resulting predictive distributions are significantly better calibrated than those obtained with other probabilistic methods for scalable regression, including variational DGPs--often by as much as a nat per datapoint.

翻訳日:2022-12-30 00:33:18 公開日:2020-12-26

# 深層強化学習のための弱い人間選好監督

Weak Human Preference Supervision For Deep Reinforcement Learning ( http://arxiv.org/abs/2007.12904v2 )

ライセンス: Link先を確認

Zehong Cao, KaiChiu Wong, Chin-Teng Lin

(参考訳) 人間の好みからの現在の報酬学習は、一対の軌道セグメント間の単一の固定された嗜好を定義することで、報酬関数にアクセスせずに複雑な強化学習(RL)タスクを解決するために使用できる。しかし、軌道間の選好の判断は動的ではなく、何千回も繰り返して人間の入力を必要とする。本研究では,人選好の選好度を自然に反映した人選好スケーリングモデルを構築し,教師付き学習による人選好推定装置を構築し,人選好数を減らすための予測選好を生成するという,弱い人選好監視フレームワークを提案する。提案されている弱い人間の嗜好監視フレームワークは、複雑なRLタスクを効果的に解決し、シミュレーションされたロボットの移動 -- MuJoCoゲーム -- における累積的な報酬を達成することができる。さらに,本手法では,環境との相互作用の0.01 %未満の人的フィードバックしか必要とせず,既存の手法と比較して,人的入力のコストを最大30 %削減する。このアプローチの柔軟性を示すために、私たちは、異なるタイプの人間の入力に基づいて訓練されたエージェントの振る舞いの比較を示すビデオ(https://youtu.be/jQPe1OILT0M)をリリースした。我々は、弱い教師付き学習による自然にインスピレーションを受けた人間の嗜好が、正確な報酬学習に有用であり、人間と自律的なチームリングシステムのような最先端のRLシステムに適用できると考えている。

The current reward learning from human preferences could be used to resolve complex reinforcement learning (RL) tasks without access to a reward function by defining a single fixed preference between pairs of trajectory segments. However, the judgement of preferences between trajectories is not dynamic and still requires human input over thousands of iterations. In this study, we proposed a weak human preference supervision framework, for which we developed a human preference scaling model that naturally reflects the human perception of the degree of weak choices between trajectories and established a human-demonstration estimator via supervised learning to generate the predicted preferences for reducing the number of human inputs. The proposed weak human preference supervision framework can effectively solve complex RL tasks and achieve higher cumulative rewards in simulated robot locomotion -- MuJoCo games -- relative to the single fixed human preferences. Furthermore, our established human-demonstration estimator requires human feedback only for less than 0.01\% of the agent's interactions with the environment and significantly reduces the cost of human inputs by up to 30\% compared with the existing approaches. To present the flexibility of our approach, we released a video (https://youtu.be/jQPe1OILT0M) showing comparisons of the behaviours of agents trained on different types of human input. We believe that our naturally inspired human preferences with weakly supervised learning are beneficial for precise reward learning and can be applied to state-of-the-art RL systems, such as human-autonomy teaming systems.

翻訳日:2022-11-07 01:10:23 公開日:2020-12-26

# 意思決定アルゴリズム評価のためのマルチモーダル安全批判シナリオ生成

Multimodal Safety-Critical Scenarios Generation for Decision-Making Algorithms Evaluation ( http://arxiv.org/abs/2009.08311v3 )

ライセンス: Link先を確認

Wenhao Ding, Baiming Chen, Bo Li, Kim Ji Eun, Ding Zhao

(参考訳) 既存のニューラルネットワークベースの自律システムは敵攻撃に対して脆弱であるため、その堅牢性に関する高度な評価は非常に重要である。しかしながら、既知の攻撃に基づいて最悪のシナリオでのみロバスト性を評価することは包括的ではない。加えて、安全クリティカルなデータの分布は通常マルチモーダルであり、伝統的な攻撃や評価方法は単一のモダリティに焦点を当てている。上記の課題を解決するため,意思決定アルゴリズムを評価するためのフローベースマルチモーダル安全クリティカルシナリオジェネレータを提案する。提案する生成モデルは重み付き確率最大化により最適化され, 勾配に基づくサンプリング手法が統合され, サンプリング効率が向上する。セーフティクリティカルなシナリオはタスクアルゴリズムをクエリすることで生成され、生成されたシナリオのログライクな状態はリスクレベルに比例する。自動運転タスクの実験は、テスト効率とマルチモーダルモデリング能力の観点から、我々の利点を示しています。 6つの強化学習アルゴリズムを生成したトラヒックシナリオで評価し,その頑健性に関する実証的結論を与える。

Existing neural network-based autonomous systems are shown to be vulnerable against adversarial attacks, therefore sophisticated evaluation on their robustness is of great importance. However, evaluating the robustness only under the worst-case scenarios based on known attacks is not comprehensive, not to mention that some of them even rarely occur in the real world. In addition, the distribution of safety-critical data is usually multimodal, while most traditional attacks and evaluation methods focus on a single modality. To solve the above challenges, we propose a flow-based multimodal safety-critical scenario generator for evaluating decisionmaking algorithms. The proposed generative model is optimized with weighted likelihood maximization and a gradient-based sampling procedure is integrated to improve the sampling efficiency. The safety-critical scenarios are generated by querying the task algorithms and the log-likelihood of the generated scenarios is in proportion to the risk level. Experiments on a self-driving task demonstrate our advantages in terms of testing efficiency and multimodal modeling capability. We evaluate six Reinforcement Learning algorithms with our generated traffic scenarios and provide empirical conclusions about their robustness.

翻訳日:2022-10-17 23:47:22 公開日:2020-12-26

# コメントとソースコード間の深いジャストインタイム不整合検出

Deep Just-In-Time Inconsistency Detection Between Comments and Source Code ( http://arxiv.org/abs/2010.01625v2 )

ライセンス: Link先を確認

Sheena Panthaplackel, Junyi Jessy Li, Milos Gligoric, Raymond J. Mooney

(参考訳) 自然言語コメントは、実装、使用法、プリ・ポスト・コンディションといったソースコードの重要な側面を伝える。対応するコードが変更されたときにコメントを更新するのに失敗すると、矛盾が生じ、混乱とソフトウェアのバグが引き起こされる。本稿では,コードベースにコミットする前に,コメントが対応するコード本体の変更によって一貫性のないものになるかどうかを検知し,潜在的な不整合性,すなわち,コードベースにコミットする前に検出することを目的とする。これを実現するために,コメントとコードの変更を関連付けるディープラーニングアプローチを開発した。様々なコメントタイプにまたがるコメント/コードペアの大規模なコーパスを評価することで,本モデルが複数のベースラインを著しく上回ることを示す。外部評価において,コード変更に基づく不整合コメントの検出と解決が可能な,より包括的な自動コメント保守システムを構築するために,コメント更新モデルと組み合わせて提案手法の有用性を示す。

Natural language comments convey key aspects of source code such as implementation, usage, and pre- and post-conditions. Failure to update comments accordingly when the corresponding code is modified introduces inconsistencies, which is known to lead to confusion and software bugs. In this paper, we aim to detect whether a comment becomes inconsistent as a result of changes to the corresponding body of code, in order to catch potential inconsistencies just-in-time, i.e., before they are committed to a code base. To achieve this, we develop a deep-learning approach that learns to correlate a comment with code changes. By evaluating on a large corpus of comment/code pairs spanning various comment types, we show that our model outperforms multiple baselines by significant margins. For extrinsic evaluation, we show the usefulness of our approach by combining it with a comment update model to build a more comprehensive automatic comment maintenance system which can both detect and resolve inconsistent comments based on code changes.

翻訳日:2022-10-11 02:58:00 公開日:2020-12-26

# 重み付き不均一グラフに基づく対話システム

A Weighted Heterogeneous Graph Based Dialogue System ( http://arxiv.org/abs/2010.10699v2 )

ライセンス: Link先を確認

Xinyan Zhao, Liangwei Chen, Huanhuan Chen

(参考訳) 知識に基づく対話システムは、多様なアプリケーションに対する研究の関心を惹きつけている。しかし, 疾患診断においては, 従来の知識グラフのエッジが重み付けされていないため, 症状-症状関係と症状-症状関係を表現することは困難である。疾患診断対話システムに関するほとんどの研究は、データ駆動型手法と統計的特徴に強く依存しており、症状-症状関係と症状-交感神経関係の深い理解を欠いている。そこで本研究では,重み付きヘテロジニアスグラフを用いた疾患診断のための対話システムを提案する。具体的には、症状共起に基づく重み付きヘテロジニアスグラフと、症状周波数逆病頻度を提案する。次に,対話管理のためのグラフベースのディープqネットワーク(graph-dqn)を提案する。グラフ畳み込みネットワーク(GCN)とDQNを組み合わせることで、重み付きヘテロジニアスグラフの構造情報と属性情報の両方から疾患や症状の埋め込みを学習することで、Graph-DQNは症状・症状・症状の関係をよりよく捉えることができる。実験の結果,提案する対話システムは最先端のモデルに匹敵することがわかった。さらに重要なことは、対話システムは対話のターンを減らしてタスクを完了し、類似の症状を持つ疾患に対するより良い識別能力を有することである。

Knowledge based dialogue systems have attracted increasing research interest in diverse applications. However, for disease diagnosis, the widely used knowledge graph is hard to represent the symptom-symptom relations and symptom-disease relations since the edges of traditional knowledge graph are unweighted. Most research on disease diagnosis dialogue systems highly rely on data-driven methods and statistical features, lacking profound comprehension of symptom-disease relations and symptom-symptom relations. To tackle this issue, this work presents a weighted heterogeneous graph based dialogue system for disease diagnosis. Specifically, we build a weighted heterogeneous graph based on symptom co-occurrence and a proposed symptom frequency-inverse disease frequency. Then this work proposes a graph based deep Q-network (Graph-DQN) for dialogue management. By combining Graph Convolutional Network (GCN) with DQN to learn the embeddings of diseases and symptoms from both the structural and attribute information in the weighted heterogeneous graph, Graph-DQN could capture the symptom-disease relations and symptom-symptom relations better. Experimental results show that the proposed dialogue system rivals the state-of-the-art models. More importantly, the proposed dialogue system can complete the task with less dialogue turns and possess a better distinguishing capability on diseases with similar symptoms.

翻訳日:2022-10-04 22:39:13 公開日:2020-12-26

# ロシア語の科学・技術文献からのエンティティ認識と関係抽出

Entity Recognition and Relation Extraction from Scientific and Technical Texts in Russian ( http://arxiv.org/abs/2011.09817v3 )

ライセンス: Link先を確認

Elena Bruches, Alexey Pauls, Tatiana Batura, Vladimir Isachenko

(参考訳) 本稿では,情報技術に関する学術文献から情報抽出(エンティティ認識と関係分類)の手法について考察する。科学出版物は最先端の科学的進歩に貴重な情報を提供するが、データ量の増加の効率的な処理は時間のかかる作業である。本稿では、ロシア語の方法のいくつかの修正を提案する。また、キーワード抽出法、語彙法、およびニューラルネットワークに基づくいくつかの方法の比較実験結果を含む。これらのタスクのためのテキストコレクションは英語に存在し、科学コミュニティが積極的に使用しているが、現在、ロシア語のデータセットは公開されていない。本稿では,ロシアにおける学術文献のコーパス,RuSERRCについて述べる。このデータセットは1600の未ラベル文書と80のエンティティとセマンティックリレーションでラベル付けされている(6つの関係型が考慮された)。データセットとモデルはhttps://github.com/iis-research-teamで入手できる。情報抽出システムの研究や開発に活用できることを願っている。

This paper is devoted to the study of methods for information extraction (entity recognition and relation classification) from scientific texts on information technology. Scientific publications provide valuable information into cutting-edge scientific advances, but efficient processing of increasing amounts of data is a time-consuming task. In this paper, several modifications of methods for the Russian language are proposed. It also includes the results of experiments comparing a keyword extraction method, vocabulary method, and some methods based on neural networks. Text collections for these tasks exist for the English language and are actively used by the scientific community, but at present, such datasets in Russian are not publicly available. In this paper, we present a corpus of scientific texts in Russian, RuSERRC. This dataset consists of 1600 unlabeled documents and 80 labeled with entities and semantic relations (6 relation types were considered). The dataset and models are available at https://github.com/iis-research-team. We hope they can be useful for research purposes and development of information extraction systems.

翻訳日:2022-09-23 21:00:23 公開日:2020-12-26

# 多発性硬化性病変の分節化 : CNN法の検討

Multiple Sclerosis Lesion Segmentation -- A Survey of Supervised CNN-Based Methods ( http://arxiv.org/abs/2012.08317v2 )

ライセンス: Link先を確認

Huahong Zhang and Ipek Oguz

(参考訳) 病変分割は多発性硬化症患者のmriスキャンを定量的に解析するための重要な課題である。近年,様々な医療画像解析アプリケーションにおける深層学習技術の成功により,この課題に対するコミュニティの関心が高まり,新たなアルゴリズム開発に向けた活動が活発化している。そこで本研究では,CNNを用いたMS病変分類法について検討した。レビューした作品をアルゴリズムコンポーネントに分離し,それぞれを別々に議論する。公開ベンチマークデータセットの評価を行う手法については,結果の比較を報告する。

Lesion segmentation is a core task for quantitative analysis of MRI scans of Multiple Sclerosis patients. The recent success of deep learning techniques in a variety of medical image analysis applications has renewed community interest in this challenging problem and led to a burst of activity for new algorithm development. In this survey, we investigate the supervised CNN-based methods for MS lesion segmentation. We decouple these reviewed works into their algorithmic components and discuss each separately. For methods that provide evaluations on public benchmark datasets, we report comparisons between their results.

翻訳日:2021-05-10 05:20:51 公開日:2020-12-26

# 分散検索によるクエリ応答

Query Answering via Decentralized Search ( http://arxiv.org/abs/2012.12192v2 )

ライセンス: Link先を確認

Liang Ma

(参考訳) エキスパートネットワークは、ネットワークに投稿された特定のクエリを協調的に解決するために、異なる専門性を持つ専門家専門家のグループによって形成される。このようなネットワークでは、十分な専門知識を持たない専門家に問い合わせが届くと、このクエリを他の専門家にルーティングして、完全に解決するまで処理する必要があるため、クエリ応答効率は、使用されているクエリルーティングメカニズムに敏感である。可能なすべてのクエリルーティング機構のうち、ネットワークのグローバル構造を知らずに各専門家のローカル情報に対して純粋に動作する分散検索は、最も基本的でスケーラブルなルーティング機構であり、動的ネットワークにおいても任意のネットワークシナリオに適用できる。しかし、専門家ネットワークにおける分散検索の効率性に関する基本的な理解が不足している。本稿では,様々なネットワーク環境下での性能を定量化し,分散検索について検討する。我々の重要な発見はネットワーク条件の存在を示し、その下にある分散検索は、非常に短いクエリルーティングパスを実現することができる(すなわち、$O(\log n)$と$O(\log^2 n)$ホップ、$n$:ネットワークの専門家の総数)。このような理論的基礎に基づき、専門家ネットワークにおける分散探索のユニークな性質が、逸話的小世界現象とどのように関連しているかをさらに研究する。さらに,必要な専門知識レベルを誤解釈することによって生じる推定誤差に対して,分散検索が堅牢であることを示す。我々の知る限りでは、これはエキスパートネットワークにおける分散検索の基本的な振る舞いを研究する最初の研究である。開発したパフォーマンス境界は、実際のデータセットによって確認され、ネットワークパフォーマンスの予測と複雑なエキスパートネットワークの設計を支援することができる。

Expert networks are formed by a group of expert-professionals with different specialties to collaboratively resolve specific queries posted to the network. In such networks, when a query reaches an expert who does not have sufficient expertise, this query needs to be routed to other experts for further processing until it is completely solved; therefore, query answering efficiency is sensitive to the underlying query routing mechanism being used. Among all possible query routing mechanisms, decentralized search, operating purely on each expert's local information without any knowledge of network global structure, represents the most basic and scalable routing mechanism, which is applicable to any network scenarios even in dynamic networks. However, there is still a lack of fundamental understanding of the efficiency of decentralized search in expert networks. In this regard, we investigate decentralized search by quantifying its performance under a variety of network settings. Our key findings reveal the existence of network conditions, under which decentralized search can achieve significantly short query routing paths (i.e., between $O(\log n)$ and $O(\log^2 n)$ hops, $n$: total number of experts in the network). Based on such theoretical foundation, we further study how the unique properties of decentralized search in expert networks is related to the anecdotal small-world phenomenon. In addition, we demonstrate that decentralized search is robust against estimation errors introduced by misinterpreting the required expertise levels. To the best of our knowledge, this is the first work studying fundamental behaviors of decentralized search in expert networks. The developed performance bounds, confirmed by real datasets, are able to assist in predicting network performance and designing complex expert networks.

翻訳日:2021-05-01 18:15:42 公開日:2020-12-26

# (参考訳) 不確実性定量化による糖尿病網膜症分類の能動的学習法

An Active Learning Method for Diabetic Retinopathy Classification with Uncertainty Quantification ( http://arxiv.org/abs/2012.13325v2 )

ライセンス: CC BY 4.0

Muhammad Ahtazaz Ahsan, Adnan Qayyum, Junaid Qadir and Adeel Razi

(参考訳) 近年、深層学習(DL)技術は様々な医療画像のタスクに最先端のパフォーマンスを提供している。しかし、時間的制約や専門的なアノテータ、例えば放射線技師が利用できるため、良質なアノテート医療データの入手は非常に困難である。加えて、DLはデータハングリーであり、そのトレーニングには広範な計算資源が必要である。 DLのもう1つの問題は、そのブラックボックスの性質と、因果的理解と推論を妨げる内部動作への透明性の欠如である。本稿では,不確実性定量化のためのベイズ畳み込みニューラルネットワーク(BCNN)を用いたハイブリッドモデルと,未ラベルデータの注釈付けのためのアクティブラーニングアプローチを提案する。 BCNNは機能記述子として使用され、これらの機能は、アクティブな学習環境でモデルのトレーニングに使用される。糖尿病網膜症分類の枠組みについて検討し,様々な指標で最先端のパフォーマンスを達成した。

In recent years, deep learning (DL) techniques have provided state-of-the-art performance on different medical imaging tasks. However, the availability of good quality annotated medical data is very challenging due to involved time constraints and the availability of expert annotators, e.g., radiologists. In addition, DL is data-hungry and their training requires extensive computational resources. Another problem with DL is their black-box nature and lack of transparency on its inner working which inhibits causal understanding and reasoning. In this paper, we jointly address these challenges by proposing a hybrid model, which uses a Bayesian convolutional neural network (BCNN) for uncertainty quantification, and an active learning approach for annotating the unlabelled data. The BCNN is used as a feature descriptor and these features are then used for training a model, in an active learning setting. We evaluate the proposed framework for diabetic retinopathy classification problem and have achieved state-of-the-art performance in terms of different metrics.

翻訳日:2021-04-25 13:13:08 公開日:2020-12-26

# (参考訳) 組織的サイバーセキュリティリスクの予測 - ディープラーニングアプローチ

Predicting Organizational Cybersecurity Risk: A Deep Learning Approach ( http://arxiv.org/abs/2012.14425v1 )

ライセンス: CC BY 4.0

Benjamin M. Ampel

(参考訳) 悪意あるハッカーによるサイバー攻撃は、組織、政府、個人に毎年不可分なダメージを与える。ハッカーはハッカーフォーラムで見つかったエクスプロイトを使って複雑なサイバー攻撃を実行し、これらのフォーラムの探索を不可欠にする。本稿では,攻撃対象と攻撃対象を識別するためのハッカーフォーラムエンティティ認識フレームワーク(HackER)を提案する。 hackerは双方向のlong short-term memory model(bilstm)を使用して、企業がエクスプロイト対象とする予測モデルを作成する。アルゴリズムの結果は、精度、精度、リコール、F1スコアを指標として、手動でゴールドスタンダードテストデータセットを使用して評価される。このモデルと最先端の古典的機械学習とディープラーニングのベンチマークモデルを比較します。その結果,提案したHacker BiLSTMモデルはF1スコア(79.71%)の古典的機械学習モデルやディープラーニングモデルよりも優れていた。これらの結果はLSTMを除く全てのベンチマークで0.05以下で統計的に有意である。予備研究の結果から,サイバーセキュリティの重要なステークホルダー(アナリスト,研究者,教育者など)が,エクスプロイトがターゲットとするビジネスの種類を特定するのに役立つことが示唆された。

Cyberattacks conducted by malicious hackers cause irreparable damage to organizations, governments, and individuals every year. Hackers use exploits found on hacker forums to carry out complex cyberattacks, making exploration of these forums vital. We propose a hacker forum entity recognition framework (HackER) to identify exploits and the entities that the exploits target. HackER then uses a bidirectional long short-term memory model (BiLSTM) to create a predictive model for what companies will be targeted by exploits. The results of the algorithm will be evaluated using a manually labeled gold-standard test dataset, using accuracy, precision, recall, and F1-score as metrics. We choose to compare our model against state of the art classical machine learning and deep learning benchmark models. Results show that our proposed HackER BiLSTM model outperforms all classical machine learning and deep learning models in F1-score (79.71%). These results are statistically significant at 0.05 or lower for all benchmarks except LSTM. The results of preliminary work suggest our model can help key cybersecurity stakeholders (e.g., analysts, researchers, educators) identify what type of business an exploit is targeting.

翻訳日:2021-04-25 04:15:46 公開日:2020-12-26

# (参考訳) rough to fine: global/local attentionによるマルチラベル画像分類

Coarse to Fine: Multi-label Image Classification with Global/Local Attention ( http://arxiv.org/abs/2012.13662v1 )

ライセンス: CC BY 4.0

Fan Lyu, Fuyuan Hu, Victor S. Sheng, Zhengtian Wu, Qiming Fu and Baochuan Fu

(参考訳) 私たちの日常生活では、周囲のシーンは常に複数のラベルがあり、特にスマートシティ、すなわち、応答と制御に対する都市操作の情報を認識する。ディープニューラルネットワークを使ってマルチラベル画像を認識することで、大きな努力がなされている。マルチラベル画像分類は非常に複雑であるため、注意機構を用いて分類プロセスを導こうとしている。しかし,従来の注意法は画像を直接的かつ積極的に分析する。複雑な場面をよく理解することは困難である。本稿では,人間による画像観察を模倣することで,粗い画像から細かい画像まで認識できるグローバル/ローカルアテンション手法を提案する。具体的には、まず、グローバル/ローカルアテンション手法が画像全体に集中し、次に画像内の局所的なオブジェクトに注目します。また,正のラベルの最小スコアが負のラベルの最大スコアよりも水平および垂直に大きいことを強制する統合的マックスマージン客観的関数を提案する。この機能は、マルチラベル画像分類法をさらに改善することができる。提案手法の有効性を2つの一般的なマルチラベル画像データセット(Pascal VOCとMS-COCO)で評価した。実験の結果,本手法は最先端手法よりも優れていた。

In our daily life, the scenes around us are always with multiple labels especially in a smart city, i.e., recognizing the information of city operation to response and control. Great efforts have been made by using Deep Neural Networks to recognize multi-label images. Since multi-label image classification is very complicated, people seek to use the attention mechanism to guide the classification process. However, conventional attention-based methods always analyzed images directly and aggressively. It is difficult for them to well understand complicated scenes. In this paper, we propose a global/local attention method that can recognize an image from coarse to fine by mimicking how human-beings observe images. Specifically, our global/local attention method first concentrates on the whole image, and then focuses on local specific objects in the image. We also propose a joint max-margin objective function, which enforces that the minimum score of positive labels should be larger than the maximum score of negative labels horizontally and vertically. This function can further improve our multi-label image classification method. We evaluate the effectiveness of our method on two popular multi-label image datasets (i.e., Pascal VOC and MS-COCO). Our experimental results show that our method outperforms state-of-the-art methods.

翻訳日:2021-04-25 04:06:56 公開日:2020-12-26

# (参考訳) ミリ波センシング:アプリケーションパイプラインとビルディングブロックのレビュー

Millimeter Wave Sensing: A Review of Application Pipelines and Building Blocks ( http://arxiv.org/abs/2012.13664v1 )

ライセンス: CC BY 4.0

Bram van Berlo, Amany Elkelany, Tanir Ozcelebi, Nirvana Meratnia

(参考訳) 新しい無線アプリケーションの帯域幅が増加すると、高速無線通信のためのミリ波スペクトルの標準化に繋がる。ミリ波スペクトルは5Gの一部であり、10mmから1mmの波長に対応する周波数は30〜300GHzである。ミリ波は、しばしば通信媒体と見なされるが、狭いビーム、広帯域での動作、大気成分との相互作用などにより、優れた「センサー」であることが証明されている。本稿では,ミリ波センシングアプリケーションパイプラインを網羅する最初のレビューにおいて,ハードウェア,アルゴリズム,解析モデル,モデル評価技術など,さまざまな基本アプリケーションパイプライン構築ブロックの概要と解析について述べる。レビューはまた、異なるミリ波センシングアプリケーションドメインを強調する分類も提供している。総合的な分析を行い、体系的な文献レビューの方法論に従い、165の論文をレビューすることで、ミリ波技術の通信面のみに焦点をあて、アクティブイメージングにミリ波技術を用い、科学的・技術的課題と動向を強調し、ミリ波をセンシング技術として応用するための今後の展望を提供する。

The increasing bandwidth requirement of new wireless applications has lead to standardization of the millimeter wave spectrum for high-speed wireless communication. The millimeter wave spectrum is part of 5G and covers frequencies between 30 and 300 GHz corresponding to wavelengths ranging from 10 to 1 mm. Although millimeter wave is often considered as a communication medium, it has also proved to be an excellent 'sensor', thanks to its narrow beams, operation across a wide bandwidth, and interaction with atmospheric constituents. In this paper, which is to the best of our knowledge the first review that completely covers millimeter wave sensing application pipelines, we provide a comprehensive overview and analysis of different basic application pipeline building blocks, including hardware, algorithms, analytical models, and model evaluation techniques. The review also provides a taxonomy that highlights different millimeter wave sensing application domains. By performing a thorough analysis, complying with the systematic literature review methodology and reviewing 165 papers, we not only extend previous investigations focused only on communication aspects of the millimeter wave technology and using millimeter wave technology for active imaging, but also highlight scientific and technological challenges and trends, and provide a future perspective for applications of millimeter wave as a sensing technology.

翻訳日:2021-04-25 03:55:02 公開日:2020-12-26

# (参考訳) 雑音ラベルを用いたロバスト協調学習

Robust Collaborative Learning with Noisy Labels ( http://arxiv.org/abs/2012.13670v1 )

ライセンス: CC BY 4.0

Mengying Sun, Jing Xing, Bin Chen, Jiayu Zhou

(参考訳) カリキュラムによる学習は、適切な設計でノイズの多いサンプルを再重み付けしたりフィルタリングしたりできるため、データがノイズラベルを含むタスクにおいて大きな効果を示してきた。しかし、追加の監督やフィードバックなしに学習者自身からカリキュラムを得ることは、サンプル選択バイアスによる効果を低下させる。そのため、2つ以上のネットワークを含む手法が近年提案されている。それにもかかわらず、これらの研究はネットワーク間の協調を利用して、意見の相違を強調したり、合意に焦点を合わせながら他方を無視したりしている。本稿では,ネットワーク間の不一致と合意が勾配の雑音を減らし,ネットワーク間の不一致と合意の両面を活用したロバスト協調学習(RCL)と呼ばれる新しいフレームワークを開発するための基盤機構について検討する。実世界の大規模バイオインフォマティクスデータとベンチマーク画像データの両方において,RCLの有効性を示す。

Learning with curriculum has shown great effectiveness in tasks where the data contains noisy (corrupted) labels, since the curriculum can be used to re-weight or filter out noisy samples via proper design. However, obtaining curriculum from a learner itself without additional supervision or feedback deteriorates the effectiveness due to sample selection bias. Therefore, methods that involve two or more networks have been recently proposed to mitigate such bias. Nevertheless, these studies utilize the collaboration between networks in a way that either emphasizes the disagreement or focuses on the agreement while ignores the other. In this paper, we study the underlying mechanism of how disagreement and agreement between networks can help reduce the noise in gradients and develop a novel framework called Robust Collaborative Learning (RCL) that leverages both disagreement and agreement among networks. We demonstrate the effectiveness of RCL on both synthetic benchmark image data and real-world large-scale bioinformatics data.

翻訳日:2021-04-25 03:53:50 公開日:2020-12-26

# (参考訳) グラフニューラルネットワークを用いた複雑なネットワークにおけるノードレジリエンスの近似

Graph neural network based approximation of Node Resiliency in complex networks ( http://arxiv.org/abs/2012.15725v1 )

ライセンス: CC BY 4.0

Sai Munikoti, Laya Das and Balasubramaniam Natarajan

(参考訳) 最適操作と効率の重視により、エンジニアリングシステムの複雑さが増大した。これによりシステムの脆弱性が増大する。しかし、極端な事象の発生頻度の増加に伴い、レジリエンスは重要な考慮事項となっている。レジリエンスは、極端な条件から吸収および回復するシステムの能力を定量化する。グラフ理論は、攻撃に対するレジリエンスを評価するために複雑なエンジニアリングシステムのモデリングに広く使われているフレームワークである。レジリエンス解析の既存の手法のほとんどは、グラフの各ノード/リンクを探索する反復的アプローチに基づいている。これらの手法は計算量が高く、解析結果はネットワーク固有である。これらの課題に対処するために,大規模複雑ネットワークにおけるノードレジリエンスを近似するためのグラフニューラルネットワーク(GNN)ベースのフレームワークを提案する。提案フレームワークは、ノードの小さな代表部分集合上のノードランクを学習するGNNモデルを定義する。次に、トレーニングされたモデルを用いて、類似したグラフの型における見えないノードのランクを予測する。このフレームワークのスケーラビリティは,実世界のグラフにおけるノードランクの予測を通じて実証される。提案手法は, ノードのレジリエンススコアを近似する精度が高く, 従来の手法よりも計算能力に優れる。

The emphasis on optimal operations and efficiency has led to increased complexity in engineered systems. This in turn increases the vulnerability of the system. However, with the increasing frequency of extreme events, resilience has now become an important consideration. Resilience quantifies the ability of the system to absorb and recover from extreme conditions. Graph theory is a widely used framework for modeling complex engineered systems to evaluate their resilience to attacks. Most existing methods in resilience analysis are based on an iterative approach that explores each node/link of a graph. These methods suffer from high computational complexity and the resulting analysis is network specific. To address these challenges, we propose a graph neural network (GNN) based framework for approximating node resilience in large complex networks. The proposed framework defines a GNN model that learns the node rank on a small representative subset of nodes. Then, the trained model can be employed to predict the ranks of unseen nodes in similar types of graphs. The scalability of the framework is demonstrated through the prediction of node ranks in real-world graphs. The proposed approach is accurate in approximating the node resilience scores and offers a significant computational advantage over conventional approaches.

翻訳日:2021-04-25 03:39:24 公開日:2020-12-26

# (参考訳) 不確実性下におけるグラフレジリエンスのためのベイズ誘導学習

Bayesian Inductive Learner for Graph Resiliency under uncertainty ( http://arxiv.org/abs/2012.15733v1 )

ライセンス: CC BY 4.0

Sai Munikoti and Balasubramaniam Natarajan

(参考訳) 効率性向上を追求する中で、相互依存と複雑性は現代のエンジニアリングシステムの特性を定義している。障害のカスケードに対する脆弱性の増加に伴い、そのようなエンジニアシステムに関連するリスクと不確実性を理解し、管理することが不可欠である。グラフ理論は相互依存系をモデル化し、破壊に対する回復力を評価するために広く使われているフレームワークである。レジリエンス解析の既存の手法のほとんどは、グラフの各ノード/リンクを探索する反復的アプローチに基づいている。これらの手法は計算量が高く、解析結果はネットワーク固有である。さらに、基礎となるグラフィカルモデルに関連する不確実性は、これらの従来のアプローチの潜在的な価値をさらに制限します。これらの課題を克服するために,大規模グラフ内の臨界ノードを迅速に識別するベイズグラフニューラルネットワークベースのフレームワークを提案する。体系的に不確実性を取り入れていますモデルをトレーニングするために観測グラフを利用する代わりに、観測されたトポロジーとノードターゲットラベルに基づいてグラフの地図推定を算出する。さらに、認識の不確実性を考慮したモンテカルロ(mc)ドロップアウトアルゴリズムが組み込まれている。ベイズフレームワークが提供する計算複雑性の忠実性と利得をシミュレーション結果を用いて示す。

In the quest to improve efficiency, interdependence and complexity are becoming defining characteristics of modern engineered systems. With increasing vulnerability to cascading failures, it is imperative to understand and manage the risk and uncertainty associated with such engineered systems. Graph theory is a widely used framework for modeling interdependent systems and to evaluate their resilience to disruptions. Most existing methods in resilience analysis are based on an iterative approach that explores each node/link of a graph. These methods suffer from high computational complexity and the resulting analysis is network specific. Additionally, uncertainty associated with the underlying graphical model further limits the potential value of these traditional approaches. To overcome these challenges, we propose a Bayesian graph neural network-based framework for quickly identifying critical nodes in a large graph. while systematically incorporating uncertainties. Instead of utilizing the observed graph for training the model, a MAP estimate of the graph is computed based on the observed topology, and node target labels. Further, a Monte-Carlo (MC) dropout algorithm is incorporated to account for the epistemic uncertainty. The fidelity and the gain in computational complexity offered by the Bayesian framework are illustrated using simulation results.

翻訳日:2021-04-25 03:18:00 公開日:2020-12-26

# (参考訳) TSGCNet:2ストリームグラフ畳み込みネットワークを用いた3次元歯科モデルセグメンテーションのための識別幾何学的特徴学習

TSGCNet: Discriminative Geometric Feature Learning with Two-Stream GraphConvolutional Network for 3D Dental Model Segmentation ( http://arxiv.org/abs/2012.13697v1 )

ライセンス: CC BY 4.0

Lingming Zhang, Yue Zhao, Deyu Meng, Zhiming Cui, Chenqiang Gao, Xinbo Gao, Chunfeng Lian, Dinggang Shen

(参考訳) デジタル化された3次元歯科モデルから歯を正確に切り離す能力は,コンピュータ支援歯科矯正計画において必須の課題である。これまで、ディープラーニングに基づく手法は、このタスクの処理に広く用いられてきた。最先端の手法は、メッシュセルの座標と通常のベクトルである3d入力の生属性を直接結合し、完全に自動化された歯のセグメンテーションのための単一ストリームネットワークを訓練する。しかし、これはこれらの原属性によって提供される異なる幾何学的意味を無視する欠点がある。この問題は、識別幾何学的特徴を学ぶ上でネットワークを混乱させ、歯科モデルの多くの孤立した誤った予測をもたらす可能性がある。本稿では,異なる幾何学的属性から多視点幾何学情報を学習するための2ストリームグラフ畳み込みネットワーク(tsgcnet)を提案する。我々のTSGCNetは2つのグラフ学習ストリームを入力認識方式で設計し、座標と正規ベクトルからより識別性の高い高次幾何表現を抽出する。設計した2つの異なるストリームから得られたこれらの特徴表現はさらに融合し、セルワイドな予測タスクのための多視点補完情報を統合する。 3次元口腔内スキャナーで取得した歯科モデルの実患者データセット上でのtsgcnetの評価を行い,本手法が3次元形状セグメンテーションの最先端法を大幅に上回っていることを実験的に示す。

The ability to segment teeth precisely from digitized 3D dental models is an essential task in computer-aided orthodontic surgical planning. To date, deep learning based methods have been popularly used to handle this task. State-of-the-art methods directly concatenate the raw attributes of 3D inputs, namely coordinates and normal vectors of mesh cells, to train a single-stream network for fully-automated tooth segmentation. This, however, has the drawback of ignoring the different geometric meanings provided by those raw attributes. This issue might possibly confuse the network in learning discriminative geometric features and result in many isolated false predictions on the dental model. Against this issue, we propose a two-stream graph convolutional network (TSGCNet) to learn multi-view geometric information from different geometric attributes. Our TSGCNet adopts two graph-learning streams, designed in an input-aware fashion, to extract more discriminative high-level geometric representations from coordinates and normal vectors, respectively. These feature representations learned from the designed two different streams are further fused to integrate the multi-view complementary information for the cell-wise dense prediction task. We evaluate our proposed TSGCNet on a real-patient dataset of dental models acquired by 3D intraoral scanners, and experimental results demonstrate that our method significantly outperforms state-of-the-art methods for 3D shape segmentation.

翻訳日:2021-04-25 03:08:12 公開日:2020-12-26

# (参考訳) 効率的推論のためのレトロ合成データを用いたハイブリッドおよび非一様量子化法

Hybrid and Non-Uniform quantization methods using retro synthesis data for efficient inference ( http://arxiv.org/abs/2012.13716v1 )

ライセンス: CC BY 4.0

Tej pratap GVSL, Raja Kumar

(参考訳) 既存の量子化対応トレーニング手法は、トレーニング後の量子化方法のほとんどと同様に、トレーニングデータを活用することで、量子化損失を補おうとする。これらの方法は、トレーニングデータと密結合しているため、プライバシ制約アプリケーションには有効ではない。対照的に,本稿では,トレーニングデータの必要性をなくすデータ非依存なトレーニング後量子化手法を提案する。これは、FP32モデル層統計からフェローデータセット(以下、Retro-Synthesis Dataと呼ぶ)を生成し、さらに量子化に使用することで達成される。このアプローチは、imagenetとcifar-10データセットのバッチ正規化層8,6,4ビット精度のモデルにおいて、zeroqとdfqを含む最先端の手法よりも優れていた。また,2種類のポストトレーニング量子化手法,すなわちハイブリッド量子化と非均一量子化を導入した。

Existing quantization aware training methods attempt to compensate for the quantization loss by leveraging on training data, like most of the post-training quantization methods, and are also time consuming. Both these methods are not effective for privacy constraint applications as they are tightly coupled with training data. In contrast, this paper proposes a data-independent post-training quantization scheme that eliminates the need for training data. This is achieved by generating a faux dataset, hereafter referred to as Retro-Synthesis Data, from the FP32 model layer statistics and further using it for quantization. This approach outperformed state-of-the-art methods including, but not limited to, ZeroQ and DFQ on models with and without Batch-Normalization layers for 8, 6, and 4 bit precisions on ImageNet and CIFAR-10 datasets. We also introduced two futuristic variants of post-training quantization methods namely Hybrid Quantization and Non-Uniform Quantization

翻訳日:2021-04-25 02:52:19 公開日:2020-12-26

# (参考訳) 分離指数に基づく伝達学習における事前学習深層ニューラルネットワークのランク付けと排除

Ranking and Rejecting of Pre-Trained Deep Neural Networks in Transfer Learning based on Separation Index ( http://arxiv.org/abs/2012.13717v1 )

ライセンス: CC BY 4.0

Mostafa Kalhor, Ahmad Kalhor, and Mehdi Rahmani

(参考訳) 事前学習型ディープニューラルネットワーク(DNN)の自動ランキングは、最適な事前学習型DNNを選択するために必要な時間を短縮し、転送学習における分類性能を高める。本稿では,対象データセットに分離指数(SI)という簡単な距離に基づく複雑性尺度を適用し,事前学習したDNNをランク付けするアルゴリズムを提案する。この目的のために、まず、SIに関する背景が与えられ、その後、自動ランキングアルゴリズムが説明される。このアルゴリズムでは、事前訓練されたDNNの特徴抽出部分から通過するターゲットデータセットに対してSIを演算する。そして、計算されたSIを下位にすることで、事前訓練されたDNNを簡単にランク付けする。このランキング法では、最高のDNNがターゲットデータセット上で最大SIを出力し、十分に低計算のSIの場合、いくつかの事前訓練されたDNNを拒否することができる。提案アルゴリズムの効率は、Linnaeus 5, Breast Cancer Images, COVID-CTの3つの挑戦的データセットを用いて評価される。第3のケーススタディでは,対象データに対する前処理が異なっていたにもかかわらず,アルゴリズムのランク付けは分類精度から得られたランキングと高い相関性を有する。

Automated ranking of pre-trained Deep Neural Networks (DNNs) reduces the required time for selecting optimal pre-trained DNN and boost the classification performance in transfer learning. In this paper, we introduce a novel algorithm to rank pre-trained DNNs by applying a straightforward distance-based complexity measure named Separation Index (SI) to the target dataset. For this purpose, at first, a background about the SI is given and then the automated ranking algorithm is explained. In this algorithm, the SI is computed for the target dataset which passes from the feature extracting parts of pre-trained DNNs. Then, by descending sort of the computed SIs, the pre-trained DNNs are ranked, easily. In this ranking method, the best DNN makes maximum SI on the target dataset and a few pre-trained DNNs may be rejected in the case of their sufficiently low computed SIs. The efficiency of the proposed algorithm is evaluated by using three challenging datasets including Linnaeus 5, Breast Cancer Images, and COVID-CT. For the two first case studies, the results of the proposed algorithm exactly match with the ranking of the trained DNNs by the accuracy on the target dataset. For the third case study, despite using different preprocessing on the target data, the ranking of the algorithm has a high correlation with the ranking resulted from classification accuracy.

翻訳日:2021-04-25 02:39:13 公開日:2020-12-26

# (参考訳) ラベルのないショット学習はほとんどありません

Few Shot Learning With No Labels ( http://arxiv.org/abs/2012.13751v1 )

ライセンス: CC BY-SA 4.0

Aditya Bharti, N.B. Vineeth, C.V. Jawahar

(参考訳) 少数派の学習者は,少数の学習サンプルに限って,新たなカテゴリの認識を目指す。主な課題は、新しいクラスへの適切な一般化を確保しながら、限られたデータへの過度な適合を避けることである。既存の文献は、ラベル要件を新しいクラスからベースクラスに単純にシフトすることで、大量の注釈付きデータを利用する。データアノテーションは時間とコストがかかるため、ラベル要件の削減がさらに重要な目標である。そこで,本稿では,トレーニングやテスト中にラベルアクセスを許可しない,より難易度の高い少数ショット設定を提案する。自己スーパービジョンを利用して画像表現と画像類似性をテスト時に学習することにより,最先端のラベルよりも少ないラベルである \textbf{zero}ラベルを用いて,競合ベースラインを実現する。この研究は、注釈付きデータにまったく依存しない、少数の学習方法を開発するための一歩になることを願っている。私たちのコードは公開されます。

Few-shot learners aim to recognize new categories given only a small number of training samples. The core challenge is to avoid overfitting to the limited data while ensuring good generalization to novel classes. Existing literature makes use of vast amounts of annotated data by simply shifting the label requirement from novel classes to base classes. Since data annotation is time-consuming and costly, reducing the label requirement even further is an important goal. To that end, our paper presents a more challenging few-shot setting where no label access is allowed during training or testing. By leveraging self-supervision for learning image representations and image similarity for classification at test time, we achieve competitive baselines while using \textbf{zero} labels, which is at least fewer labels than state-of-the-art. We hope that this work is a step towards developing few-shot learning methods which do not depend on annotated data at all. Our code will be publicly released.

翻訳日:2021-04-25 02:38:07 公開日:2020-12-26

# (参考訳) 高精度低ビット幅深層ニューラルネットワークの直接量子化

Direct Quantization for Training Highly Accurate Low Bit-width Deep Neural Networks ( http://arxiv.org/abs/2012.13762v1 )

ライセンス: CC BY 4.0

Tuan Hoang and Thanh-Toan Do and Tam V. Nguyen and Ngai-Man Cheung

(参考訳) 本稿では,低ビット幅重みとアクティベーションで深部畳み込みニューラルネットワークを訓練する2つの新しい手法を提案する。まず、ビット幅の少ない重みを得るため、既存の方法の多くは、全精度ネットワーク重みで量子化することにより量子化重みを得る。しかし、このアプローチはいくつかのミスマッチをもたらす:勾配降下は全精度重みを更新するが、量子化された重みは更新しない。この問題に対処するために,学習可能な量子化レベルを持つ量子化重みの{direct}更新を可能にし,勾配降下を用いたコスト関数を最小化する新しい手法を提案する。第二に、ビット幅の低いアクティベーションを得るために、既存の研究は全てのチャネルを等しく考慮している。しかし、活性化量子化器は高分散のいくつかのチャネルに偏りがある。この問題に対処するために,個別チャネルの量子化誤差を考慮した手法を提案する。このアプローチでは、多くのチャネルで量子化エラーを最小化するアクティベーション量子化子を学習できる。実験により,提案手法は,CIFAR-100およびImageNetデータセット上のAlexNet,ResNet,MobileNetV2アーキテクチャを用いて,画像分類タスクにおける最先端性能を実現することを示す。

This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations. First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights. However, this approach would result in some mismatch: the gradient descent updates full-precision weights, but it does not update the quantized weights. To address this issue, we propose a novel method that enables {direct} updating of quantized weights {with learnable quantization levels} to minimize the cost function using gradient descent. Second, to obtain low bit-width activations, existing works consider all channels equally. However, the activation quantizers could be biased toward a few channels with high-variance. To address this issue, we propose a method to take into account the quantization errors of individual channels. With this approach, we can learn activation quantizers that minimize the quantization errors in the majority of channels. Experimental results demonstrate that our proposed method achieves state-of-the-art performance on the image classification task, using AlexNet, ResNet and MobileNetV2 architectures on CIFAR-100 and ImageNet datasets.

翻訳日:2021-04-25 02:24:39 公開日:2020-12-26

# (参考訳) 多成分形状に対するアフィンモーメント不変量

An Affine moment invariant for multi-component shapes ( http://arxiv.org/abs/2012.13774v1 )

ライセンス: CC BY 4.0

Jovisa Zunic, Milos Stojmenovic

(参考訳) 本稿では,多成分形状解析のための画像ベースアルゴリズムツールを提案する。多成分形状の一般的な概念のため、本手法は実物体をその形状に基づいて解析する広い範囲のアプリケーションの解析に適用することができる。対応する白黒の画像ですこの方法は、多成分形状測定(multi-component shapes measure)と呼ばれる数値を形に割り当てる。この数/測度はアフィン変換に対して不変であり、本論文で開発された理論的枠組みに基づいて確立される。加えて、この方法は実装が容易で、堅牢である(例えば、)。騒音については)。航空画像解析と銀河画像解析に関する2つの小さめながら図示的な例を示す。また,測定値の挙動をよりよく理解するための合成例も提示する。

We introduce an image based algorithmic tool for analyzing multi-component shapes here. Due to the generic concept of multi-component shapes, our method can be applied to the analysis of a wide spectrum of applications where real objects are analyzed based on their shapes - i.e. on their corresponded black and white images. The method allocates a number to a shape, herein called a multi-component shapes measure. This number/measure is invariant with respect to affine transformations and is established based on the theoretical frame developed in this paper. In addition, the method is easy to implement and is robust (e.g. with respect to noise). We provide two small but illustrative examples related to aerial image analysis and galaxy image analysis. Also, we provide some synthetic examples for a better understanding of the measure behavior.

翻訳日:2021-04-25 01:50:52 公開日:2020-12-26

# (参考訳) DAC-MLを用いた試料効率制御に向けて

Towards sample-efficient episodic control with DAC-ML ( http://arxiv.org/abs/2012.13779v1 )

ライセンス: CC BY 4.0

Ismael T. Freire, Adri\'an F. Amil, Vasiliki Vouloutsi, Paul F.M.J. Verschure

(参考訳) 人工知能におけるサンプル効率問題は、少数のエピソードでアクションポリシーを最適化する現在のDeep Reinforcement Learningモデルが存在しないことを指す。近年の研究では、エピソード強化学習のような学習速度を改善するために、メモリシステムやアーキテクチャバイアスを追加することで、この制限を克服しようとしている。しかし、漸進的な改善を達成しても、そのパフォーマンスは人間の行動方針の学習方法に匹敵するものではない。本稿では、脳と心の分散適応制御(DAC)理論の設計原理を活かして、海馬にインスパイアされたシーケンシャルメモリシステムを導入することで、挑戦的な捕食作業における報酬獲得を最大化する効果的なアクションポリシーに迅速に収束できる新しい認知アーキテクチャ(DAC-ML)を構築する。

The sample-inefficiency problem in Artificial Intelligence refers to the inability of current Deep Reinforcement Learning models to optimize action policies within a small number of episodes. Recent studies have tried to overcome this limitation by adding memory systems and architectural biases to improve learning speed, such as in Episodic Reinforcement Learning. However, despite achieving incremental improvements, their performance is still not comparable to how humans learn behavioral policies. In this paper, we capitalize on the design principles of the Distributed Adaptive Control (DAC) theory of mind and brain to build a novel cognitive architecture (DAC-ML) that, by incorporating a hippocampus-inspired sequential memory system, can rapidly converge to effective action policies that maximize reward acquisition in a challenging foraging task.

翻訳日:2021-04-25 01:43:39 公開日:2020-12-26

# (参考訳) 説明可能な医療データの多クラス分類

Explainable Multi-class Classification of Medical Data ( http://arxiv.org/abs/2012.13796v1 )

ライセンス: CC BY 4.0

YuanZheng Hu, Marina Sokolova

(参考訳) 機械学習アプリケーションは、医療データの二次分析に新たな洞察をもたらした。機械学習は、新しい薬物の開発を支援し、特定の病気に罹患する集団を定義し、多くの共通疾患の予測因子を特定する。同時に、機械学習の結果は、特徴の選択、クラス(im)バランス、アルゴリズムの選好、パフォーマンスメトリクスなど、多くの要因の畳み込みに依存する。本稿では,大規模医療データセットのマルチクラス分類について説明する。本稿では,知識に基づく機能工学,データセットのバランス,最良のモデル選択,パラメータチューニングについて述べる。この研究では、SVM(Support Vector Machine)、Na\"ive Bayes、Gradient Boosting、Decision Trees、Random Forest、Logistic Regressionの6つのアルゴリズムが使用されている。 UCI 糖尿病130-US 病院の1999-2008 年データセットにおける経験的評価を行い,患者病院の再入院期間を0日,<30日,>30日という3つのクラスに分類した。その結果,23種類の薬品を学習実験で使用することにより,6種類の学習アルゴリズムのうち5つをリコールできることがわかった。これは、同じデータで行った以前の研究を拡大する新しい結果である。勾配ブースティングとランダムフォレストは他のアルゴリズムよりも3つの分類精度で優れていた。

Machine Learning applications have brought new insights into a secondary analysis of medical data. Machine Learning helps to develop new drugs, define populations susceptible to certain illnesses, identify predictors of many common diseases. At the same time, Machine Learning results depend on convolution of many factors, including feature selection, class (im)balance, algorithm preference, and performance metrics. In this paper, we present explainable multi-class classification of a large medical data set. We in details discuss knowledge-based feature engineering, data set balancing, best model selection, and parameter tuning. Six algorithms are used in this study: Support Vector Machine (SVM), Na\"ive Bayes, Gradient Boosting, Decision Trees, Random Forest, and Logistic Regression. Our empirical evaluation is done on the UCI Diabetes 130-US hospitals for years 1999-2008 dataset, with the task to classify patient hospital re-admission stay into three classes: 0 days, <30 days, or > 30 days. Our results show that using 23 medication features in learning experiments improves Recall of five out of the six applied learning algorithms. This is a new result that expands the previous studies conducted on the same data. Gradient Boosting and Random Forest outperformed other algorithms in terms of the three-class classification Accuracy.

翻訳日:2021-04-25 01:36:41 公開日:2020-12-26

# (参考訳) 段木モデルに基づく新しい生成型分類器のクラス

A new class of generative classifiers based on staged tree models ( http://arxiv.org/abs/2012.13798v1 )

ライセンス: CC BY 4.0

Federico Carli, Manuele Leonelli, Gherardo Varando

(参考訳) 分類のための生成モデルは、クラス変数と特徴の合同確率分布を使用して決定規則を構成する。生成モデルのうち、ベイズネットワークとナイーブベイズ分類器が最も一般的に使われ、すべての変数間の関係を明確に表現している。しかしこれらは、文脈固有の独立を許さないことで、存在可能な関係のタイプを高度に制限する欠点がある。ここでは,ステージ付き木分類器と呼ばれる,コンテキスト固有の独立性を考慮した新しい生成型分類器を導入する。これらは、条件付き独立性が形式的に読み取れるイベントツリーの頂点の分割によって構成される。 naive staged tree分類器も定義されており、同じ複雑さを維持しながら、古典的なnaive bayes分類器を拡張する。大規模シミュレーションにより,段階木分類器の分類精度は最先端の分類器と競合することが示された。タイタニック号の乗客の運命を予測するための応用分析は、新しい世代分類器が与えうる洞察を強調している。

Generative models for classification use the joint probability distribution of the class variable and the features to construct a decision rule. Among generative models, Bayesian networks and naive Bayes classifiers are the most commonly used and provide a clear graphical representation of the relationship among all variables. However, these have the disadvantage of highly restricting the type of relationships that could exist, by not allowing for context-specific independences. Here we introduce a new class of generative classifiers, called staged tree classifiers, which formally account for context-specific independence. They are constructed by a partitioning of the vertices of an event tree from which conditional independence can be formally read. The naive staged tree classifier is also defined, which extends the classic naive Bayes classifier whilst retaining the same complexity. An extensive simulation study shows that the classification accuracy of staged tree classifiers is competitive with those of state-of-the-art classifiers. An applied analysis to predict the fate of the passengers of the Titanic highlights the insights that the new class of generative classifiers can give.

翻訳日:2021-04-25 01:24:11 公開日:2020-12-26

# 多次元不確実性認識ニューラルネットワーク

Multidimensional Uncertainty-Aware Evidential Neural Networks ( http://arxiv.org/abs/2012.13676v1 )

ライセンス: Link先を確認

Yibo Hu, Yuzhe Ou, Xujiang Zhao, Jin-Hee Cho, Feng Chen

(参考訳) 従来のディープニューラルネットワーク(NN)は、さまざまなアプリケーションドメインの分類タスクにおける最先端のパフォーマンスに大きく貢献している。しかし、NNは、不確実性の下での誤分類が現実世界の文脈で意思決定のリスクを高くする(例えば、道路における物体の誤分類が深刻な事故を引き起こす)クラス確率に関連するデータに固有の不確実性は考慮していない。重みの不確実性を通じて間接的に不確実性を推定するベイズNNとは異なり、顕在的NN(ENN)は近年、クラス確率の不確かさを明示的にモデル化し、分類タスクに使用するために提案されている。 ENNは、NNの予測を主観的意見として定式化し、データから決定論的NNによって主観的意見を形成することができる量の証拠を収集して機能を学ぶ。しかし、ENNは、空白(証拠の欠如による不確実性)や不協和(証拠の矛盾による不確実性)など、異なる根本原因を持つデータに固有の不確かさを明示的に考慮することなく、ブラックボックスとして訓練されている。多次元不確かさを考慮し, オフ・オブ・ディストリビューション(OOD)検出問題の解法として, WGAN-ENN (WENN) と呼ばれる新しい不確実性検出NNを提案する。 We take a hybrid approach which with Wasserstein Generative Adversarial Network (WGAN) with ENNs to jointly training a model with a prior knowledge of a class, which has high vacuity for OOD sample。人工と実世界の両方のデータセットに基づく広範な実験実験により、WENNによる不確実性の推定は、OODサンプルと境界サンプルの区別に大きく役立つことを示した。 WENNは、他の競合相手と比較してOOD検出に優れていた。

Traditional deep neural networks (NNs) have significantly contributed to the state-of-the-art performance in the task of classification under various application domains. However, NNs have not considered inherent uncertainty in data associated with the class probabilities where misclassification under uncertainty may easily introduce high risk in decision making in real-world contexts (e.g., misclassification of objects in roads leads to serious accidents). Unlike Bayesian NN that indirectly infer uncertainty through weight uncertainties, evidential NNs (ENNs) have been recently proposed to explicitly model the uncertainty of class probabilities and use them for classification tasks. An ENN offers the formulation of the predictions of NNs as subjective opinions and learns the function by collecting an amount of evidence that can form the subjective opinions by a deterministic NN from data. However, the ENN is trained as a black box without explicitly considering inherent uncertainty in data with their different root causes, such as vacuity (i.e., uncertainty due to a lack of evidence) or dissonance (i.e., uncertainty due to conflicting evidence). By considering the multidimensional uncertainty, we proposed a novel uncertainty-aware evidential NN called WGAN-ENN (WENN) for solving an out-of-distribution (OOD) detection problem. We took a hybrid approach that combines Wasserstein Generative Adversarial Network (WGAN) with ENNs to jointly train a model with prior knowledge of a certain class, which has high vacuity for OOD samples. Via extensive empirical experiments based on both synthetic and real-world datasets, we demonstrated that the estimation of uncertainty by WENN can significantly help distinguish OOD samples from boundary samples. WENN outperformed in OOD detection when compared with other competitive counterparts.

翻訳日:2021-04-25 01:15:41 公開日:2020-12-26

# PaXNet:Ensemble Transfer LearningとCapsule Classifierを用いたパノラマX線歯列検出

PaXNet: Dental Caries Detection in Panoramic X-ray using Ensemble Transfer Learning and Capsule Classifier ( http://arxiv.org/abs/2012.13666v1 )

ライセンス: Link先を確認

Arman Haghanifar, Mahdiyar Molahasani Majdabadi, Seok-Bum Ko

(参考訳) 歯列骨は、生後最も慢性的な疾患の1つであり、人口の大半を包含している。喉頭病変は通常、歯科用X線による視力検査のみに依存する放射線医によって診断される。多くの場合、歯列はX線で識別することは困難であり、低画質などの異なる理由から影と誤解されることがある。したがって,近年,ケーリー検出のための意思決定支援システムの開発が注目されている。そこで本研究では,パノラマ画像中のデンタルカリーを初めて検出し,著者の知識を最大限に活用する自動診断システムを提案する。提案モデルは、X線から関連する特徴を抽出し、カプセルネットワークを用いて予測結果を描画するトランスファーラーニングにより、事前訓練された様々な深層学習モデルの利点を享受する。 240個のラベル付き画像を含む特徴抽出に使用される470個のパノラマ画像のデータセットにおいて,本モデルが精度86.05\%を達成した。得られたスコアは、実際の患者のパノラマX線を使用する際の課題を考慮し、許容する検出性能とキャリー検出速度の増大を示す。本モデルでは, 軽度および重度者に対して69.44\%, 90.52\%のリコールスコアを取得し, 軽度カリー検出がより容易で, 効果的でロバストで大きなデータセットが必要であることを確認した。パノラマ画像を用いた最近の研究の目新しさを考えると、この研究はドメインエキスパートを支援する完全自動化された効率的な意思決定支援システムを開発するための一歩である。

Dental caries is one of the most chronic diseases involving the majority of the population during their lifetime. Caries lesions are typically diagnosed by radiologists relying only on their visual inspection to detect via dental x-rays. In many cases, dental caries is hard to identify using x-rays and can be misinterpreted as shadows due to different reasons such as low image quality. Hence, developing a decision support system for caries detection has been a topic of interest in recent years. Here, we propose an automatic diagnosis system to detect dental caries in Panoramic images for the first time, to the best of authors' knowledge. The proposed model benefits from various pretrained deep learning models through transfer learning to extract relevant features from x-rays and uses a capsule network to draw prediction results. On a dataset of 470 Panoramic images used for features extraction, including 240 labeled images for classification, our model achieved an accuracy score of 86.05\% on the test set. The obtained score demonstrates acceptable detection performance and an increase in caries detection speed, as long as the challenges of using Panoramic x-rays of real patients are taken into account. Among images with caries lesions in the test set, our model acquired recall scores of 69.44\% and 90.52\% for mild and severe ones, confirming the fact that severe caries spots are more straightforward to detect and efficient mild caries detection needs a more robust and larger dataset. Considering the novelty of current research study as using Panoramic images, this work is a step towards developing a fully automated efficient decision support system to assist domain experts.

翻訳日:2021-04-25 01:15:06 公開日:2020-12-26

# ソーシャルメディアが消費者の認識のシグナルを公表

Social media data reveals signal for public consumer perceptions ( http://arxiv.org/abs/2012.13675v1 )

ライセンス: Link先を確認

Neeti Pokhriyal, Abenezer Dara, Benjamin Valentino, Soroush Vosoughi

(参考訳) 研究者たちはソーシャルメディアのデータを使って、公共の行動に関する様々なマクロ経済指標を推定してきた。最も広く引用されている経済指標の1つは消費者信頼指数(CCI)である。これまで多くの研究がソーシャルメディア、特にTwitterデータを使ってCCIを予測することに重点を置いてきた。しかし、最近の包括的調査によると、これらのモデルが新しいデータでテストされると、強い相関関係は消失した。本稿では,ガウス過程の回帰(推定とそれに関連する不確実性の両方を提供する)を基礎としたロバストな非パラメトリックベイズモデリングフレームワークを提案することにより,ソーシャルメディアデータを用いたcci測定の真の可能性を評価する問題を再考する。我々のフレームワークと一体化することは、調査頻度を減らすためにデジタルデータをいかに活用できるかを実証する原理的な実験手法であり、定期的なポーリングは我々のモデルを校正するためにのみ必要である。広範囲な実験により、スムーズな間隔や様々な種類のラグなど、異なるマイクロ決定の選択方法が示される。結果に重要な影響を与えます Redditのdecadal data (2008-2019) を用いて、CCIの月次推定と日次推定の両方が、少なくとも数ヶ月前に確実に予測可能であること、我々のモデル推定が既存の方法よりもはるかに優れていることを示します。

Researchers have used social media data to estimate various macroeconomic indicators about public behaviors, mostly as a way to reduce surveying costs. One of the most widely cited economic indicator is consumer confidence index (CCI). Numerous studies in the past have focused on using social media, especially Twitter data, to predict CCI. However, the strong correlations disappeared when those models were tested with newer data according to a recent comprehensive survey. In this work, we revisit this problem of assessing the true potential of using social media data to measure CCI, by proposing a robust non-parametric Bayesian modeling framework grounded in Gaussian Process Regression (which provides both an estimate and an uncertainty associated with it). Integral to our framework is a principled experimentation methodology that demonstrates how digital data can be employed to reduce the frequency of surveys, and thus periodic polling would be needed only to calibrate our model. Via extensive experimentation we show how the choice of different micro-decisions, such as the smoothing interval, various types of lags etc. have an important bearing on the results. By using decadal data (2008-2019) from Reddit, we show that both monthly and daily estimates of CCI can, indeed, be reliably estimated at least several months in advance, and that our model estimates are far superior to those generated by the existing methods.

翻訳日:2021-04-25 01:13:51 公開日:2020-12-26

# Few-Shot分類のための空間コントラスト学習

Spatial Contrastive Learning for Few-Shot Classification ( http://arxiv.org/abs/2012.13831v1 )

ライセンス: Link先を確認

Yassine Ouali, C\'eline Hudelot, Myriam Tami

(参考訳) 既存の数ショットの分類法は、限られたデータを持つ未確認クラスへのテスト時間適応を容易にするトランスファー可能な表現を学習するために、クロスエントロピー(CE)損失にある程度依存している。しかし、CE損失にはいくつかの欠点があり、例えば、目に見えないクラスに対する過度な差別を伴う表現の誘導は、見つからないクラスへの転送可能性を抑制し、その結果、準最適一般化をもたらす。本研究では,データ依存正規化器として機能する補助的な学習目標として,コントラスト学習を考察する。局所的な識別特徴を抑圧する標準的な対照目的ではなく、局所的な識別とクラス非依存の特徴を学習するための新しい注意に基づく空間比較目的を提案する。広範な実験により,提案手法が最先端のアプローチに勝ることを示し,数発学習における良質な組込みの学習の重要性を確認した。

Existing few-shot classification methods rely to some degree on the cross-entropy (CE) loss to learn transferable representations that facilitate the test time adaptation to unseen classes with limited data. However, the CE loss has several shortcomings, e.g., inducing representations with excessive discrimination towards seen classes, which reduces their transferability to unseen classes and results in sub-optimal generalization. In this work, we explore contrastive learning as an additional auxiliary training objective, acting as a data-dependent regularizer to promote more general and transferable features. Instead of using the standard contrastive objective, which suppresses local discriminative features, we propose a novel attention-based spatial contrastive objective to learn locally discriminative and class-agnostic features. With extensive experiments, we show that the proposed method outperforms state-of-the-art approaches, confirming the importance of learning good and transferable embeddings for few-shot learning.

翻訳日:2021-04-25 01:13:27 公開日:2020-12-26

# 拡張現実のためのシーンテキスト検出 --文字bigramによる偽陽性率の低減

Scene Text Detection for Augmented Reality -- Character Bigram Approach to reduce False Positive Rate ( http://arxiv.org/abs/2101.01054v1 )

ライセンス: Link先を確認

Sagar Gubbi and Bharadwaj Amrutur

(参考訳) 自然シーンのテキスト検出はシーン理解の重要な側面であり、拡張現実アプリケーションを構築する上で有用なツールである。本研究では,テキストスポッティングにおける偽陽性の問題に対処する。単文字ではなく文字ペア(ビグラム)を探すことにより,スライディングウィンドウテキストスポッターの性能向上を提案する。効率的な畳み込みニューラルネットワークを設計し、ビッグラムを検出するように訓練する。提案された検出器は、ICDAR 2015データセットにおいて偽陽性率を28.16%削減する。我々は,スライディングウィンドウのテキストスポッターを改善するために,bigramsの検出が計算的に安価な方法であることを実証する。

Natural scene text detection is an important aspect of scene understanding and could be a useful tool in building engaging augmented reality applications. In this work, we address the problem of false positives in text spotting. We propose improving the performace of sliding window text spotters by looking for character pairs (bigrams) rather than single characters. An efficient convolutional neural network is designed and trained to detect bigrams. The proposed detector reduces false positive rate by 28.16% on the ICDAR 2015 dataset. We demonstrate that detecting bigrams is a computationally inexpensive way to improve sliding window text spotters.

翻訳日:2021-04-25 01:13:09 公開日:2020-12-26

# アラビア語引用規則のSmartajweed自動認識

Smartajweed Automatic Recognition of Arabic Quranic Recitation Rules ( http://arxiv.org/abs/2101.04200v1 )

ライセンス: Link先を確認

Ali M. Alagrami, Maged M. Eljazzar

(参考訳) タジウェド(Tajweed)は、クァラン語を正しい発音で読むための一連の規則であり、クァラン語を暗唱している。つまり、クァーランのすべての文字に特徴の特質を付与し、読みながらこの特定の状況においてこの特定の文字にそれを適用しなければならない。これらの特徴はメロディックな規則、例えば、どこで停止するか、どのくらいの期間、発音で2文字をマージするか、あるいは何文字を伸ばすか、あるいは他の文字に力を加えるかなどである。論文のほとんどが主な朗読規則と発音に焦点を合わせているが(ahkam al tajweed)、異なるリズムと異なるメロディを発音に与えている(tajweed)。それはまた、クァランを読む上で非常に重要で不可欠であると考えられており、語に異なる意味を与えることができる。本稿では,サポートベクタマシンとしきい値スコアリングシステムを用いて,Quran Recitation Rules(Tajweed)の自動認識のための詳細なシステムについて論じる。

Tajweed is a set of rules to read the Quran in a correct Pronunciation of the letters with all its Qualities, while Reciting the Quran. which means you have to give every letter in the Quran its due of characteristics and apply it to this particular letter in this specific situation while reading, which may differ in other times. These characteristics include melodic rules, like where to stop and for how long, when to merge two letters in pronunciation or when to stretch some, or even when to put more strength on some letters over other. Most of the papers focus mainly on the main recitation rules and the pronunciation but not (Ahkam AL Tajweed) which give different rhythm and different melody to the pronunciation with every different rule of (Tajweed). Which is also considered very important and essential in Reading the Quran as it can give different meanings to the words. In this paper we discuss in detail full system for automatic recognition of Quran Recitation Rules (Tajweed) by using support vector machine and threshold scoring system

翻訳日:2021-04-25 01:12:18 公開日:2020-12-26

# ビッグデータからのコンパクトデータに向けて

Toward Compact Data from Big Data ( http://arxiv.org/abs/2012.13677v1 )

ライセンス: Link先を確認

Song-Kyoo (Amang) Kim

(参考訳) bigdataは、価値ある原材料を扱う能力以上の大きさのデータセットで、特定の洞察に洗練され、蒸留される。 compact dataは、複雑なbigdataを扱うことなく、最高のアセットを提供するbig datasetを最適化するメソッドである。このコンパクトデータセットは、ビッグデータのないビッグデータシステムの有効かつパーソナライズされた利用のために、きめ細かいレベルの最大知識パターンを含む。コンパクトデータ手法は,問題状況に依存したテーラーメイドの設計である。論文の様々なデータ駆動研究領域において、様々なコンパクトデータ技術が実証されている。

Bigdata is a dataset of which size is beyond the ability of handling a valuable raw material that can be refined and distilled into valuable specific insights. Compact data is a method that optimizes the big dataset that gives best assets without handling complex bigdata. The compact dataset contains the maximum knowledge patterns at fine grained level for effective and personalized utilization of bigdata systems without bigdata. The compact data method is a tailor-made design which depends on problem situations. Various compact data techniques have been demonstrated into various data-driven research area in the paper.

翻訳日:2021-04-25 01:11:57 公開日:2020-12-26

# スペクトル正規化による安定性確認強化学習

Stability-Certified Reinforcement Learning via Spectral Normalization ( http://arxiv.org/abs/2012.13744v1 )

ライセンス: Link先を確認

Ryoichi Takase, Nobuyuki Yoshikawa, Toshisada Mariyama, and Takeshi Tsuchiya

(参考訳) 本稿では、ニューラルネットワークが制御するシステムの安定性を確保するために、スペクトル正規化に基づく異なる視点からの2つの方法について述べる。 1つ目は、フィードバックシステムのL2ゲインが1未満の有界であり、小利得定理から導かれる安定性条件を満たすことである。第1の方法は、安定性条件を明示的に含むが、厳密な安定性条件のため、ニューラルネットワークコントローラの性能が不十分である可能性がある。この難しさを克服するため,第2の課題が提案され,より広いアトラクション領域での局所安定性を確保しつつ,性能の向上が図られた。第2の方法は、ニューラルネットワークコントローラのトレーニング後に線形行列の不等式を解くことにより安定性を確保する。本稿で提案するスペクトル正規化は, より厳密な局所セクターを構築することにより, a-posteriori 安定性試験の実現可能性を向上させる。数値実験により,第2法は第1法と比較して十分な性能を示し,既存の強化学習アルゴリズムと比較して十分な安定性が得られた。

In this article, two types of methods from different perspectives based on spectral normalization are described for ensuring the stability of the system controlled by a neural network. The first one is that the L2 gain of the feedback system is bounded less than 1 to satisfy the stability condition derived from the small-gain theorem. While explicitly including the stability condition, the first method may provide an insufficient performance on the neural network controller due to its strict stability condition. To overcome this difficulty, the second one is proposed, which improves the performance while ensuring the local stability with a larger region of attraction. In the second method, the stability is ensured by solving linear matrix inequalities after training the neural network controller. The spectral normalization proposed in this article improves the feasibility of the a-posteriori stability test by constructing tighter local sectors. The numerical experiments show that the second method provides enough performance compared with the first one while ensuring enough stability compared with the existing reinforcement learning algorithms.

翻訳日:2021-04-25 01:11:49 公開日:2020-12-26

# 呼吸音の異常予測のための深層学習フレームワーク

Deep Learning Framework Applied for Predicting Anomaly of Respiratory Sounds ( http://arxiv.org/abs/2012.13668v1 )

ライセンス: Link先を確認

Dat Ngo, Lam Pham, Anh Nguyen, Ben Phan, Khoa Tran, Truong Nguyen

(参考訳) 本稿では,呼吸サイクルの異常を分類するためのロバストなディープラーニングフレームワークを提案する。まず、フレームワークはフロントエンドの機能抽出ステップから始まります。このステップは、呼吸入力音をスペクトルと時間的特徴をよく表現した2次元スペクトログラムに変換することを目的としている。次に、C-DNNとオートエンコーダネットワークのアンサンブルを用いて、呼吸異常サイクルの4つのカテゴリに分類する。本研究は2017年にICBHI(Institutal Conference on Biomedical Health Informatics)ベンチマークデータセットを用いて実施した。その結果,ICBHI平均スコア0.49,IABHI高調波スコア0.42の競争性能が得られた。

This paper proposes a robust deep learning framework used for classifying anomaly of respiratory cycles. Initially, our framework starts with front-end feature extraction step. This step aims to transform the respiratory input sound into a two-dimensional spectrogram where both spectral and temporal features are well presented. Next, an ensemble of C- DNN and Autoencoder networks is then applied to classify into four categories of respiratory anomaly cycles. In this work, we conducted experiments over 2017 Internal Conference on Biomedical Health Informatics (ICBHI) benchmark dataset. As a result, we achieve competitive performances with ICBHI average score of 0.49, ICBHI harmonic score of 0.42.

翻訳日:2021-04-25 01:11:34 公開日:2020-12-26

# siameseネットワークを用いた学習視覚手がかりを用いたワンショット物体定位

One-Shot Object Localization Using Learnt Visual Cues via Siamese Networks ( http://arxiv.org/abs/2012.13690v1 )

ライセンス: Link先を確認

Sagar Gubbi Venkatesh and Bharadwaj Amrutur

(参考訳) 新規で非構造的な環境で動作可能なロボットは、これまで見えなかった新しい物体を認識する能力を持つ必要がある。本研究では,新しい環境にローカライズされなければならない新規な関心対象を特定するために視覚的な手がかりを用いる。 siameseネットワークを備えたエンドツーエンドニューラルネットワークを使用して、キューを学習し、関心のあるオブジェクトを推論し、新たな環境にローカライズする。シミュレーションロボットはレーザーポインターが指している新しい物体をピックアップ・アンド・プレースできることを示す。また,オムニグロット手書き文字データセットと玩具の小さなデータセットから得られたデータセットに対する提案手法の性能評価を行った。

A robot that can operate in novel and unstructured environments must be capable of recognizing new, previously unseen, objects. In this work, a visual cue is used to specify a novel object of interest which must be localized in new environments. An end-to-end neural network equipped with a Siamese network is used to learn the cue, infer the object of interest, and then to localize it in new environments. We show that a simulated robot can pick-and-place novel objects pointed to by a laser pointer. We also evaluate the performance of the proposed approach on a dataset derived from the Omniglot handwritten character dataset and on a small dataset of toys.

翻訳日:2021-04-25 01:11:05 公開日:2020-12-26

# オブジェクト検出に対するsparse adversarial attack

Sparse Adversarial Attack to Object Detection ( http://arxiv.org/abs/2012.13692v1 )

ライセンス: Link先を確認

Jiayu Bao

(参考訳) 敵対的な例は近年多くの注目を集めている。画像分類器を攻撃するために多くの敵攻撃が提案されているが、対象検出器に注意を向ける作業はほとんどない。本稿では,Sparse Adversarial Attack (SAA)を提案する。画像の脆弱な位置を選択し,タスクの回避損失関数を設計した。 YOLOv4とFasterRCNNの実験結果から,本手法の有効性が明らかになった。さらに、我々のSAAはブラックボックス攻撃設定で異なる検出器間で大きな伝達性を示す。コードは \emph{https://github.com/thurssq/tianchi04} で入手できる。

Adversarial examples have gained tons of attention in recent years. Many adversarial attacks have been proposed to attack image classifiers, but few work shift attention to object detectors. In this paper, we propose Sparse Adversarial Attack (SAA) which enables adversaries to perform effective evasion attack on detectors with bounded \emph{l$_{0}$} norm perturbation. We select the fragile position of the image and designed evasion loss function for the task. Experiment results on YOLOv4 and FasterRCNN reveal the effectiveness of our method. In addition, our SAA shows great transferability across different detectors in the black-box attack setting. Codes are available at \emph{https://github.com/THUrssq/Tianchi04}.

翻訳日:2021-04-25 01:10:29 公開日:2020-12-26

# 3次元連続心筋mriのための2次元呼吸ナビゲーションフレームワーク

2-D Respiration Navigation Framework for 3-D Continuous Cardiac Magnetic Resonance Imaging ( http://arxiv.org/abs/2012.13700v1 )

ライセンス: Link先を確認

Elisabeth Hoppe, Jens Wetzl, Philipp Roser, Lina Felsner, Alexander Preuhs, Andreas Maier

(参考訳) 心臓磁気共鳴イメージングのための連続的プロトコルは、同時に心筋相に分解された心臓解剖のサンプリングを可能にする。呼吸アーチファクトを避けるために、スキャン中の関連する動きを再建時に補償する必要がある。本稿では,連続スキャン中に2次元呼吸情報を取得するためのサンプリング適応を提案する。さらに、取得した信号から異なる呼吸状態を抽出するパイプラインを開発し、1つの呼吸相からデータを再構成する。以上の結果から,従来の1次元呼吸ナビゲーション手法と同様に,呼吸補償の不要な画像品質に対するワークフローの有用性が示された。

Continuous protocols for cardiac magnetic resonance imaging enable sampling of the cardiac anatomy simultaneously resolved into cardiac phases. To avoid respiration artifacts, associated motion during the scan has to be compensated for during reconstruction. In this paper, we propose a sampling adaption to acquire 2-D respiration information during a continuous scan. Further, we develop a pipeline to extract the different respiration states from the acquired signals, which are used to reconstruct data from one respiration phase. Our results show the benefit of the proposed workflow on the image quality compared to no respiration compensation, as well as a previous 1-D respiration navigation approach.

翻訳日:2021-04-25 01:10:17 公開日:2020-12-26

# 3次元カラーポイント雲を用いた高密度果樹樹樹へのリンゴの割り当て

Assigning Apples to Individual Trees in Dense Orchards using 3D Color Point Clouds ( http://arxiv.org/abs/2012.13721v1 )

ライセンス: Link先を確認

Mouad Zine-El-Abidine, Helin Dutagaci, Gilles Galopin, David Rousseau

(参考訳) 本稿では,trellis構造化果樹園の個々のリンゴのリンゴを数える3dカラーポイントクラウド処理パイプラインを提案する。木レベルでの果実の計数には、密集した果樹園では難しい木を切り離す必要がある。枝構造が見える冬期に葉樹園から取得した点雲を用いて樹冠の樹冠を画定する。我々は収穫期に獲得した点雲にリンゴをローカライズする。 2つの点のクラウドをアライメントすることで、appleのロケーションを線引きされた冬のクラウドにマッピングし、それぞれのリンゴをベアリングツリーに割り当てることができる。我々のリンゴ割当法は95%以上の精度を達成する。実現可能性の最初の証明を示すことに加えて、リンゴの割り当てパイプラインにさらなる改善を提案する。

We propose a 3D color point cloud processing pipeline to count apples on individual apple trees in trellis structured orchards. Fruit counting at the tree level requires separating trees, which is challenging in dense orchards. We employ point clouds acquired from the leaf-off orchard in winter period, where the branch structure is visible, to delineate tree crowns. We localize apples in point clouds acquired in harvest period. Alignment of the two point clouds enables mapping apple locations to the delineated winter cloud and assigning each apple to its bearing tree. Our apple assignment method achieves an accuracy rate higher than 95%. In addition to presenting a first proof of feasibility, we also provide suggestions for further improvement on our apple assignment pipeline.

翻訳日:2021-04-25 01:09:47 公開日:2020-12-26

# 周波数領域からの高速かつ高精度な圧縮映像行動認識

Faster and Accurate Compressed Video Action Recognition Straight from the Frequency Domain ( http://arxiv.org/abs/2012.13726v1 )

ライセンス: Link先を確認

Samuel Felipe dos Santos and Jurandy Almeida

(参考訳) 人間の行動認識は、監視、医療、産業環境、スマートホームなど幅広い応用のために、コンピュータビジョンにおける最も活発な研究分野の1つになっている。近年,ビデオ中の人間の行動を認識するための強力で解釈可能な特徴の習得にディープラーニングが成功している。既存のディープラーニングアプローチのほとんどは、RGB画像シーケンスとしてビデオ情報を処理するために設計されている。そのため、ビデオデータは圧縮フォーマットに格納されることが多いため、プリミティブな復号処理が必要となる。しかし、ビデオのデコードには高い計算負荷とメモリ使用量が必要である。そこで本研究では,圧縮映像から直接学習可能な深層ニューラルネットワークを提案する。提案手法は,UCF-101およびHMDB-51データセットの2つの公開ベンチマークで評価され,予測速度の最大2倍の高速化が期待できる。

Human action recognition has become one of the most active field of research in computer vision due to its wide range of applications, like surveillance, medical, industrial environments, smart homes, among others. Recently, deep learning has been successfully used to learn powerful and interpretable features for recognizing human actions in videos. Most of the existing deep learning approaches have been designed for processing video information as RGB image sequences. For this reason, a preliminary decoding process is required, since video data are often stored in a compressed format. However, a high computational load and memory usage is demanded for decoding a video. To overcome this problem, we propose a deep neural network capable of learning straight from compressed video. Our approach was evaluated on two public benchmarks, the UCF-101 and HMDB-51 datasets, demonstrating comparable recognition performance to the state-of-the-art methods, with the advantage of running up to 2 times faster in terms of inference speed.

翻訳日:2021-04-25 01:09:34 公開日:2020-12-26

# アンカー自由物体検出のための線形スケジューリングによるバランス指向焦点損失

Balance-Oriented Focal Loss with Linear Scheduling for Anchor Free Object Detection ( http://arxiv.org/abs/2012.13763v1 )

ライセンス: Link先を確認

Hopyong Gil, Sangwoo Park, Yusang Park, Wongoo Han, Juyean Hong, Juneyoung Jung

(参考訳) 既存のオブジェクト検出器の多くは、パフォーマンスのバランスを阻害するクラス不均衡の問題に苦しんでいる。特にアンカーフリーオブジェクト検出器は、画素毎の予測方法での検出とフォアグラウンドのアンバランス問題を同時に解決する必要がある。本研究では,背景バランスと前景バランスを総合的に考慮し,バランス学習を促すバランス指向焦点損失を提案する。本研究は,アンカーフリー物体検出器のショット数や焦点損失を含む非極端分布の一般不均衡データを用いた場合の非平衡問題に対処することを目的とする。我々は、この不均衡問題に精巧に対処するために、焦点損失のバッチワイズアルファバランスの変種を用いる。一般的な不均衡データに対して再重み付けのみを使用する、シンプルで実用的なソリューションである。推論やグルーピングクラスにおいて、追加の学習コストも構造的な変更も不要である。広範にわたる実験により,各部品の性能改善を示し,損失に対する再重み付けを用いた線形スケジューリングの効果を解析した。前景階級のバランスの点で焦点損失を改善することにより、アンカーフリーリアルタイム検出器のためのMS-COCOにおけるAP利得+1.2を達成する。

Most existing object detectors suffer from class imbalance problems that hinder balanced performance. In particular, anchor free object detectors have to solve the background imbalance problem due to detection in a per-pixel prediction fashion as well as foreground imbalance problem simultaneously. In this work, we propose Balance-oriented focal loss that can induce balanced learning by considering both background and foreground balance comprehensively. This work aims to address imbalance problem in the situation of using a general unbalanced data of non-extreme distribution not including few shot and the focal loss for anchor free object detector. We use a batch-wise alpha-balanced variant of the focal loss to deal with this imbalance problem elaborately. It is a simple and practical solution using only re-weighting for general unbalanced data. It does require neither additional learning cost nor structural change during inference and grouping classes is also unnecessary. Through extensive experiments, we show the performance improvement for each component and analyze the effect of linear scheduling when using re-weighting for the loss. By improving the focal loss in terms of balancing foreground classes, our method achieves AP gains of +1.2 in MS-COCO for the anchor free real-time detector.

翻訳日:2021-04-25 01:09:18 公開日:2020-12-26

# エッジコンピューティングに向けたディープラーニング - 圧縮データからニューラルネットワークへ

Deep Learning Towards Edge Computing: Neural Networks Straight from Compressed Data ( http://arxiv.org/abs/2012.14426v1 )

ライセンス: Link先を確認

Samuel Felipe dos Santos and Jurandy Almeida

(参考訳) 携帯電話の普及と計算能力の増大、人工知能の進歩により、多くのインテリジェントなアプリケーションが開発され、有意義に人々の生活を豊かにしている。そのため、エッジインテリジェンス(エッジインテリジェンス)の分野への関心が高まっており、これらのアプリケーションをより効率的かつセキュアにするために、データの計算をネットワークのエッジにプッシュすることを目指している。多くのインテリジェントアプリケーションは、畳み込みニューラルネットワーク(CNN)のようなディープラーニングモデルに依存している。過去10年間で、多くのコンピュータビジョンタスクで最先端のパフォーマンスを達成した。これらの手法の性能を高めるために、より深いアーキテクチャとより多くのパラメータを使用する傾向があり、計算コストが高くなる。実際、これはディープアーキテクチャが直面する主な問題の一つであり、エッジデバイスのような限られた計算リソースを持つドメインでの適用性を制限する。計算複雑性を軽減するために,画像とビデオの記憶と伝送に使用される圧縮表現で容易に利用できる視覚コンテンツに関連する情報から直接学習できるディープニューラルネットワークを提案する。提案手法の新規性は,RGB 画素ではなく DCT 係数で学習することで,周波数領域データを直接操作するように設計されている。これにより、データストリームの完全復号化において高い計算負荷を節約し、処理時間を大幅に短縮することが可能になる。 1)ImageNetデータセット上の画像分類と,(2)UCF-101データセットとHMDB-51データセット上の映像分類の2つの課題について,ネットワークの評価を行った。その結果, 計算効率が向上し, 精度の面では最先端手法に匹敵する効果を示した。

Due to the popularization and grow in computational power of mobile phones, as well as advances in artificial intelligence, many intelligent applications have been developed, meaningfully enriching people's life. For this reason, there is a growing interest in the area of edge intelligence, that aims to push the computation of data to the edges of the network, in order to make those applications more efficient and secure. Many intelligent applications rely on deep learning models, like convolutional neural networks (CNNs). Over the past decade, they have achieved state-of-the-art performance in many computer vision tasks. To increase the performance of these methods, the trend has been to use increasingly deeper architectures and with more parameters, leading to a high computational cost. Indeed, this is one of the main problems faced by deep architectures, limiting their applicability in domains with limited computational resources, like edge devices. To alleviate the computational complexity, we propose a deep neural network capable of learning straight from the relevant information pertaining to visual content readily available in the compressed representation used for image and video storage and transmission. The novelty of our approach is that it was designed to operate directly on frequency domain data, learning with DCT coefficients rather than RGB pixels. This enables to save high computational load in full decoding the data stream and therefore greatly speed up the processing time, which has become a big bottleneck of deep learning. We evaluated our network on two challenging tasks: (1) image classification on the ImageNet dataset and (2) video classification on the UCF-101 and HMDB-51 datasets. Our results demonstrate comparable effectiveness to the state-of-the-art methods in terms of accuracy, with the advantage of being more computationally efficient.

翻訳日:2021-04-25 01:09:00 公開日:2020-12-26

# スパース報酬を伴う連続制御タスクの局所的持続的探索

Locally Persistent Exploration in Continuous Control Tasks with Sparse Rewards ( http://arxiv.org/abs/2012.13658v1 )

ライセンス: Link先を確認

Susan Amin (1 and 2), Maziar Gomrokchi (1 and 2), Hossein Aboutalebi (3), Harsh Satija (1 and 2) and Doina Precup (1 and 2) ((1) McGill University, (2) Mila- Quebec Artificial Intelligence Institute, (3) University of Waterloo)

(参考訳) 強化学習における大きな課題は、特に粗末な報酬構造と連続状態と行動空間を持つ環境において、探索戦略の設計である。直感的には、補強信号が非常に少ない場合、エージェントは環境を効率的にカバーするために何らかの短期記憶に頼るべきである。我々は,(1)次の探索行動の選択は環境の(マルコフ)状態だけでなく,エージェントの軌道にも依存すべきであり,(2)エージェントは,状態空間における拡散の指標を利用して,小さな領域で立ち往生することを避ける必要がある,という2つの直観に基づく新たな探索法を提案する。本手法は,統計物理学でよく用いられる概念を応用し,状態空間における持続的(局所的に自己回避する)軌道を生成するために,単純化された(ポリマー)鎖の挙動を説明する。本稿では,局所自己回避歩行の理論的特性と,軌道内における時間的相関による短期記憶の提供能力について論じる。シミュレーションによる2次元ナビゲーションタスクや,高次元のムジョコ連続制御ロコモーションタスクにおいて,そのアプローチを経験的に評価した。

A major challenge in reinforcement learning is the design of exploration strategies, especially for environments with sparse reward structures and continuous state and action spaces. Intuitively, if the reinforcement signal is very scarce, the agent should rely on some form of short-term memory in order to cover its environment efficiently. We propose a new exploration method, based on two intuitions: (1) the choice of the next exploratory action should depend not only on the (Markovian) state of the environment, but also on the agent's trajectory so far, and (2) the agent should utilize a measure of spread in the state space to avoid getting stuck in a small region. Our method leverages concepts often used in statistical physics to provide explanations for the behavior of simplified (polymer) chains, in order to generate persistent (locally self-avoiding) trajectories in state space. We discuss the theoretical properties of locally self-avoiding walks, and their ability to provide a kind of short-term memory, through a decaying temporal correlation within the trajectory. We provide empirical evaluations of our approach in a simulated 2D navigation task, as well as higher-dimensional MuJoCo continuous control locomotion tasks with sparse rewards.

翻訳日:2021-04-25 01:08:33 公開日:2020-12-26

# 深層学習に基づく6G協調運転のためのインテリジェント車間距離制御

Deep Learning Based Intelligent Inter-Vehicle Distance Control for 6G Enabled Cooperative Autonomous Driving ( http://arxiv.org/abs/2012.13817v1 )

ライセンス: Link先を確認

Xiaosha Chen, Supeng Leng, Jianhua He, and Longyu Zhou

(参考訳) 第6世代セルネットワーク(6G)の研究は、ユビキタス無線接続を実現するために大きな勢いを増している。コネクテッド・自律運転(CAV)は、6Gにとって重要な垂直方向であり、道路の安全性、道路、エネルギー効率を改善する大きな可能性を秘めている。しかし、信頼性、レイテンシ、高速通信に関するCAVアプリケーションの厳しいサービス要件は、6Gネットワークに大きな課題をもたらすだろう。 6g対応cavには,新たなチャネルアクセスアルゴリズムとコネクテッドカーのインテリジェント制御スキームが必要である。本稿では,情報共有と運転協調による高度な運転モードである6G支援協調運転について検討した。まず,ハイブリッド通信とチャネルアクセス技術を用いたV2V通信における6G車両の遅延上限を定量化する。リアルタイム操作における遅延境界の高速計算のために,ディープラーニングニューラルネットワークを開発し,学習する。そして、協調自動運転のための車間距離を制御するインテリジェントな戦略を設計する。さらに,システム状態のパラメータを予測するマルコフ連鎖に基づくアルゴリズムと,スムーズな車速変化を可能にする安全な距離マッピング手法を提案する。提案アルゴリズムはAirSim自動運転プラットフォームで実装されている。シミュレーションの結果,提案手法は安全で安定な協調運転により有効で頑健であり,道路の安全性,容量,効率が大幅に向上することがわかった。

Research on the sixth generation cellular networks (6G) is gaining huge momentum to achieve ubiquitous wireless connectivity. Connected autonomous driving (CAV) is a critical vertical envisioned for 6G, holding great potentials of improving road safety, road and energy efficiency. However the stringent service requirements of CAV applications on reliability, latency and high speed communications will present big challenges to 6G networks. New channel access algorithms and intelligent control schemes for connected vehicles are needed for 6G supported CAV. In this paper, we investigated 6G supported cooperative driving, which is an advanced driving mode through information sharing and driving coordination. Firstly we quantify the delay upper bounds of 6G vehicle to vehicle (V2V) communications with hybrid communication and channel access technologies. A deep learning neural network is developed and trained for fast computation of the delay bounds in real time operations. Then, an intelligent strategy is designed to control the inter-vehicle distance for cooperative autonomous driving. Furthermore, we propose a Markov Chain based algorithm to predict the parameters of the system states, and also a safe distance mapping method to enable smooth vehicular speed changes. The proposed algorithms are implemented in the AirSim autonomous driving platform. Simulation results show that the proposed algorithms are effective and robust with safe and stable cooperative autonomous driving, which greatly improve the road safety, capacity and efficiency.

翻訳日:2021-04-25 01:08:12 公開日:2020-12-26

# 逆ネットワークを用いた画像合成:包括的調査とケーススタディ

Image Synthesis with Adversarial Networks: a Comprehensive Survey and Case Studies ( http://arxiv.org/abs/2012.13736v1 )

ライセンス: Link先を確認

Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger, Huiyu Zhou, Ruili Wang, M. Emre Celebi and Jie Yang

(参考訳) generative adversarial networks (gans) はコンピュータビジョン、医学、自然言語処理といった様々なアプリケーションドメインで非常に成功している。さらに、オブジェクトや人物を望ましい形に変えることは、GANにおいてよく研究される研究になる。 GANは複雑な分布を学習して意味のあるサンプルを合成する強力なモデルである。しかし、この分野には包括的なレビューの欠如、特にgans損失変動のコレクションの欠如、評価メトリクス、多様な画像生成のための修正、安定したトレーニングがある。本調査では,現時点の高速なGANの開発状況を踏まえ,画像合成の敵モデルに関する総合的なレビューを行う。合成画像生成法を要約し,画像間変換,融合画像生成,ラベル間マッピング,テキスト間変換などのカテゴリについて考察する。基礎モデルに基づいて文献を整理し,アーキテクチャ,制約,損失関数,評価指標,データセットの学習に関するアイデアを考案した。本稿では, 敵モデルのマイルストーン, 様々なカテゴリにおける先行作品の広範な選択, モデルベースからデータ駆動手法への開発経路に関する知見について述べる。さらに,今後の研究の方向性についても注目する。このレビューのユニークな特徴の1つは、これらのGANメソッドとデータセットのすべてのソフトウェア実装が収集され、https://github.com/pshams55/GAN-Case-Studyで一箇所で利用可能になったことである。

Generative Adversarial Networks (GANs) have been extremely successful in various application domains such as computer vision, medicine, and natural language processing. Moreover, transforming an object or person to a desired shape become a well-studied research in the GANs. GANs are powerful models for learning complex distributions to synthesize semantically meaningful samples. However, there is a lack of comprehensive review in this field, especially lack of a collection of GANs loss-variant, evaluation metrics, remedies for diverse image generation, and stable training. Given the current fast GANs development, in this survey, we provide a comprehensive review of adversarial models for image synthesis. We summarize the synthetic image generation methods, and discuss the categories including image-to-image translation, fusion image generation, label-to-image mapping, and text-to-image translation. We organize the literature based on their base models, developed ideas related to architectures, constraints, loss functions, evaluation metrics, and training datasets. We present milestones of adversarial models, review an extensive selection of previous works in various categories, and present insights on the development route from the model-based to data-driven methods. Further, we highlight a range of potential future research directions. One of the unique features of this review is that all software implementations of these GAN methods and datasets have been collected and made available in one place at https://github.com/pshams55/GAN-Case-Study.

翻訳日:2021-04-25 01:07:53 公開日:2020-12-26

# 自律走行のための確率的3次元マルチモーダルマルチオブジェクトトラッキング

Probabilistic 3D Multi-Modal, Multi-Object Tracking for Autonomous Driving ( http://arxiv.org/abs/2012.13755v1 )

ライセンス: Link先を確認

Hsu-kuang Chiu, Jie Li, Rares Ambrus, Jeannette Bohg

(参考訳) マルチオブジェクトトラッキングは、自動運転車が交通シーンを安全にナビゲートする重要な機能である。現在の最先端は、ある距離メトリックを通じて検出対象と既存のトラックが関連付けられるトラッキング・バイ・検出パラダイムに従っている。追跡精度を高めるための重要な課題は、データアソシエーションとライフサイクル管理の追跡にある。本稿では,複数のトレーニング可能なモジュールからなる確率的マルチモーダル・マルチオブジェクトトラッキングシステムを提案し,ロバストかつデータ駆動的なトラッキング結果を提供する。まず、2D画像と3D LiDAR点雲から特徴を融合して、オブジェクトの外観と幾何学的情報をキャプチャする方法を学ぶ。第2に,マハラノビスと特徴距離を組み合わせた距離を,トラックとデータアソシエーションにおける新たな検出とを比較して学習することを提案する。そして第3に、未整合物体検出からトラックをいつ初期化するかを学ぶことを提案する。そこで本手法は,NuScenes Trackingデータセットにおける最先端の手法よりも優れていることを示す。

Multi-object tracking is an important ability for an autonomous vehicle to safely navigate a traffic scene. Current state-of-the-art follows the tracking-by-detection paradigm where existing tracks are associated with detected objects through some distance metric. The key challenges to increase tracking accuracy lie in data association and track life cycle management. We propose a probabilistic, multi-modal, multi-object tracking system consisting of different trainable modules to provide robust and data-driven tracking results. First, we learn how to fuse features from 2D images and 3D LiDAR point clouds to capture the appearance and geometric information of an object. Second, we propose to learn a metric that combines the Mahalanobis and feature distances when comparing a track and a new detection in data association. And third, we propose to learn when to initialize a track from an unmatched object detection. Through extensive quantitative and qualitative results, we show that our method outperforms current state-of-the-art on the NuScenes Tracking dataset.

翻訳日:2021-04-25 01:07:31 公開日:2020-12-26

# エッジ保存フィルタの評価と比較

Evaluation and Comparison of Edge-Preserving Filters ( http://arxiv.org/abs/2012.13778v1 )

ライセンス: Link先を確認

Sarah Gingichashvili and Dani Lischinski

(参考訳) エッジ保存フィルタは、抽象化、トーンマップ、細部の拡張、テクスチャの除去など、計算写真の最も基本的なタスクにおいて重要な役割を果たす。スムーズな演算子の多様さと多様性は、出力品質を評価したり、それらの間の非バイアス比較を行う方法論の欠如と共に、そのような方法の誤解や潜在的な誤用につながる可能性がある。本稿では,そのような演算子を評価・比較するための体系的手法を導入し,多種多様なエッジ保存フィルタ上で実証する。さらに,異なる演算子の比較が可能な共通ベースラインを提案し,それを用いてメソッド間の等価パラメータマッピングを決定する。最後に,エッジ保存フィルタの客観的比較と評価のためのガイドラインを提案する。

Edge-preserving filters play an essential role in some of the most basic tasks of computational photography, such as abstraction, tonemapping, detail enhancement and texture removal, to name a few. The abundance and diversity of smoothing operators, accompanied by a lack of methodology to evaluate output quality and/or perform an unbiased comparison between them, could lead to misunderstanding and potential misuse of such methods. This paper introduces a systematic methodology for evaluating and comparing such operators and demonstrates it on a diverse set of published edge-preserving filters. Additionally, we present a common baseline along which a comparison of different operators can be achieved and use it to determine equivalent parameter mappings between methods. Finally, we suggest some guidelines for objective comparison and evaluation of edge-preserving filters.

翻訳日:2021-04-25 01:07:13 公開日:2020-12-26

# IDSのためのLSTMの異なるハイパーパラメータの相対的重要性の評価

Assessment of the Relative Importance of different hyper-parameters of LSTM for an IDS ( http://arxiv.org/abs/2012.14427v1 )

ライセンス: Link先を確認

Mohit Sewak, Sanjay K. Sahay and Hemant Rathore

(参考訳) LSTMのような反復的なディープラーニング言語モデルは、しばしば高価値資産のための高度なサイバー防御を提供するために使用される。 LSTMネットワークをマルウェア検出に使用する基本的な前提は、マルウェアのオプトコードシーケンスを(偽)言語表現として扱うことができることである。音声言語(単語/単語の系列)と機械語(オペ符号の系列)には違いがある。本稿では,これら固有の違いから,ネットワークの必須ハイパーパラメータが適切に調整されない限り,音声言語用に調整されたデフォルト構成のLSTMモデルでは,マルウェアを検出するのに有効ではないことを示す。その過程では,lstmネットワークのすべての異なるハイパーパラメータの相対的重要性を,そのオペコードシーケンス表現を用いてマルウェア検出に適用する。 LSTMネットワークの異なる構成を実験し、埋め込みサイズ、隠蔽層数、隠蔽層数、入力ベクトルのプルーニング/パディング長、アクティベーション-ファンクション、バッチサイズなどのハイパーパラメータを変更した。侵入検知システム用に構成されたLSTMネットワークの性能は,マルウェア/機械言語の複雑さの増大により,隠れ層数,入力シーケンス長,アクティベーション-ファンクションの選択に非常に敏感であることが判明した。また、言語モデリングでは、リカレントアーキテクチャは非リカレントアーキテクチャよりも優れています。したがって、LSTMのような連続的なDLアーキテクチャが、マルウェア検出のために、MLP-DNNのようなシーケンシャルでないアーキテクチャとどのように比較するかを評価する。

Recurrent deep learning language models like the LSTM are often used to provide advanced cyber-defense for high-value assets. The underlying assumption for using LSTM networks for malware-detection is that the op-code sequence of malware could be treated as a (spoken) language representation. There are differences between any spoken-language (sequence of words/sentences) and the machine-language (sequence of op-codes). In this paper, we demonstrate that due to these inherent differences, an LSTM model with its default configuration as tuned for a spoken-language, may not work well to detect malware (using its op-code sequence) unless the network's essential hyper-parameters are tuned appropriately. In the process, we also determine the relative importance of all the different hyper-parameters of an LSTM network as applied to malware detection using their op-code sequence representations. We experimented with different configurations of LSTM networks, and altered hyper-parameters like the embedding-size, number of hidden layers, number of LSTM-units in a hidden layer, pruning/padding-length of the input-vector, activation-function, and batch-size. We discovered that owing to the enhanced complexity of the malware/machine-language, the performance of an LSTM network configured for an Intrusion Detection System, is very sensitive towards the number-of-hidden-layers, input sequence-length, and the choice of the activation-function. Also, for (spoken) language-modeling, the recurrent architectures by-far outperform their non-recurrent counterparts. Therefore, we also assess how sequential DL architectures like the LSTM compare against their non-sequential counterparts like the MLP-DNN for the purpose of malware-detection.

翻訳日:2021-04-25 01:07:00 公開日:2020-12-26

# 高精度ペグ・イン・ホールタスクの模倣学習

Imitation Learning for High Precision Peg-in-Hole Tasks ( http://arxiv.org/abs/2101.01052v1 )

ライセンス: Link先を確認

Sagar Gubbi and Shishir Kolathaya and Bharadwaj Amrutur

(参考訳) 産業用ロボットマニピュレータは、人間が現在に至るまで、コンタクトリッチなタスクを実行できる精度とスピードとを一致させることができない。そこで, このギャップを克服する手段として, 6-DOFロボットマニピュレータにおいて, 穴内挿入タスクを模倣する生成方法を示す。特に, GAIL(Generative Adversarial mimicion Learning, GAIL)は, 八川GP8産業用ロボットの10mと6mのペグホールクリアランスを用いて, このタスクを成功させる。実験の結果,ロボット上での少数の人間専門家によるデモンストレーション(遠隔操作ロボット10例)から20エピソード以内の学習が得られた。挿入時間は > 20 秒(挿入失敗を含む)から < 15 秒に改善され、このアプローチの有効性が検証される。

Industrial robot manipulators are not able to match the precision and speed with which humans are able to execute contact rich tasks even to this day. Therefore, as a means overcome this gap, we demonstrate generative methods for imitating a peg-in-hole insertion task in a 6-DOF robot manipulator. In particular, generative adversarial imitation learning (GAIL) is used to successfully achieve this task with a 10 um, and a 6 um peg-hole clearance on the Yaskawa GP8 industrial robot. Experimental results show that the policy successfully learns within 20 episodes from a handful of human expert demonstrations on the robot (i.e., < 10 tele-operated robot demonstrations). The insertion time improves from > 20 seconds (which also includes failed insertions) to < 15 seconds, thereby validating the effectiveness of this approach.

翻訳日:2021-04-25 01:06:35 公開日:2020-12-26

# エンド・ツー・エンドの模倣学習のためのマルチインスタンスアウェアローカライゼーション

Multi-Instance Aware Localization for End-to-End Imitation Learning ( http://arxiv.org/abs/2101.01053v1 )

ライセンス: Link先を確認

Sagar Gubbi Venkatesh and Raviteja Upadrashta and Shishir Kolathaya and Bharadwaj Amrutur

(参考訳) イメージ・ツー・アクション・ポリシー・ネットワークを用いた模倣学習の既存のアーキテクチャは、興味のある対象の複数のインスタンスを含む入力画像が提示された場合、特に訓練に利用可能な専門家のデモ数が限られている場合、性能が低下する。 a) 視覚層の特徴マップ出力に、例の好みを示すような埋め込みや、専門家のデモに存在する暗黙の嗜好を活用できる埋め込みを付加し、(b) 制御層に自己回帰行動生成ネットワークを用いることで、エンドツーエンドのポリシーネットワークを効率的にトレーニングできることが示される。ローカライゼーションのためのアーキテクチャは精度とサンプル効率を向上し、トレーニング中に見るよりも多くのオブジェクトの存在を一般化することができる。エンド・ツー・エンドの模倣学習で実際のロボットでリーチ、プッシュ、ピック・アンド・プレイスのタスクを実行する場合、トレーニングは15のエキスパート・デモで達成される。

Existing architectures for imitation learning using image-to-action policy networks perform poorly when presented with an input image containing multiple instances of the object of interest, especially when the number of expert demonstrations available for training are limited. We show that end-to-end policy networks can be trained in a sample efficient manner by (a) appending the feature map output of the vision layers with an embedding that can indicate instance preference or take advantage of an implicit preference present in the expert demonstrations, and (b) employing an autoregressive action generator network for the control layers. The proposed architecture for localization has improved accuracy and sample efficiency and can generalize to the presence of more instances of objects than seen during training. When used for end-to-end imitation learning to perform reach, push, and pick-and-place tasks on a real robot, training is achieved with as few as 15 expert demonstrations.

翻訳日:2021-04-25 01:06:18 公開日:2020-12-26

# 模倣学習のための確率的行動予測

Stochastic Action Prediction for Imitation Learning ( http://arxiv.org/abs/2101.01055v1 )

ライセンス: Link先を確認

Sagar Gubbi Venkatesh and Nihesh Rathod and Shishir Kolathaya and Bharadwaj Amrutur

(参考訳) 模倣学習(imitation learning)は、専門家によるデモンストレーションに頼って、観察を行動にマッピングするポリシーを学ぶための、データ駆動の手法である。デモを行う場合、専門家は常に一貫性があり、わずかに異なる方法で同じタスクを達成する可能性がある。本稿では,遠隔操作車に追従するラインや,物体の到達,押圧,ピック,配置などの操作タスクを含む,実演における固有確率性を示す。自己回帰的行動生成,生成的逆ネット,変動予測を用いてデータ分布の確率性をモデル化し,これらの手法の性能を比較する。専門家データにおける確率性の説明は,タスク完了の成功率を大幅に向上させることがわかった。

Imitation learning is a data-driven approach to acquiring skills that relies on expert demonstrations to learn a policy that maps observations to actions. When performing demonstrations, experts are not always consistent and might accomplish the same task in slightly different ways. In this paper, we demonstrate inherent stochasticity in demonstrations collected for tasks including line following with a remote-controlled car and manipulation tasks including reaching, pushing, and picking and placing an object. We model stochasticity in the data distribution using autoregressive action generation, generative adversarial nets, and variational prediction and compare the performance of these approaches. We find that accounting for stochasticity in the expert data leads to substantial improvement in the success rate of task completion.

翻訳日:2021-04-25 01:05:57 公開日:2020-12-26

# 肺疾患と呼吸器疾患の予知に応用したインセプションベースネットワークとマルチスペクトログラム

Inception-Based Network and Multi-Spectrogram Ensemble Applied For Predicting Respiratory Anomalies and Lung Diseases ( http://arxiv.org/abs/2012.13699v1 )

ライセンス: Link先を確認

Lam Pham, Huy Phan, Ross King, Alfred Mertins, Ian McLoughlin

(参考訳) 本稿では,呼吸音入力を用いた肺疾患検出のためのインセプションベースディープニューラルネットワークを提案する。患者から収集された呼吸音の記録は、まずスペクトル情報と時間情報の両方がよく提示される分光器に変換される。これらのスペクトログラムは、肺関連疾患に罹患する患者を検出するために、バックエンド分類と呼ばれる提案されたネットワークに供給される。呼吸音のicbhiベンチマークメタデータセットを用いて, 呼吸異常と疾患検出に関して, それぞれ0.53/0.45 と 0.87/0.85 の競合 icbhiスコアを達成する実験を行った。

This paper presents an inception-based deep neural network for detecting lung diseases using respiratory sound input. Recordings of respiratory sound collected from patients are firstly transformed into spectrograms where both spectral and temporal information are well presented, referred to as front-end feature extraction. These spectrograms are then fed into the proposed network, referred to as back-end classification, for detecting whether patients suffer from lung-relevant diseases. Our experiments, conducted over the ICBHI benchmark meta-dataset of respiratory sound, achieve competitive ICBHI scores of 0.53/0.45 and 0.87/0.85 regarding respiratory anomaly and disease detection, respectively.

翻訳日:2021-04-25 01:05:46 公開日:2020-12-26

PDF登録状況（公開日: 20201226）