Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20210307となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# 量子接触過程における連続的に変化する指数による吸収相転移:ニューラルネットワークアプローチ Absorbing phase transition with a continuously varying exponent in a quantum contact process: a neural network approach ( http://arxiv.org/abs/2004.02672v4 ) ライセンス: Link先を確認	Minjae Jo, Jongshin Lee, K. Choi, and B. Kahng	(参考訳) 散逸量子系における相転移は、コヒーレント量子と非コヒーレント古典ゆらぎの相互作用によって引き起こされるため興味深い。本稿では,量子接触過程(QCP)における量子から古典的吸収相転移への交叉について検討する。 lindblad方程式は、それぞれ量子効果と古典効果の寄与を調整する2つのパラメータ、$\omega$と$\kappa$を含んでいる。ある次元において、QCPがすべての活性部位を持つ同質状態から始まるとき、指数$\alpha$(活性部位の密度と関連している)が量子から古典的有向パーコレーション(DP)値に連続的に減少する領域に臨界線が存在することが分かる。この挙動は、量子コヒーレント効果がある程度は$\kappa=0$に近いことを示唆する。しかし、1次元のQCPが1つの活性部位を除いてすべての不活性部位を持つ不均一状態から始まるとき、全ての臨界指数は$\kappa \ge 0$の古典的なDP値を持つ。 2次元では、異常なクロスオーバー動作は発生せず、古典的なDPの挙動は初期構成にかかわらず$\kappa \ge 0$の全領域に現れる。ニューラルネットワーク機械学習を用いて臨界線を特定し、相関長指数を決定する。量子ジャンプモンテカルロ法とテンソルネットワーク法による数値シミュレーションを行い、QCPの他の臨界指数を全て決定する。 Phase transitions in dissipative quantum systems are intriguing because they are induced by the interplay between coherent quantum and incoherent classical fluctuations. Here, we investigate the crossover from a quantum to a classical absorbing phase transition arising in the quantum contact process (QCP). The Lindblad equation contains two parameters, $\omega$ and $\kappa$, which adjust the contributions of the quantum and classical effects, respectively. We find that in one dimension when the QCP starts from a homogeneous state with all active sites, there exists a critical line in the region $0 \le \kappa < \kappa_*$ along which the exponent $\alpha$ (which is associated with the density of active sites) decreases continuously from a quantum to the classical directed percolation (DP) value. This behavior suggests that the quantum coherent effect remains to some extent near $\kappa=0$. However, when the QCP in one dimension starts from a heterogeneous state with all inactive sites except for one active site, all the critical exponents have the classical DP values for $\kappa \ge 0$. In two dimensions, anomalous crossover behavior does not occur, and classical DP behavior appears in the entire region of $\kappa \ge 0$ regardless of the initial configuration. Neural network machine learning is used to identify the critical line and determine the correlation length exponent. Numerical simulations using the quantum jump Monte Carlo technique and tensor network method are performed to determine all the other critical exponents of the QCP.	翻訳日:2023-05-26 06:25:12 公開日:2021-03-07
# 原子スペクトルにおける量子時間拡張 Quantum time dilation in atomic spectra ( http://arxiv.org/abs/2006.10084v3 ) ライセンス: Link先を確認	Piotr T. Grochowski, Alexander R. H. Smith, Andrzej Dragan and Kacper D\k{e}bski	(参考訳) 量子時間拡張は、相対論的運動量波パケットの重ね合わせで時計が動くときに起こる。我々は、励起水素様原子の寿命を時計として利用し、自然放出過程において量子時間拡張がどのように現れるかを示す。結果として生じる放出速度は、運動量波パケットの混合で調製された原子の放出速度に対して、順に$v^2/c^2$である場合と異なる。この効果は、運動量波パケット間のコヒーレンスによるドップラーシフトに対する量子補正を伴う。この量子ドップラーシフトは、$v/c$のスペクトル線形状に影響を与える。しかし、その減衰速度への影響は、量子時間拡張の効果と比較して抑制される。我々は、分光実験が量子時間拡張の効果を研究するための技術的に実現可能なプラットフォームを提供すると主張する。 Quantum time dilation occurs when a clock moves in a superposition of relativistic momentum wave packets. We utilize the lifetime of an excited hydrogen-like atom as a clock to demonstrate how quantum time dilation manifests in a spontaneous emission process. The resulting emission rate differs when compared to the emission rate of an atom prepared in a mixture of momentum wave packets at order $v^2/c^2$. This effect is accompanied by a quantum correction to the Doppler shift due to the coherence between momentum wave packets. This quantum Doppler shift affects the spectral line shape at order $v/c$. However, its effect on the decay rate is suppressed when compared to the effect of quantum time dilation. We argue that spectroscopic experiments offer a technologically feasible platform to explore the effects of quantum time dilation.	翻訳日:2023-05-13 15:38:49 公開日:2021-03-07
# チップスケール検出器を用いた効率的・低反発量子計測 Efficient and Low-Backaction Quantum Measurement Using a Chip-Scale Detector ( http://arxiv.org/abs/2008.03805v2 ) ライセンス: Link先を確認	Eric I. Rosenthal, Christian M. F. Schneider, Maxime Malnou, Ziyi Zhao, Felix Leditzky, Benjamin J. Chapman, Waltraut Wustmann, Xizheng Ma, Daniel A. Palken, Maximilian F. Zanner, Leila R. Vale, Gene C. Hilton, Jiansong Gao, Graeme Smith, Gerhard Kirchmair, and K. W. Lehnert	(参考訳) 超伝導量子ビットはスケーラブルな量子コンピューティングと量子誤り訂正のための主要なプラットフォームである。このプラットフォームの1つの特徴は、qubitのデコヒーレンス時間よりも桁違いに射影的測定を行う能力である。このような測定は、量子制限パラメトリック増幅器とフェライト循環器(増幅器のバックアクションによるノイズやデコヒーレンスからの隔離を提供する磁気装置)を併用することで可能となる。これらの非相互要素は性能が限られており、チップ上で簡単に統合できないため、スケーラブルな代替品に置き換えるという長年の目標であった。本稿では,量子ビットと増幅器の結合を制御するための超伝導スイッチを用いて,この問題に対する解法を示す。これにより、1つのチップスケールのデバイスを用いてトランスモン量子ビットを計測し、パラメトリック増幅とアンプバック動作のバルクからの分離の両方を提供する。この測定は高速で信頼性が高く、70%の効率で、超伝導量子ビット測定で報告された最高の測定値に匹敵する。このように、この研究は超伝導量子ビットのスケーラブルな測定のための高品質なプラットフォームを構成する。 Superconducting qubits are a leading platform for scalable quantum computing and quantum error correction. One feature of this platform is the ability to perform projective measurements orders of magnitude more quickly than qubit decoherence times. Such measurements are enabled by the use of quantum-limited parametric amplifiers in conjunction with ferrite circulators - magnetic devices which provide isolation from noise and decoherence due to amplifier backaction. Because these non-reciprocal elements have limited performance and are not easily integrated on-chip, it has been a longstanding goal to replace them with a scalable alternative. Here, we demonstrate a solution to this problem by using a superconducting switch to control the coupling between a qubit and amplifier. Doing so, we measure a transmon qubit using a single, chip-scale device to provide both parametric amplification and isolation from the bulk of amplifier backaction. This measurement is also fast, high fidelity, and has 70% efficiency, comparable to the best that has been reported in any superconducting qubit measurement. As such, this work constitutes a high-quality platform for the scalable measurement of superconducting qubits.	翻訳日:2023-05-06 18:03:43 公開日:2021-03-07
# SYKモデルとJT重力における作用素の複雑性成長 Complexity growth of operators in the SYK model and in JT gravity ( http://arxiv.org/abs/2008.12274v3 ) ライセンス: Link先を確認	Shao-Kai Jian, Brian Swingle, and Zhuo-Yu Xian	(参考訳) 作用素の大きさと計算複雑性の概念は、時間発展するハイゼンベルク作用素の構造を特徴づけるのに役立つため、量子カオスとホログラフィック双対の研究において重要な役割を果たす。特に、これらの顕微鏡的に定義された複雑性の測度が、複雑性体積(CV)双対性のような双対ホログラフィック幾何学で定義される複雑性の概念とどのように関連しているかを理解することが重要である。本稿では, Sachdev-Ye-Kitaev(SYK)モデルにおける部分絡み合った熱状態と, ジャッキー・ティーテルボイム(JT)重力におけるブラックホールの内部に挿入された作用素の双対記述について述べる。我々は,K-複雑度として知られるSYKモデルにおける複雑性の顕微鏡的定義とJT重力におけるCV双対性を用いた計算との比較を行い,両量とも指数的-直線的成長挙動を示すことを示した。また,時間発展に伴うオペレータサイズの成長を計算し,サイズと複雑性の関連を見出す。演算子の大きさの概念はスクランブル時間に飽和するが、量子系と重力理論の両方でよく定義されている複雑性は、初期と後期の両方における演算子の進化の有用な尺度として有用であることが示唆される。 The concepts of operator size and computational complexity play important roles in the study of quantum chaos and holographic duality because they help characterize the structure of time-evolving Heisenberg operators. It is particularly important to understand how these microscopically defined measures of complexity are related to notions of complexity defined in terms of a dual holographic geometry, such as complexity-volume (CV) duality. Here we study partially entangled thermal states in the Sachdev-Ye-Kitaev (SYK) model and their dual description in terms of operators inserted in the interior of a black hole in Jackiw-Teitelboim (JT) gravity. We compare a microscopic definition of complexity in the SYK model known as K-complexity to calculations using CV duality in JT gravity and find that both quantities show an exponential-to-linear growth behavior. We also calculate the growth of operator size under time evolution and find connections between size and complexity. While the notion of operator size saturates at the scrambling time, our study suggests that complexity, which is well defined in both quantum systems and gravity theories, can serve as a useful measure of operator evolution at both early and late times.	翻訳日:2023-05-04 19:40:25 公開日:2021-03-07
# 非線形光パラメトリック増幅器における時空間エンタングルメント Spatiotemporal entanglement in a noncollinear optical parametric amplifier ( http://arxiv.org/abs/2009.10511v2 ) ライセンス: Link先を確認	L. La Volpe, S. De, M. I. Kolobov, V. Parigi, C. Fabre, N. Treps, D. B. Horoshko	(参考訳) 超短パルスポンプを用いた単パス型非線形周波数縮退パラメトリックダウンコンバージョンにおける2本の絡み合った光の発生を理論的に検討した。本研究では, 時空間的固有値とそれに対応する固有値とを数値的に, 解析的に検索する。解析解は、曲線座標中のガウス関数によって場の合同スペクトル振幅をモデル化することによって得られる。この方法は非常に効率的であり,数値解とよく一致していることを示す。また、生成したビームの総帯域幅が十分に高い場合には、モーダル関数を空間的および時間的部分に分解することはできないが、ポンプを短くすることで強度を増大させることができる時空間結合を示す。 We theoretically investigate the generation of two entangled beams of light in the process of single-pass type-I noncollinear frequency degenerate parametric downconversion with an ultrashort pulsed pump. We find the spatio-temporal squeezing eigenmodes and the corresponding squeezing eigenvalues of the generated field both numerically and analytically. The analytical solution is obtained by modeling the joint spectral amplitude of the field by a Gaussian function in curvilinear coordinates. We show that this method is highly efficient and is in a good agreement with the numerical solution. We also reveal that when the total bandwidth of the generated beams is sufficiently high, the modal functions cannot be factored into a spatial and a temporal parts, but exhibit a spatio-temporal coupling, whose strength can be increased by shortening the pump.	翻訳日:2023-05-01 07:06:42 公開日:2021-03-07
# 量子位相スリップによる光子の非弾性散乱 Inelastic scattering of a photon by a quantum phase-slip ( http://arxiv.org/abs/2010.02099v2 ) ライセンス: Link先を確認	Roman Kuzmin, Nicholas Grabon, Nitish Mehta, Amir Burshtein, Moshe Goldstein, Manuel Houzet, Leonid I. Glazman, Vladimir E. Manucharyan	(参考訳) 単一光子の自然崩壊は、周波数範囲に関係なく自然界で悪名高い非効率過程である。高インピーダンス超伝導導波路における量子位相-滑り変動は、単一入射マイクロ波光子を、ほぼ単位確率で多数の低エネルギー光子に分割することができることを報告した。基礎となる非弾性光子-光子相互作用は非線形光学ではアナログを持たない。代わりに、測定された崩壊速度は、ルッティンガー液体の量子不純物の新しいモデルの枠組みにおいて調整可能なパラメータなしで説明される。この結果は、強相関系の物理学において重要な2次元境界場の臨界現象と回路量子電磁力学を結びつける。光子寿命データは、検証され有用な量子多体シミュレーションの珍しい例である。 Spontaneous decay of a single photon is a notoriously inefficient process in nature irrespective of the frequency range. We report that a quantum phase-slip fluctuation in high-impedance superconducting waveguides can split a single incident microwave photon into a large number of lower-energy photons with a near unit probability. The underlying inelastic photon-photon interaction has no analogs in non-linear optics. Instead, the measured decay rates are explained without adjustable parameters in the framework of a new model of a quantum impurity in a Luttinger liquid. Our result connects circuit quantum electrodynamics to critical phenomena in two-dimensional boundary quantum field theories, important in the physics of strongly-correlated systems. The photon lifetime data represents a rare example of verified and useful quantum many-body simulation.	翻訳日:2023-04-29 22:37:01 公開日:2021-03-07
# 磁気共鳴力センサを用いたスピン質量相互作用の探索 Searching spin-mass interaction using a diamagnetic levitated magnetic resonance force sensor ( http://arxiv.org/abs/2010.14199v3 ) ライセンス: Link先を確認	Fang Xiong, Tong Wu, Yingchun Leng, Rui Li, Changkui Duan, Xi Kong, Pu Huang, Zhengwei Li, Yu Gao, Xing Rong and Jiangfeng Du	(参考訳) 軸索状粒子(ALP)は、スピンと質量の間のエキゾチック相互作用を媒介する。近距離でのスピン質量力に対する最も感度の高いセンサの1つである浮揚マイクロメカニカル発振器に基づくalp探索実験を提案する。提案実験は、偏極電子スピンと磁気浮上マイクロスフィアとの間のスピン質量共鳴相互作用をテストする。電子スピンを周期的に反転させることで、非共鳴背景力からの汚染を除去することができる。浮遊マイクロオシレータは、現在の4 meVから0.4 eVの質量を持つALPの10^3$倍近い感度を期待できる。 Axion-like particles (ALPs) are predicted to mediate exotic interactions between spin and mass. We propose an ALP-searching experiment based on the levitated micromechanical oscillator, which is one of the most sensitive sensors for spin-mass forces at a short distance. The proposed experiment tests the spin-mass resonant interaction between the polarized electron spins and a diamagnetically levitated microsphere. By periodically flipping the electron spins, the contamination from nonresonant background forces can be eliminated. The levitated microoscillator can prospectively enhance the sensitivity by nearly $10^3$ times over current experiments for ALPs with mass in the range 4 meV to 0.4 eV.	翻訳日:2023-04-27 08:51:56 公開日:2021-03-07
# 擬似状態スキームによる多量子Rydberg量子論理ゲート Multiple-qubit Rydberg quantum logic gate via dressed-states scheme ( http://arxiv.org/abs/2010.14704v2 ) ライセンス: Link先を確認	Yucheng He, Jing-Xin Liu, F.-Q. Guo, Lei-Lei Yan, Ronghui Luo, Erjun Liang, Shi-Lei Su, M. Feng	(参考訳) 本稿では,ビタノフ式パルスと,ライドベルク原子の非断熱性(STA)の利点を組み合わせたマルチキュービット量子状態転送と量子論理ゲートを実現する手法を提案する。自発放出に対するスキームのロバスト性は、STA技術を通じてリドベルク励起状態の人口を減少させることによって達成できる。一方、制御誤差はよく設計されたパルスを使うことで最小化できる。さらに, この方式では, 量子状態伝達を高忠実度でスムーズにオン/オフし, 従来の断熱法の近道よりも高速に行うことができる。 Rydberg antiblockade (RAB) 効果を用いることで、パラメータの一般的な選択条件の下でマルチキュービットトフォリゲートを構築することができる。 We present a scheme to realize multiple-qubit quantum state transfer and quantum logic gate by combining the advantages of Vitanov-style pulses and dressed-state-based shortcut to adiabaticity (STA) in Rydberg atoms. The robustness of the scheme to spontaneous emission can be achieved by reducing the population of Rydberg excited states through the STA technology. Meanwhile, the control errors can be minimized through using the well-designed pulses. Moreover, the dressed-state method applied in the scheme makes the quantum state transfer more smoothly turned on or off with high fidelity and also faster than traditional shortcut to adiabaticity methods. By using Rydberg antiblockade (RAB) effect, the multiple-qubit Toffoli gate can be constructed under a general selection conditions of the parameters.	翻訳日:2023-04-27 06:31:51 公開日:2021-03-07
# 量子カーペットの単一から多体へのクロスオーバー Single- to many-body crossover of a quantum carpet ( http://arxiv.org/abs/2011.04582v2 ) ライセンス: Link先を確認	Maciej {\L}ebek, Piotr T. Grochowski, Kazimierz Rz\k{a}\.zewski	(参考訳) 量子カーペットパターンを示すボソンの強相互作用多体系をガウディン解を用いて正確に研究した。箱電位に閉じ込められた超低温ボソニックガス中の弱い原子間相互作用により、非相互作用の単体シナリオに通常存在するこの高一貫性設計が破壊されることを示す。しかし、システムがフェミオン化を受けると、非常に強く相互作用する体制で復活する。単体から多体へのクロスオーバー全体を追跡し,システム内に存在するデ・デファクト・デファレンスの分析を行う。 Strongly interacting many-body system of bosons exhibiting the quantum carpet pattern is investigated exactly by using Gaudin solutions. We show that this highly coherent design usually present in noninteracting, single-body scenarios gets destroyed by weak-to-moderate interatomic interactions in an ultracold bosonic gas trapped in a box potential. However, it becomes revived in a very strongly interacting regime, when the system undergoes fermionization. We track the whole single- to many-body crossover, providing an analysis of de- and rephasing present in the system.	翻訳日:2023-04-24 21:18:18 公開日:2021-03-07
# schr\"odinger--newton基底状態の漸近的崩壊について On the asymptotic decay of the Schr\"odinger--Newton ground state ( http://arxiv.org/abs/2101.01296v4 ) ライセンス: Link先を確認	Michael K.-H. Kiessling	(参考訳) 基底状態 $u(r)$ of the Schr\"odinger--Newton equation in $\mathbb{R}^3$ の漸近は V. Moroz と J. van Schaftingen によって$u(r) \sim A e^{-r}/ r^{1 - \\|u\\|_2^2/8\pi}$ と決定された。彼らは、$\\|u\\|_2^2$, squared $L^2$ norm of $u$ の値を残した。ここで、2^{1/3}3\pi^2\leq \\|u\\|_2^2\leq 2^{3}\pi^{3/2}$ が厳密に示される。数値的に$\\|u\\|_2^2\approx 14.03\pi$ と報告されており、$e^{-r}$ の単項プレファクタは、凹凸な方法で$r$ に増加する。シンガー-ニュートン方程式は外部の$\sim - K/r$電位を持ち、関連するボゾン原子やイオンのハートリー方程式に対しては漸近的な結果が提案される。 The asymptotics of the ground state $u(r)$ of the Schr\"odinger--Newton equation in $\mathbb{R}^3$ was determined by V. Moroz and J. van Schaftingen to be $u(r) \sim A e^{-r}/ r^{1 - \\|u\\|_2^2/8\pi}$ for some $A>0$, in units that fix the exponential rate to unity. They left open the value of $\\|u\\|_2^2$, the squared $L^2$ norm of $u$. Here it is rigorously shown that $2^{1/3}3\pi^2\leq \\|u\\|_2^2\leq 2^{3}\pi^{3/2}$. It is reported that numerically $\\|u\\|_2^2\approx 14.03\pi$, revealing that the monomial prefactor of $e^{-r}$ increases with $r$ in a concave manner. Asymptotic results are proposed for the Schr\"odinger--Newton equation with external $\sim - K/r$ potential, and for the related Hartree equation of a bosonic atom or ion.	翻訳日:2023-04-17 20:13:28 公開日:2021-03-07
# 変形モース様電位 Deformed Morse-like potential ( http://arxiv.org/abs/2101.09703v2 ) ライセンス: Link先を確認	I. A. Assi, A. D. Alhaidari and H. Bahlouli	(参考訳) 我々は、結合状態と共鳴状態の両方をサポートする完全可解な1次元ポテンシャルを導入する。このポテンシャルはよく知られた1次元モースポテンシャルの一般化であり、有限スペクトル特性を保存する変形を導入した。一方、ゼロ変形の極限において、このポテンシャルは A. D. Alhaidari が最近導入した指数的収束ポテンシャルに還元される。後者のポテンシャルは無限スペクトルをサポートするため、ゼロ変形極限は、我々の系が有限スペクトル極限から無限スペクトル極限へ遷移する臨界点である。対応するシュロディンガー方程式を解き、三対角表現法を用いてエネルギースペクトルと固有状態を得る。 We introduce an exactly solvable one-dimensional potential that supports both bound and/or resonance states. This potential is a generalization of the well-known 1D Morse potential where we introduced a deformation that preserves the finite spectrum property. On the other hand, in the limit of zero deformation, the potential reduces to the exponentially confining potential well introduced recently by A. D. Alhaidari. The latter potential supports infinite spectrum which means that the zero deformation limit is a critical point where our system will transition from the finite spectrum limit to the infinite spectrum limit. We solve the corresponding Schrodinger equation and obtain the energy spectrum and the eigenstates using the tridiagonal representation approach.	翻訳日:2023-04-14 02:31:21 公開日:2021-03-07
# ニューラルネットワークを用いた携帯機器の個人リスクプロファイリングによる多変量状況リスク評価に対する認知反応と感情応答の処理 Individual risk profiling for portable devices using a neural network to process the cognitive reactions and the emotional responses to a multivariate situational risk assessment ( http://arxiv.org/abs/2103.00441v2 ) ライセンス: Link先を確認	Frederic Jumelle, Kelvin So, and Didan Deng	(参考訳) 本稿では,認知と感情の関連性を確立するための,神経心理学的パフォーマンステストのための新しい方法とシステムを提案する。ユーザ情報をユーザ名の下に記憶し、携帯装置を介してユーザによってログインされるクラウドサービスと対話するために使用される携帯装置と、このユーザ情報をデバイスを介して直接キャプチャして、人工ニューラルネットワークで処理し、この3次元情報は、ユーザ認知反応、ユーザ感情応答及びユーザクロノメトリを含む。多変量状況リスクアセスメント(multivariate situational risk assessment)は、日常生活のさまざまな状況を記述する一連の30のディコトナス質問に対する各反応の3次元を捉え、ユーザの知識、価値観、倫理、原則に挑戦することで、被験者のパフォーマンスを評価するために使用される。産業用アプリケーションでは、この評価のタイミングは、銀行口座の開設、住宅ローンまたは保険契約の取得、職場での認可の認証、オンライン決済の確保など、提供者からのサービス取得の必要性に依存する。 In this paper, we are presenting a novel method and system for neuropsychological performance testing that can establish a link between cognition and emotion. It comprises a portable device used to interact with a cloud service which stores user information under username and is logged into by the user through the portable device; the user information is directly captured through the device and is processed by artificial neural network; and this tridimensional information comprises user cognitive reactions, user emotional responses and user chronometrics. The multivariate situational risk assessment is used to evaluate the performance of the subject by capturing the 3 dimensions of each reaction to a series of 30 dichotomous questions describing various situations of daily life and challenging the user's knowledge, values, ethics, and principles. In industrial application, the timing of this assessment will depend on the user's need to obtain a service from a provider such as opening a bank account, getting a mortgage or an insurance policy, authenticating clearance at work or securing online payments.	翻訳日:2023-04-09 16:43:02 公開日:2021-03-07
# 2ビットT状態におけるステアリングの必要十分基準 Necessary and sufficient criterion of steering for two-qubit T states ( http://arxiv.org/abs/2103.04280v1 ) ライセンス: Link先を確認	Xiao-Gang Fan, Huan Yang, Fei Ming, Xue-Ke Song, Dong Wang and Liu Ye	(参考訳) アインシュタイン・ポドルスキー・ローゼン(einstein-podolsky-rosen、epr)は、観測者が遠方の観測者に局所的な測定を行うことで絡み合いを共有するよう説得する能力である。量子状態の定式化は、まだ未解決の問題である。ここでは,任意の2量子T状態に対応する無限の測定値を持つ新しいステアリング不等式を,各面のN射影測定設定によるEPRステアリング不等式を考慮して導出した。実際、操舵の不等式は、T状態が操舵不能であることを保証するための十分な基準でもある。したがって、操舵不等式は、t状態が操舵可能か否かを識別するために必要な十分な基準と見なすことができる。ステアブル状態からなる集合が絡み合った状態からなる集合の厳密な部分集合であるという事実を明らかにするために、理論的にはすべての分離可能なT状態がステアリングの不等式に違反できないことを証明している。さらに,T状態が1/4を超える場合,T状態が評価可能であることを示すため,任意の2ビットT状態に対するコンカレンスからの最大違反を推定する手法を提案した。 Einstein-Podolsky-Rosen (EPR) steering is the ability that an observer persuades a distant observer to share entanglement by making local measurements. Determining a quantum state is steerable or unsteerable remains an open problem. Here, we derive a new steering inequality with infinite measurements corresponding to an arbitrary two-qubit T state, from consideration of EPR steering inequalities with N projective measurement settings for each side. In fact, the steering inequality is also a sufficient criterion for guaranteering that the T state is unsteerable. Hence, the steering inequality can be viewed as a necessary and sufficient criterion to distinguish whether the T state is steerable or unsteerable. In order to reveal the fact that the set composed of steerable states is the strict subset of the set made up of entangled states, we prove theoretically that all separable T states can not violate the steering inequality. Moreover, we put forward a method to estimate the maximum violation from concurrence for arbitrary two-qubit T states, which indicates that the T state is steerable if its concurrence exceeds 1/4.	翻訳日:2023-04-08 20:22:56 公開日:2021-03-07
# 内部量子非分離性と外部古典相関のトレードオフの観測 Observation of the tradeoff between internal quantum nonseparability and external classical correlations ( http://arxiv.org/abs/2103.04276v1 ) ライセンス: Link先を確認	Jie Zhu, Yue Dai, S. Camalet, Cheng-Jie Zhang, Bi-Heng Liu, Chuan-Feng Li, Guang-Can Guo, and Yong-Sheng Zhang	(参考訳) 絡み合いの単元関係は非常に重要である。しかし、それらは異なるサブシステムによって共有される絡み合いの量だけを含む。絡み合いと他の種類の相関、特に古典的相関の間の単元関係は、非常に少ない。ここでは、内部量子非分離性とフォトニック系の外部総相関のトレードオフ関係を実験的に観察し、純粋に古典的な外部相関でさえ内部非分離性に有害な影響を及ぼすことを示した。コンカレンスによって測定された非分離性は、同じ光子内の異なる自由度の間であり、標準量子相互情報によって測定された外部古典相関は、タイムビン法を用いて光子対の光子間で生成される。この結果から,システムの内部絡み合いを保つためには,システムと環境との間に,古典的相関を含む低い外部相関を維持する必要があることがわかった。 The monogamy relations of entanglement are highly significant. However, they involve only amounts of entanglement shared by different subsystems. Results on monogamy relations between entanglement and other kinds of correlations, and particularly classical correlations, are very scarce. Here we experimentally observe a tradeoff relation between internal quantum nonseparability and external total correlations in a photonic system and found that even purely classical external correlations have a detrimental effect on internal nonseparability. The nonseparability we consider, measured by the concurrence, is between different degrees of freedom within the same photon, and the external classical correlations, measured by the standard quantum mutual information, are generated between the photons of a photon pair using the time-bin method. Our observations show that to preserve the internal entanglement in a system, it is necessary to maintain low external correlations, including classical ones, between the system and its environment.	翻訳日:2023-04-08 20:22:34 公開日:2021-03-07
# 量子補間アンサンブル:平均エントロピーと直交多項式 Quantum interpolating ensemble: Average entropies and orthogonal polynomials ( http://arxiv.org/abs/2103.04231v1 ) ライセンス: Link先を確認	Lu Wei and Nicholas Witte	(参考訳) 密度行列形式は、量子情報処理における様々な問題を研究するための基本的な道具である。密度行列の空間において、最もよく知られ、物理的に関係のある尺度はヒルベルト=シュミットのアンサンブルとビュール=ハルのアンサンブルである。本研究では,量子補間アンサンブル(quantum interpolating ensemble)と呼ばれる密度行列の一般化アンサンブルを提案する。提案するアンサンブルを理解するための第一歩として,いくつかの最近の結果を一般化したアンサンブル上の絡み合いエントロピーの正確な平均式を導出する。また、対応する直交多項式のいくつかの重要な性質を導出し、エントロピーの他の統計情報を得る。数値実験の結果,量子状態の絡み合いの程度を推定するアンサンブルの有用性が示された。 The density matrix formalism is a fundamental tool in studying various problems in quantum information processing. In the space of density matrices, the most well-known and physically relevant measures are the Hilbert-Schmidt ensemble and the Bures-Hall ensemble. In this work, we propose a generalized ensemble of density matrices, termed quantum interpolating ensemble, which is able to interpolate between these two seemingly unrelated ensembles. As a first step to understand the proposed ensemble, we derive the exact mean formulas of entanglement entropies over such an ensemble generalizing several recent results in the literature. We also derive some key properties of the corresponding orthogonal polynomials relevant to obtaining other statistical information of the entropies. Numerical results demonstrate the usefulness of the proposed ensemble in estimating the degree of entanglement of quantum states.	翻訳日:2023-04-08 20:21:58 公開日:2021-03-07
# 中性子干渉計と重力の短距離修正試験 Neutron interferometry and tests of short-range modifications of gravity ( http://arxiv.org/abs/2103.04218v1 ) ライセンス: Link先を確認	J. M. Rocha and F. Dahia	(参考訳) 中性子干渉計による重力の短距離修正実験を, 大型余剰次元のシナリオで検討する。拡張源の内部重力ポテンシャルの計算における非計算可能性問題(ゼロ幅ブレーンモデルに典型的な)を避けるため、厚いブレーン理論の文脈で、入射中性子と物質媒体との間の高次元重力相互作用に関連する中性子光学ポテンシャルを決定する。このようにして、中性子干渉計が制約できる余剰次元モデルの物理量を特定する。また,位相シフト器を電場とする干渉計測実験を,Aharanov-Casher効果の試験として検討した。この実験は、この非バリロン源を用いて、圧力の容量と重力を生成する内部エネルギーを測定するポストニュートンパラメータの短距離挙動の試験とみなすことができる。 We consider tests of short-distance modifications of gravity based on neutron interferometry in the scenario of large extra dimensions. Avoiding the non-computability problem in the calculation of the internal gravitational potential of extended sources, typical of models with zero-width brane, we determine the neutron optical potential associated with the higher-dimension gravitational interaction between the incident neutron and a material medium in the context of thick brane theories. Proceeding this way, we identify the physical quantity of the extra dimension model that the neutron interferometry is capable of constraining. We also consider interferometric experiments in which the phase shifter is an electric field, as in the test of the Aharanov-Casher effect. We argue that this experiment, with this non-baryonic source, can be viewed as a test of the short-range behavior of Post-Newtonian parameters that measure the capacity of the pressure and the internal energy for producing gravity.	翻訳日:2023-04-08 20:21:43 公開日:2021-03-07
# 自由粒子球面波、擬似調和発振器、三重型ポテンシャルに対するdunkl微分を持つschr\"odinger方程式の厳密解 Exact solutions of the Schr\"odinger Equation with Dunkl Derivative for the Free-Particle Spherical Waves, the Pseudo-Harmonic Oscillator and the Mie-type Potential ( http://arxiv.org/abs/2103.04461v1 ) ライセンス: Link先を確認	R. D. Mota and D. Ojeda-Guill\'en	(参考訳) 我々は、自由粒子、擬調和振動子、三重型ポテンシャルに対するシュル=オディンガー方程式をダンケル微分で正確に3次元で解く。半径部と角部に関する方程式は、球座標と変数の分離を用いて得られる。これらのポテンシャルの波動関数とエネルギースペクトルは解析的な方法で導出され、dunkl微分パラメータを取り除いたときに報告された値に十分に減少することが示されている。 We solve exactly the Schr\"odinger equation for the free-particle, the pseudo-harmonic oscillator and the Mie-type potential in three dimensions with the Dunkl derivative. The equations for the radial and angular parts are obtained by using spherical coordinates and separation of variables. The wave functions and the energy spectrum for these potentials are derived in an analytical way and it is shown that our results are adequately reduced to those previously reported when we remove the Dunkl derivative parameters.	翻訳日:2023-04-08 20:17:35 公開日:2021-03-07
# インドニュースメディアの話題webページにおけるディファレンシャルトラッキング Differential Tracking Across Topical Webpages of Indian News Media ( http://arxiv.org/abs/2103.04442v1 ) ライセンス: Link先を確認	Yash Vekaria, Vibhor Agarwal, Pushkal Agarwal, Sangeeta Mahapatra, Sakthi Balan Muthiah, Nishanth Sastry, Nicolas Kourtellis	(参考訳) オンラインユーザーのプライバシーと追跡は近年広く研究されており、特にEUと米国におけるプライバシーと個人データに関する法律(General Data Protection Regulation、ePrivacy Regulation、California Consumer Privacy Act)によって研究されている。調査により、世界中のウェブサイトで第1および第3者が採用する新しい追跡方法と個人識別可能な情報漏洩方法、およびそのようなウェブサイトで実施される追跡の強度が明らかになった。しかし、ウェブの大部分をカバーするためにスケールするため、過去のほとんどの研究はウェブサイトのホームページに焦点をあて、トピックのサブページにおける追跡の慣行を深く見なかった。研究の大半は、EUや米国のようなグローバル・ノース市場に焦点を当てた。世界の人口の20%をカバーし、明確なプライバシー法を持たないインドのような大市場は、この点で研究されていない。これらのギャップに対処し、以下の研究課題に焦点をあてることを目的としている。インドのニュースサイトにおけるトピックのサブページの追跡は、彼らのホームページと異なるのか? サードパーティのトラッカーは特定のトピックを追跡するのを好むか? この選好は、これらのトピックのサブページで示される内容の類似性と比較してどうだろうか? そこで本研究では,これらの疑問に答えるべく,インドニュースのトピック・サブページをurlの詳細に基づいて自動抽出・分類する手法を提案する。特定したトピックのサブページを調査し,クッキー注入の強度やサードパーティの埋め込み度,タイプについてホームページと比較した。サブページ間、およびサブページとホームページ間で異なるユーザトラッキングを見つける。また、特定のトピックに対するサードパーティのトラッカーの優先的なアタッチメントも見つけました。また、組み込みサードパーティは特定のサブページを同時に追跡する傾向があり、ユーザのプロファイリングが実行可能である。 Online user privacy and tracking have been extensively studied in recent years, especially due to privacy and personal data-related legislations in the EU and the USA, such as the General Data Protection Regulation, ePrivacy Regulation, and California Consumer Privacy Act. Research has revealed novel tracking and personal identifiable information leakage methods that first- and third-parties employ on websites around the world, as well as the intensity of tracking performed on such websites. However, for the sake of scaling to cover a large portion of the Web, most past studies focused on homepages of websites, and did not look deeper into the tracking practices on their topical subpages. The majority of studies focused on the Global North markets such as the EU and the USA. Large markets such as India, which covers 20% of the world population and has no explicit privacy laws, have not been studied in this regard. We aim to address these gaps and focus on the following research questions: Is tracking on topical subpages of Indian news websites different from their homepage? Do third-party trackers prefer to track specific topics? How does this preference compare to the similarity of content shown on these topical subpages? To answer these questions, we propose a novel method for automatic extraction and categorization of Indian news topical subpages based on the details in their URLs. We study the identified topical subpages and compare them with their homepages with respect to the intensity of cookie injection and third-party embeddedness and type. We find differential user tracking among subpages, and between subpages and homepages. We also find a preferential attachment of third-party trackers to specific topics. Also, embedded third-parties tend to track specific subpages simultaneously, revealing possible user profiling in action.	翻訳日:2023-04-08 20:17:24 公開日:2021-03-07
# 非平衡グリーン関数から密度行列の量子マスター方程式と時間外相関式へ:定常状態と断熱力学 From non-equilibrium Green's functions to quantum master equations for the density matrix and out-of-time-order correlators: steady state and adiabatic dynamics ( http://arxiv.org/abs/2103.04373v1 ) ライセンス: Link先を確認	Bibek Bhandari, Rosario Fazio, Fabio Taddei and Liliana Arrachea	(参考訳) 低速駆動下での有限量子系と温度の異なる熱貯水池との弱結合を考える。本稿では,密度行列と時間外相関行列に対する量子マスター方程式の体系的導出について述べる。我々は顕微鏡ハミルトンから始まり、シュウィンガー・ケルディシュ非平衡グリーンの関数形式に関連付けてこれらの量の力学を支配づける方程式を定式化し、系と貯水池の間の結合の摂動展開を行う。本研究では, システム-貯留層結合による緩和時間と運転に伴う時間スケールとの比の線形応答を考慮した断熱力学に着目した。粒子とエネルギーのフラックスを計算しますボソニック貯水池に結合したクトリットとフェルミイオン貯水池に付随する一対の相互作用する量子ドットの場合の形式論を説明し、コヒーレント効果の関連性についても論じる。 We consider a finite quantum system under slow driving and weakly coupled to thermal reservoirs at different temperatures. We present a systematic derivation of the quantum master equation for the density matrix and the out-of-time-order correlators. We start from the microscopic Hamiltonian and we formulate the equations ruling the dynamics of these quantities by recourse to the Schwinger-Keldysh non-equilibrium Green's function formalism, performing a perturbative expansion in the coupling between the system and the reservoirs. We focus on the adiabatic dynamics, which corresponds to considering the linear response in the ratio between the relaxation time due to the system-reservoir coupling and the time scale associated to the driving. We calculate the particle and energy fluxes. We illustrate the formalism in the case of a qutrit coupled to bosonic reservoirs and of a pair of interacting quantum dots attached to fermionic reservoirs, also discussing the relevance of coherent effects.	翻訳日:2023-04-08 20:16:56 公開日:2021-03-07
# クライン空間の不確かさ原理 Uncertainty Principles in Krein Space ( http://arxiv.org/abs/2103.04372v1 ) ライセンス: Link先を確認	Sirous Homayouni and Angelo B. Mingarelli	(参考訳) 2つの一般非可換自己共役作用素間の不確かさ関係はクレイン空間で導かれる。これらの関係はすべてクレイン空間によって誘導される基本対称性作用素 $j$ を含み、これらの一般化された関係のいくつかは、問題の2つの作用素の反交換子、可換子、その他の様々な非線形関数を含んでいる。その結果、ヒルベルト空間上の非自己共役作用素のクラスが存在し、その可換作用素の非有界性は不確実性関係を意味する。すべての関係は、フォン・ノイマンらによってヒルベルト空間で定式化された古典的なハイゼンベルクの不確実性原理を含む。さらに、クレイン空間における作用素依存(非線形)可換不確かさ関係を導出する。 Uncertainty relations between two general non-commuting self-adjoint operators are derived in a Krein space. All of these relations involve a Krein space induced fundamental symmetry operator, $J$, while some of these generalized relations involve an anti-commutator, a commutator, and various other nonlinear functions of the two operators in question. As a consequence there exist classes of non-self-adjoint operators on Hilbert spaces such that the non-vanishing of their commutator implies an uncertainty relation. All relations include the classical Heisenberg uncertainty principle as formulated in Hilbert Space by Von Neumann and others. In addition, we derive an operator dependent (nonlinear) commutator uncertainty relation in Krein space.	翻訳日:2023-04-08 20:16:34 公開日:2021-03-07
# ハイブリッド量子アプリケーションには2つのオーケストレーションが必要:ソフトウェアアーキテクチャの観点から Hybrid Quantum Applications Need Two Orchestrations in Superposition: A Software Architecture Perspective ( http://arxiv.org/abs/2103.04320v1 ) ライセンス: Link先を確認	Frank Leymann, Johanna Barzen	(参考訳) 量子アプリケーションはしばしばハイブリッドであり、純粋な量子アルゴリズムの実装だけでなく、ワークフローやトポロジーを主要なアーティファクトとして、そして処理するデータから作られている。ワークフローとトポロジは現代の用語ではオーケストレーションと呼ばれる(しかし、全く異なる意味を持つ)ため、量子アプリケーションを実現するには2つのオーケストレーションが必要である。これらのオーケストレーション技術をスケッチし、非自明な量子アプリケーションの全体構造と、そのようなアプリケーションの実行環境の暗黙のアーキテクチャを明らかにします。 Quantum applications are most often hybrid, i.e. they are not only made of implementations of pure quantum algorithms but also of classical programs as well as workflows and topologies as key artifacts, and data they process. Since workflows and topologies are referred to as orchestrations in modern terminology (but with very different meaning), two orchestrations that go hand-in-hand are required to realize quantum applications. We motivate this by means of a non-trivial example, sketch these orchestration technologies and reveal the overall structure of nontrivial quantum applications as well as the implied architecture of a runtime environment for such applications.	翻訳日:2023-04-08 20:16:06 公開日:2021-03-07
# 古典力学と量子力学におけるパワーロー双対性 Power law duality in classical and quantum mechanics ( http://arxiv.org/abs/2103.04308v1 ) ライセンス: Link先を確認	Akira Inomata and Georg Junker	(参考訳) ニュートン-フック双対性と、古典的、半古典的、量子力学における任意のパワー則への一般化について論じる。我々は、パワーロー双対性は一連の双対演算の下での作用の対称性であるという考えを追求する。パワー双対対称性はハミルトンの特徴関数の形での作用の不変性と相互性によって定義される。パワーロー双対性は基本的に古典的な概念であり、角量子化のレベルで分解される。量子力学における双対対称性を保存するためのアドホック手順を提案する。エネルギー結合交換写像は、ある系を別の系に導く双対性演算の一部として必要であり、新しいエネルギーと古いエネルギーを関連付けるエネルギー公式に繋がる。放射状schr\"odinger方程式を満たす {the} グリーン関数の変換特性は、新しいグリーン関数を古い関数に関連付ける公式を与える。分数パワーポテンシャルにおける線形運動のエネルギースペクトルを半古典的に評価する。超対称半古典的作用におけるクーロン・フック双対性を示す方法を見出す。また,2項のパワーポテンシャルの双対構造の助けを借りて,閉じ込めポテンシャル問題についても検討する。 The Newton--Hooke duality and its generalization to arbitrary power laws in classical, semiclassical and quantum mechanics are discussed. We pursue a view that the power-law duality is a symmetry of the action under a set of duality operations. The power dual symmetry is defined by invariance and reciprocity of the action in the form of Hamilton's characteristic function. We find that the power-law duality is basically a classical notion and breaks down at the level of angular quantization. We propose an ad hoc procedure to preserve the dual symmetry in quantum mechanics. The energy-coupling exchange maps required as part of the duality operations that take one system to another lead to an energy formula that relates the new energy to the old energy. The transformation property of {the} Green function satisfying the radial Schr\"odinger equation yields a formula that relates the new Green function to the old one. The energy spectrum of the linear motion in a fractional power potential is semiclassically evaluated. We find a way to show the Coulomb--Hooke duality in the supersymmetric semiclassical action. We also study the confinement potential problem with the help of the dual structure of a two-term power potential.	翻訳日:2023-04-08 20:15:55 公開日:2021-03-07
# 異方性を有する高利得パラメトリックダウンコンバージョンにおける明るい相関双ビーム発生と放射整形 Bright correlated twin-beam generation and radiation shaping in high-gain parametric down-conversion with anisotropy ( http://arxiv.org/abs/2103.04305v1 ) ライセンス: Link先を確認	M. Riabinin, P. R. Sharapova, T. Meier	(参考訳) 非線形複屈折結晶における一軸異方性は非線形光学相互作用の効率を制限し、パラメトリックダウンコンバージョン(PDC)プロセスで生じる光の空間対称性を破る。したがって、この効果は通常望ましくないものであり、補償しなければならない。しかし、高利得は異方性の破壊的な役割を克服し、代わりに明るい2モード相関双ビームの生成に使うことができる。本研究では,強い異方性の存在下での明るい励起光の空間特性に関する厳密な理論的記述を提供する。本研究では, 単結晶および2結晶構造について検討し, 異方性による高利得で, 輝く相関したツインビームの発生を示す。生成した光のモード構造を探索し, 結晶間隔とともに異方性がどのように放射形成に利用できるかを示す。 Uniaxial anisotropy in nonlinear birefringent crystals limits the efficiency of nonlinear optical interactions and breaks the spatial symmetry of light generated in the parametric down-conversion (PDC) process. Therefore, this effect is usually undesirable and must be compensated for. However, high gain may be used to overcome the destructive role of anisotropy and instead to use it for the generation of bright two-mode correlated twin-beams. In this work, we provide a rigorous theoretical description of the spatial properties of bright squeezed light in the presence of strong anisotropy. We investigate a single-crystal and a two-crystal configuration and demonstrate the generation of bright correlated twin-beams in such systems at high gain due to anisotropy. We explore the mode structure of the generated light and show how anisotropy, together with crystal spacing, can be used for radiation shaping.	翻訳日:2023-04-08 20:15:38 公開日:2021-03-07
# ランダム性を用いた地域性, リアリズム, エルゴディダリティの決定 Using Randomness to decide among Locality, Realism and Ergodicity ( http://arxiv.org/abs/2001.01752v2 ) ライセンス: Link先を確認	Alejandro Hnilo	(参考訳) ループホールのない実験では、ベルの不平等の違反が観測されたとき、少なくとも3つの特徴のうちの1つが偽であることを示した。発見するために、または少なくとも指示を得るために実験が提案され、どれが偽であるかが示される。これは、パルス化されたベルのセットアップで見つからない一連の結果の速度の時間進化を記録することに基づいている。このような実験の結果は量子力学の基礎だけでなく重要なものとなる。例えば、基礎的な問題がまだ完全に決定されていなくても、量子認証ランダム数生成器の効率的な使用と、絡み合った状態を用いた量子鍵分布の安全性に即時的影響を与えることになる。 Loophole-free experiments have demonstrated that at least one of three features is false when the violation of Bell's inequalities is observed: Locality, Realism or (what is lesser known) Ergodicity. An experiment is proposed to find out, or at least to get an indication about, which one is false. It is based on recording the time evolution of the rate of series of outcomes that are found not-random in a pulsed Bell's setup. The results of such experiment would be important not only to the foundations of Quantum Mechanics. For, even if the foundational issue remained not fully decided, they would have immediate practical impact on the efficient use of quantum-certified Random Number Generators and the security of Quantum Key Distribution using entangled states.	翻訳日:2023-01-14 02:45:22 公開日:2021-03-07
# 選択的に分解されたデータによる因果推論 Causal Inference With Selectively Deconfounded Data ( http://arxiv.org/abs/2002.11096v4 ) ライセンス: Link先を確認	Kyra Gan, Andrew A. Li, Zachary C. Lipton, Sridhar Tayur	(参考訳) 標準的なコンバウンディンググラフと保存されていないコンファウンダリで生成されたデータのみを考えると、平均処理効果(ATE)は識別できない。 ATEを見積もるには、実践者はいずれかにしなければならない (a)非定型データの収集 b) 臨床試験を実施すること,又は (c) ATEを識別できるかもしれない因果グラフのさらなる性質を解明する。本稿では、ateを推定する際に、小型の非共役観測データセット(共同設立者不明)とともに、巨大共役観測データセット(共同設立者不明)を組み込むことの利点を考察する。理論的には, 待ち行列を所望の精度で推定するために必要なデコンストラクタデータの量を大幅に削減できる可能性が示唆された。さらに、遺伝学など一部のケースでは、再考してサンプルを分解する例も考えられる。既に観察されている)治療と結果に基づいて,これらの試料を積極的に選択することにより,試料の複雑さをさらに軽減できることを示す。我々の理論的および実証的な結果は、我々のアプローチの最悪の相対的な性能(例えば、自然ベンチマーク)が有界であり、ベストケースの利得は非有界であることを示す。最後に, がんの遺伝子変異に関連する大規模な実世界データセットを用いて, 選択的解凍の利点を実証する。 Given only data generated by a standard confounding graph with unobserved confounder, the Average Treatment Effect (ATE) is not identifiable. To estimate the ATE, a practitioner must then either (a) collect deconfounded data;(b) run a clinical trial; or (c) elucidate further properties of the causal graph that might render the ATE identifiable. In this paper, we consider the benefit of incorporating a large confounded observational dataset (confounder unobserved) alongside a small deconfounded observational dataset (confounder revealed) when estimating the ATE. Our theoretical results suggest that the inclusion of confounded data can significantly reduce the quantity of deconfounded data required to estimate the ATE to within a desired accuracy level. Moreover, in some cases -- say, genetics -- we could imagine retrospectively selecting samples to deconfound. We demonstrate that by actively selecting these samples based upon the (already observed) treatment and outcome, we can reduce sample complexity further. Our theoretical and empirical results establish that the worst-case relative performance of our approach (vs. a natural benchmark) is bounded while our best-case gains are unbounded. Finally, we demonstrate the benefits of selective deconfounding using a large real-world dataset related to genetic mutation in cancer.	翻訳日:2022-12-28 21:20:57 公開日:2021-03-07
# AraBERT:アラビア語理解のためのトランスフォーマーベースモデル AraBERT: Transformer-based Model for Arabic Language Understanding ( http://arxiv.org/abs/2003.00104v4 ) ライセンス: Link先を確認	Wissam Antoun, Fady Baly, Hazem Hajj	(参考訳) アラビア語は形態学的に豊かな言語であり、英語に比べて比較的資源が少なく、文法も乏しい。これらの制限から、感性分析(SA)、名前付きエンティティ認識(NER)、質問回答(QA)といったアラビア自然言語処理(NLP)タスクは、対処が非常に難しいことが証明されている。近年,トランスフォーマーベースモデルの増加に伴い,言語固有のBERTベースモデルは,非常に大きなコーパスで事前学習されているため,言語理解において非常に効率的であることが証明されている。これらのモデルは新しい標準を設定し、ほとんどのNLPタスクに対して最先端の結果を得ることができた。本稿では、BERTが英語で行ったのと同じ成功を追求するため、アラビア語に特化してBERTを事前訓練した。 AraBERTのパフォーマンスは、Googleや他の最先端アプローチの多言語BERTと比較される。その結果, AraBERTはアラビアのほとんどのNLPタスクで最先端の性能を達成できた。事前訓練されたアラバートモデルは https://github.com/aub-mind/arabert で公開されている。 The Arabic language is a morphologically rich language with relatively few resources and a less explored syntax compared to English. Given these limitations, Arabic Natural Language Processing (NLP) tasks like Sentiment Analysis (SA), Named Entity Recognition (NER), and Question Answering (QA), have proven to be very challenging to tackle. Recently, with the surge of transformers based models, language-specific BERT based models have proven to be very efficient at language understanding, provided they are pre-trained on a very large corpus. Such models were able to set new standards and achieve state-of-the-art results for most NLP tasks. In this paper, we pre-trained BERT specifically for the Arabic language in the pursuit of achieving the same success that BERT did for the English language. The performance of AraBERT is compared to multilingual BERT from Google and other state-of-the-art approaches. The results showed that the newly developed AraBERT achieved state-of-the-art performance on most tested Arabic NLP tasks. The pretrained araBERT models are publicly available on https://github.com/aub-mind/arabert hoping to encourage research and applications for Arabic NLP.	翻訳日:2022-12-28 02:24:23 公開日:2021-03-07
# Image Augmentation:Pixelの深部強化学習を定期的に行う Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels ( http://arxiv.org/abs/2004.13649v4 ) ライセンス: Link先を確認	Ilya Kostrikov, Denis Yarats, Rob Fergus	(参考訳) 本研究では,標準モデルフリー強化学習アルゴリズムに適用可能な簡易データ拡張手法を提案し,補助的損失や事前学習を必要とせず,画素から直接堅牢な学習を可能にする。この手法は、コンピュータビジョンタスクでよく使われる入力摂動を利用して値関数を正規化する。 SAC(Soft Actor-Critic)のような既存のモデルレスアプローチでは、画像ピクセルからディープネットワークを効果的に訓練することはできない。しかし,拡張手法の追加により,SACの性能が劇的に向上し,DeepMindコントロールスイートの最先端性能に到達し,モデルベース(Dreamer, PlaNet, SLAC)メソッドを超越し,最近提案されたコントラスト学習(CURL)が実現した。我々のアプローチはモデルフリーの強化学習アルゴリズムと組み合わせることができ、わずかな修正しか必要としない。実装はhttps://sites.google.com/view/data-regularized-qで確認できる。 We propose a simple data augmentation technique that can be applied to standard model-free reinforcement learning algorithms, enabling robust learning directly from pixels without the need for auxiliary losses or pre-training. The approach leverages input perturbations commonly used in computer vision tasks to regularize the value function. Existing model-free approaches, such as Soft Actor-Critic (SAC), are not able to train deep networks effectively from image pixels. However, the addition of our augmentation method dramatically improves SAC's performance, enabling it to reach state-of-the-art performance on the DeepMind control suite, surpassing model-based (Dreamer, PlaNet, and SLAC) methods and recently proposed contrastive learning (CURL). Our approach can be combined with any model-free reinforcement learning algorithm, requiring only minor modifications. An implementation can be found at https://sites.google.com/view/data-regularized-q.	翻訳日:2022-12-08 21:58:14 公開日:2021-03-07
# Probing Paradigm: Probingの正確性にはタスク関連性があるか? Probing the Probing Paradigm: Does Probing Accuracy Entail Task Relevance? ( http://arxiv.org/abs/2005.00719v3 ) ライセンス: Link先を確認	Abhilasha Ravichander, Yonatan Belinkov, Eduard Hovy	(参考訳) ニューラルモデルはいくつかのNLPベンチマークで印象的な結果を得たが、言語タスクの実行に使用するメカニズムについてはほとんど理解されていない。このように、近年の注目は、ニューラルエンコーダが'プローブ'タスクのレンズを通して学んだ文表現の分析に向けられている。しかし、プローブによって発見された文表現にエンコードされた情報は、実際にモデルがタスクを実行するために実際にどの程度使われたのか? 本研究では,自然言語推論のケーススタディを通じて,モデルが訓練されたタスクに必要とされない場合でも,モデルが言語特性をエンコードすることを学ぶことができることを示す。さらに,事前学習された単語埋め込みが,学習タスク自体よりもこれらの特性をエンコーディングする上で重要な役割を担っていることも確認し,探索実験の設計における注意深い制御の重要性を強調した。最後に、制御された合成タスクのセットを通じて、ランダムノイズとしてデータに分散しても、モデルがこれらの特性をかなり高い確率レベルに符号化できることを示し、探索タスクにおける絶対的なクレームの解釈を疑問視する。 Although neural models have achieved impressive results on several NLP benchmarks, little is understood about the mechanisms they use to perform language tasks. Thus, much recent attention has been devoted to analyzing the sentence representations learned by neural encoders, through the lens of `probing' tasks. However, to what extent was the information encoded in sentence representations, as discovered through a probe, actually used by the model to perform its task? In this work, we examine this probing paradigm through a case study in Natural Language Inference, showing that models can learn to encode linguistic properties even if they are not needed for the task on which the model was trained. We further identify that pretrained word embeddings play a considerable role in encoding these properties rather than the training task itself, highlighting the importance of careful controls when designing probing experiments. Finally, through a set of controlled synthetic tasks, we demonstrate models can encode these properties considerably above chance-level even when distributed in the data as random noise, calling into question the interpretation of absolute claims on probing tasks.	翻訳日:2022-12-07 12:25:01 公開日:2021-03-07
# RISEプロジェクト:産業煙発生の認識 Project RISE: Recognizing Industrial Smoke Emissions ( http://arxiv.org/abs/2005.06111v8 ) ライセンス: Link先を確認	Yen-Chia Hsu, Ting-Hao 'Kenneth' Huang, Ting-Yao Hu, Paul Dille, Sean Prendi, Ryan Hoffman, Anastasia Tsuhlares, Jessica Pachuta, Randy Sargent, Illah Nourbakhsh	(参考訳) 産業用煙の排出は人間の健康に重大な影響を及ぼす。以前の研究では、煙を視覚的証拠として識別するコンピュータビジョン(CV)技術が規制当局の態度に影響を与え、市民に環境正義を追求する権限を与えることが示されている。しかし、既存のデータセットは、空気質の擁護を支援するために必要な堅牢なCVモデルをトレーニングするのに十分な品質や量ではない。産業煙発生の認識のための,最初の大規模ビデオデータセットRISEを紹介する。市民科学的なアプローチを採用し,地域コミュニティのメンバとコラボレーションして,ビデオクリップが煙を排出するかどうかを注釈する。当社のデータセットには、3つの産業施設を監視するカメラから19の異なるビューから12,567のクリップが含まれています。これらの昼間のクリップは、全4シーズンを含む2年間で30日間に及ぶ。ディープニューラルネットワークを用いて、強力なパフォーマンスベースラインを確立し、喫煙認識の課題を明らかにする実験を行った。調査はコミュニティからのフィードバックを議論し,データ分析の結果,市民科学者とクラウドワーカーを社会影響への人工知能の適用に統合する機会が示された。 Industrial smoke emissions pose a significant concern to human health. Prior works have shown that using Computer Vision (CV) techniques to identify smoke as visual evidence can influence the attitude of regulators and empower citizens to pursue environmental justice. However, existing datasets are not of sufficient quality nor quantity to train the robust CV models needed to support air quality advocacy. We introduce RISE, the first large-scale video dataset for Recognizing Industrial Smoke Emissions. We adopted a citizen science approach to collaborate with local community members to annotate whether a video clip has smoke emissions. Our dataset contains 12,567 clips from 19 distinct views from cameras that monitored three industrial facilities. These daytime clips span 30 days over two years, including all four seasons. We ran experiments using deep neural networks to establish a strong performance baseline and reveal smoke recognition challenges. Our survey study discussed community feedback, and our data analysis displayed opportunities for integrating citizen scientists and crowd workers into the application of Artificial Intelligence for Social Impact.	翻訳日:2022-12-03 13:17:43 公開日:2021-03-07
# 有限和最適化のための加速デュアル平均化による分散低減 Variance Reduction via Accelerated Dual Averaging for Finite-Sum Optimization ( http://arxiv.org/abs/2006.10281v4 ) ライセンス: Link先を確認	Chaobing Song, Yong Jiang and Yi Ma	(参考訳) 本稿では,VRADA(Accelerated Dual Averaging)と呼ばれる有限サム凸最適化の簡易かつ統一的な手法を提案する。一般的な凸設定と強い凸設定の両方において、vradaは$o(n\log\log n)$で$o\big(\frac{1}{n}\big)$-accurateの解を得ることができ、最もよく知られた結果である$o(n\log n)$がサンプル数である。一方、vrada は $\log\log n$ まで設定された一般凸の下限と一致し、両方のレジームにおける下限である $n\le \theta(\kappa)$ と $n\gg \kappa$ に一致し、ここで $\kappa$ は条件数を表す。最もよく知られた結果の改善と上記のすべての下位境界の同時マッチングに加えて、VRADAは一般的な凸と強い凸設定の両方に対してより統一的で単純化されたアルゴリズムの実装と収束解析を行う。 VRADAにおける新しい初期化戦略のような新しいアプローチは、独立した関心を持つかもしれない。実データセットの実験を通じて、大規模機械学習問題に対する既存の手法よりも優れたVRADA性能を示す。 In this paper, we introduce a simplified and unified method for finite-sum convex optimization, named \emph{Variance Reduction via Accelerated Dual Averaging (VRADA)}. In both general convex and strongly convex settings, VRADA can attain an $O\big(\frac{1}{n}\big)$-accurate solution in $O(n\log\log n)$ number of stochastic gradient evaluations which improves the best-known result $O(n\log n)$, where $n$ is the number of samples. Meanwhile, VRADA matches the lower bound of the general convex setting up to a $\log\log n$ factor and matches the lower bounds in both regimes $n\le \Theta(\kappa)$ and $n\gg \kappa$ of the strongly convex setting, where $\kappa$ denotes the condition number. Besides improving the best-known results and matching all the above lower bounds simultaneously, VRADA has more unified and simplified algorithmic implementation and convergence analysis for both the general convex and strongly convex settings. The underlying novel approaches such as the novel initialization strategy in VRADA may be of independent interest. Through experiments on real datasets, we show the good performance of VRADA over existing methods for large-scale machine learning problems.	翻訳日:2022-11-19 14:16:20 公開日:2021-03-07
# erdos goes neural:グラフの組合せ最適化のための教師なし学習フレームワーク Erdos Goes Neural: an Unsupervised Learning Framework for Combinatorial Optimization on Graphs ( http://arxiv.org/abs/2006.10643v4 ) ライセンス: Link先を確認	Nikolaos Karalias, Andreas Loukas	(参考訳) 組合せ最適化問題は、特にラベル付きインスタンスの欠如において、ニューラルネットワークにとって非常に難しい。本研究は, グラフ上のCO問題に対する教師なし学習フレームワークを提案する。 erdosの確率的手法に触発され、ニューラルネットワークを用いて集合上の確率分布をパラメトリゼーションする。ネットワークが好適に選択された損失に最適化された場合、学習された分布は、制御された確率で、組合せ問題の制約に従う低コストな積分解を含む。確率論的存在証明は、望ましい解をデコードするためにデランディマイズされる。本稿では,最大傾き問題に対する有効な解と局所グラフクラスタリングを実現するために,本手法の有効性を示す。本手法は,実データと合成ハードインスタンスの双方で競合する結果を得る。 Combinatorial optimization problems are notoriously challenging for neural networks, especially in the absence of labeled instances. This work proposes an unsupervised learning framework for CO problems on graphs that can provide integral solutions of certified quality. Inspired by Erdos' probabilistic method, we use a neural network to parametrize a probability distribution over sets. Crucially, we show that when the network is optimized w.r.t. a suitably chosen loss, the learned distribution contains, with controlled probability, a low-cost integral solution that obeys the constraints of the combinatorial problem. The probabilistic proof of existence is then derandomized to decode the desired solutions. We demonstrate the efficacy of this approach to obtain valid solutions to the maximum clique problem and to perform local graph clustering. Our method achieves competitive results on both real datasets and synthetic hard instances.	翻訳日:2022-11-19 12:57:40 公開日:2021-03-07
# 限定モデル容量下における選択型ダイナスタイル計画 Selective Dyna-style Planning Under Limited Model Capacity ( http://arxiv.org/abs/2007.02418v3 ) ライセンス: Link先を確認	Zaheer Abbas, Samuel Sokota, Erin J. Talvitie, Martha White	(参考訳) モデルベースの強化学習では、不完全な環境モデルによる計画が学習の進捗を損なう可能性がある。しかし、モデルが不完全である場合でも、計画に有用な情報を含む可能性がある。本稿では,不完全モデルの使用を選択的に検討する。エージェントは、モデルが役に立つが、それが有害なモデルの使用を控える状態空間の一部で計画すべきである。効果的な選択的計画機構は、有理不確実性、パラメータ不確実性、およびモデル不確実性から生じる予測不確実性の推定を必要とする。事前の作業は、選択計画のパラメータの不確実性に重点を置いてきた。本研究では,モデル不足の重要性を強調する。パラメータ不確実性を考慮した手法によって検出されるモデル不確かさと相補的なモデル不確かさから生じる予測的不確かさが,パラメータ不確かさとモデル不確かさの両方を考慮すれば,より有望な選択的計画の方向になる可能性が示唆された。 In model-based reinforcement learning, planning with an imperfect model of the environment has the potential to harm learning progress. But even when a model is imperfect, it may still contain information that is useful for planning. In this paper, we investigate the idea of using an imperfect model selectively. The agent should plan in parts of the state space where the model would be helpful but refrain from using the model where it would be harmful. An effective selective planning mechanism requires estimating predictive uncertainty, which arises out of aleatoric uncertainty, parameter uncertainty, and model inadequacy, among other sources. Prior work has focused on parameter uncertainty for selective planning. In this work, we emphasize the importance of model inadequacy. We show that heteroscedastic regression can signal predictive uncertainty arising from model inadequacy that is complementary to that which is detected by methods designed for parameter uncertainty, indicating that considering both parameter uncertainty and model inadequacy may be a more promising direction for effective selective planning than either in isolation.	翻訳日:2022-11-13 07:46:25 公開日:2021-03-07
# ECOCを用いた深部畳み込みニューラルネットワーク Deep Convolutional Neural Network Ensembles using ECOC ( http://arxiv.org/abs/2009.02961v2 ) ライセンス: Link先を確認	Sara Atito Ali Ahmed, Cemre Zor, Berrin Yanikoglu, Muhammad Awais, Josef Kittler	(参考訳) ディープニューラルネットワークは、画像理解を含む多くのアプリケーションにおいて意思決定システムの性能を高め、アンサンブルを構築することでさらなる成果を得ることができる。しかし、ネットワークのトレーニングに要する時間が非常に高く、あるいは得られる性能がそれほど重要でないため、ディープネットワークのアンサンブルを設計することは、しばしば有益ではない。本稿では,深層ネットワークのアンサンブル手法として使用する誤り訂正出力符号化(ecoc)フレームワークを分析し,精度・複雑さトレードオフに対処するための設計戦略を提案する。導入したECOC設計とアンサンブル平均化や勾配向上決定木などの最先端のアンサンブル技術との広範な比較研究を行う。さらに,最も高い分類性能を達成できるコンビネータ技術を提案する。 Deep neural networks have enhanced the performance of decision making systems in many applications including image understanding, and further gains can be achieved by constructing ensembles. However, designing an ensemble of deep networks is often not very beneficial since the time needed to train the networks is very high or the performance gain obtained is not very significant. In this paper, we analyse error correcting output coding (ECOC) framework to be used as an ensemble technique for deep networks and propose different design strategies to address the accuracy-complexity trade-off. We carry out an extensive comparative study between the introduced ECOC designs and the state-of-the-art ensemble techniques such as ensemble averaging and gradient boosting decision trees. Furthermore, we propose a combinatory technique which is shown to achieve the highest classification performance amongst all.	翻訳日:2022-10-21 02:20:18 公開日:2021-03-07
# 制約付き最小方形の準最適性と非線形予測器による改善 Suboptimality of Constrained Least Squares and Improvements via Non-Linear Predictors ( http://arxiv.org/abs/2009.09304v2 ) ライセンス: Link先を確認	Tomas Va\v{s}kevi\v{c}ius and Nikita Zhivotovskiy	(参考訳) 本研究では, 2乗損失に関して有界ユークリッド球における最適線形予測器と同様に予測の問題について検討する。データ生成分布の有界性のみを仮定すると、有界ユークリッド球に制約された最小二乗推定器は古典的$o(d/n)$ の過剰なリスク率に達しない、ただし、$d$ は共変数の次元であり、$n$ はサンプルの数である。特に、制約付き最小二乗推定器が$\omega(d^{3/2}/n)$の過剰なリスクを負うような有界分布を構成するので、ohad shamir [jmlr 2015] の最近の予想を反論する。対照的に、非線形予測器は共変量の分布を仮定せずに最適なレートであるo(d/n)$を達成することができる。最小二乗推定器に対する$O(d/n)$過剰リスク率を保証するのに十分な分布仮定について論じる。それらのなかには、ロバストな統計文献でよく使われるあるモーメント同値仮定がある。このような仮定は、境界のない設定と重み付き設定の分析の中心であるが、我々の研究は、いくつかのケースでは、不利な境界付き分布も除外していることを示している。 We study the problem of predicting as well as the best linear predictor in a bounded Euclidean ball with respect to the squared loss. When only boundedness of the data generating distribution is assumed, we establish that the least squares estimator constrained to a bounded Euclidean ball does not attain the classical $O(d/n)$ excess risk rate, where $d$ is the dimension of the covariates and $n$ is the number of samples. In particular, we construct a bounded distribution such that the constrained least squares estimator incurs an excess risk of order $\Omega(d^{3/2}/n)$ hence refuting a recent conjecture of Ohad Shamir [JMLR 2015]. In contrast, we observe that non-linear predictors can achieve the optimal rate $O(d/n)$ with no assumptions on the distribution of the covariates. We discuss additional distributional assumptions sufficient to guarantee an $O(d/n)$ excess risk rate for the least squares estimator. Among them are certain moment equivalence assumptions often used in the robust statistics literature. While such assumptions are central in the analysis of unbounded and heavy-tailed settings, our work indicates that in some cases, they also rule out unfavorable bounded distributions.	翻訳日:2022-10-16 21:12:17 公開日:2021-03-07
# BFloat16トレーニングの見直し Revisiting BFloat16 Training ( http://arxiv.org/abs/2010.06192v2 ) ライセンス: Link先を確認	Pedram Zamirai, Jian Zhang, Christopher R. Aberger, Christopher De Sa	(参考訳) 最先端の汎用的低精度トレーニングアルゴリズムは16ビットと32ビットの精度を混合し、16ビットのハードウェア演算ユニットだけではモデルの精度を最大化できないという伝承を生み出した。その結果、深層学習アクセラレータは16ビット浮動小数点ユニット(FPU)と32ビット浮動小数点ユニット(FPU)の両方をサポートせざるを得なくなった。私たちは、深層学習モデルを16ビット浮動小数点ユニットでのみトレーニングできますが、32ビットのトレーニングで得られたモデルの精度は相変わらず一致しますか? そこで我々は,広く採用されているBFloat16ユニットの16ビットFPUトレーニングについて検討した。これらのユニットは従来16ビットの精度で出力を出力するために最も近い丸めを用いるが、モデルウェイト更新の最も近い丸めは、しばしば小さな更新をキャンセルし、収束とモデルの精度を低下させる。そこで本研究では,16ビットFPUトレーニングにおけるモデル精度劣化の軽減を目的とした,数値解析,確率的ラウンドリング,カハン和の2つの簡単な手法について検討した。この2つの手法により、16ビットfpuトレーニングで最大7%の絶対検証精度が得られることを示す。これにより、7つのディープラーニングアプリケーションにわたる32ビットトレーニングと比較して、0.1%から0.2%の検証精度が向上する。 State-of-the-art generic low-precision training algorithms use a mix of 16-bit and 32-bit precision, creating the folklore that 16-bit hardware compute units alone are not enough to maximize model accuracy. As a result, deep learning accelerators are forced to support both 16-bit and 32-bit floating-point units (FPUs), which is more costly than only using 16-bit FPUs for hardware design. We ask: can we train deep learning models only with 16-bit floating-point units, while still matching the model accuracy attained by 32-bit training? Towards this end, we study 16-bit-FPU training on the widely adopted BFloat16 unit. While these units conventionally use nearest rounding to cast output to 16-bit precision, we show that nearest rounding for model weight updates often cancels small updates, which degrades the convergence and model accuracy. Motivated by this, we study two simple techniques well-established in numerical analysis, stochastic rounding and Kahan summation, to remedy the model accuracy degradation in 16-bit-FPU training. We demonstrate that these two techniques can enable up to 7% absolute validation accuracy gain in 16-bit-FPU training. This leads to 0.1% lower to 0.2% higher validation accuracy compared to 32-bit training across seven deep learning applications.	翻訳日:2022-10-07 22:53:06 公開日:2021-03-07
# マルチエージェントlqrのデコンポーザビリティと並列計算 Decomposability and Parallel Computation of Multi-Agent LQR ( http://arxiv.org/abs/2010.08615v2 ) ライセンス: Link先を確認	Gangshan Jing, He Bai, Jemin George, Aranya Chakrabortty	(参考訳) マルチエージェントシステム(mas)内の個々のエージェントは、オープンループダイナミクスを分離するかもしれないが、協調制御の目的は通常、結合したクローズドループダイナミクスをもたらすので、制御設計は計算コストがかかる。エージェントのダイナミクスが分かっていない状況に対処するために強化学習(rl)のような学習戦略を適用する必要がある場合、計算時間がさらに高くなる。この問題を解決するために、連続時間線形MASにおける線形二次レギュレータ(LQR)設計のための並列RLスキームを提案する。この考え方は、LQRの目的に、$Q$と$R$の重み付け行列に埋め込まれた2つのグラフの構造特性を利用して、元のLQR設計を複数の分離された小さなLQR設計に変換する直交変換を定義することである。我々は、MAS が均質であれば、この分解は閉ループ最適性を保持することを示す。非均質なmasに適用した場合の分解性条件、変換行列を構成するアルゴリズム、並列rlアルゴリズム、ロバスト性解析について述べる。シミュレーションにより,本手法はlqrコストの累積値を失うことなく,学習の大幅な高速化を保証できることが示された。 Individual agents in a multi-agent system (MAS) may have decoupled open-loop dynamics, but a cooperative control objective usually results in coupled closed-loop dynamics thereby making the control design computationally expensive. The computation time becomes even higher when a learning strategy such as reinforcement learning (RL) needs to be applied to deal with the situation when the agents dynamics are not known. To resolve this problem, we propose a parallel RL scheme for a linear quadratic regulator (LQR) design in a continuous-time linear MAS. The idea is to exploit the structural properties of two graphs embedded in the $Q$ and $R$ weighting matrices in the LQR objective to define an orthogonal transformation that can convert the original LQR design to multiple decoupled smaller-sized LQR designs. We show that if the MAS is homogeneous then this decomposition retains closed-loop optimality. Conditions for decomposability, an algorithm for constructing the transformation matrix, a parallel RL algorithm, and robustness analysis when the design is applied to non-homogeneous MAS are presented. Simulations show that the proposed approach can guarantee significant speed-up in learning without any loss in the cumulative value of the LQR cost.	翻訳日:2022-10-06 21:48:37 公開日:2021-03-07
# ポータブルx線装置を用いたcovid-19患者肺分画の多段階トランスファー学習 Multi-stage transfer learning for lung segmentation using portable X-ray devices for patients with COVID-19 ( http://arxiv.org/abs/2011.00133v2 ) ライセンス: Link先を確認	Pl\'acido L Vidal, Joaquim de Moura, Jorge Novo, Marcos Ortega	(参考訳) 衛生上の緊急時の大きな課題の1つは、新規性、ケースの複雑さ、そしてその実装の緊急性により、利用可能なサンプル数が少ないコンピュータ支援診断システムを迅速に開発することである。新型コロナウイルス(COVID-19)のパンデミックの背景にある。この病原体は、主に合併症の呼吸器系に感染し、肺炎と急性呼吸窮迫症候群の重篤な症例を引き起こす。これにより、胸部x線を用いて検出できる肺の異なる病理構造が形成される。医療サービスの過負荷により、パンデミックの間は携帯型X線デバイスが推奨され、病気の拡散を防いでいる。しかし、これらの装置は、臨床医の主観性とともに診断過程をより困難にし、利用可能なサンプルの不足にもかかわらずコンピュータ支援診断手法の必要性を示唆する、様々な合併症(キャプチャー品質など)を伴っている。そこで本研究では,サンプル数の多いよく知られたドメインからの知識を,比較的少ない数でより複雑な新しいドメインに適応させる手法を提案する。非関連病理の脳磁気共鳴画像から事前訓練した分画モデルを利用し, 2段階の知識伝達を行い, 試料の不足と品質の低下にもかかわらず, 携帯型x線装置から肺領域を分画できる頑健なシステムを得た。この方法では、covid-19患者に$0.9761 \pm 0.0100$、正常患者に$0.9801 \pm 0.0104$、covid-19(肺炎など)と似ているが本物のcovid-19ではない肺疾患患者に$0.9769 \pm 0.0111$という満足な精度を得た。 One of the main challenges in times of sanitary emergency is to quickly develop computer aided diagnosis systems with a limited number of available samples due to the novelty, complexity of the case and the urgency of its implementation. This is the case during the current pandemic of COVID-19. This pathogen primarily infects the respiratory system of the afflicted, resulting in pneumonia and in a severe case of acute respiratory distress syndrome. This results in the formation of different pathological structures in the lungs that can be detected by the use of chest X-rays. Due to the overload of the health services, portable X-ray devices are recommended during the pandemic, preventing the spread of the disease. However, these devices entail different complications (such as capture quality) that, together with the subjectivity of the clinician, make the diagnostic process more difficult and suggest the necessity for computer-aided diagnosis methodologies despite the scarcity of samples available to do so. To solve this problem, we propose a methodology that allows to adapt the knowledge from a well-known domain with a high number of samples to a new domain with a significantly reduced number and greater complexity. We took advantage of a pre-trained segmentation model from brain magnetic resonance imaging of a unrelated pathology and performed two stages of knowledge transfer to obtain a robust system able to segment lung regions from portable X-ray devices despite the scarcity of samples and lesser quality. This way, our methodology obtained a satisfactory accuracy of $0.9761 \pm 0.0100$ for patients with COVID-19, $0.9801 \pm 0.0104$ for normal patients and $0.9769 \pm 0.0111$ for patients with pulmonary diseases with similar characteristics as COVID-19 (such as pneumonia) but not genuine COVID-19.	翻訳日:2022-10-01 17:20:23 公開日:2021-03-07
# 協調物体定位のためのモデルベース推定とグラフ学習の統合による多視点センサ融合 Multi-view Sensor Fusion by Integrating Model-based Estimation and Graph Learning for Collaborative Object Localization ( http://arxiv.org/abs/2011.07704v2 ) ライセンス: Link先を確認	Peng Gao, Rui Guo, Hongsheng Lu and Hao Zhang	(参考訳) コラボレーティブなオブジェクトローカライゼーションは、複数の視点や視点から観察されたオブジェクトの位置を協調的に推定することを目的としている。協調的なローカライゼーションを実現するために,複数のモデルに基づく状態推定と学習に基づくローカライゼーション手法を開発した。モデルに基づく状態推定には、しばしば複数のオブジェクト間の複雑な関係をモデル化する能力が欠けている。本稿では,グラフ学習とモデルに基づく推定を統合し,協調的物体定位のための多視点センサ融合を行う,時空間グラフフィルタ手法を提案する。提案手法は,新しい時空間グラフ表現を用いて複雑なオブジェクト関係をモデル化し,不確実性の下での位置推定を改善するためにベイズ方式で多視点観測を融合する。我々は、コネクテッド・自律運転と複数の歩行者位置決めの応用におけるアプローチを評価する。実験の結果,提案手法は従来の手法よりも優れており,コラボレーションのローカライゼーションにおける最先端のパフォーマンスを達成していることがわかった。 Collaborative object localization aims to collaboratively estimate locations of objects observed from multiple views or perspectives, which is a critical ability for multi-agent systems such as connected vehicles. To enable collaborative localization, several model-based state estimation and learning-based localization methods have been developed. Given their encouraging performance, model-based state estimation often lacks the ability to model the complex relationships among multiple objects, while learning-based methods are typically not able to fuse the observations from an arbitrary number of views and cannot well model uncertainty. In this paper, we introduce a novel spatiotemporal graph filter approach that integrates graph learning and model-based estimation to perform multi-view sensor fusion for collaborative object localization. Our approach models complex object relationships using a new spatiotemporal graph representation and fuses multi-view observations in a Bayesian fashion to improve location estimation under uncertainty. We evaluate our approach in the applications of connected autonomous driving and multiple pedestrian localization. Experimental results show that our approach outperforms previous techniques and achieves the state-of-the-art performance on collaboration localization.	翻訳日:2022-09-25 01:01:05 公開日:2021-03-07
# 連続遷移:ミックスアップによる連続制御問題に対するサンプル効率の改善 Continuous Transition: Improving Sample Efficiency for Continuous Control Problems via MixUp ( http://arxiv.org/abs/2011.14487v2 ) ライセンス: Link先を確認	Junfan Lin, Zhongzhan Huang, Keze Wang, Xiaodan Liang, Weiwei Chen, and Liang Lin	(参考訳) 深部強化学習(RL)は様々なロボット制御タスクにうまく適用されているが、サンプル効率の低さから現実のタスクに応用することは依然として困難である。この欠点を克服しようと、いくつかの研究は、訓練中に収集された軌跡データを政策に関係のない離散的な遷移に分解することで再利用することに焦点を当てた。しかし、その改善は (i) 遷移の量は通常小さく、 (i) 値の割り当ては結合状態でのみ発生するため、多少限界がある。これらの問題に対処するため,本論文では,経路に沿った潜在的な遷移を利用して軌道情報を利用する連続遷移を構築するための簡潔かつ強力な手法を提案する。具体的には,連続的な遷移を線形補間することにより,学習のための新しい遷移を合成する。構築された遷移を本物に保つために、我々は、自動的に構築プロセスを導く識別器も開発する。提案手法は, MuJoCo の複雑な連続ロボット制御問題に対して, サンプル効率を大幅に向上し, モデルベース/モデルフリー RL 法より優れていることを示す。ソースコードは利用可能である。 Although deep reinforcement learning (RL) has been successfully applied to a variety of robotic control tasks, it's still challenging to apply it to real-world tasks, due to the poor sample efficiency. Attempting to overcome this shortcoming, several works focus on reusing the collected trajectory data during the training by decomposing them into a set of policy-irrelevant discrete transitions. However, their improvements are somewhat marginal since i) the amount of the transitions is usually small, and ii) the value assignment only happens in the joint states. To address these issues, this paper introduces a concise yet powerful method to construct Continuous Transition, which exploits the trajectory information by exploiting the potential transitions along the trajectory. Specifically, we propose to synthesize new transitions for training by linearly interpolating the consecutive transitions. To keep the constructed transitions authentic, we also develop a discriminator to guide the construction process automatically. Extensive experiments demonstrate that our proposed method achieves a significant improvement in sample efficiency on various complex continuous robotic control problems in MuJoCo and outperforms the advanced model-based / model-free RL methods. The source code is available.	翻訳日:2021-06-06 14:39:15 公開日:2021-03-07
# オブジェクト認識のための解釈可能なグラフカプセルネットワーク Interpretable Graph Capsule Networks for Object Recognition ( http://arxiv.org/abs/2012.01674v3 ) ライセンス: Link先を確認	Jindong Gu and Volker Tresp	(参考訳) Convolutional Neural Networksに代わるCapsule Networksは、画像からオブジェクトを認識するために提案されている。現在の文献は、CNNに対するCapsNetsの多くの利点を示している。しかし,capsnetsの個別分類についての説明は十分に検討されていない。主にcnnに基づく分類を説明するために広く用いられ、活性化値と対応する勾配(例えばgrad-cam)を組み合わせることで塩分マップを説明する。これらのsaliencyメソッドは、下位の分類器の特定のアーキテクチャを必要とし、繰り返しルーティング機構のため、capsnetsに自明に適用できない。解釈可能性の欠如を克服するために、CapsNetsの新しいポストホック解釈手法を提案するか、ビルトインの説明を持つようにモデルを変更できる。本研究では後者について検討する。具体的には,多面的注意に基づくグラフプーリングアプローチでルーティング部を置き換える,解釈可能なグラフカプセルネットワーク(gracapsnets)を提案する。提案モデルでは,個々の分類説明を効果的かつ効率的に作成することができる。当社のモデルは,CapsNetsの基本部分を置き換えたとしても,予期せぬメリットも示しています。 gracapsnetsは、capsnetsと比較して、パラメータが少なく、敵のロバスト性も向上しています。さらに、gracapsnetsはcapsnetsの他の利点、すなわち不等角表現とアフィン変換のロバスト性も持っている。 Capsule Networks, as alternatives to Convolutional Neural Networks, have been proposed to recognize objects from images. The current literature demonstrates many advantages of CapsNets over CNNs. However, how to create explanations for individual classifications of CapsNets has not been well explored. The widely used saliency methods are mainly proposed for explaining CNN-based classifications; they create saliency map explanations by combining activation values and the corresponding gradients, e.g., Grad-CAM. These saliency methods require a specific architecture of the underlying classifiers and cannot be trivially applied to CapsNets due to the iterative routing mechanism therein. To overcome the lack of interpretability, we can either propose new post-hoc interpretation methods for CapsNets or modifying the model to have build-in explanations. In this work, we explore the latter. Specifically, we propose interpretable Graph Capsule Networks (GraCapsNets), where we replace the routing part with a multi-head attention-based Graph Pooling approach. In the proposed model, individual classification explanations can be created effectively and efficiently. Our model also demonstrates some unexpected benefits, even though it replaces the fundamental part of CapsNets. Our GraCapsNets achieve better classification performance with fewer parameters and better adversarial robustness, when compared to CapsNets. Besides, GraCapsNets also keep other advantages of CapsNets, namely, disentangled representations and affine transformation robustness.	翻訳日:2021-05-23 14:57:28 公開日:2021-03-07
# 学習画像圧縮の従来型コーデックへの転送可能性の活用法 How to Exploit the Transferability of Learned Image Compression to Conventional Codecs ( http://arxiv.org/abs/2012.01874v2 ) ライセンス: Link先を確認	Jan P. Klopp, Keng-Chi Liu, Liang-Gee Chen, Shao-Yi Chien	(参考訳) 損失画像圧縮は、選択された損失測度の単純さによってしばしば制限される。近年の研究では、生成的敵ネットワークは、この制限を克服し、特にテクスチャにおいてマルチモーダル損失として機能する能力を持っていることが示唆されている。学習した画像圧縮とともに、この2つのテクニックは、一般的に使われる歪みの厳密な尺度を緩和する際に大きな効果を発揮する。しかし、畳み込みニューラルネットワークに基づくアルゴリズムは計算フットプリントが大きい。理想的には、既存のコーデックはそのままであり、より高速な採用とバランスの取れた計算エンベロープへの付着を保証する。本研究は,この目標への道筋として,学習した画像の符号化を代用して,画像の符号化を最適化する手法を提案する。画像は学習したフィルタによって変更され、異なるパフォーマンス指標や特定のタスクに最適化される。このアイデアを生成的敵ネットワークで拡張すると、テクスチャ全体がエンコードするコストが低く、詳細さを保っているものに置き換えられることを示す。提案手法は,従来のコーデックを改造して,デコードオーバーヘッドを必要とせず,20%以上のレート改善でms-ssim歪みを調整できる。タスク認識画像圧縮では、類似するがコーデック特有のアプローチに対して好適に実行する。 Lossy image compression is often limited by the simplicity of the chosen loss measure. Recent research suggests that generative adversarial networks have the ability to overcome this limitation and serve as a multi-modal loss, especially for textures. Together with learned image compression, these two techniques can be used to great effect when relaxing the commonly employed tight measures of distortion. However, convolutional neural network based algorithms have a large computational footprint. Ideally, an existing conventional codec should stay in place, which would ensure faster adoption and adhering to a balanced computational envelope. As a possible avenue to this goal, in this work, we propose and investigate how learned image coding can be used as a surrogate to optimize an image for encoding. The image is altered by a learned filter to optimise for a different performance measure or a particular task. Extending this idea with a generative adversarial network, we show how entire textures are replaced by ones that are less costly to encode but preserve sense of detail. Our approach can remodel a conventional codec to adjust for the MS-SSIM distortion with over 20% rate improvement without any decoding overhead. On task-aware image compression, we perform favourably against a similar but codec-specific approach.	翻訳日:2021-05-23 14:44:20 公開日:2021-03-07
# (参考訳) CTR予測における細粒度特徴学習のためのマルチインタラクティブ注意ネットワーク Multi-Interactive Attention Network for Fine-grained Feature Learning in CTR Prediction ( http://arxiv.org/abs/2012.06968v2 ) ライセンス: CC BY 4.0	Kai Zhang, Hao Qian, Qing Cui, Qi Liu, Longfei Li, Jun Zhou, Jianhui Ma, Enhong Chen	(参考訳) クリックスルー率(ctr)予測シナリオでは、ユーザのシーケンシャルな動作を利用して、最近の文献に対するユーザの興味を捉えている。しかし、広く研究されているにもかかわらず、これらのシーケンシャルな方法には3つの制限がある。まず,CTRの予測に必ずしも適さないユーザの行動に注意を払っている。なぜなら,ユーザーは過去の行動とは無関係な新製品をクリックすることが多いからだ。第二に、現実のシナリオでは、昔から多くのユーザが存在しますが、近年では比較的アクティブではありません。したがって、初期の動作でユーザの現在の好みを正確に把握することは困難である。第3に、異なる特徴部分空間におけるユーザの歴史的行動の複数の表現は無視される。これらの問題を解消するために,様々なきめ細かい特徴(例えば,性別,年齢,職業など)の潜伏関係を包括的に抽出するMulti-Interactive Attention Network (MIAN)を提案する。具体的には、MIL(Multi-Interactive Layer)を3つのローカルなインタラクションモジュールに統合し、シーケンシャルな振る舞いを通じてユーザ好みの複数の表現をキャプチャし、きめ細かいユーザ固有の情報とコンテキスト情報を同時に利用する。さらに、高次相互作用を学習し、複数の特徴の異なる影響のバランスをとるために、Global Interaction Module (GIM) を設計する。最後に、Offline実験は、大規模レコメンデーションシステムにおけるオンラインA/Bテストとともに、3つのデータセットから行われ、提案手法の有効性を実証した。 In the Click-Through Rate (CTR) prediction scenario, user's sequential behaviors are well utilized to capture the user interest in the recent literature. However, despite being extensively studied, these sequential methods still suffer from three limitations. First, existing methods mostly utilize attention on the behavior of users, which is not always suitable for CTR prediction, because users often click on new products that are irrelevant to any historical behaviors. Second, in the real scenario, there exist numerous users that have operations a long time ago, but turn relatively inactive in recent times. Thus, it is hard to precisely capture user's current preferences through early behaviors. Third, multiple representations of user's historical behaviors in different feature subspaces are largely ignored. To remedy these issues, we propose a Multi-Interactive Attention Network (MIAN) to comprehensively extract the latent relationship among all kinds of fine-grained features (e.g., gender, age and occupation in user-profile). Specifically, MIAN contains a Multi-Interactive Layer (MIL) that integrates three local interaction modules to capture multiple representations of user preference through sequential behaviors and simultaneously utilize the fine-grained user-specific as well as context information. In addition, we design a Global Interaction Module (GIM) to learn the high-order interactions and balance the different impacts of multiple features. Finally, Offline experiment results from three datasets, together with an Online A/B test in a large-scale recommendation system, demonstrate the effectiveness of our proposed approach.	翻訳日:2021-05-09 19:42:47 公開日:2021-03-07
# リアルタイムユーザクリックによる画像マッチングの改善と不確かさ推定 Improved Image Matting via Real-time User Clicks and Uncertainty Estimation ( http://arxiv.org/abs/2012.08323v2 ) ライセンス: Link先を確認	Tianyi Wei, Dongdong Chen, Wenbo Zhou, Jing Liao, Hanqing Zhao, Weiming Zhang, Nenghai Yu	(参考訳) 画像マッチングはコンピュータビジョンとグラフィックスの基本的な問題である。既存の畳み込み方式の多くは、優れたアルファマットを生成する補助入力として、ユーザ供給のトリマップを利用する。しかし、高品質な trimap 自体を得ることは困難であり、これらの手法の適用を制限している。最近、いくつかのtrimap-freeメソッドが登場しているが、マットングの品質はtrimap-basedメソッドよりはるかに遅れている。主な理由は、トリマップガイダンスがなければ、ターゲットネットワークがフォアグラウンドターゲットについてあいまいである場合もあります。実際、前景を選択することは主観的な手続きであり、ユーザの意図に依存する。そこで本稿では,3マップフリーでユーザクリック操作を数回しか必要とせず,あいまいさを解消できるディープイメージマットリングフレームワークを提案する。さらに,どの部品に研磨が必要なのかを予測できる新しい不確実性推定モジュールと,後続の局所改質モジュールを導入する。計算予算に基づいて、不確実性ガイダンスで改善するローカル部品の数を選択できる。定量的・定性的な結果から,提案手法は既存のtrimapフリーメソッドよりも優れた性能を示し,ユーザによる最小限の労力で,最先端のtrimapベースメソッドと比較できることがわかった。 Image matting is a fundamental and challenging problem in computer vision and graphics. Most existing matting methods leverage a user-supplied trimap as an auxiliary input to produce good alpha matte. However, obtaining high-quality trimap itself is arduous, thus restricting the application of these methods. Recently, some trimap-free methods have emerged, however, the matting quality is still far behind the trimap-based methods. The main reason is that, without the trimap guidance in some cases, the target network is ambiguous about which is the foreground target. In fact, choosing the foreground is a subjective procedure and depends on the user's intention. To this end, this paper proposes an improved deep image matting framework which is trimap-free and only needs several user click interactions to eliminate the ambiguity. Moreover, we introduce a new uncertainty estimation module that can predict which parts need polishing and a following local refinement module. Based on the computation budget, users can choose how many local parts to improve with the uncertainty guidance. Quantitative and qualitative results show that our method performs better than existing trimap-free methods and comparably to state-of-the-art trimap-based methods with minimal user effort.	翻訳日:2021-05-07 05:21:17 公開日:2021-03-07
# モバイルデバイス上でリアルタイムLiDAR 3Dオブジェクト検出を実現する Achieving Real-Time LiDAR 3D Object Detection on a Mobile Device ( http://arxiv.org/abs/2012.13801v2 ) ライセンス: Link先を確認	Pu Zhao, Wei Niu, Geng Yuan, Yuxuan Cai, Hsin-Hsuan Sung, Sijia Liu, Xipeng Shen, Bin Ren, Yanzhi Wang, Xue Lin	(参考訳) 3Dオブジェクト検出は特に自律運転アプリケーション領域において重要なタスクである。しかし、自動運転車のエッジコンピューティングデバイス上での計算とメモリリソースの制限により、リアルタイムパフォーマンスをサポートすることは困難である。そこで本研究では,ネットワークの強化と強化学習手法による探索を取り入れたコンパイラ対応統合フレームワークを提案し,資源限定エッジコンピューティングデバイス上での3Dオブジェクト検出のリアルタイム推論を実現する。具体的には,リカレントニューラルネットワーク(RNN)を用いて,人的知識や支援を伴わずに,ネットワークの強化とプルーニングの両方を自動で行う統一的なスキームを提供する。また、統一スキームの評価性能は、ジェネレータRNNを訓練するためにフィードバックすることができる。実験の結果,提案フレームワークはモバイル端末(Samsung Galaxy S20)におけるリアルタイム3Dオブジェクト検出を競合検出性能で実現していることがわかった。 3D object detection is an important task, especially in the autonomous driving application domain. However, it is challenging to support the real-time performance with the limited computation and memory resources on edge-computing devices in self-driving cars. To achieve this, we propose a compiler-aware unified framework incorporating network enhancement and pruning search with the reinforcement learning techniques, to enable real-time inference of 3D object detection on the resource-limited edge-computing devices. Specifically, a generator Recurrent Neural Network (RNN) is employed to provide the unified scheme for both network enhancement and pruning search automatically, without human expertise and assistance. And the evaluated performance of the unified schemes can be fed back to train the generator RNN. The experimental results demonstrate that the proposed framework firstly achieves real-time 3D object detection on mobile devices (Samsung Galaxy S20 phone) with competitive detection performance.	翻訳日:2021-04-25 01:14:23 公開日:2021-03-07
# (参考訳) マルチエージェント強化学習を用いたOFDMAダウンリンクシステムにおけるバースティトラフィックの公正指向スケジューリング Fairness-Oriented Scheduling for Bursty Traffic in OFDMA Downlink Systems Using Multi-Agent Reinforcement Learning ( http://arxiv.org/abs/2012.15081v8 ) ライセンス: CC BY 4.0	Mingqi Yuan, Qi Cao, Man-on Pun, Yi Chen	(参考訳) ユーザスケジューリングは、無線通信における古典的な問題であり、鍵となる技術である。基地局には、PF(Proportional Fairness)やRRF(Robin Fashion)など、多くの高度なスケジューラが展開されている。オポチュニティ(OP)スケジューリングは、完全なバッファトラフィックを考慮した平均ユーザデータレート(AUDR)を最大化する最適なスケジューラであることが知られている。しかし、最高公平性を達成するための最適な戦略は、フルバッファトラフィックとバーストトラフィックの両方において、いまだに不明である。本研究では,特にRBG割り当てにおける公平性を考慮したユーザスケジューリングの問題について検討する。本稿では,マルチエージェント強化学習(marl)を用いて,通信システムの公平性を最大化する分散最適化を行うユーザスケジューラを構築する。エージェントは層間情報(例)を取る。状態として RSRP, Buffer サイズ) と状態として RBG を割り当て、フェアネスを最大化するように設計された報酬関数に従って最適解を探索する。さらに、5%タイルのユーザデータレート(5TUDR)をキーパフォーマンス指標(KPI)として、PFスケジューリングとRFスケジューリングとMARLスケジューリングの性能を比較する。シミュレーションの結果,提案したMARLスケジューリングは従来のスケジューラよりも優れていた。 User scheduling is a classical problem and key technology in wireless communication, which will still plays an important role in the prospective 6G. There are many sophisticated schedulers that are widely deployed in the base stations, such as Proportional Fairness (PF) and Round-Robin Fashion (RRF). It is known that the Opportunistic (OP) scheduling is the optimal scheduler for maximizing the average user data rate (AUDR) considering the full buffer traffic. But the optimal strategy achieving the highest fairness still remains largely unknown both in the full buffer traffic and the bursty traffic. In this work, we investigate the problem of fairness-oriented user scheduling, especially for the RBG allocation. We build a user scheduler using Multi-Agent Reinforcement Learning (MARL), which conducts distributional optimization to maximize the fairness of the communication system. The agents take the cross-layer information (e.g. RSRP, Buffer size) as state and the RBG allocation result as action, then explore the optimal solution following a well-defined reward function designed for maximizing fairness. Furthermore, we take the 5%-tile user data rate (5TUDR) as the key performance indicator (KPI) of fairness, and compare the performance of MARL scheduling with PF scheduling and RRF scheduling by conducting extensive simulations. And the simulation results show that the proposed MARL scheduling outperforms the traditional schedulers.	翻訳日:2021-04-18 15:50:37 公開日:2021-03-07
# AraELECTRA:アラビア語理解のための事前学習テキスト識別装置 AraELECTRA: Pre-Training Text Discriminators for Arabic Language Understanding ( http://arxiv.org/abs/2012.15516v2 ) ライセンス: Link先を確認	Wissam Antoun, Fady Baly, Hazem Hajj	(参考訳) 英語表現の進歩により、トークン置換を正確に分類するエンコーダ(ELECTRA)を効果的に学習することで、よりサンプル効率のよい事前学習タスクが実現された。これは、マスクされたトークンを復元するモデルをトレーニングする代わりに、ジェネレータネットワークに置き換えられた破損したトークンと真の入力トークンを区別するために識別器モデルを訓練する。一方、現在のアラビア語表現アプローチは、マスク言語モデリングによる事前学習のみに依存している。本稿では,アラエレクトラ(araelectra)というアラビア語表現モデルを開発した。我々のモデルは、大きなアラビア文字コーパス上の代用トークン検出目標を用いて事前訓練されている。我々は,複数のアラビア語nlpタスクにおいて,読み理解,感情分析,名前付きエンティティ認識を含むモデルを評価し,同じ事前学習データとより小さいモデルサイズでアラエレクトラが現在のアラビア語表現モデルよりも優れていることを示す。 Advances in English language representation enabled a more sample-efficient pre-training task by Efficiently Learning an Encoder that Classifies Token Replacements Accurately (ELECTRA). Which, instead of training a model to recover masked tokens, it trains a discriminator model to distinguish true input tokens from corrupted tokens that were replaced by a generator network. On the other hand, current Arabic language representation approaches rely only on pretraining via masked language modeling. In this paper, we develop an Arabic language representation model, which we name AraELECTRA. Our model is pretrained using the replaced token detection objective on large Arabic text corpora. We evaluate our model on multiple Arabic NLP tasks, including reading comprehension, sentiment analysis, and named-entity recognition and we show that AraELECTRA outperforms current state-of-the-art Arabic language representation models, given the same pretraining data and with even a smaller model size.	翻訳日:2021-04-17 17:16:21 公開日:2021-03-07
# aragpt2:アラビア語生成のための事前学習トランスフォーマー AraGPT2: Pre-Trained Transformer for Arabic Language Generation ( http://arxiv.org/abs/2012.15520v2 ) ライセンス: Link先を確認	Wissam Antoun, Fady Baly, Hazem Hajj	(参考訳) 近年、事前学習されたトランスフォーマーベースのアーキテクチャは、十分に大きなコーパスでトレーニングされているため、言語モデリングと理解において非常に効率的であることが証明されている。アラビア語の言語生成の応用は、アラビア語の先進的な生成モデルが欠如していることから、他のNLPの進歩と比べてもまだ遅れている。本稿では,インターネットテキストとニュース記事の巨大なアラビア語コーパスをスクラッチから学習した,最初の高度なアラビア語言語生成モデルであるalgpt2を開発した。私たちの最大のモデルであるAraGPT2-megaは46億のパラメータを持ち、アラビア語のモデルとしては最大です。 megaモデルは評価され、合成ニュース生成やゼロショット質問応答など、さまざまなタスクで成功を収めた。テキスト生成では、wikipediaの記事に29.8のパープレキシティを達成する。 AraGPT2-megaは,人間による記事と区別が難しいニュース記事の生成において,有意な成功を収めた。そこで我々は,モデル生成テキストの検出精度98%の精度で自動判別モデルを開発した。これらのモデルは、アラビア語のNLPのための新しい研究の方向性と応用を促進することを願っている。 Recently, pre-trained transformer-based architectures have proven to be very efficient at language modeling and understanding, given that they are trained on a large enough corpus. Applications in language generation for Arabic are still lagging in comparison to other NLP advances primarily due to the lack of advanced Arabic language generation models. In this paper, we develop the first advanced Arabic language generation model, AraGPT2, trained from scratch on a large Arabic corpus of internet text and news articles. Our largest model, AraGPT2-mega, has 1.46 billion parameters, which makes it the largest Arabic language model available. The Mega model was evaluated and showed success on different tasks including synthetic news generation, and zero-shot question answering. For text generation, our best model achieves a perplexity of 29.8 on held-out Wikipedia articles. A study conducted with human evaluators showed the significant success of AraGPT2-mega in generating news articles that are difficult to distinguish from articles written by humans. We thus develop and release an automatic discriminator model with a 98% percent accuracy in detecting model-generated text. The models are also publicly available, hoping to encourage new research directions and applications for Arabic NLP.	翻訳日:2021-04-17 17:16:05 公開日:2021-03-07
# 時間的点過程を用いた長地平線予測 Long Horizon Forecasting With Temporal Point Processes ( http://arxiv.org/abs/2101.02815v2 ) ライセンス: Link先を確認	Prathamesh Deshpande, Kamlesh Marathe, Abir De, Sunita Sarawagi	(参考訳) 近年,多種多様なアプリケーションにおいて,非同期イベントを特徴付ける強力なモデリング機構として,MTPP( marked temporal point process)が出現している。 mtppsは、イベントの予測、特に近い将来のイベントの予測において大きな可能性を実証している。しかし、現在の設計選択のため、MTPPは遠い未来のイベントの到着を予測する際の予測性能が低いことがしばしばある。本報告では,この制限を緩和するために,特に長い地平線イベント予測に適したDualTPPを設計する。 DualTPPには2つのコンポーネントがある。第1のコンポーネントは強度のないMTPPモデルであり、将来のイベントの時刻をモデル化することで、イベントダイナミクスの微視的または粒度の信号をキャプチャする。第2のコンポーネントは、与えられた時間ウィンドウ内のイベントの集約数をモデル化する異なる双対的な視点を取り、マクロなイベントダイナミクスをカプセル化する。そこで我々は,制約付き二次最適化問題の列を解いて,長方形事象を効率的に予測するための2つのモデル % 上で協調して新しい推論フレームワークを開発した。様々な実際のデータセットを用いた実験により、DualTPPは、長い地平線予測において既存のMTPP法よりも優れた性能を示し、実際の事象と予測の間のワッサーシュタイン距離の約1桁の縮小を実現している。 In recent years, marked temporal point processes (MTPPs) have emerged as a powerful modeling machinery to characterize asynchronous events in a wide variety of applications. MTPPs have demonstrated significant potential in predicting event-timings, especially for events arriving in near future. However, due to current design choices, MTPPs often show poor predictive performance at forecasting event arrivals in distant future. To ameliorate this limitation, in this paper, we design DualTPP which is specifically well-suited to long horizon event forecasting. DualTPP has two components. The first component is an intensity free MTPP model, which captures microscopic or granular level signals of the event dynamics by modeling the time of future events. The second component takes a different dual perspective of modeling aggregated counts of events in a given time-window, thus encapsulating macroscopic event dynamics. Then we develop a novel inference framework jointly over the two models % for efficiently forecasting long horizon events by solving a sequence of constrained quadratic optimization problems. Experiments with a diverse set of real datasets show that DualTPP outperforms existing MTPP methods on long horizon forecasting by substantial margins, achieving almost an order of magnitude reduction in Wasserstein distance between actual events and forecasts.	翻訳日:2021-04-10 05:05:46 公開日:2021-03-07
# (参考訳) アンサンブル学習について On Ensemble Learning ( http://arxiv.org/abs/2103.12521v1 ) ライセンス: CC BY 4.0	Mark Stamp and Aniket Chandak and Gavin Wong and Allen Ye	(参考訳) 本稿では,アンサンブル分類器,すなわち,スコアリング関数の組み合わせを利用した機械学習に基づく分類器について考察する。このような分類を分類するためのフレームワークを提供し、また、それぞれのフレームワークがどのように適合するかを議論するいくつかのアンサンブルテクニックを概説する。この一般的な紹介から,マルウェア分析の文脈におけるアンサンブル学習の話題に方向転換する。本稿では,マルウェア(および関連する)研究で使用されているアンサンブル技術について簡単な調査を行う。最後に,大規模かつ課題の多いマルウェアデータセットにアンサンブル手法を適用する実験を行った。これらのアンサンブル技術の多くはマルウェアの文献に現れるが、これまでは、異なるデータセットや異なる成功度合いが一般的に使用されるため、このような結果を直接比較する方法がなかった。私たちの共通フレームワークと経験的成果は、アンサンブル学習の分野において、マルウェア分析の問題の狭い領域と、マシンラーニング全般の領域の両方において、ある秩序感をもたらすための努力です。 In this paper, we consider ensemble classifiers, that is, machine learning based classifiers that utilize a combination of scoring functions. We provide a framework for categorizing such classifiers, and we outline several ensemble techniques, discussing how each fits into our framework. From this general introduction, we then pivot to the topic of ensemble learning within the context of malware analysis. We present a brief survey of some of the ensemble techniques that have been used in malware (and related) research. We conclude with an extensive set of experiments, where we apply ensemble techniques to a large and challenging malware dataset. While many of these ensemble techniques have appeared in the malware literature, previously there has been no way to directly compare results such as these, as different datasets and different measures of success are typically used. Our common framework and empirical results are an effort to bring some sense of order to the chaos that is evident in the evolving field of ensemble learning -- both within the narrow confines of the malware analysis problem, and in the larger realm of machine learning in general.	翻訳日:2021-04-05 05:22:14 公開日:2021-03-07
# (参考訳) ワイボーにおけるトロール検出の感度解析 Sentiment Analysis for Troll Detection on Weibo ( http://arxiv.org/abs/2103.09054v1 ) ライセンス: CC BY 4.0	Zidong Jiang and Fabio Di Troia and Mark Stamp	(参考訳) ソーシャルメディアが現代世界に与える影響を誇張することは難しい。事実上、あらゆる企業や著名人がtwitterやfacebookなどの人気プラットフォーム上でソーシャルメディアアカウントを持っている。中国では、マイクロブログサービスプロバイダであるsina weiboが最も人気のあるサービスである。世論に影響を与えるため、Weibo Trolls(いわゆるWater Army)は偽りのコメントを投稿するために雇われる。本稿では,Sina Weiboプラットフォーム上での感情分析およびその他のユーザ活動データを用いたトロル検出に焦点を当てた。中国語文のセグメンテーション,単語埋め込み,感情スコア計算のための手法を実装した。近年, トラル検出と感情分析が研究されているが, これまでのトロール検出を感情分析に基づいて検討した研究は知られていない。我々は、様々な機械学習戦略に基づいて、トロール検出のための感情分析アプローチを開発し、テストするために、得られた技術を用いる。実験結果が生成され分析される。提案手法を実装したChromeエクステンションは,ユーザがSina Weiboを閲覧すると,リアルタイムのトロル検出を可能にする。 The impact of social media on the modern world is difficult to overstate. Virtually all companies and public figures have social media accounts on popular platforms such as Twitter and Facebook. In China, the micro-blogging service provider, Sina Weibo, is the most popular such service. To influence public opinion, Weibo trolls -- the so called Water Army -- can be hired to post deceptive comments. In this paper, we focus on troll detection via sentiment analysis and other user activity data on the Sina Weibo platform. We implement techniques for Chinese sentence segmentation, word embedding, and sentiment score calculation. In recent years, troll detection and sentiment analysis have been studied, but we are not aware of previous research that considers troll detection based on sentiment analysis. We employ the resulting techniques to develop and test a sentiment analysis approach for troll detection, based on a variety of machine learning strategies. Experimental results are generated and analyzed. A Chrome extension is presented that implements our proposed technique, which enables real-time troll detection when a user browses Sina Weibo.	翻訳日:2021-04-05 04:59:42 公開日:2021-03-07
# (参考訳) 脳波特徴を用いた抑うつ検出のためのアンサンブルアプローチ Ensemble approach for detection of depression using EEG features ( http://arxiv.org/abs/2103.08467v1 ) ライセンス: CC BY 4.0	Egils Avots, Kla\=vs Jermakovs, Maie Bachmann, Laura Paeske, Cagri Ozcinar, Gholamreza Anbarjafari	(参考訳) うつ病は社会の健康に深刻な影響を与え、社会に悪影響を及ぼす公衆衛生問題である。これらの問題に対する意識を高めるため,脳波信号からうつ病の長期持続効果を判定することを目的とした。本稿では、線形(相対帯域パワー、APV、SASI)および非線形(HFD、LZC、DFA)EEG機能を用いて訓練されたSVM、LDA、NB、kNN、D3バイナリ分類器の精度比較を含む。年齢と性別の一致したデータセットは、健常者10名、うつ病診断者10名で構成された。提案する機能選択と分類の組み合わせのいくつかは90%の精度に達し、10倍のクロス検証を用いて評価したすべてのモデルが、ランダムなサンプル順列で100回以上繰り返した。 Depression is a public health issue which severely affects one's well being and cause negative social and economic effect for society. To rise awareness of these problems, this publication aims to determine if long lasting effects of depression can be determined from electoencephalographic (EEG) signals. The article contains accuracy comparison for SVM, LDA, NB, kNN and D3 binary classifiers which were trained using linear (relative band powers, APV, SASI) and non-linear (HFD, LZC, DFA) EEG features. The age and gender matched dataset consisted of 10 healthy subjects and 10 subjects with depression diagnosis at some point in their lifetime. Several of the proposed feature selection and classifier combinations reached accuracy of 90% where all models where evaluated using 10-fold cross validation and averaged over 100 repetitions with random sample permutations.	翻訳日:2021-04-05 04:46:44 公開日:2021-03-07
# ベイズ畳み込みニューラルネットワークを用いたグレース・グレイス・フォギャップにおける陸水貯留異常の予測の改善 Improving prediction of the terrestrial water storage anomalies during the GRACE and GRACE-FO gap with Bayesian convolutional neural networks ( http://arxiv.org/abs/2101.09361v2 ) ライセンス: Link先を確認	Shaoxing Mo, Yulong Zhong, Xiaoqing Shi, Wei Feng, Xin Yin, Jichun Wu	(参考訳) 重力回復・気候実験(GRACE)衛星とその後継であるGRACE Follow-On(GRACE-FO)は、地球規模の地球規模の貯水異常(TWSA)の貴重な正確な観測を提供する。しかし、GRACEとGRACE-FOの間には、TWSAsの約1年間の観測ギャップがある。これは、twsa観測における不連続性が水文モデル予測に重大なバイアスと不確実性をもたらし、その結果、誤った意思決定をもたらす可能性があるため、実用的な応用にとっての課題となる。この課題に対処するために、気候データによって駆動されるベイズ畳み込みニューラルネットワーク(BCNN)を提案し、このギャップを世界規模で橋渡しする。注意機構や残差と密接性を含むディープラーニングの最近の進歩を統合することで、bcnnはマルチソース入力データからtwsa予測の重要な特徴を自動的かつ効率的に抽出することができる。予測されたtwsaは、水文モデル出力および最近の3つのtwsa予測製品と比較される。この比較は、比較的乾燥した地域でのギャップにおいて、TWSAの予測を改善するBCNNの優れた性能を示唆している。さらに, 降水異常, 干ばつ指数, 地下水位を比較することにより, ギャップ期間中の極端に乾燥し, 湿潤な現象を識別するbcnnの能力はさらに議論され, 総合的に実証された。 BCNNは、TWSAデータの連続性を維持し、ギャップの間における気候極端の影響を定量化するための信頼性の高いソリューションを提供することができることを示している。 The Gravity Recovery and Climate Experiment (GRACE) satellite and its successor GRACE Follow-On (GRACE-FO) provide valuable and accurate observations of terrestrial water storage anomalies (TWSAs) at a global scale. However, there is an approximately one-year observation gap of TWSAs between GRACE and GRACE-FO. This poses a challenge for practical applications, as discontinuity in the TWSA observations may introduce significant biases and uncertainties in the hydrological model predictions and consequently mislead decision making. To tackle this challenge, a Bayesian convolutional neural network (BCNN) driven by climatic data is proposed in this study to bridge this gap at a global scale. Enhanced by integrating recent advances in deep learning, including the attention mechanisms and the residual and dense connections, BCNN can automatically and efficiently extract important features for TWSA predictions from multi-source input data. The predicted TWSAs are compared to the hydrological model outputs and three recent TWSA prediction products. The comparison suggests the superior performance of BCNN in providing improved predictions of TWSAs during the gap in particular in the relatively arid regions. The BCNN's ability to identify the extreme dry and wet events during the gap period is further discussed and comprehensively demonstrated by comparing with the precipitation anomalies, drought index, ground/surface water levels. Results indicate that BCNN is capable of offering a reliable solution to maintain the TWSA data continuity and quantify the impacts of climate extremes during the gap.	翻訳日:2021-03-21 07:46:58 公開日:2021-03-07
# 双方向GRUモデルを用いたアラビア語のアスペクトベース感情分析 Arabic aspect based sentiment analysis using bidirectional GRU based models ( http://arxiv.org/abs/2101.10539v3 ) ライセンス: Link先を確認	Mohammed M.Abdelgwad, Taysir Hassan A Soliman, Ahmed I.Taloba, Mohamed Fawzy Farghaly	(参考訳) アスペクトベースの知覚分析(ABSA)は、与えられた文書や文の側面と各側面について伝達される感情を定義する、きめ細かい分析を行う。このレベルの分析は、レビューの微妙な視点を探求できる最も詳細なバージョンである。 ABSAで利用可能な研究のほとんどは英語に焦点を当てており、アラビア語に関する研究はほとんどない。アラビアにおけるこれまでのほとんどの研究は、主にレキシコンのようなアラビアのコンテンツを分析・処理するための希少なリソースとツールのグループに依存する機械学習の通常の方法に基づいているが、これらのリソースの欠如は別の課題を招いている。これらの障害を克服するために,gru(gated recurrent unit)ニューラルネットワークを用いた2つのモデルを用いた深層学習法を提案する。 1つ目は、双方向のgrg、畳み込みニューラルネットワーク(cnn)、条件付き確率場(crf)の組み合わせによって単語と文字の両方の表現を利用するdlモデルであり、(bgru-cnn-crf)モデルを作成して主意見の側面(ote)を抽出する。 2つ目は、双方向GRU(IAN-BGRU)に基づく対話型アテンションネットワークで、抽出された側面に対する感情極性を特定する。アラビアホテルレビューデータセットを用いて,本モデルの評価を行った。提案手法は,評価対象抽出のためのF1スコアが38.5%,アスペクトベースの感情極性分類(T3)のための精度が7.5%,両タスクのベースライン研究よりも優れていることを示す。 F1得点はT2が69.44%、T3が83.98%である。 Aspect-based Sentiment analysis (ABSA) accomplishes a fine-grained analysis that defines the aspects of a given document or sentence and the sentiments conveyed regarding each aspect. This level of analysis is the most detailed version that is capable of exploring the nuanced viewpoints of the reviews. Most of the research available in ABSA focuses on English language with very few work available on Arabic. Most previous work in Arabic has been based on regular methods of machine learning that mainly depends on a group of rare resources and tools for analyzing and processing Arabic content such as lexicons, but the lack of those resources presents another challenge. To overcome these obstacles, Deep Learning (DL)-based methods are proposed using two models based on Gated Recurrent Units (GRU) neural networks for ABSA. The first one is a DL model that takes advantage of the representations on both words and characters via the combination of bidirectional GRU, Convolutional neural network (CNN), and Conditional Random Field (CRF) which makes up (BGRU-CNN-CRF) model to extract the main opinionated aspects (OTE). The second is an interactive attention network based on bidirectional GRU (IAN-BGRU) to identify sentiment polarity toward extracted aspects. We evaluated our models using the benchmarked Arabic hotel reviews dataset. The results indicate that the proposed methods are better than baseline research on both tasks having 38.5% enhancement in F1-score for opinion target extraction (T2) and 7.5% in accuracy for aspect-based sentiment polarity classification (T3). Obtaining F1 score of 69.44% for T2, and accuracy of 83.98% for T3.	翻訳日:2021-03-19 10:47:20 公開日:2021-03-07
# (参考訳) マルウェア進化検出のための単語埋め込み技術 Word Embedding Techniques for Malware Evolution Detection ( http://arxiv.org/abs/2103.05759v1 ) ライセンス: CC BY 4.0	Sunhera Paul and Mark Stamp	(参考訳) マルウェア検出は情報セキュリティの重要な側面である。問題のひとつは、マルウェアが時間とともに進化することです。効果的なマルウェア検出を維持するためには,マルウェアの進化がいつ発生したのかを判断し,適切な対策を講じる必要がある。マルウェアファミリーが進化した可能性が高い時期のポイントを検出するための様々な実験を行い、進化が実際に発生したことを確認するための二次テストを検討します。いくつかのマルウェアファミリーが分析され、それぞれが長期間にわたって収集された多数のサンプルを含んでいる。実験は, 単語埋め込み技術に基づく機能工学を用いて, 改良結果を得たことを示す。すべての実験は機械学習モデルに基づいており、進化検出戦略は人間の介入を最小限に抑え、簡単に自動化できる。 Malware detection is a critical aspect of information security. One difficulty that arises is that malware often evolves over time. To maintain effective malware detection, it is necessary to determine when malware evolution has occurred so that appropriate countermeasures can be taken. We perform a variety of experiments aimed at detecting points in time where a malware family has likely evolved, and we consider secondary tests designed to confirm that evolution has actually occurred. Several malware families are analyzed, each of which includes a number of samples collected over an extended period of time. Our experiments indicate that improved results are obtained using feature engineering based on word embedding techniques. All of our experiments are based on machine learning models, and hence our evolution detection strategies require minimal human intervention and can easily be automated.	翻訳日:2021-03-12 09:07:22 公開日:2021-03-07
# (参考訳) マルウェア家族関係のクラスタ分析 Cluster Analysis of Malware Family Relationships ( http://arxiv.org/abs/2103.05761v1 ) ライセンス: CC BY 4.0	Samanvitha Basole and Mark Stamp	(参考訳) 本稿では,k$-meansクラスタリングを用いてマルウェアサンプル間の各種関係を分析する。約20のマルウェアファミリーと1家族あたり1000のサンプルからなるデータセットを考察する。これらの家族は7種類のマルウェアに分類される。我々は,家族のペアに基づいてクラスタリングを行い,その結果を用いて家族間の関係を判定する。マルウェアの種類に基づいて同様のクラスタ分析を行います。以上の結果から,K$-meansクラスタリングは,マルウェアの家族関係を探索するための強力なツールとなる可能性が示唆された。 In this paper, we use $K$-means clustering to analyze various relationships between malware samples. We consider a dataset comprising~20 malware families with~1000 samples per family. These families can be categorized into seven different types of malware. We perform clustering based on pairs of families and use the results to determine relationships between families. We perform a similar cluster analysis based on malware type. Our results indicate that $K$-means clustering can be a powerful tool for data exploration of malware family relationships.	翻訳日:2021-03-12 08:55:50 公開日:2021-03-07
# (参考訳) マルウェア分類のためのWord2Vec, HMM2Vec, PCA2Vecの比較 A Comparison of Word2Vec, HMM2Vec, and PCA2Vec for Malware Classification ( http://arxiv.org/abs/2103.05763v1 ) ライセンス: CC BY 4.0	Aniket Chandak and Wendy Lee and Mark Stamp	(参考訳) 単語の埋め込みはしばしば、単語間の関係を定量化する手段として自然言語処理で使用される。より一般的に、これらの同じ単語埋め込み技術は特徴間の関係の定量化に利用できる。本稿では,マルウェア分類の文脈において,複数の単語埋め込み手法を検討する。私たちは隠れマルコフモデルを使用して、HMM2Vecと呼ばれるアプローチで埋め込みベクトルを取得し、主成分分析に基づいてベクトル埋め込みを生成します。また、Word2Vecと呼ばれる一般的なニューラルネットワークベースの単語埋め込み技術も検討します。いずれの場合も,様々な家系のマルウェアサンプルに対して,オプコードシーケンスに基づく特徴埋め込みを導出する。本研究では,これらの特徴埋め込みに基づく分類精度の向上を,オプコードシーケンスを直接使用するHMM実験と比較し,ベースラインの確立に役立つことを示した。これらの結果は,マルウェア解析の分野では,単語埋め込みが有用な機能工学的ステップであることを示す。 Word embeddings are often used in natural language processing as a means to quantify relationships between words. More generally, these same word embedding techniques can be used to quantify relationships between features. In this paper, we first consider multiple different word embedding techniques within the context of malware classification. We use hidden Markov models to obtain embedding vectors in an approach that we refer to as HMM2Vec, and we generate vector embeddings based on principal component analysis. We also consider the popular neural network based word embedding technique known as Word2Vec. In each case, we derive feature embeddings based on opcode sequences for malware samples from a variety of different families. We show that we can obtain better classification accuracy based on these feature embeddings, as compared to HMM experiments that directly use the opcode sequences, and serve to establish a baseline. These results show that word embeddings can be a useful feature engineering step in the field of malware analysis.	翻訳日:2021-03-12 08:43:40 公開日:2021-03-07
# (参考訳) 時系列レコメンダシステムの時間モデルを用いたハイブリッドモデル Hybrid Model with Time Modeling for Sequential Recommender Systems ( http://arxiv.org/abs/2103.06138v1 ) ライセンス: CC BY 4.0	Marlesson R. O. Santana, Anderson Soares	(参考訳) 深層学習に基づく手法は、推薦システム問題に成功している。反復ニューラルネットワーク、トランス、および注意メカニズムを用いたアプローチは、シーケンシャルインタラクションにおけるユーザーの長期および短期の好みをモデル化するのに有用である。さまざまなセッションベースのレコメンデーションソリューションを探求するために、Booking.comは最近WSDM WebTour 2021 Challengeを組織しました。本研究はこの課題に対する我々のアプローチを示す。レコメンダシステムのための最先端のディープラーニングアーキテクチャをテストするために,いくつかの実験を行った。さらに,NARM(Neural Attentive Recommendation Machine)にいくつかの変更を加え,そのアーキテクチャを課題に適応させ,どのセッションベースモデルにも適用可能なトレーニングアプローチを実装して精度を向上した。実験結果から,narmの改善は他のベンチマーク手法よりも優れていた。 Deep learning based methods have been used successfully in recommender system problems. Approaches using recurrent neural networks, transformers, and attention mechanisms are useful to model users' long- and short-term preferences in sequential interactions. To explore different session-based recommendation solutions, Booking.com recently organized the WSDM WebTour 2021 Challenge, which aims to benchmark models to recommend the final city in a trip. This study presents our approach to this challenge. We conducted several experiments to test different state-of-the-art deep learning architectures for recommender systems. Further, we proposed some changes to Neural Attentive Recommendation Machine (NARM), adapted its architecture for the challenge objective, and implemented training approaches that can be used in any session-based model to improve accuracy. Our experimental result shows that the improved NARM outperforms all other state-of-the-art benchmark methods.	翻訳日:2021-03-12 07:16:44 公開日:2021-03-07
# (参考訳) universal adversarial perturbation とイメージスパム分類器 Universal Adversarial Perturbations and Image Spam Classifiers ( http://arxiv.org/abs/2103.05469v1 ) ライセンス: CC BY 4.0	Andy Phung and Mark Stamp	(参考訳) 名前が示すように、画像スパムは画像に埋め込まれたスパムメールだ。画像スパムはテキストベースのフィルターを避けるために開発された。現代のディープラーニングに基づく分類器は、野生で見られる典型的な画像スパムを検出するのによく機能する。本章では,ディープラーニングに基づく画像スパム分類器を攻撃するために,多くの敵手法を評価する。テストした手法のうち、普遍摂動が最善であることがわかった。そこで本稿では, 画像スパムに適応した「自然な摂動」を生成可能な, 変換に基づく新たな対向攻撃を提案し, 解析する。結果として得られるスパム画像は、集中した自然特徴の存在と普遍的な敵の摂動の両方から恩恵を受ける。提案手法は, 精度の低下, 例ごとの計算時間, 摂動距離において, 既存の敵攻撃よりも優れていることを示す。本手法は,画像スパム検出における今後の研究の課題データセットとして使用できる,敵対的スパム画像のデータセットの作成に応用する。 As the name suggests, image spam is spam email that has been embedded in an image. Image spam was developed in an effort to evade text-based filters. Modern deep learning-based classifiers perform well in detecting typical image spam that is seen in the wild. In this chapter, we evaluate numerous adversarial techniques for the purpose of attacking deep learning-based image spam classifiers. Of the techniques tested, we find that universal perturbation performs best. Using universal adversarial perturbations, we propose and analyze a new transformation-based adversarial attack that enables us to create tailored "natural perturbations" in image spam. The resulting spam images benefit from both the presence of concentrated natural features and a universal adversarial perturbation. We show that the proposed technique outperforms existing adversarial attacks in terms of accuracy reduction, computation time per example, and perturbation distance. We apply our technique to create a dataset of adversarial spam images, which can serve as a challenge dataset for future research in image spam detection.	翻訳日:2021-03-11 18:22:51 公開日:2021-03-07
# (参考訳) セマンティックセグメンテーションにおける平方根親和性を用いたラベルの回帰 Use square root affinity to regress labels in semantic segmentation ( http://arxiv.org/abs/2103.04990v1 ) ライセンス: CC BY 4.0	Lumeng Cao, Zhouwang Yang	(参考訳) セマンティックセグメンテーションは、コンピュータビジョンにおける基本的な非自明なタスクです。多くの以前の研究では、アフィニティパターンを利用してセグメンテーションネットワークを強化することに焦点を当てている。これらの研究のほとんどは、アフィニティ行列を特徴融合重みの一種として使用しており、これは注意モデルや非局所モデルなどのネットワークに組み込まれたモジュールの一部である。本稿では,アフィニティ行列とラベルを関連付け,教師付き方法でアフィニティを利用する。具体的には,このラベルを用いてマルチスケールなラベル親和性行列を構造的監視として生成し,平方根カーネルを用いて出力層上の非局所親和性行列を計算する。このような2つの親和性により、Affinity Regression loss(AR損失)と呼ばれる新しい損失を定義します。我々のモデルは訓練が容易であり、実行時推論なしで計算負荷をほとんど加えない。 NYUv2データセットとCityscapesデータセットに関する広範な実験は、提案手法がセマンティックセグメンテーションネットワークを促進するのに十分であることを示す。 Semantic segmentation is a basic but non-trivial task in computer vision. Many previous work focus on utilizing affinity patterns to enhance segmentation networks. Most of these studies use the affinity matrix as a kind of feature fusion weights, which is part of modules embedded in the network, such as attention models and non-local models. In this paper, we associate affinity matrix with labels, exploiting the affinity in a supervised way. Specifically, we utilize the label to generate a multi-scale label affinity matrix as a structural supervision, and we use a square root kernel to compute a non-local affinity matrix on output layers. With such two affinities, we define a novel loss called Affinity Regression loss (AR loss), which can be an auxiliary loss providing pair-wise similarity penalty. Our model is easy to train and adds little computational burden without run-time inference. Extensive experiments on NYUv2 dataset and Cityscapes dataset demonstrate that our proposed method is sufficient in promoting semantic segmentation networks.	翻訳日:2021-03-11 16:53:20 公開日:2021-03-07
# (参考訳) カルマンフィルタを用いた最適物体追跡技術 Optimized Object Tracking Technique Using Kalman Filter ( http://arxiv.org/abs/2103.05467v1 ) ライセンス: CC BY 4.0	Liana Ellen Taylor, Midriem Mirdanies, Roni Permana Saputra	(参考訳) 本稿では, クラッタシーンにおける所望の移動物体の検出精度を維持しつつ, 物体検出プロセスに必要な処理時間を最小化する最適化オブジェクト追跡手法の設計に着目した。カルマンフィルタベースのトリミング画像は、処理時間がビデオフレーム全体よりも小さい検索ウィンドウを使用する場合にオブジェクトを検出するのにかなり少ないため、画像検出プロセスに使用されます。この技術は、トリミングプロセスでウィンドウのさまざまなサイズでテストされました。 MATLABは提案手法の設計とテストに使用された。本論文では, 最大次元の2.16倍のトリミング画像を使用することで, 処理時間が大幅に短縮される一方, 検出成功率が高く, 検出された物体の中心が実物中心に近かったことを明らかにした。 This paper focused on the design of an optimized object tracking technique which would minimize the processing time required in the object detection process while maintaining accuracy in detecting the desired moving object in a cluttered scene. A Kalman filter based cropped image is used for the image detection process as the processing time is significantly less to detect the object when a search window is used that is smaller than the entire video frame. This technique was tested with various sizes of the window in the cropping process. MATLAB was used to design and test the proposed method. This paper found that using a cropped image with 2.16 multiplied by the largest dimension of the object resulted in significantly faster processing time while still providing a high success rate of detection and a detected center of the object that was reasonably close to the actual center.	翻訳日:2021-03-11 15:18:09 公開日:2021-03-07
# 発達期におけるブレッテンベルク車による身体的連続学習 Embodied Continual Learning Across Developmental Time Via Developmental Braitenberg Vehicles ( http://arxiv.org/abs/2103.05753v1 ) ライセンス: Link先を確認	Bradly Alicea, Rishabh Chakrabarty, Akshara Gopi, Anson Lim, and Jesse Parent	(参考訳) 発達生物学、認知科学、計算モデリングの合成を通じて学ぶべきことはたくさんある。この観点から学ぶことができる教訓の1つは、インテリジェントプログラムの初期化は、多数のパラメータの操作にのみ依存できないということです。今後は、Braitenberg Vehicleをベースとした、開発にインスパイアされた学習エージェントの設計を提案する。これらのエージェントを人工体型知能の例に用い,認知発達能力の構成要素としての体型経験と形態形成成長のモデル化に近づいた。成人の表現型の発生と発達経路の同時性に影響を与える生物学的・認知的発達に関する諸要因を考察する。これらのメカニズムは、シフト重みと適応的ネットワークトポロジーによる創発的接続を生み出し、ニューラルネットワークのトレーニングにおける発達過程の重要性を示す。このアプローチは、重要な期間や成長と獲得を活用し、明示的に具体化されたネットワークアーキテクチャを活用し、ニューラルネットワークの組み立てとこれらのネットワークでのアクティブラーニングを区別することで、開発アプローチから生じる適応エージェントの振る舞いの青写真を提供する。 There is much to learn through synthesis of Developmental Biology, Cognitive Science and Computational Modeling. One lesson we can learn from this perspective is that the initialization of intelligent programs cannot solely rely on manipulation of numerous parameters. Our path forward is to present a design for developmentally-inspired learning agents based on the Braitenberg Vehicle. Using these agents to exemplify artificial embodied intelligence, we move closer to modeling embodied experience and morphogenetic growth as components of cognitive developmental capacity. We consider various factors regarding biological and cognitive development which influence the generation of adult phenotypes and the contingency of available developmental pathways. These mechanisms produce emergent connectivity with shifting weights and adaptive network topography, thus illustrating the importance of developmental processes in training neural networks. This approach provides a blueprint for adaptive agent behavior that might result from a developmental approach: namely by exploiting critical periods or growth and acquisition, an explicitly embodied network architecture, and a distinction between the assembly of neural networks and active learning on these networks.	翻訳日:2021-03-11 14:46:44 公開日:2021-03-07
# (参考訳) 深層学習に基づく小型データセットの超解像蛍光顕微鏡 Deep learning-based super-resolution fluorescence microscopy on small datasets ( http://arxiv.org/abs/2103.04989v1 ) ライセンス: CC BY 4.0	Varun Mannam, Yide Zhang, Xiaotong Yuan, and Scott Howard	(参考訳) 蛍光顕微鏡は、生物をマイクロメートルスケールの解像度で可視化することで、現代の生物学における劇的な発展を可能にした。しかし、回折限界のため、サブミクロン/ナノメータの特徴は解決しにくい。ナノメートルの解像度を達成するために様々な超解像技術が開発されているが、高価な光学的セットアップや特殊なフルオロフォを必要とすることが多い。近年、深層学習は、回折制限画像から技術的障壁を減らし、超解像を得る可能性を示している。正確な結果を得るためには、従来のディープラーニング技術はトレーニングデータセットとして数千の画像を必要とする。生物サンプルから大規模なデータセットを得ることは、フルオロフォアのフォトブレッシング、光毒性、生体内で起こる動的プロセスなどによっては実現できないことが多い。したがって、小さなデータセットを用いたディープラーニングベースの超解像の実現は困難である。この制限を、小さなデータセットでうまくトレーニングされ、超高解像度画像を実現する新しい畳み込みニューラルネットワークベースのアプローチで解決します。トレーニングデータセットとして15の異なるフィールドオブビューから合計750枚の画像をキャプチャし,そのテクニックを実証した。各FOVでは、超解像ラジアルゆらぎ法を用いて単一のターゲット画像を生成する。予想通り、この小さなデータセットは、従来の超高解像度アーキテクチャを使用して使用可能なモデルを生成できなかった。しかし、新しいアプローチを使用すると、ネットワークを訓練して、この小さなデータセットから超高解像度の画像を達成できます。このディープラーニングモデルは、大規模なトレーニングデータセットの取得が困難なMRIやX線イメージングなどの他のバイオメディカルイメージングモードに適用できます。 Fluorescence microscopy has enabled a dramatic development in modern biology by visualizing biological organisms with micrometer scale resolution. However, due to the diffraction limit, sub-micron/nanometer features are difficult to resolve. While various super-resolution techniques are developed to achieve nanometer-scale resolution, they often either require expensive optical setup or specialized fluorophores. In recent years, deep learning has shown the potentials to reduce the technical barrier and obtain super-resolution from diffraction-limited images. For accurate results, conventional deep learning techniques require thousands of images as a training dataset. Obtaining large datasets from biological samples is not often feasible due to the photobleaching of fluorophores, phototoxicity, and dynamic processes occurring within the organism. Therefore, achieving deep learning-based super-resolution using small datasets is challenging. We address this limitation with a new convolutional neural network-based approach that is successfully trained with small datasets and achieves super-resolution images. We captured 750 images in total from 15 different field-of-views as the training dataset to demonstrate the technique. In each FOV, a single target image is generated using the super-resolution radial fluctuation method. As expected, this small dataset failed to produce a usable model using traditional super-resolution architecture. However, using the new approach, a network can be trained to achieve super-resolution images from this small dataset. This deep learning model can be applied to other biomedical imaging modalities such as MRI and X-ray imaging, where obtaining large training datasets is challenging.	翻訳日:2021-03-11 11:43:12 公開日:2021-03-07
# (参考訳) 蛍光ライフタイムイメージング顕微鏡(FLIM)における畳み込みニューラルネットワーク Convolutional Neural Network Denoising in Fluorescence Lifetime Imaging Microscopy (FLIM) ( http://arxiv.org/abs/2103.05448v1 ) ライセンス: CC BY 4.0	Varun Mannam, Yide Zhang, Xiaotong Yuan, Takashi Hato, Pierre C. Dagher, Evan L. Nichols, Cody J. Smith, Kenneth W. Dunn, and Scott Howard	(参考訳) 蛍光寿命イメージング顕微鏡(FLIM)システムは、その遅い処理速度、低信号対雑音比(SNR)、および高価で困難なハードウェアセットアップによって制限されています。そこで本研究では,FLIM SNRを改善するために畳み込み畳み込みネットワークを適用した。ネットワークは、アナログ信号処理に基づく高速なデータ取得、高効率パルス変調を用いた高SNR、オフザシェルフ無線周波数成分を用いたコスト効率実装を備えたインスタントFLIMシステムと統合される。我々のインスタントFLIMシステムは同時に、強度、寿命、薬理プロット \textit{in vivo} と \textit{ex vivo} を提供する。 FLIMデータに訓練されたディープラーニングモデルを用いて画像の復調を統合することにより、正確なFLIMファサー計測が得られる。 K平均クラスタリングセグメンテーション(K-means clustering segmentation)法は、異なる蛍光体を正確に分離する、偏見のない教師なしの機械学習技術である。マウスの腎臓実験では, セグメント化前に深層学習画像の認知モデルを導入することで, 既存の方法と比較して, ファーザーのノイズを効果的に除去し, より明瞭なセグメントを提供することが示された。そこで,提案する深層学習に基づくワークフローは,インスタントflimを用いた蛍光画像の自動セグメンテーションを高速かつ高精度に実現する。 FLIM測定がノイズの多い場合, 除音操作はセグメンテーションに有効である。クラスタリングは、バイオメディカルイメージングアプリケーションに関心のある生物学的構造の検出を効果的に強化できます。 Fluorescence lifetime imaging microscopy (FLIM) systems are limited by their slow processing speed, low signal-to-noise ratio (SNR), and expensive and challenging hardware setups. In this work, we demonstrate applying a denoising convolutional network to improve FLIM SNR. The network will be integrated with an instant FLIM system with fast data acquisition based on analog signal processing, high SNR using high-efficiency pulse-modulation, and cost-effective implementation utilizing off-the-shelf radio-frequency components. Our instant FLIM system simultaneously provides the intensity, lifetime, and phasor plots \textit{in vivo} and \textit{ex vivo}. By integrating image denoising using the trained deep learning model on the FLIM data, provide accurate FLIM phasor measurements are obtained. The enhanced phasor is then passed through the K-means clustering segmentation method, an unbiased and unsupervised machine learning technique to separate different fluorophores accurately. Our experimental \textit{in vivo} mouse kidney results indicate that introducing the deep learning image denoising model before the segmentation effectively removes the noise in the phasor compared to existing methods and provides clearer segments. Hence, the proposed deep learning-based workflow provides fast and accurate automatic segmentation of fluorescence images using instant FLIM. The denoising operation is effective for the segmentation if the FLIM measurements are noisy. The clustering can effectively enhance the detection of biological structures of interest in biomedical imaging applications.	翻訳日:2021-03-11 08:43:08 公開日:2021-03-07
# (参考訳) オランダ考古学領域におけるオンラインプロフェッショナル検索の有用性評価 Usability Evaluation for Online Professional Search in the Dutch Archaeology Domain ( http://arxiv.org/abs/2103.04437v1 ) ライセンス: CC BY 4.0	Alex Brandsen, Suzan Verberne, Karsten Lambers, Milco Wansleeben	(参考訳) 本稿では,これらの長文考古学文献のフルテキスト検索を可能にする,最初の考古学グレイ文学情報検索システムagnesについて述べる。この検索システムは、考古学の専門家や学者が6万以上のオランダの発掘レポートのコレクションを通じて検索することができるWebインターフェイスを持っています。我々はAGNESの検索インタフェースの評価のために,小規模ながら多様なユーザグループを用いてユーザスタディを行った。評価はスクリーンキャプチャとthink aloudプロトコルによって行われ、ユーザインタフェースフィードバックアンケートが組み合わされた。評価は、制御された使用(事前定義されたタスクの補完)と自由使用(自由選択されたタスクの補完)の両方をカバーした。自由に利用することで、考古学者のニーズや、検索システムとの相互作用を研究することができます。結論として,(1) 考古学者の情報要求は概ねリコール指向であり,回答として項目のリストを必要とすること,(2) ユーザはメタデータフィルタよりも自由テキストクエリの使用を好み,自由テキスト検索システムの価値を確認すること,(3) 多様なユーザグループの編集が,システム改善のフィードバックとして多様な課題の収集に寄与すること,などがあげられる。私たちは現在、AGNESのユーザーインターフェイスを改良し、考古学的実体のための精度を向上させることで、考古学者が研究の質問をより効率的かつ効率的に回答できるようにし、過去をより一貫性のある物語に導きます。 This paper presents AGNES, the first information retrieval system for archaeological grey literature, allowing full-text search of these long archaeological documents. This search system has a web interface that allows archaeology professionals and scholars to search through a collection of over 60,000 Dutch excavation reports, totalling 361 million words. We conducted a user study for the evaluation of AGNES's search interface, with a small but diverse user group. The evaluation was done by screen capturing and a think aloud protocol, combined with a user interface feedback questionnaire. The evaluation covered both controlled use (completion of a pre-defined task) as well as free use (completion of a freely chosen task). The free use allows us to study the information needs of archaeologists, as well as their interactions with the search system. We conclude that: (1) the information needs of archaeologists are typically recall-oriented, often requiring a list of items as answer; (2) the users prefer the use of free-text queries over metadata filters, confirming the value of a free-text search system; (3) the compilation of a diverse user group contributed to the collection of diverse issues as feedback for improving the system. We are currently refining AGNES's user interface and improving its precision for archaeological entities, so that AGNES will help archaeologists to answer their research questions more effectively and efficiently, leading to a more coherent narrative of the past.	翻訳日:2021-03-11 00:17:46 公開日:2021-03-07
# (参考訳) Smooth Stochastic Optimizationのレトロスペクティブ近似 Retrospective Approximation for Smooth Stochastic Optimization ( http://arxiv.org/abs/2103.04392v1 ) ライセンス: CC BY 4.0	David Newton, Raghu Bollapragada, Raghu Pasupathy, Nung Kwan Yip	(参考訳) 確率的一階オラクルを用いてスムーズな(そして潜在的に非凸な)目的を最小化する確率的最適化問題を考察する。このような問題は、シミュレーション最適化からディープラーニングまで、多くの設定で発生します。本論文では,各イテレーションで$k$のサンプルパス近似問題を適応したサンプルサイズ$M_k$を用いて暗黙的に生成し,行探索準ニュートン法のような「決定論的方法」を用いて,適応した誤差許容度$\epsilon_k$に(事前の解法で)解決する,普遍的な逐次サンプル平均近似(SAA)パラダイムとして,Retrospective Approximation (RA) を提案する。 RAの主な利点は、最適化を確率的近似から切り離すことであり、既存の決定論的アルゴリズムを修正なしで直接採用できるため、確率的コンテキストのためのアルゴリズムを再設計する必要性が軽減される。 2つめの利点は、RAが並列化に寄与する明らかな方法である。 m_k, k \geq 1\}$ および $\{\epsilon_k, k\geq 1\}$ の条件を特定し、ほぼ確実に $l_1$-norm での収束と収束を保証し、最適なイテレーションと作業の複雑さ率を提供する。線形探索準ニュートンを用いたRAの性能について,未条件の最小二乗問題と深部畳み込みニューラルネットを用いた画像分類問題について述べる。 We consider stochastic optimization problems where a smooth (and potentially nonconvex) objective is to be minimized using a stochastic first-order oracle. These type of problems arise in many settings from simulation optimization to deep learning. We present Retrospective Approximation (RA) as a universal sequential sample-average approximation (SAA) paradigm where during each iteration $k$, a sample-path approximation problem is implicitly generated using an adapted sample size $M_k$, and solved (with prior solutions as "warm start") to an adapted error tolerance $\epsilon_k$, using a "deterministic method" such as the line search quasi-Newton method. The principal advantage of RA is that decouples optimization from stochastic approximation, allowing the direct adoption of existing deterministic algorithms without modification, thus mitigating the need to redesign algorithms for the stochastic context. A second advantage is the obvious manner in which RA lends itself to parallelization. We identify conditions on $\{M_k, k \geq 1\}$ and $\{\epsilon_k, k\geq 1\}$ that ensure almost sure convergence and convergence in $L_1$-norm, along with optimal iteration and work complexity rates. We illustrate the performance of RA with line-search quasi-Newton on an ill-conditioned least squares problem, as well as an image classification problem using a deep convolutional neural net.	翻訳日:2021-03-11 00:00:00 公開日:2021-03-07
# (参考訳) グラフベースピラミッドグローバルコンテキスト推論によるCOVID-19肺感染症セグメンテーションの検討 Graph-based Pyramid Global Context Reasoning with a Saliency-aware Projection for COVID-19 Lung Infections Segmentation ( http://arxiv.org/abs/2103.04235v1 ) ライセンス: CC BY 4.0	Huimin Huang, Ming Cai, Lanfen Lin, Jing Zheng, Xiongwei Mao, Xiaohan Qian, Zhiyi Peng, Jianying Zhou, Yutaro Iwamoto, Xian-Hua Han, Yen-Wei Chen, Ruofeng Tong	(参考訳) コロナウイルス病2019(COVID-19)は2020年に急速に広まり、CT画像から肺感染症のセグメンテーションに関する大量の研究が浮かび上がっている。この問題には多くの方法が提案されているが、さまざまなサイズの感染が異なるローブゾーンに現れるため、それは困難な課題である。そこで本研究では,不整合感染の長期依存性のモデル化とサイズ変化の適応が可能なグラフベースのPyramid Global Context Reasoning(Graph-PGCR)モジュールを提案する。最初にグラフ畳み込みを組み込んで、複数のローブゾーンから長期のコンテキスト情報を利用する。従来の平均プールや最大オブジェクト確率とは異なり、感染関連ピクセルをグラフノードの集合としてピックアップするサリエンシー対応のプロジェクションメカニズムを提案する。グラフ推論の後、関係認識機能はダウンストリームタスクのために元の座標空間に戻される。さらに,異なるサンプリングレートで複数のグラフをコンストラクトして,サイズ変動問題に対処する。この目的のために、異なるマルチスケールの長距離コンテキストパターンをキャプチャできる。当社のGraph-PGCRモジュールはプラグアンドプレイで、パフォーマンスを向上させるためにあらゆるアーキテクチャに統合できます。実験により、提案手法は、パブリックとプライベートのCOVID-19データセットの両方において、最先端のバックボーンアーキテクチャのパフォーマンスを継続的に向上することを示した。 Coronavirus Disease 2019 (COVID-19) has rapidly spread in 2020, emerging a mass of studies for lung infection segmentation from CT images. Though many methods have been proposed for this issue, it is a challenging task because of infections of various size appearing in different lobe zones. To tackle these issues, we propose a Graph-based Pyramid Global Context Reasoning (Graph-PGCR) module, which is capable of modeling long-range dependencies among disjoint infections as well as adapt size variation. We first incorporate graph convolution to exploit long-term contextual information from multiple lobe zones. Different from previous average pooling or maximum object probability, we propose a saliency-aware projection mechanism to pick up infection-related pixels as a set of graph nodes. After graph reasoning, the relation-aware features are reversed back to the original coordinate space for the down-stream tasks. We further con- struct multiple graphs with different sampling rates to handle the size variation problem. To this end, distinct multi-scale long-range contextual patterns can be captured. Our Graph- PGCR module is plug-and-play, which can be integrated into any architecture to improve its performance. Experiments demonstrated that the proposed method consistently boost the performance of state-of-the-art backbone architectures on both of public and our private COVID-19 datasets.	翻訳日:2021-03-10 23:16:47 公開日:2021-03-07
# (参考訳) ディファレンスを学ぶ: Sim2Real Small Defection Segmentation Network Learn to Differ: Sim2Real Small Defection Segmentation Network ( http://arxiv.org/abs/2103.04297v1 ) ライセンス: CC BY 4.0	Zexi Chen, Zheyuan Huang, Yunkai Wang, Xuecheng Xu, Yue Wang, Rong Xiong	(参考訳) 深層学習に基づく小さな欠陥セグメント化手法に関する最近の研究は、特定の設定で訓練されており、一定の文脈で制限される傾向にある。トレーニング中、ネットワークは、欠陥を突き止める前に、トレーニングデータの背景の表現を必然的に学習します。コンテキストが変更されると推論段階ではパフォーマンスが低下し、新しい設定ごとにトレーニングすることでのみ解決できる。これは最終的に、コンテキストが変化し続ける実用的ロボットアプリケーションに制限をもたらす。これに対処するために、ネットワークコンテキストをコンテキスト別にトレーニングし、一般化を期待するのではなく、制限されたコンテキストで誤解し、純粋なシミュレーションでトレーニングを開始すべきなのか? 本稿では,コンテキストに関わらず2つの画像間の小さな欠陥を識別する方法を学習するネットワークSSDSを提案する。画像間の位相相関のポーズ感度を利用した小さな欠陥検出層が導入され、その後に異常マスキング層が続く。ネットワークは、単純な形状でランダムに生成されたシミュレーションデータに基づいて訓練され、現実世界で一般化される。最後に、SSDSは実世界の収集されたデータに基づいて検証され、安価なシミュレーションでトレーニングしても、実際の世界で小さな欠陥を見つけ、実用的応用の可能性を示す能力を示す。 Recent studies on deep-learning-based small defection segmentation approaches are trained in specific settings and tend to be limited by fixed context. Throughout the training, the network inevitably learns the representation of the background of the training data before figuring out the defection. They underperform in the inference stage once the context changed and can only be solved by training in every new setting. This eventually leads to the limitation in practical robotic applications where contexts keep varying. To cope with this, instead of training a network context by context and hoping it to generalize, why not stop misleading it with any limited context and start training it with pure simulation? In this paper, we propose the network SSDS that learns a way of distinguishing small defections between two images regardless of the context, so that the network can be trained once for all. A small defection detection layer utilizing the pose sensitivity of phase correlation between images is introduced and is followed by an outlier masking layer. The network is trained on randomly generated simulated data with simple shapes and is generalized across the real world. Finally, SSDS is validated on real-world collected data and demonstrates the ability that even when trained in cheap simulation, SSDS can still find small defections in the real world showing the effectiveness and its potential for practical applications.	翻訳日:2021-03-10 23:07:21 公開日:2021-03-07
# (参考訳) オンライン機械学習手法が電力市場の長期投資決定と発電機利用に与える影響 The impact of online machine-learning methods on long-term investment decisions and generator utilization in electricity markets ( http://arxiv.org/abs/2103.04327v1 ) ライセンス: CC BY 4.0	Alexander J. M. Kell, A. Stephen McGough, Matthew Forshaw	(参考訳) 電力供給は常に需要に合致する必要があります。これにより、負荷周波数制御や停電といった問題が発生する確率を減らすことができる。今後24時間以内に必要となるであろう負荷をよりよく理解するためには、不確実性に基づく推定が必要である。これは、多くのマイクロプロデューサが中央制御下にない分散電力市場では特に困難である。本稿では,11のオフライン学習と5つのオンライン学習アルゴリズムによる次の24時間における電力需要プロファイルの予測について検討する。長期エージェントベースのモデルであるElecSimに統合することで実現します。今後の24時間における電力需要プロファイルの予測を通じて、日頭市場における予測をシミュレートすることができる。これらの予測を行った後、残留分布からサンプルを採取し、シミュレーションであるElecSimを用いて電力市場需要を摂動させる。これにより、分散型電力市場の長期的ダイナミクスに対するエラーの影響を理解することができる。提案手法では,オンラインアルゴリズムを用いて平均絶対誤差を30%削減でき,また,必要となる余裕のある全国的グリッドリザーブを削減できることを示した。この国別埋蔵量の減少は、コストと排出量の節約につながります。また, 予測精度の大きな誤差は, 17年間の時間枠での投資に不均等な誤差があり, 電気の混合も可能であることを示した。 Electricity supply must be matched with demand at all times. This helps reduce the chances of issues such as load frequency control and the chances of electricity blackouts. To gain a better understanding of the load that is likely to be required over the next 24h, estimations under uncertainty are needed. This is especially difficult in a decentralized electricity market with many micro-producers which are not under central control. In this paper, we investigate the impact of eleven offline learning and five online learning algorithms to predict the electricity demand profile over the next 24h. We achieve this through integration within the long-term agent-based model, ElecSim. Through the prediction of electricity demand profile over the next 24h, we can simulate the predictions made for a day-ahead market. Once we have made these predictions, we sample from the residual distributions and perturb the electricity market demand using the simulation, ElecSim. This enables us to understand the impact of errors on the long-term dynamics of a decentralized electricity market. We show we can reduce the mean absolute error by 30% using an online algorithm when compared to the best offline algorithm, whilst reducing the required tendered national grid reserve required. This reduction in national grid reserves leads to savings in costs and emissions. We also show that large errors in prediction accuracy have a disproportionate error on investments made over a 17-year time frame, as well as electricity mix.	翻訳日:2021-03-10 22:19:11 公開日:2021-03-07
# (参考訳) Markov Cricket: 1日の国際クリケットにおけるベッティングパフォーマンスのモデル化、予測、最適化にフォワードと逆強化学習を使う Markov Cricket: Using Forward and Inverse Reinforcement Learning to Model, Predict And Optimize Batting Performance in One-Day International Cricket ( http://arxiv.org/abs/2103.04349v1 ) ライセンス: CC BY 4.0	Manohar Vohra and George S. D. Gordon	(参考訳) 本稿では,1日の国際クリケット競技をマルコフプロセスとしてモデル化し,フォワードおよびインバース強化学習(rl)を適用し,新たな3つのツールを開発した。まず,モンテカルロ学習をスコアに基づく報酬モデルを用いて,ゲームの各状態に対する値関数の非線形近似に適用する。本手法は,残るスコアリング資源のプロキシとして使用する場合,プロの試合で使用されるダックワース・ルイス・ステルン法を3倍から10倍に上回っている。次に、逆強化学習(特にガイド付きコスト学習の変種)を用いて、エキスパートのパフォーマンスに基づいて報酬の線形モデルを推論し、ここでは勝利チームのプレーシーケンスと仮定する。このモデルから各状態に対する最適ポリシーを明示的に決定し、ゲームに関する一般的な直観と一致することを見つける。最後に、推定報酬モデルを用いて、異なるポリシーの下で最終スコアの後方分布をモデル化するゲームシミュレータを構築する。予測とシミュレーションのテクニックは中断されたゲームの最終スコアを推定するためのより公平な代替手段となり得るが、推定された報酬モデルはプロのゲームがプレイ戦略を最適化するための有用な洞察を提供するかもしれない。さらに,この競技にRLを適用する方法が,野球や球技など,チームが交互にプレーする個別の状態のスポーツに広く適用される可能性があることを期待する。 In this paper, we model one-day international cricket games as Markov processes, applying forward and inverse Reinforcement Learning (RL) to develop three novel tools for the game. First, we apply Monte-Carlo learning to fit a nonlinear approximation of the value function for each state of the game using a score-based reward model. We show that, when used as a proxy for remaining scoring resources, this approach outperforms the state-of-the-art Duckworth-Lewis-Stern method used in professional matches by 3 to 10 fold. Next, we use inverse reinforcement learning, specifically a variant of guided-cost learning, to infer a linear model of rewards based on expert performances, assumed here to be play sequences of winning teams. From this model we explicitly determine the optimal policy for each state and find this agrees with common intuitions about the game. Finally, we use the inferred reward models to construct a game simulator that models the posterior distribution of final scores under different policies. We envisage our prediction and simulation techniques may provide a fairer alternative for estimating final scores in interrupted games, while the inferred reward model may provide useful insights for the professional game to optimize playing strategy. Further, we anticipate our method of applying RL to this game may have broader application to other sports with discrete states of play where teams take turns, such as baseball and rounders.	翻訳日:2021-03-10 19:49:39 公開日:2021-03-07
# (参考訳) コード埋め込みを用いた半自動誤解発見に向けて Toward Semi-Automatic Misconception Discovery Using Code Embeddings ( http://arxiv.org/abs/2103.04448v1 ) ライセンス: CC BY 4.0	Yang Shi, Krupal Shah, Wengran Wang, Samiha Marwan, Poorvaja Penmetsa and Thomas W. Price	(参考訳) 生徒の誤解を理解することは効果的な指導と評価に重要である。しかし、そのような誤解を手動で発見することは時間と労力を要する。自動誤解発見(automated misconception discovery)は、学生データのパターンを強調することで、これらの課題に対処することができる。本研究では,現状のコード分類モデルを用いて,コンピュータコースにおける生徒のプログラムコードから問題固有の誤解を半自動で発見する手法を提案する。ブロックベースのプログラミングデータセットでモデルをトレーニングし、学習した埋め込みをクラスタの不正な学生の応募に使用しました。これらのクラスターは問題に関する特定の誤解に対応しており、既存のアプローチでは容易には発見できなかった。また、私たちのアプローチの潜在的な応用と、これらの誤解が学生の学習プロセスにドメイン固有の洞察をどう伝えるかについて議論します。 Understanding students' misconceptions is important for effective teaching and assessment. However, discovering such misconceptions manually can be time-consuming and laborious. Automated misconception discovery can address these challenges by highlighting patterns in student data, which domain experts can then inspect to identify misconceptions. In this work, we present a novel method for the semi-automated discovery of problem-specific misconceptions from students' program code in computing courses, using a state-of-the-art code classification model. We trained the model on a block-based programming dataset and used the learned embedding to cluster incorrect student submissions. We found these clusters correspond to specific misconceptions about the problem and would not have been easily discovered with existing approaches. We also discuss potential applications of our approach and how these misconceptions inform domain-specific insights into students' learning processes.	翻訳日:2021-03-10 19:35:45 公開日:2021-03-07
# (参考訳) 深層学習層のスペクトルテンソルトレインパラメータ化 Spectral Tensor Train Parameterization of Deep Learning Layers ( http://arxiv.org/abs/2103.04217v1 ) ライセンス: CC BY-SA 4.0	Anton Obukhov, Maxim Rakhuba, Alexander Liniger, Zhiwu Huang, Stamatios Georgoulis, Dengxin Dai, Luc Van Gool	(参考訳) 重み行列の低ランクパラメータ化をDeep Learningコンテキストに埋め込まれたスペクトル特性を用いて検討する。低ランク特性はパラメータ効率をもたらし、マッピングを計算する際に計算ショートカットを行うことができる。スペクトル特性はしばしば最適化問題に制約を受け、より良いモデルと最適化の安定性をもたらす。まず、重み行列のコンパクトなSVDパラメータ化とパラメータ化における冗長性源の同定から始める。さらに, テンソルトレイン(TT)分解をコンパクトなSVD成分に適用し, スペクトルテンソルトレインパラメータ化(STTP)と呼ばれる固定されたTTランクテンソル多様体の非冗長微分パラメータ化を提案する。画像分類設定におけるニューラルネットワーク圧縮の効果と,生成敵対的トレーニング設定における圧縮とトレーニング安定性の改善を実証する。 We study low-rank parameterizations of weight matrices with embedded spectral properties in the Deep Learning context. The low-rank property leads to parameter efficiency and permits taking computational shortcuts when computing mappings. Spectral properties are often subject to constraints in optimization problems, leading to better models and stability of optimization. We start by looking at the compact SVD parameterization of weight matrices and identifying redundancy sources in the parameterization. We further apply the Tensor Train (TT) decomposition to the compact SVD components, and propose a non-redundant differentiable parameterization of fixed TT-rank tensor manifolds, termed the Spectral Tensor Train Parameterization (STTP). We demonstrate the effects of neural network compression in the image classification setting and both compression and improved training stability in the generative adversarial training setting.	翻訳日:2021-03-10 17:48:26 公開日:2021-03-07
# 動的プログラミングを伴わないcnn音声単語検出と局所化 CNN-based Spoken Term Detection and Localization without Dynamic Programming ( http://arxiv.org/abs/2103.05468v1 ) ライセンス: Link先を確認	Tzeviya Sylvia Fuchs, Yael Segal and Joseph Keshet	(参考訳) 本稿では,音声セグメント内の語彙内および語彙外用語の同時予測と局所化のための音声項検出アルゴリズムを提案する。提案アルゴリズムは、音声信号の様々な部分の単語埋め込みを予測し、所望の単語埋め込みと比較することにより、ある単語が所定の音声信号内に発声されたか否かを推定する。このアルゴリズムはこのタスクに既存の埋め込みスペースを利用し、タスク固有の埋め込みスペースをトレーニングする必要がない。推定では、アルゴリズムはターゲット項のすべての可能な位置を同時に予測し、最適な検索のために動的プログラミングを必要としません。読み上げ音声コーポラにおける複数の音声単語検出タスクのシステム評価を行った。 In this paper, we propose a spoken term detection algorithm for simultaneous prediction and localization of in-vocabulary and out-of-vocabulary terms within an audio segment. The proposed algorithm infers whether a term was uttered within a given speech signal or not by predicting the word embeddings of various parts of the speech signal and comparing them to the word embedding of the desired term. The algorithm utilizes an existing embedding space for this task and does not need to train a task-specific embedding space. At inference the algorithm simultaneously predicts all possible locations of the target term and does not need dynamic programming for optimal search. We evaluate our system on several spoken term detection tasks on read speech corpora.	翻訳日:2021-03-10 14:42:32 公開日:2021-03-07
# (参考訳) ARVo:ビデオデブリのための全行ボリューム対応学習 ARVo: Learning All-Range Volumetric Correspondence for Video Deblurring ( http://arxiv.org/abs/2103.04260v1 ) ライセンス: CC BY 4.0	Dongxu Li, Chenchen Xu, Kaihao Zhang, Xin Yu, Yiran Zhong, Wenqi Ren, Hanna Suominen, Hongdong Li	(参考訳) ビデオデブラリングモデルは連続フレームを利用して、カメラの揺動や物体の動きからぼやけを取り除く。隣接するシャープパッチを利用するために、典型的な手法は主にホモグラフィや光学フローに依存し、隣接するぼやけたフレームを空間的に整列させる。しかし、そのような明示的なアプローチは、大きなピクセル変位を持つ高速な動きの存在において効果が低い。本研究では,特徴空間におけるぼやけたフレーム間の空間対応を学習する新しい暗黙的手法を提案する。遠方画素対応を構築するために, 隣接フレーム間のすべての画素対間の相関体積ピラミッドを構築する。参照フレームの特徴を高めるために,ボリュームピラミッドに基づいて,近傍とのピクセルペア相関を最大化する相関アグリゲーションモジュールを設計した。最後に,集約された特徴を復元モジュールに供給し,復元されたフレームを得る。我々は,モデルを漸進的に最適化するための生成的逆パラダイムを設計する。提案手法は,ビデオデブロアリング用高フレームレート(1000fps)データセット(HFR-DVD)とともに,広く採用されているDVDデータセットを用いて評価する。定量的および定性的な実験は、従来の最先端の手法に対する両方のデータセットで好適に動作し、ビデオデブレーションのための全範囲空間対応のモデリングの利点を確認することを示しています。 Video deblurring models exploit consecutive frames to remove blurs from camera shakes and object motions. In order to utilize neighboring sharp patches, typical methods rely mainly on homography or optical flows to spatially align neighboring blurry frames. However, such explicit approaches are less effective in the presence of fast motions with large pixel displacements. In this work, we propose a novel implicit method to learn spatial correspondence among blurry frames in the feature space. To construct distant pixel correspondences, our model builds a correlation volume pyramid among all the pixel-pairs between neighboring frames. To enhance the features of the reference frame, we design a correlative aggregation module that maximizes the pixel-pair correlations with its neighbors based on the volume pyramid. Finally, we feed the aggregated features into a reconstruction module to obtain the restored frame. We design a generative adversarial paradigm to optimize the model progressively. Our proposed method is evaluated on the widely-adopted DVD dataset, along with a newly collected High-Frame-Rate (1000 fps) Dataset for Video Deblurring (HFR-DVD). Quantitative and qualitative experiments show that our model performs favorably on both datasets against previous state-of-the-art methods, confirming the benefit of modeling all-range spatial correspondence for video deblurring.	翻訳日:2021-03-10 14:33:21 公開日:2021-03-07
# (参考訳) 局所単語統計は超越性によらず読解時間に影響する Local word statistics affect reading times independently of surprisal ( http://arxiv.org/abs/2103.04469v1 ) ライセンス: CC BY 4.0	Adam Goodkind and Klinton Bicknell	(参考訳) 代用的理論は、文処理における多くの現象を理解するための統一的な枠組み(hale, 2001; levy, 2008a)を提供し、全ての事前文脈で与えられた単語の条件付き確率が処理の困難を完全に決定することを示した。この主張の問題点として、条件付き確率が一定である場合でも、ある局所統計的単語頻度も処理に影響を与えることが示されている。ここでは、他のローカル統計が処理に役割を持つか、単語頻度が特別な場合であるかどうかを尋ねます。我々は,より複雑な局所統計量であるbigram と trigram の確率が,超越性とは独立に処理に影響を与えることを示す最初の明確な証拠を示す。これらの結果は、処理における局所統計の重要かつ独立した役割を示唆している。さらに、地域統計情報に大きすぎる効果がある理由を説明できるような仮説の新たな一般化の研究を動機付けている。 Surprisal theory has provided a unifying framework for understanding many phenomena in sentence processing (Hale, 2001; Levy, 2008a), positing that a word's conditional probability given all prior context fully determines processing difficulty. Problematically for this claim, one local statistic, word frequency, has also been shown to affect processing, even when conditional probability given context is held constant. Here, we ask whether other local statistics have a role in processing, or whether word frequency is a special case. We present the first clear evidence that more complex local statistics, word bigram and trigram probability, also affect processing independently of surprisal. These findings suggest a significant and independent role of local statistics in processing. Further, it motivates research into new generalizations of surprisal that can also explain why local statistical information should have an outsized effect.	翻訳日:2021-03-10 14:06:26 公開日:2021-03-07
# (参考訳) 説明可能な人工知能における反事実と原因:理論、アルゴリズム、応用 Counterfactuals and Causability in Explainable Artificial Intelligence: Theory, Algorithms, and Applications ( http://arxiv.org/abs/2103.04244v1 ) ライセンス: CC BY 4.0	Yu-Liang Chou and Catarina Moreira and Peter Bruza and Chun Ouyang and Joaquim Jorge	(参考訳) ディープラーニングモデルをより透明で説明しやすいものにする、モデルに依存しない方法への関心が高まっている。一部の研究者は、機械がある程度の人間レベルの説明可能性を達成するためには、この機械は人間の因果的理解可能な説明を提供する必要があると主張した。可利用性を提供する可能性のある特定のアルゴリズムのクラスは偽物である。本稿では,多種多様な文献を体系的に検証し,その事実と説明可能な人工知能の因果性について述べる。 PRISMAフレームワークの下でLDAトピックモデリング解析を行い、最も関連性の高い文献記事を見つけました。この分析の結果、調査されたアルゴリズムの接地理論と、その基礎となる特性と実世界データへの応用を考える新しい分類法が導かれた。この研究は、現在のAIのモデル非依存の反ファクトアルゴリズムは因果論的形式主義に基づいておらず、したがって人間の意思決定者への因果性を促進することができないことを示唆している。本研究では, 文献における主要なアルゴリズムから得られた説明は, 因果関係ではなく, 散発的な相関関係を提供し, 副最適, 誤った, あるいは偏見のある説明につながることを示唆した。本稿では,人工知能のモデル非依存的アプローチにおける可利用性向上のための新たな方向性と課題について述べる。 There has been a growing interest in model-agnostic methods that can make deep learning models more transparent and explainable to a user. Some researchers recently argued that for a machine to achieve a certain degree of human-level explainability, this machine needs to provide human causally understandable explanations, also known as causability. A specific class of algorithms that have the potential to provide causability are counterfactuals. This paper presents an in-depth systematic review of the diverse existing body of literature on counterfactuals and causability for explainable artificial intelligence. We performed an LDA topic modelling analysis under a PRISMA framework to find the most relevant literature articles. This analysis resulted in a novel taxonomy that considers the grounding theories of the surveyed algorithms, together with their underlying properties and applications in real-world data. This research suggests that current model-agnostic counterfactual algorithms for explainable AI are not grounded on a causal theoretical formalism and, consequently, cannot promote causability to a human decision-maker. Our findings suggest that the explanations derived from major algorithms in the literature provide spurious correlations rather than cause/effects relationships, leading to sub-optimal, erroneous or even biased explanations. This paper also advances the literature with new directions and challenges on promoting causability in model-agnostic approaches for explainable artificial intelligence.	翻訳日:2021-03-10 12:45:26 公開日:2021-03-07
# (参考訳) グラフデータ補完のための畳み込みグラフテンソルネット Convolutional Graph-Tensor Net for Graph Data Completion ( http://arxiv.org/abs/2103.04485v1 ) ライセンス: CC BY 4.0	Xiao-Yang Liu, Ming Zhu	(参考訳) グラフデータ補完は、一般的には、ソーシャルネットワーク、レコメンデーションシステム、モノのインターネットといったグラフ構造を持つため、基本的に重要な問題である。我々は,各ノードがデータ行列を持つグラフを,データ行列を3次元に積み重ねることにより,「textit{graph-tensor}」として表現する。本稿では,ディープニューラルネットワークを用いてグラフテンソルの一般変換を学習するグラフデータ補完問題に対して, \textit{Convolutional Graph-Tensor Net} (\textit{Conv GT-Net})を提案する。実験の結果、提案された \textit{Conv GT-Net} は、既存のアルゴリズムに対する完成精度 (50\% 高い) と完成速度 (3.6x $\sim$ 8.1x 速い) の両方において有意な改善を達成できることが示された。 Graph data completion is a fundamentally important issue as data generally has a graph structure, e.g., social networks, recommendation systems, and the Internet of Things. We consider a graph where each node has a data matrix, represented as a \textit{graph-tensor} by stacking the data matrices in the third dimension. In this paper, we propose a \textit{Convolutional Graph-Tensor Net} (\textit{Conv GT-Net}) for the graph data completion problem, which uses deep neural networks to learn the general transform of graph-tensors. The experimental results on the ego-Facebook data sets show that the proposed \textit{Conv GT-Net} achieves significant improvements on both completion accuracy (50\% higher) and completion speed (3.6x $\sim$ 8.1x faster) over the existing algorithms.	翻訳日:2021-03-10 12:44:15 公開日:2021-03-07
# (参考訳) 衝突層除去によるディープニューラルネットワークの自動チューニング Auto-tuning of Deep Neural Networks by Conflicting Layer Removal ( http://arxiv.org/abs/2103.04331v1 ) ライセンス: CC BY 4.0	David Peer, Sebastian Stabinger, Antonio Rodriguez-Sanchez	(参考訳) ニューラルネットワークアーキテクチャを設計することは難しい作業であり、どのモデルの特定の層をパフォーマンスを改善するために適応しなければならないかを知ることは、ほぼ謎である。本稿では,学習モデルのテスト精度を低下させる層を識別する新しい手法を提案する。矛盾する層は、トレーニングの開始時に早期に検出される。最悪のシナリオでは、そのような層がまったく訓練できないネットワークにつながる可能性があることを証明します。理論的分析は、ネットワーク全体のパフォーマンスが低下するこれらの層の起源について提供され、これは広範な実証的評価によって補完されます。より正確には、競合するトレーニングバンドルを生成するため、パフォーマンスを悪化させるレイヤを特定しました。トレーニングされた残存ネットワークのレイヤの約60%が、テストエラーの有意な増加なしに、アーキテクチャから完全に削除できることを示します。さらに、トレーニングの開始時に相反する層を識別する新しいニューラルアーキテクチャサーチ(NAS)アルゴリズムを紹介します。自動チューニングアルゴリズムが検出したアーキテクチャは、より複雑な最先端アーキテクチャと比較すると、競合精度が向上する一方で、異なるコンピュータビジョンタスクのメモリ消費と推論時間を劇的に削減する。ソースコードはhttps://github.com/peerdavid/conflicting-bundlesで入手できる。 Designing neural network architectures is a challenging task and knowing which specific layers of a model must be adapted to improve the performance is almost a mystery. In this paper, we introduce a novel methodology to identify layers that decrease the test accuracy of trained models. Conflicting layers are detected as early as the beginning of training. In the worst-case scenario, we prove that such a layer could lead to a network that cannot be trained at all. A theoretical analysis is provided on what is the origin of those layers that result in a lower overall network performance, which is complemented by our extensive empirical evaluation. More precisely, we identified those layers that worsen the performance because they would produce what we name conflicting training bundles. We will show that around 60% of the layers of trained residual networks can be completely removed from the architecture with no significant increase in the test-error. We will further present a novel neural-architecture-search (NAS) algorithm that identifies conflicting layers at the beginning of the training. Architectures found by our auto-tuning algorithm achieve competitive accuracy values when compared against more complex state-of-the-art architectures, while drastically reducing memory consumption and inference time for different computer vision tasks. The source code is available on https://github.com/peerdavid/conflicting-bundles	翻訳日:2021-03-10 09:20:01 公開日:2021-03-07
# (参考訳) 特徴履歴による効率的なモデル性能推定 Efficient Model Performance Estimation via Feature Histories ( http://arxiv.org/abs/2103.04450v1 ) ライセンス: CC BY 4.0	Shengcao Cao, Xiaofang Wang, Kris Kitani	(参考訳) ハイパーパラメータ最適化(HPO)やニューラルアーキテクチャサーチ(NAS)といったニューラルネットワーク設計の課題の重要なステップは、候補モデルの性能を評価することである。一定の計算リソースがあれば、各モデルにより多くの時間を費やして最終的なパフォーマンスの正確な見積もりを得るか、設定スペースでより多様なモデルを探索するより多くの時間を費やすことができます。本研究では、トレーニングプロセスの早期にモデルの最大性能を正確に近似することにより、画像分類のためのHPOとNASの文脈でこの探索-探索トレードオフを最適化することを目指しています。特定の検索空間向けにカスタマイズされた最近の高速化NAS手法とは対照的に、例えば、検索空間の微分が要求される場合、我々の手法は柔軟であり、検索空間にほとんど制約を課さない。本手法は,訓練の初期段階におけるネットワークの特徴の進化履歴を用いて,検討中のネットワークのピーク性能に一致するプロキシ分類器を構築する。本手法は複数の探索アルゴリズムと組み合わせ、HPOやNASの幅広いタスクに対するより良いソリューションを見つけることができることを示す。サンプリングに基づく検索アルゴリズムと並列計算を用いて,dartよりも優れたアーキテクチャを探索し,壁時間探索時間の80%削減を実現する。 An important step in the task of neural network design, such as hyper-parameter optimization (HPO) or neural architecture search (NAS), is the evaluation of a candidate model's performance. Given fixed computational resources, one can either invest more time training each model to obtain more accurate estimates of final performance, or spend more time exploring a greater variety of models in the configuration space. In this work, we aim to optimize this exploration-exploitation trade-off in the context of HPO and NAS for image classification by accurately approximating a model's maximal performance early in the training process. In contrast to recent accelerated NAS methods customized for certain search spaces, e.g., requiring the search space to be differentiable, our method is flexible and imposes almost no constraints on the search space. Our method uses the evolution history of features of a network during the early stages of training to build a proxy classifier that matches the peak performance of the network under consideration. We show that our method can be combined with multiple search algorithms to find better solutions to a wide range of tasks in HPO and NAS. Using a sampling-based search algorithm and parallel computing, our method can find an architecture which is better than DARTS and with an 80% reduction in wall-clock search time.	翻訳日:2021-03-10 08:59:08 公開日:2021-03-07
# (参考訳) 対人学習による公正度の推定と改善 Estimating and Improving Fairness with Adversarial Learning ( http://arxiv.org/abs/2103.04243v1 ) ライセンス: CC BY 4.0	Xiaoxiao Li, Ziteng Cui, Yifan Wu, Li Gu, Tatsuya Harada	(参考訳) 医療における信頼される人工知能(AI)には、公平性と説明責任が不可欠です。しかし、既存のAIモデルは決定マーキングに偏る可能性があります。そこで本研究では,深層学習に基づく医用画像解析システムにおいて,バイアスの軽減と検出を同時に行うマルチタスク学習戦略を提案する。具体的には,バイアスに対する識別モジュールと,ベース分類モデルにおける不公平性を予測するクリティカルモジュールを追加することを提案する。さらに、トレーニング中に2つのモジュールが独立するように直交正規化を強制します。したがって、これらの深層学習タスクを互いに区別し、多様体上の特異点に分解することを避けることができる。この敵対的なトレーニング方法を通じて、性別や肌のトーンなどの属性のためにバイアスに対して脆弱である恵まれないグループからのデータは、これらの属性に対して中立なドメインに転送されます。さらに、クリティカルモジュールは、未知の敏感な属性を持つデータの公平度スコアを予測できます。各種フェアネス評価指標に基づいて, 大規模皮膚病変データセット上での枠組みの評価を行った。本実験は,深層学習に基づく医用画像解析システムにおいて,フェアネスを推定・改善するための提案手法の有効性を示すものである。 Fairness and accountability are two essential pillars for trustworthy Artificial Intelligence (AI) in healthcare. However, the existing AI model may be biased in its decision marking. To tackle this issue, we propose an adversarial multi-task training strategy to simultaneously mitigate and detect bias in the deep learning-based medical image analysis system. Specifically, we propose to add a discrimination module against bias and a critical module that predicts unfairness within the base classification model. We further impose an orthogonality regularization to force the two modules to be independent during training. Hence, we can keep these deep learning tasks distinct from one another, and avoid collapsing them into a singular point on the manifold. Through this adversarial training method, the data from the underprivileged group, which is vulnerable to bias because of attributes such as sex and skin tone, are transferred into a domain that is neutral relative to these attributes. Furthermore, the critical module can predict fairness scores for the data with unknown sensitive attributes. We evaluate our framework on a large-scale public-available skin lesion dataset under various fairness evaluation metrics. The experiments demonstrate the effectiveness of our proposed method for estimating and improving fairness in the deep learning-based medical image analysis system.	翻訳日:2021-03-10 08:05:07 公開日:2021-03-07
# (参考訳) 離散構成エネルギーネットワークを用いたRNA代替スプライシング予測 RNA Alternative Splicing Prediction with Discrete Compositional Energy Network ( http://arxiv.org/abs/2103.04246v1 ) ライセンス: CC BY 4.0	Alvin Chan, Anna Korsakova, Yew-Soon Ong, Fernaldo Richtia Winnerdy, Kah Wai Lim, Anh Tuan Phan	(参考訳) 単一の遺伝子は、代替スプライシングと呼ばれるプロセスを通じて、異なるタンパク質バージョンをコードすることができる。タンパク質は細胞機能において主要な役割を果たすため、異常なスプライシングプロファイルはがんを含む様々な疾患を引き起こす可能性がある。代替スプライシングは、遺伝子の一次配列およびRNA結合タンパク質レベルなどの他の調節因子によって決定される。これを入力として、RNAスプライシングの予測を回帰タスクとして定式化し、学習モデルをベンチマークするための新しいトレーニングデータセット(CAPD)を構築します。本研究では,スプライスサイト,接合部,転写部間の階層的関係を利用した離散構成エネルギーネットワーク(DCEN)を提案する。代替スプライシング予測の場合、DCENはその構成スプライス接合のエネルギー値を通じてmRNA転写確率をモデル化する。これらの転写確率はその後、キーヌクレオチドの相対的存在量値にマッピングされ、基礎実験によって訓練される。 CAPDの実験を通じて、DCENがベースラインとアブレーションバリアントを上回っていることを示します。 A single gene can encode for different protein versions through a process called alternative splicing. Since proteins play major roles in cellular functions, aberrant splicing profiles can result in a variety of diseases, including cancers. Alternative splicing is determined by the gene's primary sequence and other regulatory factors such as RNA-binding protein levels. With these as input, we formulate the prediction of RNA splicing as a regression task and build a new training dataset (CAPD) to benchmark learned models. We propose discrete compositional energy network (DCEN) which leverages the hierarchical relationships between splice sites, junctions and transcripts to approach this task. In the case of alternative splicing prediction, DCEN models mRNA transcript probabilities through its constituent splice junctions' energy values. These transcript probabilities are subsequently mapped to relative abundance values of key nucleotides and trained with ground-truth experimental measurements. Through our experiments on CAPD, we show that DCEN outperforms baselines and ablation variants.	翻訳日:2021-03-10 07:31:20 公開日:2021-03-07
# (参考訳) 無監視異常検出のための学生教師特徴ピラミッドマッチング Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection ( http://arxiv.org/abs/2103.04257v1 ) ライセンス: CC BY 4.0	Guodong Wang, Shumin Han, Errui Ding, Di Huang	(参考訳) 異常検出は難しい課題であり、通常、異常の予期せぬ性質に対する教師なし学習問題として定式化される。本稿では,生徒-教員の枠組みにその利点を生かして実装するが,精度と効率の両面で大幅に拡張する,シンプルかつ強力な手法を提案する。イメージ分類を教師として事前訓練した強いモデルから,その知識を同一のアーキテクチャで単一学生ネットワークに抽出し,異常な画像の分布を学習し,このワンステップ転送は可能な限り重要な手がかりを保存する。さらに,マルチスケールな特徴マッチング戦略をフレームワークに統合し,この階層的な特徴アライメントにより,より優れた監視の下で,学生ネットワークが特徴ピラミッドから多段階の知識を混在させることで,様々な大きさの異常を検出することができる。 2つのネットワークによって生成される特徴ピラミッドの違いは、異常が起こる確率を示すスコア関数として機能する。このような操作により、正確で高速なピクセルレベルの異常検出を実現します。非常に競争力のある結果は、3つの主要なベンチマークで提供されます。さらに、非常に高速(256x256のサイズの画像のための100 FPS)で推論を行い、最新のものよりも少なくとも数十倍高速です。 Anomaly detection is a challenging task and usually formulated as an unsupervised learning problem for the unexpectedness of anomalies. This paper proposes a simple yet powerful approach to this issue, which is implemented in the student-teacher framework for its advantages but substantially extends it in terms of both accuracy and efficiency. Given a strong model pre-trained on image classification as the teacher, we distill the knowledge into a single student network with the identical architecture to learn the distribution of anomaly-free images and this one-step transfer preserves the crucial clues as much as possible. Moreover, we integrate the multi-scale feature matching strategy into the framework, and this hierarchical feature alignment enables the student network to receive a mixture of multi-level knowledge from the feature pyramid under better supervision, thus allowing to detect anomalies of various sizes. The difference between feature pyramids generated by the two networks serves as a scoring function indicating the probability of anomaly occurring. Due to such operations, our approach achieves accurate and fast pixel-level anomaly detection. Very competitive results are delivered on three major benchmarks, significantly superior to the state of the art ones. In addition, it makes inferences at a very high speed (with 100 FPS for images of the size at 256x256), at least dozens of times faster than the latest counterparts.	翻訳日:2021-03-10 01:33:32 公開日:2021-03-07
# (参考訳) ランドマーク検出を用いた多項式曲線フィッティングによる歯根ファジィエッジの高分解能分割 High-Resolution Segmentation of Tooth Root Fuzzy Edge Based on Polynomial Curve Fitting with Landmark Detection ( http://arxiv.org/abs/2103.04258v1 ) ライセンス: CC BY 4.0	Yunxiang Li, Yifan Zhang, Yaqi Wang, Shuai Wang, Ruizi Peng, Kai Tang, Qianni Zhang, Jun Wang, Qun Jin, Lingling Sun	(参考訳) 根管治療の診断における最も経済的かつ定期的な補助的検査として、口腔X線は、皮膚科医によって広く用いられている。従来の画像分割法では歯根をぼやけた境界で分割することは依然として困難である。そこで,ランドマーク検出(HS-PCL)を用いた多項式曲線フィッティングに基づく高分解能セグメンテーションモデルを提案する。歯根の縁に均等に分布する複数のランドマークを検出し、滑らかな多項式曲線を歯根のセグメント化として適合させ、ファジィエッジの問題を解決する。本モデルでは,不正確に検出された間違ったランドマークの悪影響を自動的に低減し,適合結果に歯根から逸脱する最短距離アルゴリズム(mnsda)の最大数を提案する。数値実験により,提案手法は最先端の手法と比較して,Hausdorff95 (HD95) を33.9%,Average Surface Distance (ASD) を42.1%削減するだけでなく,データセットの分量にも優れた結果が得られ,医用画像処理による根管自動治療の有効性が向上した。 As the most economical and routine auxiliary examination in the diagnosis of root canal treatment, oral X-ray has been widely used by stomatologists. It is still challenging to segment the tooth root with a blurry boundary for the traditional image segmentation method. To this end, we propose a model for high-resolution segmentation based on polynomial curve fitting with landmark detection (HS-PCL). It is based on detecting multiple landmarks evenly distributed on the edge of the tooth root to fit a smooth polynomial curve as the segmentation of the tooth root, thereby solving the problem of fuzzy edge. In our model, a maximum number of the shortest distances algorithm (MNSDA) is proposed to automatically reduce the negative influence of the wrong landmarks which are detected incorrectly and deviate from the tooth root on the fitting result. Our numerical experiments demonstrate that the proposed approach not only reduces Hausdorff95 (HD95) by 33.9% and Average Surface Distance (ASD) by 42.1% compared with the state-of-the-art method, but it also achieves excellent results on the minute quantity of datasets, which greatly improves the feasibility of automatic root canal therapy evaluation by medical image computing.	翻訳日:2021-03-10 01:20:03 公開日:2021-03-07
# (参考訳) 部分アスペクト角SARターゲット認識のためのPose Disrepancy Spatial Transformer Pose Discrepancy Spatial Transformer Based Feature Disentangling for Partial Aspect Angles SAR Target Recognition ( http://arxiv.org/abs/2103.04329v1 ) ライセンス: CC BY 4.0	Zaidao Wen, Jiaxiang Liu, Zhunga Liu, Quan Pan	(参考訳) 本文は,合成開口レーダ(SAR)自動目標認識(ATR)タスクのための新しいフレームワークであるDistSTNを提示する。従来のSAR ATRアルゴリズムとは対照的に、DistSTNは、トレーニングのアスペクト角が不完全で部分的な範囲に制限されている非協力的ターゲットに対して、テストサンプルの角度が無制限であるより困難な実用シナリオを検討している。この問題に対処するため、ポーズ不変の特徴を学習する代わりに、DistSTNは、SARターゲットの学習したポーズファクタとアイデンティティファクタを分離し、ターゲットイメージの表現プロセスを独立して制御できるように、精巧な機能分離モデルを含む。説明可能なポーズ因子を分離するために、DistSTNのポーズ不一致空間トランスフォーマーモジュールを開発し、2つの異なるターゲットの因子間の固有の変換を明示的な幾何学的モデルで特徴付ける。さらに、DistSTNは、エンコーダ・デコーダ機構を用いて効率的な特徴抽出と認識を可能にする、償却推論方式を開発した。移動目標獲得・認識(MSTAR)ベンチマークによる実験結果から,提案手法の有効性が示された。他のatrアルゴリズムと比較して、diststnは高い認識精度を達成できる。 This letter presents a novel framework termed DistSTN for the task of synthetic aperture radar (SAR) automatic target recognition (ATR). In contrast to the conventional SAR ATR algorithms, DistSTN considers a more challenging practical scenario for non-cooperative targets whose aspect angles for training are incomplete and limited in a partial range while those of testing samples are unlimited. To address this issue, instead of learning the pose invariant features, DistSTN newly involves an elaborated feature disentangling model to separate the learned pose factors of a SAR target from the identity ones so that they can independently control the representation process of the target image. To disentangle the explainable pose factors, we develop a pose discrepancy spatial transformer module in DistSTN to characterize the intrinsic transformation between the factors of two different targets with an explicit geometric model. Furthermore, DistSTN develops an amortized inference scheme that enables efficient feature extraction and recognition using an encoder-decoder mechanism. Experimental results with the moving and stationary target acquisition and recognition (MSTAR) benchmark demonstrate the effectiveness of our proposed approach. Compared with the other ATR algorithms, DistSTN can achieve higher recognition accuracy.	翻訳日:2021-03-10 01:12:39 公開日:2021-03-07
# (参考訳) IRON:不変ベースの高ロバストポイントクラウド登録 IRON: Invariant-based Highly Robust Point Cloud Registration ( http://arxiv.org/abs/2103.04357v1 ) ライセンス: CC0 1.0	Lei Sun	(参考訳) 本稿では,非最小かつ高ロバストなポイントクラウド登録法であるiron (invariant-based global robust estimation and optimization)を提案する。これを実現するために、登録問題をそれぞれスケール、回転、翻訳の推定に分離します。最初のコントリビューションは、ランダムなサンプル間のインリエリエンスを求めるために不変互換性を採用し、2つの点群間のスケールを堅牢に推定するRANSIC(RANdom Samples with Invariant Compatibility)を提案することです。スケールを見積もると、第2の貢献は、SOS(Sum-of-Squares)緩和を用いて、非凸なグローバル登録問題をSDP(convex Semi-Definite Program)に緩和し、緩和がきついことを示すことである。また、ロバストな推定のために、従来のGNCよりもロバスト性および時間効率のよいグローバルな外乱拒絶ヒューリスティックであるRT-GNC(Rough Trimming and Graduated Non-Convexity)を3番目の貢献として提案する。これらの貢献により、登録アルゴリズム、ironをレンダリングできます。実データセット上での実験を通じて,鉄は99%の異常値に対して効率的,高精度,堅牢であり,既存の最先端アルゴリズムを上回っていることを示した。 In this paper, we present IRON (Invariant-based global Robust estimation and OptimizatioN), a non-minimal and highly robust solution for point cloud registration with a great number of outliers among the correspondences. To realize this, we decouple the registration problem into the estimation of scale, rotation and translation, respectively. Our first contribution is to propose RANSIC (RANdom Samples with Invariant Compatibility), which employs the invariant compatibility to seek inliers among random samples and robustly estimates the scale between two sets of point clouds in the meantime. Once the scale is estimated, our second contribution is to relax the non-convex global registration problem into a convex Semi-Definite Program (SDP) in a certifiable way using Sum-of-Squares (SOS) Relaxation and show that the relaxation is tight. For robust estimation, we further propose RT-GNC (Rough Trimming and Graduated Non-Convexity), a global outlier rejection heuristic having better robustness and time-efficiency than traditional GNC, as our third contribution. With these contributions, we can render our registration algorithm, IRON. Through experiments over real datasets, we show that IRON is efficient, highly accurate and robust against as many as 99% outliers whether the scale is known or unknown, outperforming the existing state-of-the-art algorithms.	翻訳日:2021-03-10 01:01:08 公開日:2021-03-07
# (参考訳) 写真における自動フレアスポットアーティファクト検出と除去 Automatic Flare Spot Artifact Detection and Removal in Photographs ( http://arxiv.org/abs/2103.04384v1 ) ライセンス: CC BY 4.0	Patricia Vitoria and Coloma Ballester	(参考訳) フレアスポットは、多くの条件によって引き起こされる1つのタイプのフレアアーティファクトであり、しばしばカメラの視野内または近くで1つ以上の高輝度光源によって誘発される。高輝度源からの光がカメラの前面要素に到達すると、撮像された画像に非画像情報またはフレアを形成するフィルム面に出現するカメラ素子の内部反射を生成することができる。予防機構が用いられるが、アーティファクトが現れることもある。本稿では,フレアスポットアーティファクトを自動的に検出・除去する頑健な計算手法を提案する。第一に、フレアスポットが満たされる可能性のある本質的な特性に基づく特性評価を提案し、第二に、候補者の中からフレアスポットを選択できる新たな信頼度尺度を定義し、最後に、フレア領域を正確に決定する手法を提供します。そして、前記検出されたアーティファクトを、模範ベースの塗布を用いて除去する。アルゴリズムが最高水準の定量的および定性的な性能を達成することを示します。 Flare spot is one type of flare artifact caused by a number of conditions, frequently provoked by one or more high-luminance sources within or close to the camera field of view. When light rays coming from a high-luminance source reach the front element of a camera, it can produce intra-reflections within camera elements that emerge at the film plane forming non-image information or flare on the captured image. Even though preventive mechanisms are used, artifacts can appear. In this paper, we propose a robust computational method to automatically detect and remove flare spot artifacts. Our contribution is threefold: firstly, we propose a characterization which is based on intrinsic properties that a flare spot is likely to satisfy; secondly, we define a new confidence measure able to select flare spots among the candidates; and, finally, a method to accurately determine the flare region is given. Then, the detected artifacts are removed by using exemplar-based inpainting. We show that our algorithm achieve top-tier quantitative and qualitative performance.	翻訳日:2021-03-10 00:28:02 公開日:2021-03-07
# (参考訳) シーンテキスト認識に本当のデータセットしか使わないとしたら? ラベルの少ないシーンテキスト認識に向けて What If We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels ( http://arxiv.org/abs/2103.04400v1 ) ライセンス: CC BY-SA 4.0	Jeonghun Baek, Yusuke Matsui, Kiyoharu Aizawa	(参考訳) シーンテキスト認識(STR)タスクは、一般的なプラクティスを持っています:すべての最先端のSTRモデルは、大規模な合成データで訓練されます。この練習とは対照的に、合成データなしでSTRモデルを訓練する必要があるとき、より少ない実ラベル(ラベルの少ないSTR)でのみSTRモデルをトレーニングすることは重要です。しかし、実際のデータは不十分であるため、実データ上でSTRモデルをトレーニングすることはほぼ不可能であるという暗黙の共通知識がある。この共通知識がラベルの少ないSTRの研究を妨げていると考えます。本研究では,共通知識を否定し,少ないラベルでSTRを再活性化する。我々は最近蓄積した公開実データを統合することで、STRモデルを実際のラベル付きデータでのみ満足に訓練できることを示します。その後、実データを完全に活用するための単純なデータ拡張が見つかる。さらに,ラベルなしデータを収集し,半教師付きおよび自己教師付き手法を導入することで,モデルを改善する。その結果,最先端手法に対する競争モデルが得られた。我々の知る限りでは、1)実際のラベルのみを用いることで十分な性能を示す最初の研究であり、2)より少ないラベルを持つSTRに半自己監督手法を導入する。私たちのコードとデータが利用可能です。 https://github.com/ku21fan/STR-Fewer-Labels Scene text recognition (STR) task has a common practice: All state-of-the-art STR models are trained on large synthetic data. In contrast to this practice, training STR models only on fewer real labels (STR with fewer labels) is important when we have to train STR models without synthetic data: for handwritten or artistic texts that are difficult to generate synthetically and for languages other than English for which we do not always have synthetic data. However, there has been implicit common knowledge that training STR models on real data is nearly impossible because real data is insufficient. We consider that this common knowledge has obstructed the study of STR with fewer labels. In this work, we would like to reactivate STR with fewer labels by disproving the common knowledge. We consolidate recently accumulated public real data and show that we can train STR models satisfactorily only with real labeled data. Subsequently, we find simple data augmentation to fully exploit real data. Furthermore, we improve the models by collecting unlabeled data and introducing semi- and self-supervised methods. As a result, we obtain a competitive model to state-of-the-art methods. To the best of our knowledge, this is the first study that 1) shows sufficient performance by only using real labels and 2) introduces semi- and self-supervised methods into STR with fewer labels. Our code and data are available: https://github.com/ku21fan/STR-Fewer-Labels	翻訳日:2021-03-09 23:28:55 公開日:2021-03-07
# (参考訳) スナップショット圧縮イメージング:原理、実装、理論、アルゴリズムおよび応用 Snapshot Compressive Imaging: Principle, Implementation, Theory, Algorithms and Applications ( http://arxiv.org/abs/2103.04421v1 ) ライセンス: CC BY 4.0	Xin Yuan and David J. Brady and Aggelos K. Katsaggelos	(参考訳) 高次元(HD)データの取得は、信号処理とその関連分野における長期的な課題である。スナップショット圧縮イメージング(SCI)は、2次元(2D)検出器を使用して、シュエムスナップショット測定でHD($\ge3$D)データをキャプチャします。新規な光学設計により、2D検出器は、HDデータを圧縮的にサンプリングし、その後、アルゴリズムを用いて所望のHDデータキューブを再構築する。 SCIは、ハイパースペクトルイメージング、ビデオ、ホログラフィー、トモグラフィー、焦点深度イメージング、偏光イメージング、顕微鏡、画像撮影に使われてきた。ディープラーニングにインスパイアされた様々なディープニューラルネットワークも、スペクトルSCIとビデオSCIのHDデータキューブを再構築するために開発されている。本稿では、最適化に基づくアルゴリズムとディープラーニングに基づくアルゴリズムの両方を含む、sciハードウェア、理論、アルゴリズムの最近の進歩を概観する。様々な応用やSCIの展望についても論じる。 Capturing high-dimensional (HD) data is a long-term challenge in signal processing and related fields. Snapshot compressive imaging (SCI) uses a two-dimensional (2D) detector to capture HD ($\ge3$D) data in a {\em snapshot} measurement. Via novel optical designs, the 2D detector samples the HD data in a {\em compressive} manner; following this, algorithms are employed to reconstruct the desired HD data-cube. SCI has been used in hyperspectral imaging, video, holography, tomography, focal depth imaging, polarization imaging, microscopy, \etc.~Though the hardware has been investigated for more than a decade, the theoretical guarantees have only recently been derived. Inspired by deep learning, various deep neural networks have also been developed to reconstruct the HD data-cube in spectral SCI and video SCI. This article reviews recent advances in SCI hardware, theory and algorithms, including both optimization-based and deep-learning-based algorithms. Diverse applications and the outlook of SCI are also discussed.	翻訳日:2021-03-09 22:59:56 公開日:2021-03-07
# (参考訳) TypeShift: タイピング生産プロセスを視覚化するためのユーザインターフェース TypeShift: A User Interface for Visualizing the Typing Production Process ( http://arxiv.org/abs/2103.04222v1 ) ライセンス: CC BY 4.0	Adam Goodkind	(参考訳) TypeShiftは、生産をタイプするタイミングで言語パターンを視覚化するためのツールです。言語生産は、言語、認知、運動能力に基づく複雑なプロセスです。タイピングプロセスにおける全体的なトレンドを視覚化することで、typeshiftは、単語レベルと文字レベルの両方でタイピングパターンを表現するのに使われるノイズの多い情報信号を明らかにすることを目指している。これは研究者が特定の言語現象を比較・比較し、個別のタイピングセッションを複数のグループ平均と比較することで達成される。最後に、TypeShiftはもともとデータタイピング用に設計されていたが、音声データにも容易に適応できる。 Webデモはhttps://angoodkind.shinyapps.io/TypeShift/で公開されている。ソースコードはhttps://github.com/angoodkind/typeshiftからアクセスできる。 TypeShift is a tool for visualizing linguistic patterns in the timing of typing production. Language production is a complex process which draws on linguistic, cognitive and motor skills. By visualizing holistic trends in the typing process, TypeShift aims to elucidate the often noisy information signals that are used to represent typing patterns, both at the word-level and character-level. It accomplishes this by enabling a researcher to compare and contrast specific linguistic phenomena, and compare an individual typing session to multiple group averages. Finally, although TypeShift was originally designed for typing data, it can easy be adapted to accommodate speech data, as well. A web demo is available at https://angoodkind.shinyapps.io/TypeShift/. The source code can be accessed at https://github.com/angoodkind/TypeShift.	翻訳日:2021-03-09 21:33:31 公開日:2021-03-07
# (参考訳) モデルなし強化学習におけるQ-関数の再利用がトータルレグレットに及ぼす影響 The Effect of Q-function Reuse on the Total Regret of Tabular, Model-Free, Reinforcement Learning ( http://arxiv.org/abs/2103.04416v1 ) ライセンス: CC BY 4.0	Volodymyr Tkachuk, Sriram Ganapathi Subramanian, Matthew E. Taylor	(参考訳) 一部の強化学習方法は、実世界では実用的ではない高いサンプル複雑性に苦しんでいます。転送学習メソッドである$Q$-functionの再利用は、学習のサンプル複雑さを低減し、既存のアルゴリズムの有用性を向上させる1つの方法です。これまでの研究は、モデルフリーアルゴリズムに適用した場合、様々な環境における$Q$-functionの再利用の実証的な効果を示してきた。私たちの知る限りでは、表型でモデルフリーな設定に適用される場合、$q$-関数再利用の後悔を示す理論的研究は存在しません。 UCB-Hoeffdingアルゴリズムを用いた$Q$-learningに適用した場合の$Q$-functionの再利用効果に関する理論的知見を提供することで、$Q$-functionの再利用における理論的作業と経験的作業のギャップを埋めることを目指している。 q$-関数の再利用がucb-hoeffdingアルゴリズムによる$q$-learningに適用された場合、状態やアクション空間とは無関係な後悔があることを示すことが私たちの大きな貢献です。また,理論的な知見を裏付ける実証的な結果も提供する。 Some reinforcement learning methods suffer from high sample complexity causing them to not be practical in real-world situations. $Q$-function reuse, a transfer learning method, is one way to reduce the sample complexity of learning, potentially improving usefulness of existing algorithms. Prior work has shown the empirical effectiveness of $Q$-function reuse for various environments when applied to model-free algorithms. To the best of our knowledge, there has been no theoretical work showing the regret of $Q$-function reuse when applied to the tabular, model-free setting. We aim to bridge the gap between theoretical and empirical work in $Q$-function reuse by providing some theoretical insights on the effectiveness of $Q$-function reuse when applied to the $Q$-learning with UCB-Hoeffding algorithm. Our main contribution is showing that in a specific case if $Q$-function reuse is applied to the $Q$-learning with UCB-Hoeffding algorithm it has a regret that is independent of the state or action space. We also provide empirical results supporting our theoretical findings.	翻訳日:2021-03-09 19:32:13 公開日:2021-03-07
# (参考訳) 自動運転のV&Vと安全保証のためのカバレッジベーステスト:システム文献レビュー Coverage based testing for V&V and Safety Assurance of Self-driving Autonomous Vehicles: A Systematic Literature Review ( http://arxiv.org/abs/2103.04364v1 ) ライセンス: CC BY 4.0	Zaid Tahir, Rob Alexander	(参考訳) 自動運転車(SAV)は、業界だけでなく一般の人々によって毎日より多くの関心を集めています。テクノロジー企業や自動車会社は、将来のSAV市場でのヘッドスタートを確実にするために、SAVの研究開発に膨大な資金を投資しています。 SAVが公道に到達する際の大きなハードルの1つは、SAVの安全面における公衆の信頼の欠如である。世界中の研究者は、安全を確保し、SAVの安全性に国民に信頼を提供するために、SAVの検証と検証(V&V)と安全保証のためのカバレッジベースのテストを使用しています。本論文の目的は,過去10年間に研究者が用いたカバレッジ基準とカバレッジ最大化手法を検討し,SAVの安全性を保証することである。本稿では,本研究のための体系的文献レビュー(SLR)を実施している。適用範囲の基準に基づいて、既存の研究の分類を提示します。この領域のさらなる研究を可能にするために、このSLRにはいくつかの研究ギャップと研究方向も設けられている。本稿では,SAVの安全保証分野における知識の体系を提供する。このSLRの結果は、V&Vの進展とSAVの安全性確保に有効であると考えています。 Self-driving Autonomous Vehicles (SAVs) are gaining more interest each passing day by the industry as well as the general public. Tech and automobile companies are investing huge amounts of capital in research and development of SAVs to make sure they have a head start in the SAV market in the future. One of the major hurdles in the way of SAVs making it to the public roads is the lack of confidence of public in the safety aspect of SAVs. In order to assure safety and provide confidence to the public in the safety of SAVs, researchers around the world have used coverage-based testing for Verification and Validation (V&V) and safety assurance of SAVs. The objective of this paper is to investigate the coverage criteria proposed and coverage maximizing techniques used by researchers in the last decade up till now, to assure safety of SAVs. We conduct a Systematic Literature Review (SLR) for this investigation in our paper. We present a classification of existing research based on the coverage criteria used. Several research gaps and research directions are also provided in this SLR to enable further research in this domain. This paper provides a body of knowledge in the domain of safety assurance of SAVs. We believe the results of this SLR will be helpful in the progression of V&V and safety assurance of SAVs.	翻訳日:2021-03-09 17:09:51 公開日:2021-03-07
# Unseen の翻訳? Yor\`ub\'a $\rightarrow$ English MT in Low-Resource, Morphologically-unmarked settingss Translating the Unseen? Yor\`ub\'a $\rightarrow$ English MT in Low-Resource, Morphologically-Unmarked Settings ( http://arxiv.org/abs/2103.04225v1 ) ライセンス: Link先を確認	Ife Adebara Miikka Silfverberg Muhammad Abdul-Mageed	(参考訳) 特定の特徴が一方で形態素的にマークされているが、他方で欠落または文脈的にマークされている言語間の翻訳は、機械翻訳の重要なテストケースである。定型性(in)を形態的にマークする英語に翻訳する場合、Yor\`ub\'a は素名詞を用いるが、これらの特徴を文脈的にマークする。本研究では、Yor\`ub\'a の素名詞を英語に翻訳する際に、SMT システムが 2 つの NMT システム (BiLSTM と Transformer) とどのように比較するかを細かく分析する。システムがどのようにBNを識別し、正しく翻訳し、人間の翻訳パターンと比較するかを検討する。また,各モデルが犯す誤りの種類を分析し,それらの誤りを言語的に記述する。低リソース設定でモデルパフォーマンスを評価するための洞察を得る。素名詞の翻訳では, トランスフォーマーモデルは4つのカテゴリでSMT, BiLSTMモデルより優れ, BiLSTMは3つのカテゴリでSMTモデルより優れ, SMTは1つのカテゴリでNMTモデルより優れていた。 Translating between languages where certain features are marked morphologically in one but absent or marked contextually in the other is an important test case for machine translation. When translating into English which marks (in)definiteness morphologically, from Yor\`ub\'a which uses bare nouns but marks these features contextually, ambiguities arise. In this work, we perform fine-grained analysis on how an SMT system compares with two NMT systems (BiLSTM and Transformer) when translating bare nouns in Yor\`ub\'a into English. We investigate how the systems what extent they identify BNs, correctly translate them, and compare with human translation patterns. We also analyze the type of errors each model makes and provide a linguistic description of these errors. We glean insights for evaluating model performance in low-resource settings. In translating bare nouns, our results show the transformer model outperforms the SMT and BiLSTM models for 4 categories, the BiLSTM outperforms the SMT model for 3 categories while the SMT outperforms the NMT models for 1 category.	翻訳日:2021-03-09 16:06:50 公開日:2021-03-07
# 乱雑な動的環境における状態表現とナビゲーションの学習 Learning a State Representation and Navigation in Cluttered and Dynamic Environments ( http://arxiv.org/abs/2103.04351v1 ) ライセンス: Link先を確認	David Hoeller, Lorenz Wellhausen, Farbod Farshidian, Marco Hutter	(参考訳) 本研究では,静的および動的障害のあるクラッタ環境において,四足ロボットを用いた局所ナビゲーションを実現するための学習ベースのパイプラインを提案する。高レベルのナビゲーションコマンドにより、ロボットは環境の明示的なマッピングをすることなく、深度カメラからフレームに基づいてターゲットの場所に安全に移動することができます。まず、画像のシーケンスとカメラの現在の軌道を融合して、状態表現学習を用いて世界のモデルを形成する。この軽量モジュールの出力は、強化学習で訓練された目標到達および障害物回避ポリシーに直接供給される。パイプラインをこれらのコンポーネントに分離すると、わずか数十分でシミュレーションで完全にトレーニングできるサンプル効率的なポリシー学習ステージになることを示します。重要な部分は状態表現であり、監視されていない方法で世界の隠れた状態を推定するだけでなく、現実のギャップを橋渡しし、シミュレーションから現実への転送を成功させるのに役立ちます。シミュレーションと実演で4足歩行ロボットanymalを用いた実験では,ノイズの多い奥行き画像の処理や,トレーニング中の動的障害物の回避,局所的な空間意識の付与などが可能であった。 In this work, we present a learning-based pipeline to realise local navigation with a quadrupedal robot in cluttered environments with static and dynamic obstacles. Given high-level navigation commands, the robot is able to safely locomote to a target location based on frames from a depth camera without any explicit mapping of the environment. First, the sequence of images and the current trajectory of the camera are fused to form a model of the world using state representation learning. The output of this lightweight module is then directly fed into a target-reaching and obstacle-avoiding policy trained with reinforcement learning. We show that decoupling the pipeline into these components results in a sample efficient policy learning stage that can be fully trained in simulation in just a dozen minutes. The key part is the state representation, which is trained to not only estimate the hidden state of the world in an unsupervised fashion, but also helps bridging the reality gap, enabling successful sim-to-real transfer. In our experiments with the quadrupedal robot ANYmal in simulation and in reality, we show that our system can handle noisy depth images, avoid dynamic obstacles unseen during training, and is endowed with local spatial awareness.	翻訳日:2021-03-09 16:04:50 公開日:2021-03-07
# Watching You: ビデオベースの人物再識別のためのグローバルガイドによる相互学習 Watching You: Global-guided Reciprocal Learning for Video-based Person Re-identification ( http://arxiv.org/abs/2103.04337v1 ) ライセンス: Link先を確認	Xuehu Liu and Pingping Zhang and Chenyang Yu and Huchuan Lu and Xiaoyun Yang	(参考訳) ビデオベースの人物再識別(Re-ID)は、重複しないカメラで同一人物のビデオシーケンスを自動的に取得することを目的としている。この目的を達成するために、ビデオに豊富な空間的および時間的手がかりを十分に活用することが鍵となる。既存の手法は通常、最も顕著な画像領域に焦点を合わせており、画像シーケンスの人物の多様性によって、きめ細かな手がかりを見逃しがちである。そこで本論文では,映像に基づくRe-IDのためのGLL(Global-Guided Reciprocal Learning)フレームワークを提案する。具体的には,GCE(Global-Guided Correlation Estimation)を提案し,局所的特徴とグローバル特徴の特徴相関マップを生成し,同一人物を識別するための高相関領域と低相関領域をローカライズする。その後、グローバル表現の指導の下で、識別的特徴を高相関特徴と低相関特徴に分解する。さらに, テンポラル・相互学習(TRL)機構は, 高相関意味情報を逐次強化し, 低相関のサブクリティカルな手がかりを蓄積するように設計されている。 3つの公開ベンチマークに関する広範な実験は、私たちのアプローチが他の最先端のアプローチよりも優れたパフォーマンスを達成できることを示しています。 Video-based person re-identification (Re-ID) aims to automatically retrieve video sequences of the same person under non-overlapping cameras. To achieve this goal, it is the key to fully utilize abundant spatial and temporal cues in videos. Existing methods usually focus on the most conspicuous image regions, thus they may easily miss out fine-grained clues due to the person varieties in image sequences. To address above issues, in this paper, we propose a novel Global-guided Reciprocal Learning (GRL) framework for video-based person Re-ID. Specifically, we first propose a Global-guided Correlation Estimation (GCE) to generate feature correlation maps of local features and global features, which help to localize the high-and low-correlation regions for identifying the same person. After that, the discriminative features are disentangled into high-correlation features and low-correlation features under the guidance of the global representations. Moreover, a novel Temporal Reciprocal Learning (TRL) mechanism is designed to sequentially enhance the high-correlation semantic information and accumulate the low-correlation sub-critical clues. Extensive experiments on three public benchmarks indicate that our approach can achieve better performance than other state-of-the-art approaches.	翻訳日:2021-03-09 16:02:56 公開日:2021-03-07
# TransBTS: Transformer を用いたマルチモーダル脳腫瘍切除 TransBTS: Multimodal Brain Tumor Segmentation Using Transformer ( http://arxiv.org/abs/2103.04430v1 ) ライセンス: Link先を確認	Wenxuan Wang, Chen Chen, Meng Ding, Jiangyun Li, Hong Yu, Sen Zha	(参考訳) 自己着脱機構を用いたグローバル(長距離)情報モデリングの恩恵を受けるトランスフォーマは,近年,自然言語処理と2次元画像分類に成功している。しかし,特に3次元医用画像セグメンテーションでは,局所的特徴とグローバル特徴の両方が重要となる。本稿では、MRI脳腫瘍セグメンテーションのための3D CNNにおけるTransformerを初めて利用し、エンコーダデコーダ構造に基づくTransBTSという新しいネットワークを提案する。ローカルな3dコンテキスト情報をキャプチャするために、エンコーダはまず3d cnnを使用して体積空間特徴マップを抽出する。一方、機能マップは、グローバル機能モデリングのためにTransformerに供給されるトークンのために精巧に再構成されます。デコーダはtransformerに埋め込まれた機能を活用し、詳細なセグメンテーションマップを予測するためにプログレッシブアップサンプリングを行う。 BraTS 2019データセットの実験結果は、TransBTSが3D MRIスキャンで脳腫瘍のセグメント化の最先端の手法より優れていることを示している。コードはhttps://github.com/Wenxuan-1119/TransBTSで入手できる。 Transformer, which can benefit from global (long-range) information modeling using self-attention mechanisms, has been successful in natural language processing and 2D image classification recently. However, both local and global features are crucial for dense prediction tasks, especially for 3D medical image segmentation. In this paper, we for the first time exploit Transformer in 3D CNN for MRI Brain Tumor Segmentation and propose a novel network named TransBTS based on the encoder-decoder structure. To capture the local 3D context information, the encoder first utilizes 3D CNN to extract the volumetric spatial feature maps. Meanwhile, the feature maps are reformed elaborately for tokens that are fed into Transformer for global feature modeling. The decoder leverages the features embedded by Transformer and performs progressive upsampling to predict the detailed segmentation map. Experimental results on the BraTS 2019 dataset show that TransBTS outperforms state-of-the-art methods for brain tumor segmentation on 3D MRI scans. Code is available at https://github.com/Wenxuan-1119/TransBTS	翻訳日:2021-03-09 16:02:34 公開日:2021-03-07
# 直交注意:クローズスタイルアプローチによる否定スコープの解決 Orthogonal Attention: A Cloze-Style Approach to Negation Scope Resolution ( http://arxiv.org/abs/2103.04294v1 ) ライセンス: Link先を確認	Aditya Khandelwal and Vahida Attar	(参考訳) Negation Scope Resolutionは広く研究されている問題であり、ネゲーションキューの影響を受ける単語を文に見つけるために使用されます。最近の研究では、トランスフォーマーベースのアーキテクチャの微調整が、このタスクに最先端の結果をもたらすことが示されている。本研究では,文を文脈として,手がかり語をクエリとして,否定スコープの解決をクローズ的なタスクとして捉えた。また, 自己注意に触発された直交注意と呼ばれる新しいクロゼスタイルの注意機構も導入する。まず, オーソゴナル・アテンション・バリアントを開発するためのフレームワークを提案し, OA-C, OA-CA, OA-EM, OA-EMBの4種類のオーソゴナル・アテンテンション・バリアントを提案する。 XLNetのバックボーンの上にこれらの直交アテンションレイヤーを使用して、我々は、私たちが実験するすべてのデータセットで今まで最高の結果を達成し、微調整XLNet最先端のネゲーションスコープ解像度を上回ります:BioScope Abstracts、BioScope Full Papers、SFU Review Corpus、およびsem 2012 Dataset(Sherlock)。 Negation Scope Resolution is an extensively researched problem, which is used to locate the words affected by a negation cue in a sentence. Recent works have shown that simply finetuning transformer-based architectures yield state-of-the-art results on this task. In this work, we look at Negation Scope Resolution as a Cloze-Style task, with the sentence as the Context and the cue words as the Query. We also introduce a novel Cloze-Style Attention mechanism called Orthogonal Attention, which is inspired by Self Attention. First, we propose a framework for developing Orthogonal Attention variants, and then propose 4 Orthogonal Attention variants: OA-C, OA-CA, OA-EM, and OA-EMB. Using these Orthogonal Attention layers on top of an XLNet backbone, we outperform the finetuned XLNet state-of-the-art for Negation Scope Resolution, achieving the best results to date on all 4 datasets we experiment with: BioScope Abstracts, BioScope Full Papers, SFU Review Corpus and the sem 2012 Dataset (Sherlock).	翻訳日:2021-03-09 16:01:32 公開日:2021-03-07
# エキスパートシステムグラジエントディサントスタイルトレーニング: 防衛可能な人工知能技術の開発。 Expert System Gradient Descent Style Training: Development of a Defensible Artificial Intelligence Technique ( http://arxiv.org/abs/2103.04314v1 ) ライセンス: Link先を確認	Jeremy Straub	(参考訳) 提示されたデータから学習する能力を備えた人工知能システムは、社会全体で使用されています。これらのシステムは、ローン申請者のスクリーニング、刑事被告に対する判決の推薦、禁止コンテンツに対するソーシャルメディア投稿のスキャンなどに使われる。これらのシステムは、複雑な学習された相関ネットワークに意味を割り当てないため、因果関係に等しくない関連を学習することができ、最適で防御不能な決定が下される。準最適な意思決定に加えて、これらのシステムは、差別防止法に違反している相関関係を学習することで、設計者やオペレーターに法的責任を負う可能性がある。本稿では,意味割り当てノード (facts) と相関関係 (rules) を用いて開発した機械学習エキスパートシステムについて述べる。複数の潜在的な実装は、異なるネットワークエラーと拡張レベルと異なるトレーニングレベルを含む、異なる条件下で検討および評価されます。これらのシステムの性能は、ランダムで完全に接続されたネットワークと比較される。 Artificial intelligence systems, which are designed with a capability to learn from the data presented to them, are used throughout society. These systems are used to screen loan applicants, make sentencing recommendations for criminal defendants, scan social media posts for disallowed content and more. Because these systems don't assign meaning to their complex learned correlation network, they can learn associations that don't equate to causality, resulting in non-optimal and indefensible decisions being made. In addition to making decisions that are sub-optimal, these systems may create legal liability for their designers and operators by learning correlations that violate anti-discrimination and other laws regarding what factors can be used in different types of decision making. This paper presents the use of a machine learning expert system, which is developed with meaning-assigned nodes (facts) and correlations (rules). Multiple potential implementations are considered and evaluated under different conditions, including different network error and augmentation levels and different training levels. The performance of these systems is compared to random and fully connected networks.	翻訳日:2021-03-09 16:00:46 公開日:2021-03-07
# TensorFlow-Kerasによるグラフニューラルネットワークの実装 Implementing graph neural networks with TensorFlow-Keras ( http://arxiv.org/abs/2103.04318v1 ) ライセンス: Link先を確認	Patrick Reiser, Andre Eberhard and Pascal Friederich	(参考訳) グラフニューラルネットワークは、最近多くの注目を集めた汎用的な機械学習アーキテクチャです。本稿では、TensorFlow-Kerasモデルの畳み込み層とプール層の実装について述べる。これにより、標準的なKeras層にシームレスかつ柔軟な統合により、グラフモデルを機能的に設定できる。これは、グラフに適したtensorflowの新しいraggedtensorクラスを通じて実現可能な、最初のテンソル次元としてのミニバッチの使用を意味する。 tensorflow-kerasをベースとしたkeras graph convolutional neural network pythonパッケージを開発した。tensorflow-kerasは、層間で渡される透明なテンソル構造と、使いやすいマインドセットに焦点を当てた、グラフネットワーク用のkerasレイヤセットを提供する。 Graph neural networks are a versatile machine learning architecture that received a lot of attention recently. In this technical report, we present an implementation of convolution and pooling layers for TensorFlow-Keras models, which allows a seamless and flexible integration into standard Keras layers to set up graph models in a functional way. This implies the usage of mini-batches as the first tensor dimension, which can be realized via the new RaggedTensor class of TensorFlow best suited for graphs. We developed the Keras Graph Convolutional Neural Network Python package kgcnn based on TensorFlow-Keras that provides a set of Keras layers for graph networks which focus on a transparent tensor structure passed between layers and an ease-of-use mindset.	翻訳日:2021-03-09 16:00:26 公開日:2021-03-07
# Hierarchical Causal Bandit Hierarchical Causal Bandit ( http://arxiv.org/abs/2103.04215v1 ) ライセンス: Link先を確認	Ruiyang Song, Stefano Rini, Kuang Xu	(参考訳) 因果バンディット(英: Causal Bandit)は、エージェントが変数の因果ネットワークで連続的に実験し、報酬の最大化介入を特定する、創発的な学習モデルである。モデルの適用性は広いが、既存の分析結果は、全ての変数が互いに独立な並列バンディットバージョンに大きく制限されている。本研究では,階層型因果バンディットモデルを,従属変数による一般因果バンディット理解への有効な経路として紹介する。コアのアイデアは、直接的な効果を持つすべての変数間の相互作用をキャプチャするコンテキスト変数を組み込むことです。この階層的枠組みを用いることで、因果的包帯と従属腕のアルゴリズム設計の鋭い洞察を導き、二項文脈の場合、ほぼ一致する後悔境界を得る。 Causal bandit is a nascent learning model where an agent sequentially experiments in a causal network of variables, in order to identify the reward-maximizing intervention. Despite the model's wide applicability, existing analytical results are largely restricted to a parallel bandit version where all variables are mutually independent. We introduce in this work the hierarchical causal bandit model as a viable path towards understanding general causal bandits with dependent variables. The core idea is to incorporate a contextual variable that captures the interaction among all variables with direct effects. Using this hierarchical framework, we derive sharp insights on algorithmic design in causal bandits with dependent arms and obtain nearly matching regret bounds in the case of a binary context.	翻訳日:2021-03-09 15:57:51 公開日:2021-03-07
# アクティブシーケンシャル仮説テストのための近似アルゴリズム Approximation Algorithms for Active Sequential Hypothesis Testing ( http://arxiv.org/abs/2103.04250v1 ) ライセンス: Link先を確認	Kyra Gan, Su Jia, Andrew Li	(参考訳) アクティブ・シーケンシャル仮説テスト(英語版)(asht)の問題において、学習者は、一連の仮説のうち、真の仮説である$h^$を同定しようとする。学習者は一連の行動を与えられ、任意の真の仮説の下での行動の結果分布を知る。アクションの集合全体を繰り返し再生すると、$h^$を識別するのに十分だが、アクションごとにコストがかかる。したがって、ターゲットエラー $\delta>0$ が与えられた場合、少なくとも1 - \delta$ の確率で $h^$ を識別するアクションを逐次選択するための最小コストポリシーを見つけることが目的である。本稿では2種類の適応性の下でASHTの最初の近似アルゴリズムを提供する。まず、ポリシーが事前に一連のアクションを修正し、いつ終了し、どの仮説を返すかを適応的に決定すれば、部分的に適応する。部分的適応性の下では、$o\big(s^{-1}(1+\log_{1/\delta}\|h\|)\log (s^{-1}\|h\| \log \|h\|)\big)$近似アルゴリズムを提供する。第二に、アクションの選択が以前の結果に依存している場合、ポリシーは完全に適応的です。完全な適応性の下で、$O(s^{-1}\log (\|H\|/\delta)\log \|H\|)$-近似アルゴリズムを提供する。合成データと実世界のデータの両方を用いて,アルゴリズムの性能を数値的に検討し,提案したヒューリスティック・ポリシーよりも優れていることを示す。 In the problem of active sequential hypotheses testing (ASHT), a learner seeks to identify the true hypothesis $h^$ from among a set of hypotheses $H$. The learner is given a set of actions and knows the outcome distribution of any action under any true hypothesis. While repeatedly playing the entire set of actions suffices to identify $h^$, a cost is incurred with each action. Thus, given a target error $\delta>0$, the goal is to find the minimal cost policy for sequentially selecting actions that identify $h^$ with probability at least $1 - \delta$. This paper provides the first approximation algorithms for ASHT, under two types of adaptivity. First, a policy is partially adaptive if it fixes a sequence of actions in advance and adaptively decides when to terminate and what hypothesis to return. Under partial adaptivity, we provide an $O\big(s^{-1}(1+\log_{1/\delta}\|H\|)\log (s^{-1}\|H\| \log \|H\|)\big)$-approximation algorithm, where $s$ is a natural separation parameter between the hypotheses. Second, a policy is fully adaptive if action selection is allowed to depend on previous outcomes. Under full adaptivity, we provide an $O(s^{-1}\log (\|H\|/\delta)\log \|H\|)$-approximation algorithm. We numerically investigate the performance of our algorithms using both synthetic and real-world data, showing that our algorithms outperform a previously proposed heuristic policy.	翻訳日:2021-03-09 15:57:39 公開日:2021-03-07
# CORe:バンディット探索における報酬の活用 CORe: Capitalizing On Rewards in Bandit Exploration ( http://arxiv.org/abs/2103.04387v1 ) ライセンス: Link先を確認	Nan Wang, Branislav Kveton, Maryam Karimzadehgan	(参考訳) 過去の観測をランダム化して純粋に探索するバンディットアルゴリズムを提案する。特に、平均報酬推定における十分な楽観性は、過去の観測された報酬の分散を利用して達成される。我々は報酬(コア)に乗じたアルゴリズムを命名する。アルゴリズムは一般的であり、様々なバンディット設定に容易に適用できる。 COReの主な利点は、その探索が完全にデータに依存していることです。外部ノイズに依存しず、パラメータチューニングなしでさまざまな問題に適応します。我々は、$d$が特徴の数であり、$K$が腕の数である確率的な線形バンドイットにおいて、$n$-roundのCOReの後悔に、$\tilde O(d\sqrt{n\log K})$のギャップフリー境界を導出する。複数の合成および実世界の問題に関する広範な経験的評価は、COReの有効性を示す。 We propose a bandit algorithm that explores purely by randomizing its past observations. In particular, the sufficient optimism in the mean reward estimates is achieved by exploiting the variance in the past observed rewards. We name the algorithm Capitalizing On Rewards (CORe). The algorithm is general and can be easily applied to different bandit settings. The main benefit of CORe is that its exploration is fully data-dependent. It does not rely on any external noise and adapts to different problems without parameter tuning. We derive a $\tilde O(d\sqrt{n\log K})$ gap-free bound on the $n$-round regret of CORe in a stochastic linear bandit, where $d$ is the number of features and $K$ is the number of arms. Extensive empirical evaluation on multiple synthetic and real-world problems demonstrates the effectiveness of CORe.	翻訳日:2021-03-09 15:57:10 公開日:2021-03-07
# 逆強化学習のサンプル複雑性のための下界 A Lower Bound for the Sample Complexity of Inverse Reinforcement Learning ( http://arxiv.org/abs/2103.04446v1 ) ライセンス: Link先を確認	Abi Komanduru, Jean Honorio	(参考訳) 逆強化学習(IRL)は、与えられたマルコフ決定プロセス(MDP)に対して望ましい最適ポリシーを生成する報酬関数を求めるタスクである。本稿では, 有限状態, 有限作用IRL問題のサンプル複雑性に対する情報理論の下界について述べる。球面符号を用いた $\beta$-strict separable IRL 問題の幾何学的構成を考える。生成した軌跡間のkullback-leibler発散と同様にアンサンブルサイズの性質を導出する。結果として得られるアンサンブルはファノの不等式とともに、mdp 内の状態数である $n$ で$o(n \log n)$ 以下のサンプル複雑性を導出するために用いられる。 Inverse reinforcement learning (IRL) is the task of finding a reward function that generates a desired optimal policy for a given Markov Decision Process (MDP). This paper develops an information-theoretic lower bound for the sample complexity of the finite state, finite action IRL problem. A geometric construction of $\beta$-strict separable IRL problems using spherical codes is considered. Properties of the ensemble size as well as the Kullback-Leibler divergence between the generated trajectories are derived. The resulting ensemble is then used along with Fano's inequality to derive a sample complexity lower bound of $O(n \log n)$, where $n$ is the number of states in the MDP.	翻訳日:2021-03-09 15:56:56 公開日:2021-03-07
# クラスカプセルの識別力に向けたルーティング Routing Towards Discriminative Power of Class Capsules ( http://arxiv.org/abs/2103.04278v1 ) ライセンス: Link先を確認	Haoyu Yang, Shuhe Li, Bei Yu	(参考訳) カプセルネットワークは最近、現代のニューラルネットワークアーキテクチャの代替として提案されている。ニューロンは、正常化されたベクトルまたは行列を持つ特定の特徴または実体を表すカプセルユニットに置き換えられる。下層カプセルの活性化は、特定のルーティングアルゴリズムによって訓練中に構築されるルーティングリンクを介して、以下のカプセルの挙動に影響する。本稿では,ネットワークを最適性から遠ざける動的ルーティングアルゴリズムにおけるルーティング・バイ・アグリーメントスキームについて考察する。より良く,より高速な収束を得るため,規則化された二次プログラミング問題を効率的に解くことができるルーティングアルゴリズムを提案する。特に,提案したルーティングアルゴリズムは,入力インスタンスの正しい判定を行うクラスカプセルの識別力を直接ターゲットとする。 mnist,mnist-fashion,cifar-10の実験を行い,既存のカプセルネットワークと比較して競合分類結果を示す。 Capsule networks are recently proposed as an alternative to modern neural network architectures. Neurons are replaced with capsule units that represent specific features or entities with normalized vectors or matrices. The activation of lower layer capsules affects the behavior of the following capsules via routing links that are constructed during training via certain routing algorithms. We discuss the routing-by-agreement scheme in dynamic routing algorithm which, in certain cases, leads the networks away from optimality. To obtain better and faster convergence, we propose a routing algorithm that incorporates a regularized quadratic programming problem which can be solved efficiently. Particularly, the proposed routing algorithm targets directly on the discriminative power of class capsules making the correct decision on input instances. We conduct experiments on MNIST, MNIST-Fashion, and CIFAR-10 and show competitive classification results compared to existing capsule networks.	翻訳日:2021-03-09 15:55:23 公開日:2021-03-07
# 階層的自己注意に基づく人的活動認識のためのオートエンコーダ Hierarchical Self Attention Based Autoencoder for Open-Set Human Activity Recognition ( http://arxiv.org/abs/2103.04279v1 ) ライセンス: Link先を確認	M Tanjid Hasan Tonmoy, Saif Mahmud, A K M Mahbubur Rahman, M Ashraful Amin, and Amin Ahsan Ali	(参考訳) ウェアラブルセンサーベースの人間の活動認識は、センサー信号の空間的および時間的依存性のモデリングが困難であるため、困難な問題です。閉集合仮定における認識モデルは、既知のアクティビティクラスのメンバを予測として生成しなければならない。しかし, 運動認識モデルでは, 身体感覚の異常や, 動作中の被験者の障害により, 目に見えない活動に遭遇する可能性がある。この問題は、オープンセット認識の仮定に従ってモデリングソリューションを通じて対処することができる。したがって、提案した自己注意ベースのアプローチは、さまざまなセンサー配置からのデータを階層的に組み合わせてクローズドセットアクティビティを分類し、5つの公開データセット上での最先端モデルに対する顕著なパフォーマンス改善を得る。このオートエンコーダアーキテクチャのデコーダには、エンコーダからの自己認識に基づく特徴表現が組み込まれ、オープンセット認識設定で未確認のアクティビティクラスを検出する。さらに、階層モデルによって生成された注目マップは、アクティビティ認識における特徴の説明可能な選択を示す。本研究は,騒音に対する頑健性および体温センサ信号の特異的変動性を著しく改善した検証実験を広範囲に実施する。ソースコードはgithub.com/saif-mahmud/hierarchical-attention-HARで入手できる。 Wearable sensor based human activity recognition is a challenging problem due to difficulty in modeling spatial and temporal dependencies of sensor signals. Recognition models in closed-set assumption are forced to yield members of known activity classes as prediction. However, activity recognition models can encounter an unseen activity due to body-worn sensor malfunction or disability of the subject performing the activities. This problem can be addressed through modeling solution according to the assumption of open-set recognition. Hence, the proposed self attention based approach combines data hierarchically from different sensor placements across time to classify closed-set activities and it obtains notable performance improvement over state-of-the-art models on five publicly available datasets. The decoder in this autoencoder architecture incorporates self-attention based feature representations from encoder to detect unseen activity classes in open-set recognition setting. Furthermore, attention maps generated by the hierarchical model demonstrate explainable selection of features in activity recognition. We conduct extensive leave one subject out validation experiments that indicate significantly improved robustness to noise and subject specific variability in body-worn sensor signals. The source code is available at: github.com/saif-mahmud/hierarchical-attention-HAR	翻訳日:2021-03-09 15:55:10 公開日:2021-03-07
# 単発セマンティック部品セグメンテーションのためのGANの再利用 Repurposing GANs for One-shot Semantic Part Segmentation ( http://arxiv.org/abs/2103.04379v1 ) ライセンス: Link先を確認	Nontawat Tritrong, Pitchaporn Rewatbowornwong, Supasorn Suwajanakorn	(参考訳) GANは現実的な画像生成に成功したが、合成とは無関係な他のタスクにGANを使用することのアイデアは明らかにされていない。 GANは、それらのオブジェクトを再生する過程で、オブジェクトの有意義な構造的部分を学ぶか? そこで本研究では,この仮説を検証し,ラベルなしデータセットとともにラベルを1つも必要としない,意味部分セグメンテーションのためのgansに基づく単純かつ効果的なアプローチを提案する。我々のキーとなるアイデアは、訓練されたGANを利用して、入力画像からピクセルワイズ表現を抽出し、セグメンテーションネットワークのための特徴ベクトルとして利用することです。我々の実験は、GANの表現が「可読的に差別的」であり、かなり多くのラベルで訓練された教師付きベースラインと同等の驚くほど良い結果をもたらすことを示した。我々は、gansのこの新しい再提案は、他の多くのタスクに適用可能な教師なし表現学習の新たなクラスであると信じている。詳細は https://repurposegans.github.io/ をご覧ください。 While GANs have shown success in realistic image generation, the idea of using GANs for other tasks unrelated to synthesis is underexplored. Do GANs learn meaningful structural parts of objects during their attempt to reproduce those objects? In this work, we test this hypothesis and propose a simple and effective approach based on GANs for semantic part segmentation that requires as few as one label example along with an unlabeled dataset. Our key idea is to leverage a trained GAN to extract pixel-wise representation from the input image and use it as feature vectors for a segmentation network. Our experiments demonstrate that GANs representation is "readily discriminative" and produces surprisingly good results that are comparable to those from supervised baselines trained with significantly more labels. We believe this novel repurposing of GANs underlies a new class of unsupervised representation learning that is applicable to many other tasks. More results are available at https://repurposegans.github.io/.	翻訳日:2021-03-09 15:54:51 公開日:2021-03-07
# 空間変換領域の感度不一致からの逆例の検出 Detecting Adversarial Examples from Sensitivity Inconsistency of Spatial-Transform Domain ( http://arxiv.org/abs/2103.04302v1 ) ライセンス: Link先を確認	Jinyu Tian, Jiantao Zhou, Yuanman Li, Jia Duan	(参考訳) ディープニューラルネットワーク(DNN)は、劇的なモデル出力エラーを引き起こすように悪質に設計された敵の例(AE)に対して脆弱であることが示されている。本研究では、通常の例(NE)は、決定境界の高度に湾曲した領域で発生する変動に敏感であるのに対し、AEは1つの領域(主に空間領域)上に設計され、そのような変動に対して極端に敏感であることを示す。この現象は、感度の不整合により、元の分類器(原始分類器)と協調してAEを検出することができる変換決定境界を持つ別の分類器(二重分類器)を設計する動機となる。 LID(Local Intrinsic Dimensionality)、MD(Mahalanobis Distance)、FS(Feature Squeezing)に基づく最先端のアルゴリズムと比較して、提案された感度インシネンスディテクタ(SID)は、特に敵対的な摂動レベルが小さい場合において、AE検出性能と優れた一般化能力の向上を実現します。 ResNet と VGG の総合的な実験結果から,提案した SID の優位性を検証した。 Deep neural networks (DNNs) have been shown to be vulnerable against adversarial examples (AEs), which are maliciously designed to cause dramatic model output errors. In this work, we reveal that normal examples (NEs) are insensitive to the fluctuations occurring at the highly-curved region of the decision boundary, while AEs typically designed over one single domain (mostly spatial domain) exhibit exorbitant sensitivity on such fluctuations. This phenomenon motivates us to design another classifier (called dual classifier) with transformed decision boundary, which can be collaboratively used with the original classifier (called primal classifier) to detect AEs, by virtue of the sensitivity inconsistency. When comparing with the state-of-the-art algorithms based on Local Intrinsic Dimensionality (LID), Mahalanobis Distance (MD), and Feature Squeezing (FS), our proposed Sensitivity Inconsistency Detector (SID) achieves improved AE detection performance and superior generalization capabilities, especially in the challenging cases where the adversarial perturbation levels are small. Intensive experimental results on ResNet and VGG validate the superiority of the proposed SID.	翻訳日:2021-03-09 15:47:27 公開日:2021-03-07
# MTLHealth: 学生の摂動コンテンツ検出のための深層学習システム MTLHealth: A Deep Learning System for Detecting Disturbing Contentin Student Essays ( http://arxiv.org/abs/2103.04290v1 ) ライセンス: Link先を確認	Joseph Valencia, Erin Yao	(参考訳) ACTのような標準化されたテストへのエッセイの提出には、いじめ、自己害、暴力、および邪魔となるコンテンツの他の形態への言及が含まれる。生徒はこのような事件を識別し、危険にさらされている可能性のある学生のために当局に警告するかどうかを判断しなければならない。コンテンツが乱れる可能性を自動で警告することで、人間の意思決定を支援する堅牢なコンピュータシステムの必要性が高まっている。本稿では,計算言語学,特に事前学習型言語モデルトランスフォーマーネットワークの最近の進歩を中心に構築された,乱れたコンテンツ検出パイプラインであるMTLHealthについて述べる。 Essay submissions to standardized tests like the ACT occasionally include references to bullying, self-harm, violence, and other forms of disturbing content. Graders must take great care to identify cases like these and decide whether to alert authorities on behalf of students who may be in danger. There is a growing need for robust computer systems to support human decision-makers by automatically flagging potential instances of disturbing content. This paper describes MTLHealth, a disturbing content detection pipeline built around recent advances from computational linguistics, particularly pre-trained language model Transformer networks.	翻訳日:2021-03-09 15:45:44 公開日:2021-03-07
# Syntax-BERT: シンタックスツリーによるプリトレーニングトランスの改善 Syntax-BERT: Improving Pre-trained Transformers with Syntax Trees ( http://arxiv.org/abs/2103.04350v1 ) ライセンス: Link先を確認	Jiangang Bai, Yujing Wang, Yiren Chen, Yaming Yang, Jing Bai, Jing Yu, Yunhai Tong	(参考訳) BERTのような事前訓練された言語モデルは、構文情報を明確に考慮することなく、様々なNLPタスクで優れたパフォーマンスを実現します。一方、シンタクティック情報はNLPアプリケーションの成功に不可欠であることが証明されています。しかし、構文木を効率的に効率的にトレーニング済みのTransformerに組み込む方法はまだ未定である。本稿では,syntax-bertという新しいフレームワークを提案することで,この問題に解決する。このフレームワークはプラグアンドプレイモードで動作し、Transformerアーキテクチャに基づく任意の事前トレーニングされたチェックポイントに適用できる。自然言語理解のさまざまなデータセットの実験は、構文木の有効性を検証し、BERT、RoBERTa、T5を含む複数の事前学習モデルに対して一貫した改善を実現する。 Pre-trained language models like BERT achieve superior performances in various NLP tasks without explicit consideration of syntactic information. Meanwhile, syntactic information has been proved to be crucial for the success of NLP applications. However, how to incorporate the syntax trees effectively and efficiently into pre-trained Transformers is still unsettled. In this paper, we address this problem by proposing a novel framework named Syntax-BERT. This framework works in a plug-and-play mode and is applicable to an arbitrary pre-trained checkpoint based on Transformer architecture. Experiments on various datasets of natural language understanding verify the effectiveness of syntax trees and achieve consistent improvement over multiple pre-trained models, including BERT, RoBERTa, and T5.	翻訳日:2021-03-09 15:45:33 公開日:2021-03-07
# 共感的BERT2BERT会話モデル:少ないデータでアラビア語生成を学習する Empathetic BERT2BERT Conversational Model: Learning Arabic Language Generation with Little Data ( http://arxiv.org/abs/2103.04353v1 ) ライセンス: Link先を確認	Tarek Naous, Wissam Antoun, Reem A. Mahmoud, and Hazem Hajj	(参考訳) アラビア語対話エージェントにおける共感行動の実現は、人間のような会話モデルを構築する上で重要な側面である。アラビア自然言語処理はAraBERTのような言語モデルで自然言語理解(NLU)に大きな進歩を遂げているが、自然言語生成(NLG)は依然として課題である。 NLGエンコーダデコーダモデルの欠点は、主に会話エージェントなどのNLGモデルのトレーニングに適したアラビア語データセットがないためです。そこで本論文では,AraBERTパラメータを初期化したトランスベースのエンコーダデコーダを提案する。エンコーダとデコーダの重みをAraBERT事前学習重みで初期化することにより,本モデルでは知識伝達の活用と応答生成の性能向上を実現した。会話モデルにおける共感を可能にするために, arabicempatheticdialoguesデータセットを用いて学習し, 共感応答生成における高いパフォーマンスを達成する。具体的には,従来の最先端モデルと比較して,低パープレキシティ値 17.0 と 5 BLEU 点の増加を達成した。また,提案モデルは85人の評価者によって高く評価され,オープンドメイン設定において,共感を呈示する上で高い能力が検証された。 Enabling empathetic behavior in Arabic dialogue agents is an important aspect of building human-like conversational models. While Arabic Natural Language Processing has seen significant advances in Natural Language Understanding (NLU) with language models such as AraBERT, Natural Language Generation (NLG) remains a challenge. The shortcomings of NLG encoder-decoder models are primarily due to the lack of Arabic datasets suitable to train NLG models such as conversational agents. To overcome this issue, we propose a transformer-based encoder-decoder initialized with AraBERT parameters. By initializing the weights of the encoder and decoder with AraBERT pre-trained weights, our model was able to leverage knowledge transfer and boost performance in response generation. To enable empathy in our conversational model, we train it using the ArabicEmpatheticDialogues dataset and achieve high performance in empathetic response generation. Specifically, our model achieved a low perplexity value of 17.0 and an increase in 5 BLEU points compared to the previous state-of-the-art model. Also, our proposed model was rated highly by 85 human evaluators, validating its high capability in exhibiting empathy while generating relevant and fluent responses in open-domain settings.	翻訳日:2021-03-09 15:45:22 公開日:2021-03-07
# アラビア語文の自動難易度分類 Automatic Difficulty Classification of Arabic Sentences ( http://arxiv.org/abs/2103.04386v1 ) ライセンス: Link先を確認	Nouran Khallaf, Serge Sharoff	(参考訳) 本論文では,CEFRの習熟度レベルと2進分分類を単純あるいは複雑として用いた言語学習者の文の難易度を予測する,現代標準アラビア語(MSA)文難易度分類器を提案する。異なる種類の文埋め込み(fastText, mBERT, XLM-R, Arabic-BERT)とPOSタグ, 依存性木, 可読性スコア, 言語学習者の頻度リストなど, 従来の言語機能との比較を行った。きめ細やかなアラビア-BERTで最高の結果が得られました。 3方向cefr分類の精度はアラビア語-bert分類では0.80, xlm-r分類では0.75, 回帰では0.71スピアマン相関である。我々の二項難易度分類器は文対意味類似度分類器の F-1 0.94 と F-1 0.98 に達する。 In this paper, we present a Modern Standard Arabic (MSA) Sentence difficulty classifier, which predicts the difficulty of sentences for language learners using either the CEFR proficiency levels or the binary classification as simple or complex. We compare the use of sentence embeddings of different kinds (fastText, mBERT , XLM-R and Arabic-BERT), as well as traditional language features such as POS tags, dependency trees, readability scores and frequency lists for language learners. Our best results have been achieved using fined-tuned Arabic-BERT. The accuracy of our 3-way CEFR classification is F-1 of 0.80 and 0.75 for Arabic-Bert and XLM-R classification respectively and 0.71 Spearman correlation for regression. Our binary difficulty classifier reaches F-1 0.94 and F-1 0.98 for sentence-pair semantic similarity classifier.	翻訳日:2021-03-09 15:45:01 公開日:2021-03-07
# スキーマ依存学習によるテキストからSQLへの改善 Improving Text-to-SQL with Schema Dependency Learning ( http://arxiv.org/abs/2103.04399v1 ) ライセンス: Link先を確認	Binyuan Hui, Xiang Shi, Ruiying Geng, Binhua Li, Yongbin Li, Jian Sun, Xiaodan Zhu	(参考訳) Text-to-SQLは自然言語の質問をSQLクエリにマップすることを目的としている。スケッチベースの手法と実行誘導(EG)デコーディング戦略を組み合わせることで、WikiSQLベンチマークでは高いパフォーマンスを示している。しかし、実行誘導型デコーディングはデータベースの実行に依存しており、推論プロセスが大幅に遅くなるため、多くの現実世界のアプリケーションには不満足である。本稿では、質問とスキーマ間の相互作用を効果的に捉えるためのネットワークをガイドするために、スキーマ依存性ガイド付きマルチタスクテキスト・ツー・SQLモデル(SDSQL)を紹介します。提案モデルは,eg の有無に関わらず,既存のメソッドをすべて上回っている。スキーマ依存性の学習は、EGのメリットを部分的にカバーし、その必要性を軽減します。 EGなしのSDSQLは、推論時の時間消費を大幅に削減し、少数のパフォーマンスを犠牲にし、ダウンストリームアプリケーションに柔軟性を提供します。 Text-to-SQL aims to map natural language questions to SQL queries. The sketch-based method combined with execution-guided (EG) decoding strategy has shown a strong performance on the WikiSQL benchmark. However, execution-guided decoding relies on database execution, which significantly slows down the inference process and is hence unsatisfactory for many real-world applications. In this paper, we present the Schema Dependency guided multi-task Text-to-SQL model (SDSQL) to guide the network to effectively capture the interactions between questions and schemas. The proposed model outperforms all existing methods in both the settings with or without EG. We show the schema dependency learning partially cover the benefit from EG and alleviates the need for it. SDSQL without EG significantly reduces time consumption during inference, sacrificing only a small amount of performance and provides more flexibility for downstream applications.	翻訳日:2021-03-09 15:44:44 公開日:2021-03-07
# 仮想常態:精度とロバスト深さ予測のための幾何学的制約を強制する Virtual Normal: Enforcing Geometric Constraintsfor Accurate and Robust Depth Prediction ( http://arxiv.org/abs/2103.04216v1 ) ライセンス: Link先を確認	Wei Yin and Yifan Liu and Chunhua Shen	(参考訳) 単眼深度予測は3次元シーン形状の理解において重要な役割を担っている。近年の手法は画素単位の相対誤差などの評価指標で顕著な進歩を遂げているが、ほとんどの手法は3次元空間における幾何的制約を無視している。本研究では,深度予測のための高次3次元幾何学的制約の重要性を示す。再構成された3次元空間でランダムにサンプリングされた3点によって決定される仮想正規方向という単純な幾何学的制約を強制する損失項を設計することにより、単眼深度推定の精度とロバスト性を大幅に向上させる。重要なことは、仮想正規損失は、学習メートル法深度の性能を向上するだけでなく、スケール情報を解き、より優れた形状情報でモデルを豊かにする。したがって、絶対距離深度トレーニングデータにアクセスできない場合、仮想正規法を用いて多様なシーンで生成される強固なアフィン不変深さを学ぶことができる。実験では,NYU Depth-V2 と KITTI の学習深度について,最先端の学習結果を示す。高品質の予測深度から、ポイント雲や表面の正常といったシーンの優れた3次元構造を復元することが可能となり、これまでやってきたような追加モデルに頼る必要がなくなる。仮想正規損失による多様なデータに対するアフィン不変深度学習の汎用性を示すために、アフィン不変深度トレーニングのための大規模かつ多様なデータセット、いわゆるDiverse Scene Depthデータセット(DiverseDepth)を構築し、ゼロショットテスト設定で5つのデータセットをテストする。 Monocular depth prediction plays a crucial role in understanding 3D scene geometry. Although recent methods have achieved impressive progress in terms of evaluation metrics such as the pixel-wise relative error, most methods neglect the geometric constraints in the 3D space. In this work, we show the importance of the high-order 3D geometric constraints for depth prediction. By designing a loss term that enforces a simple geometric constraint, namely, virtual normal directions determined by randomly sampled three points in the reconstructed 3D space, we significantly improve the accuracy and robustness of monocular depth estimation. Significantly, the virtual normal loss can not only improve the performance of learning metric depth, but also disentangle the scale information and enrich the model with better shape information. Therefore, when not having access to absolute metric depth training data, we can use virtual normal to learn a robust affine-invariant depth generated on diverse scenes. In experiments, We show state-of-the-art results of learning metric depth on NYU Depth-V2 and KITTI. From the high-quality predicted depth, we are now able to recover good 3D structures of the scene such as the point cloud and surface normal directly, eliminating the necessity of relying on additional models as was previously done. To demonstrate the excellent generalizability of learning affine-invariant depth on diverse data with the virtual normal loss, we construct a large-scale and diverse dataset for training affine-invariant depth, termed Diverse Scene Depth dataset (DiverseDepth), and test on five datasets with the zero-shot test setting.	翻訳日:2021-03-09 15:40:32 公開日:2021-03-07
# MeGA-CDA: カテゴリ別非監視ドメイン適応オブジェクト検出のためのメモリガイド注意 MeGA-CDA: Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object Detection ( http://arxiv.org/abs/2103.04224v1 ) ライセンス: Link先を確認	Vibashan VS, Poojan Oza, Vishwanath A. Sindagi, Vikram Gupta, Vishal M. Patel	(参考訳) 教師なしドメイン適応オブジェクト検出のための既存のアプローチは、逆訓練によって機能アライメントを実行する。これらの手法は性能を合理的に改善するが、典型的にはカテゴリーに依存しない領域アライメントを行い、結果として特徴の負の移動をもたらす。そこで本研究では,カテゴリ・アウェア・ドメイン適応(MeGA-CDA)のためのメモリガイドアテンションを提案することで,カテゴリ情報をドメイン適応プロセスに組み込もうとする。提案手法は,カテゴリー別識別器を用いて,カテゴリ別識別特徴を学習するためのカテゴリ別特徴アライメントを保証する。しかし,対象のサンプルではカテゴリ情報が利用できないため,その特徴を対応するカテゴリ判別器に適切にルーティングするために,メモリガイド付きカテゴリ特異的注意マップを作成することを提案する。提案手法はいくつかのベンチマークデータセットで評価され,既存のアプローチを上回っていることが示された。 Existing approaches for unsupervised domain adaptive object detection perform feature alignment via adversarial training. While these methods achieve reasonable improvements in performance, they typically perform category-agnostic domain alignment, thereby resulting in negative transfer of features. To overcome this issue, in this work, we attempt to incorporate category information into the domain adaptation process by proposing Memory Guided Attention for Category-Aware Domain Adaptation (MeGA-CDA). The proposed method consists of employing category-wise discriminators to ensure category-aware feature alignment for learning domain-invariant discriminative features. However, since the category information is not available for the target samples, we propose to generate memory-guided category-specific attention maps which are then used to route the features appropriately to the corresponding category discriminator. The proposed method is evaluated on several benchmark datasets and is shown to outperform existing approaches.	翻訳日:2021-03-09 15:40:04 公開日:2021-03-07
# ディープグラフマッチングに基づくロバストポイントクラウド登録フレームワーク Robust Point Cloud Registration Framework Based on Deep Graph Matching ( http://arxiv.org/abs/2103.04256v1 ) ライセンス: Link先を確認	Kexue Fu and Shaolei Liu and Xiaoyuan Luo and Manning Wang	(参考訳) 3Dポイントクラウド登録は、コンピュータビジョンとロボティクスにおける基本的な問題です。この分野では広範な研究が行われてきたが、既存の手法は、多くの異常値と時間的制約がある状況において大きな課題を満たしている。近年,学習に基づくアルゴリズムが多数導入され,高速化のメリットが示された。それらの多くは2つの点雲間の対応に基づいているため、変換初期化に依存しない。しかし、これらの学習に基づく手法は外れ値に敏感であり、より誤った対応をもたらす。本稿では,ポイントクラウド登録のための新しいディープグラフマッチングベースのフレームワークを提案する。具体的には、まず点雲をグラフに変換し、各点の深い特徴を抽出する。そこで我々は, 深部グラフマッチングに基づくモジュールを開発し, ソフト対応行列を計算した。グラフマッチングを用いることで、各点の局所幾何学だけでなく、より広い範囲におけるその構造やトポロジーも対応付けを確立することで、より正確な対応が見出される。通信上で直接定義された損失でネットワークを訓練し、テスト段階ではソフト対応をハードな1対1対応に変換し、単価分解により登録が行えるようにします。さらに,グラフ構築のためのエッジを生成するトランスベース手法を導入し,対応文の品質をさらに向上させる。クリーン,ノイズ,部分的・部分的・不可視のカテゴリー点雲の登録実験により,提案手法が最先端の性能を達成することを示す。コードはhttps://github.com/fukexue/RGMで公開される。 3D point cloud registration is a fundamental problem in computer vision and robotics. There has been extensive research in this area, but existing methods meet great challenges in situations with a large proportion of outliers and time constraints, but without good transformation initialization. Recently, a series of learning-based algorithms have been introduced and show advantages in speed. Many of them are based on correspondences between the two point clouds, so they do not rely on transformation initialization. However, these learning-based methods are sensitive to outliers, which lead to more incorrect correspondences. In this paper, we propose a novel deep graph matchingbased framework for point cloud registration. Specifically, we first transform point clouds into graphs and extract deep features for each point. Then, we develop a module based on deep graph matching to calculate a soft correspondence matrix. By using graph matching, not only the local geometry of each point but also its structure and topology in a larger range are considered in establishing correspondences, so that more correct correspondences are found. We train the network with a loss directly defined on the correspondences, and in the test stage the soft correspondences are transformed into hard one-to-one correspondences so that registration can be performed by singular value decomposition. Furthermore, we introduce a transformer-based method to generate edges for graph construction, which further improves the quality of the correspondences. Extensive experiments on registering clean, noisy, partial-to-partial and unseen category point clouds show that the proposed method achieves state-of-the-art performance. The code will be made publicly available at https://github.com/fukexue/RGM.	翻訳日:2021-03-09 15:39:47 公開日:2021-03-07
# リフレクションフリーフラッシュのみによるロバスト反射除去 Robust Reflection Removal with Reflection-free Flash-only Cues ( http://arxiv.org/abs/2103.04273v1 ) ライセンス: Link先を確認	Chenyang Lei and Qifeng Chen	(参考訳) フラッシュとアンビエント(非フラッシュ)の2つの画像から、堅牢な反射除去のためのシンプルで効果的な反射レスキューを提案します。反射フリーキューは、対応するフラッシュ画像から周囲画像を原データ空間に減じて得られるフラッシュ専用画像を利用する。フラッシュのみの画像は、フラッシュオンのみの暗い環境で撮影された画像と同等である。このフラッシュのみの画像は視覚的に反射しないため、周囲画像の反射を推測するロバストな手がかりが得られる。フラッシュのみの画像には通常アーティファクトがあるため、反射のないキューを利用するだけでなく、反射と透過を正確に推定するアーティファクトの導入を避ける専用モデルも提案する。私たちのモデルはPSNRの5.23dB、SSIMの0.04、LPIPSの0.068以上の最先端の反射除去アプローチを上回っています。ソースコードとデータセットは \href{https://github.com/ChenyangLEI/flash-reflection-removal}{このウェブサイト} で公開されます。 We propose a simple yet effective reflection-free cue for robust reflection removal from a pair of flash and ambient (no-flash) images. The reflection-free cue exploits a flash-only image obtained by subtracting the ambient image from the corresponding flash image in raw data space. The flash-only image is equivalent to an image taken in a dark environment with only a flash on. We observe that this flash-only image is visually reflection-free, and thus it can provide robust cues to infer the reflection in the ambient image. Since the flash-only image usually has artifacts, we further propose a dedicated model that not only utilizes the reflection-free cue but also avoids introducing artifacts, which helps accurately estimate reflection and transmission. Our experiments on real-world images with various types of reflection demonstrate the effectiveness of our model with reflection-free flash-only cues: our model outperforms state-of-the-art reflection removal approaches by more than 5.23dB in PSNR, 0.04 in SSIM, and 0.068 in LPIPS. Our source code and dataset will be publicly available at \href{https://github.com/ChenyangLEI/flash-reflection-removal}{this website}.	翻訳日:2021-03-09 15:39:23 公開日:2021-03-07
# 教師なしクロスドメイン翻訳のための交代MCMC指導による学習サイクル一貫性協調ネットワーク Learning Cycle-Consistent Cooperative Networks via Alternating MCMC Teaching for Unsupervised Cross-Domain Translation ( http://arxiv.org/abs/2103.04285v1 ) ライセンス: Link先を確認	Jianwen Xie, Zilong Zheng, Xiaolin Fang, Song-Chun Zhu, Ying Nian Wu	(参考訳) 本稿では,各領域の確率分布をエネルギーベースモデルと潜在変数モデルからなる生成協調ネットワークで表現する生成フレームワークを提案することにより,教師なしのクロスドメイン翻訳問題について検討する。生成協調ネットワークを利用することで、MCMC教育によるドメインモデルの最大限の学習が可能となり、エネルギーベースモデルでは、ドメインのデータ分布に適合し、MCMCを介して潜在変数モデルにその知識を蒸留する。具体的には、MCMCの指導過程において、エンコーダデコーダによりパラメータ化された潜在変数モデルは、ソースドメインから対象ドメインにマッピングする一方、エネルギーベースモデルは、学習エネルギー関数によって定義される統計特性の観点から、修正結果が対象ドメインの例と一致するように、ランゲヴィンリビジョンによってマッピングされた結果をさらに洗練する。 2つのドメイン間の対応を構築するために,MCMC教育の交互化により,2つのドメイン間の双方向翻訳を考慮し,サイクル整合性のある協調ネットワークのペアを同時に学習する。提案手法は,教師なし画像から画像への変換とペアなし画像のシーケンス変換に有用であることを示す。 This paper studies the unsupervised cross-domain translation problem by proposing a generative framework, in which the probability distribution of each domain is represented by a generative cooperative network that consists of an energy-based model and a latent variable model. The use of generative cooperative network enables maximum likelihood learning of the domain model by MCMC teaching, where the energy-based model seeks to fit the data distribution of domain and distills its knowledge to the latent variable model via MCMC. Specifically, in the MCMC teaching process, the latent variable model parameterized by an encoder-decoder maps examples from the source domain to the target domain, while the energy-based model further refines the mapped results by Langevin revision such that the revised results match to the examples in the target domain in terms of the statistical properties, which are defined by the learned energy function. For the purpose of building up a correspondence between two unpaired domains, the proposed framework simultaneously learns a pair of cooperative networks with cycle consistency, accounting for a two-way translation between two domains, by alternating MCMC teaching. Experiments show that the proposed framework is useful for unsupervised image-to-image translation and unpaired image sequence translation.	翻訳日:2021-03-09 15:39:03 公開日:2021-03-07
# RFN-Nest:赤外・可視画像のためのエンドツーエンド残差核融合ネットワーク RFN-Nest: An end-to-end residual fusion network for infrared and visible images ( http://arxiv.org/abs/2103.04286v1 ) ライセンス: Link先を確認	Hui Li, Xiao-Jun Wu, Josef Kittler	(参考訳) 画像融合分野では、深層学習に基づく融合法の設計は日常的ではない。それは常に融合タスク特異的であり、慎重な考慮が必要です。設計の最も難しい部分は、特定のタスクの融合画像を生成するための適切な戦略を選択することです。したがって、学習可能な融合戦略の考案は、画像融合のコミュニティで非常に困難な問題です。この問題を解決するために、赤外線および可視画像融合のための新しいエンドツーエンド融合ネットワークアーキテクチャ(RFN-Nest)を開発した。本稿では,従来の核融合方式を代替する残差構造に基づく残差核融合ネットワーク(RFN)を提案する。 RFNを訓練するために、新しい詳細保存損失関数と機能強化損失関数が提案される。融合モデル学習は、新しい二段階学習戦略によって達成される。最初の段階では、革新的なネスト接続(Nest)の概念に基づいて自動エンコーダをトレーニングします。次に、提案された損失関数を用いてrfnを訓練する。パブリックドメインデータセットにおける実験結果は,既存の手法と比較して,主観的および客観的評価において,エンドツーエンドのフュージョンネットワークが最先端の手法よりも優れた性能を提供することを示した。私たちの融合メソッドのコードはhttps://github.com/hli1221/imagefusion-rfn-nestで入手できます。 In the image fusion field, the design of deep learning-based fusion methods is far from routine. It is invariably fusion-task specific and requires a careful consideration. The most difficult part of the design is to choose an appropriate strategy to generate the fused image for a specific task in hand. Thus, devising learnable fusion strategy is a very challenging problem in the community of image fusion. To address this problem, a novel end-to-end fusion network architecture (RFN-Nest) is developed for infrared and visible image fusion. We propose a residual fusion network (RFN) which is based on a residual architecture to replace the traditional fusion approach. A novel detail-preserving loss function, and a feature enhancing loss function are proposed to train RFN. The fusion model learning is accomplished by a novel two-stage training strategy. In the first stage, we train an auto-encoder based on an innovative nest connection (Nest) concept. Next, the RFN is trained using the proposed loss functions. The experimental results on public domain data sets show that, compared with the existing methods, our end-to-end fusion network delivers a better performance than the state-of-the-art methods in both subjective and objective evaluation. The code of our fusion method is available at https://github.com/hli1221/imagefusion-rfn-nest	翻訳日:2021-03-09 15:38:39 公開日:2021-03-07
# ハイパースペクトル画像の超解像のための空間スペクトルフィードバックネットワーク Spatial-Spectral Feedback Network for Super-Resolution of Hyperspectral Imagery ( http://arxiv.org/abs/2103.04354v1 ) ライセンス: Link先を確認	Enhai Liu, Zhenjie Tang, Bin Pan, Zhenwei Shi	(参考訳) 近年、深層学習に基づく単一グレー/RGB画像スーパーレゾリューション(SR)法が大きな成功を収めています。しかし、単一ハイパースペクトル画像超解像の技術的発展を制限するには2つの障害がある。 1つは高スペクトル像の高次元および複雑なスペクトルパターンであり、バンド間の空間情報とスペクトル情報の同時探索が困難である。もうひとつは、利用可能なハイパースペクトルトレーニングサンプルの数は極めて少なく、ディープニューラルネットワークのトレーニング時にオーバーフィットする可能性があることだ。そこで本論文では,局所スペクトル帯域間の低レベル表現をグローバルスペクトル帯域から高レベル情報を用いて改善するSSFN(Spatial-Spectral Feedback Network)を提案する。ハイパースペクトルデータの高い次元による特徴抽出の難しさを軽減するだけでなく、トレーニングプロセスをより安定させます。具体的には、そのようなフィードバックの仕方を達成するために、有限展開を持つRNNの隠れ状態を用いる。 SSFB(Spatial-Spectral Feedback Block)は、空間とスペクトルの事前利用のために、フィードバック接続を処理し、強力な高レベルの表現を生成するように設計されている。提案したSSFNは早期予測を伴い、最終高分解能ハイパースペクトル像を段階的に再構成することができる。 3つのベンチマークデータセットの大規模な実験結果から,提案したSSFNは最先端手法と比較して優れた性能を示した。ソースコードはhttps://github.com/tangzhenjie/ssfnで入手できる。 Recently, single gray/RGB image super-resolution (SR) methods based on deep learning have achieved great success. However, there are two obstacles to limit technical development in the single hyperspectral image super-resolution. One is the high-dimensional and complex spectral patterns in hyperspectral image, which make it difficult to explore spatial information and spectral information among bands simultaneously. The other is that the number of available hyperspectral training samples is extremely small, which can easily lead to overfitting when training a deep neural network. To address these issues, in this paper, we propose a novel Spatial-Spectral Feedback Network (SSFN) to refine low-level representations among local spectral bands with high-level information from global spectral bands. It will not only alleviate the difficulty in feature extraction due to high dimensional of hyperspectral data, but also make the training process more stable. Specifically, we use hidden states in an RNN with finite unfoldings to achieve such feedback manner. To exploit the spatial and spectral prior, a Spatial-Spectral Feedback Block (SSFB) is designed to handle the feedback connections and generate powerful high-level representations. The proposed SSFN comes with a early predictions and can reconstruct the final high-resolution hyperspectral image step by step. Extensive experimental results on three benchmark datasets demonstrate that the proposed SSFN achieves superior performance in comparison with the state-of-the-art methods. The source code is available at https://github.com/tangzhenjie/SSFN.	翻訳日:2021-03-09 15:38:18 公開日:2021-03-07
# Insta-RS: ロバスト性と精度を向上するためのインスタンスワイズランダム化スムージング Insta-RS: Instance-wise Randomized Smoothing for Improved Robustness and Accuracy ( http://arxiv.org/abs/2103.04436v1 ) ライセンス: Link先を確認	Chen Chen, Kezhi Kong, Peihong Yu, Juan Luque, Furong Huang	(参考訳) ランダム化平滑化(英: randomized smoothing、rs)は、ニューラルネットワークの分類器を構築するための効果的でスケーラブルな手法である。ほとんどのrsは、滑らかなモデルの認定された堅牢性を高める優れたベースモデルのトレーニングにフォーカスしています。しかし、既存のRS技術は全てのデータポイントを同じ扱い、すなわち、滑らかなモデルを形成するために使用されるガウスノイズの分散は、すべてのトレーニングデータとテストデータに対してプリセットされ普遍的である。このプリセットおよび普遍ガウスノイズ分散は、異なるデータポイントが異なるマージンを持ち、ベースモデルの局所特性が入力例によって異なるため、最適である。本稿では、サンプルのカスタマイズ処理の影響について検討し、サンプルにカスタマイズされたガウス分散を割り当てるマルチスタート探索アルゴリズムであるインスタンスワイズランダム化平滑化(Insta-RS)を提案する。また、インスタンスワイズガウス平滑化モデルの認証された堅牢性を高めるベースモデルをトレーニングするために、各トレーニング例のノイズレベルを適応的に調整およびカスタマイズする新しい2段階トレーニングアルゴリズムであるInsta-RS Trainも設計しています。 CIFAR-10 と ImageNet の広範な実験により,本手法は既存の最先端の堅牢な分類器と比較して,平均認証半径 (ACR) とクリーンデータの精度を著しく向上させることを示した。 Randomized smoothing (RS) is an effective and scalable technique for constructing neural network classifiers that are certifiably robust to adversarial perturbations. Most RS works focus on training a good base model that boosts the certified robustness of the smoothed model. However, existing RS techniques treat every data point the same, i.e., the variance of the Gaussian noise used to form the smoothed model is preset and universal for all training and test data. This preset and universal Gaussian noise variance is suboptimal since different data points have different margins and the local properties of the base model vary across the input examples. In this paper, we examine the impact of customized handling of examples and propose Instance-wise Randomized Smoothing (Insta-RS) -- a multiple-start search algorithm that assigns customized Gaussian variances to test examples. We also design Insta-RS Train -- a novel two-stage training algorithm that adaptively adjusts and customizes the noise level of each training example for training a base model that boosts the certified robustness of the instance-wise Gaussian smoothed model. Through extensive experiments on CIFAR-10 and ImageNet, we show that our method significantly enhances the average certified radius (ACR) as well as the clean data accuracy compared to existing state-of-the-art provably robust classifiers.	翻訳日:2021-03-09 15:28:59 公開日:2021-03-07
# マルチエージェントゲームにおける潜在知能レベル推定による人間報酬の学習 : 運転データへの適用による極小アプローチ Learning Human Rewards by Inferring Their Latent Intelligence Levels in Multi-Agent Games: A Theory-of-Mind Approach with Application to Driving Data ( http://arxiv.org/abs/2103.04289v1 ) ライセンス: Link先を確認	Ran Tian, Masayoshi Tomizuka, and Liting Sun	(参考訳) リワード機能は、人間のエージェントを認識し、人間の行動を合理化するインセンティブとして、特に人間とロボットの相互作用における人間の行動のモデル化に魅力がある。逆強化学習は、デモから報酬関数を取得する効果的な方法です。しかし,エージェント間の相互影響を適切にモデル化する必要があるため,マルチエージェント設定に適用することは常に困難である。この課題に取り組むために、以前の研究では、人間を無限の知性を持つ完全合理的なオプティマイザと仮定することによって平衡解の概念を利用するか、人間の相互作用戦略を優先順位付けする。本研究では、他者の意思決定過程を推論するとき、人間は理性に縛られ、異なる知能レベルを持つことを提唱し、このような固有的および潜在的特性は報酬学習アルゴリズムにおいて考慮されるべきである。そこで我々は,このような知見を心の理論から活用し,学習中の人間の潜在知性レベルを理由とする,新しい多エージェント逆強化学習フレームワークを提案する。ゼロサムとジェネラルサムの両方のゲームにおけるアプローチを合成エージェントで検証し、実際の運転データから人間のドライバーの報酬機能を学ぶための実用的なアプリケーションを示しています。アプローチを2つのベースラインアルゴリズムと比較する。その結果、人間の潜伏した知能レベルを推察することで、提案手法は人間の運転行動をよりよく説明できる報酬関数をより柔軟かつ高めることができることがわかった。 Reward function, as an incentive representation that recognizes humans' agency and rationalizes humans' actions, is particularly appealing for modeling human behavior in human-robot interaction. Inverse Reinforcement Learning is an effective way to retrieve reward functions from demonstrations. However, it has always been challenging when applying it to multi-agent settings since the mutual influence between agents has to be appropriately modeled. To tackle this challenge, previous work either exploits equilibrium solution concepts by assuming humans as perfectly rational optimizers with unbounded intelligence or pre-assigns humans' interaction strategies a priori. In this work, we advocate that humans are bounded rational and have different intelligence levels when reasoning about others' decision-making process, and such an inherent and latent characteristic should be accounted for in reward learning algorithms. Hence, we exploit such insights from Theory-of-Mind and propose a new multi-agent Inverse Reinforcement Learning framework that reasons about humans' latent intelligence levels during learning. We validate our approach in both zero-sum and general-sum games with synthetic agents and illustrate a practical application to learning human drivers' reward functions from real driving data. We compare our approach with two baseline algorithms. The results show that by reasoning about humans' latent intelligence levels, the proposed approach has more flexibility and capability to retrieve reward functions that explain humans' driving behaviors better.	翻訳日:2021-03-09 15:25:56 公開日:2021-03-07
# 無線エッジネットワークを用いた分散学習のための共同符号化とスケジューリング最適化 Joint Coding and Scheduling Optimization for Distributed Learning over Wireless Edge Networks ( http://arxiv.org/abs/2103.04303v1 ) ライセンス: Link先を確認	Nguyen Van Huynh, Dinh Thai Hoang, Diep N. Nguyen, and Eryk Dutkiewicz	(参考訳) 理論的分散学習(DL)とは異なり、無線エッジネットワーク上のDLは、無線接続とエッジノードの固有のダイナミクス/不確実性に直面しており、非常にダイナミックな無線エッジネットワーク(例えばmmWインターフェースを使用して)下でDLを効率性や適用性が低下させる。本稿では,近年のコーデックコンピューティングとディープデューリングニューラルネットワークアーキテクチャを活用し,これらの問題に対処する。コード化された構造/冗長性を導入することで、ノードをつまずくのを待つことなく、分散学習タスクを完了することができる。コード構造のみを最適化する従来のコードドコンピューティングとは異なり、ワイヤレスエッジ上のコードド分散学習では、異種接続によるワイヤレスエッジノードの選択/スケジュール、計算能力、ストラグリング効果も最適化する必要がある。しかし、前述のダイナミクス/未知性を無視しても、分散学習時間を最小化するためのコーディングとスケジューリングの協調最適化はnpハードであることが判明した。そこで我々は,無線接続とエッジノードのダイナミクスと不確実性を考慮し,問題をマルコフ決定プロセスとして再構成し,ディープ・デュリングニューラルネットワークアーキテクチャを用いた新しい深層強化学習アルゴリズムを設計し,無線環境とエッジノードのストラグリングパラメータに関する情報を明示することなく,異なる学習タスクのための最適な符号化方式と最良エッジノードを探索する。シミュレーションでは、提案されたフレームワークは、他のDLアプローチと比較して、無線エッジコンピューティングの平均学習遅延を最大66%削減する。本記事での共同最適フレームワークは、異種および不確実な計算ノードを持つ任意の分散学習スキームにも適用可能である。 Unlike theoretical distributed learning (DL), DL over wireless edge networks faces the inherent dynamics/uncertainty of wireless connections and edge nodes, making DL less efficient or even inapplicable under the highly dynamic wireless edge networks (e.g., using mmW interfaces). This article addresses these problems by leveraging recent advances in coded computing and the deep dueling neural network architecture. By introducing coded structures/redundancy, a distributed learning task can be completed without waiting for straggling nodes. Unlike conventional coded computing that only optimizes the code structure, coded distributed learning over the wireless edge also requires to optimize the selection/scheduling of wireless edge nodes with heterogeneous connections, computing capability, and straggling effects. However, even neglecting the aforementioned dynamics/uncertainty, the resulting joint optimization of coding and scheduling to minimize the distributed learning time turns out to be NP-hard. To tackle this and to account for the dynamics and uncertainty of wireless connections and edge nodes, we reformulate the problem as a Markov Decision Process and then design a novel deep reinforcement learning algorithm that employs the deep dueling neural network architecture to find the jointly optimal coding scheme and the best set of edge nodes for different learning tasks without explicit information about the wireless environment and edge nodes' straggling parameters. Simulations show that the proposed framework reduces the average learning delay in wireless edge computing up to 66% compared with other DL approaches. The jointly optimal framework in this article is also applicable to any distributed learning scheme with heterogeneous and uncertain computing nodes.	翻訳日:2021-03-09 15:25:30 公開日:2021-03-07
# マルチモーダルVAEアクティブ推論コントローラ Multimodal VAE Active Inference Controller ( http://arxiv.org/abs/2103.04412v1 ) ライセンス: Link先を確認	Cristian Meo and Pablo Lanillos	(参考訳) 脳処理に触発された理論的構造であるアクティブ推論は、人工薬剤を制御するための有望な代替手段である。しかし、現在の方法は、連続制御の高次元入力にはまだスケールしない。本稿では,従来の受容的アプローチの適応特性を維持しつつ,大規模なマルチモーダル統合(生画像など)を可能にする産業用アーム用アクティブ・推論トルクコントローラを提案する。線形結合型マルチモーダル変分オートエンコーダを用いたマルチモーダル状態表現学習を含む以前の数学的定式化を拡張した。シミュレーションされた7DOF Franka Emika Pandaロボットアーム上でモデルを評価し、その動作を以前のアクティブ推論ベースラインとPanda組み込み最適化コントローラと比較した。その結果, 生成モデルやパラメータの調整を必要とせず, 表現力の増大, 騒音に対するロバスト性, 環境条件やロボットパラメータの変化への適応性等により, 目標方向到達時の追従性, 制御性が向上した。 Active inference, a theoretical construct inspired by brain processing, is a promising alternative to control artificial agents. However, current methods do not yet scale to high-dimensional inputs in continuous control. Here we present a novel active inference torque controller for industrial arms that maintains the adaptive characteristics of previous proprioceptive approaches but also enables large-scale multimodal integration (e.g., raw images). We extended our previous mathematical formulation by including multimodal state representation learning using a linearly coupled multimodal variational autoencoder. We evaluated our model on a simulated 7DOF Franka Emika Panda robot arm and compared its behavior with a previous active inference baseline and the Panda built-in optimized controller. Results showed improved tracking and control in goal-directed reaching due to the increased representation power, high robustness to noise and adaptability in changes on the environmental conditions and robot parameters without the need to relearn the generative models nor parameters retuning.	翻訳日:2021-03-09 15:25:02 公開日:2021-03-07
# 確率制御確率勾配法によるサドル点のエスケープ Escaping Saddle Points with Stochastically Controlled Stochastic Gradient Methods ( http://arxiv.org/abs/2103.04413v1 ) ライセンス: Link先を確認	Guannan Liang, Qianqian Tong, Chunjiang Zhu, Jinbo bi	(参考訳) 確率的に制御された確率勾配(SCSG)法は1次定常点に効率よく収束することが証明されているが、非凸最適化ではサドル点となる。確率勾配降下 (SGD) ステップは, 深層学習と非凸空間学習問題に対するサドル点周辺の異方性雑音を生じさせ, これらの問題に対してSGDが相関負曲率 (CNC) 条件を満たすことを示す。そこで我々は,SCSG法が厳密なサドル点から脱出するのを助けるためにSGDステップを別々に使用し,CNC-SCSG法を提案する。 SGDステップはノイズ注入と同じような役割を果たすが、より安定している。結果のアルゴリズムは、$\tilde{O}( \epsilon^{-2} log( 1/\epsilon)$の収束率を持つ2次定常点に収束することを証明している。この収束率は問題次元とは独立であり、cnc-sgdよりも高速である。より一般的なフレームワークは、提案する cnc-scsg を任意の一階法に組み込むように設計されている。シミュレーション研究により、提案アルゴリズムはノイズ注入またはSGDステップによって摂動される勾配降下法よりもはるかに少ないエポックでサドル点を回避できることを示した。 Stochastically controlled stochastic gradient (SCSG) methods have been proved to converge efficiently to first-order stationary points which, however, can be saddle points in nonconvex optimization. It has been observed that a stochastic gradient descent (SGD) step introduces anistropic noise around saddle points for deep learning and non-convex half space learning problems, which indicates that SGD satisfies the correlated negative curvature (CNC) condition for these problems. Therefore, we propose to use a separate SGD step to help the SCSG method escape from strict saddle points, resulting in the CNC-SCSG method. The SGD step plays a role similar to noise injection but is more stable. We prove that the resultant algorithm converges to a second-order stationary point with a convergence rate of $\tilde{O}( \epsilon^{-2} log( 1/\epsilon))$ where $\epsilon$ is the pre-specified error tolerance. This convergence rate is independent of the problem dimension, and is faster than that of CNC-SGD. A more general framework is further designed to incorporate the proposed CNC-SCSG into any first-order method for the method to escape saddle points. Simulation studies illustrate that the proposed algorithm can escape saddle points in much fewer epochs than the gradient descent methods perturbed by either noise injection or a SGD step.	翻訳日:2021-03-09 15:24:46 公開日:2021-03-07
# GANav:非構造屋外環境におけるナビゲーション可能な地域分類のためのグループワイドアテンションネットワーク GANav: Group-wise Attention Network for Classifying Navigable Regions in Unstructured Outdoor Environments ( http://arxiv.org/abs/2103.04233v1 ) ライセンス: Link先を確認	Tianrui Guan, Divya Kothandaraman, Rohan Chandra and Dinesh Manocha	(参考訳) 本稿では,RGB画像から,オフロード地形および非構造環境における安全かつ航行可能な領域を識別する新しい学習手法を提案する。本手法は,粒度の粗いセマンティックセグメンテーションを用いて,そのナビビリティレベルに基づいて地形分類群を分類する。本稿では,新しいグループアテンション機構を用いて異なる地形の航行性レベルを識別する,ボトルネックトランスに基づくディープニューラルネットワークアーキテクチャを提案する。グループアテンションヘッドにより,ネットワークが異なるグループに明示的に焦点を合わせ,精度を向上させることができる。さらに,データセットの長い尾の性質を扱うために,動的重み付きクロスエントロピー損失関数を提案する。 RUGD と RELLIS-3D のデータセットを広範囲に評価することにより,我々の学習アルゴリズムがナビゲーションのためのオフロード地形における視覚知覚の精度を向上させることを示す。これらのデータセットに対する先行研究と比較し,rugdでは6.74-39.1%,rellis-3dでは3.82-10.64%改善した。 We present a new learning-based method for identifying safe and navigable regions in off-road terrains and unstructured environments from RGB images. Our approach consists of classifying groups of terrain classes based on their navigability levels using coarse-grained semantic segmentation. We propose a bottleneck transformer-based deep neural network architecture that uses a novel group-wise attention mechanism to distinguish between navigability levels of different terrains.Our group-wise attention heads enable the network to explicitly focus on the different groups and improve the accuracy. In addition, we propose a dynamic weighted cross entropy loss function to handle the long-tailed nature of the dataset. We show through extensive evaluations on the RUGD and RELLIS-3D datasets that our learning algorithm improves the accuracy of visual perception in off-road terrains for navigation. We compare our approach with prior work on these datasets and achieve an improvement over the state-of-the-art mIoU by 6.74-39.1% on RUGD and 3.82-10.64% on RELLIS-3D.	翻訳日:2021-03-09 15:21:17 公開日:2021-03-07
# 野生のディープフェイクビデオ:分析と検出 Deepfake Videos in the Wild: Analysis and Detection ( http://arxiv.org/abs/2103.04263v1 ) ライセンス: Link先を確認	Jiameng Pu, Neal Mangaokar, Lauren Kelly, Parantapa Bhattacharya, Kavya Sundaram, Mobin Javed, Bolun Wang, Bimal Viswanath	(参考訳) aiが操作するビデオ、通称deepfakesは、新しい問題だ。近年、学界や産業界の研究者が、いくつかの(自己作成)ベンチマークdeepfakeデータセットとdeepfake検出アルゴリズムに貢献している。しかし、ディープフェイク動画の理解に向けた努力はほとんど行っていないため、この分野における研究貢献の現実的な適用性についての理解は限られている。既存のデータセットで検出スキームがうまく機能していることが示されたとしても、実際のディープフェイクに対するメソッドの一般性は明らかでない。まず、YouTubeとBilibiliからの1,869のビデオを含む、野生のディープフェイクビデオの最大のデータセットを収集し、提示し、コンテンツの4.8Mフレーム以上を抽出します。第2に,実世界におけるディープフェイクコンテンツの成長パターン,人気,クリエーター,操作戦略,生産方法の包括的分析を行った。第三に、我々は新しいデータセットを使って既存の防衛を体系的に評価し、実際の世界に配備する準備が整っていないことを観察する。第四に、我々は防御を改善するための転送学習スキームと競争に勝った技術の可能性を模索します。 AI-manipulated videos, commonly known as deepfakes, are an emerging problem. Recently, researchers in academia and industry have contributed several (self-created) benchmark deepfake datasets, and deepfake detection algorithms. However, little effort has gone towards understanding deepfake videos in the wild, leading to a limited understanding of the real-world applicability of research contributions in this space. Even if detection schemes are shown to perform well on existing datasets, it is unclear how well the methods generalize to real-world deepfakes. To bridge this gap in knowledge, we make the following contributions: First, we collect and present the largest dataset of deepfake videos in the wild, containing 1,869 videos from YouTube and Bilibili, and extract over 4.8M frames of content. Second, we present a comprehensive analysis of the growth patterns, popularity, creators, manipulation strategies, and production methods of deepfake content in the real-world. Third, we systematically evaluate existing defenses using our new dataset, and observe that they are not ready for deployment in the real-world. Fourth, we explore the potential for transfer learning schemes and competition-winning techniques to improve defenses.	翻訳日:2021-03-09 15:20:49 公開日:2021-03-07
# ERASOR:静的3次元クラウドマップ構築のための擬似占有率に基づく動的物体除去 ERASOR: Egocentric Ratio of Pseudo Occupancy-based Dynamic Object Removal for Static 3D Point Cloud Map Building ( http://arxiv.org/abs/2103.04316v1 ) ライセンス: Link先を確認	Hyungtae Lim, Sungwon Hwang, and Hyun Myung	(参考訳) 都市環境のスキャンデータには、車両や歩行者などの動的物体の表現が含まれることが多い。しかし、スキャンデータの連続的な蓄積を持つ3Dポイントクラウドマップの構築に関しては、動的オブジェクトはしばしば地図に不要なトレースを残します。これらの動的オブジェクトのトレースは障害として機能し、モバイル車両が良好なローカリゼーションおよびナビゲーション性能を達成するのを妨げます。そこで本研究では,pSeudo Occupancyをベースとした動的物体除去手法であるERASOR(Egocentric RAtio of pSeudo Occupancy-based dynamic object removal)を提案する。私たちのアプローチは、必然的に地面と接触している都市環境における最もダイナミックなオブジェクトの性質にその注意を向けます。そこで我々は,単位空間の占有を表現し,異なる占有の空間を識別する,擬似占有という新しい概念を提案する。最後に、R-GPF(Regional-wise Ground Plane Fitting)が採用され、動的点を含む可能性のある候補ビン内の動的点から静的点を区別する。 SemanticKITTIで実験的に検証されたこの手法は、既存のレイトレースベースおよび可視性ベースのメソッドの限界を克服する最先端の手法に対して有望なパフォーマンスをもたらす。 Scan data of urban environments often include representations of dynamic objects, such as vehicles, pedestrians, and so forth. However, when it comes to constructing a 3D point cloud map with sequential accumulations of the scan data, the dynamic objects often leave unwanted traces in the map. These traces of dynamic objects act as obstacles and thus impede mobile vehicles from achieving good localization and navigation performances. To tackle the problem, this paper presents a novel static map building method called ERASOR, Egocentric RAtio of pSeudo Occupancy-based dynamic object Removal, which is fast and robust to motion ambiguity. Our approach directs its attention to the nature of most dynamic objects in urban environments being inevitably in contact with the ground. Accordingly, we propose the novel concept called pseudo occupancy to express the occupancy of unit space and then discriminate spaces of varying occupancy. Finally, Region-wise Ground Plane Fitting (R-GPF) is adopted to distinguish static points from dynamic points within the candidate bins that potentially contain dynamic points. As experimentally verified on SemanticKITTI, our proposed method yields promising performance against state-of-the-art methods overcoming the limitations of existing ray tracing-based and visibility-based methods.	翻訳日:2021-03-09 15:20:29 公開日:2021-03-07
# リアルタイム人間エージェントチームのための適応エージェントアーキテクチャ Adaptive Agent Architecture for Real-time Human-Agent Teaming ( http://arxiv.org/abs/2103.04439v1 ) ライセンス: Link先を確認	Tianwei Ni, Huao Li, Siddharth Agrawal, Suhas Raja, Fan Jia, Yikang Gui, Dana Hughes, Michael Lewis, Katia Sycara	(参考訳) チームワークは、共通の目的を促進するチームメンバの相互関係的な推論、行動、行動のセットです。チームワーク理論と実験は、人間とエージェントエージェントの両方のチームの有効性のための一連の状態とプロセスをもたらしました。しかし、人間とエージェントのチーム化は、非常に新しいものであり、人間のチームには存在しない方針や意図の非対称性が伴うため、あまり研究されていない。人間エージェントチームにおけるチームパフォーマンスを最適化するには、エージェントが人間の意図を推測し、警察を円滑な調整に適応させることが重要です。ほとんどの文献は、学習された人間のモデルを参照するエージェントを構築している。これらのエージェントは学習されたモデルでうまく機能することが保証されているが、最適性や一貫性といった人間のポリシーに重きを置いている。本稿では,TSF(Team Space Fortress)と呼ばれる2人プレイヤ協調ゲームにおいて,人間モデルフリー設定における新しい適応エージェントアーキテクチャを提案する。これまでの人間と人間のチームの研究では、tsfゲームにおける相補的なポリシーと、プレイヤーのスキルの多様性が示されている。したがって、私たちは人間のデータから人間モデルの学習を破棄し、RLアルゴリズムまたはルールベースの方法で構成された事前訓練された例ポリシーライブラリの適応戦略を使用して、人間の行動を最小に仮定します。適応戦略は、人間のポリシーを推論するための新しい類似度メトリクスに依存し、チームのパフォーマンスを最大化するために、我々のライブラリで最も補完的なポリシーを選択します。アダプティブエージェントアーキテクチャはリアルタイムでデプロイでき、任意のオフセットの静的エージェントに一般化できる。提案する適応エージェントフレームワークを評価するために,人間エージェント実験を実施し,人間エージェントチームにおけるヒューマンポリシーの最適性,多様性,適応性について検証した。 Teamwork is a set of interrelated reasoning, actions and behaviors of team members that facilitate common objectives. Teamwork theory and experiments have resulted in a set of states and processes for team effectiveness in both human-human and agent-agent teams. However, human-agent teaming is less well studied because it is so new and involves asymmetry in policy and intent not present in human teams. To optimize team performance in human-agent teaming, it is critical that agents infer human intent and adapt their polices for smooth coordination. Most literature in human-agent teaming builds agents referencing a learned human model. Though these agents are guaranteed to perform well with the learned model, they lay heavy assumptions on human policy such as optimality and consistency, which is unlikely in many real-world scenarios. In this paper, we propose a novel adaptive agent architecture in human-model-free setting on a two-player cooperative game, namely Team Space Fortress (TSF). Previous human-human team research have shown complementary policies in TSF game and diversity in human players' skill, which encourages us to relax the assumptions on human policy. Therefore, we discard learning human models from human data, and instead use an adaptation strategy on a pre-trained library of exemplar policies composed of RL algorithms or rule-based methods with minimal assumptions of human behavior. The adaptation strategy relies on a novel similarity metric to infer human policy and then selects the most complementary policy in our library to maximize the team performance. The adaptive agent architecture can be deployed in real-time and generalize to any off-the-shelf static agents. We conducted human-agent experiments to evaluate the proposed adaptive agent framework, and demonstrated the suboptimality, diversity, and adaptability of human policies in human-agent teams.	翻訳日:2021-03-09 15:16:42 公開日:2021-03-07
# T-Miner: DNNテキスト分類におけるトロイの木馬攻撃対策のためのジェネレーティブアプローチ T-Miner: A Generative Approach to Defend Against Trojan Attacks on DNN-based Text Classification ( http://arxiv.org/abs/2103.04264v1 ) ライセンス: Link先を確認	Ahmadreza Azizi, Ibrahim Asadullah Tahmid, Asim Waheed, Neal Mangaokar, Jiameng Pu, Mobin Javed, Chandan K. Reddy, Bimal Viswanath	(参考訳) ディープニューラルネットワーク(dnn)分類器はトロイの木馬やバックドア攻撃に対して脆弱であることが知られており、分類器は攻撃者によって決定されたトロイの木馬トリガーを含む入力を誤分類するように操作される。バックドアはモデルの整合性を損なうため、DNNベースの分類の状況に深刻な脅威をもたらす。このような攻撃に対する複数の防御は画像ドメインの分類器に対して存在するが、テキストドメインの分類器を保護する努力は限られている。我々は、DNNベースのテキスト分類器に対するトロイの木馬攻撃のための防御フレームワークであるTrojan-Miner(T-Miner)を紹介する。 T-Minerはシークエンス・ツー・シークエンス(seq-2-seq)生成モデルを用いて、疑わしい分類器を探索し、トロイの木馬トリガーを含む可能性が高いテキストシーケンスを生成する。 T-Minerは、生成モデルによって生成されたテキストを分析し、トリガーフレーズを含むかどうかを決定し、テストされた分類器にバックドアがあるかどうかを判断します。 T-Minerは、不審な分類器のトレーニングデータセットやクリーンな入力へのアクセスを必要とせず、代わりに合成された「非意味」テキスト入力を使用して生成モデルをトレーニングする。 3つのユビキタスDNNモデルアーキテクチャ、5つの分類タスク、さまざまなトリガーフレーズからなる1100モデルインスタンスのT-Minerを幅広く評価します。 T-Minerがトロイの木馬とクリーンモデルを98.75%の全体的な精度で検出し、クリーンモデルの偽陽性を低く抑えることを示した。また、T-Minerはアダプティブアタッカーからの様々な標的の高度な攻撃に対して堅牢であることも示しています。 Deep Neural Network (DNN) classifiers are known to be vulnerable to Trojan or backdoor attacks, where the classifier is manipulated such that it misclassifies any input containing an attacker-determined Trojan trigger. Backdoors compromise a model's integrity, thereby posing a severe threat to the landscape of DNN-based classification. While multiple defenses against such attacks exist for classifiers in the image domain, there have been limited efforts to protect classifiers in the text domain. We present Trojan-Miner (T-Miner) -- a defense framework for Trojan attacks on DNN-based text classifiers. T-Miner employs a sequence-to-sequence (seq-2-seq) generative model that probes the suspicious classifier and learns to produce text sequences that are likely to contain the Trojan trigger. T-Miner then analyzes the text produced by the generative model to determine if they contain trigger phrases, and correspondingly, whether the tested classifier has a backdoor. T-Miner requires no access to the training dataset or clean inputs of the suspicious classifier, and instead uses synthetically crafted "nonsensical" text inputs to train the generative model. We extensively evaluate T-Miner on 1100 model instances spanning 3 ubiquitous DNN model architectures, 5 different classification tasks, and a variety of trigger phrases. We show that T-Miner detects Trojan and clean models with a 98.75% overall accuracy, while achieving low false positives on clean models. We also show that T-Miner is robust against a variety of targeted, advanced attacks from an adaptive attacker.	翻訳日:2021-03-09 15:14:53 公開日:2021-03-07
# 複数のディープラーニングモデルの比較テストを促進するための識別測定 Measuring Discrimination to Boost Comparative Testing for Multiple Deep Learning Models ( http://arxiv.org/abs/2103.04333v1 ) ライセンス: Link先を確認	Linghan Meng, Yanhui Li, Lin Chen, Zhi Wang, Di Wu, Yuming Zhou, Baowen Xu	(参考訳) DL技術のブームは巨大なDLモデルの構築と共有をもたらし、DLモデルの取得と再利用を促進する。与えられたタスクに対して、同じ機能で利用可能な複数のDLモデルに遭遇する。テスターは複数のDLモデルを比較し、より適したものを選択することが期待される。テストのコンテキスト全体。分類の努力の限界のために、テスターはこれらのモデルのためにできるだけ正確なランクの推定をするサンプルの有効なサブセットを選ぶことを目標にします。この問題に対処するために,複数のモデルを識別可能な効率的なサンプルを選択するために,サンプル識別に基づく選択(SDS)を提案する。 SDSを評価するために,広範に利用されている3つの画像データセットと80個の実世界DLモデルを用いて広範な実験研究を行った。実験の結果,SDSは最先端のベースライン法と比較して,複数のDLモデルのランク付けに有効で効率的なサンプル選択法であることがわかった。 The boom of DL technology leads to massive DL models built and shared, which facilitates the acquisition and reuse of DL models. For a given task, we encounter multiple DL models available with the same functionality, which are considered as candidates to achieve this task. Testers are expected to compare multiple DL models and select the more suitable ones w.r.t. the whole testing context. Due to the limitation of labeling effort, testers aim to select an efficient subset of samples to make an as precise rank estimation as possible for these models. To tackle this problem, we propose Sample Discrimination based Selection (SDS) to select efficient samples that could discriminate multiple models, i.e., the prediction behaviors (right/wrong) of these samples would be helpful to indicate the trend of model performance. To evaluate SDS, we conduct an extensive empirical study with three widely-used image datasets and 80 real world DL models. The experimental results show that, compared with state-of-the-art baseline methods, SDS is an effective and efficient sample selection method to rank multiple DL models.	翻訳日:2021-03-09 15:14:19 公開日:2021-03-07
# ネットワーク表現学習:伝統的な特徴学習から深層学習へ Network Representation Learning: From Traditional Feature Learning to Deep Learning ( http://arxiv.org/abs/2103.04339v1 ) ライセンス: Link先を確認	Ke Sun, Lei Wang, Bo Xu, Wenhong Zhao, Shyh Wei Teng, Feng Xia	(参考訳) ネットワーク表現学習(NRL)は,グラフデータの隠れた特徴を深く理解するための効果的なグラフ解析手法である。ソーシャルネットワークデータ処理、生物学的情報処理、レコメンダシステムなど、ネットワーク科学に関連する多くの実世界のタスクにうまく応用されている。ディープラーニングはデータ機能を学ぶための強力なツールです。しかし、空間情報を持つ画像や時間情報を持つ音といった正規データとは異なるため、グラフ構造データへのディープラーニングの一般化は非自明である。近年, NRL領域で多くの深層学習手法が提案されている。本研究では,従来の特徴学習手法から深層学習モデルまでの古典的なNRLを調査し,それらの関係を分析し,最新の進歩をまとめる。最後に、NRLを考慮したオープンな問題について議論し、この分野の今後の方向性を指摘する。 Network representation learning (NRL) is an effective graph analytics technique and promotes users to deeply understand the hidden characteristics of graph data. It has been successfully applied in many real-world tasks related to network science, such as social network data processing, biological information processing, and recommender systems. Deep Learning is a powerful tool to learn data features. However, it is non-trivial to generalize deep learning to graph-structured data since it is different from the regular data such as pictures having spatial information and sounds having temporal information. Recently, researchers proposed many deep learning-based methods in the area of NRL. In this survey, we investigate classical NRL from traditional feature learning method to the deep learning-based model, analyze relationships between them, and summarize the latest progress. Finally, we discuss open issues considering NRL and point out the future directions in this field.	翻訳日:2021-03-09 15:14:01 公開日:2021-03-07
# グラフ力学習 Graph Force Learning ( http://arxiv.org/abs/2103.04344v1 ) ライセンス: Link先を確認	Ke Sun, Jiaying Liu, Shuo Yu, Bo Xu, Feng Xia	(参考訳) 機能表現は、ネットワーク分析タスクの大きな力を活用します。しかし、ほとんどの機能は離散的であり、効果的に利用するための大きな課題をもたらす。近年,離散的な特徴を連続空間にマップできるネットワーク機能学習に注目が集まっている。残念なことに、現在の研究では、トレーニング中にランダムな負のサンプリング戦略によって特徴空間の構造情報が完全に保存されない。この問題に対処するために,機能学習の課題と新規性について検討し,春電モデルに着想を得た力に基づくグラフ学習モデルGForceを提案する。 GForceは、ノードが魅力的な力と反発力を持っていると仮定し、特徴学習における元の構造情報と同じ表現をもたらす。ベンチマークデータセットに関する総合的な実験は、提案フレームワークの有効性を実証する。さらにgforceは、グラフ学習のためのノードインタラクションのモデル化に物理モデルを使用する機会を開放する。 Features representation leverages the great power in network analysis tasks. However, most features are discrete which poses tremendous challenges to effective use. Recently, increasing attention has been paid on network feature learning, which could map discrete features to continued space. Unfortunately, current studies fail to fully preserve the structural information in the feature space due to random negative sampling strategy during training. To tackle this problem, we study the problem of feature learning and novelty propose a force-based graph learning model named GForce inspired by the spring-electrical model. GForce assumes that nodes are in attractive forces and repulsive forces, thus leading to the same representation with the original structural information in feature learning. Comprehensive experiments on benchmark datasets demonstrate the effectiveness of the proposed framework. Furthermore, GForce opens up opportunities to use physics models to model node interaction for graph learning.	翻訳日:2021-03-09 15:13:49 公開日:2021-03-07
# Bio-JOIE:生物知識基盤の共同表現学習 Bio-JOIE: Joint Representation Learning of Biological Knowledge Bases ( http://arxiv.org/abs/2103.04283v1 ) ライセンス: Link先を確認	Junheng Hao, Chelsea Ju, Muhao Chen, Yizhou Sun, Carlo Zaniolo, Wei Wang	(参考訳) コロナウイルスの流行は世界的なパンデミックを引き起こし、死亡率は高い。現在、このウイルスに関するさまざまな研究から得られた知識は非常に限られている。他の近縁種の遺伝子オントロジーやタンパク質-タンパク質相互作用(PPI)ネットワークなどの幅広い生物学的知識を活用して、新しい種の分子影響を推定する重要なアプローチを提示します。本稿では,遺伝子オントロジーとppiネットワークの知識を捉え,sars-cov-2-ヒトタンパク質相互作用のモデル化における超能力を示す,トランスファーマルチリレーショナル組込みモデルbio-joieを提案する。 Bio-JOIEは2つのモデルコンポーネントを共同でトレーニングする。知識モデルは、GO用語に用いられる階層認識符号化技術を用いて、タンパク質とGOドメインの関連事実を分離埋め込み空間にエンコードする。さらに、トランスファーモデルは、PPIの知識と遺伝子オントロジーアノテーションを埋め込み空間全体に転送するための非線形変換を学習する。構造化知識のみを活用することにより、Bio-JOIEはPPI型予測において、既存の最先端の手法よりも著しく優れている。さらに,酵素活性を有するタンパク質群における学習表現を酵素系に活用する可能性を実証した。最後に,Bio-JOIEはSARS-CoV-2タンパク質とヒトタンパク質のPPIを正確に同定し,本疾患の研究を進める上で貴重な知見を提供する。 The widespread of Coronavirus has led to a worldwide pandemic with a high mortality rate. Currently, the knowledge accumulated from different studies about this virus is very limited. Leveraging a wide-range of biological knowledge, such as gene ontology and protein-protein interaction (PPI) networks from other closely related species presents a vital approach to infer the molecular impact of a new species. In this paper, we propose the transferred multi-relational embedding model Bio-JOIE to capture the knowledge of gene ontology and PPI networks, which demonstrates superb capability in modeling the SARS-CoV-2-human protein interactions. Bio-JOIE jointly trains two model components. The knowledge model encodes the relational facts from the protein and GO domains into separated embedding spaces, using a hierarchy-aware encoding technique employed for the GO terms. On top of that, the transfer model learns a non-linear transformation to transfer the knowledge of PPIs and gene ontology annotations across their embedding spaces. By leveraging only structured knowledge, Bio-JOIE significantly outperforms existing state-of-the-art methods in PPI type prediction on multiple species. Furthermore, we also demonstrate the potential of leveraging the learned representations on clustering proteins with enzymatic function into enzyme commission families. Finally, we show that Bio-JOIE can accurately identify PPIs between the SARS-CoV-2 proteins and human proteins, providing valuable insights for advancing research on this new disease.	翻訳日:2021-03-09 15:07:51 公開日:2021-03-07
# 非線形システムの適応制御指向メタ学習 Adaptive-Control-Oriented Meta-Learning for Nonlinear Systems ( http://arxiv.org/abs/2103.04490v1 ) ライセンス: Link先を確認	Spencer M. Richards, Navid Azizan, Jean-Jacques E. Slotine, and Marco Pavone	(参考訳) リアルタイム適応は、複雑な動的環境で動作するロボットの制御に不可欠である。適応制御則は、不確定なダイナミクス項が既知の非線形特徴で線形にパラメータ化可能であれば、軌道追従性能の良好な非線形システムでさえも付与することができる。しかし、ロータークラフトの空力障害やマニピュレータアームと様々な物体との相互作用力など、先駆的な特徴を特定することはしばしば困難である。本稿では、ニューラルネットワークを用いたデータ駆動モデルを用いて、過去のデータからオフラインで学習し、これらの非線形特徴を内部パラメトリックモデルで適応制御する。私たちの重要な洞察は、入出力データに適合する機能の回帰指向メタ学習よりも、クローズドループシミュレーションにおける機能の制御指向メタラーニングによるデプロイメントのためのコントローラを準備できるということです。具体的には,アダプティブコントローラをメタ学習し,クローズドループ追跡シミュレーションをベースラーナーとし,平均トラッキング誤差をメタ対象とする。風を受ける非線形平面ロータークラフトを用いて,軌道追従制御のためにクローズドループに配置した場合,適応型コントローラが回帰指向メタラーニングにより訓練された他のコントローラよりも優れることを示す。 Real-time adaptation is imperative to the control of robots operating in complex, dynamic environments. Adaptive control laws can endow even nonlinear systems with good trajectory tracking performance, provided that any uncertain dynamics terms are linearly parameterizable with known nonlinear features. However, it is often difficult to specify such features a priori, such as for aerodynamic disturbances on rotorcraft or interaction forces between a manipulator arm and various objects. In this paper, we turn to data-driven modeling with neural networks to learn, offline from past data, an adaptive controller with an internal parametric model of these nonlinear features. Our key insight is that we can better prepare the controller for deployment with control-oriented meta-learning of features in closed-loop simulation, rather than regression-oriented meta-learning of features to fit input-output data. Specifically, we meta-learn the adaptive controller with closed-loop tracking simulation as the base-learner and the average tracking error as the meta-objective. With a nonlinear planar rotorcraft subject to wind, we demonstrate that our adaptive controller outperforms other controllers trained with regression-oriented meta-learning when deployed in closed-loop for trajectory tracking control.	翻訳日:2021-03-09 15:07:27 公開日:2021-03-07

Title

Authors

Abstract

論文公表日・翻訳日

# 量子接触過程における連続的に変化する指数による吸収相転移:ニューラルネットワークアプローチ

Absorbing phase transition with a continuously varying exponent in a quantum contact process: a neural network approach ( http://arxiv.org/abs/2004.02672v4 )

ライセンス: Link先を確認

Minjae Jo, Jongshin Lee, K. Choi, and B. Kahng

(参考訳) 散逸量子系における相転移は、コヒーレント量子と非コヒーレント古典ゆらぎの相互作用によって引き起こされるため興味深い。本稿では,量子接触過程(QCP)における量子から古典的吸収相転移への交叉について検討する。 lindblad方程式は、それぞれ量子効果と古典効果の寄与を調整する2つのパラメータ、$\omega$と$\kappa$を含んでいる。ある次元において、QCPがすべての活性部位を持つ同質状態から始まるとき、指数$\alpha$(活性部位の密度と関連している)が量子から古典的有向パーコレーション(DP)値に連続的に減少する領域に臨界線が存在することが分かる。この挙動は、量子コヒーレント効果がある程度は$\kappa=0$に近いことを示唆する。しかし、1次元のQCPが1つの活性部位を除いてすべての不活性部位を持つ不均一状態から始まるとき、全ての臨界指数は$\kappa \ge 0$の古典的なDP値を持つ。 2次元では、異常なクロスオーバー動作は発生せず、古典的なDPの挙動は初期構成にかかわらず$\kappa \ge 0$の全領域に現れる。ニューラルネットワーク機械学習を用いて臨界線を特定し、相関長指数を決定する。量子ジャンプモンテカルロ法とテンソルネットワーク法による数値シミュレーションを行い、QCPの他の臨界指数を全て決定する。

Phase transitions in dissipative quantum systems are intriguing because they are induced by the interplay between coherent quantum and incoherent classical fluctuations. Here, we investigate the crossover from a quantum to a classical absorbing phase transition arising in the quantum contact process (QCP). The Lindblad equation contains two parameters, $\omega$ and $\kappa$, which adjust the contributions of the quantum and classical effects, respectively. We find that in one dimension when the QCP starts from a homogeneous state with all active sites, there exists a critical line in the region $0 \le \kappa < \kappa_*$ along which the exponent $\alpha$ (which is associated with the density of active sites) decreases continuously from a quantum to the classical directed percolation (DP) value. This behavior suggests that the quantum coherent effect remains to some extent near $\kappa=0$. However, when the QCP in one dimension starts from a heterogeneous state with all inactive sites except for one active site, all the critical exponents have the classical DP values for $\kappa \ge 0$. In two dimensions, anomalous crossover behavior does not occur, and classical DP behavior appears in the entire region of $\kappa \ge 0$ regardless of the initial configuration. Neural network machine learning is used to identify the critical line and determine the correlation length exponent. Numerical simulations using the quantum jump Monte Carlo technique and tensor network method are performed to determine all the other critical exponents of the QCP.

翻訳日:2023-05-26 06:25:12 公開日:2021-03-07

# 原子スペクトルにおける量子時間拡張

Quantum time dilation in atomic spectra ( http://arxiv.org/abs/2006.10084v3 )

ライセンス: Link先を確認

Piotr T. Grochowski, Alexander R. H. Smith, Andrzej Dragan and Kacper D\k{e}bski

(参考訳) 量子時間拡張は、相対論的運動量波パケットの重ね合わせで時計が動くときに起こる。我々は、励起水素様原子の寿命を時計として利用し、自然放出過程において量子時間拡張がどのように現れるかを示す。結果として生じる放出速度は、運動量波パケットの混合で調製された原子の放出速度に対して、順に$v^2/c^2$である場合と異なる。この効果は、運動量波パケット間のコヒーレンスによるドップラーシフトに対する量子補正を伴う。この量子ドップラーシフトは、$v/c$のスペクトル線形状に影響を与える。しかし、その減衰速度への影響は、量子時間拡張の効果と比較して抑制される。我々は、分光実験が量子時間拡張の効果を研究するための技術的に実現可能なプラットフォームを提供すると主張する。

Quantum time dilation occurs when a clock moves in a superposition of relativistic momentum wave packets. We utilize the lifetime of an excited hydrogen-like atom as a clock to demonstrate how quantum time dilation manifests in a spontaneous emission process. The resulting emission rate differs when compared to the emission rate of an atom prepared in a mixture of momentum wave packets at order $v^2/c^2$. This effect is accompanied by a quantum correction to the Doppler shift due to the coherence between momentum wave packets. This quantum Doppler shift affects the spectral line shape at order $v/c$. However, its effect on the decay rate is suppressed when compared to the effect of quantum time dilation. We argue that spectroscopic experiments offer a technologically feasible platform to explore the effects of quantum time dilation.

翻訳日:2023-05-13 15:38:49 公開日:2021-03-07

# チップスケール検出器を用いた効率的・低反発量子計測

Efficient and Low-Backaction Quantum Measurement Using a Chip-Scale Detector ( http://arxiv.org/abs/2008.03805v2 )

ライセンス: Link先を確認

Eric I. Rosenthal, Christian M. F. Schneider, Maxime Malnou, Ziyi Zhao, Felix Leditzky, Benjamin J. Chapman, Waltraut Wustmann, Xizheng Ma, Daniel A. Palken, Maximilian F. Zanner, Leila R. Vale, Gene C. Hilton, Jiansong Gao, Graeme Smith, Gerhard Kirchmair, and K. W. Lehnert

(参考訳) 超伝導量子ビットはスケーラブルな量子コンピューティングと量子誤り訂正のための主要なプラットフォームである。このプラットフォームの1つの特徴は、qubitのデコヒーレンス時間よりも桁違いに射影的測定を行う能力である。このような測定は、量子制限パラメトリック増幅器とフェライト循環器(増幅器のバックアクションによるノイズやデコヒーレンスからの隔離を提供する磁気装置)を併用することで可能となる。これらの非相互要素は性能が限られており、チップ上で簡単に統合できないため、スケーラブルな代替品に置き換えるという長年の目標であった。本稿では,量子ビットと増幅器の結合を制御するための超伝導スイッチを用いて,この問題に対する解法を示す。これにより、1つのチップスケールのデバイスを用いてトランスモン量子ビットを計測し、パラメトリック増幅とアンプバック動作のバルクからの分離の両方を提供する。この測定は高速で信頼性が高く、70%の効率で、超伝導量子ビット測定で報告された最高の測定値に匹敵する。このように、この研究は超伝導量子ビットのスケーラブルな測定のための高品質なプラットフォームを構成する。

Superconducting qubits are a leading platform for scalable quantum computing and quantum error correction. One feature of this platform is the ability to perform projective measurements orders of magnitude more quickly than qubit decoherence times. Such measurements are enabled by the use of quantum-limited parametric amplifiers in conjunction with ferrite circulators - magnetic devices which provide isolation from noise and decoherence due to amplifier backaction. Because these non-reciprocal elements have limited performance and are not easily integrated on-chip, it has been a longstanding goal to replace them with a scalable alternative. Here, we demonstrate a solution to this problem by using a superconducting switch to control the coupling between a qubit and amplifier. Doing so, we measure a transmon qubit using a single, chip-scale device to provide both parametric amplification and isolation from the bulk of amplifier backaction. This measurement is also fast, high fidelity, and has 70% efficiency, comparable to the best that has been reported in any superconducting qubit measurement. As such, this work constitutes a high-quality platform for the scalable measurement of superconducting qubits.

翻訳日:2023-05-06 18:03:43 公開日:2021-03-07

# SYKモデルとJT重力における作用素の複雑性成長

Complexity growth of operators in the SYK model and in JT gravity ( http://arxiv.org/abs/2008.12274v3 )

ライセンス: Link先を確認

Shao-Kai Jian, Brian Swingle, and Zhuo-Yu Xian

(参考訳) 作用素の大きさと計算複雑性の概念は、時間発展するハイゼンベルク作用素の構造を特徴づけるのに役立つため、量子カオスとホログラフィック双対の研究において重要な役割を果たす。特に、これらの顕微鏡的に定義された複雑性の測度が、複雑性体積(CV)双対性のような双対ホログラフィック幾何学で定義される複雑性の概念とどのように関連しているかを理解することが重要である。本稿では, Sachdev-Ye-Kitaev(SYK)モデルにおける部分絡み合った熱状態と, ジャッキー・ティーテルボイム(JT)重力におけるブラックホールの内部に挿入された作用素の双対記述について述べる。我々は,K-複雑度として知られるSYKモデルにおける複雑性の顕微鏡的定義とJT重力におけるCV双対性を用いた計算との比較を行い,両量とも指数的-直線的成長挙動を示すことを示した。また,時間発展に伴うオペレータサイズの成長を計算し,サイズと複雑性の関連を見出す。演算子の大きさの概念はスクランブル時間に飽和するが、量子系と重力理論の両方でよく定義されている複雑性は、初期と後期の両方における演算子の進化の有用な尺度として有用であることが示唆される。

The concepts of operator size and computational complexity play important roles in the study of quantum chaos and holographic duality because they help characterize the structure of time-evolving Heisenberg operators. It is particularly important to understand how these microscopically defined measures of complexity are related to notions of complexity defined in terms of a dual holographic geometry, such as complexity-volume (CV) duality. Here we study partially entangled thermal states in the Sachdev-Ye-Kitaev (SYK) model and their dual description in terms of operators inserted in the interior of a black hole in Jackiw-Teitelboim (JT) gravity. We compare a microscopic definition of complexity in the SYK model known as K-complexity to calculations using CV duality in JT gravity and find that both quantities show an exponential-to-linear growth behavior. We also calculate the growth of operator size under time evolution and find connections between size and complexity. While the notion of operator size saturates at the scrambling time, our study suggests that complexity, which is well defined in both quantum systems and gravity theories, can serve as a useful measure of operator evolution at both early and late times.

翻訳日:2023-05-04 19:40:25 公開日:2021-03-07

# 非線形光パラメトリック増幅器における時空間エンタングルメント

Spatiotemporal entanglement in a noncollinear optical parametric amplifier ( http://arxiv.org/abs/2009.10511v2 )

ライセンス: Link先を確認

L. La Volpe, S. De, M. I. Kolobov, V. Parigi, C. Fabre, N. Treps, D. B. Horoshko

(参考訳) 超短パルスポンプを用いた単パス型非線形周波数縮退パラメトリックダウンコンバージョンにおける2本の絡み合った光の発生を理論的に検討した。本研究では, 時空間的固有値とそれに対応する固有値とを数値的に, 解析的に検索する。解析解は、曲線座標中のガウス関数によって場の合同スペクトル振幅をモデル化することによって得られる。この方法は非常に効率的であり,数値解とよく一致していることを示す。また、生成したビームの総帯域幅が十分に高い場合には、モーダル関数を空間的および時間的部分に分解することはできないが、ポンプを短くすることで強度を増大させることができる時空間結合を示す。

We theoretically investigate the generation of two entangled beams of light in the process of single-pass type-I noncollinear frequency degenerate parametric downconversion with an ultrashort pulsed pump. We find the spatio-temporal squeezing eigenmodes and the corresponding squeezing eigenvalues of the generated field both numerically and analytically. The analytical solution is obtained by modeling the joint spectral amplitude of the field by a Gaussian function in curvilinear coordinates. We show that this method is highly efficient and is in a good agreement with the numerical solution. We also reveal that when the total bandwidth of the generated beams is sufficiently high, the modal functions cannot be factored into a spatial and a temporal parts, but exhibit a spatio-temporal coupling, whose strength can be increased by shortening the pump.

翻訳日:2023-05-01 07:06:42 公開日:2021-03-07

# 量子位相スリップによる光子の非弾性散乱

Inelastic scattering of a photon by a quantum phase-slip ( http://arxiv.org/abs/2010.02099v2 )

ライセンス: Link先を確認

Roman Kuzmin, Nicholas Grabon, Nitish Mehta, Amir Burshtein, Moshe Goldstein, Manuel Houzet, Leonid I. Glazman, Vladimir E. Manucharyan

(参考訳) 単一光子の自然崩壊は、周波数範囲に関係なく自然界で悪名高い非効率過程である。高インピーダンス超伝導導波路における量子位相-滑り変動は、単一入射マイクロ波光子を、ほぼ単位確率で多数の低エネルギー光子に分割することができることを報告した。基礎となる非弾性光子-光子相互作用は非線形光学ではアナログを持たない。代わりに、測定された崩壊速度は、ルッティンガー液体の量子不純物の新しいモデルの枠組みにおいて調整可能なパラメータなしで説明される。この結果は、強相関系の物理学において重要な2次元境界場の臨界現象と回路量子電磁力学を結びつける。光子寿命データは、検証され有用な量子多体シミュレーションの珍しい例である。

Spontaneous decay of a single photon is a notoriously inefficient process in nature irrespective of the frequency range. We report that a quantum phase-slip fluctuation in high-impedance superconducting waveguides can split a single incident microwave photon into a large number of lower-energy photons with a near unit probability. The underlying inelastic photon-photon interaction has no analogs in non-linear optics. Instead, the measured decay rates are explained without adjustable parameters in the framework of a new model of a quantum impurity in a Luttinger liquid. Our result connects circuit quantum electrodynamics to critical phenomena in two-dimensional boundary quantum field theories, important in the physics of strongly-correlated systems. The photon lifetime data represents a rare example of verified and useful quantum many-body simulation.

翻訳日:2023-04-29 22:37:01 公開日:2021-03-07

# 磁気共鳴力センサを用いたスピン質量相互作用の探索

Searching spin-mass interaction using a diamagnetic levitated magnetic resonance force sensor ( http://arxiv.org/abs/2010.14199v3 )

ライセンス: Link先を確認

Fang Xiong, Tong Wu, Yingchun Leng, Rui Li, Changkui Duan, Xi Kong, Pu Huang, Zhengwei Li, Yu Gao, Xing Rong and Jiangfeng Du

(参考訳) 軸索状粒子(ALP)は、スピンと質量の間のエキゾチック相互作用を媒介する。近距離でのスピン質量力に対する最も感度の高いセンサの1つである浮揚マイクロメカニカル発振器に基づくalp探索実験を提案する。提案実験は、偏極電子スピンと磁気浮上マイクロスフィアとの間のスピン質量共鳴相互作用をテストする。電子スピンを周期的に反転させることで、非共鳴背景力からの汚染を除去することができる。浮遊マイクロオシレータは、現在の4 meVから0.4 eVの質量を持つALPの10^3$倍近い感度を期待できる。

Axion-like particles (ALPs) are predicted to mediate exotic interactions between spin and mass. We propose an ALP-searching experiment based on the levitated micromechanical oscillator, which is one of the most sensitive sensors for spin-mass forces at a short distance. The proposed experiment tests the spin-mass resonant interaction between the polarized electron spins and a diamagnetically levitated microsphere. By periodically flipping the electron spins, the contamination from nonresonant background forces can be eliminated. The levitated microoscillator can prospectively enhance the sensitivity by nearly $10^3$ times over current experiments for ALPs with mass in the range 4 meV to 0.4 eV.

翻訳日:2023-04-27 08:51:56 公開日:2021-03-07

# 擬似状態スキームによる多量子Rydberg量子論理ゲート

Multiple-qubit Rydberg quantum logic gate via dressed-states scheme ( http://arxiv.org/abs/2010.14704v2 )

ライセンス: Link先を確認

Yucheng He, Jing-Xin Liu, F.-Q. Guo, Lei-Lei Yan, Ronghui Luo, Erjun Liang, Shi-Lei Su, M. Feng

(参考訳) 本稿では,ビタノフ式パルスと,ライドベルク原子の非断熱性(STA)の利点を組み合わせたマルチキュービット量子状態転送と量子論理ゲートを実現する手法を提案する。自発放出に対するスキームのロバスト性は、STA技術を通じてリドベルク励起状態の人口を減少させることによって達成できる。一方、制御誤差はよく設計されたパルスを使うことで最小化できる。さらに, この方式では, 量子状態伝達を高忠実度でスムーズにオン/オフし, 従来の断熱法の近道よりも高速に行うことができる。 Rydberg antiblockade (RAB) 効果を用いることで、パラメータの一般的な選択条件の下でマルチキュービットトフォリゲートを構築することができる。

We present a scheme to realize multiple-qubit quantum state transfer and quantum logic gate by combining the advantages of Vitanov-style pulses and dressed-state-based shortcut to adiabaticity (STA) in Rydberg atoms. The robustness of the scheme to spontaneous emission can be achieved by reducing the population of Rydberg excited states through the STA technology. Meanwhile, the control errors can be minimized through using the well-designed pulses. Moreover, the dressed-state method applied in the scheme makes the quantum state transfer more smoothly turned on or off with high fidelity and also faster than traditional shortcut to adiabaticity methods. By using Rydberg antiblockade (RAB) effect, the multiple-qubit Toffoli gate can be constructed under a general selection conditions of the parameters.

翻訳日:2023-04-27 06:31:51 公開日:2021-03-07

# 量子カーペットの単一から多体へのクロスオーバー

Single- to many-body crossover of a quantum carpet ( http://arxiv.org/abs/2011.04582v2 )

ライセンス: Link先を確認

Maciej {\L}ebek, Piotr T. Grochowski, Kazimierz Rz\k{a}\.zewski

(参考訳) 量子カーペットパターンを示すボソンの強相互作用多体系をガウディン解を用いて正確に研究した。箱電位に閉じ込められた超低温ボソニックガス中の弱い原子間相互作用により、非相互作用の単体シナリオに通常存在するこの高一貫性設計が破壊されることを示す。しかし、システムがフェミオン化を受けると、非常に強く相互作用する体制で復活する。単体から多体へのクロスオーバー全体を追跡し,システム内に存在するデ・デファクト・デファレンスの分析を行う。

Strongly interacting many-body system of bosons exhibiting the quantum carpet pattern is investigated exactly by using Gaudin solutions. We show that this highly coherent design usually present in noninteracting, single-body scenarios gets destroyed by weak-to-moderate interatomic interactions in an ultracold bosonic gas trapped in a box potential. However, it becomes revived in a very strongly interacting regime, when the system undergoes fermionization. We track the whole single- to many-body crossover, providing an analysis of de- and rephasing present in the system.

翻訳日:2023-04-24 21:18:18 公開日:2021-03-07

# schr\"odinger--newton基底状態の漸近的崩壊について

On the asymptotic decay of the Schr\"odinger--Newton ground state ( http://arxiv.org/abs/2101.01296v4 )

ライセンス: Link先を確認

Michael K.-H. Kiessling

(参考訳) 基底状態 $u(r)$ of the Schr\"odinger--Newton equation in $\mathbb{R}^3$ の漸近は V. Moroz と J. van Schaftingen によって$u(r) \sim A e^{-r}/ r^{1 - \|u\|_2^2/8\pi}$ と決定された。彼らは、$\|u\|_2^2$, squared $L^2$ norm of $u$ の値を残した。ここで、2^{1/3}3\pi^2\leq \|u\|_2^2\leq 2^{3}\pi^{3/2}$ が厳密に示される。数値的に$\|u\|_2^2\approx 14.03\pi$ と報告されており、$e^{-r}$ の単項プレファクタは、凹凸な方法で$r$ に増加する。シンガー-ニュートン方程式は外部の$\sim - K/r$電位を持ち、関連するボゾン原子やイオンのハートリー方程式に対しては漸近的な結果が提案される。

The asymptotics of the ground state $u(r)$ of the Schr\"odinger--Newton equation in $\mathbb{R}^3$ was determined by V. Moroz and J. van Schaftingen to be $u(r) \sim A e^{-r}/ r^{1 - \|u\|_2^2/8\pi}$ for some $A>0$, in units that fix the exponential rate to unity. They left open the value of $\|u\|_2^2$, the squared $L^2$ norm of $u$. Here it is rigorously shown that $2^{1/3}3\pi^2\leq \|u\|_2^2\leq 2^{3}\pi^{3/2}$. It is reported that numerically $\|u\|_2^2\approx 14.03\pi$, revealing that the monomial prefactor of $e^{-r}$ increases with $r$ in a concave manner. Asymptotic results are proposed for the Schr\"odinger--Newton equation with external $\sim - K/r$ potential, and for the related Hartree equation of a bosonic atom or ion.

翻訳日:2023-04-17 20:13:28 公開日:2021-03-07

# 変形モース様電位

Deformed Morse-like potential ( http://arxiv.org/abs/2101.09703v2 )

ライセンス: Link先を確認

I. A. Assi, A. D. Alhaidari and H. Bahlouli

(参考訳) 我々は、結合状態と共鳴状態の両方をサポートする完全可解な1次元ポテンシャルを導入する。このポテンシャルはよく知られた1次元モースポテンシャルの一般化であり、有限スペクトル特性を保存する変形を導入した。一方、ゼロ変形の極限において、このポテンシャルは A. D. Alhaidari が最近導入した指数的収束ポテンシャルに還元される。後者のポテンシャルは無限スペクトルをサポートするため、ゼロ変形極限は、我々の系が有限スペクトル極限から無限スペクトル極限へ遷移する臨界点である。対応するシュロディンガー方程式を解き、三対角表現法を用いてエネルギースペクトルと固有状態を得る。

We introduce an exactly solvable one-dimensional potential that supports both bound and/or resonance states. This potential is a generalization of the well-known 1D Morse potential where we introduced a deformation that preserves the finite spectrum property. On the other hand, in the limit of zero deformation, the potential reduces to the exponentially confining potential well introduced recently by A. D. Alhaidari. The latter potential supports infinite spectrum which means that the zero deformation limit is a critical point where our system will transition from the finite spectrum limit to the infinite spectrum limit. We solve the corresponding Schrodinger equation and obtain the energy spectrum and the eigenstates using the tridiagonal representation approach.

翻訳日:2023-04-14 02:31:21 公開日:2021-03-07

# ニューラルネットワークを用いた携帯機器の個人リスクプロファイリングによる多変量状況リスク評価に対する認知反応と感情応答の処理

Individual risk profiling for portable devices using a neural network to process the cognitive reactions and the emotional responses to a multivariate situational risk assessment ( http://arxiv.org/abs/2103.00441v2 )

ライセンス: Link先を確認

Frederic Jumelle, Kelvin So, and Didan Deng

(参考訳) 本稿では,認知と感情の関連性を確立するための,神経心理学的パフォーマンステストのための新しい方法とシステムを提案する。ユーザ情報をユーザ名の下に記憶し、携帯装置を介してユーザによってログインされるクラウドサービスと対話するために使用される携帯装置と、このユーザ情報をデバイスを介して直接キャプチャして、人工ニューラルネットワークで処理し、この3次元情報は、ユーザ認知反応、ユーザ感情応答及びユーザクロノメトリを含む。多変量状況リスクアセスメント(multivariate situational risk assessment)は、日常生活のさまざまな状況を記述する一連の30のディコトナス質問に対する各反応の3次元を捉え、ユーザの知識、価値観、倫理、原則に挑戦することで、被験者のパフォーマンスを評価するために使用される。産業用アプリケーションでは、この評価のタイミングは、銀行口座の開設、住宅ローンまたは保険契約の取得、職場での認可の認証、オンライン決済の確保など、提供者からのサービス取得の必要性に依存する。

In this paper, we are presenting a novel method and system for neuropsychological performance testing that can establish a link between cognition and emotion. It comprises a portable device used to interact with a cloud service which stores user information under username and is logged into by the user through the portable device; the user information is directly captured through the device and is processed by artificial neural network; and this tridimensional information comprises user cognitive reactions, user emotional responses and user chronometrics. The multivariate situational risk assessment is used to evaluate the performance of the subject by capturing the 3 dimensions of each reaction to a series of 30 dichotomous questions describing various situations of daily life and challenging the user's knowledge, values, ethics, and principles. In industrial application, the timing of this assessment will depend on the user's need to obtain a service from a provider such as opening a bank account, getting a mortgage or an insurance policy, authenticating clearance at work or securing online payments.

翻訳日:2023-04-09 16:43:02 公開日:2021-03-07

# 2ビットT状態におけるステアリングの必要十分基準

Necessary and sufficient criterion of steering for two-qubit T states ( http://arxiv.org/abs/2103.04280v1 )

ライセンス: Link先を確認

Xiao-Gang Fan, Huan Yang, Fei Ming, Xue-Ke Song, Dong Wang and Liu Ye

(参考訳) アインシュタイン・ポドルスキー・ローゼン(einstein-podolsky-rosen、epr)は、観測者が遠方の観測者に局所的な測定を行うことで絡み合いを共有するよう説得する能力である。量子状態の定式化は、まだ未解決の問題である。ここでは,任意の2量子T状態に対応する無限の測定値を持つ新しいステアリング不等式を,各面のN射影測定設定によるEPRステアリング不等式を考慮して導出した。実際、操舵の不等式は、T状態が操舵不能であることを保証するための十分な基準でもある。したがって、操舵不等式は、t状態が操舵可能か否かを識別するために必要な十分な基準と見なすことができる。ステアブル状態からなる集合が絡み合った状態からなる集合の厳密な部分集合であるという事実を明らかにするために、理論的にはすべての分離可能なT状態がステアリングの不等式に違反できないことを証明している。さらに,T状態が1/4を超える場合,T状態が評価可能であることを示すため,任意の2ビットT状態に対するコンカレンスからの最大違反を推定する手法を提案した。

Einstein-Podolsky-Rosen (EPR) steering is the ability that an observer persuades a distant observer to share entanglement by making local measurements. Determining a quantum state is steerable or unsteerable remains an open problem. Here, we derive a new steering inequality with infinite measurements corresponding to an arbitrary two-qubit T state, from consideration of EPR steering inequalities with N projective measurement settings for each side. In fact, the steering inequality is also a sufficient criterion for guaranteering that the T state is unsteerable. Hence, the steering inequality can be viewed as a necessary and sufficient criterion to distinguish whether the T state is steerable or unsteerable. In order to reveal the fact that the set composed of steerable states is the strict subset of the set made up of entangled states, we prove theoretically that all separable T states can not violate the steering inequality. Moreover, we put forward a method to estimate the maximum violation from concurrence for arbitrary two-qubit T states, which indicates that the T state is steerable if its concurrence exceeds 1/4.

翻訳日:2023-04-08 20:22:56 公開日:2021-03-07

# 内部量子非分離性と外部古典相関のトレードオフの観測

Observation of the tradeoff between internal quantum nonseparability and external classical correlations ( http://arxiv.org/abs/2103.04276v1 )

ライセンス: Link先を確認

Jie Zhu, Yue Dai, S. Camalet, Cheng-Jie Zhang, Bi-Heng Liu, Chuan-Feng Li, Guang-Can Guo, and Yong-Sheng Zhang

(参考訳) 絡み合いの単元関係は非常に重要である。しかし、それらは異なるサブシステムによって共有される絡み合いの量だけを含む。絡み合いと他の種類の相関、特に古典的相関の間の単元関係は、非常に少ない。ここでは、内部量子非分離性とフォトニック系の外部総相関のトレードオフ関係を実験的に観察し、純粋に古典的な外部相関でさえ内部非分離性に有害な影響を及ぼすことを示した。コンカレンスによって測定された非分離性は、同じ光子内の異なる自由度の間であり、標準量子相互情報によって測定された外部古典相関は、タイムビン法を用いて光子対の光子間で生成される。この結果から,システムの内部絡み合いを保つためには,システムと環境との間に,古典的相関を含む低い外部相関を維持する必要があることがわかった。

The monogamy relations of entanglement are highly significant. However, they involve only amounts of entanglement shared by different subsystems. Results on monogamy relations between entanglement and other kinds of correlations, and particularly classical correlations, are very scarce. Here we experimentally observe a tradeoff relation between internal quantum nonseparability and external total correlations in a photonic system and found that even purely classical external correlations have a detrimental effect on internal nonseparability. The nonseparability we consider, measured by the concurrence, is between different degrees of freedom within the same photon, and the external classical correlations, measured by the standard quantum mutual information, are generated between the photons of a photon pair using the time-bin method. Our observations show that to preserve the internal entanglement in a system, it is necessary to maintain low external correlations, including classical ones, between the system and its environment.

翻訳日:2023-04-08 20:22:34 公開日:2021-03-07

# 量子補間アンサンブル:平均エントロピーと直交多項式

Quantum interpolating ensemble: Average entropies and orthogonal polynomials ( http://arxiv.org/abs/2103.04231v1 )

ライセンス: Link先を確認

Lu Wei and Nicholas Witte

(参考訳) 密度行列形式は、量子情報処理における様々な問題を研究するための基本的な道具である。密度行列の空間において、最もよく知られ、物理的に関係のある尺度はヒルベルト=シュミットのアンサンブルとビュール=ハルのアンサンブルである。本研究では,量子補間アンサンブル(quantum interpolating ensemble)と呼ばれる密度行列の一般化アンサンブルを提案する。提案するアンサンブルを理解するための第一歩として,いくつかの最近の結果を一般化したアンサンブル上の絡み合いエントロピーの正確な平均式を導出する。また、対応する直交多項式のいくつかの重要な性質を導出し、エントロピーの他の統計情報を得る。数値実験の結果,量子状態の絡み合いの程度を推定するアンサンブルの有用性が示された。

The density matrix formalism is a fundamental tool in studying various problems in quantum information processing. In the space of density matrices, the most well-known and physically relevant measures are the Hilbert-Schmidt ensemble and the Bures-Hall ensemble. In this work, we propose a generalized ensemble of density matrices, termed quantum interpolating ensemble, which is able to interpolate between these two seemingly unrelated ensembles. As a first step to understand the proposed ensemble, we derive the exact mean formulas of entanglement entropies over such an ensemble generalizing several recent results in the literature. We also derive some key properties of the corresponding orthogonal polynomials relevant to obtaining other statistical information of the entropies. Numerical results demonstrate the usefulness of the proposed ensemble in estimating the degree of entanglement of quantum states.

翻訳日:2023-04-08 20:21:58 公開日:2021-03-07

# 中性子干渉計と重力の短距離修正試験

Neutron interferometry and tests of short-range modifications of gravity ( http://arxiv.org/abs/2103.04218v1 )

ライセンス: Link先を確認

J. M. Rocha and F. Dahia

(参考訳) 中性子干渉計による重力の短距離修正実験を, 大型余剰次元のシナリオで検討する。拡張源の内部重力ポテンシャルの計算における非計算可能性問題(ゼロ幅ブレーンモデルに典型的な)を避けるため、厚いブレーン理論の文脈で、入射中性子と物質媒体との間の高次元重力相互作用に関連する中性子光学ポテンシャルを決定する。このようにして、中性子干渉計が制約できる余剰次元モデルの物理量を特定する。また,位相シフト器を電場とする干渉計測実験を,Aharanov-Casher効果の試験として検討した。この実験は、この非バリロン源を用いて、圧力の容量と重力を生成する内部エネルギーを測定するポストニュートンパラメータの短距離挙動の試験とみなすことができる。

We consider tests of short-distance modifications of gravity based on neutron interferometry in the scenario of large extra dimensions. Avoiding the non-computability problem in the calculation of the internal gravitational potential of extended sources, typical of models with zero-width brane, we determine the neutron optical potential associated with the higher-dimension gravitational interaction between the incident neutron and a material medium in the context of thick brane theories. Proceeding this way, we identify the physical quantity of the extra dimension model that the neutron interferometry is capable of constraining. We also consider interferometric experiments in which the phase shifter is an electric field, as in the test of the Aharanov-Casher effect. We argue that this experiment, with this non-baryonic source, can be viewed as a test of the short-range behavior of Post-Newtonian parameters that measure the capacity of the pressure and the internal energy for producing gravity.

翻訳日:2023-04-08 20:21:43 公開日:2021-03-07

# 自由粒子球面波、擬似調和発振器、三重型ポテンシャルに対するdunkl微分を持つschr\"odinger方程式の厳密解

Exact solutions of the Schr\"odinger Equation with Dunkl Derivative for the Free-Particle Spherical Waves, the Pseudo-Harmonic Oscillator and the Mie-type Potential ( http://arxiv.org/abs/2103.04461v1 )

ライセンス: Link先を確認

R. D. Mota and D. Ojeda-Guill\'en

(参考訳) 我々は、自由粒子、擬調和振動子、三重型ポテンシャルに対するシュル=オディンガー方程式をダンケル微分で正確に3次元で解く。半径部と角部に関する方程式は、球座標と変数の分離を用いて得られる。これらのポテンシャルの波動関数とエネルギースペクトルは解析的な方法で導出され、dunkl微分パラメータを取り除いたときに報告された値に十分に減少することが示されている。

We solve exactly the Schr\"odinger equation for the free-particle, the pseudo-harmonic oscillator and the Mie-type potential in three dimensions with the Dunkl derivative. The equations for the radial and angular parts are obtained by using spherical coordinates and separation of variables. The wave functions and the energy spectrum for these potentials are derived in an analytical way and it is shown that our results are adequately reduced to those previously reported when we remove the Dunkl derivative parameters.

翻訳日:2023-04-08 20:17:35 公開日:2021-03-07

# インドニュースメディアの話題webページにおけるディファレンシャルトラッキング

Differential Tracking Across Topical Webpages of Indian News Media ( http://arxiv.org/abs/2103.04442v1 )

ライセンス: Link先を確認

Yash Vekaria, Vibhor Agarwal, Pushkal Agarwal, Sangeeta Mahapatra, Sakthi Balan Muthiah, Nishanth Sastry, Nicolas Kourtellis

(参考訳) オンラインユーザーのプライバシーと追跡は近年広く研究されており、特にEUと米国におけるプライバシーと個人データに関する法律(General Data Protection Regulation、ePrivacy Regulation、California Consumer Privacy Act)によって研究されている。調査により、世界中のウェブサイトで第1および第3者が採用する新しい追跡方法と個人識別可能な情報漏洩方法、およびそのようなウェブサイトで実施される追跡の強度が明らかになった。しかし、ウェブの大部分をカバーするためにスケールするため、過去のほとんどの研究はウェブサイトのホームページに焦点をあて、トピックのサブページにおける追跡の慣行を深く見なかった。研究の大半は、EUや米国のようなグローバル・ノース市場に焦点を当てた。世界の人口の20%をカバーし、明確なプライバシー法を持たないインドのような大市場は、この点で研究されていない。これらのギャップに対処し、以下の研究課題に焦点をあてることを目的としている。インドのニュースサイトにおけるトピックのサブページの追跡は、彼らのホームページと異なるのか? サードパーティのトラッカーは特定のトピックを追跡するのを好むか? この選好は、これらのトピックのサブページで示される内容の類似性と比較してどうだろうか? そこで本研究では,これらの疑問に答えるべく,インドニュースのトピック・サブページをurlの詳細に基づいて自動抽出・分類する手法を提案する。特定したトピックのサブページを調査し,クッキー注入の強度やサードパーティの埋め込み度,タイプについてホームページと比較した。サブページ間、およびサブページとホームページ間で異なるユーザトラッキングを見つける。また、特定のトピックに対するサードパーティのトラッカーの優先的なアタッチメントも見つけました。また、組み込みサードパーティは特定のサブページを同時に追跡する傾向があり、ユーザのプロファイリングが実行可能である。

Online user privacy and tracking have been extensively studied in recent years, especially due to privacy and personal data-related legislations in the EU and the USA, such as the General Data Protection Regulation, ePrivacy Regulation, and California Consumer Privacy Act. Research has revealed novel tracking and personal identifiable information leakage methods that first- and third-parties employ on websites around the world, as well as the intensity of tracking performed on such websites. However, for the sake of scaling to cover a large portion of the Web, most past studies focused on homepages of websites, and did not look deeper into the tracking practices on their topical subpages. The majority of studies focused on the Global North markets such as the EU and the USA. Large markets such as India, which covers 20% of the world population and has no explicit privacy laws, have not been studied in this regard. We aim to address these gaps and focus on the following research questions: Is tracking on topical subpages of Indian news websites different from their homepage? Do third-party trackers prefer to track specific topics? How does this preference compare to the similarity of content shown on these topical subpages? To answer these questions, we propose a novel method for automatic extraction and categorization of Indian news topical subpages based on the details in their URLs. We study the identified topical subpages and compare them with their homepages with respect to the intensity of cookie injection and third-party embeddedness and type. We find differential user tracking among subpages, and between subpages and homepages. We also find a preferential attachment of third-party trackers to specific topics. Also, embedded third-parties tend to track specific subpages simultaneously, revealing possible user profiling in action.

翻訳日:2023-04-08 20:17:24 公開日:2021-03-07

# 非平衡グリーン関数から密度行列の量子マスター方程式と時間外相関式へ:定常状態と断熱力学

From non-equilibrium Green's functions to quantum master equations for the density matrix and out-of-time-order correlators: steady state and adiabatic dynamics ( http://arxiv.org/abs/2103.04373v1 )

ライセンス: Link先を確認

Bibek Bhandari, Rosario Fazio, Fabio Taddei and Liliana Arrachea

(参考訳) 低速駆動下での有限量子系と温度の異なる熱貯水池との弱結合を考える。本稿では,密度行列と時間外相関行列に対する量子マスター方程式の体系的導出について述べる。我々は顕微鏡ハミルトンから始まり、シュウィンガー・ケルディシュ非平衡グリーンの関数形式に関連付けてこれらの量の力学を支配づける方程式を定式化し、系と貯水池の間の結合の摂動展開を行う。本研究では, システム-貯留層結合による緩和時間と運転に伴う時間スケールとの比の線形応答を考慮した断熱力学に着目した。粒子とエネルギーのフラックスを計算しますボソニック貯水池に結合したクトリットとフェルミイオン貯水池に付随する一対の相互作用する量子ドットの場合の形式論を説明し、コヒーレント効果の関連性についても論じる。

We consider a finite quantum system under slow driving and weakly coupled to thermal reservoirs at different temperatures. We present a systematic derivation of the quantum master equation for the density matrix and the out-of-time-order correlators. We start from the microscopic Hamiltonian and we formulate the equations ruling the dynamics of these quantities by recourse to the Schwinger-Keldysh non-equilibrium Green's function formalism, performing a perturbative expansion in the coupling between the system and the reservoirs. We focus on the adiabatic dynamics, which corresponds to considering the linear response in the ratio between the relaxation time due to the system-reservoir coupling and the time scale associated to the driving. We calculate the particle and energy fluxes. We illustrate the formalism in the case of a qutrit coupled to bosonic reservoirs and of a pair of interacting quantum dots attached to fermionic reservoirs, also discussing the relevance of coherent effects.

翻訳日:2023-04-08 20:16:56 公開日:2021-03-07

# クライン空間の不確かさ原理

Uncertainty Principles in Krein Space ( http://arxiv.org/abs/2103.04372v1 )

ライセンス: Link先を確認

Sirous Homayouni and Angelo B. Mingarelli

(参考訳) 2つの一般非可換自己共役作用素間の不確かさ関係はクレイン空間で導かれる。これらの関係はすべてクレイン空間によって誘導される基本対称性作用素 $j$ を含み、これらの一般化された関係のいくつかは、問題の2つの作用素の反交換子、可換子、その他の様々な非線形関数を含んでいる。その結果、ヒルベルト空間上の非自己共役作用素のクラスが存在し、その可換作用素の非有界性は不確実性関係を意味する。すべての関係は、フォン・ノイマンらによってヒルベルト空間で定式化された古典的なハイゼンベルクの不確実性原理を含む。さらに、クレイン空間における作用素依存(非線形)可換不確かさ関係を導出する。

Uncertainty relations between two general non-commuting self-adjoint operators are derived in a Krein space. All of these relations involve a Krein space induced fundamental symmetry operator, $J$, while some of these generalized relations involve an anti-commutator, a commutator, and various other nonlinear functions of the two operators in question. As a consequence there exist classes of non-self-adjoint operators on Hilbert spaces such that the non-vanishing of their commutator implies an uncertainty relation. All relations include the classical Heisenberg uncertainty principle as formulated in Hilbert Space by Von Neumann and others. In addition, we derive an operator dependent (nonlinear) commutator uncertainty relation in Krein space.

翻訳日:2023-04-08 20:16:34 公開日:2021-03-07

# ハイブリッド量子アプリケーションには2つのオーケストレーションが必要:ソフトウェアアーキテクチャの観点から

Hybrid Quantum Applications Need Two Orchestrations in Superposition: A Software Architecture Perspective ( http://arxiv.org/abs/2103.04320v1 )

ライセンス: Link先を確認

Frank Leymann, Johanna Barzen

(参考訳) 量子アプリケーションはしばしばハイブリッドであり、純粋な量子アルゴリズムの実装だけでなく、ワークフローやトポロジーを主要なアーティファクトとして、そして処理するデータから作られている。ワークフローとトポロジは現代の用語ではオーケストレーションと呼ばれる(しかし、全く異なる意味を持つ)ため、量子アプリケーションを実現するには2つのオーケストレーションが必要である。これらのオーケストレーション技術をスケッチし、非自明な量子アプリケーションの全体構造と、そのようなアプリケーションの実行環境の暗黙のアーキテクチャを明らかにします。

Quantum applications are most often hybrid, i.e. they are not only made of implementations of pure quantum algorithms but also of classical programs as well as workflows and topologies as key artifacts, and data they process. Since workflows and topologies are referred to as orchestrations in modern terminology (but with very different meaning), two orchestrations that go hand-in-hand are required to realize quantum applications. We motivate this by means of a non-trivial example, sketch these orchestration technologies and reveal the overall structure of nontrivial quantum applications as well as the implied architecture of a runtime environment for such applications.

翻訳日:2023-04-08 20:16:06 公開日:2021-03-07

# 古典力学と量子力学におけるパワーロー双対性

Power law duality in classical and quantum mechanics ( http://arxiv.org/abs/2103.04308v1 )

ライセンス: Link先を確認

Akira Inomata and Georg Junker

(参考訳) ニュートン-フック双対性と、古典的、半古典的、量子力学における任意のパワー則への一般化について論じる。我々は、パワーロー双対性は一連の双対演算の下での作用の対称性であるという考えを追求する。パワー双対対称性はハミルトンの特徴関数の形での作用の不変性と相互性によって定義される。パワーロー双対性は基本的に古典的な概念であり、角量子化のレベルで分解される。量子力学における双対対称性を保存するためのアドホック手順を提案する。エネルギー結合交換写像は、ある系を別の系に導く双対性演算の一部として必要であり、新しいエネルギーと古いエネルギーを関連付けるエネルギー公式に繋がる。放射状schr\"odinger方程式を満たす {the} グリーン関数の変換特性は、新しいグリーン関数を古い関数に関連付ける公式を与える。分数パワーポテンシャルにおける線形運動のエネルギースペクトルを半古典的に評価する。超対称半古典的作用におけるクーロン・フック双対性を示す方法を見出す。また,2項のパワーポテンシャルの双対構造の助けを借りて,閉じ込めポテンシャル問題についても検討する。

The Newton--Hooke duality and its generalization to arbitrary power laws in classical, semiclassical and quantum mechanics are discussed. We pursue a view that the power-law duality is a symmetry of the action under a set of duality operations. The power dual symmetry is defined by invariance and reciprocity of the action in the form of Hamilton's characteristic function. We find that the power-law duality is basically a classical notion and breaks down at the level of angular quantization. We propose an ad hoc procedure to preserve the dual symmetry in quantum mechanics. The energy-coupling exchange maps required as part of the duality operations that take one system to another lead to an energy formula that relates the new energy to the old energy. The transformation property of {the} Green function satisfying the radial Schr\"odinger equation yields a formula that relates the new Green function to the old one. The energy spectrum of the linear motion in a fractional power potential is semiclassically evaluated. We find a way to show the Coulomb--Hooke duality in the supersymmetric semiclassical action. We also study the confinement potential problem with the help of the dual structure of a two-term power potential.

翻訳日:2023-04-08 20:15:55 公開日:2021-03-07

# 異方性を有する高利得パラメトリックダウンコンバージョンにおける明るい相関双ビーム発生と放射整形

Bright correlated twin-beam generation and radiation shaping in high-gain parametric down-conversion with anisotropy ( http://arxiv.org/abs/2103.04305v1 )

ライセンス: Link先を確認

M. Riabinin, P. R. Sharapova, T. Meier

(参考訳) 非線形複屈折結晶における一軸異方性は非線形光学相互作用の効率を制限し、パラメトリックダウンコンバージョン(PDC)プロセスで生じる光の空間対称性を破る。したがって、この効果は通常望ましくないものであり、補償しなければならない。しかし、高利得は異方性の破壊的な役割を克服し、代わりに明るい2モード相関双ビームの生成に使うことができる。本研究では,強い異方性の存在下での明るい励起光の空間特性に関する厳密な理論的記述を提供する。本研究では, 単結晶および2結晶構造について検討し, 異方性による高利得で, 輝く相関したツインビームの発生を示す。生成した光のモード構造を探索し, 結晶間隔とともに異方性がどのように放射形成に利用できるかを示す。

Uniaxial anisotropy in nonlinear birefringent crystals limits the efficiency of nonlinear optical interactions and breaks the spatial symmetry of light generated in the parametric down-conversion (PDC) process. Therefore, this effect is usually undesirable and must be compensated for. However, high gain may be used to overcome the destructive role of anisotropy and instead to use it for the generation of bright two-mode correlated twin-beams. In this work, we provide a rigorous theoretical description of the spatial properties of bright squeezed light in the presence of strong anisotropy. We investigate a single-crystal and a two-crystal configuration and demonstrate the generation of bright correlated twin-beams in such systems at high gain due to anisotropy. We explore the mode structure of the generated light and show how anisotropy, together with crystal spacing, can be used for radiation shaping.

翻訳日:2023-04-08 20:15:38 公開日:2021-03-07

# ランダム性を用いた地域性, リアリズム, エルゴディダリティの決定

Using Randomness to decide among Locality, Realism and Ergodicity ( http://arxiv.org/abs/2001.01752v2 )

ライセンス: Link先を確認

Alejandro Hnilo

(参考訳) ループホールのない実験では、ベルの不平等の違反が観測されたとき、少なくとも3つの特徴のうちの1つが偽であることを示した。発見するために、または少なくとも指示を得るために実験が提案され、どれが偽であるかが示される。これは、パルス化されたベルのセットアップで見つからない一連の結果の速度の時間進化を記録することに基づいている。このような実験の結果は量子力学の基礎だけでなく重要なものとなる。例えば、基礎的な問題がまだ完全に決定されていなくても、量子認証ランダム数生成器の効率的な使用と、絡み合った状態を用いた量子鍵分布の安全性に即時的影響を与えることになる。

Loophole-free experiments have demonstrated that at least one of three features is false when the violation of Bell's inequalities is observed: Locality, Realism or (what is lesser known) Ergodicity. An experiment is proposed to find out, or at least to get an indication about, which one is false. It is based on recording the time evolution of the rate of series of outcomes that are found not-random in a pulsed Bell's setup. The results of such experiment would be important not only to the foundations of Quantum Mechanics. For, even if the foundational issue remained not fully decided, they would have immediate practical impact on the efficient use of quantum-certified Random Number Generators and the security of Quantum Key Distribution using entangled states.

翻訳日:2023-01-14 02:45:22 公開日:2021-03-07

# 選択的に分解されたデータによる因果推論

Causal Inference With Selectively Deconfounded Data ( http://arxiv.org/abs/2002.11096v4 )

ライセンス: Link先を確認

Kyra Gan, Andrew A. Li, Zachary C. Lipton, Sridhar Tayur

(参考訳) 標準的なコンバウンディンググラフと保存されていないコンファウンダリで生成されたデータのみを考えると、平均処理効果(ATE)は識別できない。 ATEを見積もるには、実践者はいずれかにしなければならない (a)非定型データの収集 b) 臨床試験を実施すること,又は (c) ATEを識別できるかもしれない因果グラフのさらなる性質を解明する。本稿では、ateを推定する際に、小型の非共役観測データセット(共同設立者不明)とともに、巨大共役観測データセット(共同設立者不明)を組み込むことの利点を考察する。理論的には, 待ち行列を所望の精度で推定するために必要なデコンストラクタデータの量を大幅に削減できる可能性が示唆された。さらに、遺伝学など一部のケースでは、再考してサンプルを分解する例も考えられる。既に観察されている)治療と結果に基づいて,これらの試料を積極的に選択することにより,試料の複雑さをさらに軽減できることを示す。我々の理論的および実証的な結果は、我々のアプローチの最悪の相対的な性能(例えば、自然ベンチマーク)が有界であり、ベストケースの利得は非有界であることを示す。最後に, がんの遺伝子変異に関連する大規模な実世界データセットを用いて, 選択的解凍の利点を実証する。

Given only data generated by a standard confounding graph with unobserved confounder, the Average Treatment Effect (ATE) is not identifiable. To estimate the ATE, a practitioner must then either (a) collect deconfounded data;(b) run a clinical trial; or (c) elucidate further properties of the causal graph that might render the ATE identifiable. In this paper, we consider the benefit of incorporating a large confounded observational dataset (confounder unobserved) alongside a small deconfounded observational dataset (confounder revealed) when estimating the ATE. Our theoretical results suggest that the inclusion of confounded data can significantly reduce the quantity of deconfounded data required to estimate the ATE to within a desired accuracy level. Moreover, in some cases -- say, genetics -- we could imagine retrospectively selecting samples to deconfound. We demonstrate that by actively selecting these samples based upon the (already observed) treatment and outcome, we can reduce sample complexity further. Our theoretical and empirical results establish that the worst-case relative performance of our approach (vs. a natural benchmark) is bounded while our best-case gains are unbounded. Finally, we demonstrate the benefits of selective deconfounding using a large real-world dataset related to genetic mutation in cancer.

翻訳日:2022-12-28 21:20:57 公開日:2021-03-07

# AraBERT:アラビア語理解のためのトランスフォーマーベースモデル

AraBERT: Transformer-based Model for Arabic Language Understanding ( http://arxiv.org/abs/2003.00104v4 )

ライセンス: Link先を確認

Wissam Antoun, Fady Baly, Hazem Hajj

(参考訳) アラビア語は形態学的に豊かな言語であり、英語に比べて比較的資源が少なく、文法も乏しい。これらの制限から、感性分析(SA)、名前付きエンティティ認識(NER)、質問回答(QA)といったアラビア自然言語処理(NLP)タスクは、対処が非常に難しいことが証明されている。近年,トランスフォーマーベースモデルの増加に伴い,言語固有のBERTベースモデルは,非常に大きなコーパスで事前学習されているため,言語理解において非常に効率的であることが証明されている。これらのモデルは新しい標準を設定し、ほとんどのNLPタスクに対して最先端の結果を得ることができた。本稿では、BERTが英語で行ったのと同じ成功を追求するため、アラビア語に特化してBERTを事前訓練した。 AraBERTのパフォーマンスは、Googleや他の最先端アプローチの多言語BERTと比較される。その結果, AraBERTはアラビアのほとんどのNLPタスクで最先端の性能を達成できた。事前訓練されたアラバートモデルは https://github.com/aub-mind/arabert で公開されている。

The Arabic language is a morphologically rich language with relatively few resources and a less explored syntax compared to English. Given these limitations, Arabic Natural Language Processing (NLP) tasks like Sentiment Analysis (SA), Named Entity Recognition (NER), and Question Answering (QA), have proven to be very challenging to tackle. Recently, with the surge of transformers based models, language-specific BERT based models have proven to be very efficient at language understanding, provided they are pre-trained on a very large corpus. Such models were able to set new standards and achieve state-of-the-art results for most NLP tasks. In this paper, we pre-trained BERT specifically for the Arabic language in the pursuit of achieving the same success that BERT did for the English language. The performance of AraBERT is compared to multilingual BERT from Google and other state-of-the-art approaches. The results showed that the newly developed AraBERT achieved state-of-the-art performance on most tested Arabic NLP tasks. The pretrained araBERT models are publicly available on https://github.com/aub-mind/arabert hoping to encourage research and applications for Arabic NLP.

翻訳日:2022-12-28 02:24:23 公開日:2021-03-07

# Image Augmentation:Pixelの深部強化学習を定期的に行う

Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels ( http://arxiv.org/abs/2004.13649v4 )

ライセンス: Link先を確認

Ilya Kostrikov, Denis Yarats, Rob Fergus

(参考訳) 本研究では,標準モデルフリー強化学習アルゴリズムに適用可能な簡易データ拡張手法を提案し,補助的損失や事前学習を必要とせず,画素から直接堅牢な学習を可能にする。この手法は、コンピュータビジョンタスクでよく使われる入力摂動を利用して値関数を正規化する。 SAC(Soft Actor-Critic)のような既存のモデルレスアプローチでは、画像ピクセルからディープネットワークを効果的に訓練することはできない。しかし,拡張手法の追加により,SACの性能が劇的に向上し,DeepMindコントロールスイートの最先端性能に到達し,モデルベース(Dreamer, PlaNet, SLAC)メソッドを超越し,最近提案されたコントラスト学習(CURL)が実現した。我々のアプローチはモデルフリーの強化学習アルゴリズムと組み合わせることができ、わずかな修正しか必要としない。実装はhttps://sites.google.com/view/data-regularized-qで確認できる。

We propose a simple data augmentation technique that can be applied to standard model-free reinforcement learning algorithms, enabling robust learning directly from pixels without the need for auxiliary losses or pre-training. The approach leverages input perturbations commonly used in computer vision tasks to regularize the value function. Existing model-free approaches, such as Soft Actor-Critic (SAC), are not able to train deep networks effectively from image pixels. However, the addition of our augmentation method dramatically improves SAC's performance, enabling it to reach state-of-the-art performance on the DeepMind control suite, surpassing model-based (Dreamer, PlaNet, and SLAC) methods and recently proposed contrastive learning (CURL). Our approach can be combined with any model-free reinforcement learning algorithm, requiring only minor modifications. An implementation can be found at https://sites.google.com/view/data-regularized-q.

翻訳日:2022-12-08 21:58:14 公開日:2021-03-07

# Probing Paradigm: Probingの正確性にはタスク関連性があるか?

Probing the Probing Paradigm: Does Probing Accuracy Entail Task Relevance? ( http://arxiv.org/abs/2005.00719v3 )

ライセンス: Link先を確認

Abhilasha Ravichander, Yonatan Belinkov, Eduard Hovy

(参考訳) ニューラルモデルはいくつかのNLPベンチマークで印象的な結果を得たが、言語タスクの実行に使用するメカニズムについてはほとんど理解されていない。このように、近年の注目は、ニューラルエンコーダが'プローブ'タスクのレンズを通して学んだ文表現の分析に向けられている。しかし、プローブによって発見された文表現にエンコードされた情報は、実際にモデルがタスクを実行するために実際にどの程度使われたのか? 本研究では,自然言語推論のケーススタディを通じて,モデルが訓練されたタスクに必要とされない場合でも,モデルが言語特性をエンコードすることを学ぶことができることを示す。さらに,事前学習された単語埋め込みが,学習タスク自体よりもこれらの特性をエンコーディングする上で重要な役割を担っていることも確認し,探索実験の設計における注意深い制御の重要性を強調した。最後に、制御された合成タスクのセットを通じて、ランダムノイズとしてデータに分散しても、モデルがこれらの特性をかなり高い確率レベルに符号化できることを示し、探索タスクにおける絶対的なクレームの解釈を疑問視する。

Although neural models have achieved impressive results on several NLP benchmarks, little is understood about the mechanisms they use to perform language tasks. Thus, much recent attention has been devoted to analyzing the sentence representations learned by neural encoders, through the lens of `probing' tasks. However, to what extent was the information encoded in sentence representations, as discovered through a probe, actually used by the model to perform its task? In this work, we examine this probing paradigm through a case study in Natural Language Inference, showing that models can learn to encode linguistic properties even if they are not needed for the task on which the model was trained. We further identify that pretrained word embeddings play a considerable role in encoding these properties rather than the training task itself, highlighting the importance of careful controls when designing probing experiments. Finally, through a set of controlled synthetic tasks, we demonstrate models can encode these properties considerably above chance-level even when distributed in the data as random noise, calling into question the interpretation of absolute claims on probing tasks.

翻訳日:2022-12-07 12:25:01 公開日:2021-03-07

# RISEプロジェクト:産業煙発生の認識

Project RISE: Recognizing Industrial Smoke Emissions ( http://arxiv.org/abs/2005.06111v8 )

ライセンス: Link先を確認

Yen-Chia Hsu, Ting-Hao 'Kenneth' Huang, Ting-Yao Hu, Paul Dille, Sean Prendi, Ryan Hoffman, Anastasia Tsuhlares, Jessica Pachuta, Randy Sargent, Illah Nourbakhsh

(参考訳) 産業用煙の排出は人間の健康に重大な影響を及ぼす。以前の研究では、煙を視覚的証拠として識別するコンピュータビジョン(CV)技術が規制当局の態度に影響を与え、市民に環境正義を追求する権限を与えることが示されている。しかし、既存のデータセットは、空気質の擁護を支援するために必要な堅牢なCVモデルをトレーニングするのに十分な品質や量ではない。産業煙発生の認識のための,最初の大規模ビデオデータセットRISEを紹介する。市民科学的なアプローチを採用し,地域コミュニティのメンバとコラボレーションして,ビデオクリップが煙を排出するかどうかを注釈する。当社のデータセットには、3つの産業施設を監視するカメラから19の異なるビューから12,567のクリップが含まれています。これらの昼間のクリップは、全4シーズンを含む2年間で30日間に及ぶ。ディープニューラルネットワークを用いて、強力なパフォーマンスベースラインを確立し、喫煙認識の課題を明らかにする実験を行った。調査はコミュニティからのフィードバックを議論し,データ分析の結果,市民科学者とクラウドワーカーを社会影響への人工知能の適用に統合する機会が示された。

Industrial smoke emissions pose a significant concern to human health. Prior works have shown that using Computer Vision (CV) techniques to identify smoke as visual evidence can influence the attitude of regulators and empower citizens to pursue environmental justice. However, existing datasets are not of sufficient quality nor quantity to train the robust CV models needed to support air quality advocacy. We introduce RISE, the first large-scale video dataset for Recognizing Industrial Smoke Emissions. We adopted a citizen science approach to collaborate with local community members to annotate whether a video clip has smoke emissions. Our dataset contains 12,567 clips from 19 distinct views from cameras that monitored three industrial facilities. These daytime clips span 30 days over two years, including all four seasons. We ran experiments using deep neural networks to establish a strong performance baseline and reveal smoke recognition challenges. Our survey study discussed community feedback, and our data analysis displayed opportunities for integrating citizen scientists and crowd workers into the application of Artificial Intelligence for Social Impact.

翻訳日:2022-12-03 13:17:43 公開日:2021-03-07

# 有限和最適化のための加速デュアル平均化による分散低減

Variance Reduction via Accelerated Dual Averaging for Finite-Sum Optimization ( http://arxiv.org/abs/2006.10281v4 )

ライセンス: Link先を確認

Chaobing Song, Yong Jiang and Yi Ma

(参考訳) 本稿では,VRADA(Accelerated Dual Averaging)と呼ばれる有限サム凸最適化の簡易かつ統一的な手法を提案する。一般的な凸設定と強い凸設定の両方において、vradaは$o(n\log\log n)$で$o\big(\frac{1}{n}\big)$-accurateの解を得ることができ、最もよく知られた結果である$o(n\log n)$がサンプル数である。一方、vrada は $\log\log n$ まで設定された一般凸の下限と一致し、両方のレジームにおける下限である $n\le \theta(\kappa)$ と $n\gg \kappa$ に一致し、ここで $\kappa$ は条件数を表す。最もよく知られた結果の改善と上記のすべての下位境界の同時マッチングに加えて、VRADAは一般的な凸と強い凸設定の両方に対してより統一的で単純化されたアルゴリズムの実装と収束解析を行う。 VRADAにおける新しい初期化戦略のような新しいアプローチは、独立した関心を持つかもしれない。実データセットの実験を通じて、大規模機械学習問題に対する既存の手法よりも優れたVRADA性能を示す。

In this paper, we introduce a simplified and unified method for finite-sum convex optimization, named \emph{Variance Reduction via Accelerated Dual Averaging (VRADA)}. In both general convex and strongly convex settings, VRADA can attain an $O\big(\frac{1}{n}\big)$-accurate solution in $O(n\log\log n)$ number of stochastic gradient evaluations which improves the best-known result $O(n\log n)$, where $n$ is the number of samples. Meanwhile, VRADA matches the lower bound of the general convex setting up to a $\log\log n$ factor and matches the lower bounds in both regimes $n\le \Theta(\kappa)$ and $n\gg \kappa$ of the strongly convex setting, where $\kappa$ denotes the condition number. Besides improving the best-known results and matching all the above lower bounds simultaneously, VRADA has more unified and simplified algorithmic implementation and convergence analysis for both the general convex and strongly convex settings. The underlying novel approaches such as the novel initialization strategy in VRADA may be of independent interest. Through experiments on real datasets, we show the good performance of VRADA over existing methods for large-scale machine learning problems.

翻訳日:2022-11-19 14:16:20 公開日:2021-03-07

# erdos goes neural:グラフの組合せ最適化のための教師なし学習フレームワーク

Erdos Goes Neural: an Unsupervised Learning Framework for Combinatorial Optimization on Graphs ( http://arxiv.org/abs/2006.10643v4 )

ライセンス: Link先を確認

Nikolaos Karalias, Andreas Loukas

(参考訳) 組合せ最適化問題は、特にラベル付きインスタンスの欠如において、ニューラルネットワークにとって非常に難しい。本研究は, グラフ上のCO問題に対する教師なし学習フレームワークを提案する。 erdosの確率的手法に触発され、ニューラルネットワークを用いて集合上の確率分布をパラメトリゼーションする。ネットワークが好適に選択された損失に最適化された場合、学習された分布は、制御された確率で、組合せ問題の制約に従う低コストな積分解を含む。確率論的存在証明は、望ましい解をデコードするためにデランディマイズされる。本稿では,最大傾き問題に対する有効な解と局所グラフクラスタリングを実現するために,本手法の有効性を示す。本手法は,実データと合成ハードインスタンスの双方で競合する結果を得る。

Combinatorial optimization problems are notoriously challenging for neural networks, especially in the absence of labeled instances. This work proposes an unsupervised learning framework for CO problems on graphs that can provide integral solutions of certified quality. Inspired by Erdos' probabilistic method, we use a neural network to parametrize a probability distribution over sets. Crucially, we show that when the network is optimized w.r.t. a suitably chosen loss, the learned distribution contains, with controlled probability, a low-cost integral solution that obeys the constraints of the combinatorial problem. The probabilistic proof of existence is then derandomized to decode the desired solutions. We demonstrate the efficacy of this approach to obtain valid solutions to the maximum clique problem and to perform local graph clustering. Our method achieves competitive results on both real datasets and synthetic hard instances.

翻訳日:2022-11-19 12:57:40 公開日:2021-03-07

# 限定モデル容量下における選択型ダイナスタイル計画

Selective Dyna-style Planning Under Limited Model Capacity ( http://arxiv.org/abs/2007.02418v3 )

ライセンス: Link先を確認

Zaheer Abbas, Samuel Sokota, Erin J. Talvitie, Martha White

(参考訳) モデルベースの強化学習では、不完全な環境モデルによる計画が学習の進捗を損なう可能性がある。しかし、モデルが不完全である場合でも、計画に有用な情報を含む可能性がある。本稿では,不完全モデルの使用を選択的に検討する。エージェントは、モデルが役に立つが、それが有害なモデルの使用を控える状態空間の一部で計画すべきである。効果的な選択的計画機構は、有理不確実性、パラメータ不確実性、およびモデル不確実性から生じる予測不確実性の推定を必要とする。事前の作業は、選択計画のパラメータの不確実性に重点を置いてきた。本研究では,モデル不足の重要性を強調する。パラメータ不確実性を考慮した手法によって検出されるモデル不確かさと相補的なモデル不確かさから生じる予測的不確かさが,パラメータ不確かさとモデル不確かさの両方を考慮すれば,より有望な選択的計画の方向になる可能性が示唆された。

In model-based reinforcement learning, planning with an imperfect model of the environment has the potential to harm learning progress. But even when a model is imperfect, it may still contain information that is useful for planning. In this paper, we investigate the idea of using an imperfect model selectively. The agent should plan in parts of the state space where the model would be helpful but refrain from using the model where it would be harmful. An effective selective planning mechanism requires estimating predictive uncertainty, which arises out of aleatoric uncertainty, parameter uncertainty, and model inadequacy, among other sources. Prior work has focused on parameter uncertainty for selective planning. In this work, we emphasize the importance of model inadequacy. We show that heteroscedastic regression can signal predictive uncertainty arising from model inadequacy that is complementary to that which is detected by methods designed for parameter uncertainty, indicating that considering both parameter uncertainty and model inadequacy may be a more promising direction for effective selective planning than either in isolation.

翻訳日:2022-11-13 07:46:25 公開日:2021-03-07

# ECOCを用いた深部畳み込みニューラルネットワーク

Deep Convolutional Neural Network Ensembles using ECOC ( http://arxiv.org/abs/2009.02961v2 )

ライセンス: Link先を確認

Sara Atito Ali Ahmed, Cemre Zor, Berrin Yanikoglu, Muhammad Awais, Josef Kittler

(参考訳) ディープニューラルネットワークは、画像理解を含む多くのアプリケーションにおいて意思決定システムの性能を高め、アンサンブルを構築することでさらなる成果を得ることができる。しかし、ネットワークのトレーニングに要する時間が非常に高く、あるいは得られる性能がそれほど重要でないため、ディープネットワークのアンサンブルを設計することは、しばしば有益ではない。本稿では,深層ネットワークのアンサンブル手法として使用する誤り訂正出力符号化(ecoc)フレームワークを分析し,精度・複雑さトレードオフに対処するための設計戦略を提案する。導入したECOC設計とアンサンブル平均化や勾配向上決定木などの最先端のアンサンブル技術との広範な比較研究を行う。さらに,最も高い分類性能を達成できるコンビネータ技術を提案する。

Deep neural networks have enhanced the performance of decision making systems in many applications including image understanding, and further gains can be achieved by constructing ensembles. However, designing an ensemble of deep networks is often not very beneficial since the time needed to train the networks is very high or the performance gain obtained is not very significant. In this paper, we analyse error correcting output coding (ECOC) framework to be used as an ensemble technique for deep networks and propose different design strategies to address the accuracy-complexity trade-off. We carry out an extensive comparative study between the introduced ECOC designs and the state-of-the-art ensemble techniques such as ensemble averaging and gradient boosting decision trees. Furthermore, we propose a combinatory technique which is shown to achieve the highest classification performance amongst all.

翻訳日:2022-10-21 02:20:18 公開日:2021-03-07

# 制約付き最小方形の準最適性と非線形予測器による改善

Suboptimality of Constrained Least Squares and Improvements via Non-Linear Predictors ( http://arxiv.org/abs/2009.09304v2 )

ライセンス: Link先を確認

Tomas Va\v{s}kevi\v{c}ius and Nikita Zhivotovskiy

(参考訳) 本研究では, 2乗損失に関して有界ユークリッド球における最適線形予測器と同様に予測の問題について検討する。データ生成分布の有界性のみを仮定すると、有界ユークリッド球に制約された最小二乗推定器は古典的$o(d/n)$ の過剰なリスク率に達しない、ただし、$d$ は共変数の次元であり、$n$ はサンプルの数である。特に、制約付き最小二乗推定器が$\omega(d^{3/2}/n)$の過剰なリスクを負うような有界分布を構成するので、ohad shamir [jmlr 2015] の最近の予想を反論する。対照的に、非線形予測器は共変量の分布を仮定せずに最適なレートであるo(d/n)$を達成することができる。最小二乗推定器に対する$O(d/n)$過剰リスク率を保証するのに十分な分布仮定について論じる。それらのなかには、ロバストな統計文献でよく使われるあるモーメント同値仮定がある。このような仮定は、境界のない設定と重み付き設定の分析の中心であるが、我々の研究は、いくつかのケースでは、不利な境界付き分布も除外していることを示している。

We study the problem of predicting as well as the best linear predictor in a bounded Euclidean ball with respect to the squared loss. When only boundedness of the data generating distribution is assumed, we establish that the least squares estimator constrained to a bounded Euclidean ball does not attain the classical $O(d/n)$ excess risk rate, where $d$ is the dimension of the covariates and $n$ is the number of samples. In particular, we construct a bounded distribution such that the constrained least squares estimator incurs an excess risk of order $\Omega(d^{3/2}/n)$ hence refuting a recent conjecture of Ohad Shamir [JMLR 2015]. In contrast, we observe that non-linear predictors can achieve the optimal rate $O(d/n)$ with no assumptions on the distribution of the covariates. We discuss additional distributional assumptions sufficient to guarantee an $O(d/n)$ excess risk rate for the least squares estimator. Among them are certain moment equivalence assumptions often used in the robust statistics literature. While such assumptions are central in the analysis of unbounded and heavy-tailed settings, our work indicates that in some cases, they also rule out unfavorable bounded distributions.

翻訳日:2022-10-16 21:12:17 公開日:2021-03-07

# BFloat16トレーニングの見直し

Revisiting BFloat16 Training ( http://arxiv.org/abs/2010.06192v2 )

ライセンス: Link先を確認

Pedram Zamirai, Jian Zhang, Christopher R. Aberger, Christopher De Sa

(参考訳) 最先端の汎用的低精度トレーニングアルゴリズムは16ビットと32ビットの精度を混合し、16ビットのハードウェア演算ユニットだけではモデルの精度を最大化できないという伝承を生み出した。その結果、深層学習アクセラレータは16ビット浮動小数点ユニット(FPU)と32ビット浮動小数点ユニット(FPU)の両方をサポートせざるを得なくなった。私たちは、深層学習モデルを16ビット浮動小数点ユニットでのみトレーニングできますが、32ビットのトレーニングで得られたモデルの精度は相変わらず一致しますか? そこで我々は,広く採用されているBFloat16ユニットの16ビットFPUトレーニングについて検討した。これらのユニットは従来16ビットの精度で出力を出力するために最も近い丸めを用いるが、モデルウェイト更新の最も近い丸めは、しばしば小さな更新をキャンセルし、収束とモデルの精度を低下させる。そこで本研究では,16ビットFPUトレーニングにおけるモデル精度劣化の軽減を目的とした,数値解析,確率的ラウンドリング,カハン和の2つの簡単な手法について検討した。この2つの手法により、16ビットfpuトレーニングで最大7%の絶対検証精度が得られることを示す。これにより、7つのディープラーニングアプリケーションにわたる32ビットトレーニングと比較して、0.1%から0.2%の検証精度が向上する。

State-of-the-art generic low-precision training algorithms use a mix of 16-bit and 32-bit precision, creating the folklore that 16-bit hardware compute units alone are not enough to maximize model accuracy. As a result, deep learning accelerators are forced to support both 16-bit and 32-bit floating-point units (FPUs), which is more costly than only using 16-bit FPUs for hardware design. We ask: can we train deep learning models only with 16-bit floating-point units, while still matching the model accuracy attained by 32-bit training? Towards this end, we study 16-bit-FPU training on the widely adopted BFloat16 unit. While these units conventionally use nearest rounding to cast output to 16-bit precision, we show that nearest rounding for model weight updates often cancels small updates, which degrades the convergence and model accuracy. Motivated by this, we study two simple techniques well-established in numerical analysis, stochastic rounding and Kahan summation, to remedy the model accuracy degradation in 16-bit-FPU training. We demonstrate that these two techniques can enable up to 7% absolute validation accuracy gain in 16-bit-FPU training. This leads to 0.1% lower to 0.2% higher validation accuracy compared to 32-bit training across seven deep learning applications.

翻訳日:2022-10-07 22:53:06 公開日:2021-03-07

# マルチエージェントlqrのデコンポーザビリティと並列計算

Decomposability and Parallel Computation of Multi-Agent LQR ( http://arxiv.org/abs/2010.08615v2 )

ライセンス: Link先を確認

Gangshan Jing, He Bai, Jemin George, Aranya Chakrabortty

(参考訳) マルチエージェントシステム(mas)内の個々のエージェントは、オープンループダイナミクスを分離するかもしれないが、協調制御の目的は通常、結合したクローズドループダイナミクスをもたらすので、制御設計は計算コストがかかる。エージェントのダイナミクスが分かっていない状況に対処するために強化学習(rl)のような学習戦略を適用する必要がある場合、計算時間がさらに高くなる。この問題を解決するために、連続時間線形MASにおける線形二次レギュレータ(LQR)設計のための並列RLスキームを提案する。この考え方は、LQRの目的に、$Q$と$R$の重み付け行列に埋め込まれた2つのグラフの構造特性を利用して、元のLQR設計を複数の分離された小さなLQR設計に変換する直交変換を定義することである。我々は、MAS が均質であれば、この分解は閉ループ最適性を保持することを示す。非均質なmasに適用した場合の分解性条件、変換行列を構成するアルゴリズム、並列rlアルゴリズム、ロバスト性解析について述べる。シミュレーションにより,本手法はlqrコストの累積値を失うことなく,学習の大幅な高速化を保証できることが示された。

Individual agents in a multi-agent system (MAS) may have decoupled open-loop dynamics, but a cooperative control objective usually results in coupled closed-loop dynamics thereby making the control design computationally expensive. The computation time becomes even higher when a learning strategy such as reinforcement learning (RL) needs to be applied to deal with the situation when the agents dynamics are not known. To resolve this problem, we propose a parallel RL scheme for a linear quadratic regulator (LQR) design in a continuous-time linear MAS. The idea is to exploit the structural properties of two graphs embedded in the $Q$ and $R$ weighting matrices in the LQR objective to define an orthogonal transformation that can convert the original LQR design to multiple decoupled smaller-sized LQR designs. We show that if the MAS is homogeneous then this decomposition retains closed-loop optimality. Conditions for decomposability, an algorithm for constructing the transformation matrix, a parallel RL algorithm, and robustness analysis when the design is applied to non-homogeneous MAS are presented. Simulations show that the proposed approach can guarantee significant speed-up in learning without any loss in the cumulative value of the LQR cost.

翻訳日:2022-10-06 21:48:37 公開日:2021-03-07

# ポータブルx線装置を用いたcovid-19患者肺分画の多段階トランスファー学習

Multi-stage transfer learning for lung segmentation using portable X-ray devices for patients with COVID-19 ( http://arxiv.org/abs/2011.00133v2 )

ライセンス: Link先を確認

Pl\'acido L Vidal, Joaquim de Moura, Jorge Novo, Marcos Ortega

(参考訳) 衛生上の緊急時の大きな課題の1つは、新規性、ケースの複雑さ、そしてその実装の緊急性により、利用可能なサンプル数が少ないコンピュータ支援診断システムを迅速に開発することである。新型コロナウイルス(COVID-19)のパンデミックの背景にある。この病原体は、主に合併症の呼吸器系に感染し、肺炎と急性呼吸窮迫症候群の重篤な症例を引き起こす。これにより、胸部x線を用いて検出できる肺の異なる病理構造が形成される。医療サービスの過負荷により、パンデミックの間は携帯型X線デバイスが推奨され、病気の拡散を防いでいる。しかし、これらの装置は、臨床医の主観性とともに診断過程をより困難にし、利用可能なサンプルの不足にもかかわらずコンピュータ支援診断手法の必要性を示唆する、様々な合併症(キャプチャー品質など)を伴っている。そこで本研究では,サンプル数の多いよく知られたドメインからの知識を,比較的少ない数でより複雑な新しいドメインに適応させる手法を提案する。非関連病理の脳磁気共鳴画像から事前訓練した分画モデルを利用し, 2段階の知識伝達を行い, 試料の不足と品質の低下にもかかわらず, 携帯型x線装置から肺領域を分画できる頑健なシステムを得た。この方法では、covid-19患者に$0.9761 \pm 0.0100$、正常患者に$0.9801 \pm 0.0104$、covid-19(肺炎など)と似ているが本物のcovid-19ではない肺疾患患者に$0.9769 \pm 0.0111$という満足な精度を得た。

One of the main challenges in times of sanitary emergency is to quickly develop computer aided diagnosis systems with a limited number of available samples due to the novelty, complexity of the case and the urgency of its implementation. This is the case during the current pandemic of COVID-19. This pathogen primarily infects the respiratory system of the afflicted, resulting in pneumonia and in a severe case of acute respiratory distress syndrome. This results in the formation of different pathological structures in the lungs that can be detected by the use of chest X-rays. Due to the overload of the health services, portable X-ray devices are recommended during the pandemic, preventing the spread of the disease. However, these devices entail different complications (such as capture quality) that, together with the subjectivity of the clinician, make the diagnostic process more difficult and suggest the necessity for computer-aided diagnosis methodologies despite the scarcity of samples available to do so. To solve this problem, we propose a methodology that allows to adapt the knowledge from a well-known domain with a high number of samples to a new domain with a significantly reduced number and greater complexity. We took advantage of a pre-trained segmentation model from brain magnetic resonance imaging of a unrelated pathology and performed two stages of knowledge transfer to obtain a robust system able to segment lung regions from portable X-ray devices despite the scarcity of samples and lesser quality. This way, our methodology obtained a satisfactory accuracy of $0.9761 \pm 0.0100$ for patients with COVID-19, $0.9801 \pm 0.0104$ for normal patients and $0.9769 \pm 0.0111$ for patients with pulmonary diseases with similar characteristics as COVID-19 (such as pneumonia) but not genuine COVID-19.

翻訳日:2022-10-01 17:20:23 公開日:2021-03-07

# 協調物体定位のためのモデルベース推定とグラフ学習の統合による多視点センサ融合

Multi-view Sensor Fusion by Integrating Model-based Estimation and Graph Learning for Collaborative Object Localization ( http://arxiv.org/abs/2011.07704v2 )

ライセンス: Link先を確認

Peng Gao, Rui Guo, Hongsheng Lu and Hao Zhang

(参考訳) コラボレーティブなオブジェクトローカライゼーションは、複数の視点や視点から観察されたオブジェクトの位置を協調的に推定することを目的としている。協調的なローカライゼーションを実現するために,複数のモデルに基づく状態推定と学習に基づくローカライゼーション手法を開発した。モデルに基づく状態推定には、しばしば複数のオブジェクト間の複雑な関係をモデル化する能力が欠けている。本稿では,グラフ学習とモデルに基づく推定を統合し,協調的物体定位のための多視点センサ融合を行う,時空間グラフフィルタ手法を提案する。提案手法は,新しい時空間グラフ表現を用いて複雑なオブジェクト関係をモデル化し,不確実性の下での位置推定を改善するためにベイズ方式で多視点観測を融合する。我々は、コネクテッド・自律運転と複数の歩行者位置決めの応用におけるアプローチを評価する。実験の結果,提案手法は従来の手法よりも優れており,コラボレーションのローカライゼーションにおける最先端のパフォーマンスを達成していることがわかった。

Collaborative object localization aims to collaboratively estimate locations of objects observed from multiple views or perspectives, which is a critical ability for multi-agent systems such as connected vehicles. To enable collaborative localization, several model-based state estimation and learning-based localization methods have been developed. Given their encouraging performance, model-based state estimation often lacks the ability to model the complex relationships among multiple objects, while learning-based methods are typically not able to fuse the observations from an arbitrary number of views and cannot well model uncertainty. In this paper, we introduce a novel spatiotemporal graph filter approach that integrates graph learning and model-based estimation to perform multi-view sensor fusion for collaborative object localization. Our approach models complex object relationships using a new spatiotemporal graph representation and fuses multi-view observations in a Bayesian fashion to improve location estimation under uncertainty. We evaluate our approach in the applications of connected autonomous driving and multiple pedestrian localization. Experimental results show that our approach outperforms previous techniques and achieves the state-of-the-art performance on collaboration localization.

翻訳日:2022-09-25 01:01:05 公開日:2021-03-07

# 連続遷移:ミックスアップによる連続制御問題に対するサンプル効率の改善

Continuous Transition: Improving Sample Efficiency for Continuous Control Problems via MixUp ( http://arxiv.org/abs/2011.14487v2 )

ライセンス: Link先を確認

Junfan Lin, Zhongzhan Huang, Keze Wang, Xiaodan Liang, Weiwei Chen, and Liang Lin

(参考訳) 深部強化学習(RL)は様々なロボット制御タスクにうまく適用されているが、サンプル効率の低さから現実のタスクに応用することは依然として困難である。この欠点を克服しようと、いくつかの研究は、訓練中に収集された軌跡データを政策に関係のない離散的な遷移に分解することで再利用することに焦点を当てた。しかし、その改善は (i) 遷移の量は通常小さく、 (i) 値の割り当ては結合状態でのみ発生するため、多少限界がある。これらの問題に対処するため,本論文では,経路に沿った潜在的な遷移を利用して軌道情報を利用する連続遷移を構築するための簡潔かつ強力な手法を提案する。具体的には,連続的な遷移を線形補間することにより,学習のための新しい遷移を合成する。構築された遷移を本物に保つために、我々は、自動的に構築プロセスを導く識別器も開発する。提案手法は, MuJoCo の複雑な連続ロボット制御問題に対して, サンプル効率を大幅に向上し, モデルベース/モデルフリー RL 法より優れていることを示す。ソースコードは利用可能である。

Although deep reinforcement learning (RL) has been successfully applied to a variety of robotic control tasks, it's still challenging to apply it to real-world tasks, due to the poor sample efficiency. Attempting to overcome this shortcoming, several works focus on reusing the collected trajectory data during the training by decomposing them into a set of policy-irrelevant discrete transitions. However, their improvements are somewhat marginal since i) the amount of the transitions is usually small, and ii) the value assignment only happens in the joint states. To address these issues, this paper introduces a concise yet powerful method to construct Continuous Transition, which exploits the trajectory information by exploiting the potential transitions along the trajectory. Specifically, we propose to synthesize new transitions for training by linearly interpolating the consecutive transitions. To keep the constructed transitions authentic, we also develop a discriminator to guide the construction process automatically. Extensive experiments demonstrate that our proposed method achieves a significant improvement in sample efficiency on various complex continuous robotic control problems in MuJoCo and outperforms the advanced model-based / model-free RL methods. The source code is available.

翻訳日:2021-06-06 14:39:15 公開日:2021-03-07

# オブジェクト認識のための解釈可能なグラフカプセルネットワーク

Interpretable Graph Capsule Networks for Object Recognition ( http://arxiv.org/abs/2012.01674v3 )

ライセンス: Link先を確認

Jindong Gu and Volker Tresp

(参考訳) Convolutional Neural Networksに代わるCapsule Networksは、画像からオブジェクトを認識するために提案されている。現在の文献は、CNNに対するCapsNetsの多くの利点を示している。しかし,capsnetsの個別分類についての説明は十分に検討されていない。主にcnnに基づく分類を説明するために広く用いられ、活性化値と対応する勾配(例えばgrad-cam)を組み合わせることで塩分マップを説明する。これらのsaliencyメソッドは、下位の分類器の特定のアーキテクチャを必要とし、繰り返しルーティング機構のため、capsnetsに自明に適用できない。解釈可能性の欠如を克服するために、CapsNetsの新しいポストホック解釈手法を提案するか、ビルトインの説明を持つようにモデルを変更できる。本研究では後者について検討する。具体的には,多面的注意に基づくグラフプーリングアプローチでルーティング部を置き換える,解釈可能なグラフカプセルネットワーク(gracapsnets)を提案する。提案モデルでは,個々の分類説明を効果的かつ効率的に作成することができる。当社のモデルは,CapsNetsの基本部分を置き換えたとしても,予期せぬメリットも示しています。 gracapsnetsは、capsnetsと比較して、パラメータが少なく、敵のロバスト性も向上しています。さらに、gracapsnetsはcapsnetsの他の利点、すなわち不等角表現とアフィン変換のロバスト性も持っている。

Capsule Networks, as alternatives to Convolutional Neural Networks, have been proposed to recognize objects from images. The current literature demonstrates many advantages of CapsNets over CNNs. However, how to create explanations for individual classifications of CapsNets has not been well explored. The widely used saliency methods are mainly proposed for explaining CNN-based classifications; they create saliency map explanations by combining activation values and the corresponding gradients, e.g., Grad-CAM. These saliency methods require a specific architecture of the underlying classifiers and cannot be trivially applied to CapsNets due to the iterative routing mechanism therein. To overcome the lack of interpretability, we can either propose new post-hoc interpretation methods for CapsNets or modifying the model to have build-in explanations. In this work, we explore the latter. Specifically, we propose interpretable Graph Capsule Networks (GraCapsNets), where we replace the routing part with a multi-head attention-based Graph Pooling approach. In the proposed model, individual classification explanations can be created effectively and efficiently. Our model also demonstrates some unexpected benefits, even though it replaces the fundamental part of CapsNets. Our GraCapsNets achieve better classification performance with fewer parameters and better adversarial robustness, when compared to CapsNets. Besides, GraCapsNets also keep other advantages of CapsNets, namely, disentangled representations and affine transformation robustness.

翻訳日:2021-05-23 14:57:28 公開日:2021-03-07

# 学習画像圧縮の従来型コーデックへの転送可能性の活用法

How to Exploit the Transferability of Learned Image Compression to Conventional Codecs ( http://arxiv.org/abs/2012.01874v2 )

ライセンス: Link先を確認

Jan P. Klopp, Keng-Chi Liu, Liang-Gee Chen, Shao-Yi Chien

(参考訳) 損失画像圧縮は、選択された損失測度の単純さによってしばしば制限される。近年の研究では、生成的敵ネットワークは、この制限を克服し、特にテクスチャにおいてマルチモーダル損失として機能する能力を持っていることが示唆されている。学習した画像圧縮とともに、この2つのテクニックは、一般的に使われる歪みの厳密な尺度を緩和する際に大きな効果を発揮する。しかし、畳み込みニューラルネットワークに基づくアルゴリズムは計算フットプリントが大きい。理想的には、既存のコーデックはそのままであり、より高速な採用とバランスの取れた計算エンベロープへの付着を保証する。本研究は,この目標への道筋として,学習した画像の符号化を代用して,画像の符号化を最適化する手法を提案する。画像は学習したフィルタによって変更され、異なるパフォーマンス指標や特定のタスクに最適化される。このアイデアを生成的敵ネットワークで拡張すると、テクスチャ全体がエンコードするコストが低く、詳細さを保っているものに置き換えられることを示す。提案手法は,従来のコーデックを改造して,デコードオーバーヘッドを必要とせず,20%以上のレート改善でms-ssim歪みを調整できる。タスク認識画像圧縮では、類似するがコーデック特有のアプローチに対して好適に実行する。

Lossy image compression is often limited by the simplicity of the chosen loss measure. Recent research suggests that generative adversarial networks have the ability to overcome this limitation and serve as a multi-modal loss, especially for textures. Together with learned image compression, these two techniques can be used to great effect when relaxing the commonly employed tight measures of distortion. However, convolutional neural network based algorithms have a large computational footprint. Ideally, an existing conventional codec should stay in place, which would ensure faster adoption and adhering to a balanced computational envelope. As a possible avenue to this goal, in this work, we propose and investigate how learned image coding can be used as a surrogate to optimize an image for encoding. The image is altered by a learned filter to optimise for a different performance measure or a particular task. Extending this idea with a generative adversarial network, we show how entire textures are replaced by ones that are less costly to encode but preserve sense of detail. Our approach can remodel a conventional codec to adjust for the MS-SSIM distortion with over 20% rate improvement without any decoding overhead. On task-aware image compression, we perform favourably against a similar but codec-specific approach.

翻訳日:2021-05-23 14:44:20 公開日:2021-03-07

# (参考訳) CTR予測における細粒度特徴学習のためのマルチインタラクティブ注意ネットワーク

Multi-Interactive Attention Network for Fine-grained Feature Learning in CTR Prediction ( http://arxiv.org/abs/2012.06968v2 )

ライセンス: CC BY 4.0

Kai Zhang, Hao Qian, Qing Cui, Qi Liu, Longfei Li, Jun Zhou, Jianhui Ma, Enhong Chen

(参考訳) クリックスルー率(ctr)予測シナリオでは、ユーザのシーケンシャルな動作を利用して、最近の文献に対するユーザの興味を捉えている。しかし、広く研究されているにもかかわらず、これらのシーケンシャルな方法には3つの制限がある。まず,CTRの予測に必ずしも適さないユーザの行動に注意を払っている。なぜなら,ユーザーは過去の行動とは無関係な新製品をクリックすることが多いからだ。第二に、現実のシナリオでは、昔から多くのユーザが存在しますが、近年では比較的アクティブではありません。したがって、初期の動作でユーザの現在の好みを正確に把握することは困難である。第3に、異なる特徴部分空間におけるユーザの歴史的行動の複数の表現は無視される。これらの問題を解消するために,様々なきめ細かい特徴(例えば,性別,年齢,職業など)の潜伏関係を包括的に抽出するMulti-Interactive Attention Network (MIAN)を提案する。具体的には、MIL(Multi-Interactive Layer)を3つのローカルなインタラクションモジュールに統合し、シーケンシャルな振る舞いを通じてユーザ好みの複数の表現をキャプチャし、きめ細かいユーザ固有の情報とコンテキスト情報を同時に利用する。さらに、高次相互作用を学習し、複数の特徴の異なる影響のバランスをとるために、Global Interaction Module (GIM) を設計する。最後に、Offline実験は、大規模レコメンデーションシステムにおけるオンラインA/Bテストとともに、3つのデータセットから行われ、提案手法の有効性を実証した。

In the Click-Through Rate (CTR) prediction scenario, user's sequential behaviors are well utilized to capture the user interest in the recent literature. However, despite being extensively studied, these sequential methods still suffer from three limitations. First, existing methods mostly utilize attention on the behavior of users, which is not always suitable for CTR prediction, because users often click on new products that are irrelevant to any historical behaviors. Second, in the real scenario, there exist numerous users that have operations a long time ago, but turn relatively inactive in recent times. Thus, it is hard to precisely capture user's current preferences through early behaviors. Third, multiple representations of user's historical behaviors in different feature subspaces are largely ignored. To remedy these issues, we propose a Multi-Interactive Attention Network (MIAN) to comprehensively extract the latent relationship among all kinds of fine-grained features (e.g., gender, age and occupation in user-profile). Specifically, MIAN contains a Multi-Interactive Layer (MIL) that integrates three local interaction modules to capture multiple representations of user preference through sequential behaviors and simultaneously utilize the fine-grained user-specific as well as context information. In addition, we design a Global Interaction Module (GIM) to learn the high-order interactions and balance the different impacts of multiple features. Finally, Offline experiment results from three datasets, together with an Online A/B test in a large-scale recommendation system, demonstrate the effectiveness of our proposed approach.

翻訳日:2021-05-09 19:42:47 公開日:2021-03-07

# リアルタイムユーザクリックによる画像マッチングの改善と不確かさ推定

Improved Image Matting via Real-time User Clicks and Uncertainty Estimation ( http://arxiv.org/abs/2012.08323v2 )

ライセンス: Link先を確認

Tianyi Wei, Dongdong Chen, Wenbo Zhou, Jing Liao, Hanqing Zhao, Weiming Zhang, Nenghai Yu

(参考訳) 画像マッチングはコンピュータビジョンとグラフィックスの基本的な問題である。既存の畳み込み方式の多くは、優れたアルファマットを生成する補助入力として、ユーザ供給のトリマップを利用する。しかし、高品質な trimap 自体を得ることは困難であり、これらの手法の適用を制限している。最近、いくつかのtrimap-freeメソッドが登場しているが、マットングの品質はtrimap-basedメソッドよりはるかに遅れている。主な理由は、トリマップガイダンスがなければ、ターゲットネットワークがフォアグラウンドターゲットについてあいまいである場合もあります。実際、前景を選択することは主観的な手続きであり、ユーザの意図に依存する。そこで本稿では,3マップフリーでユーザクリック操作を数回しか必要とせず,あいまいさを解消できるディープイメージマットリングフレームワークを提案する。さらに,どの部品に研磨が必要なのかを予測できる新しい不確実性推定モジュールと,後続の局所改質モジュールを導入する。計算予算に基づいて、不確実性ガイダンスで改善するローカル部品の数を選択できる。定量的・定性的な結果から,提案手法は既存のtrimapフリーメソッドよりも優れた性能を示し,ユーザによる最小限の労力で,最先端のtrimapベースメソッドと比較できることがわかった。

Image matting is a fundamental and challenging problem in computer vision and graphics. Most existing matting methods leverage a user-supplied trimap as an auxiliary input to produce good alpha matte. However, obtaining high-quality trimap itself is arduous, thus restricting the application of these methods. Recently, some trimap-free methods have emerged, however, the matting quality is still far behind the trimap-based methods. The main reason is that, without the trimap guidance in some cases, the target network is ambiguous about which is the foreground target. In fact, choosing the foreground is a subjective procedure and depends on the user's intention. To this end, this paper proposes an improved deep image matting framework which is trimap-free and only needs several user click interactions to eliminate the ambiguity. Moreover, we introduce a new uncertainty estimation module that can predict which parts need polishing and a following local refinement module. Based on the computation budget, users can choose how many local parts to improve with the uncertainty guidance. Quantitative and qualitative results show that our method performs better than existing trimap-free methods and comparably to state-of-the-art trimap-based methods with minimal user effort.

翻訳日:2021-05-07 05:21:17 公開日:2021-03-07

# モバイルデバイス上でリアルタイムLiDAR 3Dオブジェクト検出を実現する

Achieving Real-Time LiDAR 3D Object Detection on a Mobile Device ( http://arxiv.org/abs/2012.13801v2 )

ライセンス: Link先を確認

Pu Zhao, Wei Niu, Geng Yuan, Yuxuan Cai, Hsin-Hsuan Sung, Sijia Liu, Xipeng Shen, Bin Ren, Yanzhi Wang, Xue Lin

(参考訳) 3Dオブジェクト検出は特に自律運転アプリケーション領域において重要なタスクである。しかし、自動運転車のエッジコンピューティングデバイス上での計算とメモリリソースの制限により、リアルタイムパフォーマンスをサポートすることは困難である。そこで本研究では,ネットワークの強化と強化学習手法による探索を取り入れたコンパイラ対応統合フレームワークを提案し,資源限定エッジコンピューティングデバイス上での3Dオブジェクト検出のリアルタイム推論を実現する。具体的には,リカレントニューラルネットワーク(RNN)を用いて,人的知識や支援を伴わずに,ネットワークの強化とプルーニングの両方を自動で行う統一的なスキームを提供する。また、統一スキームの評価性能は、ジェネレータRNNを訓練するためにフィードバックすることができる。実験の結果,提案フレームワークはモバイル端末(Samsung Galaxy S20)におけるリアルタイム3Dオブジェクト検出を競合検出性能で実現していることがわかった。

3D object detection is an important task, especially in the autonomous driving application domain. However, it is challenging to support the real-time performance with the limited computation and memory resources on edge-computing devices in self-driving cars. To achieve this, we propose a compiler-aware unified framework incorporating network enhancement and pruning search with the reinforcement learning techniques, to enable real-time inference of 3D object detection on the resource-limited edge-computing devices. Specifically, a generator Recurrent Neural Network (RNN) is employed to provide the unified scheme for both network enhancement and pruning search automatically, without human expertise and assistance. And the evaluated performance of the unified schemes can be fed back to train the generator RNN. The experimental results demonstrate that the proposed framework firstly achieves real-time 3D object detection on mobile devices (Samsung Galaxy S20 phone) with competitive detection performance.

翻訳日:2021-04-25 01:14:23 公開日:2021-03-07

# (参考訳) マルチエージェント強化学習を用いたOFDMAダウンリンクシステムにおけるバースティトラフィックの公正指向スケジューリング

Fairness-Oriented Scheduling for Bursty Traffic in OFDMA Downlink Systems Using Multi-Agent Reinforcement Learning ( http://arxiv.org/abs/2012.15081v8 )

ライセンス: CC BY 4.0

Mingqi Yuan, Qi Cao, Man-on Pun, Yi Chen

(参考訳) ユーザスケジューリングは、無線通信における古典的な問題であり、鍵となる技術である。基地局には、PF(Proportional Fairness)やRRF(Robin Fashion)など、多くの高度なスケジューラが展開されている。オポチュニティ(OP)スケジューリングは、完全なバッファトラフィックを考慮した平均ユーザデータレート(AUDR)を最大化する最適なスケジューラであることが知られている。しかし、最高公平性を達成するための最適な戦略は、フルバッファトラフィックとバーストトラフィックの両方において、いまだに不明である。本研究では,特にRBG割り当てにおける公平性を考慮したユーザスケジューリングの問題について検討する。本稿では,マルチエージェント強化学習(marl)を用いて,通信システムの公平性を最大化する分散最適化を行うユーザスケジューラを構築する。エージェントは層間情報(例)を取る。状態として RSRP, Buffer サイズ) と状態として RBG を割り当て、フェアネスを最大化するように設計された報酬関数に従って最適解を探索する。さらに、5%タイルのユーザデータレート(5TUDR)をキーパフォーマンス指標(KPI)として、PFスケジューリングとRFスケジューリングとMARLスケジューリングの性能を比較する。シミュレーションの結果,提案したMARLスケジューリングは従来のスケジューラよりも優れていた。

User scheduling is a classical problem and key technology in wireless communication, which will still plays an important role in the prospective 6G. There are many sophisticated schedulers that are widely deployed in the base stations, such as Proportional Fairness (PF) and Round-Robin Fashion (RRF). It is known that the Opportunistic (OP) scheduling is the optimal scheduler for maximizing the average user data rate (AUDR) considering the full buffer traffic. But the optimal strategy achieving the highest fairness still remains largely unknown both in the full buffer traffic and the bursty traffic. In this work, we investigate the problem of fairness-oriented user scheduling, especially for the RBG allocation. We build a user scheduler using Multi-Agent Reinforcement Learning (MARL), which conducts distributional optimization to maximize the fairness of the communication system. The agents take the cross-layer information (e.g. RSRP, Buffer size) as state and the RBG allocation result as action, then explore the optimal solution following a well-defined reward function designed for maximizing fairness. Furthermore, we take the 5%-tile user data rate (5TUDR) as the key performance indicator (KPI) of fairness, and compare the performance of MARL scheduling with PF scheduling and RRF scheduling by conducting extensive simulations. And the simulation results show that the proposed MARL scheduling outperforms the traditional schedulers.

翻訳日:2021-04-18 15:50:37 公開日:2021-03-07

# AraELECTRA:アラビア語理解のための事前学習テキスト識別装置

AraELECTRA: Pre-Training Text Discriminators for Arabic Language Understanding ( http://arxiv.org/abs/2012.15516v2 )

ライセンス: Link先を確認

Wissam Antoun, Fady Baly, Hazem Hajj

(参考訳) 英語表現の進歩により、トークン置換を正確に分類するエンコーダ(ELECTRA)を効果的に学習することで、よりサンプル効率のよい事前学習タスクが実現された。これは、マスクされたトークンを復元するモデルをトレーニングする代わりに、ジェネレータネットワークに置き換えられた破損したトークンと真の入力トークンを区別するために識別器モデルを訓練する。一方、現在のアラビア語表現アプローチは、マスク言語モデリングによる事前学習のみに依存している。本稿では,アラエレクトラ(araelectra)というアラビア語表現モデルを開発した。我々のモデルは、大きなアラビア文字コーパス上の代用トークン検出目標を用いて事前訓練されている。我々は,複数のアラビア語nlpタスクにおいて,読み理解,感情分析,名前付きエンティティ認識を含むモデルを評価し,同じ事前学習データとより小さいモデルサイズでアラエレクトラが現在のアラビア語表現モデルよりも優れていることを示す。

Advances in English language representation enabled a more sample-efficient pre-training task by Efficiently Learning an Encoder that Classifies Token Replacements Accurately (ELECTRA). Which, instead of training a model to recover masked tokens, it trains a discriminator model to distinguish true input tokens from corrupted tokens that were replaced by a generator network. On the other hand, current Arabic language representation approaches rely only on pretraining via masked language modeling. In this paper, we develop an Arabic language representation model, which we name AraELECTRA. Our model is pretrained using the replaced token detection objective on large Arabic text corpora. We evaluate our model on multiple Arabic NLP tasks, including reading comprehension, sentiment analysis, and named-entity recognition and we show that AraELECTRA outperforms current state-of-the-art Arabic language representation models, given the same pretraining data and with even a smaller model size.

翻訳日:2021-04-17 17:16:21 公開日:2021-03-07

# aragpt2:アラビア語生成のための事前学習トランスフォーマー

AraGPT2: Pre-Trained Transformer for Arabic Language Generation ( http://arxiv.org/abs/2012.15520v2 )

ライセンス: Link先を確認

Wissam Antoun, Fady Baly, Hazem Hajj

(参考訳) 近年、事前学習されたトランスフォーマーベースのアーキテクチャは、十分に大きなコーパスでトレーニングされているため、言語モデリングと理解において非常に効率的であることが証明されている。アラビア語の言語生成の応用は、アラビア語の先進的な生成モデルが欠如していることから、他のNLPの進歩と比べてもまだ遅れている。本稿では,インターネットテキストとニュース記事の巨大なアラビア語コーパスをスクラッチから学習した,最初の高度なアラビア語言語生成モデルであるalgpt2を開発した。私たちの最大のモデルであるAraGPT2-megaは46億のパラメータを持ち、アラビア語のモデルとしては最大です。 megaモデルは評価され、合成ニュース生成やゼロショット質問応答など、さまざまなタスクで成功を収めた。テキスト生成では、wikipediaの記事に29.8のパープレキシティを達成する。 AraGPT2-megaは,人間による記事と区別が難しいニュース記事の生成において,有意な成功を収めた。そこで我々は,モデル生成テキストの検出精度98%の精度で自動判別モデルを開発した。これらのモデルは、アラビア語のNLPのための新しい研究の方向性と応用を促進することを願っている。

Recently, pre-trained transformer-based architectures have proven to be very efficient at language modeling and understanding, given that they are trained on a large enough corpus. Applications in language generation for Arabic are still lagging in comparison to other NLP advances primarily due to the lack of advanced Arabic language generation models. In this paper, we develop the first advanced Arabic language generation model, AraGPT2, trained from scratch on a large Arabic corpus of internet text and news articles. Our largest model, AraGPT2-mega, has 1.46 billion parameters, which makes it the largest Arabic language model available. The Mega model was evaluated and showed success on different tasks including synthetic news generation, and zero-shot question answering. For text generation, our best model achieves a perplexity of 29.8 on held-out Wikipedia articles. A study conducted with human evaluators showed the significant success of AraGPT2-mega in generating news articles that are difficult to distinguish from articles written by humans. We thus develop and release an automatic discriminator model with a 98% percent accuracy in detecting model-generated text. The models are also publicly available, hoping to encourage new research directions and applications for Arabic NLP.

翻訳日:2021-04-17 17:16:05 公開日:2021-03-07

# 時間的点過程を用いた長地平線予測

Long Horizon Forecasting With Temporal Point Processes ( http://arxiv.org/abs/2101.02815v2 )

ライセンス: Link先を確認

Prathamesh Deshpande, Kamlesh Marathe, Abir De, Sunita Sarawagi

(参考訳) 近年,多種多様なアプリケーションにおいて,非同期イベントを特徴付ける強力なモデリング機構として,MTPP( marked temporal point process)が出現している。 mtppsは、イベントの予測、特に近い将来のイベントの予測において大きな可能性を実証している。しかし、現在の設計選択のため、MTPPは遠い未来のイベントの到着を予測する際の予測性能が低いことがしばしばある。本報告では,この制限を緩和するために,特に長い地平線イベント予測に適したDualTPPを設計する。 DualTPPには2つのコンポーネントがある。第1のコンポーネントは強度のないMTPPモデルであり、将来のイベントの時刻をモデル化することで、イベントダイナミクスの微視的または粒度の信号をキャプチャする。第2のコンポーネントは、与えられた時間ウィンドウ内のイベントの集約数をモデル化する異なる双対的な視点を取り、マクロなイベントダイナミクスをカプセル化する。そこで我々は,制約付き二次最適化問題の列を解いて,長方形事象を効率的に予測するための2つのモデル % 上で協調して新しい推論フレームワークを開発した。様々な実際のデータセットを用いた実験により、DualTPPは、長い地平線予測において既存のMTPP法よりも優れた性能を示し、実際の事象と予測の間のワッサーシュタイン距離の約1桁の縮小を実現している。

In recent years, marked temporal point processes (MTPPs) have emerged as a powerful modeling machinery to characterize asynchronous events in a wide variety of applications. MTPPs have demonstrated significant potential in predicting event-timings, especially for events arriving in near future. However, due to current design choices, MTPPs often show poor predictive performance at forecasting event arrivals in distant future. To ameliorate this limitation, in this paper, we design DualTPP which is specifically well-suited to long horizon event forecasting. DualTPP has two components. The first component is an intensity free MTPP model, which captures microscopic or granular level signals of the event dynamics by modeling the time of future events. The second component takes a different dual perspective of modeling aggregated counts of events in a given time-window, thus encapsulating macroscopic event dynamics. Then we develop a novel inference framework jointly over the two models % for efficiently forecasting long horizon events by solving a sequence of constrained quadratic optimization problems. Experiments with a diverse set of real datasets show that DualTPP outperforms existing MTPP methods on long horizon forecasting by substantial margins, achieving almost an order of magnitude reduction in Wasserstein distance between actual events and forecasts.

翻訳日:2021-04-10 05:05:46 公開日:2021-03-07

# (参考訳) アンサンブル学習について

On Ensemble Learning ( http://arxiv.org/abs/2103.12521v1 )

ライセンス: CC BY 4.0

Mark Stamp and Aniket Chandak and Gavin Wong and Allen Ye

(参考訳) 本稿では,アンサンブル分類器,すなわち,スコアリング関数の組み合わせを利用した機械学習に基づく分類器について考察する。このような分類を分類するためのフレームワークを提供し、また、それぞれのフレームワークがどのように適合するかを議論するいくつかのアンサンブルテクニックを概説する。この一般的な紹介から,マルウェア分析の文脈におけるアンサンブル学習の話題に方向転換する。本稿では,マルウェア(および関連する)研究で使用されているアンサンブル技術について簡単な調査を行う。最後に,大規模かつ課題の多いマルウェアデータセットにアンサンブル手法を適用する実験を行った。これらのアンサンブル技術の多くはマルウェアの文献に現れるが、これまでは、異なるデータセットや異なる成功度合いが一般的に使用されるため、このような結果を直接比較する方法がなかった。私たちの共通フレームワークと経験的成果は、アンサンブル学習の分野において、マルウェア分析の問題の狭い領域と、マシンラーニング全般の領域の両方において、ある秩序感をもたらすための努力です。

In this paper, we consider ensemble classifiers, that is, machine learning based classifiers that utilize a combination of scoring functions. We provide a framework for categorizing such classifiers, and we outline several ensemble techniques, discussing how each fits into our framework. From this general introduction, we then pivot to the topic of ensemble learning within the context of malware analysis. We present a brief survey of some of the ensemble techniques that have been used in malware (and related) research. We conclude with an extensive set of experiments, where we apply ensemble techniques to a large and challenging malware dataset. While many of these ensemble techniques have appeared in the malware literature, previously there has been no way to directly compare results such as these, as different datasets and different measures of success are typically used. Our common framework and empirical results are an effort to bring some sense of order to the chaos that is evident in the evolving field of ensemble learning -- both within the narrow confines of the malware analysis problem, and in the larger realm of machine learning in general.

翻訳日:2021-04-05 05:22:14 公開日:2021-03-07

# (参考訳) ワイボーにおけるトロール検出の感度解析

Sentiment Analysis for Troll Detection on Weibo ( http://arxiv.org/abs/2103.09054v1 )

ライセンス: CC BY 4.0

Zidong Jiang and Fabio Di Troia and Mark Stamp

(参考訳) ソーシャルメディアが現代世界に与える影響を誇張することは難しい。事実上、あらゆる企業や著名人がtwitterやfacebookなどの人気プラットフォーム上でソーシャルメディアアカウントを持っている。中国では、マイクロブログサービスプロバイダであるsina weiboが最も人気のあるサービスである。世論に影響を与えるため、Weibo Trolls(いわゆるWater Army)は偽りのコメントを投稿するために雇われる。本稿では,Sina Weiboプラットフォーム上での感情分析およびその他のユーザ活動データを用いたトロル検出に焦点を当てた。中国語文のセグメンテーション,単語埋め込み,感情スコア計算のための手法を実装した。近年, トラル検出と感情分析が研究されているが, これまでのトロール検出を感情分析に基づいて検討した研究は知られていない。我々は、様々な機械学習戦略に基づいて、トロール検出のための感情分析アプローチを開発し、テストするために、得られた技術を用いる。実験結果が生成され分析される。提案手法を実装したChromeエクステンションは,ユーザがSina Weiboを閲覧すると,リアルタイムのトロル検出を可能にする。

The impact of social media on the modern world is difficult to overstate. Virtually all companies and public figures have social media accounts on popular platforms such as Twitter and Facebook. In China, the micro-blogging service provider, Sina Weibo, is the most popular such service. To influence public opinion, Weibo trolls -- the so called Water Army -- can be hired to post deceptive comments. In this paper, we focus on troll detection via sentiment analysis and other user activity data on the Sina Weibo platform. We implement techniques for Chinese sentence segmentation, word embedding, and sentiment score calculation. In recent years, troll detection and sentiment analysis have been studied, but we are not aware of previous research that considers troll detection based on sentiment analysis. We employ the resulting techniques to develop and test a sentiment analysis approach for troll detection, based on a variety of machine learning strategies. Experimental results are generated and analyzed. A Chrome extension is presented that implements our proposed technique, which enables real-time troll detection when a user browses Sina Weibo.

翻訳日:2021-04-05 04:59:42 公開日:2021-03-07

# (参考訳) 脳波特徴を用いた抑うつ検出のためのアンサンブルアプローチ

Ensemble approach for detection of depression using EEG features ( http://arxiv.org/abs/2103.08467v1 )

ライセンス: CC BY 4.0

Egils Avots, Kla\=vs Jermakovs, Maie Bachmann, Laura Paeske, Cagri Ozcinar, Gholamreza Anbarjafari

(参考訳) うつ病は社会の健康に深刻な影響を与え、社会に悪影響を及ぼす公衆衛生問題である。これらの問題に対する意識を高めるため,脳波信号からうつ病の長期持続効果を判定することを目的とした。本稿では、線形(相対帯域パワー、APV、SASI)および非線形(HFD、LZC、DFA)EEG機能を用いて訓練されたSVM、LDA、NB、kNN、D3バイナリ分類器の精度比較を含む。年齢と性別の一致したデータセットは、健常者10名、うつ病診断者10名で構成された。提案する機能選択と分類の組み合わせのいくつかは90%の精度に達し、10倍のクロス検証を用いて評価したすべてのモデルが、ランダムなサンプル順列で100回以上繰り返した。

Depression is a public health issue which severely affects one's well being and cause negative social and economic effect for society. To rise awareness of these problems, this publication aims to determine if long lasting effects of depression can be determined from electoencephalographic (EEG) signals. The article contains accuracy comparison for SVM, LDA, NB, kNN and D3 binary classifiers which were trained using linear (relative band powers, APV, SASI) and non-linear (HFD, LZC, DFA) EEG features. The age and gender matched dataset consisted of 10 healthy subjects and 10 subjects with depression diagnosis at some point in their lifetime. Several of the proposed feature selection and classifier combinations reached accuracy of 90% where all models where evaluated using 10-fold cross validation and averaged over 100 repetitions with random sample permutations.

翻訳日:2021-04-05 04:46:44 公開日:2021-03-07

# ベイズ畳み込みニューラルネットワークを用いたグレース・グレイス・フォギャップにおける陸水貯留異常の予測の改善

Improving prediction of the terrestrial water storage anomalies during the GRACE and GRACE-FO gap with Bayesian convolutional neural networks ( http://arxiv.org/abs/2101.09361v2 )

ライセンス: Link先を確認

Shaoxing Mo, Yulong Zhong, Xiaoqing Shi, Wei Feng, Xin Yin, Jichun Wu

(参考訳) 重力回復・気候実験(GRACE)衛星とその後継であるGRACE Follow-On(GRACE-FO)は、地球規模の地球規模の貯水異常(TWSA)の貴重な正確な観測を提供する。しかし、GRACEとGRACE-FOの間には、TWSAsの約1年間の観測ギャップがある。これは、twsa観測における不連続性が水文モデル予測に重大なバイアスと不確実性をもたらし、その結果、誤った意思決定をもたらす可能性があるため、実用的な応用にとっての課題となる。この課題に対処するために、気候データによって駆動されるベイズ畳み込みニューラルネットワーク(BCNN)を提案し、このギャップを世界規模で橋渡しする。注意機構や残差と密接性を含むディープラーニングの最近の進歩を統合することで、bcnnはマルチソース入力データからtwsa予測の重要な特徴を自動的かつ効率的に抽出することができる。予測されたtwsaは、水文モデル出力および最近の3つのtwsa予測製品と比較される。この比較は、比較的乾燥した地域でのギャップにおいて、TWSAの予測を改善するBCNNの優れた性能を示唆している。さらに, 降水異常, 干ばつ指数, 地下水位を比較することにより, ギャップ期間中の極端に乾燥し, 湿潤な現象を識別するbcnnの能力はさらに議論され, 総合的に実証された。 BCNNは、TWSAデータの連続性を維持し、ギャップの間における気候極端の影響を定量化するための信頼性の高いソリューションを提供することができることを示している。

The Gravity Recovery and Climate Experiment (GRACE) satellite and its successor GRACE Follow-On (GRACE-FO) provide valuable and accurate observations of terrestrial water storage anomalies (TWSAs) at a global scale. However, there is an approximately one-year observation gap of TWSAs between GRACE and GRACE-FO. This poses a challenge for practical applications, as discontinuity in the TWSA observations may introduce significant biases and uncertainties in the hydrological model predictions and consequently mislead decision making. To tackle this challenge, a Bayesian convolutional neural network (BCNN) driven by climatic data is proposed in this study to bridge this gap at a global scale. Enhanced by integrating recent advances in deep learning, including the attention mechanisms and the residual and dense connections, BCNN can automatically and efficiently extract important features for TWSA predictions from multi-source input data. The predicted TWSAs are compared to the hydrological model outputs and three recent TWSA prediction products. The comparison suggests the superior performance of BCNN in providing improved predictions of TWSAs during the gap in particular in the relatively arid regions. The BCNN's ability to identify the extreme dry and wet events during the gap period is further discussed and comprehensively demonstrated by comparing with the precipitation anomalies, drought index, ground/surface water levels. Results indicate that BCNN is capable of offering a reliable solution to maintain the TWSA data continuity and quantify the impacts of climate extremes during the gap.

翻訳日:2021-03-21 07:46:58 公開日:2021-03-07

# 双方向GRUモデルを用いたアラビア語のアスペクトベース感情分析

Arabic aspect based sentiment analysis using bidirectional GRU based models ( http://arxiv.org/abs/2101.10539v3 )

ライセンス: Link先を確認

Mohammed M.Abdelgwad, Taysir Hassan A Soliman, Ahmed I.Taloba, Mohamed Fawzy Farghaly

(参考訳) アスペクトベースの知覚分析(ABSA)は、与えられた文書や文の側面と各側面について伝達される感情を定義する、きめ細かい分析を行う。このレベルの分析は、レビューの微妙な視点を探求できる最も詳細なバージョンである。 ABSAで利用可能な研究のほとんどは英語に焦点を当てており、アラビア語に関する研究はほとんどない。アラビアにおけるこれまでのほとんどの研究は、主にレキシコンのようなアラビアのコンテンツを分析・処理するための希少なリソースとツールのグループに依存する機械学習の通常の方法に基づいているが、これらのリソースの欠如は別の課題を招いている。これらの障害を克服するために,gru(gated recurrent unit)ニューラルネットワークを用いた2つのモデルを用いた深層学習法を提案する。 1つ目は、双方向のgrg、畳み込みニューラルネットワーク(cnn)、条件付き確率場(crf)の組み合わせによって単語と文字の両方の表現を利用するdlモデルであり、(bgru-cnn-crf)モデルを作成して主意見の側面(ote)を抽出する。 2つ目は、双方向GRU(IAN-BGRU)に基づく対話型アテンションネットワークで、抽出された側面に対する感情極性を特定する。アラビアホテルレビューデータセットを用いて,本モデルの評価を行った。提案手法は,評価対象抽出のためのF1スコアが38.5%,アスペクトベースの感情極性分類(T3)のための精度が7.5%,両タスクのベースライン研究よりも優れていることを示す。 F1得点はT2が69.44%、T3が83.98%である。

Aspect-based Sentiment analysis (ABSA) accomplishes a fine-grained analysis that defines the aspects of a given document or sentence and the sentiments conveyed regarding each aspect. This level of analysis is the most detailed version that is capable of exploring the nuanced viewpoints of the reviews. Most of the research available in ABSA focuses on English language with very few work available on Arabic. Most previous work in Arabic has been based on regular methods of machine learning that mainly depends on a group of rare resources and tools for analyzing and processing Arabic content such as lexicons, but the lack of those resources presents another challenge. To overcome these obstacles, Deep Learning (DL)-based methods are proposed using two models based on Gated Recurrent Units (GRU) neural networks for ABSA. The first one is a DL model that takes advantage of the representations on both words and characters via the combination of bidirectional GRU, Convolutional neural network (CNN), and Conditional Random Field (CRF) which makes up (BGRU-CNN-CRF) model to extract the main opinionated aspects (OTE). The second is an interactive attention network based on bidirectional GRU (IAN-BGRU) to identify sentiment polarity toward extracted aspects. We evaluated our models using the benchmarked Arabic hotel reviews dataset. The results indicate that the proposed methods are better than baseline research on both tasks having 38.5% enhancement in F1-score for opinion target extraction (T2) and 7.5% in accuracy for aspect-based sentiment polarity classification (T3). Obtaining F1 score of 69.44% for T2, and accuracy of 83.98% for T3.

翻訳日:2021-03-19 10:47:20 公開日:2021-03-07

# (参考訳) マルウェア進化検出のための単語埋め込み技術

Word Embedding Techniques for Malware Evolution Detection ( http://arxiv.org/abs/2103.05759v1 )

ライセンス: CC BY 4.0

Sunhera Paul and Mark Stamp

(参考訳) マルウェア検出は情報セキュリティの重要な側面である。問題のひとつは、マルウェアが時間とともに進化することです。効果的なマルウェア検出を維持するためには,マルウェアの進化がいつ発生したのかを判断し,適切な対策を講じる必要がある。マルウェアファミリーが進化した可能性が高い時期のポイントを検出するための様々な実験を行い、進化が実際に発生したことを確認するための二次テストを検討します。いくつかのマルウェアファミリーが分析され、それぞれが長期間にわたって収集された多数のサンプルを含んでいる。実験は, 単語埋め込み技術に基づく機能工学を用いて, 改良結果を得たことを示す。すべての実験は機械学習モデルに基づいており、進化検出戦略は人間の介入を最小限に抑え、簡単に自動化できる。

Malware detection is a critical aspect of information security. One difficulty that arises is that malware often evolves over time. To maintain effective malware detection, it is necessary to determine when malware evolution has occurred so that appropriate countermeasures can be taken. We perform a variety of experiments aimed at detecting points in time where a malware family has likely evolved, and we consider secondary tests designed to confirm that evolution has actually occurred. Several malware families are analyzed, each of which includes a number of samples collected over an extended period of time. Our experiments indicate that improved results are obtained using feature engineering based on word embedding techniques. All of our experiments are based on machine learning models, and hence our evolution detection strategies require minimal human intervention and can easily be automated.

翻訳日:2021-03-12 09:07:22 公開日:2021-03-07

# (参考訳) マルウェア家族関係のクラスタ分析

Cluster Analysis of Malware Family Relationships ( http://arxiv.org/abs/2103.05761v1 )

ライセンス: CC BY 4.0

Samanvitha Basole and Mark Stamp

(参考訳) 本稿では,k$-meansクラスタリングを用いてマルウェアサンプル間の各種関係を分析する。約20のマルウェアファミリーと1家族あたり1000のサンプルからなるデータセットを考察する。これらの家族は7種類のマルウェアに分類される。我々は,家族のペアに基づいてクラスタリングを行い,その結果を用いて家族間の関係を判定する。マルウェアの種類に基づいて同様のクラスタ分析を行います。以上の結果から,K$-meansクラスタリングは,マルウェアの家族関係を探索するための強力なツールとなる可能性が示唆された。

In this paper, we use $K$-means clustering to analyze various relationships between malware samples. We consider a dataset comprising~20 malware families with~1000 samples per family. These families can be categorized into seven different types of malware. We perform clustering based on pairs of families and use the results to determine relationships between families. We perform a similar cluster analysis based on malware type. Our results indicate that $K$-means clustering can be a powerful tool for data exploration of malware family relationships.

翻訳日:2021-03-12 08:55:50 公開日:2021-03-07

# (参考訳) マルウェア分類のためのWord2Vec, HMM2Vec, PCA2Vecの比較

A Comparison of Word2Vec, HMM2Vec, and PCA2Vec for Malware Classification ( http://arxiv.org/abs/2103.05763v1 )

ライセンス: CC BY 4.0

Aniket Chandak and Wendy Lee and Mark Stamp

(参考訳) 単語の埋め込みはしばしば、単語間の関係を定量化する手段として自然言語処理で使用される。より一般的に、これらの同じ単語埋め込み技術は特徴間の関係の定量化に利用できる。本稿では,マルウェア分類の文脈において,複数の単語埋め込み手法を検討する。私たちは隠れマルコフモデルを使用して、HMM2Vecと呼ばれるアプローチで埋め込みベクトルを取得し、主成分分析に基づいてベクトル埋め込みを生成します。また、Word2Vecと呼ばれる一般的なニューラルネットワークベースの単語埋め込み技術も検討します。いずれの場合も,様々な家系のマルウェアサンプルに対して,オプコードシーケンスに基づく特徴埋め込みを導出する。本研究では,これらの特徴埋め込みに基づく分類精度の向上を,オプコードシーケンスを直接使用するHMM実験と比較し,ベースラインの確立に役立つことを示した。これらの結果は,マルウェア解析の分野では,単語埋め込みが有用な機能工学的ステップであることを示す。

Word embeddings are often used in natural language processing as a means to quantify relationships between words. More generally, these same word embedding techniques can be used to quantify relationships between features. In this paper, we first consider multiple different word embedding techniques within the context of malware classification. We use hidden Markov models to obtain embedding vectors in an approach that we refer to as HMM2Vec, and we generate vector embeddings based on principal component analysis. We also consider the popular neural network based word embedding technique known as Word2Vec. In each case, we derive feature embeddings based on opcode sequences for malware samples from a variety of different families. We show that we can obtain better classification accuracy based on these feature embeddings, as compared to HMM experiments that directly use the opcode sequences, and serve to establish a baseline. These results show that word embeddings can be a useful feature engineering step in the field of malware analysis.

翻訳日:2021-03-12 08:43:40 公開日:2021-03-07

# (参考訳) 時系列レコメンダシステムの時間モデルを用いたハイブリッドモデル

Hybrid Model with Time Modeling for Sequential Recommender Systems ( http://arxiv.org/abs/2103.06138v1 )

ライセンス: CC BY 4.0

Marlesson R. O. Santana, Anderson Soares

(参考訳) 深層学習に基づく手法は、推薦システム問題に成功している。反復ニューラルネットワーク、トランス、および注意メカニズムを用いたアプローチは、シーケンシャルインタラクションにおけるユーザーの長期および短期の好みをモデル化するのに有用である。さまざまなセッションベースのレコメンデーションソリューションを探求するために、Booking.comは最近WSDM WebTour 2021 Challengeを組織しました。本研究はこの課題に対する我々のアプローチを示す。レコメンダシステムのための最先端のディープラーニングアーキテクチャをテストするために,いくつかの実験を行った。さらに,NARM(Neural Attentive Recommendation Machine)にいくつかの変更を加え,そのアーキテクチャを課題に適応させ,どのセッションベースモデルにも適用可能なトレーニングアプローチを実装して精度を向上した。実験結果から,narmの改善は他のベンチマーク手法よりも優れていた。

Deep learning based methods have been used successfully in recommender system problems. Approaches using recurrent neural networks, transformers, and attention mechanisms are useful to model users' long- and short-term preferences in sequential interactions. To explore different session-based recommendation solutions, Booking.com recently organized the WSDM WebTour 2021 Challenge, which aims to benchmark models to recommend the final city in a trip. This study presents our approach to this challenge. We conducted several experiments to test different state-of-the-art deep learning architectures for recommender systems. Further, we proposed some changes to Neural Attentive Recommendation Machine (NARM), adapted its architecture for the challenge objective, and implemented training approaches that can be used in any session-based model to improve accuracy. Our experimental result shows that the improved NARM outperforms all other state-of-the-art benchmark methods.

翻訳日:2021-03-12 07:16:44 公開日:2021-03-07

# (参考訳) universal adversarial perturbation とイメージスパム分類器

Universal Adversarial Perturbations and Image Spam Classifiers ( http://arxiv.org/abs/2103.05469v1 )

ライセンス: CC BY 4.0

Andy Phung and Mark Stamp

(参考訳) 名前が示すように、画像スパムは画像に埋め込まれたスパムメールだ。画像スパムはテキストベースのフィルターを避けるために開発された。現代のディープラーニングに基づく分類器は、野生で見られる典型的な画像スパムを検出するのによく機能する。本章では,ディープラーニングに基づく画像スパム分類器を攻撃するために,多くの敵手法を評価する。テストした手法のうち、普遍摂動が最善であることがわかった。そこで本稿では, 画像スパムに適応した「自然な摂動」を生成可能な, 変換に基づく新たな対向攻撃を提案し, 解析する。結果として得られるスパム画像は、集中した自然特徴の存在と普遍的な敵の摂動の両方から恩恵を受ける。提案手法は, 精度の低下, 例ごとの計算時間, 摂動距離において, 既存の敵攻撃よりも優れていることを示す。本手法は,画像スパム検出における今後の研究の課題データセットとして使用できる,敵対的スパム画像のデータセットの作成に応用する。

As the name suggests, image spam is spam email that has been embedded in an image. Image spam was developed in an effort to evade text-based filters. Modern deep learning-based classifiers perform well in detecting typical image spam that is seen in the wild. In this chapter, we evaluate numerous adversarial techniques for the purpose of attacking deep learning-based image spam classifiers. Of the techniques tested, we find that universal perturbation performs best. Using universal adversarial perturbations, we propose and analyze a new transformation-based adversarial attack that enables us to create tailored "natural perturbations" in image spam. The resulting spam images benefit from both the presence of concentrated natural features and a universal adversarial perturbation. We show that the proposed technique outperforms existing adversarial attacks in terms of accuracy reduction, computation time per example, and perturbation distance. We apply our technique to create a dataset of adversarial spam images, which can serve as a challenge dataset for future research in image spam detection.

翻訳日:2021-03-11 18:22:51 公開日:2021-03-07

# (参考訳) セマンティックセグメンテーションにおける平方根親和性を用いたラベルの回帰

Use square root affinity to regress labels in semantic segmentation ( http://arxiv.org/abs/2103.04990v1 )

ライセンス: CC BY 4.0

Lumeng Cao, Zhouwang Yang

(参考訳) セマンティックセグメンテーションは、コンピュータビジョンにおける基本的な非自明なタスクです。多くの以前の研究では、アフィニティパターンを利用してセグメンテーションネットワークを強化することに焦点を当てている。これらの研究のほとんどは、アフィニティ行列を特徴融合重みの一種として使用しており、これは注意モデルや非局所モデルなどのネットワークに組み込まれたモジュールの一部である。本稿では,アフィニティ行列とラベルを関連付け,教師付き方法でアフィニティを利用する。具体的には,このラベルを用いてマルチスケールなラベル親和性行列を構造的監視として生成し,平方根カーネルを用いて出力層上の非局所親和性行列を計算する。このような2つの親和性により、Affinity Regression loss(AR損失)と呼ばれる新しい損失を定義します。我々のモデルは訓練が容易であり、実行時推論なしで計算負荷をほとんど加えない。 NYUv2データセットとCityscapesデータセットに関する広範な実験は、提案手法がセマンティックセグメンテーションネットワークを促進するのに十分であることを示す。

Semantic segmentation is a basic but non-trivial task in computer vision. Many previous work focus on utilizing affinity patterns to enhance segmentation networks. Most of these studies use the affinity matrix as a kind of feature fusion weights, which is part of modules embedded in the network, such as attention models and non-local models. In this paper, we associate affinity matrix with labels, exploiting the affinity in a supervised way. Specifically, we utilize the label to generate a multi-scale label affinity matrix as a structural supervision, and we use a square root kernel to compute a non-local affinity matrix on output layers. With such two affinities, we define a novel loss called Affinity Regression loss (AR loss), which can be an auxiliary loss providing pair-wise similarity penalty. Our model is easy to train and adds little computational burden without run-time inference. Extensive experiments on NYUv2 dataset and Cityscapes dataset demonstrate that our proposed method is sufficient in promoting semantic segmentation networks.

翻訳日:2021-03-11 16:53:20 公開日:2021-03-07

# (参考訳) カルマンフィルタを用いた最適物体追跡技術

Optimized Object Tracking Technique Using Kalman Filter ( http://arxiv.org/abs/2103.05467v1 )

ライセンス: CC BY 4.0

Liana Ellen Taylor, Midriem Mirdanies, Roni Permana Saputra

(参考訳) 本稿では, クラッタシーンにおける所望の移動物体の検出精度を維持しつつ, 物体検出プロセスに必要な処理時間を最小化する最適化オブジェクト追跡手法の設計に着目した。カルマンフィルタベースのトリミング画像は、処理時間がビデオフレーム全体よりも小さい検索ウィンドウを使用する場合にオブジェクトを検出するのにかなり少ないため、画像検出プロセスに使用されます。この技術は、トリミングプロセスでウィンドウのさまざまなサイズでテストされました。 MATLABは提案手法の設計とテストに使用された。本論文では, 最大次元の2.16倍のトリミング画像を使用することで, 処理時間が大幅に短縮される一方, 検出成功率が高く, 検出された物体の中心が実物中心に近かったことを明らかにした。

This paper focused on the design of an optimized object tracking technique which would minimize the processing time required in the object detection process while maintaining accuracy in detecting the desired moving object in a cluttered scene. A Kalman filter based cropped image is used for the image detection process as the processing time is significantly less to detect the object when a search window is used that is smaller than the entire video frame. This technique was tested with various sizes of the window in the cropping process. MATLAB was used to design and test the proposed method. This paper found that using a cropped image with 2.16 multiplied by the largest dimension of the object resulted in significantly faster processing time while still providing a high success rate of detection and a detected center of the object that was reasonably close to the actual center.

翻訳日:2021-03-11 15:18:09 公開日:2021-03-07

# 発達期におけるブレッテンベルク車による身体的連続学習

Embodied Continual Learning Across Developmental Time Via Developmental Braitenberg Vehicles ( http://arxiv.org/abs/2103.05753v1 )

ライセンス: Link先を確認

Bradly Alicea, Rishabh Chakrabarty, Akshara Gopi, Anson Lim, and Jesse Parent

(参考訳) 発達生物学、認知科学、計算モデリングの合成を通じて学ぶべきことはたくさんある。この観点から学ぶことができる教訓の1つは、インテリジェントプログラムの初期化は、多数のパラメータの操作にのみ依存できないということです。今後は、Braitenberg Vehicleをベースとした、開発にインスパイアされた学習エージェントの設計を提案する。これらのエージェントを人工体型知能の例に用い,認知発達能力の構成要素としての体型経験と形態形成成長のモデル化に近づいた。成人の表現型の発生と発達経路の同時性に影響を与える生物学的・認知的発達に関する諸要因を考察する。これらのメカニズムは、シフト重みと適応的ネットワークトポロジーによる創発的接続を生み出し、ニューラルネットワークのトレーニングにおける発達過程の重要性を示す。このアプローチは、重要な期間や成長と獲得を活用し、明示的に具体化されたネットワークアーキテクチャを活用し、ニューラルネットワークの組み立てとこれらのネットワークでのアクティブラーニングを区別することで、開発アプローチから生じる適応エージェントの振る舞いの青写真を提供する。

There is much to learn through synthesis of Developmental Biology, Cognitive Science and Computational Modeling. One lesson we can learn from this perspective is that the initialization of intelligent programs cannot solely rely on manipulation of numerous parameters. Our path forward is to present a design for developmentally-inspired learning agents based on the Braitenberg Vehicle. Using these agents to exemplify artificial embodied intelligence, we move closer to modeling embodied experience and morphogenetic growth as components of cognitive developmental capacity. We consider various factors regarding biological and cognitive development which influence the generation of adult phenotypes and the contingency of available developmental pathways. These mechanisms produce emergent connectivity with shifting weights and adaptive network topography, thus illustrating the importance of developmental processes in training neural networks. This approach provides a blueprint for adaptive agent behavior that might result from a developmental approach: namely by exploiting critical periods or growth and acquisition, an explicitly embodied network architecture, and a distinction between the assembly of neural networks and active learning on these networks.

翻訳日:2021-03-11 14:46:44 公開日:2021-03-07

# (参考訳) 深層学習に基づく小型データセットの超解像蛍光顕微鏡

Deep learning-based super-resolution fluorescence microscopy on small datasets ( http://arxiv.org/abs/2103.04989v1 )

ライセンス: CC BY 4.0

Varun Mannam, Yide Zhang, Xiaotong Yuan, and Scott Howard

(参考訳) 蛍光顕微鏡は、生物をマイクロメートルスケールの解像度で可視化することで、現代の生物学における劇的な発展を可能にした。しかし、回折限界のため、サブミクロン/ナノメータの特徴は解決しにくい。ナノメートルの解像度を達成するために様々な超解像技術が開発されているが、高価な光学的セットアップや特殊なフルオロフォを必要とすることが多い。近年、深層学習は、回折制限画像から技術的障壁を減らし、超解像を得る可能性を示している。正確な結果を得るためには、従来のディープラーニング技術はトレーニングデータセットとして数千の画像を必要とする。生物サンプルから大規模なデータセットを得ることは、フルオロフォアのフォトブレッシング、光毒性、生体内で起こる動的プロセスなどによっては実現できないことが多い。したがって、小さなデータセットを用いたディープラーニングベースの超解像の実現は困難である。この制限を、小さなデータセットでうまくトレーニングされ、超高解像度画像を実現する新しい畳み込みニューラルネットワークベースのアプローチで解決します。トレーニングデータセットとして15の異なるフィールドオブビューから合計750枚の画像をキャプチャし,そのテクニックを実証した。各FOVでは、超解像ラジアルゆらぎ法を用いて単一のターゲット画像を生成する。予想通り、この小さなデータセットは、従来の超高解像度アーキテクチャを使用して使用可能なモデルを生成できなかった。しかし、新しいアプローチを使用すると、ネットワークを訓練して、この小さなデータセットから超高解像度の画像を達成できます。このディープラーニングモデルは、大規模なトレーニングデータセットの取得が困難なMRIやX線イメージングなどの他のバイオメディカルイメージングモードに適用できます。

Fluorescence microscopy has enabled a dramatic development in modern biology by visualizing biological organisms with micrometer scale resolution. However, due to the diffraction limit, sub-micron/nanometer features are difficult to resolve. While various super-resolution techniques are developed to achieve nanometer-scale resolution, they often either require expensive optical setup or specialized fluorophores. In recent years, deep learning has shown the potentials to reduce the technical barrier and obtain super-resolution from diffraction-limited images. For accurate results, conventional deep learning techniques require thousands of images as a training dataset. Obtaining large datasets from biological samples is not often feasible due to the photobleaching of fluorophores, phototoxicity, and dynamic processes occurring within the organism. Therefore, achieving deep learning-based super-resolution using small datasets is challenging. We address this limitation with a new convolutional neural network-based approach that is successfully trained with small datasets and achieves super-resolution images. We captured 750 images in total from 15 different field-of-views as the training dataset to demonstrate the technique. In each FOV, a single target image is generated using the super-resolution radial fluctuation method. As expected, this small dataset failed to produce a usable model using traditional super-resolution architecture. However, using the new approach, a network can be trained to achieve super-resolution images from this small dataset. This deep learning model can be applied to other biomedical imaging modalities such as MRI and X-ray imaging, where obtaining large training datasets is challenging.

翻訳日:2021-03-11 11:43:12 公開日:2021-03-07

# (参考訳) 蛍光ライフタイムイメージング顕微鏡(FLIM)における畳み込みニューラルネットワーク

Convolutional Neural Network Denoising in Fluorescence Lifetime Imaging Microscopy (FLIM) ( http://arxiv.org/abs/2103.05448v1 )

ライセンス: CC BY 4.0

Varun Mannam, Yide Zhang, Xiaotong Yuan, Takashi Hato, Pierre C. Dagher, Evan L. Nichols, Cody J. Smith, Kenneth W. Dunn, and Scott Howard

(参考訳) 蛍光寿命イメージング顕微鏡(FLIM)システムは、その遅い処理速度、低信号対雑音比(SNR)、および高価で困難なハードウェアセットアップによって制限されています。そこで本研究では,FLIM SNRを改善するために畳み込み畳み込みネットワークを適用した。ネットワークは、アナログ信号処理に基づく高速なデータ取得、高効率パルス変調を用いた高SNR、オフザシェルフ無線周波数成分を用いたコスト効率実装を備えたインスタントFLIMシステムと統合される。我々のインスタントFLIMシステムは同時に、強度、寿命、薬理プロット \textit{in vivo} と \textit{ex vivo} を提供する。 FLIMデータに訓練されたディープラーニングモデルを用いて画像の復調を統合することにより、正確なFLIMファサー計測が得られる。 K平均クラスタリングセグメンテーション(K-means clustering segmentation)法は、異なる蛍光体を正確に分離する、偏見のない教師なしの機械学習技術である。マウスの腎臓実験では, セグメント化前に深層学習画像の認知モデルを導入することで, 既存の方法と比較して, ファーザーのノイズを効果的に除去し, より明瞭なセグメントを提供することが示された。そこで,提案する深層学習に基づくワークフローは,インスタントflimを用いた蛍光画像の自動セグメンテーションを高速かつ高精度に実現する。 FLIM測定がノイズの多い場合, 除音操作はセグメンテーションに有効である。クラスタリングは、バイオメディカルイメージングアプリケーションに関心のある生物学的構造の検出を効果的に強化できます。

Fluorescence lifetime imaging microscopy (FLIM) systems are limited by their slow processing speed, low signal-to-noise ratio (SNR), and expensive and challenging hardware setups. In this work, we demonstrate applying a denoising convolutional network to improve FLIM SNR. The network will be integrated with an instant FLIM system with fast data acquisition based on analog signal processing, high SNR using high-efficiency pulse-modulation, and cost-effective implementation utilizing off-the-shelf radio-frequency components. Our instant FLIM system simultaneously provides the intensity, lifetime, and phasor plots \textit{in vivo} and \textit{ex vivo}. By integrating image denoising using the trained deep learning model on the FLIM data, provide accurate FLIM phasor measurements are obtained. The enhanced phasor is then passed through the K-means clustering segmentation method, an unbiased and unsupervised machine learning technique to separate different fluorophores accurately. Our experimental \textit{in vivo} mouse kidney results indicate that introducing the deep learning image denoising model before the segmentation effectively removes the noise in the phasor compared to existing methods and provides clearer segments. Hence, the proposed deep learning-based workflow provides fast and accurate automatic segmentation of fluorescence images using instant FLIM. The denoising operation is effective for the segmentation if the FLIM measurements are noisy. The clustering can effectively enhance the detection of biological structures of interest in biomedical imaging applications.

翻訳日:2021-03-11 08:43:08 公開日:2021-03-07

# (参考訳) オランダ考古学領域におけるオンラインプロフェッショナル検索の有用性評価

Usability Evaluation for Online Professional Search in the Dutch Archaeology Domain ( http://arxiv.org/abs/2103.04437v1 )

ライセンス: CC BY 4.0

Alex Brandsen, Suzan Verberne, Karsten Lambers, Milco Wansleeben

(参考訳) 本稿では,これらの長文考古学文献のフルテキスト検索を可能にする,最初の考古学グレイ文学情報検索システムagnesについて述べる。この検索システムは、考古学の専門家や学者が6万以上のオランダの発掘レポートのコレクションを通じて検索することができるWebインターフェイスを持っています。我々はAGNESの検索インタフェースの評価のために,小規模ながら多様なユーザグループを用いてユーザスタディを行った。評価はスクリーンキャプチャとthink aloudプロトコルによって行われ、ユーザインタフェースフィードバックアンケートが組み合わされた。評価は、制御された使用(事前定義されたタスクの補完)と自由使用(自由選択されたタスクの補完)の両方をカバーした。自由に利用することで、考古学者のニーズや、検索システムとの相互作用を研究することができます。結論として,(1) 考古学者の情報要求は概ねリコール指向であり,回答として項目のリストを必要とすること,(2) ユーザはメタデータフィルタよりも自由テキストクエリの使用を好み,自由テキスト検索システムの価値を確認すること,(3) 多様なユーザグループの編集が,システム改善のフィードバックとして多様な課題の収集に寄与すること,などがあげられる。私たちは現在、AGNESのユーザーインターフェイスを改良し、考古学的実体のための精度を向上させることで、考古学者が研究の質問をより効率的かつ効率的に回答できるようにし、過去をより一貫性のある物語に導きます。

This paper presents AGNES, the first information retrieval system for archaeological grey literature, allowing full-text search of these long archaeological documents. This search system has a web interface that allows archaeology professionals and scholars to search through a collection of over 60,000 Dutch excavation reports, totalling 361 million words. We conducted a user study for the evaluation of AGNES's search interface, with a small but diverse user group. The evaluation was done by screen capturing and a think aloud protocol, combined with a user interface feedback questionnaire. The evaluation covered both controlled use (completion of a pre-defined task) as well as free use (completion of a freely chosen task). The free use allows us to study the information needs of archaeologists, as well as their interactions with the search system. We conclude that: (1) the information needs of archaeologists are typically recall-oriented, often requiring a list of items as answer; (2) the users prefer the use of free-text queries over metadata filters, confirming the value of a free-text search system; (3) the compilation of a diverse user group contributed to the collection of diverse issues as feedback for improving the system. We are currently refining AGNES's user interface and improving its precision for archaeological entities, so that AGNES will help archaeologists to answer their research questions more effectively and efficiently, leading to a more coherent narrative of the past.

翻訳日:2021-03-11 00:17:46 公開日:2021-03-07

# (参考訳) Smooth Stochastic Optimizationのレトロスペクティブ近似

Retrospective Approximation for Smooth Stochastic Optimization ( http://arxiv.org/abs/2103.04392v1 )

ライセンス: CC BY 4.0

David Newton, Raghu Bollapragada, Raghu Pasupathy, Nung Kwan Yip

(参考訳) 確率的一階オラクルを用いてスムーズな(そして潜在的に非凸な)目的を最小化する確率的最適化問題を考察する。このような問題は、シミュレーション最適化からディープラーニングまで、多くの設定で発生します。本論文では,各イテレーションで$k$のサンプルパス近似問題を適応したサンプルサイズ$M_k$を用いて暗黙的に生成し,行探索準ニュートン法のような「決定論的方法」を用いて,適応した誤差許容度$\epsilon_k$に(事前の解法で)解決する,普遍的な逐次サンプル平均近似(SAA)パラダイムとして,Retrospective Approximation (RA) を提案する。 RAの主な利点は、最適化を確率的近似から切り離すことであり、既存の決定論的アルゴリズムを修正なしで直接採用できるため、確率的コンテキストのためのアルゴリズムを再設計する必要性が軽減される。 2つめの利点は、RAが並列化に寄与する明らかな方法である。 m_k, k \geq 1\}$ および $\{\epsilon_k, k\geq 1\}$ の条件を特定し、ほぼ確実に $l_1$-norm での収束と収束を保証し、最適なイテレーションと作業の複雑さ率を提供する。線形探索準ニュートンを用いたRAの性能について,未条件の最小二乗問題と深部畳み込みニューラルネットを用いた画像分類問題について述べる。

We consider stochastic optimization problems where a smooth (and potentially nonconvex) objective is to be minimized using a stochastic first-order oracle. These type of problems arise in many settings from simulation optimization to deep learning. We present Retrospective Approximation (RA) as a universal sequential sample-average approximation (SAA) paradigm where during each iteration $k$, a sample-path approximation problem is implicitly generated using an adapted sample size $M_k$, and solved (with prior solutions as "warm start") to an adapted error tolerance $\epsilon_k$, using a "deterministic method" such as the line search quasi-Newton method. The principal advantage of RA is that decouples optimization from stochastic approximation, allowing the direct adoption of existing deterministic algorithms without modification, thus mitigating the need to redesign algorithms for the stochastic context. A second advantage is the obvious manner in which RA lends itself to parallelization. We identify conditions on $\{M_k, k \geq 1\}$ and $\{\epsilon_k, k\geq 1\}$ that ensure almost sure convergence and convergence in $L_1$-norm, along with optimal iteration and work complexity rates. We illustrate the performance of RA with line-search quasi-Newton on an ill-conditioned least squares problem, as well as an image classification problem using a deep convolutional neural net.

翻訳日:2021-03-11 00:00:00 公開日:2021-03-07

# (参考訳) グラフベースピラミッドグローバルコンテキスト推論によるCOVID-19肺感染症セグメンテーションの検討

Graph-based Pyramid Global Context Reasoning with a Saliency-aware Projection for COVID-19 Lung Infections Segmentation ( http://arxiv.org/abs/2103.04235v1 )

ライセンス: CC BY 4.0

Huimin Huang, Ming Cai, Lanfen Lin, Jing Zheng, Xiongwei Mao, Xiaohan Qian, Zhiyi Peng, Jianying Zhou, Yutaro Iwamoto, Xian-Hua Han, Yen-Wei Chen, Ruofeng Tong

(参考訳) コロナウイルス病2019(COVID-19)は2020年に急速に広まり、CT画像から肺感染症のセグメンテーションに関する大量の研究が浮かび上がっている。この問題には多くの方法が提案されているが、さまざまなサイズの感染が異なるローブゾーンに現れるため、それは困難な課題である。そこで本研究では,不整合感染の長期依存性のモデル化とサイズ変化の適応が可能なグラフベースのPyramid Global Context Reasoning(Graph-PGCR)モジュールを提案する。最初にグラフ畳み込みを組み込んで、複数のローブゾーンから長期のコンテキスト情報を利用する。従来の平均プールや最大オブジェクト確率とは異なり、感染関連ピクセルをグラフノードの集合としてピックアップするサリエンシー対応のプロジェクションメカニズムを提案する。グラフ推論の後、関係認識機能はダウンストリームタスクのために元の座標空間に戻される。さらに,異なるサンプリングレートで複数のグラフをコンストラクトして,サイズ変動問題に対処する。この目的のために、異なるマルチスケールの長距離コンテキストパターンをキャプチャできる。当社のGraph-PGCRモジュールはプラグアンドプレイで、パフォーマンスを向上させるためにあらゆるアーキテクチャに統合できます。実験により、提案手法は、パブリックとプライベートのCOVID-19データセットの両方において、最先端のバックボーンアーキテクチャのパフォーマンスを継続的に向上することを示した。

Coronavirus Disease 2019 (COVID-19) has rapidly spread in 2020, emerging a mass of studies for lung infection segmentation from CT images. Though many methods have been proposed for this issue, it is a challenging task because of infections of various size appearing in different lobe zones. To tackle these issues, we propose a Graph-based Pyramid Global Context Reasoning (Graph-PGCR) module, which is capable of modeling long-range dependencies among disjoint infections as well as adapt size variation. We first incorporate graph convolution to exploit long-term contextual information from multiple lobe zones. Different from previous average pooling or maximum object probability, we propose a saliency-aware projection mechanism to pick up infection-related pixels as a set of graph nodes. After graph reasoning, the relation-aware features are reversed back to the original coordinate space for the down-stream tasks. We further con- struct multiple graphs with different sampling rates to handle the size variation problem. To this end, distinct multi-scale long-range contextual patterns can be captured. Our Graph- PGCR module is plug-and-play, which can be integrated into any architecture to improve its performance. Experiments demonstrated that the proposed method consistently boost the performance of state-of-the-art backbone architectures on both of public and our private COVID-19 datasets.

翻訳日:2021-03-10 23:16:47 公開日:2021-03-07

# (参考訳) ディファレンスを学ぶ: Sim2Real Small Defection Segmentation Network

Learn to Differ: Sim2Real Small Defection Segmentation Network ( http://arxiv.org/abs/2103.04297v1 )

ライセンス: CC BY 4.0

Zexi Chen, Zheyuan Huang, Yunkai Wang, Xuecheng Xu, Yue Wang, Rong Xiong

(参考訳) 深層学習に基づく小さな欠陥セグメント化手法に関する最近の研究は、特定の設定で訓練されており、一定の文脈で制限される傾向にある。トレーニング中、ネットワークは、欠陥を突き止める前に、トレーニングデータの背景の表現を必然的に学習します。コンテキストが変更されると推論段階ではパフォーマンスが低下し、新しい設定ごとにトレーニングすることでのみ解決できる。これは最終的に、コンテキストが変化し続ける実用的ロボットアプリケーションに制限をもたらす。これに対処するために、ネットワークコンテキストをコンテキスト別にトレーニングし、一般化を期待するのではなく、制限されたコンテキストで誤解し、純粋なシミュレーションでトレーニングを開始すべきなのか? 本稿では,コンテキストに関わらず2つの画像間の小さな欠陥を識別する方法を学習するネットワークSSDSを提案する。画像間の位相相関のポーズ感度を利用した小さな欠陥検出層が導入され、その後に異常マスキング層が続く。ネットワークは、単純な形状でランダムに生成されたシミュレーションデータに基づいて訓練され、現実世界で一般化される。最後に、SSDSは実世界の収集されたデータに基づいて検証され、安価なシミュレーションでトレーニングしても、実際の世界で小さな欠陥を見つけ、実用的応用の可能性を示す能力を示す。

Recent studies on deep-learning-based small defection segmentation approaches are trained in specific settings and tend to be limited by fixed context. Throughout the training, the network inevitably learns the representation of the background of the training data before figuring out the defection. They underperform in the inference stage once the context changed and can only be solved by training in every new setting. This eventually leads to the limitation in practical robotic applications where contexts keep varying. To cope with this, instead of training a network context by context and hoping it to generalize, why not stop misleading it with any limited context and start training it with pure simulation? In this paper, we propose the network SSDS that learns a way of distinguishing small defections between two images regardless of the context, so that the network can be trained once for all. A small defection detection layer utilizing the pose sensitivity of phase correlation between images is introduced and is followed by an outlier masking layer. The network is trained on randomly generated simulated data with simple shapes and is generalized across the real world. Finally, SSDS is validated on real-world collected data and demonstrates the ability that even when trained in cheap simulation, SSDS can still find small defections in the real world showing the effectiveness and its potential for practical applications.

翻訳日:2021-03-10 23:07:21 公開日:2021-03-07

# (参考訳) オンライン機械学習手法が電力市場の長期投資決定と発電機利用に与える影響

The impact of online machine-learning methods on long-term investment decisions and generator utilization in electricity markets ( http://arxiv.org/abs/2103.04327v1 )

ライセンス: CC BY 4.0

Alexander J. M. Kell, A. Stephen McGough, Matthew Forshaw

(参考訳) 電力供給は常に需要に合致する必要があります。これにより、負荷周波数制御や停電といった問題が発生する確率を減らすことができる。今後24時間以内に必要となるであろう負荷をよりよく理解するためには、不確実性に基づく推定が必要である。これは、多くのマイクロプロデューサが中央制御下にない分散電力市場では特に困難である。本稿では,11のオフライン学習と5つのオンライン学習アルゴリズムによる次の24時間における電力需要プロファイルの予測について検討する。長期エージェントベースのモデルであるElecSimに統合することで実現します。今後の24時間における電力需要プロファイルの予測を通じて、日頭市場における予測をシミュレートすることができる。これらの予測を行った後、残留分布からサンプルを採取し、シミュレーションであるElecSimを用いて電力市場需要を摂動させる。これにより、分散型電力市場の長期的ダイナミクスに対するエラーの影響を理解することができる。提案手法では,オンラインアルゴリズムを用いて平均絶対誤差を30%削減でき,また,必要となる余裕のある全国的グリッドリザーブを削減できることを示した。この国別埋蔵量の減少は、コストと排出量の節約につながります。また, 予測精度の大きな誤差は, 17年間の時間枠での投資に不均等な誤差があり, 電気の混合も可能であることを示した。

Electricity supply must be matched with demand at all times. This helps reduce the chances of issues such as load frequency control and the chances of electricity blackouts. To gain a better understanding of the load that is likely to be required over the next 24h, estimations under uncertainty are needed. This is especially difficult in a decentralized electricity market with many micro-producers which are not under central control. In this paper, we investigate the impact of eleven offline learning and five online learning algorithms to predict the electricity demand profile over the next 24h. We achieve this through integration within the long-term agent-based model, ElecSim. Through the prediction of electricity demand profile over the next 24h, we can simulate the predictions made for a day-ahead market. Once we have made these predictions, we sample from the residual distributions and perturb the electricity market demand using the simulation, ElecSim. This enables us to understand the impact of errors on the long-term dynamics of a decentralized electricity market. We show we can reduce the mean absolute error by 30% using an online algorithm when compared to the best offline algorithm, whilst reducing the required tendered national grid reserve required. This reduction in national grid reserves leads to savings in costs and emissions. We also show that large errors in prediction accuracy have a disproportionate error on investments made over a 17-year time frame, as well as electricity mix.

翻訳日:2021-03-10 22:19:11 公開日:2021-03-07

# (参考訳) Markov Cricket: 1日の国際クリケットにおけるベッティングパフォーマンスのモデル化、予測、最適化にフォワードと逆強化学習を使う

Markov Cricket: Using Forward and Inverse Reinforcement Learning to Model, Predict And Optimize Batting Performance in One-Day International Cricket ( http://arxiv.org/abs/2103.04349v1 )

ライセンス: CC BY 4.0

Manohar Vohra and George S. D. Gordon

(参考訳) 本稿では,1日の国際クリケット競技をマルコフプロセスとしてモデル化し,フォワードおよびインバース強化学習(rl)を適用し,新たな3つのツールを開発した。まず,モンテカルロ学習をスコアに基づく報酬モデルを用いて,ゲームの各状態に対する値関数の非線形近似に適用する。本手法は,残るスコアリング資源のプロキシとして使用する場合,プロの試合で使用されるダックワース・ルイス・ステルン法を3倍から10倍に上回っている。次に、逆強化学習(特にガイド付きコスト学習の変種)を用いて、エキスパートのパフォーマンスに基づいて報酬の線形モデルを推論し、ここでは勝利チームのプレーシーケンスと仮定する。このモデルから各状態に対する最適ポリシーを明示的に決定し、ゲームに関する一般的な直観と一致することを見つける。最後に、推定報酬モデルを用いて、異なるポリシーの下で最終スコアの後方分布をモデル化するゲームシミュレータを構築する。予測とシミュレーションのテクニックは中断されたゲームの最終スコアを推定するためのより公平な代替手段となり得るが、推定された報酬モデルはプロのゲームがプレイ戦略を最適化するための有用な洞察を提供するかもしれない。さらに,この競技にRLを適用する方法が,野球や球技など,チームが交互にプレーする個別の状態のスポーツに広く適用される可能性があることを期待する。

In this paper, we model one-day international cricket games as Markov processes, applying forward and inverse Reinforcement Learning (RL) to develop three novel tools for the game. First, we apply Monte-Carlo learning to fit a nonlinear approximation of the value function for each state of the game using a score-based reward model. We show that, when used as a proxy for remaining scoring resources, this approach outperforms the state-of-the-art Duckworth-Lewis-Stern method used in professional matches by 3 to 10 fold. Next, we use inverse reinforcement learning, specifically a variant of guided-cost learning, to infer a linear model of rewards based on expert performances, assumed here to be play sequences of winning teams. From this model we explicitly determine the optimal policy for each state and find this agrees with common intuitions about the game. Finally, we use the inferred reward models to construct a game simulator that models the posterior distribution of final scores under different policies. We envisage our prediction and simulation techniques may provide a fairer alternative for estimating final scores in interrupted games, while the inferred reward model may provide useful insights for the professional game to optimize playing strategy. Further, we anticipate our method of applying RL to this game may have broader application to other sports with discrete states of play where teams take turns, such as baseball and rounders.

翻訳日:2021-03-10 19:49:39 公開日:2021-03-07

# (参考訳) コード埋め込みを用いた半自動誤解発見に向けて

Toward Semi-Automatic Misconception Discovery Using Code Embeddings ( http://arxiv.org/abs/2103.04448v1 )

ライセンス: CC BY 4.0

Yang Shi, Krupal Shah, Wengran Wang, Samiha Marwan, Poorvaja Penmetsa and Thomas W. Price

(参考訳) 生徒の誤解を理解することは効果的な指導と評価に重要である。しかし、そのような誤解を手動で発見することは時間と労力を要する。自動誤解発見(automated misconception discovery)は、学生データのパターンを強調することで、これらの課題に対処することができる。本研究では,現状のコード分類モデルを用いて,コンピュータコースにおける生徒のプログラムコードから問題固有の誤解を半自動で発見する手法を提案する。ブロックベースのプログラミングデータセットでモデルをトレーニングし、学習した埋め込みをクラスタの不正な学生の応募に使用しました。これらのクラスターは問題に関する特定の誤解に対応しており、既存のアプローチでは容易には発見できなかった。また、私たちのアプローチの潜在的な応用と、これらの誤解が学生の学習プロセスにドメイン固有の洞察をどう伝えるかについて議論します。

Understanding students' misconceptions is important for effective teaching and assessment. However, discovering such misconceptions manually can be time-consuming and laborious. Automated misconception discovery can address these challenges by highlighting patterns in student data, which domain experts can then inspect to identify misconceptions. In this work, we present a novel method for the semi-automated discovery of problem-specific misconceptions from students' program code in computing courses, using a state-of-the-art code classification model. We trained the model on a block-based programming dataset and used the learned embedding to cluster incorrect student submissions. We found these clusters correspond to specific misconceptions about the problem and would not have been easily discovered with existing approaches. We also discuss potential applications of our approach and how these misconceptions inform domain-specific insights into students' learning processes.

翻訳日:2021-03-10 19:35:45 公開日:2021-03-07

# (参考訳) 深層学習層のスペクトルテンソルトレインパラメータ化

Spectral Tensor Train Parameterization of Deep Learning Layers ( http://arxiv.org/abs/2103.04217v1 )

ライセンス: CC BY-SA 4.0

Anton Obukhov, Maxim Rakhuba, Alexander Liniger, Zhiwu Huang, Stamatios Georgoulis, Dengxin Dai, Luc Van Gool

(参考訳) 重み行列の低ランクパラメータ化をDeep Learningコンテキストに埋め込まれたスペクトル特性を用いて検討する。低ランク特性はパラメータ効率をもたらし、マッピングを計算する際に計算ショートカットを行うことができる。スペクトル特性はしばしば最適化問題に制約を受け、より良いモデルと最適化の安定性をもたらす。まず、重み行列のコンパクトなSVDパラメータ化とパラメータ化における冗長性源の同定から始める。さらに, テンソルトレイン(TT)分解をコンパクトなSVD成分に適用し, スペクトルテンソルトレインパラメータ化(STTP)と呼ばれる固定されたTTランクテンソル多様体の非冗長微分パラメータ化を提案する。画像分類設定におけるニューラルネットワーク圧縮の効果と,生成敵対的トレーニング設定における圧縮とトレーニング安定性の改善を実証する。

We study low-rank parameterizations of weight matrices with embedded spectral properties in the Deep Learning context. The low-rank property leads to parameter efficiency and permits taking computational shortcuts when computing mappings. Spectral properties are often subject to constraints in optimization problems, leading to better models and stability of optimization. We start by looking at the compact SVD parameterization of weight matrices and identifying redundancy sources in the parameterization. We further apply the Tensor Train (TT) decomposition to the compact SVD components, and propose a non-redundant differentiable parameterization of fixed TT-rank tensor manifolds, termed the Spectral Tensor Train Parameterization (STTP). We demonstrate the effects of neural network compression in the image classification setting and both compression and improved training stability in the generative adversarial training setting.

翻訳日:2021-03-10 17:48:26 公開日:2021-03-07

# 動的プログラミングを伴わないcnn音声単語検出と局所化

CNN-based Spoken Term Detection and Localization without Dynamic Programming ( http://arxiv.org/abs/2103.05468v1 )

ライセンス: Link先を確認

Tzeviya Sylvia Fuchs, Yael Segal and Joseph Keshet

(参考訳) 本稿では,音声セグメント内の語彙内および語彙外用語の同時予測と局所化のための音声項検出アルゴリズムを提案する。提案アルゴリズムは、音声信号の様々な部分の単語埋め込みを予測し、所望の単語埋め込みと比較することにより、ある単語が所定の音声信号内に発声されたか否かを推定する。このアルゴリズムはこのタスクに既存の埋め込みスペースを利用し、タスク固有の埋め込みスペースをトレーニングする必要がない。推定では、アルゴリズムはターゲット項のすべての可能な位置を同時に予測し、最適な検索のために動的プログラミングを必要としません。読み上げ音声コーポラにおける複数の音声単語検出タスクのシステム評価を行った。

In this paper, we propose a spoken term detection algorithm for simultaneous prediction and localization of in-vocabulary and out-of-vocabulary terms within an audio segment. The proposed algorithm infers whether a term was uttered within a given speech signal or not by predicting the word embeddings of various parts of the speech signal and comparing them to the word embedding of the desired term. The algorithm utilizes an existing embedding space for this task and does not need to train a task-specific embedding space. At inference the algorithm simultaneously predicts all possible locations of the target term and does not need dynamic programming for optimal search. We evaluate our system on several spoken term detection tasks on read speech corpora.

翻訳日:2021-03-10 14:42:32 公開日:2021-03-07

# (参考訳) ARVo:ビデオデブリのための全行ボリューム対応学習

ARVo: Learning All-Range Volumetric Correspondence for Video Deblurring ( http://arxiv.org/abs/2103.04260v1 )

ライセンス: CC BY 4.0

Dongxu Li, Chenchen Xu, Kaihao Zhang, Xin Yu, Yiran Zhong, Wenqi Ren, Hanna Suominen, Hongdong Li

(参考訳) ビデオデブラリングモデルは連続フレームを利用して、カメラの揺動や物体の動きからぼやけを取り除く。隣接するシャープパッチを利用するために、典型的な手法は主にホモグラフィや光学フローに依存し、隣接するぼやけたフレームを空間的に整列させる。しかし、そのような明示的なアプローチは、大きなピクセル変位を持つ高速な動きの存在において効果が低い。本研究では,特徴空間におけるぼやけたフレーム間の空間対応を学習する新しい暗黙的手法を提案する。遠方画素対応を構築するために, 隣接フレーム間のすべての画素対間の相関体積ピラミッドを構築する。参照フレームの特徴を高めるために,ボリュームピラミッドに基づいて,近傍とのピクセルペア相関を最大化する相関アグリゲーションモジュールを設計した。最後に,集約された特徴を復元モジュールに供給し,復元されたフレームを得る。我々は,モデルを漸進的に最適化するための生成的逆パラダイムを設計する。提案手法は,ビデオデブロアリング用高フレームレート(1000fps)データセット(HFR-DVD)とともに,広く採用されているDVDデータセットを用いて評価する。定量的および定性的な実験は、従来の最先端の手法に対する両方のデータセットで好適に動作し、ビデオデブレーションのための全範囲空間対応のモデリングの利点を確認することを示しています。

Video deblurring models exploit consecutive frames to remove blurs from camera shakes and object motions. In order to utilize neighboring sharp patches, typical methods rely mainly on homography or optical flows to spatially align neighboring blurry frames. However, such explicit approaches are less effective in the presence of fast motions with large pixel displacements. In this work, we propose a novel implicit method to learn spatial correspondence among blurry frames in the feature space. To construct distant pixel correspondences, our model builds a correlation volume pyramid among all the pixel-pairs between neighboring frames. To enhance the features of the reference frame, we design a correlative aggregation module that maximizes the pixel-pair correlations with its neighbors based on the volume pyramid. Finally, we feed the aggregated features into a reconstruction module to obtain the restored frame. We design a generative adversarial paradigm to optimize the model progressively. Our proposed method is evaluated on the widely-adopted DVD dataset, along with a newly collected High-Frame-Rate (1000 fps) Dataset for Video Deblurring (HFR-DVD). Quantitative and qualitative experiments show that our model performs favorably on both datasets against previous state-of-the-art methods, confirming the benefit of modeling all-range spatial correspondence for video deblurring.

翻訳日:2021-03-10 14:33:21 公開日:2021-03-07

# (参考訳) 局所単語統計は超越性によらず読解時間に影響する

Local word statistics affect reading times independently of surprisal ( http://arxiv.org/abs/2103.04469v1 )

ライセンス: CC BY 4.0

Adam Goodkind and Klinton Bicknell

(参考訳) 代用的理論は、文処理における多くの現象を理解するための統一的な枠組み(hale, 2001; levy, 2008a)を提供し、全ての事前文脈で与えられた単語の条件付き確率が処理の困難を完全に決定することを示した。この主張の問題点として、条件付き確率が一定である場合でも、ある局所統計的単語頻度も処理に影響を与えることが示されている。ここでは、他のローカル統計が処理に役割を持つか、単語頻度が特別な場合であるかどうかを尋ねます。我々は,より複雑な局所統計量であるbigram と trigram の確率が,超越性とは独立に処理に影響を与えることを示す最初の明確な証拠を示す。これらの結果は、処理における局所統計の重要かつ独立した役割を示唆している。さらに、地域統計情報に大きすぎる効果がある理由を説明できるような仮説の新たな一般化の研究を動機付けている。

Surprisal theory has provided a unifying framework for understanding many phenomena in sentence processing (Hale, 2001; Levy, 2008a), positing that a word's conditional probability given all prior context fully determines processing difficulty. Problematically for this claim, one local statistic, word frequency, has also been shown to affect processing, even when conditional probability given context is held constant. Here, we ask whether other local statistics have a role in processing, or whether word frequency is a special case. We present the first clear evidence that more complex local statistics, word bigram and trigram probability, also affect processing independently of surprisal. These findings suggest a significant and independent role of local statistics in processing. Further, it motivates research into new generalizations of surprisal that can also explain why local statistical information should have an outsized effect.

翻訳日:2021-03-10 14:06:26 公開日:2021-03-07

# (参考訳) 説明可能な人工知能における反事実と原因:理論、アルゴリズム、応用

Counterfactuals and Causability in Explainable Artificial Intelligence: Theory, Algorithms, and Applications ( http://arxiv.org/abs/2103.04244v1 )

ライセンス: CC BY 4.0

Yu-Liang Chou and Catarina Moreira and Peter Bruza and Chun Ouyang and Joaquim Jorge

(参考訳) ディープラーニングモデルをより透明で説明しやすいものにする、モデルに依存しない方法への関心が高まっている。一部の研究者は、機械がある程度の人間レベルの説明可能性を達成するためには、この機械は人間の因果的理解可能な説明を提供する必要があると主張した。可利用性を提供する可能性のある特定のアルゴリズムのクラスは偽物である。本稿では,多種多様な文献を体系的に検証し,その事実と説明可能な人工知能の因果性について述べる。 PRISMAフレームワークの下でLDAトピックモデリング解析を行い、最も関連性の高い文献記事を見つけました。この分析の結果、調査されたアルゴリズムの接地理論と、その基礎となる特性と実世界データへの応用を考える新しい分類法が導かれた。この研究は、現在のAIのモデル非依存の反ファクトアルゴリズムは因果論的形式主義に基づいておらず、したがって人間の意思決定者への因果性を促進することができないことを示唆している。本研究では, 文献における主要なアルゴリズムから得られた説明は, 因果関係ではなく, 散発的な相関関係を提供し, 副最適, 誤った, あるいは偏見のある説明につながることを示唆した。本稿では,人工知能のモデル非依存的アプローチにおける可利用性向上のための新たな方向性と課題について述べる。

There has been a growing interest in model-agnostic methods that can make deep learning models more transparent and explainable to a user. Some researchers recently argued that for a machine to achieve a certain degree of human-level explainability, this machine needs to provide human causally understandable explanations, also known as causability. A specific class of algorithms that have the potential to provide causability are counterfactuals. This paper presents an in-depth systematic review of the diverse existing body of literature on counterfactuals and causability for explainable artificial intelligence. We performed an LDA topic modelling analysis under a PRISMA framework to find the most relevant literature articles. This analysis resulted in a novel taxonomy that considers the grounding theories of the surveyed algorithms, together with their underlying properties and applications in real-world data. This research suggests that current model-agnostic counterfactual algorithms for explainable AI are not grounded on a causal theoretical formalism and, consequently, cannot promote causability to a human decision-maker. Our findings suggest that the explanations derived from major algorithms in the literature provide spurious correlations rather than cause/effects relationships, leading to sub-optimal, erroneous or even biased explanations. This paper also advances the literature with new directions and challenges on promoting causability in model-agnostic approaches for explainable artificial intelligence.

翻訳日:2021-03-10 12:45:26 公開日:2021-03-07

# (参考訳) グラフデータ補完のための畳み込みグラフテンソルネット

Convolutional Graph-Tensor Net for Graph Data Completion ( http://arxiv.org/abs/2103.04485v1 )

ライセンス: CC BY 4.0

Xiao-Yang Liu, Ming Zhu

(参考訳) グラフデータ補完は、一般的には、ソーシャルネットワーク、レコメンデーションシステム、モノのインターネットといったグラフ構造を持つため、基本的に重要な問題である。我々は,各ノードがデータ行列を持つグラフを,データ行列を3次元に積み重ねることにより,「textit{graph-tensor}」として表現する。本稿では,ディープニューラルネットワークを用いてグラフテンソルの一般変換を学習するグラフデータ補完問題に対して, \textit{Convolutional Graph-Tensor Net} (\textit{Conv GT-Net})を提案する。実験の結果、提案された \textit{Conv GT-Net} は、既存のアルゴリズムに対する完成精度 (50\% 高い) と完成速度 (3.6x $\sim$ 8.1x 速い) の両方において有意な改善を達成できることが示された。

Graph data completion is a fundamentally important issue as data generally has a graph structure, e.g., social networks, recommendation systems, and the Internet of Things. We consider a graph where each node has a data matrix, represented as a \textit{graph-tensor} by stacking the data matrices in the third dimension. In this paper, we propose a \textit{Convolutional Graph-Tensor Net} (\textit{Conv GT-Net}) for the graph data completion problem, which uses deep neural networks to learn the general transform of graph-tensors. The experimental results on the ego-Facebook data sets show that the proposed \textit{Conv GT-Net} achieves significant improvements on both completion accuracy (50\% higher) and completion speed (3.6x $\sim$ 8.1x faster) over the existing algorithms.

翻訳日:2021-03-10 12:44:15 公開日:2021-03-07

# (参考訳) 衝突層除去によるディープニューラルネットワークの自動チューニング

Auto-tuning of Deep Neural Networks by Conflicting Layer Removal ( http://arxiv.org/abs/2103.04331v1 )

ライセンス: CC BY 4.0

David Peer, Sebastian Stabinger, Antonio Rodriguez-Sanchez

(参考訳) ニューラルネットワークアーキテクチャを設計することは難しい作業であり、どのモデルの特定の層をパフォーマンスを改善するために適応しなければならないかを知ることは、ほぼ謎である。本稿では,学習モデルのテスト精度を低下させる層を識別する新しい手法を提案する。矛盾する層は、トレーニングの開始時に早期に検出される。最悪のシナリオでは、そのような層がまったく訓練できないネットワークにつながる可能性があることを証明します。理論的分析は、ネットワーク全体のパフォーマンスが低下するこれらの層の起源について提供され、これは広範な実証的評価によって補完されます。より正確には、競合するトレーニングバンドルを生成するため、パフォーマンスを悪化させるレイヤを特定しました。トレーニングされた残存ネットワークのレイヤの約60%が、テストエラーの有意な増加なしに、アーキテクチャから完全に削除できることを示します。さらに、トレーニングの開始時に相反する層を識別する新しいニューラルアーキテクチャサーチ(NAS)アルゴリズムを紹介します。自動チューニングアルゴリズムが検出したアーキテクチャは、より複雑な最先端アーキテクチャと比較すると、競合精度が向上する一方で、異なるコンピュータビジョンタスクのメモリ消費と推論時間を劇的に削減する。ソースコードはhttps://github.com/peerdavid/conflicting-bundlesで入手できる。

Designing neural network architectures is a challenging task and knowing which specific layers of a model must be adapted to improve the performance is almost a mystery. In this paper, we introduce a novel methodology to identify layers that decrease the test accuracy of trained models. Conflicting layers are detected as early as the beginning of training. In the worst-case scenario, we prove that such a layer could lead to a network that cannot be trained at all. A theoretical analysis is provided on what is the origin of those layers that result in a lower overall network performance, which is complemented by our extensive empirical evaluation. More precisely, we identified those layers that worsen the performance because they would produce what we name conflicting training bundles. We will show that around 60% of the layers of trained residual networks can be completely removed from the architecture with no significant increase in the test-error. We will further present a novel neural-architecture-search (NAS) algorithm that identifies conflicting layers at the beginning of the training. Architectures found by our auto-tuning algorithm achieve competitive accuracy values when compared against more complex state-of-the-art architectures, while drastically reducing memory consumption and inference time for different computer vision tasks. The source code is available on https://github.com/peerdavid/conflicting-bundles

翻訳日:2021-03-10 09:20:01 公開日:2021-03-07

# (参考訳) 特徴履歴による効率的なモデル性能推定

Efficient Model Performance Estimation via Feature Histories ( http://arxiv.org/abs/2103.04450v1 )

ライセンス: CC BY 4.0

Shengcao Cao, Xiaofang Wang, Kris Kitani

(参考訳) ハイパーパラメータ最適化(HPO)やニューラルアーキテクチャサーチ(NAS)といったニューラルネットワーク設計の課題の重要なステップは、候補モデルの性能を評価することである。一定の計算リソースがあれば、各モデルにより多くの時間を費やして最終的なパフォーマンスの正確な見積もりを得るか、設定スペースでより多様なモデルを探索するより多くの時間を費やすことができます。本研究では、トレーニングプロセスの早期にモデルの最大性能を正確に近似することにより、画像分類のためのHPOとNASの文脈でこの探索-探索トレードオフを最適化することを目指しています。特定の検索空間向けにカスタマイズされた最近の高速化NAS手法とは対照的に、例えば、検索空間の微分が要求される場合、我々の手法は柔軟であり、検索空間にほとんど制約を課さない。本手法は,訓練の初期段階におけるネットワークの特徴の進化履歴を用いて,検討中のネットワークのピーク性能に一致するプロキシ分類器を構築する。本手法は複数の探索アルゴリズムと組み合わせ、HPOやNASの幅広いタスクに対するより良いソリューションを見つけることができることを示す。サンプリングに基づく検索アルゴリズムと並列計算を用いて,dartよりも優れたアーキテクチャを探索し,壁時間探索時間の80%削減を実現する。

An important step in the task of neural network design, such as hyper-parameter optimization (HPO) or neural architecture search (NAS), is the evaluation of a candidate model's performance. Given fixed computational resources, one can either invest more time training each model to obtain more accurate estimates of final performance, or spend more time exploring a greater variety of models in the configuration space. In this work, we aim to optimize this exploration-exploitation trade-off in the context of HPO and NAS for image classification by accurately approximating a model's maximal performance early in the training process. In contrast to recent accelerated NAS methods customized for certain search spaces, e.g., requiring the search space to be differentiable, our method is flexible and imposes almost no constraints on the search space. Our method uses the evolution history of features of a network during the early stages of training to build a proxy classifier that matches the peak performance of the network under consideration. We show that our method can be combined with multiple search algorithms to find better solutions to a wide range of tasks in HPO and NAS. Using a sampling-based search algorithm and parallel computing, our method can find an architecture which is better than DARTS and with an 80% reduction in wall-clock search time.

翻訳日:2021-03-10 08:59:08 公開日:2021-03-07

# (参考訳) 対人学習による公正度の推定と改善

Estimating and Improving Fairness with Adversarial Learning ( http://arxiv.org/abs/2103.04243v1 )

ライセンス: CC BY 4.0

Xiaoxiao Li, Ziteng Cui, Yifan Wu, Li Gu, Tatsuya Harada

(参考訳) 医療における信頼される人工知能(AI)には、公平性と説明責任が不可欠です。しかし、既存のAIモデルは決定マーキングに偏る可能性があります。そこで本研究では,深層学習に基づく医用画像解析システムにおいて,バイアスの軽減と検出を同時に行うマルチタスク学習戦略を提案する。具体的には,バイアスに対する識別モジュールと,ベース分類モデルにおける不公平性を予測するクリティカルモジュールを追加することを提案する。さらに、トレーニング中に2つのモジュールが独立するように直交正規化を強制します。したがって、これらの深層学習タスクを互いに区別し、多様体上の特異点に分解することを避けることができる。この敵対的なトレーニング方法を通じて、性別や肌のトーンなどの属性のためにバイアスに対して脆弱である恵まれないグループからのデータは、これらの属性に対して中立なドメインに転送されます。さらに、クリティカルモジュールは、未知の敏感な属性を持つデータの公平度スコアを予測できます。各種フェアネス評価指標に基づいて, 大規模皮膚病変データセット上での枠組みの評価を行った。本実験は,深層学習に基づく医用画像解析システムにおいて,フェアネスを推定・改善するための提案手法の有効性を示すものである。

Fairness and accountability are two essential pillars for trustworthy Artificial Intelligence (AI) in healthcare. However, the existing AI model may be biased in its decision marking. To tackle this issue, we propose an adversarial multi-task training strategy to simultaneously mitigate and detect bias in the deep learning-based medical image analysis system. Specifically, we propose to add a discrimination module against bias and a critical module that predicts unfairness within the base classification model. We further impose an orthogonality regularization to force the two modules to be independent during training. Hence, we can keep these deep learning tasks distinct from one another, and avoid collapsing them into a singular point on the manifold. Through this adversarial training method, the data from the underprivileged group, which is vulnerable to bias because of attributes such as sex and skin tone, are transferred into a domain that is neutral relative to these attributes. Furthermore, the critical module can predict fairness scores for the data with unknown sensitive attributes. We evaluate our framework on a large-scale public-available skin lesion dataset under various fairness evaluation metrics. The experiments demonstrate the effectiveness of our proposed method for estimating and improving fairness in the deep learning-based medical image analysis system.

翻訳日:2021-03-10 08:05:07 公開日:2021-03-07

# (参考訳) 離散構成エネルギーネットワークを用いたRNA代替スプライシング予測

RNA Alternative Splicing Prediction with Discrete Compositional Energy Network ( http://arxiv.org/abs/2103.04246v1 )

ライセンス: CC BY 4.0

Alvin Chan, Anna Korsakova, Yew-Soon Ong, Fernaldo Richtia Winnerdy, Kah Wai Lim, Anh Tuan Phan

(参考訳) 単一の遺伝子は、代替スプライシングと呼ばれるプロセスを通じて、異なるタンパク質バージョンをコードすることができる。タンパク質は細胞機能において主要な役割を果たすため、異常なスプライシングプロファイルはがんを含む様々な疾患を引き起こす可能性がある。代替スプライシングは、遺伝子の一次配列およびRNA結合タンパク質レベルなどの他の調節因子によって決定される。これを入力として、RNAスプライシングの予測を回帰タスクとして定式化し、学習モデルをベンチマークするための新しいトレーニングデータセット(CAPD)を構築します。本研究では,スプライスサイト,接合部,転写部間の階層的関係を利用した離散構成エネルギーネットワーク(DCEN)を提案する。代替スプライシング予測の場合、DCENはその構成スプライス接合のエネルギー値を通じてmRNA転写確率をモデル化する。これらの転写確率はその後、キーヌクレオチドの相対的存在量値にマッピングされ、基礎実験によって訓練される。 CAPDの実験を通じて、DCENがベースラインとアブレーションバリアントを上回っていることを示します。

A single gene can encode for different protein versions through a process called alternative splicing. Since proteins play major roles in cellular functions, aberrant splicing profiles can result in a variety of diseases, including cancers. Alternative splicing is determined by the gene's primary sequence and other regulatory factors such as RNA-binding protein levels. With these as input, we formulate the prediction of RNA splicing as a regression task and build a new training dataset (CAPD) to benchmark learned models. We propose discrete compositional energy network (DCEN) which leverages the hierarchical relationships between splice sites, junctions and transcripts to approach this task. In the case of alternative splicing prediction, DCEN models mRNA transcript probabilities through its constituent splice junctions' energy values. These transcript probabilities are subsequently mapped to relative abundance values of key nucleotides and trained with ground-truth experimental measurements. Through our experiments on CAPD, we show that DCEN outperforms baselines and ablation variants.

翻訳日:2021-03-10 07:31:20 公開日:2021-03-07

# (参考訳) 無監視異常検出のための学生教師特徴ピラミッドマッチング

Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection ( http://arxiv.org/abs/2103.04257v1 )

ライセンス: CC BY 4.0

Guodong Wang, Shumin Han, Errui Ding, Di Huang

(参考訳) 異常検出は難しい課題であり、通常、異常の予期せぬ性質に対する教師なし学習問題として定式化される。本稿では,生徒-教員の枠組みにその利点を生かして実装するが,精度と効率の両面で大幅に拡張する,シンプルかつ強力な手法を提案する。イメージ分類を教師として事前訓練した強いモデルから,その知識を同一のアーキテクチャで単一学生ネットワークに抽出し,異常な画像の分布を学習し,このワンステップ転送は可能な限り重要な手がかりを保存する。さらに,マルチスケールな特徴マッチング戦略をフレームワークに統合し,この階層的な特徴アライメントにより,より優れた監視の下で,学生ネットワークが特徴ピラミッドから多段階の知識を混在させることで,様々な大きさの異常を検出することができる。 2つのネットワークによって生成される特徴ピラミッドの違いは、異常が起こる確率を示すスコア関数として機能する。このような操作により、正確で高速なピクセルレベルの異常検出を実現します。非常に競争力のある結果は、3つの主要なベンチマークで提供されます。さらに、非常に高速(256x256のサイズの画像のための100 FPS)で推論を行い、最新のものよりも少なくとも数十倍高速です。

Anomaly detection is a challenging task and usually formulated as an unsupervised learning problem for the unexpectedness of anomalies. This paper proposes a simple yet powerful approach to this issue, which is implemented in the student-teacher framework for its advantages but substantially extends it in terms of both accuracy and efficiency. Given a strong model pre-trained on image classification as the teacher, we distill the knowledge into a single student network with the identical architecture to learn the distribution of anomaly-free images and this one-step transfer preserves the crucial clues as much as possible. Moreover, we integrate the multi-scale feature matching strategy into the framework, and this hierarchical feature alignment enables the student network to receive a mixture of multi-level knowledge from the feature pyramid under better supervision, thus allowing to detect anomalies of various sizes. The difference between feature pyramids generated by the two networks serves as a scoring function indicating the probability of anomaly occurring. Due to such operations, our approach achieves accurate and fast pixel-level anomaly detection. Very competitive results are delivered on three major benchmarks, significantly superior to the state of the art ones. In addition, it makes inferences at a very high speed (with 100 FPS for images of the size at 256x256), at least dozens of times faster than the latest counterparts.

翻訳日:2021-03-10 01:33:32 公開日:2021-03-07

# (参考訳) ランドマーク検出を用いた多項式曲線フィッティングによる歯根ファジィエッジの高分解能分割

High-Resolution Segmentation of Tooth Root Fuzzy Edge Based on Polynomial Curve Fitting with Landmark Detection ( http://arxiv.org/abs/2103.04258v1 )

ライセンス: CC BY 4.0

Yunxiang Li, Yifan Zhang, Yaqi Wang, Shuai Wang, Ruizi Peng, Kai Tang, Qianni Zhang, Jun Wang, Qun Jin, Lingling Sun

(参考訳) 根管治療の診断における最も経済的かつ定期的な補助的検査として、口腔X線は、皮膚科医によって広く用いられている。従来の画像分割法では歯根をぼやけた境界で分割することは依然として困難である。そこで,ランドマーク検出(HS-PCL)を用いた多項式曲線フィッティングに基づく高分解能セグメンテーションモデルを提案する。歯根の縁に均等に分布する複数のランドマークを検出し、滑らかな多項式曲線を歯根のセグメント化として適合させ、ファジィエッジの問題を解決する。本モデルでは,不正確に検出された間違ったランドマークの悪影響を自動的に低減し,適合結果に歯根から逸脱する最短距離アルゴリズム(mnsda)の最大数を提案する。数値実験により,提案手法は最先端の手法と比較して,Hausdorff95 (HD95) を33.9%,Average Surface Distance (ASD) を42.1%削減するだけでなく,データセットの分量にも優れた結果が得られ,医用画像処理による根管自動治療の有効性が向上した。

As the most economical and routine auxiliary examination in the diagnosis of root canal treatment, oral X-ray has been widely used by stomatologists. It is still challenging to segment the tooth root with a blurry boundary for the traditional image segmentation method. To this end, we propose a model for high-resolution segmentation based on polynomial curve fitting with landmark detection (HS-PCL). It is based on detecting multiple landmarks evenly distributed on the edge of the tooth root to fit a smooth polynomial curve as the segmentation of the tooth root, thereby solving the problem of fuzzy edge. In our model, a maximum number of the shortest distances algorithm (MNSDA) is proposed to automatically reduce the negative influence of the wrong landmarks which are detected incorrectly and deviate from the tooth root on the fitting result. Our numerical experiments demonstrate that the proposed approach not only reduces Hausdorff95 (HD95) by 33.9% and Average Surface Distance (ASD) by 42.1% compared with the state-of-the-art method, but it also achieves excellent results on the minute quantity of datasets, which greatly improves the feasibility of automatic root canal therapy evaluation by medical image computing.

翻訳日:2021-03-10 01:20:03 公開日:2021-03-07

# (参考訳) 部分アスペクト角SARターゲット認識のためのPose Disrepancy Spatial Transformer

Pose Discrepancy Spatial Transformer Based Feature Disentangling for Partial Aspect Angles SAR Target Recognition ( http://arxiv.org/abs/2103.04329v1 )

ライセンス: CC BY 4.0

Zaidao Wen, Jiaxiang Liu, Zhunga Liu, Quan Pan

(参考訳) 本文は,合成開口レーダ(SAR)自動目標認識(ATR)タスクのための新しいフレームワークであるDistSTNを提示する。従来のSAR ATRアルゴリズムとは対照的に、DistSTNは、トレーニングのアスペクト角が不完全で部分的な範囲に制限されている非協力的ターゲットに対して、テストサンプルの角度が無制限であるより困難な実用シナリオを検討している。この問題に対処するため、ポーズ不変の特徴を学習する代わりに、DistSTNは、SARターゲットの学習したポーズファクタとアイデンティティファクタを分離し、ターゲットイメージの表現プロセスを独立して制御できるように、精巧な機能分離モデルを含む。説明可能なポーズ因子を分離するために、DistSTNのポーズ不一致空間トランスフォーマーモジュールを開発し、2つの異なるターゲットの因子間の固有の変換を明示的な幾何学的モデルで特徴付ける。さらに、DistSTNは、エンコーダ・デコーダ機構を用いて効率的な特徴抽出と認識を可能にする、償却推論方式を開発した。移動目標獲得・認識(MSTAR)ベンチマークによる実験結果から,提案手法の有効性が示された。他のatrアルゴリズムと比較して、diststnは高い認識精度を達成できる。

This letter presents a novel framework termed DistSTN for the task of synthetic aperture radar (SAR) automatic target recognition (ATR). In contrast to the conventional SAR ATR algorithms, DistSTN considers a more challenging practical scenario for non-cooperative targets whose aspect angles for training are incomplete and limited in a partial range while those of testing samples are unlimited. To address this issue, instead of learning the pose invariant features, DistSTN newly involves an elaborated feature disentangling model to separate the learned pose factors of a SAR target from the identity ones so that they can independently control the representation process of the target image. To disentangle the explainable pose factors, we develop a pose discrepancy spatial transformer module in DistSTN to characterize the intrinsic transformation between the factors of two different targets with an explicit geometric model. Furthermore, DistSTN develops an amortized inference scheme that enables efficient feature extraction and recognition using an encoder-decoder mechanism. Experimental results with the moving and stationary target acquisition and recognition (MSTAR) benchmark demonstrate the effectiveness of our proposed approach. Compared with the other ATR algorithms, DistSTN can achieve higher recognition accuracy.

翻訳日:2021-03-10 01:12:39 公開日:2021-03-07

# (参考訳) IRON:不変ベースの高ロバストポイントクラウド登録

IRON: Invariant-based Highly Robust Point Cloud Registration ( http://arxiv.org/abs/2103.04357v1 )

ライセンス: CC0 1.0

Lei Sun

(参考訳) 本稿では,非最小かつ高ロバストなポイントクラウド登録法であるiron (invariant-based global robust estimation and optimization)を提案する。これを実現するために、登録問題をそれぞれスケール、回転、翻訳の推定に分離します。最初のコントリビューションは、ランダムなサンプル間のインリエリエンスを求めるために不変互換性を採用し、2つの点群間のスケールを堅牢に推定するRANSIC(RANdom Samples with Invariant Compatibility)を提案することです。スケールを見積もると、第2の貢献は、SOS(Sum-of-Squares)緩和を用いて、非凸なグローバル登録問題をSDP(convex Semi-Definite Program)に緩和し、緩和がきついことを示すことである。また、ロバストな推定のために、従来のGNCよりもロバスト性および時間効率のよいグローバルな外乱拒絶ヒューリスティックであるRT-GNC(Rough Trimming and Graduated Non-Convexity)を3番目の貢献として提案する。これらの貢献により、登録アルゴリズム、ironをレンダリングできます。実データセット上での実験を通じて,鉄は99%の異常値に対して効率的,高精度,堅牢であり,既存の最先端アルゴリズムを上回っていることを示した。

In this paper, we present IRON (Invariant-based global Robust estimation and OptimizatioN), a non-minimal and highly robust solution for point cloud registration with a great number of outliers among the correspondences. To realize this, we decouple the registration problem into the estimation of scale, rotation and translation, respectively. Our first contribution is to propose RANSIC (RANdom Samples with Invariant Compatibility), which employs the invariant compatibility to seek inliers among random samples and robustly estimates the scale between two sets of point clouds in the meantime. Once the scale is estimated, our second contribution is to relax the non-convex global registration problem into a convex Semi-Definite Program (SDP) in a certifiable way using Sum-of-Squares (SOS) Relaxation and show that the relaxation is tight. For robust estimation, we further propose RT-GNC (Rough Trimming and Graduated Non-Convexity), a global outlier rejection heuristic having better robustness and time-efficiency than traditional GNC, as our third contribution. With these contributions, we can render our registration algorithm, IRON. Through experiments over real datasets, we show that IRON is efficient, highly accurate and robust against as many as 99% outliers whether the scale is known or unknown, outperforming the existing state-of-the-art algorithms.

翻訳日:2021-03-10 01:01:08 公開日:2021-03-07

# (参考訳) 写真における自動フレアスポットアーティファクト検出と除去

Automatic Flare Spot Artifact Detection and Removal in Photographs ( http://arxiv.org/abs/2103.04384v1 )

ライセンス: CC BY 4.0

Patricia Vitoria and Coloma Ballester

(参考訳) フレアスポットは、多くの条件によって引き起こされる1つのタイプのフレアアーティファクトであり、しばしばカメラの視野内または近くで1つ以上の高輝度光源によって誘発される。高輝度源からの光がカメラの前面要素に到達すると、撮像された画像に非画像情報またはフレアを形成するフィルム面に出現するカメラ素子の内部反射を生成することができる。予防機構が用いられるが、アーティファクトが現れることもある。本稿では,フレアスポットアーティファクトを自動的に検出・除去する頑健な計算手法を提案する。第一に、フレアスポットが満たされる可能性のある本質的な特性に基づく特性評価を提案し、第二に、候補者の中からフレアスポットを選択できる新たな信頼度尺度を定義し、最後に、フレア領域を正確に決定する手法を提供します。そして、前記検出されたアーティファクトを、模範ベースの塗布を用いて除去する。アルゴリズムが最高水準の定量的および定性的な性能を達成することを示します。

Flare spot is one type of flare artifact caused by a number of conditions, frequently provoked by one or more high-luminance sources within or close to the camera field of view. When light rays coming from a high-luminance source reach the front element of a camera, it can produce intra-reflections within camera elements that emerge at the film plane forming non-image information or flare on the captured image. Even though preventive mechanisms are used, artifacts can appear. In this paper, we propose a robust computational method to automatically detect and remove flare spot artifacts. Our contribution is threefold: firstly, we propose a characterization which is based on intrinsic properties that a flare spot is likely to satisfy; secondly, we define a new confidence measure able to select flare spots among the candidates; and, finally, a method to accurately determine the flare region is given. Then, the detected artifacts are removed by using exemplar-based inpainting. We show that our algorithm achieve top-tier quantitative and qualitative performance.

翻訳日:2021-03-10 00:28:02 公開日:2021-03-07

# (参考訳) シーンテキスト認識に本当のデータセットしか使わないとしたら? ラベルの少ないシーンテキスト認識に向けて

What If We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels ( http://arxiv.org/abs/2103.04400v1 )

ライセンス: CC BY-SA 4.0

Jeonghun Baek, Yusuke Matsui, Kiyoharu Aizawa

(参考訳) シーンテキスト認識(STR)タスクは、一般的なプラクティスを持っています:すべての最先端のSTRモデルは、大規模な合成データで訓練されます。この練習とは対照的に、合成データなしでSTRモデルを訓練する必要があるとき、より少ない実ラベル(ラベルの少ないSTR)でのみSTRモデルをトレーニングすることは重要です。しかし、実際のデータは不十分であるため、実データ上でSTRモデルをトレーニングすることはほぼ不可能であるという暗黙の共通知識がある。この共通知識がラベルの少ないSTRの研究を妨げていると考えます。本研究では,共通知識を否定し,少ないラベルでSTRを再活性化する。我々は最近蓄積した公開実データを統合することで、STRモデルを実際のラベル付きデータでのみ満足に訓練できることを示します。その後、実データを完全に活用するための単純なデータ拡張が見つかる。さらに,ラベルなしデータを収集し,半教師付きおよび自己教師付き手法を導入することで,モデルを改善する。その結果,最先端手法に対する競争モデルが得られた。我々の知る限りでは、1)実際のラベルのみを用いることで十分な性能を示す最初の研究であり、2)より少ないラベルを持つSTRに半自己監督手法を導入する。私たちのコードとデータが利用可能です。 https://github.com/ku21fan/STR-Fewer-Labels

Scene text recognition (STR) task has a common practice: All state-of-the-art STR models are trained on large synthetic data. In contrast to this practice, training STR models only on fewer real labels (STR with fewer labels) is important when we have to train STR models without synthetic data: for handwritten or artistic texts that are difficult to generate synthetically and for languages other than English for which we do not always have synthetic data. However, there has been implicit common knowledge that training STR models on real data is nearly impossible because real data is insufficient. We consider that this common knowledge has obstructed the study of STR with fewer labels. In this work, we would like to reactivate STR with fewer labels by disproving the common knowledge. We consolidate recently accumulated public real data and show that we can train STR models satisfactorily only with real labeled data. Subsequently, we find simple data augmentation to fully exploit real data. Furthermore, we improve the models by collecting unlabeled data and introducing semi- and self-supervised methods. As a result, we obtain a competitive model to state-of-the-art methods. To the best of our knowledge, this is the first study that 1) shows sufficient performance by only using real labels and 2) introduces semi- and self-supervised methods into STR with fewer labels. Our code and data are available: https://github.com/ku21fan/STR-Fewer-Labels

翻訳日:2021-03-09 23:28:55 公開日:2021-03-07

# (参考訳) スナップショット圧縮イメージング:原理、実装、理論、アルゴリズムおよび応用

Snapshot Compressive Imaging: Principle, Implementation, Theory, Algorithms and Applications ( http://arxiv.org/abs/2103.04421v1 )

ライセンス: CC BY 4.0

Xin Yuan and David J. Brady and Aggelos K. Katsaggelos

(参考訳) 高次元(HD)データの取得は、信号処理とその関連分野における長期的な課題である。スナップショット圧縮イメージング(SCI)は、2次元(2D)検出器を使用して、シュエムスナップショット測定でHD($\ge3$D)データをキャプチャします。新規な光学設計により、2D検出器は、HDデータを圧縮的にサンプリングし、その後、アルゴリズムを用いて所望のHDデータキューブを再構築する。 SCIは、ハイパースペクトルイメージング、ビデオ、ホログラフィー、トモグラフィー、焦点深度イメージング、偏光イメージング、顕微鏡、画像撮影に使われてきた。ディープラーニングにインスパイアされた様々なディープニューラルネットワークも、スペクトルSCIとビデオSCIのHDデータキューブを再構築するために開発されている。本稿では、最適化に基づくアルゴリズムとディープラーニングに基づくアルゴリズムの両方を含む、sciハードウェア、理論、アルゴリズムの最近の進歩を概観する。様々な応用やSCIの展望についても論じる。

Capturing high-dimensional (HD) data is a long-term challenge in signal processing and related fields. Snapshot compressive imaging (SCI) uses a two-dimensional (2D) detector to capture HD ($\ge3$D) data in a {\em snapshot} measurement. Via novel optical designs, the 2D detector samples the HD data in a {\em compressive} manner; following this, algorithms are employed to reconstruct the desired HD data-cube. SCI has been used in hyperspectral imaging, video, holography, tomography, focal depth imaging, polarization imaging, microscopy, \etc.~Though the hardware has been investigated for more than a decade, the theoretical guarantees have only recently been derived. Inspired by deep learning, various deep neural networks have also been developed to reconstruct the HD data-cube in spectral SCI and video SCI. This article reviews recent advances in SCI hardware, theory and algorithms, including both optimization-based and deep-learning-based algorithms. Diverse applications and the outlook of SCI are also discussed.

翻訳日:2021-03-09 22:59:56 公開日:2021-03-07

# (参考訳) TypeShift: タイピング生産プロセスを視覚化するためのユーザインターフェース

TypeShift: A User Interface for Visualizing the Typing Production Process ( http://arxiv.org/abs/2103.04222v1 )

ライセンス: CC BY 4.0

Adam Goodkind

(参考訳) TypeShiftは、生産をタイプするタイミングで言語パターンを視覚化するためのツールです。言語生産は、言語、認知、運動能力に基づく複雑なプロセスです。タイピングプロセスにおける全体的なトレンドを視覚化することで、typeshiftは、単語レベルと文字レベルの両方でタイピングパターンを表現するのに使われるノイズの多い情報信号を明らかにすることを目指している。これは研究者が特定の言語現象を比較・比較し、個別のタイピングセッションを複数のグループ平均と比較することで達成される。最後に、TypeShiftはもともとデータタイピング用に設計されていたが、音声データにも容易に適応できる。 Webデモはhttps://angoodkind.shinyapps.io/TypeShift/で公開されている。ソースコードはhttps://github.com/angoodkind/typeshiftからアクセスできる。

TypeShift is a tool for visualizing linguistic patterns in the timing of typing production. Language production is a complex process which draws on linguistic, cognitive and motor skills. By visualizing holistic trends in the typing process, TypeShift aims to elucidate the often noisy information signals that are used to represent typing patterns, both at the word-level and character-level. It accomplishes this by enabling a researcher to compare and contrast specific linguistic phenomena, and compare an individual typing session to multiple group averages. Finally, although TypeShift was originally designed for typing data, it can easy be adapted to accommodate speech data, as well. A web demo is available at https://angoodkind.shinyapps.io/TypeShift/. The source code can be accessed at https://github.com/angoodkind/TypeShift.

翻訳日:2021-03-09 21:33:31 公開日:2021-03-07

# (参考訳) モデルなし強化学習におけるQ-関数の再利用がトータルレグレットに及ぼす影響

The Effect of Q-function Reuse on the Total Regret of Tabular, Model-Free, Reinforcement Learning ( http://arxiv.org/abs/2103.04416v1 )

ライセンス: CC BY 4.0

Volodymyr Tkachuk, Sriram Ganapathi Subramanian, Matthew E. Taylor

(参考訳) 一部の強化学習方法は、実世界では実用的ではない高いサンプル複雑性に苦しんでいます。転送学習メソッドである$Q$-functionの再利用は、学習のサンプル複雑さを低減し、既存のアルゴリズムの有用性を向上させる1つの方法です。これまでの研究は、モデルフリーアルゴリズムに適用した場合、様々な環境における$Q$-functionの再利用の実証的な効果を示してきた。私たちの知る限りでは、表型でモデルフリーな設定に適用される場合、$q$-関数再利用の後悔を示す理論的研究は存在しません。 UCB-Hoeffdingアルゴリズムを用いた$Q$-learningに適用した場合の$Q$-functionの再利用効果に関する理論的知見を提供することで、$Q$-functionの再利用における理論的作業と経験的作業のギャップを埋めることを目指している。 q$-関数の再利用がucb-hoeffdingアルゴリズムによる$q$-learningに適用された場合、状態やアクション空間とは無関係な後悔があることを示すことが私たちの大きな貢献です。また,理論的な知見を裏付ける実証的な結果も提供する。

Some reinforcement learning methods suffer from high sample complexity causing them to not be practical in real-world situations. $Q$-function reuse, a transfer learning method, is one way to reduce the sample complexity of learning, potentially improving usefulness of existing algorithms. Prior work has shown the empirical effectiveness of $Q$-function reuse for various environments when applied to model-free algorithms. To the best of our knowledge, there has been no theoretical work showing the regret of $Q$-function reuse when applied to the tabular, model-free setting. We aim to bridge the gap between theoretical and empirical work in $Q$-function reuse by providing some theoretical insights on the effectiveness of $Q$-function reuse when applied to the $Q$-learning with UCB-Hoeffding algorithm. Our main contribution is showing that in a specific case if $Q$-function reuse is applied to the $Q$-learning with UCB-Hoeffding algorithm it has a regret that is independent of the state or action space. We also provide empirical results supporting our theoretical findings.

翻訳日:2021-03-09 19:32:13 公開日:2021-03-07

# (参考訳) 自動運転のV&Vと安全保証のためのカバレッジベーステスト:システム文献レビュー

Coverage based testing for V&V and Safety Assurance of Self-driving Autonomous Vehicles: A Systematic Literature Review ( http://arxiv.org/abs/2103.04364v1 )

ライセンス: CC BY 4.0

Zaid Tahir, Rob Alexander

(参考訳) 自動運転車(SAV)は、業界だけでなく一般の人々によって毎日より多くの関心を集めています。テクノロジー企業や自動車会社は、将来のSAV市場でのヘッドスタートを確実にするために、SAVの研究開発に膨大な資金を投資しています。 SAVが公道に到達する際の大きなハードルの1つは、SAVの安全面における公衆の信頼の欠如である。世界中の研究者は、安全を確保し、SAVの安全性に国民に信頼を提供するために、SAVの検証と検証(V&V)と安全保証のためのカバレッジベースのテストを使用しています。本論文の目的は,過去10年間に研究者が用いたカバレッジ基準とカバレッジ最大化手法を検討し,SAVの安全性を保証することである。本稿では,本研究のための体系的文献レビュー(SLR)を実施している。適用範囲の基準に基づいて、既存の研究の分類を提示します。この領域のさらなる研究を可能にするために、このSLRにはいくつかの研究ギャップと研究方向も設けられている。本稿では,SAVの安全保証分野における知識の体系を提供する。このSLRの結果は、V&Vの進展とSAVの安全性確保に有効であると考えています。

Self-driving Autonomous Vehicles (SAVs) are gaining more interest each passing day by the industry as well as the general public. Tech and automobile companies are investing huge amounts of capital in research and development of SAVs to make sure they have a head start in the SAV market in the future. One of the major hurdles in the way of SAVs making it to the public roads is the lack of confidence of public in the safety aspect of SAVs. In order to assure safety and provide confidence to the public in the safety of SAVs, researchers around the world have used coverage-based testing for Verification and Validation (V&V) and safety assurance of SAVs. The objective of this paper is to investigate the coverage criteria proposed and coverage maximizing techniques used by researchers in the last decade up till now, to assure safety of SAVs. We conduct a Systematic Literature Review (SLR) for this investigation in our paper. We present a classification of existing research based on the coverage criteria used. Several research gaps and research directions are also provided in this SLR to enable further research in this domain. This paper provides a body of knowledge in the domain of safety assurance of SAVs. We believe the results of this SLR will be helpful in the progression of V&V and safety assurance of SAVs.

翻訳日:2021-03-09 17:09:51 公開日:2021-03-07

# Unseen の翻訳? Yor\`ub\'a $\rightarrow$ English MT in Low-Resource, Morphologically-unmarked settingss

Translating the Unseen? Yor\`ub\'a $\rightarrow$ English MT in Low-Resource, Morphologically-Unmarked Settings ( http://arxiv.org/abs/2103.04225v1 )

ライセンス: Link先を確認

Ife Adebara Miikka Silfverberg Muhammad Abdul-Mageed

(参考訳) 特定の特徴が一方で形態素的にマークされているが、他方で欠落または文脈的にマークされている言語間の翻訳は、機械翻訳の重要なテストケースである。定型性(in)を形態的にマークする英語に翻訳する場合、Yor\`ub\'a は素名詞を用いるが、これらの特徴を文脈的にマークする。本研究では、Yor\`ub\'a の素名詞を英語に翻訳する際に、SMT システムが 2 つの NMT システム (BiLSTM と Transformer) とどのように比較するかを細かく分析する。システムがどのようにBNを識別し、正しく翻訳し、人間の翻訳パターンと比較するかを検討する。また,各モデルが犯す誤りの種類を分析し,それらの誤りを言語的に記述する。低リソース設定でモデルパフォーマンスを評価するための洞察を得る。素名詞の翻訳では, トランスフォーマーモデルは4つのカテゴリでSMT, BiLSTMモデルより優れ, BiLSTMは3つのカテゴリでSMTモデルより優れ, SMTは1つのカテゴリでNMTモデルより優れていた。

Translating between languages where certain features are marked morphologically in one but absent or marked contextually in the other is an important test case for machine translation. When translating into English which marks (in)definiteness morphologically, from Yor\`ub\'a which uses bare nouns but marks these features contextually, ambiguities arise. In this work, we perform fine-grained analysis on how an SMT system compares with two NMT systems (BiLSTM and Transformer) when translating bare nouns in Yor\`ub\'a into English. We investigate how the systems what extent they identify BNs, correctly translate them, and compare with human translation patterns. We also analyze the type of errors each model makes and provide a linguistic description of these errors. We glean insights for evaluating model performance in low-resource settings. In translating bare nouns, our results show the transformer model outperforms the SMT and BiLSTM models for 4 categories, the BiLSTM outperforms the SMT model for 3 categories while the SMT outperforms the NMT models for 1 category.

翻訳日:2021-03-09 16:06:50 公開日:2021-03-07

# 乱雑な動的環境における状態表現とナビゲーションの学習

Learning a State Representation and Navigation in Cluttered and Dynamic Environments ( http://arxiv.org/abs/2103.04351v1 )

ライセンス: Link先を確認

David Hoeller, Lorenz Wellhausen, Farbod Farshidian, Marco Hutter

(参考訳) 本研究では,静的および動的障害のあるクラッタ環境において,四足ロボットを用いた局所ナビゲーションを実現するための学習ベースのパイプラインを提案する。高レベルのナビゲーションコマンドにより、ロボットは環境の明示的なマッピングをすることなく、深度カメラからフレームに基づいてターゲットの場所に安全に移動することができます。まず、画像のシーケンスとカメラの現在の軌道を融合して、状態表現学習を用いて世界のモデルを形成する。この軽量モジュールの出力は、強化学習で訓練された目標到達および障害物回避ポリシーに直接供給される。パイプラインをこれらのコンポーネントに分離すると、わずか数十分でシミュレーションで完全にトレーニングできるサンプル効率的なポリシー学習ステージになることを示します。重要な部分は状態表現であり、監視されていない方法で世界の隠れた状態を推定するだけでなく、現実のギャップを橋渡しし、シミュレーションから現実への転送を成功させるのに役立ちます。シミュレーションと実演で4足歩行ロボットanymalを用いた実験では,ノイズの多い奥行き画像の処理や,トレーニング中の動的障害物の回避,局所的な空間意識の付与などが可能であった。

In this work, we present a learning-based pipeline to realise local navigation with a quadrupedal robot in cluttered environments with static and dynamic obstacles. Given high-level navigation commands, the robot is able to safely locomote to a target location based on frames from a depth camera without any explicit mapping of the environment. First, the sequence of images and the current trajectory of the camera are fused to form a model of the world using state representation learning. The output of this lightweight module is then directly fed into a target-reaching and obstacle-avoiding policy trained with reinforcement learning. We show that decoupling the pipeline into these components results in a sample efficient policy learning stage that can be fully trained in simulation in just a dozen minutes. The key part is the state representation, which is trained to not only estimate the hidden state of the world in an unsupervised fashion, but also helps bridging the reality gap, enabling successful sim-to-real transfer. In our experiments with the quadrupedal robot ANYmal in simulation and in reality, we show that our system can handle noisy depth images, avoid dynamic obstacles unseen during training, and is endowed with local spatial awareness.

翻訳日:2021-03-09 16:04:50 公開日:2021-03-07

# Watching You: ビデオベースの人物再識別のためのグローバルガイドによる相互学習

Watching You: Global-guided Reciprocal Learning for Video-based Person Re-identification ( http://arxiv.org/abs/2103.04337v1 )

ライセンス: Link先を確認

Xuehu Liu and Pingping Zhang and Chenyang Yu and Huchuan Lu and Xiaoyun Yang

(参考訳) ビデオベースの人物再識別(Re-ID)は、重複しないカメラで同一人物のビデオシーケンスを自動的に取得することを目的としている。この目的を達成するために、ビデオに豊富な空間的および時間的手がかりを十分に活用することが鍵となる。既存の手法は通常、最も顕著な画像領域に焦点を合わせており、画像シーケンスの人物の多様性によって、きめ細かな手がかりを見逃しがちである。そこで本論文では,映像に基づくRe-IDのためのGLL(Global-Guided Reciprocal Learning)フレームワークを提案する。具体的には,GCE(Global-Guided Correlation Estimation)を提案し,局所的特徴とグローバル特徴の特徴相関マップを生成し,同一人物を識別するための高相関領域と低相関領域をローカライズする。その後、グローバル表現の指導の下で、識別的特徴を高相関特徴と低相関特徴に分解する。さらに, テンポラル・相互学習(TRL)機構は, 高相関意味情報を逐次強化し, 低相関のサブクリティカルな手がかりを蓄積するように設計されている。 3つの公開ベンチマークに関する広範な実験は、私たちのアプローチが他の最先端のアプローチよりも優れたパフォーマンスを達成できることを示しています。

Video-based person re-identification (Re-ID) aims to automatically retrieve video sequences of the same person under non-overlapping cameras. To achieve this goal, it is the key to fully utilize abundant spatial and temporal cues in videos. Existing methods usually focus on the most conspicuous image regions, thus they may easily miss out fine-grained clues due to the person varieties in image sequences. To address above issues, in this paper, we propose a novel Global-guided Reciprocal Learning (GRL) framework for video-based person Re-ID. Specifically, we first propose a Global-guided Correlation Estimation (GCE) to generate feature correlation maps of local features and global features, which help to localize the high-and low-correlation regions for identifying the same person. After that, the discriminative features are disentangled into high-correlation features and low-correlation features under the guidance of the global representations. Moreover, a novel Temporal Reciprocal Learning (TRL) mechanism is designed to sequentially enhance the high-correlation semantic information and accumulate the low-correlation sub-critical clues. Extensive experiments on three public benchmarks indicate that our approach can achieve better performance than other state-of-the-art approaches.

翻訳日:2021-03-09 16:02:56 公開日:2021-03-07

# TransBTS: Transformer を用いたマルチモーダル脳腫瘍切除

TransBTS: Multimodal Brain Tumor Segmentation Using Transformer ( http://arxiv.org/abs/2103.04430v1 )

ライセンス: Link先を確認

Wenxuan Wang, Chen Chen, Meng Ding, Jiangyun Li, Hong Yu, Sen Zha

(参考訳) 自己着脱機構を用いたグローバル(長距離)情報モデリングの恩恵を受けるトランスフォーマは,近年,自然言語処理と2次元画像分類に成功している。しかし,特に3次元医用画像セグメンテーションでは,局所的特徴とグローバル特徴の両方が重要となる。本稿では、MRI脳腫瘍セグメンテーションのための3D CNNにおけるTransformerを初めて利用し、エンコーダデコーダ構造に基づくTransBTSという新しいネットワークを提案する。ローカルな3dコンテキスト情報をキャプチャするために、エンコーダはまず3d cnnを使用して体積空間特徴マップを抽出する。一方、機能マップは、グローバル機能モデリングのためにTransformerに供給されるトークンのために精巧に再構成されます。デコーダはtransformerに埋め込まれた機能を活用し、詳細なセグメンテーションマップを予測するためにプログレッシブアップサンプリングを行う。 BraTS 2019データセットの実験結果は、TransBTSが3D MRIスキャンで脳腫瘍のセグメント化の最先端の手法より優れていることを示している。コードはhttps://github.com/Wenxuan-1119/TransBTSで入手できる。

Transformer, which can benefit from global (long-range) information modeling using self-attention mechanisms, has been successful in natural language processing and 2D image classification recently. However, both local and global features are crucial for dense prediction tasks, especially for 3D medical image segmentation. In this paper, we for the first time exploit Transformer in 3D CNN for MRI Brain Tumor Segmentation and propose a novel network named TransBTS based on the encoder-decoder structure. To capture the local 3D context information, the encoder first utilizes 3D CNN to extract the volumetric spatial feature maps. Meanwhile, the feature maps are reformed elaborately for tokens that are fed into Transformer for global feature modeling. The decoder leverages the features embedded by Transformer and performs progressive upsampling to predict the detailed segmentation map. Experimental results on the BraTS 2019 dataset show that TransBTS outperforms state-of-the-art methods for brain tumor segmentation on 3D MRI scans. Code is available at https://github.com/Wenxuan-1119/TransBTS

翻訳日:2021-03-09 16:02:34 公開日:2021-03-07

# 直交注意:クローズスタイルアプローチによる否定スコープの解決

Orthogonal Attention: A Cloze-Style Approach to Negation Scope Resolution ( http://arxiv.org/abs/2103.04294v1 )

ライセンス: Link先を確認

Aditya Khandelwal and Vahida Attar

(参考訳) Negation Scope Resolutionは広く研究されている問題であり、ネゲーションキューの影響を受ける単語を文に見つけるために使用されます。最近の研究では、トランスフォーマーベースのアーキテクチャの微調整が、このタスクに最先端の結果をもたらすことが示されている。本研究では,文を文脈として,手がかり語をクエリとして,否定スコープの解決をクローズ的なタスクとして捉えた。また, 自己注意に触発された直交注意と呼ばれる新しいクロゼスタイルの注意機構も導入する。まず, オーソゴナル・アテンション・バリアントを開発するためのフレームワークを提案し, OA-C, OA-CA, OA-EM, OA-EMBの4種類のオーソゴナル・アテンテンション・バリアントを提案する。 XLNetのバックボーンの上にこれらの直交アテンションレイヤーを使用して、我々は、私たちが実験するすべてのデータセットで今まで最高の結果を達成し、微調整XLNet最先端のネゲーションスコープ解像度を上回ります:BioScope Abstracts、BioScope Full Papers、SFU Review Corpus、および*sem 2012 Dataset(Sherlock)。

Negation Scope Resolution is an extensively researched problem, which is used to locate the words affected by a negation cue in a sentence. Recent works have shown that simply finetuning transformer-based architectures yield state-of-the-art results on this task. In this work, we look at Negation Scope Resolution as a Cloze-Style task, with the sentence as the Context and the cue words as the Query. We also introduce a novel Cloze-Style Attention mechanism called Orthogonal Attention, which is inspired by Self Attention. First, we propose a framework for developing Orthogonal Attention variants, and then propose 4 Orthogonal Attention variants: OA-C, OA-CA, OA-EM, and OA-EMB. Using these Orthogonal Attention layers on top of an XLNet backbone, we outperform the finetuned XLNet state-of-the-art for Negation Scope Resolution, achieving the best results to date on all 4 datasets we experiment with: BioScope Abstracts, BioScope Full Papers, SFU Review Corpus and the *sem 2012 Dataset (Sherlock).

翻訳日:2021-03-09 16:01:32 公開日:2021-03-07

# エキスパートシステムグラジエントディサントスタイルトレーニング: 防衛可能な人工知能技術の開発。

Expert System Gradient Descent Style Training: Development of a Defensible Artificial Intelligence Technique ( http://arxiv.org/abs/2103.04314v1 )

ライセンス: Link先を確認

Jeremy Straub

(参考訳) 提示されたデータから学習する能力を備えた人工知能システムは、社会全体で使用されています。これらのシステムは、ローン申請者のスクリーニング、刑事被告に対する判決の推薦、禁止コンテンツに対するソーシャルメディア投稿のスキャンなどに使われる。これらのシステムは、複雑な学習された相関ネットワークに意味を割り当てないため、因果関係に等しくない関連を学習することができ、最適で防御不能な決定が下される。準最適な意思決定に加えて、これらのシステムは、差別防止法に違反している相関関係を学習することで、設計者やオペレーターに法的責任を負う可能性がある。本稿では,意味割り当てノード (facts) と相関関係 (rules) を用いて開発した機械学習エキスパートシステムについて述べる。複数の潜在的な実装は、異なるネットワークエラーと拡張レベルと異なるトレーニングレベルを含む、異なる条件下で検討および評価されます。これらのシステムの性能は、ランダムで完全に接続されたネットワークと比較される。

Artificial intelligence systems, which are designed with a capability to learn from the data presented to them, are used throughout society. These systems are used to screen loan applicants, make sentencing recommendations for criminal defendants, scan social media posts for disallowed content and more. Because these systems don't assign meaning to their complex learned correlation network, they can learn associations that don't equate to causality, resulting in non-optimal and indefensible decisions being made. In addition to making decisions that are sub-optimal, these systems may create legal liability for their designers and operators by learning correlations that violate anti-discrimination and other laws regarding what factors can be used in different types of decision making. This paper presents the use of a machine learning expert system, which is developed with meaning-assigned nodes (facts) and correlations (rules). Multiple potential implementations are considered and evaluated under different conditions, including different network error and augmentation levels and different training levels. The performance of these systems is compared to random and fully connected networks.

翻訳日:2021-03-09 16:00:46 公開日:2021-03-07

# TensorFlow-Kerasによるグラフニューラルネットワークの実装

Implementing graph neural networks with TensorFlow-Keras ( http://arxiv.org/abs/2103.04318v1 )

ライセンス: Link先を確認

Patrick Reiser, Andre Eberhard and Pascal Friederich

(参考訳) グラフニューラルネットワークは、最近多くの注目を集めた汎用的な機械学習アーキテクチャです。本稿では、TensorFlow-Kerasモデルの畳み込み層とプール層の実装について述べる。これにより、標準的なKeras層にシームレスかつ柔軟な統合により、グラフモデルを機能的に設定できる。これは、グラフに適したtensorflowの新しいraggedtensorクラスを通じて実現可能な、最初のテンソル次元としてのミニバッチの使用を意味する。 tensorflow-kerasをベースとしたkeras graph convolutional neural network pythonパッケージを開発した。tensorflow-kerasは、層間で渡される透明なテンソル構造と、使いやすいマインドセットに焦点を当てた、グラフネットワーク用のkerasレイヤセットを提供する。

Graph neural networks are a versatile machine learning architecture that received a lot of attention recently. In this technical report, we present an implementation of convolution and pooling layers for TensorFlow-Keras models, which allows a seamless and flexible integration into standard Keras layers to set up graph models in a functional way. This implies the usage of mini-batches as the first tensor dimension, which can be realized via the new RaggedTensor class of TensorFlow best suited for graphs. We developed the Keras Graph Convolutional Neural Network Python package kgcnn based on TensorFlow-Keras that provides a set of Keras layers for graph networks which focus on a transparent tensor structure passed between layers and an ease-of-use mindset.

翻訳日:2021-03-09 16:00:26 公開日:2021-03-07

# Hierarchical Causal Bandit

Hierarchical Causal Bandit ( http://arxiv.org/abs/2103.04215v1 )

ライセンス: Link先を確認

Ruiyang Song, Stefano Rini, Kuang Xu

(参考訳) 因果バンディット(英: Causal Bandit)は、エージェントが変数の因果ネットワークで連続的に実験し、報酬の最大化介入を特定する、創発的な学習モデルである。モデルの適用性は広いが、既存の分析結果は、全ての変数が互いに独立な並列バンディットバージョンに大きく制限されている。本研究では,階層型因果バンディットモデルを,従属変数による一般因果バンディット理解への有効な経路として紹介する。コアのアイデアは、直接的な効果を持つすべての変数間の相互作用をキャプチャするコンテキスト変数を組み込むことです。この階層的枠組みを用いることで、因果的包帯と従属腕のアルゴリズム設計の鋭い洞察を導き、二項文脈の場合、ほぼ一致する後悔境界を得る。

Causal bandit is a nascent learning model where an agent sequentially experiments in a causal network of variables, in order to identify the reward-maximizing intervention. Despite the model's wide applicability, existing analytical results are largely restricted to a parallel bandit version where all variables are mutually independent. We introduce in this work the hierarchical causal bandit model as a viable path towards understanding general causal bandits with dependent variables. The core idea is to incorporate a contextual variable that captures the interaction among all variables with direct effects. Using this hierarchical framework, we derive sharp insights on algorithmic design in causal bandits with dependent arms and obtain nearly matching regret bounds in the case of a binary context.

翻訳日:2021-03-09 15:57:51 公開日:2021-03-07

# アクティブシーケンシャル仮説テストのための近似アルゴリズム

Approximation Algorithms for Active Sequential Hypothesis Testing ( http://arxiv.org/abs/2103.04250v1 )

ライセンス: Link先を確認

Kyra Gan, Su Jia, Andrew Li

(参考訳) アクティブ・シーケンシャル仮説テスト(英語版)(asht)の問題において、学習者は、一連の仮説のうち、真の仮説である$h^*$を同定しようとする。学習者は一連の行動を与えられ、任意の真の仮説の下での行動の結果分布を知る。アクションの集合全体を繰り返し再生すると、$h^*$を識別するのに十分だが、アクションごとにコストがかかる。したがって、ターゲットエラー $\delta>0$ が与えられた場合、少なくとも1 - \delta$ の確率で $h^*$ を識別するアクションを逐次選択するための最小コストポリシーを見つけることが目的である。本稿では2種類の適応性の下でASHTの最初の近似アルゴリズムを提供する。まず、ポリシーが事前に一連のアクションを修正し、いつ終了し、どの仮説を返すかを適応的に決定すれば、部分的に適応する。部分的適応性の下では、$o\big(s^{-1}(1+\log_{1/\delta}|h|)\log (s^{-1}|h| \log |h|)\big)$近似アルゴリズムを提供する。第二に、アクションの選択が以前の結果に依存している場合、ポリシーは完全に適応的です。完全な適応性の下で、$O(s^{-1}\log (|H|/\delta)\log |H|)$-近似アルゴリズムを提供する。合成データと実世界のデータの両方を用いて,アルゴリズムの性能を数値的に検討し,提案したヒューリスティック・ポリシーよりも優れていることを示す。

In the problem of active sequential hypotheses testing (ASHT), a learner seeks to identify the true hypothesis $h^*$ from among a set of hypotheses $H$. The learner is given a set of actions and knows the outcome distribution of any action under any true hypothesis. While repeatedly playing the entire set of actions suffices to identify $h^*$, a cost is incurred with each action. Thus, given a target error $\delta>0$, the goal is to find the minimal cost policy for sequentially selecting actions that identify $h^*$ with probability at least $1 - \delta$. This paper provides the first approximation algorithms for ASHT, under two types of adaptivity. First, a policy is partially adaptive if it fixes a sequence of actions in advance and adaptively decides when to terminate and what hypothesis to return. Under partial adaptivity, we provide an $O\big(s^{-1}(1+\log_{1/\delta}|H|)\log (s^{-1}|H| \log |H|)\big)$-approximation algorithm, where $s$ is a natural separation parameter between the hypotheses. Second, a policy is fully adaptive if action selection is allowed to depend on previous outcomes. Under full adaptivity, we provide an $O(s^{-1}\log (|H|/\delta)\log |H|)$-approximation algorithm. We numerically investigate the performance of our algorithms using both synthetic and real-world data, showing that our algorithms outperform a previously proposed heuristic policy.

翻訳日:2021-03-09 15:57:39 公開日:2021-03-07

# CORe:バンディット探索における報酬の活用

CORe: Capitalizing On Rewards in Bandit Exploration ( http://arxiv.org/abs/2103.04387v1 )

ライセンス: Link先を確認

Nan Wang, Branislav Kveton, Maryam Karimzadehgan

(参考訳) 過去の観測をランダム化して純粋に探索するバンディットアルゴリズムを提案する。特に、平均報酬推定における十分な楽観性は、過去の観測された報酬の分散を利用して達成される。我々は報酬(コア)に乗じたアルゴリズムを命名する。アルゴリズムは一般的であり、様々なバンディット設定に容易に適用できる。 COReの主な利点は、その探索が完全にデータに依存していることです。外部ノイズに依存しず、パラメータチューニングなしでさまざまな問題に適応します。我々は、$d$が特徴の数であり、$K$が腕の数である確率的な線形バンドイットにおいて、$n$-roundのCOReの後悔に、$\tilde O(d\sqrt{n\log K})$のギャップフリー境界を導出する。複数の合成および実世界の問題に関する広範な経験的評価は、COReの有効性を示す。

We propose a bandit algorithm that explores purely by randomizing its past observations. In particular, the sufficient optimism in the mean reward estimates is achieved by exploiting the variance in the past observed rewards. We name the algorithm Capitalizing On Rewards (CORe). The algorithm is general and can be easily applied to different bandit settings. The main benefit of CORe is that its exploration is fully data-dependent. It does not rely on any external noise and adapts to different problems without parameter tuning. We derive a $\tilde O(d\sqrt{n\log K})$ gap-free bound on the $n$-round regret of CORe in a stochastic linear bandit, where $d$ is the number of features and $K$ is the number of arms. Extensive empirical evaluation on multiple synthetic and real-world problems demonstrates the effectiveness of CORe.

翻訳日:2021-03-09 15:57:10 公開日:2021-03-07

# 逆強化学習のサンプル複雑性のための下界

A Lower Bound for the Sample Complexity of Inverse Reinforcement Learning ( http://arxiv.org/abs/2103.04446v1 )

ライセンス: Link先を確認

Abi Komanduru, Jean Honorio

(参考訳) 逆強化学習(IRL)は、与えられたマルコフ決定プロセス(MDP)に対して望ましい最適ポリシーを生成する報酬関数を求めるタスクである。本稿では, 有限状態, 有限作用IRL問題のサンプル複雑性に対する情報理論の下界について述べる。球面符号を用いた $\beta$-strict separable IRL 問題の幾何学的構成を考える。生成した軌跡間のkullback-leibler発散と同様にアンサンブルサイズの性質を導出する。結果として得られるアンサンブルはファノの不等式とともに、mdp 内の状態数である $n$ で$o(n \log n)$ 以下のサンプル複雑性を導出するために用いられる。

Inverse reinforcement learning (IRL) is the task of finding a reward function that generates a desired optimal policy for a given Markov Decision Process (MDP). This paper develops an information-theoretic lower bound for the sample complexity of the finite state, finite action IRL problem. A geometric construction of $\beta$-strict separable IRL problems using spherical codes is considered. Properties of the ensemble size as well as the Kullback-Leibler divergence between the generated trajectories are derived. The resulting ensemble is then used along with Fano's inequality to derive a sample complexity lower bound of $O(n \log n)$, where $n$ is the number of states in the MDP.

翻訳日:2021-03-09 15:56:56 公開日:2021-03-07

# クラスカプセルの識別力に向けたルーティング

Routing Towards Discriminative Power of Class Capsules ( http://arxiv.org/abs/2103.04278v1 )

ライセンス: Link先を確認

Haoyu Yang, Shuhe Li, Bei Yu

(参考訳) カプセルネットワークは最近、現代のニューラルネットワークアーキテクチャの代替として提案されている。ニューロンは、正常化されたベクトルまたは行列を持つ特定の特徴または実体を表すカプセルユニットに置き換えられる。下層カプセルの活性化は、特定のルーティングアルゴリズムによって訓練中に構築されるルーティングリンクを介して、以下のカプセルの挙動に影響する。本稿では,ネットワークを最適性から遠ざける動的ルーティングアルゴリズムにおけるルーティング・バイ・アグリーメントスキームについて考察する。より良く,より高速な収束を得るため,規則化された二次プログラミング問題を効率的に解くことができるルーティングアルゴリズムを提案する。特に,提案したルーティングアルゴリズムは,入力インスタンスの正しい判定を行うクラスカプセルの識別力を直接ターゲットとする。 mnist,mnist-fashion,cifar-10の実験を行い,既存のカプセルネットワークと比較して競合分類結果を示す。

Capsule networks are recently proposed as an alternative to modern neural network architectures. Neurons are replaced with capsule units that represent specific features or entities with normalized vectors or matrices. The activation of lower layer capsules affects the behavior of the following capsules via routing links that are constructed during training via certain routing algorithms. We discuss the routing-by-agreement scheme in dynamic routing algorithm which, in certain cases, leads the networks away from optimality. To obtain better and faster convergence, we propose a routing algorithm that incorporates a regularized quadratic programming problem which can be solved efficiently. Particularly, the proposed routing algorithm targets directly on the discriminative power of class capsules making the correct decision on input instances. We conduct experiments on MNIST, MNIST-Fashion, and CIFAR-10 and show competitive classification results compared to existing capsule networks.

翻訳日:2021-03-09 15:55:23 公開日:2021-03-07

# 階層的自己注意に基づく人的活動認識のためのオートエンコーダ

Hierarchical Self Attention Based Autoencoder for Open-Set Human Activity Recognition ( http://arxiv.org/abs/2103.04279v1 )

ライセンス: Link先を確認

M Tanjid Hasan Tonmoy, Saif Mahmud, A K M Mahbubur Rahman, M Ashraful Amin, and Amin Ahsan Ali

(参考訳) ウェアラブルセンサーベースの人間の活動認識は、センサー信号の空間的および時間的依存性のモデリングが困難であるため、困難な問題です。閉集合仮定における認識モデルは、既知のアクティビティクラスのメンバを予測として生成しなければならない。しかし, 運動認識モデルでは, 身体感覚の異常や, 動作中の被験者の障害により, 目に見えない活動に遭遇する可能性がある。この問題は、オープンセット認識の仮定に従ってモデリングソリューションを通じて対処することができる。したがって、提案した自己注意ベースのアプローチは、さまざまなセンサー配置からのデータを階層的に組み合わせてクローズドセットアクティビティを分類し、5つの公開データセット上での最先端モデルに対する顕著なパフォーマンス改善を得る。このオートエンコーダアーキテクチャのデコーダには、エンコーダからの自己認識に基づく特徴表現が組み込まれ、オープンセット認識設定で未確認のアクティビティクラスを検出する。さらに、階層モデルによって生成された注目マップは、アクティビティ認識における特徴の説明可能な選択を示す。本研究は,騒音に対する頑健性および体温センサ信号の特異的変動性を著しく改善した検証実験を広範囲に実施する。ソースコードはgithub.com/saif-mahmud/hierarchical-attention-HARで入手できる。

Wearable sensor based human activity recognition is a challenging problem due to difficulty in modeling spatial and temporal dependencies of sensor signals. Recognition models in closed-set assumption are forced to yield members of known activity classes as prediction. However, activity recognition models can encounter an unseen activity due to body-worn sensor malfunction or disability of the subject performing the activities. This problem can be addressed through modeling solution according to the assumption of open-set recognition. Hence, the proposed self attention based approach combines data hierarchically from different sensor placements across time to classify closed-set activities and it obtains notable performance improvement over state-of-the-art models on five publicly available datasets. The decoder in this autoencoder architecture incorporates self-attention based feature representations from encoder to detect unseen activity classes in open-set recognition setting. Furthermore, attention maps generated by the hierarchical model demonstrate explainable selection of features in activity recognition. We conduct extensive leave one subject out validation experiments that indicate significantly improved robustness to noise and subject specific variability in body-worn sensor signals. The source code is available at: github.com/saif-mahmud/hierarchical-attention-HAR

翻訳日:2021-03-09 15:55:10 公開日:2021-03-07

# 単発セマンティック部品セグメンテーションのためのGANの再利用

Repurposing GANs for One-shot Semantic Part Segmentation ( http://arxiv.org/abs/2103.04379v1 )

ライセンス: Link先を確認

Nontawat Tritrong, Pitchaporn Rewatbowornwong, Supasorn Suwajanakorn

(参考訳) GANは現実的な画像生成に成功したが、合成とは無関係な他のタスクにGANを使用することのアイデアは明らかにされていない。 GANは、それらのオブジェクトを再生する過程で、オブジェクトの有意義な構造的部分を学ぶか? そこで本研究では,この仮説を検証し,ラベルなしデータセットとともにラベルを1つも必要としない,意味部分セグメンテーションのためのgansに基づく単純かつ効果的なアプローチを提案する。我々のキーとなるアイデアは、訓練されたGANを利用して、入力画像からピクセルワイズ表現を抽出し、セグメンテーションネットワークのための特徴ベクトルとして利用することです。我々の実験は、GANの表現が「可読的に差別的」であり、かなり多くのラベルで訓練された教師付きベースラインと同等の驚くほど良い結果をもたらすことを示した。我々は、gansのこの新しい再提案は、他の多くのタスクに適用可能な教師なし表現学習の新たなクラスであると信じている。詳細は https://repurposegans.github.io/ をご覧ください。

While GANs have shown success in realistic image generation, the idea of using GANs for other tasks unrelated to synthesis is underexplored. Do GANs learn meaningful structural parts of objects during their attempt to reproduce those objects? In this work, we test this hypothesis and propose a simple and effective approach based on GANs for semantic part segmentation that requires as few as one label example along with an unlabeled dataset. Our key idea is to leverage a trained GAN to extract pixel-wise representation from the input image and use it as feature vectors for a segmentation network. Our experiments demonstrate that GANs representation is "readily discriminative" and produces surprisingly good results that are comparable to those from supervised baselines trained with significantly more labels. We believe this novel repurposing of GANs underlies a new class of unsupervised representation learning that is applicable to many other tasks. More results are available at https://repurposegans.github.io/.

翻訳日:2021-03-09 15:54:51 公開日:2021-03-07

# 空間変換領域の感度不一致からの逆例の検出

Detecting Adversarial Examples from Sensitivity Inconsistency of Spatial-Transform Domain ( http://arxiv.org/abs/2103.04302v1 )

ライセンス: Link先を確認

Jinyu Tian, Jiantao Zhou, Yuanman Li, Jia Duan

(参考訳) ディープニューラルネットワーク(DNN)は、劇的なモデル出力エラーを引き起こすように悪質に設計された敵の例(AE)に対して脆弱であることが示されている。本研究では、通常の例(NE)は、決定境界の高度に湾曲した領域で発生する変動に敏感であるのに対し、AEは1つの領域(主に空間領域)上に設計され、そのような変動に対して極端に敏感であることを示す。この現象は、感度の不整合により、元の分類器(原始分類器)と協調してAEを検出することができる変換決定境界を持つ別の分類器(二重分類器)を設計する動機となる。 LID(Local Intrinsic Dimensionality)、MD(Mahalanobis Distance)、FS(Feature Squeezing)に基づく最先端のアルゴリズムと比較して、提案された感度インシネンスディテクタ(SID)は、特に敵対的な摂動レベルが小さい場合において、AE検出性能と優れた一般化能力の向上を実現します。 ResNet と VGG の総合的な実験結果から,提案した SID の優位性を検証した。

Deep neural networks (DNNs) have been shown to be vulnerable against adversarial examples (AEs), which are maliciously designed to cause dramatic model output errors. In this work, we reveal that normal examples (NEs) are insensitive to the fluctuations occurring at the highly-curved region of the decision boundary, while AEs typically designed over one single domain (mostly spatial domain) exhibit exorbitant sensitivity on such fluctuations. This phenomenon motivates us to design another classifier (called dual classifier) with transformed decision boundary, which can be collaboratively used with the original classifier (called primal classifier) to detect AEs, by virtue of the sensitivity inconsistency. When comparing with the state-of-the-art algorithms based on Local Intrinsic Dimensionality (LID), Mahalanobis Distance (MD), and Feature Squeezing (FS), our proposed Sensitivity Inconsistency Detector (SID) achieves improved AE detection performance and superior generalization capabilities, especially in the challenging cases where the adversarial perturbation levels are small. Intensive experimental results on ResNet and VGG validate the superiority of the proposed SID.

翻訳日:2021-03-09 15:47:27 公開日:2021-03-07

# MTLHealth: 学生の摂動コンテンツ検出のための深層学習システム

MTLHealth: A Deep Learning System for Detecting Disturbing Contentin Student Essays ( http://arxiv.org/abs/2103.04290v1 )

ライセンス: Link先を確認

Joseph Valencia, Erin Yao

(参考訳) ACTのような標準化されたテストへのエッセイの提出には、いじめ、自己害、暴力、および邪魔となるコンテンツの他の形態への言及が含まれる。生徒はこのような事件を識別し、危険にさらされている可能性のある学生のために当局に警告するかどうかを判断しなければならない。コンテンツが乱れる可能性を自動で警告することで、人間の意思決定を支援する堅牢なコンピュータシステムの必要性が高まっている。本稿では,計算言語学,特に事前学習型言語モデルトランスフォーマーネットワークの最近の進歩を中心に構築された,乱れたコンテンツ検出パイプラインであるMTLHealthについて述べる。

Essay submissions to standardized tests like the ACT occasionally include references to bullying, self-harm, violence, and other forms of disturbing content. Graders must take great care to identify cases like these and decide whether to alert authorities on behalf of students who may be in danger. There is a growing need for robust computer systems to support human decision-makers by automatically flagging potential instances of disturbing content. This paper describes MTLHealth, a disturbing content detection pipeline built around recent advances from computational linguistics, particularly pre-trained language model Transformer networks.

翻訳日:2021-03-09 15:45:44 公開日:2021-03-07

# Syntax-BERT: シンタックスツリーによるプリトレーニングトランスの改善

Syntax-BERT: Improving Pre-trained Transformers with Syntax Trees ( http://arxiv.org/abs/2103.04350v1 )

ライセンス: Link先を確認

Jiangang Bai, Yujing Wang, Yiren Chen, Yaming Yang, Jing Bai, Jing Yu, Yunhai Tong

(参考訳) BERTのような事前訓練された言語モデルは、構文情報を明確に考慮することなく、様々なNLPタスクで優れたパフォーマンスを実現します。一方、シンタクティック情報はNLPアプリケーションの成功に不可欠であることが証明されています。しかし、構文木を効率的に効率的にトレーニング済みのTransformerに組み込む方法はまだ未定である。本稿では,syntax-bertという新しいフレームワークを提案することで,この問題に解決する。このフレームワークはプラグアンドプレイモードで動作し、Transformerアーキテクチャに基づく任意の事前トレーニングされたチェックポイントに適用できる。自然言語理解のさまざまなデータセットの実験は、構文木の有効性を検証し、BERT、RoBERTa、T5を含む複数の事前学習モデルに対して一貫した改善を実現する。

Pre-trained language models like BERT achieve superior performances in various NLP tasks without explicit consideration of syntactic information. Meanwhile, syntactic information has been proved to be crucial for the success of NLP applications. However, how to incorporate the syntax trees effectively and efficiently into pre-trained Transformers is still unsettled. In this paper, we address this problem by proposing a novel framework named Syntax-BERT. This framework works in a plug-and-play mode and is applicable to an arbitrary pre-trained checkpoint based on Transformer architecture. Experiments on various datasets of natural language understanding verify the effectiveness of syntax trees and achieve consistent improvement over multiple pre-trained models, including BERT, RoBERTa, and T5.

翻訳日:2021-03-09 15:45:33 公開日:2021-03-07

# 共感的BERT2BERT会話モデル:少ないデータでアラビア語生成を学習する

Empathetic BERT2BERT Conversational Model: Learning Arabic Language Generation with Little Data ( http://arxiv.org/abs/2103.04353v1 )

ライセンス: Link先を確認

Tarek Naous, Wissam Antoun, Reem A. Mahmoud, and Hazem Hajj

(参考訳) アラビア語対話エージェントにおける共感行動の実現は、人間のような会話モデルを構築する上で重要な側面である。アラビア自然言語処理はAraBERTのような言語モデルで自然言語理解(NLU)に大きな進歩を遂げているが、自然言語生成(NLG)は依然として課題である。 NLGエンコーダデコーダモデルの欠点は、主に会話エージェントなどのNLGモデルのトレーニングに適したアラビア語データセットがないためです。そこで本論文では,AraBERTパラメータを初期化したトランスベースのエンコーダデコーダを提案する。エンコーダとデコーダの重みをAraBERT事前学習重みで初期化することにより,本モデルでは知識伝達の活用と応答生成の性能向上を実現した。会話モデルにおける共感を可能にするために, arabicempatheticdialoguesデータセットを用いて学習し, 共感応答生成における高いパフォーマンスを達成する。具体的には,従来の最先端モデルと比較して,低パープレキシティ値 17.0 と 5 BLEU 点の増加を達成した。また,提案モデルは85人の評価者によって高く評価され,オープンドメイン設定において,共感を呈示する上で高い能力が検証された。

Enabling empathetic behavior in Arabic dialogue agents is an important aspect of building human-like conversational models. While Arabic Natural Language Processing has seen significant advances in Natural Language Understanding (NLU) with language models such as AraBERT, Natural Language Generation (NLG) remains a challenge. The shortcomings of NLG encoder-decoder models are primarily due to the lack of Arabic datasets suitable to train NLG models such as conversational agents. To overcome this issue, we propose a transformer-based encoder-decoder initialized with AraBERT parameters. By initializing the weights of the encoder and decoder with AraBERT pre-trained weights, our model was able to leverage knowledge transfer and boost performance in response generation. To enable empathy in our conversational model, we train it using the ArabicEmpatheticDialogues dataset and achieve high performance in empathetic response generation. Specifically, our model achieved a low perplexity value of 17.0 and an increase in 5 BLEU points compared to the previous state-of-the-art model. Also, our proposed model was rated highly by 85 human evaluators, validating its high capability in exhibiting empathy while generating relevant and fluent responses in open-domain settings.

翻訳日:2021-03-09 15:45:22 公開日:2021-03-07

# アラビア語文の自動難易度分類

Automatic Difficulty Classification of Arabic Sentences ( http://arxiv.org/abs/2103.04386v1 )

ライセンス: Link先を確認

Nouran Khallaf, Serge Sharoff

(参考訳) 本論文では,CEFRの習熟度レベルと2進分分類を単純あるいは複雑として用いた言語学習者の文の難易度を予測する,現代標準アラビア語(MSA)文難易度分類器を提案する。異なる種類の文埋め込み(fastText, mBERT, XLM-R, Arabic-BERT)とPOSタグ, 依存性木, 可読性スコア, 言語学習者の頻度リストなど, 従来の言語機能との比較を行った。きめ細やかなアラビア-BERTで最高の結果が得られました。 3方向cefr分類の精度はアラビア語-bert分類では0.80, xlm-r分類では0.75, 回帰では0.71スピアマン相関である。我々の二項難易度分類器は文対意味類似度分類器の F-1 0.94 と F-1 0.98 に達する。

In this paper, we present a Modern Standard Arabic (MSA) Sentence difficulty classifier, which predicts the difficulty of sentences for language learners using either the CEFR proficiency levels or the binary classification as simple or complex. We compare the use of sentence embeddings of different kinds (fastText, mBERT , XLM-R and Arabic-BERT), as well as traditional language features such as POS tags, dependency trees, readability scores and frequency lists for language learners. Our best results have been achieved using fined-tuned Arabic-BERT. The accuracy of our 3-way CEFR classification is F-1 of 0.80 and 0.75 for Arabic-Bert and XLM-R classification respectively and 0.71 Spearman correlation for regression. Our binary difficulty classifier reaches F-1 0.94 and F-1 0.98 for sentence-pair semantic similarity classifier.

翻訳日:2021-03-09 15:45:01 公開日:2021-03-07

# スキーマ依存学習によるテキストからSQLへの改善

Improving Text-to-SQL with Schema Dependency Learning ( http://arxiv.org/abs/2103.04399v1 )

ライセンス: Link先を確認

Binyuan Hui, Xiang Shi, Ruiying Geng, Binhua Li, Yongbin Li, Jian Sun, Xiaodan Zhu

(参考訳) Text-to-SQLは自然言語の質問をSQLクエリにマップすることを目的としている。スケッチベースの手法と実行誘導(EG)デコーディング戦略を組み合わせることで、WikiSQLベンチマークでは高いパフォーマンスを示している。しかし、実行誘導型デコーディングはデータベースの実行に依存しており、推論プロセスが大幅に遅くなるため、多くの現実世界のアプリケーションには不満足である。本稿では、質問とスキーマ間の相互作用を効果的に捉えるためのネットワークをガイドするために、スキーマ依存性ガイド付きマルチタスクテキスト・ツー・SQLモデル(SDSQL)を紹介します。提案モデルは,eg の有無に関わらず,既存のメソッドをすべて上回っている。スキーマ依存性の学習は、EGのメリットを部分的にカバーし、その必要性を軽減します。 EGなしのSDSQLは、推論時の時間消費を大幅に削減し、少数のパフォーマンスを犠牲にし、ダウンストリームアプリケーションに柔軟性を提供します。

Text-to-SQL aims to map natural language questions to SQL queries. The sketch-based method combined with execution-guided (EG) decoding strategy has shown a strong performance on the WikiSQL benchmark. However, execution-guided decoding relies on database execution, which significantly slows down the inference process and is hence unsatisfactory for many real-world applications. In this paper, we present the Schema Dependency guided multi-task Text-to-SQL model (SDSQL) to guide the network to effectively capture the interactions between questions and schemas. The proposed model outperforms all existing methods in both the settings with or without EG. We show the schema dependency learning partially cover the benefit from EG and alleviates the need for it. SDSQL without EG significantly reduces time consumption during inference, sacrificing only a small amount of performance and provides more flexibility for downstream applications.

翻訳日:2021-03-09 15:44:44 公開日:2021-03-07

# 仮想常態:精度とロバスト深さ予測のための幾何学的制約を強制する

Virtual Normal: Enforcing Geometric Constraintsfor Accurate and Robust Depth Prediction ( http://arxiv.org/abs/2103.04216v1 )

ライセンス: Link先を確認

Wei Yin and Yifan Liu and Chunhua Shen

(参考訳) 単眼深度予測は3次元シーン形状の理解において重要な役割を担っている。近年の手法は画素単位の相対誤差などの評価指標で顕著な進歩を遂げているが、ほとんどの手法は3次元空間における幾何的制約を無視している。本研究では,深度予測のための高次3次元幾何学的制約の重要性を示す。再構成された3次元空間でランダムにサンプリングされた3点によって決定される仮想正規方向という単純な幾何学的制約を強制する損失項を設計することにより、単眼深度推定の精度とロバスト性を大幅に向上させる。重要なことは、仮想正規損失は、学習メートル法深度の性能を向上するだけでなく、スケール情報を解き、より優れた形状情報でモデルを豊かにする。したがって、絶対距離深度トレーニングデータにアクセスできない場合、仮想正規法を用いて多様なシーンで生成される強固なアフィン不変深さを学ぶことができる。実験では,NYU Depth-V2 と KITTI の学習深度について,最先端の学習結果を示す。高品質の予測深度から、ポイント雲や表面の正常といったシーンの優れた3次元構造を復元することが可能となり、これまでやってきたような追加モデルに頼る必要がなくなる。仮想正規損失による多様なデータに対するアフィン不変深度学習の汎用性を示すために、アフィン不変深度トレーニングのための大規模かつ多様なデータセット、いわゆるDiverse Scene Depthデータセット(DiverseDepth)を構築し、ゼロショットテスト設定で5つのデータセットをテストする。

Monocular depth prediction plays a crucial role in understanding 3D scene geometry. Although recent methods have achieved impressive progress in terms of evaluation metrics such as the pixel-wise relative error, most methods neglect the geometric constraints in the 3D space. In this work, we show the importance of the high-order 3D geometric constraints for depth prediction. By designing a loss term that enforces a simple geometric constraint, namely, virtual normal directions determined by randomly sampled three points in the reconstructed 3D space, we significantly improve the accuracy and robustness of monocular depth estimation. Significantly, the virtual normal loss can not only improve the performance of learning metric depth, but also disentangle the scale information and enrich the model with better shape information. Therefore, when not having access to absolute metric depth training data, we can use virtual normal to learn a robust affine-invariant depth generated on diverse scenes. In experiments, We show state-of-the-art results of learning metric depth on NYU Depth-V2 and KITTI. From the high-quality predicted depth, we are now able to recover good 3D structures of the scene such as the point cloud and surface normal directly, eliminating the necessity of relying on additional models as was previously done. To demonstrate the excellent generalizability of learning affine-invariant depth on diverse data with the virtual normal loss, we construct a large-scale and diverse dataset for training affine-invariant depth, termed Diverse Scene Depth dataset (DiverseDepth), and test on five datasets with the zero-shot test setting.

翻訳日:2021-03-09 15:40:32 公開日:2021-03-07

# MeGA-CDA: カテゴリ別非監視ドメイン適応オブジェクト検出のためのメモリガイド注意

MeGA-CDA: Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object Detection ( http://arxiv.org/abs/2103.04224v1 )

ライセンス: Link先を確認

Vibashan VS, Poojan Oza, Vishwanath A. Sindagi, Vikram Gupta, Vishal M. Patel

(参考訳) 教師なしドメイン適応オブジェクト検出のための既存のアプローチは、逆訓練によって機能アライメントを実行する。これらの手法は性能を合理的に改善するが、典型的にはカテゴリーに依存しない領域アライメントを行い、結果として特徴の負の移動をもたらす。そこで本研究では,カテゴリ・アウェア・ドメイン適応(MeGA-CDA)のためのメモリガイドアテンションを提案することで,カテゴリ情報をドメイン適応プロセスに組み込もうとする。提案手法は,カテゴリー別識別器を用いて,カテゴリ別識別特徴を学習するためのカテゴリ別特徴アライメントを保証する。しかし,対象のサンプルではカテゴリ情報が利用できないため,その特徴を対応するカテゴリ判別器に適切にルーティングするために,メモリガイド付きカテゴリ特異的注意マップを作成することを提案する。提案手法はいくつかのベンチマークデータセットで評価され,既存のアプローチを上回っていることが示された。

Existing approaches for unsupervised domain adaptive object detection perform feature alignment via adversarial training. While these methods achieve reasonable improvements in performance, they typically perform category-agnostic domain alignment, thereby resulting in negative transfer of features. To overcome this issue, in this work, we attempt to incorporate category information into the domain adaptation process by proposing Memory Guided Attention for Category-Aware Domain Adaptation (MeGA-CDA). The proposed method consists of employing category-wise discriminators to ensure category-aware feature alignment for learning domain-invariant discriminative features. However, since the category information is not available for the target samples, we propose to generate memory-guided category-specific attention maps which are then used to route the features appropriately to the corresponding category discriminator. The proposed method is evaluated on several benchmark datasets and is shown to outperform existing approaches.

翻訳日:2021-03-09 15:40:04 公開日:2021-03-07

# ディープグラフマッチングに基づくロバストポイントクラウド登録フレームワーク

Robust Point Cloud Registration Framework Based on Deep Graph Matching ( http://arxiv.org/abs/2103.04256v1 )

ライセンス: Link先を確認

Kexue Fu and Shaolei Liu and Xiaoyuan Luo and Manning Wang

(参考訳) 3Dポイントクラウド登録は、コンピュータビジョンとロボティクスにおける基本的な問題です。この分野では広範な研究が行われてきたが、既存の手法は、多くの異常値と時間的制約がある状況において大きな課題を満たしている。近年,学習に基づくアルゴリズムが多数導入され,高速化のメリットが示された。それらの多くは2つの点雲間の対応に基づいているため、変換初期化に依存しない。しかし、これらの学習に基づく手法は外れ値に敏感であり、より誤った対応をもたらす。本稿では,ポイントクラウド登録のための新しいディープグラフマッチングベースのフレームワークを提案する。具体的には、まず点雲をグラフに変換し、各点の深い特徴を抽出する。そこで我々は, 深部グラフマッチングに基づくモジュールを開発し, ソフト対応行列を計算した。グラフマッチングを用いることで、各点の局所幾何学だけでなく、より広い範囲におけるその構造やトポロジーも対応付けを確立することで、より正確な対応が見出される。通信上で直接定義された損失でネットワークを訓練し、テスト段階ではソフト対応をハードな1対1対応に変換し、単価分解により登録が行えるようにします。さらに,グラフ構築のためのエッジを生成するトランスベース手法を導入し,対応文の品質をさらに向上させる。クリーン,ノイズ,部分的・部分的・不可視のカテゴリー点雲の登録実験により,提案手法が最先端の性能を達成することを示す。コードはhttps://github.com/fukexue/RGMで公開される。

3D point cloud registration is a fundamental problem in computer vision and robotics. There has been extensive research in this area, but existing methods meet great challenges in situations with a large proportion of outliers and time constraints, but without good transformation initialization. Recently, a series of learning-based algorithms have been introduced and show advantages in speed. Many of them are based on correspondences between the two point clouds, so they do not rely on transformation initialization. However, these learning-based methods are sensitive to outliers, which lead to more incorrect correspondences. In this paper, we propose a novel deep graph matchingbased framework for point cloud registration. Specifically, we first transform point clouds into graphs and extract deep features for each point. Then, we develop a module based on deep graph matching to calculate a soft correspondence matrix. By using graph matching, not only the local geometry of each point but also its structure and topology in a larger range are considered in establishing correspondences, so that more correct correspondences are found. We train the network with a loss directly defined on the correspondences, and in the test stage the soft correspondences are transformed into hard one-to-one correspondences so that registration can be performed by singular value decomposition. Furthermore, we introduce a transformer-based method to generate edges for graph construction, which further improves the quality of the correspondences. Extensive experiments on registering clean, noisy, partial-to-partial and unseen category point clouds show that the proposed method achieves state-of-the-art performance. The code will be made publicly available at https://github.com/fukexue/RGM.

翻訳日:2021-03-09 15:39:47 公開日:2021-03-07

# リフレクションフリーフラッシュのみによるロバスト反射除去

Robust Reflection Removal with Reflection-free Flash-only Cues ( http://arxiv.org/abs/2103.04273v1 )

ライセンス: Link先を確認

Chenyang Lei and Qifeng Chen

(参考訳) フラッシュとアンビエント(非フラッシュ)の2つの画像から、堅牢な反射除去のためのシンプルで効果的な反射レスキューを提案します。反射フリーキューは、対応するフラッシュ画像から周囲画像を原データ空間に減じて得られるフラッシュ専用画像を利用する。フラッシュのみの画像は、フラッシュオンのみの暗い環境で撮影された画像と同等である。このフラッシュのみの画像は視覚的に反射しないため、周囲画像の反射を推測するロバストな手がかりが得られる。フラッシュのみの画像には通常アーティファクトがあるため、反射のないキューを利用するだけでなく、反射と透過を正確に推定するアーティファクトの導入を避ける専用モデルも提案する。私たちのモデルはPSNRの5.23dB、SSIMの0.04、LPIPSの0.068以上の最先端の反射除去アプローチを上回っています。ソースコードとデータセットは \href{https://github.com/ChenyangLEI/flash-reflection-removal}{このウェブサイト} で公開されます。

We propose a simple yet effective reflection-free cue for robust reflection removal from a pair of flash and ambient (no-flash) images. The reflection-free cue exploits a flash-only image obtained by subtracting the ambient image from the corresponding flash image in raw data space. The flash-only image is equivalent to an image taken in a dark environment with only a flash on. We observe that this flash-only image is visually reflection-free, and thus it can provide robust cues to infer the reflection in the ambient image. Since the flash-only image usually has artifacts, we further propose a dedicated model that not only utilizes the reflection-free cue but also avoids introducing artifacts, which helps accurately estimate reflection and transmission. Our experiments on real-world images with various types of reflection demonstrate the effectiveness of our model with reflection-free flash-only cues: our model outperforms state-of-the-art reflection removal approaches by more than 5.23dB in PSNR, 0.04 in SSIM, and 0.068 in LPIPS. Our source code and dataset will be publicly available at \href{https://github.com/ChenyangLEI/flash-reflection-removal}{this website}.

翻訳日:2021-03-09 15:39:23 公開日:2021-03-07

# 教師なしクロスドメイン翻訳のための交代MCMC指導による学習サイクル一貫性協調ネットワーク

Learning Cycle-Consistent Cooperative Networks via Alternating MCMC Teaching for Unsupervised Cross-Domain Translation ( http://arxiv.org/abs/2103.04285v1 )

ライセンス: Link先を確認

Jianwen Xie, Zilong Zheng, Xiaolin Fang, Song-Chun Zhu, Ying Nian Wu

(参考訳) 本稿では,各領域の確率分布をエネルギーベースモデルと潜在変数モデルからなる生成協調ネットワークで表現する生成フレームワークを提案することにより,教師なしのクロスドメイン翻訳問題について検討する。生成協調ネットワークを利用することで、MCMC教育によるドメインモデルの最大限の学習が可能となり、エネルギーベースモデルでは、ドメインのデータ分布に適合し、MCMCを介して潜在変数モデルにその知識を蒸留する。具体的には、MCMCの指導過程において、エンコーダデコーダによりパラメータ化された潜在変数モデルは、ソースドメインから対象ドメインにマッピングする一方、エネルギーベースモデルは、学習エネルギー関数によって定義される統計特性の観点から、修正結果が対象ドメインの例と一致するように、ランゲヴィンリビジョンによってマッピングされた結果をさらに洗練する。 2つのドメイン間の対応を構築するために,MCMC教育の交互化により,2つのドメイン間の双方向翻訳を考慮し,サイクル整合性のある協調ネットワークのペアを同時に学習する。提案手法は,教師なし画像から画像への変換とペアなし画像のシーケンス変換に有用であることを示す。

This paper studies the unsupervised cross-domain translation problem by proposing a generative framework, in which the probability distribution of each domain is represented by a generative cooperative network that consists of an energy-based model and a latent variable model. The use of generative cooperative network enables maximum likelihood learning of the domain model by MCMC teaching, where the energy-based model seeks to fit the data distribution of domain and distills its knowledge to the latent variable model via MCMC. Specifically, in the MCMC teaching process, the latent variable model parameterized by an encoder-decoder maps examples from the source domain to the target domain, while the energy-based model further refines the mapped results by Langevin revision such that the revised results match to the examples in the target domain in terms of the statistical properties, which are defined by the learned energy function. For the purpose of building up a correspondence between two unpaired domains, the proposed framework simultaneously learns a pair of cooperative networks with cycle consistency, accounting for a two-way translation between two domains, by alternating MCMC teaching. Experiments show that the proposed framework is useful for unsupervised image-to-image translation and unpaired image sequence translation.

翻訳日:2021-03-09 15:39:03 公開日:2021-03-07

# RFN-Nest:赤外・可視画像のためのエンドツーエンド残差核融合ネットワーク

RFN-Nest: An end-to-end residual fusion network for infrared and visible images ( http://arxiv.org/abs/2103.04286v1 )

ライセンス: Link先を確認

Hui Li, Xiao-Jun Wu, Josef Kittler

(参考訳) 画像融合分野では、深層学習に基づく融合法の設計は日常的ではない。それは常に融合タスク特異的であり、慎重な考慮が必要です。設計の最も難しい部分は、特定のタスクの融合画像を生成するための適切な戦略を選択することです。したがって、学習可能な融合戦略の考案は、画像融合のコミュニティで非常に困難な問題です。この問題を解決するために、赤外線および可視画像融合のための新しいエンドツーエンド融合ネットワークアーキテクチャ(RFN-Nest)を開発した。本稿では,従来の核融合方式を代替する残差構造に基づく残差核融合ネットワーク(RFN)を提案する。 RFNを訓練するために、新しい詳細保存損失関数と機能強化損失関数が提案される。融合モデル学習は、新しい二段階学習戦略によって達成される。最初の段階では、革新的なネスト接続(Nest)の概念に基づいて自動エンコーダをトレーニングします。次に、提案された損失関数を用いてrfnを訓練する。パブリックドメインデータセットにおける実験結果は,既存の手法と比較して,主観的および客観的評価において,エンドツーエンドのフュージョンネットワークが最先端の手法よりも優れた性能を提供することを示した。私たちの融合メソッドのコードはhttps://github.com/hli1221/imagefusion-rfn-nestで入手できます。

In the image fusion field, the design of deep learning-based fusion methods is far from routine. It is invariably fusion-task specific and requires a careful consideration. The most difficult part of the design is to choose an appropriate strategy to generate the fused image for a specific task in hand. Thus, devising learnable fusion strategy is a very challenging problem in the community of image fusion. To address this problem, a novel end-to-end fusion network architecture (RFN-Nest) is developed for infrared and visible image fusion. We propose a residual fusion network (RFN) which is based on a residual architecture to replace the traditional fusion approach. A novel detail-preserving loss function, and a feature enhancing loss function are proposed to train RFN. The fusion model learning is accomplished by a novel two-stage training strategy. In the first stage, we train an auto-encoder based on an innovative nest connection (Nest) concept. Next, the RFN is trained using the proposed loss functions. The experimental results on public domain data sets show that, compared with the existing methods, our end-to-end fusion network delivers a better performance than the state-of-the-art methods in both subjective and objective evaluation. The code of our fusion method is available at https://github.com/hli1221/imagefusion-rfn-nest

翻訳日:2021-03-09 15:38:39 公開日:2021-03-07

# ハイパースペクトル画像の超解像のための空間スペクトルフィードバックネットワーク

Spatial-Spectral Feedback Network for Super-Resolution of Hyperspectral Imagery ( http://arxiv.org/abs/2103.04354v1 )

ライセンス: Link先を確認

Enhai Liu, Zhenjie Tang, Bin Pan, Zhenwei Shi

(参考訳) 近年、深層学習に基づく単一グレー/RGB画像スーパーレゾリューション(SR)法が大きな成功を収めています。しかし、単一ハイパースペクトル画像超解像の技術的発展を制限するには2つの障害がある。 1つは高スペクトル像の高次元および複雑なスペクトルパターンであり、バンド間の空間情報とスペクトル情報の同時探索が困難である。もうひとつは、利用可能なハイパースペクトルトレーニングサンプルの数は極めて少なく、ディープニューラルネットワークのトレーニング時にオーバーフィットする可能性があることだ。そこで本論文では,局所スペクトル帯域間の低レベル表現をグローバルスペクトル帯域から高レベル情報を用いて改善するSSFN(Spatial-Spectral Feedback Network)を提案する。ハイパースペクトルデータの高い次元による特徴抽出の難しさを軽減するだけでなく、トレーニングプロセスをより安定させます。具体的には、そのようなフィードバックの仕方を達成するために、有限展開を持つRNNの隠れ状態を用いる。 SSFB(Spatial-Spectral Feedback Block)は、空間とスペクトルの事前利用のために、フィードバック接続を処理し、強力な高レベルの表現を生成するように設計されている。提案したSSFNは早期予測を伴い、最終高分解能ハイパースペクトル像を段階的に再構成することができる。 3つのベンチマークデータセットの大規模な実験結果から,提案したSSFNは最先端手法と比較して優れた性能を示した。ソースコードはhttps://github.com/tangzhenjie/ssfnで入手できる。

Recently, single gray/RGB image super-resolution (SR) methods based on deep learning have achieved great success. However, there are two obstacles to limit technical development in the single hyperspectral image super-resolution. One is the high-dimensional and complex spectral patterns in hyperspectral image, which make it difficult to explore spatial information and spectral information among bands simultaneously. The other is that the number of available hyperspectral training samples is extremely small, which can easily lead to overfitting when training a deep neural network. To address these issues, in this paper, we propose a novel Spatial-Spectral Feedback Network (SSFN) to refine low-level representations among local spectral bands with high-level information from global spectral bands. It will not only alleviate the difficulty in feature extraction due to high dimensional of hyperspectral data, but also make the training process more stable. Specifically, we use hidden states in an RNN with finite unfoldings to achieve such feedback manner. To exploit the spatial and spectral prior, a Spatial-Spectral Feedback Block (SSFB) is designed to handle the feedback connections and generate powerful high-level representations. The proposed SSFN comes with a early predictions and can reconstruct the final high-resolution hyperspectral image step by step. Extensive experimental results on three benchmark datasets demonstrate that the proposed SSFN achieves superior performance in comparison with the state-of-the-art methods. The source code is available at https://github.com/tangzhenjie/SSFN.

翻訳日:2021-03-09 15:38:18 公開日:2021-03-07

# Insta-RS: ロバスト性と精度を向上するためのインスタンスワイズランダム化スムージング

Insta-RS: Instance-wise Randomized Smoothing for Improved Robustness and Accuracy ( http://arxiv.org/abs/2103.04436v1 )

ライセンス: Link先を確認

Chen Chen, Kezhi Kong, Peihong Yu, Juan Luque, Furong Huang

(参考訳) ランダム化平滑化(英: randomized smoothing、rs)は、ニューラルネットワークの分類器を構築するための効果的でスケーラブルな手法である。ほとんどのrsは、滑らかなモデルの認定された堅牢性を高める優れたベースモデルのトレーニングにフォーカスしています。しかし、既存のRS技術は全てのデータポイントを同じ扱い、すなわち、滑らかなモデルを形成するために使用されるガウスノイズの分散は、すべてのトレーニングデータとテストデータに対してプリセットされ普遍的である。このプリセットおよび普遍ガウスノイズ分散は、異なるデータポイントが異なるマージンを持ち、ベースモデルの局所特性が入力例によって異なるため、最適である。本稿では、サンプルのカスタマイズ処理の影響について検討し、サンプルにカスタマイズされたガウス分散を割り当てるマルチスタート探索アルゴリズムであるインスタンスワイズランダム化平滑化(Insta-RS)を提案する。また、インスタンスワイズガウス平滑化モデルの認証された堅牢性を高めるベースモデルをトレーニングするために、各トレーニング例のノイズレベルを適応的に調整およびカスタマイズする新しい2段階トレーニングアルゴリズムであるInsta-RS Trainも設計しています。 CIFAR-10 と ImageNet の広範な実験により,本手法は既存の最先端の堅牢な分類器と比較して,平均認証半径 (ACR) とクリーンデータの精度を著しく向上させることを示した。

Randomized smoothing (RS) is an effective and scalable technique for constructing neural network classifiers that are certifiably robust to adversarial perturbations. Most RS works focus on training a good base model that boosts the certified robustness of the smoothed model. However, existing RS techniques treat every data point the same, i.e., the variance of the Gaussian noise used to form the smoothed model is preset and universal for all training and test data. This preset and universal Gaussian noise variance is suboptimal since different data points have different margins and the local properties of the base model vary across the input examples. In this paper, we examine the impact of customized handling of examples and propose Instance-wise Randomized Smoothing (Insta-RS) -- a multiple-start search algorithm that assigns customized Gaussian variances to test examples. We also design Insta-RS Train -- a novel two-stage training algorithm that adaptively adjusts and customizes the noise level of each training example for training a base model that boosts the certified robustness of the instance-wise Gaussian smoothed model. Through extensive experiments on CIFAR-10 and ImageNet, we show that our method significantly enhances the average certified radius (ACR) as well as the clean data accuracy compared to existing state-of-the-art provably robust classifiers.

翻訳日:2021-03-09 15:28:59 公開日:2021-03-07

# マルチエージェントゲームにおける潜在知能レベル推定による人間報酬の学習 : 運転データへの適用による極小アプローチ

Learning Human Rewards by Inferring Their Latent Intelligence Levels in Multi-Agent Games: A Theory-of-Mind Approach with Application to Driving Data ( http://arxiv.org/abs/2103.04289v1 )

ライセンス: Link先を確認

Ran Tian, Masayoshi Tomizuka, and Liting Sun

(参考訳) リワード機能は、人間のエージェントを認識し、人間の行動を合理化するインセンティブとして、特に人間とロボットの相互作用における人間の行動のモデル化に魅力がある。逆強化学習は、デモから報酬関数を取得する効果的な方法です。しかし,エージェント間の相互影響を適切にモデル化する必要があるため,マルチエージェント設定に適用することは常に困難である。この課題に取り組むために、以前の研究では、人間を無限の知性を持つ完全合理的なオプティマイザと仮定することによって平衡解の概念を利用するか、人間の相互作用戦略を優先順位付けする。本研究では、他者の意思決定過程を推論するとき、人間は理性に縛られ、異なる知能レベルを持つことを提唱し、このような固有的および潜在的特性は報酬学習アルゴリズムにおいて考慮されるべきである。そこで我々は,このような知見を心の理論から活用し,学習中の人間の潜在知性レベルを理由とする,新しい多エージェント逆強化学習フレームワークを提案する。ゼロサムとジェネラルサムの両方のゲームにおけるアプローチを合成エージェントで検証し、実際の運転データから人間のドライバーの報酬機能を学ぶための実用的なアプリケーションを示しています。アプローチを2つのベースラインアルゴリズムと比較する。その結果、人間の潜伏した知能レベルを推察することで、提案手法は人間の運転行動をよりよく説明できる報酬関数をより柔軟かつ高めることができることがわかった。

Reward function, as an incentive representation that recognizes humans' agency and rationalizes humans' actions, is particularly appealing for modeling human behavior in human-robot interaction. Inverse Reinforcement Learning is an effective way to retrieve reward functions from demonstrations. However, it has always been challenging when applying it to multi-agent settings since the mutual influence between agents has to be appropriately modeled. To tackle this challenge, previous work either exploits equilibrium solution concepts by assuming humans as perfectly rational optimizers with unbounded intelligence or pre-assigns humans' interaction strategies a priori. In this work, we advocate that humans are bounded rational and have different intelligence levels when reasoning about others' decision-making process, and such an inherent and latent characteristic should be accounted for in reward learning algorithms. Hence, we exploit such insights from Theory-of-Mind and propose a new multi-agent Inverse Reinforcement Learning framework that reasons about humans' latent intelligence levels during learning. We validate our approach in both zero-sum and general-sum games with synthetic agents and illustrate a practical application to learning human drivers' reward functions from real driving data. We compare our approach with two baseline algorithms. The results show that by reasoning about humans' latent intelligence levels, the proposed approach has more flexibility and capability to retrieve reward functions that explain humans' driving behaviors better.

翻訳日:2021-03-09 15:25:56 公開日:2021-03-07

# 無線エッジネットワークを用いた分散学習のための共同符号化とスケジューリング最適化

Joint Coding and Scheduling Optimization for Distributed Learning over Wireless Edge Networks ( http://arxiv.org/abs/2103.04303v1 )

ライセンス: Link先を確認

Nguyen Van Huynh, Dinh Thai Hoang, Diep N. Nguyen, and Eryk Dutkiewicz

(参考訳) 理論的分散学習(DL)とは異なり、無線エッジネットワーク上のDLは、無線接続とエッジノードの固有のダイナミクス/不確実性に直面しており、非常にダイナミックな無線エッジネットワーク(例えばmmWインターフェースを使用して)下でDLを効率性や適用性が低下させる。本稿では,近年のコーデックコンピューティングとディープデューリングニューラルネットワークアーキテクチャを活用し,これらの問題に対処する。コード化された構造/冗長性を導入することで、ノードをつまずくのを待つことなく、分散学習タスクを完了することができる。コード構造のみを最適化する従来のコードドコンピューティングとは異なり、ワイヤレスエッジ上のコードド分散学習では、異種接続によるワイヤレスエッジノードの選択/スケジュール、計算能力、ストラグリング効果も最適化する必要がある。しかし、前述のダイナミクス/未知性を無視しても、分散学習時間を最小化するためのコーディングとスケジューリングの協調最適化はnpハードであることが判明した。そこで我々は,無線接続とエッジノードのダイナミクスと不確実性を考慮し,問題をマルコフ決定プロセスとして再構成し,ディープ・デュリングニューラルネットワークアーキテクチャを用いた新しい深層強化学習アルゴリズムを設計し,無線環境とエッジノードのストラグリングパラメータに関する情報を明示することなく,異なる学習タスクのための最適な符号化方式と最良エッジノードを探索する。シミュレーションでは、提案されたフレームワークは、他のDLアプローチと比較して、無線エッジコンピューティングの平均学習遅延を最大66%削減する。本記事での共同最適フレームワークは、異種および不確実な計算ノードを持つ任意の分散学習スキームにも適用可能である。

Unlike theoretical distributed learning (DL), DL over wireless edge networks faces the inherent dynamics/uncertainty of wireless connections and edge nodes, making DL less efficient or even inapplicable under the highly dynamic wireless edge networks (e.g., using mmW interfaces). This article addresses these problems by leveraging recent advances in coded computing and the deep dueling neural network architecture. By introducing coded structures/redundancy, a distributed learning task can be completed without waiting for straggling nodes. Unlike conventional coded computing that only optimizes the code structure, coded distributed learning over the wireless edge also requires to optimize the selection/scheduling of wireless edge nodes with heterogeneous connections, computing capability, and straggling effects. However, even neglecting the aforementioned dynamics/uncertainty, the resulting joint optimization of coding and scheduling to minimize the distributed learning time turns out to be NP-hard. To tackle this and to account for the dynamics and uncertainty of wireless connections and edge nodes, we reformulate the problem as a Markov Decision Process and then design a novel deep reinforcement learning algorithm that employs the deep dueling neural network architecture to find the jointly optimal coding scheme and the best set of edge nodes for different learning tasks without explicit information about the wireless environment and edge nodes' straggling parameters. Simulations show that the proposed framework reduces the average learning delay in wireless edge computing up to 66% compared with other DL approaches. The jointly optimal framework in this article is also applicable to any distributed learning scheme with heterogeneous and uncertain computing nodes.

翻訳日:2021-03-09 15:25:30 公開日:2021-03-07

# マルチモーダルVAEアクティブ推論コントローラ

Multimodal VAE Active Inference Controller ( http://arxiv.org/abs/2103.04412v1 )

ライセンス: Link先を確認

Cristian Meo and Pablo Lanillos

(参考訳) 脳処理に触発された理論的構造であるアクティブ推論は、人工薬剤を制御するための有望な代替手段である。しかし、現在の方法は、連続制御の高次元入力にはまだスケールしない。本稿では,従来の受容的アプローチの適応特性を維持しつつ,大規模なマルチモーダル統合(生画像など)を可能にする産業用アーム用アクティブ・推論トルクコントローラを提案する。線形結合型マルチモーダル変分オートエンコーダを用いたマルチモーダル状態表現学習を含む以前の数学的定式化を拡張した。シミュレーションされた7DOF Franka Emika Pandaロボットアーム上でモデルを評価し、その動作を以前のアクティブ推論ベースラインとPanda組み込み最適化コントローラと比較した。その結果, 生成モデルやパラメータの調整を必要とせず, 表現力の増大, 騒音に対するロバスト性, 環境条件やロボットパラメータの変化への適応性等により, 目標方向到達時の追従性, 制御性が向上した。

Active inference, a theoretical construct inspired by brain processing, is a promising alternative to control artificial agents. However, current methods do not yet scale to high-dimensional inputs in continuous control. Here we present a novel active inference torque controller for industrial arms that maintains the adaptive characteristics of previous proprioceptive approaches but also enables large-scale multimodal integration (e.g., raw images). We extended our previous mathematical formulation by including multimodal state representation learning using a linearly coupled multimodal variational autoencoder. We evaluated our model on a simulated 7DOF Franka Emika Panda robot arm and compared its behavior with a previous active inference baseline and the Panda built-in optimized controller. Results showed improved tracking and control in goal-directed reaching due to the increased representation power, high robustness to noise and adaptability in changes on the environmental conditions and robot parameters without the need to relearn the generative models nor parameters retuning.

翻訳日:2021-03-09 15:25:02 公開日:2021-03-07

# 確率制御確率勾配法によるサドル点のエスケープ

Escaping Saddle Points with Stochastically Controlled Stochastic Gradient Methods ( http://arxiv.org/abs/2103.04413v1 )

ライセンス: Link先を確認

Guannan Liang, Qianqian Tong, Chunjiang Zhu, Jinbo bi

(参考訳) 確率的に制御された確率勾配(SCSG)法は1次定常点に効率よく収束することが証明されているが、非凸最適化ではサドル点となる。確率勾配降下 (SGD) ステップは, 深層学習と非凸空間学習問題に対するサドル点周辺の異方性雑音を生じさせ, これらの問題に対してSGDが相関負曲率 (CNC) 条件を満たすことを示す。そこで我々は,SCSG法が厳密なサドル点から脱出するのを助けるためにSGDステップを別々に使用し,CNC-SCSG法を提案する。 SGDステップはノイズ注入と同じような役割を果たすが、より安定している。結果のアルゴリズムは、$\tilde{O}( \epsilon^{-2} log( 1/\epsilon)$の収束率を持つ2次定常点に収束することを証明している。この収束率は問題次元とは独立であり、cnc-sgdよりも高速である。より一般的なフレームワークは、提案する cnc-scsg を任意の一階法に組み込むように設計されている。シミュレーション研究により、提案アルゴリズムはノイズ注入またはSGDステップによって摂動される勾配降下法よりもはるかに少ないエポックでサドル点を回避できることを示した。

Stochastically controlled stochastic gradient (SCSG) methods have been proved to converge efficiently to first-order stationary points which, however, can be saddle points in nonconvex optimization. It has been observed that a stochastic gradient descent (SGD) step introduces anistropic noise around saddle points for deep learning and non-convex half space learning problems, which indicates that SGD satisfies the correlated negative curvature (CNC) condition for these problems. Therefore, we propose to use a separate SGD step to help the SCSG method escape from strict saddle points, resulting in the CNC-SCSG method. The SGD step plays a role similar to noise injection but is more stable. We prove that the resultant algorithm converges to a second-order stationary point with a convergence rate of $\tilde{O}( \epsilon^{-2} log( 1/\epsilon))$ where $\epsilon$ is the pre-specified error tolerance. This convergence rate is independent of the problem dimension, and is faster than that of CNC-SGD. A more general framework is further designed to incorporate the proposed CNC-SCSG into any first-order method for the method to escape saddle points. Simulation studies illustrate that the proposed algorithm can escape saddle points in much fewer epochs than the gradient descent methods perturbed by either noise injection or a SGD step.

翻訳日:2021-03-09 15:24:46 公開日:2021-03-07

# GANav:非構造屋外環境におけるナビゲーション可能な地域分類のためのグループワイドアテンションネットワーク

GANav: Group-wise Attention Network for Classifying Navigable Regions in Unstructured Outdoor Environments ( http://arxiv.org/abs/2103.04233v1 )

ライセンス: Link先を確認

Tianrui Guan, Divya Kothandaraman, Rohan Chandra and Dinesh Manocha

(参考訳) 本稿では,RGB画像から,オフロード地形および非構造環境における安全かつ航行可能な領域を識別する新しい学習手法を提案する。本手法は,粒度の粗いセマンティックセグメンテーションを用いて,そのナビビリティレベルに基づいて地形分類群を分類する。本稿では,新しいグループアテンション機構を用いて異なる地形の航行性レベルを識別する,ボトルネックトランスに基づくディープニューラルネットワークアーキテクチャを提案する。グループアテンションヘッドにより,ネットワークが異なるグループに明示的に焦点を合わせ,精度を向上させることができる。さらに,データセットの長い尾の性質を扱うために,動的重み付きクロスエントロピー損失関数を提案する。 RUGD と RELLIS-3D のデータセットを広範囲に評価することにより,我々の学習アルゴリズムがナビゲーションのためのオフロード地形における視覚知覚の精度を向上させることを示す。これらのデータセットに対する先行研究と比較し,rugdでは6.74-39.1%,rellis-3dでは3.82-10.64%改善した。

We present a new learning-based method for identifying safe and navigable regions in off-road terrains and unstructured environments from RGB images. Our approach consists of classifying groups of terrain classes based on their navigability levels using coarse-grained semantic segmentation. We propose a bottleneck transformer-based deep neural network architecture that uses a novel group-wise attention mechanism to distinguish between navigability levels of different terrains.Our group-wise attention heads enable the network to explicitly focus on the different groups and improve the accuracy. In addition, we propose a dynamic weighted cross entropy loss function to handle the long-tailed nature of the dataset. We show through extensive evaluations on the RUGD and RELLIS-3D datasets that our learning algorithm improves the accuracy of visual perception in off-road terrains for navigation. We compare our approach with prior work on these datasets and achieve an improvement over the state-of-the-art mIoU by 6.74-39.1% on RUGD and 3.82-10.64% on RELLIS-3D.

翻訳日:2021-03-09 15:21:17 公開日:2021-03-07

# 野生のディープフェイクビデオ:分析と検出

Deepfake Videos in the Wild: Analysis and Detection ( http://arxiv.org/abs/2103.04263v1 )

ライセンス: Link先を確認

Jiameng Pu, Neal Mangaokar, Lauren Kelly, Parantapa Bhattacharya, Kavya Sundaram, Mobin Javed, Bolun Wang, Bimal Viswanath

(参考訳) aiが操作するビデオ、通称deepfakesは、新しい問題だ。近年、学界や産業界の研究者が、いくつかの(自己作成)ベンチマークdeepfakeデータセットとdeepfake検出アルゴリズムに貢献している。しかし、ディープフェイク動画の理解に向けた努力はほとんど行っていないため、この分野における研究貢献の現実的な適用性についての理解は限られている。既存のデータセットで検出スキームがうまく機能していることが示されたとしても、実際のディープフェイクに対するメソッドの一般性は明らかでない。まず、YouTubeとBilibiliからの1,869のビデオを含む、野生のディープフェイクビデオの最大のデータセットを収集し、提示し、コンテンツの4.8Mフレーム以上を抽出します。第2に,実世界におけるディープフェイクコンテンツの成長パターン,人気,クリエーター,操作戦略,生産方法の包括的分析を行った。第三に、我々は新しいデータセットを使って既存の防衛を体系的に評価し、実際の世界に配備する準備が整っていないことを観察する。第四に、我々は防御を改善するための転送学習スキームと競争に勝った技術の可能性を模索します。

AI-manipulated videos, commonly known as deepfakes, are an emerging problem. Recently, researchers in academia and industry have contributed several (self-created) benchmark deepfake datasets, and deepfake detection algorithms. However, little effort has gone towards understanding deepfake videos in the wild, leading to a limited understanding of the real-world applicability of research contributions in this space. Even if detection schemes are shown to perform well on existing datasets, it is unclear how well the methods generalize to real-world deepfakes. To bridge this gap in knowledge, we make the following contributions: First, we collect and present the largest dataset of deepfake videos in the wild, containing 1,869 videos from YouTube and Bilibili, and extract over 4.8M frames of content. Second, we present a comprehensive analysis of the growth patterns, popularity, creators, manipulation strategies, and production methods of deepfake content in the real-world. Third, we systematically evaluate existing defenses using our new dataset, and observe that they are not ready for deployment in the real-world. Fourth, we explore the potential for transfer learning schemes and competition-winning techniques to improve defenses.

翻訳日:2021-03-09 15:20:49 公開日:2021-03-07

# ERASOR:静的3次元クラウドマップ構築のための擬似占有率に基づく動的物体除去

ERASOR: Egocentric Ratio of Pseudo Occupancy-based Dynamic Object Removal for Static 3D Point Cloud Map Building ( http://arxiv.org/abs/2103.04316v1 )

ライセンス: Link先を確認

Hyungtae Lim, Sungwon Hwang, and Hyun Myung

(参考訳) 都市環境のスキャンデータには、車両や歩行者などの動的物体の表現が含まれることが多い。しかし、スキャンデータの連続的な蓄積を持つ3Dポイントクラウドマップの構築に関しては、動的オブジェクトはしばしば地図に不要なトレースを残します。これらの動的オブジェクトのトレースは障害として機能し、モバイル車両が良好なローカリゼーションおよびナビゲーション性能を達成するのを妨げます。そこで本研究では,pSeudo Occupancyをベースとした動的物体除去手法であるERASOR(Egocentric RAtio of pSeudo Occupancy-based dynamic object removal)を提案する。私たちのアプローチは、必然的に地面と接触している都市環境における最もダイナミックなオブジェクトの性質にその注意を向けます。そこで我々は,単位空間の占有を表現し,異なる占有の空間を識別する,擬似占有という新しい概念を提案する。最後に、R-GPF(Regional-wise Ground Plane Fitting)が採用され、動的点を含む可能性のある候補ビン内の動的点から静的点を区別する。 SemanticKITTIで実験的に検証されたこの手法は、既存のレイトレースベースおよび可視性ベースのメソッドの限界を克服する最先端の手法に対して有望なパフォーマンスをもたらす。

Scan data of urban environments often include representations of dynamic objects, such as vehicles, pedestrians, and so forth. However, when it comes to constructing a 3D point cloud map with sequential accumulations of the scan data, the dynamic objects often leave unwanted traces in the map. These traces of dynamic objects act as obstacles and thus impede mobile vehicles from achieving good localization and navigation performances. To tackle the problem, this paper presents a novel static map building method called ERASOR, Egocentric RAtio of pSeudo Occupancy-based dynamic object Removal, which is fast and robust to motion ambiguity. Our approach directs its attention to the nature of most dynamic objects in urban environments being inevitably in contact with the ground. Accordingly, we propose the novel concept called pseudo occupancy to express the occupancy of unit space and then discriminate spaces of varying occupancy. Finally, Region-wise Ground Plane Fitting (R-GPF) is adopted to distinguish static points from dynamic points within the candidate bins that potentially contain dynamic points. As experimentally verified on SemanticKITTI, our proposed method yields promising performance against state-of-the-art methods overcoming the limitations of existing ray tracing-based and visibility-based methods.

翻訳日:2021-03-09 15:20:29 公開日:2021-03-07

# リアルタイム人間エージェントチームのための適応エージェントアーキテクチャ

Adaptive Agent Architecture for Real-time Human-Agent Teaming ( http://arxiv.org/abs/2103.04439v1 )

ライセンス: Link先を確認

Tianwei Ni, Huao Li, Siddharth Agrawal, Suhas Raja, Fan Jia, Yikang Gui, Dana Hughes, Michael Lewis, Katia Sycara

(参考訳) チームワークは、共通の目的を促進するチームメンバの相互関係的な推論、行動、行動のセットです。チームワーク理論と実験は、人間とエージェントエージェントの両方のチームの有効性のための一連の状態とプロセスをもたらしました。しかし、人間とエージェントのチーム化は、非常に新しいものであり、人間のチームには存在しない方針や意図の非対称性が伴うため、あまり研究されていない。人間エージェントチームにおけるチームパフォーマンスを最適化するには、エージェントが人間の意図を推測し、警察を円滑な調整に適応させることが重要です。ほとんどの文献は、学習された人間のモデルを参照するエージェントを構築している。これらのエージェントは学習されたモデルでうまく機能することが保証されているが、最適性や一貫性といった人間のポリシーに重きを置いている。本稿では,TSF(Team Space Fortress)と呼ばれる2人プレイヤ協調ゲームにおいて,人間モデルフリー設定における新しい適応エージェントアーキテクチャを提案する。これまでの人間と人間のチームの研究では、tsfゲームにおける相補的なポリシーと、プレイヤーのスキルの多様性が示されている。したがって、私たちは人間のデータから人間モデルの学習を破棄し、RLアルゴリズムまたはルールベースの方法で構成された事前訓練された例ポリシーライブラリの適応戦略を使用して、人間の行動を最小に仮定します。適応戦略は、人間のポリシーを推論するための新しい類似度メトリクスに依存し、チームのパフォーマンスを最大化するために、我々のライブラリで最も補完的なポリシーを選択します。アダプティブエージェントアーキテクチャはリアルタイムでデプロイでき、任意のオフセットの静的エージェントに一般化できる。提案する適応エージェントフレームワークを評価するために,人間エージェント実験を実施し,人間エージェントチームにおけるヒューマンポリシーの最適性,多様性,適応性について検証した。

Teamwork is a set of interrelated reasoning, actions and behaviors of team members that facilitate common objectives. Teamwork theory and experiments have resulted in a set of states and processes for team effectiveness in both human-human and agent-agent teams. However, human-agent teaming is less well studied because it is so new and involves asymmetry in policy and intent not present in human teams. To optimize team performance in human-agent teaming, it is critical that agents infer human intent and adapt their polices for smooth coordination. Most literature in human-agent teaming builds agents referencing a learned human model. Though these agents are guaranteed to perform well with the learned model, they lay heavy assumptions on human policy such as optimality and consistency, which is unlikely in many real-world scenarios. In this paper, we propose a novel adaptive agent architecture in human-model-free setting on a two-player cooperative game, namely Team Space Fortress (TSF). Previous human-human team research have shown complementary policies in TSF game and diversity in human players' skill, which encourages us to relax the assumptions on human policy. Therefore, we discard learning human models from human data, and instead use an adaptation strategy on a pre-trained library of exemplar policies composed of RL algorithms or rule-based methods with minimal assumptions of human behavior. The adaptation strategy relies on a novel similarity metric to infer human policy and then selects the most complementary policy in our library to maximize the team performance. The adaptive agent architecture can be deployed in real-time and generalize to any off-the-shelf static agents. We conducted human-agent experiments to evaluate the proposed adaptive agent framework, and demonstrated the suboptimality, diversity, and adaptability of human policies in human-agent teams.

翻訳日:2021-03-09 15:16:42 公開日:2021-03-07

# T-Miner: DNNテキスト分類におけるトロイの木馬攻撃対策のためのジェネレーティブアプローチ

T-Miner: A Generative Approach to Defend Against Trojan Attacks on DNN-based Text Classification ( http://arxiv.org/abs/2103.04264v1 )

ライセンス: Link先を確認

Ahmadreza Azizi, Ibrahim Asadullah Tahmid, Asim Waheed, Neal Mangaokar, Jiameng Pu, Mobin Javed, Chandan K. Reddy, Bimal Viswanath

(参考訳) ディープニューラルネットワーク(dnn)分類器はトロイの木馬やバックドア攻撃に対して脆弱であることが知られており、分類器は攻撃者によって決定されたトロイの木馬トリガーを含む入力を誤分類するように操作される。バックドアはモデルの整合性を損なうため、DNNベースの分類の状況に深刻な脅威をもたらす。このような攻撃に対する複数の防御は画像ドメインの分類器に対して存在するが、テキストドメインの分類器を保護する努力は限られている。我々は、DNNベースのテキスト分類器に対するトロイの木馬攻撃のための防御フレームワークであるTrojan-Miner(T-Miner)を紹介する。 T-Minerはシークエンス・ツー・シークエンス(seq-2-seq)生成モデルを用いて、疑わしい分類器を探索し、トロイの木馬トリガーを含む可能性が高いテキストシーケンスを生成する。 T-Minerは、生成モデルによって生成されたテキストを分析し、トリガーフレーズを含むかどうかを決定し、テストされた分類器にバックドアがあるかどうかを判断します。 T-Minerは、不審な分類器のトレーニングデータセットやクリーンな入力へのアクセスを必要とせず、代わりに合成された「非意味」テキスト入力を使用して生成モデルをトレーニングする。 3つのユビキタスDNNモデルアーキテクチャ、5つの分類タスク、さまざまなトリガーフレーズからなる1100モデルインスタンスのT-Minerを幅広く評価します。 T-Minerがトロイの木馬とクリーンモデルを98.75%の全体的な精度で検出し、クリーンモデルの偽陽性を低く抑えることを示した。また、T-Minerはアダプティブアタッカーからの様々な標的の高度な攻撃に対して堅牢であることも示しています。

Deep Neural Network (DNN) classifiers are known to be vulnerable to Trojan or backdoor attacks, where the classifier is manipulated such that it misclassifies any input containing an attacker-determined Trojan trigger. Backdoors compromise a model's integrity, thereby posing a severe threat to the landscape of DNN-based classification. While multiple defenses against such attacks exist for classifiers in the image domain, there have been limited efforts to protect classifiers in the text domain. We present Trojan-Miner (T-Miner) -- a defense framework for Trojan attacks on DNN-based text classifiers. T-Miner employs a sequence-to-sequence (seq-2-seq) generative model that probes the suspicious classifier and learns to produce text sequences that are likely to contain the Trojan trigger. T-Miner then analyzes the text produced by the generative model to determine if they contain trigger phrases, and correspondingly, whether the tested classifier has a backdoor. T-Miner requires no access to the training dataset or clean inputs of the suspicious classifier, and instead uses synthetically crafted "nonsensical" text inputs to train the generative model. We extensively evaluate T-Miner on 1100 model instances spanning 3 ubiquitous DNN model architectures, 5 different classification tasks, and a variety of trigger phrases. We show that T-Miner detects Trojan and clean models with a 98.75% overall accuracy, while achieving low false positives on clean models. We also show that T-Miner is robust against a variety of targeted, advanced attacks from an adaptive attacker.

翻訳日:2021-03-09 15:14:53 公開日:2021-03-07

# 複数のディープラーニングモデルの比較テストを促進するための識別測定

Measuring Discrimination to Boost Comparative Testing for Multiple Deep Learning Models ( http://arxiv.org/abs/2103.04333v1 )

ライセンス: Link先を確認

Linghan Meng, Yanhui Li, Lin Chen, Zhi Wang, Di Wu, Yuming Zhou, Baowen Xu

(参考訳) DL技術のブームは巨大なDLモデルの構築と共有をもたらし、DLモデルの取得と再利用を促進する。与えられたタスクに対して、同じ機能で利用可能な複数のDLモデルに遭遇する。テスターは複数のDLモデルを比較し、より適したものを選択することが期待される。テストのコンテキスト全体。分類の努力の限界のために、テスターはこれらのモデルのためにできるだけ正確なランクの推定をするサンプルの有効なサブセットを選ぶことを目標にします。この問題に対処するために,複数のモデルを識別可能な効率的なサンプルを選択するために,サンプル識別に基づく選択(SDS)を提案する。 SDSを評価するために,広範に利用されている3つの画像データセットと80個の実世界DLモデルを用いて広範な実験研究を行った。実験の結果,SDSは最先端のベースライン法と比較して,複数のDLモデルのランク付けに有効で効率的なサンプル選択法であることがわかった。

The boom of DL technology leads to massive DL models built and shared, which facilitates the acquisition and reuse of DL models. For a given task, we encounter multiple DL models available with the same functionality, which are considered as candidates to achieve this task. Testers are expected to compare multiple DL models and select the more suitable ones w.r.t. the whole testing context. Due to the limitation of labeling effort, testers aim to select an efficient subset of samples to make an as precise rank estimation as possible for these models. To tackle this problem, we propose Sample Discrimination based Selection (SDS) to select efficient samples that could discriminate multiple models, i.e., the prediction behaviors (right/wrong) of these samples would be helpful to indicate the trend of model performance. To evaluate SDS, we conduct an extensive empirical study with three widely-used image datasets and 80 real world DL models. The experimental results show that, compared with state-of-the-art baseline methods, SDS is an effective and efficient sample selection method to rank multiple DL models.

翻訳日:2021-03-09 15:14:19 公開日:2021-03-07

# ネットワーク表現学習:伝統的な特徴学習から深層学習へ

Network Representation Learning: From Traditional Feature Learning to Deep Learning ( http://arxiv.org/abs/2103.04339v1 )

ライセンス: Link先を確認

Ke Sun, Lei Wang, Bo Xu, Wenhong Zhao, Shyh Wei Teng, Feng Xia

(参考訳) ネットワーク表現学習(NRL)は,グラフデータの隠れた特徴を深く理解するための効果的なグラフ解析手法である。ソーシャルネットワークデータ処理、生物学的情報処理、レコメンダシステムなど、ネットワーク科学に関連する多くの実世界のタスクにうまく応用されている。ディープラーニングはデータ機能を学ぶための強力なツールです。しかし、空間情報を持つ画像や時間情報を持つ音といった正規データとは異なるため、グラフ構造データへのディープラーニングの一般化は非自明である。近年, NRL領域で多くの深層学習手法が提案されている。本研究では,従来の特徴学習手法から深層学習モデルまでの古典的なNRLを調査し,それらの関係を分析し,最新の進歩をまとめる。最後に、NRLを考慮したオープンな問題について議論し、この分野の今後の方向性を指摘する。

Network representation learning (NRL) is an effective graph analytics technique and promotes users to deeply understand the hidden characteristics of graph data. It has been successfully applied in many real-world tasks related to network science, such as social network data processing, biological information processing, and recommender systems. Deep Learning is a powerful tool to learn data features. However, it is non-trivial to generalize deep learning to graph-structured data since it is different from the regular data such as pictures having spatial information and sounds having temporal information. Recently, researchers proposed many deep learning-based methods in the area of NRL. In this survey, we investigate classical NRL from traditional feature learning method to the deep learning-based model, analyze relationships between them, and summarize the latest progress. Finally, we discuss open issues considering NRL and point out the future directions in this field.

翻訳日:2021-03-09 15:14:01 公開日:2021-03-07

# グラフ力学習

Graph Force Learning ( http://arxiv.org/abs/2103.04344v1 )

ライセンス: Link先を確認

Ke Sun, Jiaying Liu, Shuo Yu, Bo Xu, Feng Xia

(参考訳) 機能表現は、ネットワーク分析タスクの大きな力を活用します。しかし、ほとんどの機能は離散的であり、効果的に利用するための大きな課題をもたらす。近年,離散的な特徴を連続空間にマップできるネットワーク機能学習に注目が集まっている。残念なことに、現在の研究では、トレーニング中にランダムな負のサンプリング戦略によって特徴空間の構造情報が完全に保存されない。この問題に対処するために,機能学習の課題と新規性について検討し,春電モデルに着想を得た力に基づくグラフ学習モデルGForceを提案する。 GForceは、ノードが魅力的な力と反発力を持っていると仮定し、特徴学習における元の構造情報と同じ表現をもたらす。ベンチマークデータセットに関する総合的な実験は、提案フレームワークの有効性を実証する。さらにgforceは、グラフ学習のためのノードインタラクションのモデル化に物理モデルを使用する機会を開放する。

Features representation leverages the great power in network analysis tasks. However, most features are discrete which poses tremendous challenges to effective use. Recently, increasing attention has been paid on network feature learning, which could map discrete features to continued space. Unfortunately, current studies fail to fully preserve the structural information in the feature space due to random negative sampling strategy during training. To tackle this problem, we study the problem of feature learning and novelty propose a force-based graph learning model named GForce inspired by the spring-electrical model. GForce assumes that nodes are in attractive forces and repulsive forces, thus leading to the same representation with the original structural information in feature learning. Comprehensive experiments on benchmark datasets demonstrate the effectiveness of the proposed framework. Furthermore, GForce opens up opportunities to use physics models to model node interaction for graph learning.

翻訳日:2021-03-09 15:13:49 公開日:2021-03-07

# Bio-JOIE:生物知識基盤の共同表現学習

Bio-JOIE: Joint Representation Learning of Biological Knowledge Bases ( http://arxiv.org/abs/2103.04283v1 )

ライセンス: Link先を確認

Junheng Hao, Chelsea Ju, Muhao Chen, Yizhou Sun, Carlo Zaniolo, Wei Wang

(参考訳) コロナウイルスの流行は世界的なパンデミックを引き起こし、死亡率は高い。現在、このウイルスに関するさまざまな研究から得られた知識は非常に限られている。他の近縁種の遺伝子オントロジーやタンパク質-タンパク質相互作用(PPI)ネットワークなどの幅広い生物学的知識を活用して、新しい種の分子影響を推定する重要なアプローチを提示します。本稿では,遺伝子オントロジーとppiネットワークの知識を捉え,sars-cov-2-ヒトタンパク質相互作用のモデル化における超能力を示す,トランスファーマルチリレーショナル組込みモデルbio-joieを提案する。 Bio-JOIEは2つのモデルコンポーネントを共同でトレーニングする。知識モデルは、GO用語に用いられる階層認識符号化技術を用いて、タンパク質とGOドメインの関連事実を分離埋め込み空間にエンコードする。さらに、トランスファーモデルは、PPIの知識と遺伝子オントロジーアノテーションを埋め込み空間全体に転送するための非線形変換を学習する。構造化知識のみを活用することにより、Bio-JOIEはPPI型予測において、既存の最先端の手法よりも著しく優れている。さらに,酵素活性を有するタンパク質群における学習表現を酵素系に活用する可能性を実証した。最後に,Bio-JOIEはSARS-CoV-2タンパク質とヒトタンパク質のPPIを正確に同定し,本疾患の研究を進める上で貴重な知見を提供する。

The widespread of Coronavirus has led to a worldwide pandemic with a high mortality rate. Currently, the knowledge accumulated from different studies about this virus is very limited. Leveraging a wide-range of biological knowledge, such as gene ontology and protein-protein interaction (PPI) networks from other closely related species presents a vital approach to infer the molecular impact of a new species. In this paper, we propose the transferred multi-relational embedding model Bio-JOIE to capture the knowledge of gene ontology and PPI networks, which demonstrates superb capability in modeling the SARS-CoV-2-human protein interactions. Bio-JOIE jointly trains two model components. The knowledge model encodes the relational facts from the protein and GO domains into separated embedding spaces, using a hierarchy-aware encoding technique employed for the GO terms. On top of that, the transfer model learns a non-linear transformation to transfer the knowledge of PPIs and gene ontology annotations across their embedding spaces. By leveraging only structured knowledge, Bio-JOIE significantly outperforms existing state-of-the-art methods in PPI type prediction on multiple species. Furthermore, we also demonstrate the potential of leveraging the learned representations on clustering proteins with enzymatic function into enzyme commission families. Finally, we show that Bio-JOIE can accurately identify PPIs between the SARS-CoV-2 proteins and human proteins, providing valuable insights for advancing research on this new disease.

翻訳日:2021-03-09 15:07:51 公開日:2021-03-07

# 非線形システムの適応制御指向メタ学習

Adaptive-Control-Oriented Meta-Learning for Nonlinear Systems ( http://arxiv.org/abs/2103.04490v1 )

ライセンス: Link先を確認

Spencer M. Richards, Navid Azizan, Jean-Jacques E. Slotine, and Marco Pavone

(参考訳) リアルタイム適応は、複雑な動的環境で動作するロボットの制御に不可欠である。適応制御則は、不確定なダイナミクス項が既知の非線形特徴で線形にパラメータ化可能であれば、軌道追従性能の良好な非線形システムでさえも付与することができる。しかし、ロータークラフトの空力障害やマニピュレータアームと様々な物体との相互作用力など、先駆的な特徴を特定することはしばしば困難である。本稿では、ニューラルネットワークを用いたデータ駆動モデルを用いて、過去のデータからオフラインで学習し、これらの非線形特徴を内部パラメトリックモデルで適応制御する。私たちの重要な洞察は、入出力データに適合する機能の回帰指向メタ学習よりも、クローズドループシミュレーションにおける機能の制御指向メタラーニングによるデプロイメントのためのコントローラを準備できるということです。具体的には,アダプティブコントローラをメタ学習し,クローズドループ追跡シミュレーションをベースラーナーとし,平均トラッキング誤差をメタ対象とする。風を受ける非線形平面ロータークラフトを用いて,軌道追従制御のためにクローズドループに配置した場合,適応型コントローラが回帰指向メタラーニングにより訓練された他のコントローラよりも優れることを示す。

Real-time adaptation is imperative to the control of robots operating in complex, dynamic environments. Adaptive control laws can endow even nonlinear systems with good trajectory tracking performance, provided that any uncertain dynamics terms are linearly parameterizable with known nonlinear features. However, it is often difficult to specify such features a priori, such as for aerodynamic disturbances on rotorcraft or interaction forces between a manipulator arm and various objects. In this paper, we turn to data-driven modeling with neural networks to learn, offline from past data, an adaptive controller with an internal parametric model of these nonlinear features. Our key insight is that we can better prepare the controller for deployment with control-oriented meta-learning of features in closed-loop simulation, rather than regression-oriented meta-learning of features to fit input-output data. Specifically, we meta-learn the adaptive controller with closed-loop tracking simulation as the base-learner and the average tracking error as the meta-objective. With a nonlinear planar rotorcraft subject to wind, we demonstrate that our adaptive controller outperforms other controllers trained with regression-oriented meta-learning when deployed in closed-loop for trajectory tracking control.

翻訳日:2021-03-09 15:07:27 公開日:2021-03-07

PDF登録状況（公開日: 20210307）