Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20200809となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# 量子論を超えた非古典性 Witnessing non-classicality beyond quantum theory ( http://arxiv.org/abs/2003.07974v3 ) ライセンス: Link先を確認	Chiara Marletto, Vlatko Vedral	(参考訳) 物理系が2つの量子系間の絡み合いの発生を局所的に調停できるなら、それ自身を非古典的でなければならないことを示す一般的な議論を提案する。この結果は、最近提案されたコンストラクタ情報理論から導かれた一般的な情報理論の原則に従うものである。この議論は、最近提案された重力の非古典性試験において、量子プローブにおける重力によって引き起こされる絡み合いを目撃した理論の基礎となる。 We propose a general argument to show that if a physical system can mediate locally the generation of entanglement between two quantum systems, then it itself must be non-classical. Remarkably, we do not assume any classical or quantum formalism to describe the mediating physical system: our result follows from general information-theoretic principles, drawn from the recently proposed constructor theory of information. This argument provides the indispensable theoretical basis for recently proposed tests of non-classicality in gravity, based on witnessing gravitationally-induced entanglement in quantum probes.	翻訳日:2023-05-28 22:11:16 公開日:2020-08-09
# 3量子系における絡み合いのベクトル特性 Vector Properties of Entanglement in a Three-Qubit System ( http://arxiv.org/abs/2003.14390v2 ) ライセンス: Link先を確認	Dmitry B. Uskov, Paul M. Alsing	(参考訳) 我々は、$su(4)$ と $so(6)$ Lie 代数の間の同型性に基づく3量子系における絡み合いの動的ベクトルモデルを提案する。 Pl\ "ucker-type description of three-qubit local invariants" を一般化すると、3つの実数値の3D$ベクトル(ここでは$A_{R,I}$、$B_{R,I}$、$C_{R,I}$と表記する)を導入する。これらのベクトルのマグニチュードはシステムの2ビットおよび3ビットの絡み合いパラメータを決定する。局所的な$SU(2)$演算の下でのベクトルの進化$A$, $B$ , $C$は、qubits $a$, $b$, $c$の単一キュービットブロッホベクトルの進化$SO(3)$と同一であることを示す。同時に、一般的な 2-qubit $su(4)$ Hamiltonians は$a-b$, $a-c$ and $b-c$ 2-qubit coupling terms generate $SO(6)$ coupling between vectors $A$ and $B$, $A$ and $C$, and $B$ and $C$ を含む。異なる2量子結合項によって引き起こされる絡み合いのダイナミクスはベクトルの相互配向$A$,$B$,$C$によって完全に決定され、これは単量子変換によって制御できる。 W$, Greenberg-Horne-Zeilinger (GHZ$) と分岐状態の間の変換を含む量子制御問題を解くことで、絡み合いのベクトル記述の力を説明する。 We suggest a dynamical vector model of entanglement in a three qubit system based on isomorphism between $su(4)$ and $so(6)$ Lie algebras. Generalizing Pl\"ucker-type description of three-qubit local invariants we introduce three pairs of real-valued $3D$ vector (denoted here as $A_{R,I}$ , $B_{R,I}$ and $C_{R,I}$). Magnitudes of these vectors determine two- and three-qubit entanglement parameters of the system. We show that evolution of vectors $A$, $B$ , $C$ under local $SU(2)$ operations is identical to $SO(3)$ evolution of single-qubit Bloch vectors of qubits $a$, $b$ and $c$ correspondingly. At the same time, general two-qubit $su(4)$ Hamiltonians incorporating $a-b$, $a-c$ and $b-c$ two-qubit coupling terms generate $SO(6)$ coupling between vectors $A$ and $B$, $A$ and $C$, and $B$ and $C$, correspondingly. It turns out that dynamics of entanglement induced by different two-qubit coupling terms is entirely determined by mutual orientation of vectors $A$, $B$, $C$ which can be controlled by single-qubit transformations. We illustrate the power of this vector description of entanglement by solving quantum control problems involving transformations between $W$, Greenberg-Horne-Zeilinger ($GHZ$ ) and biseparable states.	翻訳日:2023-05-27 07:32:03 公開日:2020-08-09
# 持続スピンヘリックスを求める臨界超電流と$\phi_0$状態 Critical supercurrent and $\phi_0$ state for probing a persistent spin helix ( http://arxiv.org/abs/2004.14586v3 ) ライセンス: Link先を確認	Mohammad Alidoust	(参考訳) 理論的には、ゼーマン場の存在下でのRashba-Dresselhaus spin-orbit interaction (RDSOI) との2次元ジョセフソン接合における超電流のプロファイルを理論的に研究する。自己バイアス付き超電流(いわゆる$\varphi_0$-Josephson状態)を調べることで、RDSOIパラメータ($\alpha,\beta$)および面内ゼーマン場成分($h_x,h_y$)に対する$\varphi_0$状態の関数の明示的な式を得る。以上の結果から,超伝導電極のエネルギーギャップ (\Delta$) に対して化学ポテンシャル (\mu$) が十分高い場合, 等強度 (\|\alpha\|=\|\beta\|$) の DSOI と$\mu \gg \Delta$ (RSOI) は磁化とRDSOIのタイプに依存しない$\varphi_0$ を消滅させることがわかった。しかし、不等成分を持つゼーマン場、すなわち$\|h_x\|\neq \|h_y\|$は、同じ強度のRDSOIsの破壊的影響を(1つの型のみに対して)無効化することができるが、$\|h_x\|= \|h_y\|$は$\varphi_0$状態を取り除くことができる。驚くべきことに、$\mu\sim\delta$ 極限において、$\varphi_0$ 状態は平面内ゼーマン場の両成分の乗算、すなわち $\mu \gg \delta$ 極限に存在しない $h_xh_y$ に比例する。さらに, 臨界超電流実験の結果から, 持続的なスピンヘリックスは高い化学ポテンシャル系である$\mu\gg \delta$ で明らかとなり, 逆の系である$\mu\sim\delta$ が悪影響をもたらすことが示された。弾道的状態において、臨界超電流の「最大」は$\|\alpha\|=\|\beta\|$で起こり、ゼーマン場はこの特徴を高めることができる。障害や非磁性的不純物の存在は、この図を劇的に変えるので、臨界超電流の「最小」は対称性線 $\|\alpha\|=\|\beta\|$ の周辺で起こる。 We theoretically study the profile of a supercurrent in two-dimensional Josephson junctions with Rashba-Dresselhaus spin-orbit interaction (RDSOI) in the presence of a Zeeman field. Through investigating self-biased supercurrent (so called $\varphi_0$-Josephson state), we obtain explicit expressions for the functionality of the $\varphi_0$ state with respect to RDSOI parameters ($\alpha,\beta$) and in-plane Zeeman field components ($h_x,h_y$). Our findings reveal that, when the chemical potential ($\mu$) is high enough compared to the energy gap ($\Delta$) in superconducting electrodes, i.e., $\mu \gg \Delta$, RSOI and DSOI with equal strengths ($\|\alpha\|=\|\beta\|$) cause vanishing $\varphi_0$ state independent of magnetization and the type of RDSOI. A Zeeman field with unequal components, i.e., $\|h_x\|\neq \|h_y\|$, however, can counteract and nullify the destructive impact of equal-strength RDSOIs (for one type only), where $\mu\sim\Delta$, although $\|h_x\|= \|h_y\|$ can still eliminate the $\varphi_0$ state. Remarkably, in the $\mu\sim\Delta$ limit, the $\varphi_0$ state is proportional to the multiplication of both components of an in-plane Zeeman field, i.e., $h_xh_y$, which is absent in the $\mu \gg \Delta$ limit. Furthermore, our results of critical supercurrents demonstrate that the persistent spin helices can be revealed in a high enough chemical potential regime $\mu\gg \Delta$, while an opposite regime, i.e., $\mu\sim\Delta$, introduces an adverse effect. In the ballistic regime, the "maximum" of the critical supercurrent occurs at $\|\alpha\|=\|\beta\|$ and the Zeeman field can boost this feature. The presence of disorder and nonmagnetic impurities change this picture drastically so the "minimum" of the critical supercurrent occurs at and around the symmetry lines $\|\alpha\|=\|\beta\|$.	翻訳日:2023-05-21 17:31:39 公開日:2020-08-09
# IsiAの構造に基づくハミルトニアンモデルによる高堅牢な色素タンパク質複合体の発見 Structure-based Hamiltonian model for IsiA uncovers a highly robust pigment protein complex ( http://arxiv.org/abs/2006.00947v2 ) ライセンス: Link先を確認	Hanan Schoffman, William M. Brown, Yossi Paltiel, Nir Keren and Erik M. Gauger	(参考訳) 鉄ストレス誘発タンパク質a(iia)は、生物学的研究における興味と議論の源である。 200以上のクロロフィルを結合するIsiA超複合体は、光系I(PSI)の周りの多量体環に組み立てられる。近年、IsiA-PSI構造は3.48 {\AA} に分解された。この構造に基づいて、IsiAモノマー内の単一励起事象をシミュレートするモデルを構築した。このモデルにより,isia構造における励起の蛍光と局所化の計算が可能となった。このシステムをさらに検討するため、モデルに熱と位置の2つの形態でノイズを導入した。騒音の導入は、低温と生物学的に関連する温度のシステムにおける機能的差異を強調している。以上の結果から,IsiA色素タンパク質複合体のエネルギーは室温では非常に強いことが示唆された。それでも、特定のクロロフィルの位置の変化は、光学的および蛍光的性質に大きな変化をもたらす。これらの結果に基づき,光合成過程の機能と進化の理解に基づいて,異なる役割をコンテキストに依存した形で果たす可能性を持つ,高ロバストな構造の含意について考察する。 The iron stress-induced protein A (IsiA) is a source of interest and debate in biological research. The IsiA super-complex, binding over 200 chlorophylls, assembles in multimeric rings around photosystem I (PSI). Recently, the IsiA-PSI structure was resolved to 3.48 {\AA}. Based on this structure, we created a model simulating a single excitation event in an IsiA monomer. This model enabled us to calculate the fluorescence and the localisation of the excitation in the IsiA structure. To further examine this system, noise was introduced to the model in two forms -- thermal and positional. Introducing noise highlights the functional differences in the system between cryogenic temperatures and biologically relevant temperatures. Our results show that the energetics of the IsiA pigment-protein complex are very robust at room temperature. Nevertheless, shifts in the position of specific chlorophylls lead to large changes in their optical and fluorescence properties. Based on these results we discuss the implication of highly robust structures, with potential for serving different roles in a context dependent manner, on our understanding of the function and evolution of photosynthetic processes.	翻訳日:2023-05-17 11:28:16 公開日:2020-08-09
# 量子デファスメントプローブによるオーミック熱浴の識別 Discrimination of Ohmic thermal baths by quantum dephasing probes ( http://arxiv.org/abs/2008.02526v2 ) ライセンス: Link先を確認	Alessandro Candeloro, Matteo G. A. Paris	(参考訳) 量子プローブを軽視することで、異なる温度での構造化浴の識別に対処する。我々は、正確な還元力学を導出し、2つの量子ビットからなる3種類の量子プローブ、すなわち量子ビット、量子ビット、量子レジスタによって達成可能な最小誤差確率を評価する。その結果, 量子プローブの劣化は温度の低い値の識別に有用であり, 相互作用時間の中間値では誤差の確率が低くなることが示された。一方、2つのキュービットからなるレジスタは、連続的に使用される2つの単一キュービットと比較して、何の利点も与えない。 We address the discrimination of structured baths at different temperatures by dephasing quantum probes. We derive the exact reduced dynamics and evaluate the minimum error probability achievable by three different kinds of quantum probes, namely a qubit, a qutrit and a quantum register made of two qubits. Our results indicate that dephasing quantum probes are useful in discriminating low values of temperature, and that lower probabilities of error are achieved for intermediate values of the interaction time. A qutrit probe outperforms a qubit one in the discrimination task, whereas a register made of two qubits does not offer any advantage compared to two single qubits used sequentially.	翻訳日:2023-05-07 00:07:52 公開日:2020-08-09
# 量子場の幾何学的量子情報構造とその格子シミュレーション Geometric Quantum Information Structure in Quantum Fields and their Lattice Simulation ( http://arxiv.org/abs/2008.03647v1 ) ライセンス: Link先を確認	Natalie Klco and Martin J. Savage	(参考訳) 質量を持たない非相互作用スカラー場理論の2つの切断領域の間の蒸留可能な絡み合いの上限は、幾何学的減衰定数によって定義される指数的減衰を持つ。空間格子で短い距離で制御されると、この絡み合いは突然無次元分離を越えて消滅し、ネガティビティ球面を定義する。 2つの空間次元において、一連の格子計算を通じて、円板対とネガティビティ球面の連続体への成長の間の幾何学的減衰定数を決定する。このような量子情報スケールが量子色力学(qcd)にも現れると仮定すると、核子と核の低エネルギーダイナミクスを記述する有効場理論に新しい相対スケールが存在するかもしれない。本稿では, 有効場理論, 格子qcd計算, 将来の量子シミュレーションにおける蒸留性絡み合い構造の影響について考察する。 An upper limit to distillable entanglement between two disconnected regions of massless non-interacting scalar field theory has an exponential decay defined by a geometric decay constant. When regulated at short distances with a spatial lattice, this entanglement abruptly vanishes beyond a dimensionless separation, defining a negativity sphere. In two spatial dimensions, we determine this geometric decay constant between a pair of disks and the growth of the negativity sphere toward the continuum through a series of lattice calculations. Making the connection to quantum field theories in three-spatial dimensions, assuming such quantum information scales appear also in quantum chromodynamics (QCD), a new relative scale may be present in effective field theories describing the low-energy dynamics of nucleons and nuclei. We highlight potential impacts of the distillable entanglement structure on effective field theories, lattice QCD calculations and future quantum simulations.	翻訳日:2023-05-06 18:05:54 公開日:2020-08-09
# ファジィテストを用いた消費者UAVサイバーセキュリティ脆弱性評価 Consumer UAV Cybersecurity Vulnerability Assessment Using Fuzzing Tests ( http://arxiv.org/abs/2008.03621v1 ) ライセンス: Link先を確認	David Rudo and Dr. Kai Zeng	(参考訳) 無人航空機(uav)は、遠隔操作可能な飛行可能な車両であり、軍事活動から国内娯楽まで様々な環境に存在する。これらの車両は素晴らしい資産ですが、パイロットが遠隔操作できるのと同じように、サイバー攻撃も同様に実行できます。 UAVに対するサイバー攻撃は、物理的および仮想システムに多くの問題をもたらす可能性がある。このような誤動作は、攻撃者にデータを盗んだり、UAVを無効にしたり、UAVをハイジャックする能力を与える。このような攻撃を軽減するには、悪意ある悪用される可能性のある脆弱性を特定し、パッチを当てる必要がある。本稿では, 特定のポートに送信される大量のデータストリームを用いて, 悪用可能なUAVセキュリティプラクティスを識別し, 新たなUAV脆弱性を探索する。より詳細なモデルでは、UAVのFTPポートに送信されるFTP固有のキーワードを含むデータの文字列をファジングテストとして含み、UAV上の他のポートでも数千のパケットを起動する。これらのテストの間、仮想および物理的システムは、特定のパターンや脆弱性を特定するために広範囲に監視される。このモデルは、攻撃者がネットワークを侵害したuavを正確に描写し、多くのローエンドのuavモデルを家庭で使用するparrot bebop 2に適用される。テスト中、Parrot Bebop 2はGPS性能の低下、ビデオ速度、パイロットに対するUAVの反応性、モーター機能、UAVのセンサーデータの精度をモニターする。これらすべての監視ポイントは、個々のテストに対するUAVの反応を包括的に見ることができる。本稿では,この脆弱性の悪用に対処するための対策と,ファジングテストから分岐する可能性のある攻撃について述べる。 Unmanned Aerial Vehicles (UAVs) are remote-controlled vehicles capable of flight and are present in a variety of environments from military operations to domestic enjoyment. These vehicles are great assets, but just as their pilot can control them remotely, cyberattacks can be executed in a similar manner. Cyber attacks on UAVs can bring a plethora of issues to physical and virtual systems. Such malfunctions are capable of giving an attacker the ability to steal data, incapacitate the UAV, or hijack the UAV. To mitigate such attacks, it is necessary to identify and patch vulnerabilities that may be maliciously exploited. In this paper, a new UAV vulnerability is explored with related UAV security practices identified for possible exploitation using large streams of data sent at specific ports. The more in-depth model involves strings of data involving FTP-specific keywords sent to the UAV's FTP port in the form of a fuzzing test and launching thousands of packets at other ports on the UAV as well. During these tests, virtual and physical systems are monitored extensively to identify specific patterns and vulnerabilities. This model is applied to a Parrot Bebop 2, which accurately portrays a UAV that had their network compromised by an attacker and portrays many lower-end UAV models for domestic use. During testings, the Parrot Bebop 2 is monitored for degradation in GPS performance, video speed, the UAV's reactivity to the pilot, motor function, and the accuracy of the UAV's sensor data. All these points of monitoring give a comprehensive view of the UAV's reaction to each individual test. In this paper, countermeasures to combat the exploitation of this vulnerability will be discussed as well as possible attacks that can branch from the fuzzing tests.	翻訳日:2023-05-06 18:05:40 公開日:2020-08-09
# ハイブリッドシステムの近似進化:光機械式jaynes-cummingsモデル Approximate evolution for a hybrid system: An optomechanical Jaynes-Cummings model ( http://arxiv.org/abs/2008.03839v1 ) ライセンス: Link先を確認	L. Medina-Dozal, I. Ramos-Prieto, J. R\'ecamier	(参考訳) この研究は、2つの既知のシステムから構築された現象論的ハミルトニアンから始まります:ポンプ式光力学系のハミルトニアンとjaynes cummingsハミルトニアンです。代数的手法を用いて、(指数関数の積として)強制光学系に対して近似時間発展作用素 $\hat U_{opt}$ を構築し、相互作用として JC ハミルトニアンを取る。後者を$\hat u_{opt}$ で変換し、線形化でき、時間発展演算子が積形式で書かれる一般化された相互作用像ハミルトニアンを得る。解析結果はフルハミルトニアンを用いた純粋数値計算と比較され,両者の一致は顕著である。 In this work we start from a phenomenological Hamiltonian built from two known systems: the Hamiltonian of a pumped optomechanical system and the Jaynes Cummings Hamiltonian. Using algebraic techniques we construct an approximate time evolution operator $\hat U_{opt}$ for the forced optomechanical system (as a product of exponentials) and take the JC Hamiltonian as an interaction. We transform the later with $\hat U_{opt}$ to obtain a generalized interaction picture Hamiltonian which can be linearized and whose time evolution operator is written in a product form. The analytic results are compared with purely numerical calculations using the full Hamiltonian and the agreement between them is remarkable.	翻訳日:2023-05-06 18:04:16 公開日:2020-08-09
# モバイル受動センシングを利用した不安検出 Anxiety Detection Leveraging Mobile Passive Sensing ( http://arxiv.org/abs/2008.03810v1 ) ライセンス: Link先を確認	Lionel Levine, Migyeong Gwak, Kimmo Karkkainen, Shayan Fazeli, Bita Zadeh, Tara Peris, Alexander Young, Majid Sarrafzadeh	(参考訳) 不安障害は小児と成人の両方に影響する精神疾患の最も一般的な分類である。しかしながら、不安を効果的に監視し管理するためのツールは不足しており、不安に関するユニークな課題に対処するために比較的限られた研究が適用されている。スマートフォンから受動的で控えめなデータを集めることは、従来の方法の代替となり、リアルタイムのメンタルヘルス監視と疾病管理が可能になる。本稿では,センサとユーザログの完全適合性を連続的かつ受動的に追跡する実験用モバイルアプリケーションeWellnessを提案する。 1か月の間に10人を追跡し、受動的に監視された機能のみに基づいて、毎日の不安や抑うつレベルを予測するのに76%近い成功率を示した最初のパイロット研究を報告した。 Anxiety disorders are the most common class of psychiatric problems affecting both children and adults. However, tools to effectively monitor and manage anxiety are lacking, and comparatively limited research has been applied to addressing the unique challenges around anxiety. Leveraging passive and unobtrusive data collection from smartphones could be a viable alternative to classical methods, allowing for real-time mental health surveillance and disease management. This paper presents eWellness, an experimental mobile application designed to track a full-suite of sensor and user-log data off an individual's device in a continuous and passive manner. We report on an initial pilot study tracking ten people over the course of a month that showed a nearly 76% success rate at predicting daily anxiety and depression levels based solely on the passively monitored features.	翻訳日:2023-05-06 18:03:59 公開日:2020-08-09
# 拡張不確かさ原理が相対論的クーロンポテンシャルに及ぼす影響 Effects of Extended Uncertainty Principle on the Relativistic Coulomb Potential ( http://arxiv.org/abs/2008.03807v1 ) ライセンス: Link先を確認	B. Hamil, M. Merad, T. Birkandan	(参考訳) 拡張不確実性原理の文脈において、相対論的境界状態エネルギースペクトルとクーロンポテンシャルの波動関数をド・ジッター空間と反ド・ジッター空間に対して研究した。クライン=ゴルドン方程式とディラック方程式を解析的に解いて結果を得る。水素様原子の電子エネルギーを数値的に研究する。 The relativistic bound-state energy spectrum and the wavefunctions for the Coulomb potential are studied for de Sitter and anti-de Sitter spaces in the context of the extended uncertainty principle. Klein-Gordon and Dirac equations are solved analytically to obtain the results. The electron energies of hydrogen-like atoms are studied numerically.	翻訳日:2023-05-06 18:03:48 公開日:2020-08-09
# 内部コンプライアンスのための戦術:文献レビュー Tactics for Internal Compliance: A Literature Review ( http://arxiv.org/abs/2008.03775v1 ) ライセンス: Link先を確認	Ralph Foorthuis	(参考訳) 組織の内部および外部の規範へのコンプライアンスは、現代の実践者と学者の両方にとって非常に重要なトピックである。しかし、組織が内部コンプライアンスを達成するために利用できる実質的かつ基本的なコンプライアンス戦術は、断片的な方法で記述され、学術分野の異なる文献で記述されている。本研究は,134冊の出版物を対象に,多分野の構造化文献レビューを行った。まず、45のコンプライアンス戦略のタイプを提示し、組織をコンプライアンス化するための基本的方法の概要を包括的かつリッチに概観する。第2に,コンプライアンス戦略を位置決めし,コンプライアンス戦略を解析・開発するためのフレームワークの基礎となるコンプライアンス理論の基本概念の概要を提供する。第3に,コンプライアンス戦略からコンプライアンス戦略に移行するための洞察を示す。この過程で,多分野の文献レビューを用いて鳥の視線を捉えることにより,コンプライアンス戦略はヒッヘルトよりもリッチな概念とみなす必要があることを実証する。また、イノベーションの機会が存在することも示しています。 Compliance of organizations with internal and external norms is a highly relevant topic for both practitioners and academics nowadays. However, the substantive, elementary compliance tactics that organizations can use for achieving internal compliance have been described in a fragmented manner and in the literatures of distinct academic disciplines. Using a multidisciplinary structured literature review of 134 publications, this study offers three contributions. First, we present a typology of 45 compliance tactics, which constitutes a comprehensive and rich overview of elementary ways for bringing the organization into compliance. Secondly, we provide an overview of fundamental concepts in the theory of compliance, which forms the basis for the framework we developed for positioning compliance tactics and for analyzing or developing compliance strategies. Thirdly, we present insights for moving from compliance tactics to compliance strategies. In the process, and using the multidisciplinary literature review to take a bird's-eye view, we demonstrate that compliance strategies need to be regarded as a richer concept than perceived hitherto. We also show that opportunities for innovation exist.	翻訳日:2023-05-06 18:03:15 公開日:2020-08-09
# バンドド行列構造を示す摂動による量子多体緩和の修正 Modification of quantum many-body relaxation by perturbations exhibiting a banded matrix structure ( http://arxiv.org/abs/2008.03745v1 ) ライセンス: Link先を確認	Lennart Dabelow, Patrick Vorndamme, and Peter Reimann	(参考訳) 分離量子多体系の可観測緩和挙動は,非摂動的典型的枠組み内での弱い摂動に応答してどのように修正されるかを検討する。鍵となる役割はいわゆる摂動プロファイル(perturbation profile)であり、摂動行列要素が対応するエネルギー固有値の差に対する非摂動ハミルトニアンの固有ベイシスにおける依存性を特徴付ける。特に、帯状マトリックス構造は、大きなエネルギー差のためにゼロに近づく摂動プロファイルによって定量的に捕獲される。緩和の時間的修正は、十分弱く強い摂動に対する近似解析解を許容する非線形積分方程式を介して摂動プロファイルと関連付けられ、一般的な場合において数値解スキームを考える。例として,可視なバンドド行列構造を持つスピン格子モデルについて考察し,自由適合パラメータを伴わない解析的予測と数値の相同性が極めて高いことを見出した。 We investigate how the observable relaxation behavior of an isolated quantum many-body system is modified in response to weak-to-moderate perturbations within a nonperturbative typicality framework. A key role is played by the so-called perturbation profile, which characterizes the dependence of the perturbation matrix elements in the eigenbasis of the unperturbed Hamiltonian on the difference of the corresponding energy eigenvalues. In particular, a banded matrix structure is quantitatively captured by a perturbation profile which approaches zero for large energy differences. The temporal modification of the relaxation is linked to the perturbation profile via a nonlinear integral equation, which admits approximate analytical solutions for sufficiently weak and strong perturbations, and for which we work out a numerical solution scheme in the general case. As an example, we consider a spin lattice model with a pronounced banded matrix structure, and we find very good agreement of the numerics with our analytical predictions without any free fit parameter.	翻訳日:2023-05-06 18:02:51 公開日:2020-08-09
# スマート音声メッセージングシステムを用いた農業知識管理 : 物理的・人的センサの組み合わせ Agricultural Knowledge Management Using Smart Voice Messaging Systems: Combination of Physical and Human Sensors ( http://arxiv.org/abs/2008.03711v1 ) ライセンス: Link先を確認	Naoshi Uchihira and Masami Yoshida	(参考訳) 農業知識管理システムにおけるモノのインターネット(IoT)の利用は、農業の効率を高めるための最も有望なアプローチの1つである。しかし、農業における既存の物理的センサーは、作物の特性の変化をモニタリングするために限られており、平均的な農家にとっては高価である可能性がある。身体と人間のセンサー(五感)の組み合わせを提案する。農家は、自分の目、耳、鼻、舌、指を使って、作物や機器の特徴や状況(葉の色、病気、害虫、欠陥または機能不全)の様々な変化を確認し、その観察を口頭で表現し、スマートフォンのようなオーディオ録音装置で記述を捉えた。音声録音はwebサーバによってテキストに書き起こされる。物理的および人的センサ(音声メッセージ)が取得したデータは、データとテキストマイニングによって分析され、農業の知識を創造し、改善する。物理的および人的センサを用いた農業知識管理システムは、農業の効率と生産性を向上させる目的で、農家間で知識の共有と伝達を奨励する。北海道の温室野菜農場にこのような農業知識管理システム(スマート音声メッセージングシステム)を適用した。蓄積音声の質的分析と農家へのインタビューにより,本システムの有効性が示された。本研究の貢献は,「IoE(Agricultural Internet of Everything)」に対する新たな実践的アプローチと,実生野菜農場での試行実験の結果,その有効性を示すものである。 The use of the Internet of Things (IoT) in agricultural knowledge management systems is one of the most promising approaches to increasing the efficiency of agriculture. However, the existing physical sensors in agriculture are limited for monitoring various changes in the characteristics of crops and may be expensive for the average farmer. We propose a combination of physical and human sensors (the five human senses). By using their own eyes, ears, noses, tongues, and fingers, farmers could check the various changes in the characteristics and conditions (colors of leaves, diseases, pests, faulty or malfunctioning equipment) of their crops and equipment, verbally describe their observations, and capture the descriptions with audio recording devices, such as smartphones. The voice recordings could be transcribed into text by web servers. The data captured by the physical and human sensors (voice messages) are analyzed by data and text mining to create and improve agricultural knowledge. An agricultural knowledge management system using physical and human sensors encourages to share and transfer knowledge among farmers for the purpose of improving the efficiency and productivity of agriculture. We applied one such agricultural knowledge management system (smart voice messaging system) to a greenhouse vegetable farm in Hokkaido. A qualitative analysis of accumulated voice messages and an interview with the farmer demonstrated the effectiveness of this system. The contributions of this study include a new and practical approach to an "agricultural Internet of Everything (IoE)" and evidence of its effectiveness as a result of our trial experiment at a real vegetable farm.	翻訳日:2023-05-06 18:02:35 公開日:2020-08-09
# Schr\"odinger方程式の正確な離散化について On the exact discretization of Schr\"odinger equation ( http://arxiv.org/abs/2008.03698v1 ) ライセンス: Link先を確認	Chih-Lung Chou	(参考訳) 我々は、運動量表現から位置表現へ作用素を変換する離散フーリエ変換を用いて、シュリンガー方程式の正確な離散アナログが、シュリンガー場理論のハミルトン作用素から自然に導出されることを示した。離散化されたschr\"odinger方程式としてよく用いられる標準中心差分方程式は、異なるハミルトニアン作用素から導かれるため、実際には別の理論を記述する。離散空間における位置と運動量作用素の間の可換関係も導出され、連続空間における従来の可換関係とは異なることが分かる。 2つの離散化公式の比較は、1次元空間の正方形ポテンシャル障壁を通過する波束の透過確率を数値的に研究することによってなされる。両方の離散化公式は、理論計算と比較して賢明で正確な数値結果を与えることが示されているが、正確な離散化公式を使うには計算時間がかかる。入射波パケットの平均波数 $k_0$ は、標準中央差分式を用いて正確な数値結果を得るために、位置空間における格子間隔である \|\|k_0\ell\| < 0.35$ を満たす必要がある。 We show that the exact discrete analogue of Schr\"odinger equation can be derived naturally from the Hamiltonian operator of a Schr\"odinger field theory by using the discrete Fourier transform that transforms the operator from momentum representation into position representation. The standard central difference equation that is often used as the discretized Schr\"odinger equation actually describes a different theory since it is derived from a different Hamiltonian operator. The commutator relation between the position and momentum operators in discrete space is also derived and found to be different from the conventional commutator relation in continuous space. A comparison between the two discretization formulas is made by numerically studying the transmission probability for a wave packet passing through a square potential barrier in one dimensional space. Both discretization formulas are shown to give sensible and accurate numerical results as compared to theoretical calculation, though it takes more computation time when using the exact discretization formula. The average wave number $k_0$ of the incident wave packet must satisfy $\|k_0\ell\| < 0.35$, where $\ell$ is the lattice spacing in position space, in order to obtain an accurate numerical result by using the standard central difference formula.	翻訳日:2023-05-06 18:02:10 公開日:2020-08-09
# ソーシャルメディアを使って自然災害に対する人口動態を測る:2019年のオーストラリア・ブッシュファイア後の大規模Facebook調査から Using social media to measure demographic responses to natural disaster: Insights from a large-scale Facebook survey following the 2019 Australia Bushfires ( http://arxiv.org/abs/2008.03665v1 ) ライセンス: Link先を確認	Paige Maas and Zack Almquist and Eugenia Giraudy and JW Schneider	(参考訳) 本稿では,自然災害後の調査データを収集し,そのデータをデバイス由来の移動情報と組み合わせ,人口統計学的結果の探索を行う。ソーシャルメディアを人口統計調査のプラットフォームとして使うことは、特に困難で費用がかかる人口統計調査のプラットフォームとして、人口統計コミュニティの関心をますます高めている。 Schneider と Harknett (2019) による最近の研究は、米国の低所得労働者のデータ収集に Facebook をターゲットとした広告の利用を探求している。他の研究では、移民同化(stewart et al, 2019)、世界出生率(ribeiro et al, 2020)、世界移住株(zagheni et al, 2017)に対処している。われわれは、Facebookアプリ自体を通じて、ディスアスター後の人口統計と経済結果の迅速応答調査を導入することで、この取り組みを構築している。我々は、これらの調査回答を用いて、facebookの変位マップを含むアプリから派生したモビリティデータを強化し、観察された行動トレンドの妥当性とドライバーを評価する。この調査は、2019年のオーストラリアで起きた山火事の後に行われた。そうすることで、変位と人口動態に関するいくつかの重要な仮説を試すことができます。特に,転位決定やタイミング,喫煙マスクなどの保護具へのアクセスなど,重要な領域における男女差を明らかにした。研究と政策に関する簡単な議論で締めくくります。 In this paper we explore a novel method for collecting survey data following a natural disaster and then combine this data with device-derived mobility information to explore demographic outcomes. Using social media as a survey platform for measuring demographic outcomes, especially those that are challenging or expensive to field for, is increasingly of interest to the demographic community. Recent work by Schneider and Harknett (2019) explores the use of Facebook targeted advertisements to collect data on low-income shift workers in the United States. Other work has addressed immigrant assimilation (Stewart et al, 2019), world fertility (Ribeiro et al, 2020), and world migration stocks (Zagheni et al, 2017). We build on this work by introducing a rapid-response survey of post-disaster demographic and economic outcomes fielded through the Facebook app itself. We use these survey responses to augment app-derived mobility data that comprises Facebook Displacement Maps to assess the validity of and drivers underlying those observed behavioral trends. This survey was deployed following the 2019 Australia bushfires to better understand how these events displaced residents. In doing so we are able to test a number of key hypotheses around displacement and demographics. In particular, we uncover several gender differences in key areas, including in displacement decision-making and timing, and in access to protective equipment such as smoke masks. We conclude with a brief discussion of research and policy implications.	翻訳日:2023-05-06 18:01:49 公開日:2020-08-09
# InSAR位相フィルタリングとコヒーレンス推定のための教師なし生成ニューラルアプローチ An Unsupervised Generative Neural Approach for InSAR Phase Filtering and Coherence Estimation ( http://arxiv.org/abs/2001.09631v3 ) ライセンス: Link先を確認	Subhayan Mukherjee, Aaron Zimmer, Xinyao Sun, Parwant Ghuman, Irene Cheng	(参考訳) 位相フィルタリングと画素品質(コヒーレンス)推定は、干渉合成開口レーダ(InSAR)画像からデジタル標高モデル(DEM)を作成する際に重要であり、空間的不整合(残差)を除去し、その後のアンラッピングを大幅に改善する。大量のInSARデータは、地理的領域にわたる広域モニタリング(WAM)を容易にする。並列コンピューティングの進歩は、畳み込みニューラルネットワーク(CNN)を加速し、視覚的パターン認識における人間のパフォーマンスよりも有利になった。しかし、この研究はほとんど未調査である。そこで我々は,共同位相フィルタリングとコヒーレンス推定のためのCNNに基づく生成モデルであるGenInSARを提案し,InSARのデータ分布を直接学習する。ゲニンサーの衛星とシミュレートされたノイズのinsar画像に関する教師なしの訓練は、他の5つの関連する方法(平均16.5%以上)を上回っており、分岐カット周辺の過剰なスムーシング/アーティファクトは少ない。ゲニンサーの位相とコヒーレンス根-平均二乗誤差と位相コサイン誤差はそれぞれ0.54, 0.07, 0.05であった。 Phase filtering and pixel quality (coherence) estimation is critical in producing Digital Elevation Models (DEMs) from Interferometric Synthetic Aperture Radar (InSAR) images, as it removes spatial inconsistencies (residues) and immensely improves the subsequent unwrapping. Large amount of InSAR data facilitates Wide Area Monitoring (WAM) over geographical regions. Advances in parallel computing have accelerated Convolutional Neural Networks (CNNs), giving them advantages over human performance on visual pattern recognition, which makes CNNs a good choice for WAM. Nevertheless, this research is largely unexplored. We thus propose "GenInSAR", a CNN-based generative model for joint phase filtering and coherence estimation, that directly learns the InSAR data distribution. GenInSAR's unsupervised training on satellite and simulated noisy InSAR images outperforms other five related methods in total residue reduction (over 16.5% better on average) with less over-smoothing/artefacts around branch cuts. GenInSAR's Phase, and Coherence Root-Mean-Squared-Error and Phase Cosine Error have average improvements of 0.54, 0.07, and 0.05 respectively compared to the related methods.	翻訳日:2023-01-06 08:00:36 公開日:2020-08-09
# DAGの混合から生じる分布からの因果構造発見 Causal Structure Discovery from Distributions Arising from Mixtures of DAGs ( http://arxiv.org/abs/2001.11940v2 ) ライセンス: Link先を確認	Basil Saeed, Snigdha Panigrahi, Caroline Uhler	(参考訳) 本研究では,各モデルが有向非巡回グラフ(DAG)で表される因果モデルの混合から生じる分布について考察する。このような混合分布のグラフィカル表現を提供し、この表現が混合分布の条件付き独立関係を符号化することを示す。次に,このような分布からのサンプルに基づく構造学習の問題を考える。混合変数は潜時であるため、潜時変数に対処できるFCIなどの因果構造探索アルゴリズムを検討する。これらのアルゴリズムは, 成分DAGの「統一」を復元し, 成分DAG間の条件分布が異なる変数を同定可能であることを示す。本研究では,合成および実データを用いて,推定されたグラフが異なる混合成分間で異なるノードを識別することを示す。直近の応用として,各混合成分に応じてサンプルをクラスタリングするために,この因果情報の検索方法を示す。 We consider distributions arising from a mixture of causal models, where each model is represented by a directed acyclic graph (DAG). We provide a graphical representation of such mixture distributions and prove that this representation encodes the conditional independence relations of the mixture distribution. We then consider the problem of structure learning based on samples from such distributions. Since the mixing variable is latent, we consider causal structure discovery algorithms such as FCI that can deal with latent variables. We show that such algorithms recover a "union" of the component DAGs and can identify variables whose conditional distribution across the component DAGs vary. We demonstrate our results on synthetic and real data showing that the inferred graph identifies nodes that vary between the different mixture components. As an immediate application, we demonstrate how retrieval of this causal information can be used to cluster samples according to each mixture component.	翻訳日:2023-01-05 05:52:26 公開日:2020-08-09
# 相関色強調:カラーフィルタの最適化による非制限逆画像の生成 Adversarial Color Enhancement: Generating Unrestricted Adversarial Images by Optimizing a Color Filter ( http://arxiv.org/abs/2002.01008v3 ) ライセンス: Link先を確認	Zhengyu Zhao, Zhuoran Liu, Martha Larson	(参考訳) 本稿では,ニューラルネットワークを誤分類する逆効果を生成するために,カラーフィルタを用いた画像拡張手法を提案する。提案手法であるACE(Adversarial Color Enhancement)は,勾配降下によるカラーフィルタの最適化により,非制限逆画像を生成する。 ACEの新規性は、透明な画像強調のための確立された実践の取り入れである。実験によりACEの白色箱対向強度と黒箱移動性について検証した。さまざまな例がACEが生成する画像の知覚的品質を示している。 ACEは、L_p$非受容性を超えた最近の研究に重要な貢献をし、大きな知覚的摂動をもたらすが、人間の目には目立たないような、制限のない敵の修正に焦点を当てている。フィルタベースの敵の今後の可能性についても,共通の拡張プラクティス(Instagramフィルタなど)でACEを特定の魅力的なイメージスタイルに導くこと,イメージセマンティクスにACEを適用すること,の2つの方向が検討されている。コードはhttps://github.com/zhengyuzhao/aceで入手できる。 We introduce an approach that enhances images using a color filter in order to create adversarial effects, which fool neural networks into misclassification. Our approach, Adversarial Color Enhancement (ACE), generates unrestricted adversarial images by optimizing the color filter via gradient descent. The novelty of ACE is its incorporation of established practice for image enhancement in a transparent manner. Experimental results validate the white-box adversarial strength and black-box transferability of ACE. A range of examples demonstrates the perceptual quality of images that ACE produces. ACE makes an important contribution to recent work that moves beyond $L_p$ imperceptibility and focuses on unrestricted adversarial modifications that yield large perceptible perturbations, but remain non-suspicious, to the human eye. The future potential of filter-based adversaries is also explored in two directions: guiding ACE with common enhancement practices (e.g., Instagram filters) towards specific attractive image styles and adapting ACE to image semantics. Code is available at https://github.com/ZhengyuZhao/ACE.	翻訳日:2023-01-04 09:16:06 公開日:2020-08-09
# AOL:ダイナミックビデオシーンにおける人間の軌道予測のための適応型オンライン学習 AOL: Adaptive Online Learning for Human Trajectory Prediction in Dynamic Video Scenes ( http://arxiv.org/abs/2002.06666v2 ) ライセンス: Link先を確認	Manh Huynh, Gita Alaghband	(参考訳) 本稿では,動的映像シーンにおける人間の運動軌跡を予測するための適応型オンライン学習(aol)フレームワークを提案する。我々のフレームワークはシーン環境の変化を学習し、適応し、異なるシナリオに対して最適なネットワーク重みを生成する。このフレームワークは予測モデルに適用でき、シーンの変化に遭遇すると動的に調整し、次の場所を予測するのに最適なトレーニング重みを適用できるため、パフォーマンスを向上させることができる。 LSTM[3]とFuture Person Location(FPL)[1]という2つの既存の予測モデルとフレームワークを統合することでこれを実証する。さらに,ネットワークの重み付け数を最適性能として分析し,最も最近トレーニングされたネットワーク重みを維持したlru(lru)戦略を用いて,一定数のネットワークでリアルタイムに実現可能であることを示す。大規模な実験により,我々のフレームワークはLSTMとFPLの予測精度を平均で17%,FPLで28%向上し,FPLでは最大50%向上し,リアルタイム(20fps)を実現した。 We present a novel adaptive online learning (AOL) framework to predict human movement trajectories in dynamic video scenes. Our framework learns and adapts to changes in the scene environment and generates best network weights for different scenarios. The framework can be applied to prediction models and improve their performance as it dynamically adjusts when it encounters changes in the scene and can apply the best training weights for predicting the next locations. We demonstrate this by integrating our framework with two existing prediction models: LSTM [3] and Future Person Location (FPL) [1]. Furthermore, we analyze the number of network weights for optimal performance and show that we can achieve real-time with a fixed number of networks using the least recently used (LRU) strategy for maintaining the most recently trained network weights. With extensive experiments, we show that our framework increases prediction accuracies of LSTM and FPL by ~17% and 28% on average, and up to ~50% for FPL on the worst case while achieving real-time (20fps).	翻訳日:2022-12-31 18:16:23 公開日:2020-08-09
# パリからベルリンへ:世界中のファッションスタイルの影響を発見 From Paris to Berlin: Discovering Fashion Style Influences Around the World ( http://arxiv.org/abs/2004.01316v2 ) ライセンス: Link先を確認	Ziad Al-Halah, Kristen Grauman	(参考訳) 服のスタイルの進化とその世界への移住は興味深いが、定量的に説明するのは難しい。着ている人の日常のイメージからファッションの影響を発見・定量化することを提案する。我々は,他の都市がどの都市に影響を及ぼすかを検出する手法を導入する。次に、発見された影響パターンを活用して予測モデルに通知し、任意の都市における任意のスタイルの人気を予測します。 44大都市を対象とする7.7M画像の大規模なデータセットであるGeoStyleを用いて、私たちのアイデアを実証し、都市が50の視覚的スタイルに対してどのようにしてファッションの影響を受け、受けているかを明らかにする。さらに,提案した予測モデルは,空間的および時間的に視覚的スタイルの進化の基盤となることの利点を示す,挑戦的なスタイル予測タスクの最先端結果を実現する。 The evolution of clothing styles and their migration across the world is intriguing, yet difficult to describe quantitatively. We propose to discover and quantify fashion influences from everyday images of people wearing clothes. We introduce an approach that detects which cities influence which other cities in terms of propagating their styles. We then leverage the discovered influence patterns to inform a forecasting model that predicts the popularity of any given style at any given city into the future. Demonstrating our idea with GeoStyle---a large-scale dataset of 7.7M images covering 44 major world cities, we present the discovered influence relationships, revealing how cities exert and receive fashion influence for an array of 50 observed visual styles. Furthermore, the proposed forecasting model achieves state-of-the-art results for a challenging style forecasting task, showing the advantage of grounding visual style evolution both spatially and temporally.	翻訳日:2022-12-17 04:53:50 公開日:2020-08-09
# tuigan: 2つの非ペア画像による多彩な画像から画像への翻訳を学ぶ TuiGAN: Learning Versatile Image-to-Image Translation with Two Unpaired Images ( http://arxiv.org/abs/2004.04634v2 ) ライセンス: Link先を確認	Jianxin Lin, Yingxue Pang, Yingce Xia, Zhibo Chen, Jiebo Luo	(参考訳) unsupervised image-to-image translation (ui2i)タスクは、2つのドメイン間のマッピングをペアイメージなしで学習する。既存のui2iメソッドは通常、トレーニングのために異なるドメインからの多数の非ペア画像を必要とするが、トレーニングデータが非常に限られるシナリオはたくさんある。本稿では、各ドメインが1つのイメージを含んでいても、ui2iは依然として達成できると主張する。この目的のために,2つの未ペア画像のみをトレーニングし,ワンショットで教師なし学習を行う生成モデルTuiGANを提案する。 TuiGANでは、生成した画像がグローバルな構造から局所的な詳細へと徐々に洗練される粗い方法で変換される。幅広いui2iタスクにおいて,汎用性が強いベースラインを上回ることを検証するために,広範な実験を行った。さらに、TuiGANは十分なデータでトレーニングされた最先端のUI2Iモデルと同等のパフォーマンスを達成することができる。 An unsupervised image-to-image translation (UI2I) task deals with learning a mapping between two domains without paired images. While existing UI2I methods usually require numerous unpaired images from different domains for training, there are many scenarios where training data is quite limited. In this paper, we argue that even if each domain contains a single image, UI2I can still be achieved. To this end, we propose TuiGAN, a generative model that is trained on only two unpaired images and amounts to one-shot unsupervised learning. With TuiGAN, an image is translated in a coarse-to-fine manner where the generated image is gradually refined from global structures to local details. We conduct extensive experiments to verify that our versatile method can outperform strong baselines on a wide variety of UI2I tasks. Moreover, TuiGAN is capable of achieving comparable performance with the state-of-the-art UI2I models trained with sufficient data.	翻訳日:2022-12-15 02:36:07 公開日:2020-08-09
# 新型コロナウイルス薬品購入機会の特定のためのネットワークメディカルフレームワーク Network Medicine Framework for Identifying Drug Repurposing Opportunities for COVID-19 ( http://arxiv.org/abs/2004.07229v2 ) ライセンス: Link先を確認	Deisy Morselli Gysi and \'Italo Do Valle and Marinka Zitnik and Asher Ameli and Xiao Gan and Onur Varol and Susan Dina Ghiassian and JJ Patten and Robert Davey and Joseph Loscalzo and Albert-L\'aszl\'o Barab\'asi	(参考訳) 現在のパンデミックは、SARS-CoV-2感染の潜在的な効果のために、迅速かつ確実に臨床承認された化合物を優先順位付けできる方法の必要性を強調している。過去10年間で、ネットワークメディカルは、薬物の標的と疾患遺伝子の間の細胞内ネットワークに基づく関係を利用して、薬物の再利用のための複数の予測アルゴリズムを開発し、検証してきた。そこで我々は,人工知能,ネットワーク拡散,ネットワーク近接に基づくアルゴリズムをデプロイし,それぞれがSARS-CoV-2に対する効果を期待して6,340の薬物をランク付けするよう命じた。予測を検証するために,veroe6細胞で実験的にスクリーニングされた基底的真理918薬と,臨床試験中の薬物の一覧を用いて,covid-19の有効性を有する薬物に対する医療コミュニティの評価を捉えた。ほとんどのアルゴリズムは、これらの基底真理データに対して予測能力を提供しているが、すべてのデータセットとメトリクスに対して一貫した結果を提供する単一の方法はない。これにより、全てのアルゴリズムの予測を融合させるマルチモーダルアプローチを開発し、異なる予測手法間のコンセンサスが、最高のパイプラインの性能を常に上回ることを示した。ウイルス感染の抑制に成功している77薬のうち76薬はSARS-CoV-2を標的としたタンパク質に結合せず、これらの薬はドッキングベースの戦略では特定できないネットワークベースの作用に依存している。これらの進歩は、将来の病原体や、デ・ノボの薬物開発のコストと長期のスケジュールで守られていない疾患に対する再生可能な薬物を同定する方法を提供する。 The current pandemic has highlighted the need for methodologies that can quickly and reliably prioritize clinically approved compounds for their potential effectiveness for SARS-CoV-2 infections. In the past decade, network medicine has developed and validated multiple predictive algorithms for drug repurposing, exploiting the sub-cellular network-based relationship between a drug's targets and disease genes. Here, we deployed algorithms relying on artificial intelligence, network diffusion, and network proximity, tasking each of them to rank 6,340 drugs for their expected efficacy against SARS-CoV-2. To test the predictions, we used as ground truth 918 drugs that had been experimentally screened in VeroE6 cells, and the list of drugs under clinical trial, that capture the medical community's assessment of drugs with potential COVID-19 efficacy. We find that while most algorithms offer predictive power for these ground truth data, no single method offers consistently reliable outcomes across all datasets and metrics. This prompted us to develop a multimodal approach that fuses the predictions of all algorithms, showing that a consensus among the different predictive methods consistently exceeds the performance of the best individual pipelines. We find that 76 of the 77 drugs that successfully reduced viral infection do not bind the proteins targeted by SARS-CoV-2, indicating that these drugs rely on network-based actions that cannot be identified using docking-based strategies. These advances offer a methodological pathway to identify repurposable drugs for future pathogens and neglected diseases underserved by the costs and extended timeline of de novo drug development.	翻訳日:2022-12-13 03:15:44 公開日:2020-08-09
# 米国におけるcovid-19データレポジトリのキュレーションと郡レベルの死亡数予測 Curating a COVID-19 data repository and forecasting county-level death counts in the United States ( http://arxiv.org/abs/2005.07882v2 ) ライセンス: Link先を確認	Nick Altieri, Rebecca L. Barter, James Duncan, Raaz Dwivedi, Karl Kumbier, Xiao Li, Robert Netzorg, Briton Park, Chandan Singh, Yan Shuo Tan, Tiffany Tang, Yu Wang, Chao Zhang, Bin Yu	(参考訳) 新型コロナウイルス(covid-19)の流行が進むにつれ、正確な予測は政策決定に極めて重要な役割を果たす。本稿では,covid-19情報を含む大規模データレポジトリの継続的なキュレーションについて述べる。このデータを用いて、米国内の郡レベルでの累積死亡数の短期的軌道の予測とそれに対応する予測間隔を最大2週間前に作成する。 2020年1月22日から6月20日までのデータを用いて、複数の予測をセンシング技術を用いて開発し、組み合わせて、線形および指数予測器(clep)と呼ぶアンサンブルを作成する。我々の個人予測器には、郡固有の指数と線形予測器、郡にまたがるデータをまとめる共有指数予測器、近隣の郡からのデータを利用する共有指数予測器、人口統計に基づく共有指数予測器が含まれる。過去5日間の予測誤差を用いて、死亡予測の不確実性を評価し、一般に適用可能な予測間隔、最大(絶対)エラー予測間隔(mepi)を生み出した。 MEPIは、将来2週間の累計死亡数を予測するために、郡全体で平均して94%以上をカバーしている。当社の予測は現在、非営利組織である response4life が個々の病院の医療提供の必要性を判断するために使用しており、全国の医療用品の流通に直接貢献しています。 https://covidseverity.comの予測とデータリポジトリが、必要な郡固有の意思決定をガイドし、郡の新型コロナウイルス対策の継続を支援することを願っている。 As the COVID-19 outbreak evolves, accurate forecasting continues to play an extremely important role in informing policy decisions. In this paper, we present our continuous curation of a large data repository containing COVID-19 information from a range of sources. We use this data to develop predictions and corresponding prediction intervals for the short-term trajectory of COVID-19 cumulative death counts at the county-level in the United States up to two weeks ahead. Using data from January 22 to June 20, 2020, we develop and combine multiple forecasts using ensembling techniques, resulting in an ensemble we refer to as Combined Linear and Exponential Predictors (CLEP). Our individual predictors include county-specific exponential and linear predictors, a shared exponential predictor that pools data together across counties, an expanded shared exponential predictor that uses data from neighboring counties, and a demographics-based shared exponential predictor. We use prediction errors from the past five days to assess the uncertainty of our death predictions, resulting in generally-applicable prediction intervals, Maximum (absolute) Error Prediction Intervals (MEPI). MEPI achieves a coverage rate of more than 94% when averaged across counties for predicting cumulative recorded death counts two weeks in the future. Our forecasts are currently being used by the non-profit organization, Response4Life, to determine the medical supply need for individual hospitals and have directly contributed to the distribution of medical supplies across the country. We hope that our forecasts and data repository at https://covidseverity.com can help guide necessary county-specific decision-making and help counties prepare for their continued fight against COVID-19.	翻訳日:2022-12-02 14:00:50 公開日:2020-08-09
# 新型コロナウイルス(covid-19)抗パンデミック対策のインパクトスタディ--区画モデルと機械学習 Impact studies of nationwide measures COVID-19 anti-pandemic: compartmental model and machine learning ( http://arxiv.org/abs/2005.08395v2 ) ライセンス: Link先を確認	Mouhamadou A.M.T. Balde, Coura Balde, Babacar M. Ndiaye	(参考訳) 本稿では,全国的な新型コロナウイルス対策のパンデミック対策の効果について検討する。対策を考えると、covid-19のデータを分析するプロセスが2つある。我々は,全国的な尺度のレベルを,モデルの接触率に関連するパラメータの値と関連付ける。次に、パラメトリック・ソルバーは、これらの指標に関して、パンデミックの進化の異なる可能性を示している。パンデミックの進化を予測するために、2つの機械学習ツールが使用される。最後に、決定論的ツールと2つの機械学習ツールの比較を示す。 In this paper, we deal with the study of the impact of nationwide measures COVID-19 anti-pandemic. We drive two processes to analyze COVID-19 data considering measures. We associate level of nationwide measure with value of parameters related to the contact rate of the model. Then a parametric solve, with respect to those parameters of measures, shows different possibilities of the evolution of the pandemic. Two machine learning tools are used to forecast the evolution of the pandemic. Finally, we show comparison between deterministic and two machine learning tools.	翻訳日:2022-12-02 06:02:10 公開日:2020-08-09
# エンド・ツー・エンド注意によるカクテルの話者識別 Identify Speakers in Cocktail Parties with End-to-End Attention ( http://arxiv.org/abs/2005.11408v2 ) ライセンス: Link先を確認	Junzhe Zhu, Mark Hasegawa-Johnson, Leda Sari	(参考訳) 複数の話者が同時に話すシナリオでは、話者を正確に識別できることが重要である。本稿では、音源抽出と話者識別を統合したエンドツーエンドシステムを提案し、チャネル次元に沿って話者予測を最大にすることで、これら2つの部分を協調的に最適化する方法を提案する。残差注意により、話者識別のために最適化されたスペクトログラムマスクを学習でき、残差フォワード接続は、十分に大きなコンテキストウインドウによる拡張畳み込みを許容し、音節境界を越えた正しいストリーミングを保証する。エンドツーエンドトレーニングの結果、99.9%の精度と93.9%の精度で2つの話者を混合して認識し、81.2%の精度で3話者シナリオで全ての話者を認識するシステムが得られる。 In scenarios where multiple speakers talk at the same time, it is important to be able to identify the talkers accurately. This paper presents an end-to-end system that integrates speech source extraction and speaker identification, and proposes a new way to jointly optimize these two parts by max-pooling the speaker predictions along the channel dimension. Residual attention permits us to learn spectrogram masks that are optimized for the purpose of speaker identification, while residual forward connections permit dilated convolution with a sufficiently large context window to guarantee correct streaming across syllable boundaries. End-to-end training results in a system that recognizes one speaker in a two-speaker broadcast speech mixture with 99.9% accuracy and both speakers with 93.9% accuracy, and that recognizes all speakers in three-speaker scenarios with 81.2% accuracy.	翻訳日:2022-11-30 09:50:47 公開日:2020-08-09
# 識別的特徴アライメント:ガウス誘導潜在アライメントによる教師なし領域適応の伝達性の改善 Discriminative Feature Alignment: Improving Transferability of Unsupervised Domain Adaptation by Gaussian-guided Latent Alignment ( http://arxiv.org/abs/2006.12770v5 ) ライセンス: Link先を確認	Jing Wang, Jiahong Chen, Jianzhe Lin, Leonid Sigal, and Clarence W. de Silva	(参考訳) 本研究では,ラベル付きデータドメインから近似推論モデルを学習し,ラベル付きデータドメインへの一般化を期待する教師なし領域適応問題に焦点を当てた。教師なしドメイン適応の成功は、主にクロスドメイン機能アライメントに依存している。従来の研究は、分類器によって引き起こされる相違により、潜伏する特徴を直接調整しようと試みてきた。それでも、特に大きなドメインギャップが存在する場合、この直接的特徴アライメントを通じて共通の特徴空間を常に学べることはできない。この問題を解決するために,ガウス誘導型潜時アライメント手法を導入し,先行分布の誘導の下で2つの領域の潜時特徴分布を整列させる。このような間接的な方法では、2つの領域からのサンプル上の分布は共通の特徴空間、すなわち、より優れた特徴アライメントを促進する前の空間上に構築される。対象の潜伏分布をこの先行分布に効果的に整合させるため,エンコーダデコーダの定式化を生かして,不対向L1距離を提案する。 9つのベンチマークデータセットの広範な評価は、既存の作業を大幅に改善することで、最先端の手法よりも優れた知識伝達可能性と提案手法の汎用性を検証する。 In this study, we focus on the unsupervised domain adaptation problem where an approximate inference model is to be learned from a labeled data domain and expected to generalize well to an unlabeled data domain. The success of unsupervised domain adaptation largely relies on the cross-domain feature alignment. Previous work has attempted to directly align latent features by the classifier-induced discrepancies. Nevertheless, a common feature space cannot always be learned via this direct feature alignment especially when a large domain gap exists. To solve this problem, we introduce a Gaussian-guided latent alignment approach to align the latent feature distributions of the two domains under the guidance of the prior distribution. In such an indirect way, the distributions over the samples from the two domains will be constructed on a common feature space, i.e., the space of the prior, which promotes better feature alignment. To effectively align the target latent distribution with this prior distribution, we also propose a novel unpaired L1-distance by taking advantage of the formulation of the encoder-decoder. The extensive evaluations on nine benchmark datasets validate the superior knowledge transferability through outperforming state-of-the-art methods and the versatility of the proposed method by improving the existing work significantly.	翻訳日:2022-11-17 23:09:25 公開日:2020-08-09
# 空間指数構造におけるハンズオフモデルの統合 Hands-off Model Integration in Spatial Index Structures ( http://arxiv.org/abs/2006.16411v2 ) ライセンス: Link先を確認	Ali Hadian, Ankit Kumar, Thomas Heinis	(参考訳) 空間インデックスは、例えばIoTアプリケーションを通じて生成された空間データの増加量を分析するために不可欠である。近年開発されたインデックスの多さは、主にディスクに最適化されている。しかし、コモディティマシン上でもメモリ量が増加すると、メインメモリに移すことが選択肢となる。そうすることで、メインメモリにのみ対応可能な追加最適化を使用する機会が開かれる。本稿では,軽量機械学習モデルを用いて空間インデックスのクエリを高速化する機会について検討する。我々は、最も広く使われている空間指標であるR木に補間や同様の手法を用いる可能性を探究する。実験分析で示したように、クエリの実行時間は最大60%削減でき、同時にインデックスのメモリフットプリントを90%以上縮小できる。 Spatial indexes are crucial for the analysis of the increasing amounts of spatial data, for example generated through IoT applications. The plethora of indexes that has been developed in recent decades has primarily been optimised for disk. With increasing amounts of memory even on commodity machines, however, moving them to main memory is an option. Doing so opens up the opportunity to use additional optimizations that are only amenable to main memory. In this paper we thus explore the opportunity to use light-weight machine learning models to accelerate queries on spatial indexes. We do so by exploring the potential of using interpolation and similar techniques on the R-tree, arguably the most broadly used spatial index. As we show in our experimental analysis, the query execution time can be reduced by up to 60% while simultaneously shrinking the index's memory footprint by over 90%	翻訳日:2022-11-15 15:25:51 公開日:2020-08-09
# 準周期顕微鏡画像の学習に基づく欠陥認識 Learning-based Defect Recognition for Quasi-Periodic Microscope Images ( http://arxiv.org/abs/2007.01309v2 ) ライセンス: Link先を確認	Nik Dennler, Antonio Foncubierta-Rodriguez, Titus Neupert, Marilyne Sousa	(参考訳) 結晶材料欠陥の制御は、デバイスの最終性能に有害または有益である材料の特性に影響を与えるため、非常に重要である。サブナノメータスケールの欠陥解析は高分解能(走査型)透過電子顕微鏡[HR(S)TEM]によって実現され、人間の専門知識に基づいて欠陥の同定が行われている。しかし、プロセスは退屈で非常に時間がかかり、時には曖昧な結果をもたらす。本稿では,原子分解能顕微鏡画像からの格子欠陥検出を支援する半教師付き機械学習手法を提案する。画像パッチを欠陥または非欠陥と分類する畳み込みニューラルネットワーク、モデルとして1つの非破壊パッチを選択するグラフベースのヒューリスティック、最終的に自動的に生成される畳み込みフィルタバンク、スタック障害、ツイン欠陥、粒界などの対称性の破れを強調する。さらに,アモルファス領域とビーム欠陥を分割するための分散フィルタを提案する。このアルゴリズムは、III-V/Si結晶材料上でテストされ、異なるメトリクスに対して評価し、非常に小さなトレーニングデータセットであっても有望な結果を示す。データ駆動型分類の一般性,頑健性,深層学習の速度を,画像フィルタの有効性と組み合わせることで,結晶材料の将来のHR(S)TEM解析を効率化できる,マイクロスコピストコミュニティに貴重なオープンソースツールを提供する。 Controlling crystalline material defects is crucial, as they affect properties of the material that may be detrimental or beneficial for the final performance of a device. Defect analysis on the sub-nanometer scale is enabled by high-resolution (scanning) transmission electron microscopy [HR(S)TEM], where the identification of defects is currently carried out based on human expertise. However, the process is tedious, highly time consuming and, in some cases, yields ambiguous results. Here we propose a semi-supervised machine learning method that assists in the detection of lattice defects from atomic resolution microscope images. It involves a convolutional neural network that classifies image patches as defective or non-defective, a graph-based heuristic that chooses one non-defective patch as a model, and finally an automatically generated convolutional filter bank, which highlights symmetry breaking such as stacking faults, twin defects and grain boundaries. Additionally, we suggest a variance filter to segment amorphous regions and beam defects. The algorithm is tested on III-V/Si crystalline materials and successfully evaluated against different metrics, showing promising results even for extremely small training data sets. By combining the data-driven classification generality, robustness and speed of deep learning with the effectiveness of image filters in segmenting faulty symmetry arrangements, we provide a valuable open-source tool to the microscopist community that can streamline future HR(S)TEM analyses of crystalline materials.	翻訳日:2022-11-14 14:46:42 公開日:2020-08-09
# 点雲復調のための微分マニフォールド再構成 Differentiable Manifold Reconstruction for Point Cloud Denoising ( http://arxiv.org/abs/2007.13551v2 ) ライセンス: Link先を確認	Shitong Luo, Wei Hu	(参考訳) 3次元点雲は、表面再構成やレンダリングなどの下流のタスクを妨害する、取得装置の固有の制限のため、ノイズによって乱されることが多い。従来の作業は、主に下面からノイズ点の変位を推測するが、表面を明示的に回復するために指定されていないため、準最適化の結果につながる可能性がある。そこで本論文では,ノイズ摂動と組込み近傍特徴を持つ可微分部分サンプリングされた点から雑音点雲の基本多様体を学習し,点雲内の固有構造を捉えることを目的とする。具体的には,オートエンコーダライクなニューラルネットワークを提案する。エンコーダは各点の局所的特徴表現と非局所的特徴表現の両方を学習し、適応的微分可能プーリング操作を介して低ノイズの点をサンプリングする。その後、デコーダは、各サンプル点をその近傍の埋め込み特徴と共に、その点を中心とする局所曲面に変換することにより、基礎となる多様体を推定する。再構成多様体上の再サンプリングにより、偏微分点雲が得られる。さらに、教師なしのトレーニング損失を設計し、教師なしまたは教師なしの方法でネットワークをトレーニングできるようにする。提案手法は, 合成雑音と実環境雑音の両方において, 最先端のデノイジング法を著しく上回ることを示す実験を行った。コードとデータはhttps://github.com/luost26/dmrdenoiseで入手できる。 3D point clouds are often perturbed by noise due to the inherent limitation of acquisition equipments, which obstructs downstream tasks such as surface reconstruction, rendering and so on. Previous works mostly infer the displacement of noisy points from the underlying surface, which however are not designated to recover the surface explicitly and may lead to sub-optimal denoising results. To this end, we propose to learn the underlying manifold of a noisy point cloud from differentiably subsampled points with trivial noise perturbation and their embedded neighborhood feature, aiming to capture intrinsic structures in point clouds. Specifically, we present an autoencoder-like neural network. The encoder learns both local and non-local feature representations of each point, and then samples points with low noise via an adaptive differentiable pooling operation. Afterwards, the decoder infers the underlying manifold by transforming each sampled point along with the embedded feature of its neighborhood to a local surface centered around the point. By resampling on the reconstructed manifold, we obtain a denoised point cloud. Further, we design an unsupervised training loss, so that our network can be trained in either an unsupervised or supervised fashion. Experiments show that our method significantly outperforms state-of-the-art denoising methods under both synthetic noise and real world noise. The code and data are available at https://github.com/luost26/DMRDenoise	翻訳日:2022-11-06 08:46:06 公開日:2020-08-09
# 自己監督学習による低解像度画像からの3次元人物形状と姿勢 3D Human Shape and Pose from a Single Low-Resolution Image with Self-Supervised Learning ( http://arxiv.org/abs/2007.13666v2 ) ライセンス: Link先を確認	Xiangyu Xu, Hao Chen, Francesc Moreno-Noguer, Laszlo A. Jeni, Fernando De la Torre	(参考訳) 3次元人間の形状と単眼画像からのポーズ推定はコンピュータビジョンにおける活発な研究領域であり、活動認識から仮想アバターの作成に至るまで、新しいアプリケーションの開発に大きな影響を与えている。既存の3次元人体形状とポーズ推定の深層学習手法は比較的高解像度な入力画像に依存しているが、ビデオ監視やスポーツ放送といったいくつかの現実的なシナリオでは高解像度の視覚コンテンツが必ずしも利用できない。実際のシナリオにおける低解像度の画像は、幅広いサイズで異なり、1つの解像度で訓練されたモデルは、通常、解像度を越えて優雅に劣化しない。低解像度入力の問題を解決するための2つの一般的なアプローチは、視覚的アーティファクトにつながる可能性のある入力画像に超解像技術を適用するか、あるいは単に1つのモデルを各解像度で訓練するかである。上記の問題に対処するため,本研究では,レゾリューション・アウェア・ネットワーク,自己超越損失,コントラッシブ・ラーニング・スキームからなるRCC-Netという新しいアルゴリズムを提案する。提案したネットワークは3次元のボディ形状を学習し、単一のモデルで異なる解像度でポーズをとることができる。自己超越損失は出力のスケール一貫性を促進し、対照的な学習手法は深い特徴のスケール一貫性を強制する。これら2つの新たなトレーニング損失は,3次元形状を学習し,弱教師ありの姿勢を示す。広範な実験により、rsc-netは低解像度画像に挑戦するための最先端の手法よりも一貫して優れた結果が得られることが証明された。 3D human shape and pose estimation from monocular images has been an active area of research in computer vision, having a substantial impact on the development of new applications, from activity recognition to creating virtual avatars. Existing deep learning methods for 3D human shape and pose estimation rely on relatively high-resolution input images; however, high-resolution visual content is not always available in several practical scenarios such as video surveillance and sports broadcasting. Low-resolution images in real scenarios can vary in a wide range of sizes, and a model trained in one resolution does not typically degrade gracefully across resolutions. Two common approaches to solve the problem of low-resolution input are applying super-resolution techniques to the input images which may result in visual artifacts, or simply training one model for each resolution, which is impractical in many realistic applications. To address the above issues, this paper proposes a novel algorithm called RSC-Net, which consists of a Resolution-aware network, a Self-supervision loss, and a Contrastive learning scheme. The proposed network is able to learn the 3D body shape and pose across different resolutions with a single model. The self-supervision loss encourages scale-consistency of the output, and the contrastive learning scheme enforces scale-consistency of the deep features. We show that both these new training losses provide robustness when learning 3D shape and pose in a weakly-supervised manner. Extensive experiments demonstrate that the RSC-Net can achieve consistently better results than the state-of-the-art methods for challenging low-resolution images.	翻訳日:2022-11-06 08:19:06 公開日:2020-08-09
# より公平なバイナリサブマトリクス検出のための個人バイアスの除去 Denoising individual bias for a fairer binary submatrix detection ( http://arxiv.org/abs/2007.15816v2 ) ライセンス: Link先を確認	Changlin Wan, Wennan Chang, Tong Zhao, Sha Cao, Chi Zhang	(参考訳) バイナリマトリクスの低階表現は、スパース個人属性関係の分離において強力であり、広く応用されている。既存のbmf(binary matrix factorization)またはcc(co-clustering)メソッドは背景雑音を仮定することが多い。しかし、この仮定は実データでは容易に破られ、二進法の異質な行や列の確率は異なる要素の背景分布をもたらし、既存の方法の合理性を麻痺させる。本稿では,パターンの行または列単位での混合分布と異なる背景を推定し,背景からより可能性の高いバイナリ属性を除去し,真のパターンの検出を最適化するbindという2値化フレームワークを提案する。 BINDは行と列の混合分布の完全な数学的性質によって支えられている。 BINDは背景雑音を効果的に除去し,最先端のBMF法とCC法の妥当性と精度を大幅に向上させることを示した。 Low rank representation of binary matrix is powerful in disentangling sparse individual-attribute associations, and has received wide applications. Existing binary matrix factorization (BMF) or co-clustering (CC) methods often assume i.i.d background noise. However, this assumption could be easily violated in real data, where heterogeneous row- or column-wise probability of binary entries results in disparate element-wise background distribution, and paralyzes the rationality of existing methods. We propose a binary data denoising framework, namely BIND, which optimizes the detection of true patterns by estimating the row- or column-wise mixture distribution of patterns and disparate background, and eliminating the binary attributes that are more likely from the background. BIND is supported by thoroughly derived mathematical property of the row- and column-wise mixture distributions. Our experiment on synthetic and real-world data demonstrated BIND effectively removes background noise and drastically increases the fairness and accuracy of state-of-the arts BMF and CC methods.	翻訳日:2022-11-04 05:53:36 公開日:2020-08-09
# CNNを用いたウェイクワード開始端の高精度検出 Accurate Detection of Wake Word Start and End Using a CNN ( http://arxiv.org/abs/2008.03790v1 ) ライセンス: Link先を確認	Christin Jose, Yuriy Mishchenko, Thibaud Senechal, Anish Shah, Alex Escott, Shiv Vitaladevuni	(参考訳) 小さなフットプリント組み込みデバイスは、音声アシスタントを実現するために、小さなモデルサイズと検出遅延を持つキーワードスポッター(KWS)を必要とする。このようなキーワードは、ボイスアシスタント対応デバイスを起動するために使われるため、しばしば \textit{wake word} と呼ばれる。ウェイクワード検出と合わせて、ウェイクワードエンドポイント(開始と終了)の正確な推定はkwsの重要なタスクである。本稿では,単一段階の単語レベルニューラルネットワークを用いたニューラルKWSにおけるウェイクワードの終端を検出する2つの新しい手法を提案する。提案手法は, 従来の音響モデルとhmm強制アライメントと比較して, 最大50msecの標準誤差のウェークワードのエンドポイント検出に優れた精度を示すことを示す。我々の知る限り、これは単一段階のニューラルKWSに対するウェイクワード終端検出法の最初の研究である。 Small footprint embedded devices require keyword spotters (KWS) with small model size and detection latency for enabling voice assistants. Such a keyword is often referred to as \textit{wake word} as it is used to wake up voice assistant enabled devices. Together with wake word detection, accurate estimation of wake word endpoints (start and end) is an important task of KWS. In this paper, we propose two new methods for detecting the endpoints of wake words in neural KWS that use single-stage word-level neural networks. Our results show that the new techniques give superior accuracy for detecting wake words' endpoints of up to 50 msec standard error versus human annotations, on par with the conventional Acoustic Model plus HMM forced alignment. To our knowledge, this is the first study of wake word endpoints detection methods for single-stage neural KWS.	翻訳日:2022-11-01 04:57:39 公開日:2020-08-09
# 話者条件波RNN:未知話者と記録条件のためのユニバーサルニューラルボコーダを目指して Speaker Conditional WaveRNN: Towards Universal Neural Vocoder for Unseen Speaker and Recording Conditions ( http://arxiv.org/abs/2008.05289v1 ) ライセンス: Link先を確認	Dipjyoti Paul, Yannis Pantazis, Yannis Stylianou	(参考訳) ディープラーニングの最近の進歩は、単一話者音声合成における人間レベルのパフォーマンスにつながった。しかし、これらのシステムを複数の話者モデルに一般化する際には、音声品質の面ではまだ制限がある。例えば、従来のニューラルボコーダはトレーニングスピーカーに調整され、目に見えない話者に一般化能力が不足している。本研究では,話者条件付きWaveRNN(SC-WaveRNN)と呼ばれるWaveRNNの変種を提案する。我々は,未知の話者や記録条件であっても,効率的なユニバーサルヴォコーダの開発を目指している。標準のWaveRNNとは対照的に、SC-WaveRNNは話者埋め込みという形で追加情報を利用する。 SC-WaveRNNは、トレーニングのために公開データを使用することで、主観的および客観的なメトリクスのベースラインであるWaveRNNよりも大幅にパフォーマンスが向上する。 MOSでは、SC-WaveRNNは、可視話者の約23%、可視話者の最大95%の改善を実現している。最後に,ゼロショット話者適応に類似したマルチ話者テキスト音声合成(tts)を実装して作業を拡大する。性能面では、我々のシステムはベースラインのTSシステムよりも15.5%以上60.9%以上32.6%以上60.9%より好まれている。 Recent advancements in deep learning led to human-level performance in single-speaker speech synthesis. However, there are still limitations in terms of speech quality when generalizing those systems into multiple-speaker models especially for unseen speakers and unseen recording qualities. For instance, conventional neural vocoders are adjusted to the training speaker and have poor generalization capabilities to unseen speakers. In this work, we propose a variant of WaveRNN, referred to as speaker conditional WaveRNN (SC-WaveRNN). We target towards the development of an efficient universal vocoder even for unseen speakers and recording conditions. In contrast to standard WaveRNN, SC-WaveRNN exploits additional information given in the form of speaker embeddings. Using publicly-available data for training, SC-WaveRNN achieves significantly better performance over baseline WaveRNN on both subjective and objective metrics. In MOS, SC-WaveRNN achieves an improvement of about 23% for seen speaker and seen recording condition and up to 95% for unseen speaker and unseen condition. Finally, we extend our work by implementing a multi-speaker text-to-speech (TTS) synthesis similar to zero-shot speaker adaptation. In terms of performance, our system has been preferred over the baseline TTS system by 60% over 15.5% and by 60.9% over 32.6%, for seen and unseen speakers, respectively.	翻訳日:2022-11-01 04:57:27 公開日:2020-08-09
# 新型コロナウイルスのトレンド予測のためのディープラーニングアプローチ A Deep Learning Approach for COVID-19 Trend Prediction ( http://arxiv.org/abs/2008.05644v1 ) ライセンス: Link先を確認	Tong Yang, Long Sha, Justin Li, Pengyu Hong	(参考訳) 本研究では,米国におけるSARS-CoV-2の普及傾向を予測するためのディープラーニングモデルに基づくアプローチを開発した。我々は,米国を用いて事例と州人口統計データを確認する設計モデルを実装し,有望な傾向予測結果を得た。このモデルは、Gated Recurrent Unit構造を通して、人口統計情報と流行時系列データを組み込む。支配的な人口統計要因の識別は最後に行われる。 In this work, we developed a deep learning model-based approach to forecast the spreading trend of SARS-CoV-2 in the United States. We implemented the designed model using the United States to confirm cases and state demographic data and achieved promising trend prediction results. The model incorporates demographic information and epidemic time-series data through a Gated Recurrent Unit structure. The identification of dominating demographic factors is delivered in the end.	翻訳日:2022-11-01 04:57:07 公開日:2020-08-09
# MODEL: リンク予測のためのモチーフベースディープ特徴学習 MODEL: Motif-based Deep Feature Learning for Link Prediction ( http://arxiv.org/abs/2008.03637v1 ) ライセンス: Link先を確認	Lei Wang, Jing Ren, Bo Xu, Jianxin Li, Wei Luo, Feng Xia	(参考訳) リンク予測はネットワーク分析やアプリケーションにおいて重要な役割を果たす。近年、リンク予測のアプローチは、従来の類似性に基づくアルゴリズムから埋め込みに基づくアルゴリズムへと進化している。しかし、既存のアプローチの多くは、現実世界のネットワークがランダムなネットワークとは異なるという事実をうまく利用できない。特に、現実世界のネットワークにはモチーフ、基盤となるネットワーク生成プロセスを反映した自然なネットワーク構築ブロックが含まれていることが知られている。本稿では,ネットワーク内の高次構造をキャプチャするために,ネットワークモチーフを組み込んだ新しい埋め込みアルゴリズムを提案する。リンク予測の有効性を評価するために,ソーシャルネットワーク,生物ネットワーク,学術ネットワークの3種類のネットワークを用いて実験を行った。その結果,本アルゴリズムは従来の類似性に基づくアルゴリズムを20%,最先端の埋め込みベースアルゴリズムを19%上回った。 Link prediction plays an important role in network analysis and applications. Recently, approaches for link prediction have evolved from traditional similarity-based algorithms into embedding-based algorithms. However, most existing approaches fail to exploit the fact that real-world networks are different from random networks. In particular, real-world networks are known to contain motifs, natural network building blocks reflecting the underlying network-generating processes. In this paper, we propose a novel embedding algorithm that incorporates network motifs to capture higher-order structures in the network. To evaluate its effectiveness for link prediction, experiments were conducted on three types of networks: social networks, biological networks, and academic networks. The results demonstrate that our algorithm outperforms both the traditional similarity-based algorithms by 20% and the state-of-the-art embedding-based algorithms by 19%.	翻訳日:2022-11-01 04:53:55 公開日:2020-08-09
# ソーシャルネットワークにおける多変量関係集約学習 Multivariate Relations Aggregation Learning in Social Networks ( http://arxiv.org/abs/2008.03654v1 ) ライセンス: Link先を確認	Jin Xu, Shuo Yu, Ke Sun, Jing Ren, Ivan Lee, Shirui Pan, Feng Xia	(参考訳) 多変量関係は、生物ネットワーク、ソーシャルネットワーク、輸送ネットワーク、学術ネットワークなど、様々な種類のネットワークにおいて一般的である。三次閉鎖の原則とグループ形成の傾向により、ソーシャルネットワークにおける多変量関係は複雑で豊かなものである。したがって、ソーシャルネットワークのグラフ学習タスクでは、多変量関係情報の同定と活用がより重要である。既存のグラフ学習手法は近隣情報拡散機構に基づいており、これは多くの場合、部分的欠落や多変量関係情報の欠如を招き、最終的にタスクの正確性と実行効率に影響を及ぼす。これらの課題に対処するために,ネットワーク環境における多変量関係情報を効果的に把握できる多変量関係集約学習法(MORE)を提案する。 node属性と構造特徴を集約することで、より精度が高く、コンバージェンス速度が速くなる。 1つの引用ネットワークと5つのソーシャルネットワークで実験を行った。実験の結果,MOREモデルはノード分類タスクにおけるGCN(Graph Convolutional Network)モデルよりも精度が高く,時間コストを大幅に削減できることがわかった。 Multivariate relations are general in various types of networks, such as biological networks, social networks, transportation networks, and academic networks. Due to the principle of ternary closures and the trend of group formation, the multivariate relationships in social networks are complex and rich. Therefore, in graph learning tasks of social networks, the identification and utilization of multivariate relationship information are more important. Existing graph learning methods are based on the neighborhood information diffusion mechanism, which often leads to partial omission or even lack of multivariate relationship information, and ultimately affects the accuracy and execution efficiency of the task. To address these challenges, this paper proposes the multivariate relationship aggregation learning (MORE) method, which can effectively capture the multivariate relationship information in the network environment. By aggregating node attribute features and structural features, MORE achieves higher accuracy and faster convergence speed. We conducted experiments on one citation network and five social networks. The experimental results show that the MORE model has higher accuracy than the GCN (Graph Convolutional Network) model in node classification tasks, and can significantly reduce time cost.	翻訳日:2022-11-01 04:53:44 公開日:2020-08-09
# DINE: ディープ不完全なネットワーク埋め込みのためのフレームワーク DINE: A Framework for Deep Incomplete Network Embedding ( http://arxiv.org/abs/2008.06311v1 ) ライセンス: Link先を確認	Ke Hou, Jiaying Liu, Yin Peng, Bo Xu, Ivan Lee, Feng Xia	(参考訳) ネットワーク表現学習(NRL)は,ノード分類やリンク予測など,さまざまなタスクにおいて重要な役割を果たす。ネットワーク構造やノード属性に基づいて,ノードの低次元ベクトル表現を学習することを目的とする。完全ネットワークへの埋め込み技術は集中的に研究されてきたが、実世界のアプリケーションでは、完全ネットワークの収集は依然として難しい課題である。本稿では,このギャップを埋めるために,ディープ不完全ネットワーク埋め込み法,すなわちDINEを提案する。具体的には、期待最大化フレームワークを用いて、部分的に観測可能なネットワーク内のノードとエッジの両方を含む欠落部分を最初に完成する。組込み性能を向上させるため,ノード表現の学習にはネットワーク構造とノード属性の両方を考慮する。実験により,マルチラベル分類およびリンク予測タスクにおいて,DINEを3つのネットワーク上で評価する。その結果,最先端のベースラインと比較して提案手法の優位性を示した。 Network representation learning (NRL) plays a vital role in a variety of tasks such as node classification and link prediction. It aims to learn low-dimensional vector representations for nodes based on network structures or node attributes. While embedding techniques on complete networks have been intensively studied, in real-world applications, it is still a challenging task to collect complete networks. To bridge the gap, in this paper, we propose a Deep Incomplete Network Embedding method, namely DINE. Specifically, we first complete the missing part including both nodes and edges in a partially observable network by using the expectation-maximization framework. To improve the embedding performance, we consider both network structures and node attributes to learn node representations. Empirically, we evaluate DINE over three networks on multi-label classification and link prediction tasks. The results demonstrate the superiority of our proposed approach compared against state-of-the-art baselines.	翻訳日:2022-11-01 04:53:08 公開日:2020-08-09
# 野生における教師なし迷路補正とアニメーションのためのデュアルインペイントモデル Dual In-painting Model for Unsupervised Gaze Correction and Animation in the Wild ( http://arxiv.org/abs/2008.03834v1 ) ライセンス: Link先を確認	Jichao Zhang, Jingjing Chen, Hao Tang, Wei Wang, Yan Yan, Enver Sangineto, Nicu Sebe	(参考訳) 本稿では,野生における無監督視線補正の問題に対処し,視線角と頭部姿勢の正確な注釈を必要とせず,有効な解決法を提案する。私たちはCelebAGazeという新しいデータセットを作成しました。このデータセットは2つのドメイン X, Y で構成されています。本手法は,Gaze Correction Module (GCM), Gaze Animation Module (GAM), Pretrained Autoencoder Module (PAM)の3つの新しいモジュールから構成される。具体的には、GCMとGAMは、視線補正のためのドメイン$X$のデータと、視線アニメーションのためのドメイン$Y$のデータを使用して、デュアル塗装ネットワークを別々に訓練する。また、GAMのトレーニングにおいて、眼領域から符号化された特徴と角度情報との相関を助長し、潜伏空間の補間によって実現可能な視線アニメーションを実現するための合成-アスレーニング法を提案する。アイリス色など)の識別情報をさらに保存するために,自己監督ミラー学習に基づくオートエンコーダを用いたPAMを提案し,そのボトルネック特徴が角度不変であり,デュアルインペイントモデルへの追加入力として機能する。広汎な実験により,提案手法の有効性を検証し,本手法が最先端のベースラインよりも説得力のある結果を得る上での優位性を実証した。私たちのコード、事前訓練されたモデル、補足資料は、https://github.com/zhangqianhui/GazeAnimation.comで公開されています。 In this paper we address the problem of unsupervised gaze correction in the wild, presenting a solution that works without the need for precise annotations of the gaze angle and the head pose. We have created a new dataset called CelebAGaze, which consists of two domains X, Y, where the eyes are either staring at the camera or somewhere else. Our method consists of three novel modules: the Gaze Correction module (GCM), the Gaze Animation module (GAM), and the Pretrained Autoencoder module (PAM). Specifically, GCM and GAM separately train a dual in-painting network using data from the domain $X$ for gaze correction and data from the domain $Y$ for gaze animation. Additionally, a Synthesis-As-Training method is proposed when training GAM to encourage the features encoded from the eye region to be correlated with the angle information, resulting in a gaze animation which can be achieved by interpolation in the latent space. To further preserve the identity information~(e.g., eye shape, iris color), we propose the PAM with an Autoencoder, which is based on Self-Supervised mirror learning where the bottleneck features are angle-invariant and which works as an extra input to the dual in-painting models. Extensive experiments validate the effectiveness of the proposed method for gaze correction and gaze animation in the wild and demonstrate the superiority of our approach in producing more compelling results than state-of-the-art baselines. Our code, the pretrained models and the supplementary material are available at: https://github.com/zhangqianhui/GazeAnimation.	翻訳日:2022-11-01 04:52:57 公開日:2020-08-09
# ネットワーク侵入検出のための多段階最適化機械学習フレームワーク Multi-Stage Optimized Machine Learning Framework for Network Intrusion Detection ( http://arxiv.org/abs/2008.03297v1 ) ライセンス: Link先を確認	MohammadNoor Injadat, Abdallah Moubayed, Ali Bou Nassif, Abdallah Shami	(参考訳) サイバーセキュリティは、インターネット上の個人や組織の依存度の増加と、オンライン活動のセキュリティとプライバシに関する懸念から、大きな注目を集めた。従来の機械学習(ML)ベースのネットワーク侵入検知システム(NIDS)は、悪意のあるオンライン行動から保護するために開発された。本稿では,その検出性能を維持しつつ計算複雑性を低減できる多段最適化mlベースのnidsフレームワークを提案する。本研究は,オーバーサンプリング手法がモデルのトレーニングサンプルサイズに与える影響を調査し,最小のトレーニングサンプルサイズを決定する。さらに、情報ゲインと相関に基づく2つの特徴選択手法を比較し、検出性能と時間複雑性への影響について検討する。さらに、NIDSの性能を高めるため、異なるMLハイパーパラメータ最適化手法について検討した。提案フレームワークの性能は、CICIDS 2017とUNSW-NB 2015データセットの2つの最近の侵入検知データセットを用いて評価される。実験の結果,提案モデルでは,必要なトレーニングサンプルサイズ (最大74%) と特徴セットサイズ (最大50%) を著しく削減できることがわかった。さらに、モデル性能はハイパーパラメータ最適化により向上し、両方のデータセットに対して99%以上の精度で検出精度が向上し、最近の文献の精度が1-2%、誤警報率が1-2%向上した。 Cyber-security garnered significant attention due to the increased dependency of individuals and organizations on the Internet and their concern about the security and privacy of their online activities. Several previous machine learning (ML)-based network intrusion detection systems (NIDSs) have been developed to protect against malicious online behavior. This paper proposes a novel multi-stage optimized ML-based NIDS framework that reduces computational complexity while maintaining its detection performance. This work studies the impact of oversampling techniques on the models' training sample size and determines the minimal suitable training sample size. Furthermore, it compares between two feature selection techniques, information gain and correlation-based, and explores their effect on detection performance and time complexity. Moreover, different ML hyper-parameter optimization techniques are investigated to enhance the NIDS's performance. The performance of the proposed framework is evaluated using two recent intrusion detection datasets, the CICIDS 2017 and the UNSW-NB 2015 datasets. Experimental results show that the proposed model significantly reduces the required training sample size (up to 74%) and feature set size (up to 50%). Moreover, the model performance is enhanced with hyper-parameter optimization with detection accuracies over 99% for both datasets, outperforming recent literature works by 1-2% higher accuracy and 1-2% lower false alarm rate.	翻訳日:2022-11-01 04:52:29 公開日:2020-08-09
# Big Networks: 調査 Big Networks: A Survey ( http://arxiv.org/abs/2008.03638v1 ) ライセンス: Link先を確認	Hayat Dino Bedru, Shuo Yu, Xinru Xiao, Da Zhang, Liangtian Wan, He Guo, Feng Xia	(参考訳) ネットワークは、ネットワークの構成要素間の相互作用のパターンが複雑である頂点とリンクの観点から複雑なシステムを表現する典型的な表現形式である。ネットワークは、時間とともに変化しない静的でもよいし、時間とともに進化する動的でもよい。ネットワーク解析の複雑さは,ネットワークサイズの爆発が増加する新しい状況下で異なる。本稿では,big networkという新たなネットワーク科学概念を提案する。大きなネットワークは通常、複雑で高次の内部構造を持つ大規模である。本稿では,大規模ネットワークの観点から,ネットワーク科学の分野における主要なトピックを考察するガイドラインフレームワークを提案する。まず,マイクロレベル,メソレベル,マクロレベルという3段階の大規模ネットワークの構造特性を紹介する。次に,大規模ネットワーク解析の最先端のトピックについて論じる。ランク付け手法や分割手法,ネットワーク埋め込みアルゴリズムなど,ネットワークモデルと関連するアプローチが体系的に導入されている。ビッグネットワークの典型的なアプリケーションは、コミュニティ検出、リンク予測、レコメンデーションなど、レビューされる。さらに、さらに調査すべき重要なオープンな問題についても指摘します。 A network is a typical expressive form of representing complex systems in terms of vertices and links, in which the pattern of interactions amongst components of the network is intricate. The network can be static that does not change over time or dynamic that evolves through time. The complication of network analysis is different under the new circumstance of network size explosive increasing. In this paper, we introduce a new network science concept called big network. Big networks are generally in large-scale with a complicated and higher-order inner structure. This paper proposes a guideline framework that gives an insight into the major topics in the area of network science from the viewpoint of a big network. We first introduce the structural characteristics of big networks from three levels, which are micro-level, meso-level, and macro-level. We then discuss some state-of-the-art advanced topics of big network analysis. Big network models and related approaches, including ranking methods, partition approaches, as well as network embedding algorithms are systematically introduced. Some typical applications in big networks are then reviewed, such as community detection, link prediction, recommendation, etc. Moreover, we also pinpoint some critical open issues that need to be investigated further.	翻訳日:2022-11-01 04:52:08 公開日:2020-08-09
# Random Walks: アルゴリズムと応用のレビュー Random Walks: A Review of Algorithms and Applications ( http://arxiv.org/abs/2008.03639v1 ) ライセンス: Link先を確認	Feng Xia, Jiaying Liu, Hansong Nie, Yonghao Fu, Liangtian Wan, Xiangjie Kong	(参考訳) ランダムウォークは、数学空間におけるランダムなステップの連続を含む経路を記述するランダムプロセスとして知られている。数学や計算機科学などの様々な分野で人気を集めている。さらに量子力学では、量子ウォークは古典的ランダムウォークの量子アナログと見なすことができる。古典的なランダムウォークと量子ウォークは、ノード間の近接を計算し、ネットワーク内のトポロジーを抽出するために使用できる。様々なランダムウォーク関連モデルは、リンク予測、レコメンデーション、コンピュータビジョン、半教師付き学習、ネットワーク埋め込みといった下流タスクに非常に重要である。本稿では,古典的ランダムウォークと量子ウォークの総合的なレビューを行う。まず,古典的ランダムウォークと量子ウォークの知識,基本的な概念と一般的なアルゴリズムについて概説する。また,時間複雑性の観点から,量子ウォークと古典ランダムウォークに基づくアルゴリズムを比較する。次に,その応用を計算機科学の分野に導入する。最後に、効率性、主記憶容量、既存アルゴリズムの計算時間の観点から、オープンな問題について議論する。この研究は、ランダムウォークと量子ウォークを一緒に探索することで、この成長する研究領域に寄与することを目的としている。 A random walk is known as a random process which describes a path including a succession of random steps in the mathematical space. It has increasingly been popular in various disciplines such as mathematics and computer science. Furthermore, in quantum mechanics, quantum walks can be regarded as quantum analogues of classical random walks. Classical random walks and quantum walks can be used to calculate the proximity between nodes and extract the topology in the network. Various random walk related models can be applied in different fields, which is of great significance to downstream tasks such as link prediction, recommendation, computer vision, semi-supervised learning, and network embedding. In this paper, we aim to provide a comprehensive review of classical random walks and quantum walks. We first review the knowledge of classical random walks and quantum walks, including basic concepts and some typical algorithms. We also compare the algorithms based on quantum walks and classical random walks from the perspective of time complexity. Then we introduce their applications in the field of computer science. Finally we discuss the open issues from the perspectives of efficiency, main-memory volume, and computing time of existing algorithms. This study aims to contribute to this growing area of research by exploring random walks and quantum walks together.	翻訳日:2022-11-01 04:51:56 公開日:2020-08-09
# ネットワーク侵入検知システムにおける逆例に対するロバスト性向上 Enhancing Robustness Against Adversarial Examples in Network Intrusion Detection Systems ( http://arxiv.org/abs/2008.03677v1 ) ライセンス: Link先を確認	Mohammad J. Hashemi, Eric Keller	(参考訳) 近年のサイバー攻撃の増加は、より洗練されたネットワーク侵入検知システム(NIDS)の構築を要求している。これらのNIDSは、SDN(Software-Defined Network)にデプロイされるような、ネットワークを経由するすべてのトラフィックを監視することができれば、パフォーマンスが向上する。ゼロデイ攻撃を検出できないため、従来悪意のあるトラフィックを検出するために使われていたシグネチャベースのNIDSは、ニューラルネットワーク上に構築された異常ベースのNIDSに置き換えられ始めている。しかし、近年、このようなNIDSは独自の欠点、すなわち敵のサンプル攻撃に弱いことが示されている。さらに、ネットワークシステムが最近直面する可能性のあるさまざまな攻撃を表現していない古いデータセットで、主に評価された。本稿では、異なる種類のネットワーク攻撃を、敵のサンプル攻撃に対する堅牢性を高めた低い偽警報設定で検出できる自動エンコーダをデノナイズし、NIDSを構築するための新しいメカニズムとしてRestruction from partial Observation (RePO)を提案する。ネットワーク攻撃を多岐に及ぼしたデータセットを用いて行った評価の結果, オートエンコーダをデノナイズすることで, 通常の設定では最大29%, 対向設定では最大45%の悪質なトラフィックの検出精度が向上することがわかった。 The increase of cyber attacks in both the numbers and varieties in recent years demands to build a more sophisticated network intrusion detection system (NIDS). These NIDS perform better when they can monitor all the traffic traversing through the network like when being deployed on a Software-Defined Network (SDN). Because of the inability to detect zero-day attacks, signature-based NIDS which were traditionally used for detecting malicious traffic are beginning to get replaced by anomaly-based NIDS built on neural networks. However, recently it has been shown that such NIDS have their own drawback namely being vulnerable to the adversarial example attack. Moreover, they were mostly evaluated on the old datasets which don't represent the variety of attacks network systems might face these days. In this paper, we present Reconstruction from Partial Observation (RePO) as a new mechanism to build an NIDS with the help of denoising autoencoders capable of detecting different types of network attacks in a low false alert setting with an enhanced robustness against adversarial example attack. Our evaluation conducted on a dataset with a variety of network attacks shows denoising autoencoders can improve detection of malicious traffic by up to 29% in a normal setting and by up to 45% in an adversarial setting compared to other recently proposed anomaly detectors.	翻訳日:2022-11-01 04:51:38 公開日:2020-08-09
# クラスタベースモデリングによる合成音声の深部MOS予測 Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling ( http://arxiv.org/abs/2008.03710v1 ) ライセンス: Link先を確認	Yeunju Choi, Youngmoon Jung, Hoirin Kim	(参考訳) 深層学習は音声合成と音声変換において顕著な進歩を遂げてきたが, 人工音声の評価はまだ人間の被験者によって行われている。近年, 深層学習に基づく評価モデルを提案し, 音声品質評価の自動化の可能性を示した。先述した評価モデルであるmosnetを改善するために,クラスタベースのモデリング手法を用いて,グローバル品質トークン(gqt)層の使用,エンコーディング層の使用,および両者の使用という3つのモデルを提案する。我々は、音声変換チャレンジ2018の評価結果を用いて、合成音声の平均意見スコアと合成音声と参照音声の類似度スコアを予測する実験を行った。その結果、gqt層はタスクの有用な品質トークンを自動的に学習することで、人間評価の予測に役立ち、符号化層はフレームレベルのスコアをより正確に活用するのに役立ちます。 While deep learning has made impressive progress in speech synthesis and voice conversion, the assessment of the synthesized speech is still carried out by human participants. Several recent papers have proposed deep-learning-based assessment models and shown the potential to automate the speech quality assessment. To improve the previously proposed assessment model, MOSNet, we propose three models using cluster-based modeling methods: using a global quality token (GQT) layer, using an Encoding Layer, and using both of them. We perform experiments using the evaluation results of the Voice Conversion Challenge 2018 to predict the mean opinion score of synthesized speech and similarity score between synthesized speech and reference speech. The results show that the GQT layer helps to predict human assessment better by automatically learning the useful quality tokens for the task and that the Encoding Layer helps to utilize frame-level scores more precisely.	翻訳日:2022-11-01 04:51:13 公開日:2020-08-09
# グラフ分割による進化的構成木抽出の高速化 Accelerating Evolutionary Construction Tree Extraction via Graph Partitioning ( http://arxiv.org/abs/2008.03669v1 ) ライセンス: Link先を確認	Markus Friedrich and Sebastian Feld and Thomy Phan and Pierre-Alain Fayolle	(参考訳) 潜在的にノイズの多い点雲から構築木を抽出することは、コンピュータ支援設計におけるリバースエンジニアリングタスクの重要な側面である。アルゴリズム幾何学に基づく解は、使用可能なモデル表現(例えば二次曲面のみ)と雑音ロバスト性に制約を課す。問題を組合せ最適化問題として再計算し、進化的アルゴリズムで解くことで、計算複雑性の増大を犠牲にしてこれらの制約の一部を緩和することができる。本稿では,最新のcpuの並列化機能を活用しつつ,進化的構成木抽出を高速化するグラフ検索空間分割スキームを提案する。この評価は、ベースラインアプローチと比較して46.6ドルまでのスピードアップを示し、結果としてツリーサイズは25.2.%から8.6.%に増加した。 Extracting a Construction Tree from potentially noisy point clouds is an important aspect of Reverse Engineering tasks in Computer Aided Design. Solutions based on algorithmic geometry impose constraints on usable model representations (e.g. quadric surfaces only) and noise robustness. Re-formulating the problem as a combinatorial optimization problem and solving it with an Evolutionary Algorithm can mitigate some of these constraints at the cost of increased computational complexity. This paper proposes a graph-based search space partitioning scheme that is able to accelerate Evolutionary Construction Tree extraction while exploiting parallelization capabilities of modern CPUs. The evaluation indicates a speed-up up to a factor of $46.6$ compared to the baseline approach while resulting tree sizes increased by $25.2\%$ to $88.6\%$.	翻訳日:2022-11-01 04:45:14 公開日:2020-08-09
# ブロックシャッフル:メモリ制限のある高分解能高速スタイル転送方式 Block Shuffle: A Method for High-resolution Fast Style Transfer with Limited Memory ( http://arxiv.org/abs/2008.03706v1 ) ライセンス: Link先を確認	Weifeng Ma, Zhe Chen, Caoting Ji	(参考訳) Fast Style Transferは、フィードフォワードニューラルネットワークを使って入力画像をレンダリングする一連のNeural Style Transferアルゴリズムである。出力層の高次元のため、これらのネットワークは計算に多くのメモリを必要とする。したがって、高解像度画像の場合、ほとんどのモバイルデバイスやパーソナルコンピュータはそれらをスタイリングできないため、Fast Style Transferのアプリケーションシナリオは大幅に制限される。現在、既存の2つのソリューションは、より多くのメモリを購入し、羽毛ベースの方法を使用しているが、前者は追加コストが必要であり、後者は画質が劣っている。そこで本研究では,高メモリ消費の単一タスクを低メモリ消費の複数のサブタスクに変換する新しい画像合成手法である「emph{block shuffle}」を提案する。このメソッドは、ネットワークアーキテクチャを変更することなく、高速スタイル転送のプラグインとして機能することができる。私たちはGitHubで最も人気のあるFast Style Transferリポジトリをベースラインとして使用しています。実験により,本手法による高分解能画像の品質がフェザリング法より優れていることを示した。本手法はベースラインよりも桁違いに遅いが,メモリに制限のある高分解能画像のスタイリングが可能であり,ベースラインでは不可能である。コードとモデルは \url{https://github.com/czczup/block-shuffle} で利用可能になる。 Fast Style Transfer is a series of Neural Style Transfer algorithms that use feed-forward neural networks to render input images. Because of the high dimension of the output layer, these networks require much memory for computation. Therefore, for high-resolution images, most mobile devices and personal computers cannot stylize them, which greatly limits the application scenarios of Fast Style Transfer. At present, the two existing solutions are purchasing more memory and using the feathering-based method, but the former requires additional cost, and the latter has poor image quality. To solve this problem, we propose a novel image synthesis method named \emph{block shuffle}, which converts a single task with high memory consumption to multiple subtasks with low memory consumption. This method can act as a plug-in for Fast Style Transfer without any modification to the network architecture. We use the most popular Fast Style Transfer repository on GitHub as the baseline. Experiments show that the quality of high-resolution images generated by our method is better than that of the feathering-based method. Although our method is an order of magnitude slower than the baseline, it can stylize high-resolution images with limited memory, which is impossible with the baseline. The code and models will be made available on \url{https://github.com/czczup/block-shuffle}.	翻訳日:2022-11-01 04:45:00 公開日:2020-08-09
# 病理組織における一般核検出のためのスイッチング損失 Switching Loss for Generalized Nucleus Detection in Histopathology ( http://arxiv.org/abs/2008.03750v1 ) ライセンス: Link先を確認	Deepak Anand, Gaurav Patel, Yaman Dang, Amit Sethi	(参考訳) 医用画像解析における2つの基礎的課題に対する深層学習手法の精度 - 検出とセグメンテーション -- は、クラス不均衡に悩まされる可能性がある。本稿では,前景クラスと背景クラスを適応的にシフトする「スイッチングロス」関数を提案する。この問題に対処する既存の損失関数は分類タスクによって動機付けられているが、スイッチング損失はDice損失に基づいており、セグメンテーションや検出に適している。さらに、トレーニングサンプルを最大限に活用するために、トレーニングセット全体に対して一度適応する以前の提案とは異なり、各ミニバッチで損失を適応します。ソースデータセット上で提案された損失関数を用いて訓練された核検出器は、クロスエントロピー、サイコロ、焦点損失を用いて訓練されたものよりも優れていた。驚くべきことに、ターゲットデータセットをリトレーニングすることなく、トレーニング済みの核検出器は、ターゲットデータセットの少なくとも一部のイメージでトレーニングされた既存の核検出器よりも優れています。提案した損失の幅広い有用性を確立するため,MRIにおける他の損失関数と比較してより正確な心室分画が得られたことも確認した。当社のGPU対応でトレーニング済みの核検出ソフトウェアは、スライド画像全体を最初から処理する準備ができています。 The accuracy of deep learning methods for two foundational tasks in medical image analysis -- detection and segmentation -- can suffer from class imbalance. We propose a `switching loss' function that adaptively shifts the emphasis between foreground and background classes. While the existing loss functions to address this problem were motivated by the classification task, the switching loss is based on Dice loss, which is better suited for segmentation and detection. Furthermore, to get the most out the training samples, we adapt the loss with each mini-batch, unlike previous proposals that adapt once for the entire training set. A nucleus detector trained using the proposed loss function on a source dataset outperformed those trained using cross-entropy, Dice, or focal losses. Remarkably, without retraining on target datasets, our pre-trained nucleus detector also outperformed existing nucleus detectors that were trained on at least some of the images from the target datasets. To establish a broad utility of the proposed loss, we also confirmed that it led to more accurate ventricle segmentation in MRI as compared to the other loss functions. Our GPU-enabled pre-trained nucleus detection software is also ready to process whole slide images right out-of-the-box and is usably fast.	翻訳日:2022-11-01 04:44:28 公開日:2020-08-09
# コンピュータビジョンと慣性センサを用いた軌道形状計測手法 A methodology for the measurement of track geometry based on computer vision and inertial sensors ( http://arxiv.org/abs/2008.03763v1 ) ライセンス: Link先を確認	Jos\'e L. Escalona	(参考訳) 本論文は,鉄道車両に搭載される軌道形状測定システム(TGMS)における軌道形状の不規則性の計算に使用される理論について述べる。 TGMSは、データ取得と処理のためのコンピュータと、慣性測定ユニット(IMU、3Dジャイロスコープ、および3D加速度計)と、2つのビデオカメラとエンコーダを含む一連のセンサーを含む。提案システムの主な特徴は次のとおりである。 1.非接触技術を用いて、軌道アライメント、垂直プロファイル、クロスレベル、ゲージ、ツイスト、レールヘッドプロファイルを測定することができる。 2.鉄道車両に設置可能。コンパクトで低コストである。車両の移動時にレールヘッドを視認していれば、車輪セットレベル、プライマリサスペンション(ボディフレーム)上、またはセカンダリサスペンション(車体)上において、車両の任意の本体に設置することができる。 This document describes the theory used for the calculation of track geometric irregularities on a Track Geometry Measuring System (TGMS) to be installed in railway vehicles. The TGMS includes a computer for data acquisition and process, a set of sensors including an inertial measuring unit (IMU, 3D gyroscope and 3D accelerometer), two video cameras and an encoder. The main features of the proposed system are: 1. It is capable to measure track alignment, vertical profile, cross-level, gauge, twist and rail-head profile using non-contact technology. 2. It can be installed in line railway vehicles. It is compact and low cost. Provided that the equipment sees the rail heads when the vehicle is moving, it can be installed in any body of the vehicle: at the wheelsets level, above primary suspension (bogie frame) or above the secondary suspension (car body).	翻訳日:2022-11-01 04:44:06 公開日:2020-08-09
# 正則照明最適化と深部雑音抑圧による低光海洋画像強調 Low-Light Maritime Image Enhancement with Regularized Illumination Optimization and Deep Noise Suppression ( http://arxiv.org/abs/2008.03765v1 ) ライセンス: Link先を確認	Yu Guo, Yuxu Lu, Ryan Wen Liu, Meifang Yang, Kwok Tai Chui	(参考訳) 低光撮像条件下で撮影された海洋画像は視認性が低く、予期せぬノイズが発生しやすいため、海上交通の監督や管理に悪影響を及ぼす。画像性能向上のためには、劣化した低光度画像から重要な視覚情報を復元する必要がある。本稿では,正規化照明最適化とディープノイズ抑圧による低照度画像の高精細化を提案する。特に,L0-ノルム勾配間隔と構造認識正規化を併用したハイブリッド正規化変分モデルを示し,Max-RGBを用いて推定した粗い照明マップを改良する。次に、洗練された照明マップを調整するために適応ガンマ補正法を導入する。 Retinex理論の仮定に基づいて、リフレクションマップを最適化するために、ガイド付きフィルタに基づく詳細強化手法を導入する。調整された照明と最適化された反射マップを組み合わせて、拡張された海洋画像を生成する。望ましくないノイズが撮像性能に与える影響を抑制するため、深層学習に基づくブラインドデノイングフレームワークを更に導入し、強調画像の視覚的品質を向上する。特に、このフレームワークは2つのサブネットワーク(E-NetとD-Net)で構成されており、それぞれノイズレベル推定と非ブラインドノイズ低減に採用されている。画像強調手法の主な利点は、規則化された照明最適化と深い目隠しを最大限に活用できることである。人工海事画像と現実海事画像の総合的な実験を行い,提案手法と最先端画像との比較を行った。実験結果から,定量評価と定性評価の両面で優れた性能を示した。 Maritime images captured under low-light imaging condition easily suffer from low visibility and unexpected noise, leading to negative effects on maritime traffic supervision and management. To promote imaging performance, it is necessary to restore the important visual information from degraded low-light images. In this paper, we propose to enhance the low-light images through regularized illumination optimization and deep noise suppression. In particular, a hybrid regularized variational model, which combines L0-norm gradient sparsity prior with structure-aware regularization, is presented to refine the coarse illumination map originally estimated using Max-RGB. The adaptive gamma correction method is then introduced to adjust the refined illumination map. Based on the assumption of Retinex theory, a guided filter-based detail boosting method is introduced to optimize the reflection map. The adjusted illumination and optimized reflection maps are finally combined to generate the enhanced maritime images. To suppress the effect of unwanted noise on imaging performance, a deep learning-based blind denoising framework is further introduced to promote the visual quality of enhanced image. In particular, this framework is composed of two sub-networks, i.e., E-Net and D-Net adopted for noise level estimation and non-blind noise reduction, respectively. The main benefit of our image enhancement method is that it takes full advantage of the regularized illumination optimization and deep blind denoising. Comprehensive experiments have been conducted on both synthetic and realistic maritime images to compare our proposed method with several state-of-the-art imaging methods. Experimental results have illustrated its superior performance in terms of both quantitative and qualitative evaluations.	翻訳日:2022-11-01 04:43:49 公開日:2020-08-09
# sequence-to-sequence asrにおけるbertの知識の蒸留 Distilling the Knowledge of BERT for Sequence-to-Sequence ASR ( http://arxiv.org/abs/2008.03822v1 ) ライセンス: Link先を確認	Hayato Futami, Hirofumi Inaguma, Sei Ueno, Masato Mimura, Shinsuke Sakai and Tatsuya Kawahara	(参考訳) 注意に基づくシーケンス・ツー・シーケンス(seq2seq)モデルは自動音声認識(ASR)において有望な結果を得た。しかし、これらのモデルは左から右にデコードするので、右のコンテキストにアクセスできない。我々は、知識蒸留によるセク2seq ASRにBERTを外部言語モデルとして適用することで、左右両方の文脈を活用する。提案手法では,ceq2seq ASRのトレーニングを案内するソフトラベルを生成する。さらに,現在の発話を超えた文脈をbertの入力として活用する。日本語自発語コーパス(CSJ)のSeq2seqベースラインからASR性能を有意に向上することを示す実験的検討を行った。 BERTの知識蒸留は、左の文脈だけを見る変換器のLMよりも優れている。また,現在の発話を超えた文脈の活用の有効性を示す。提案手法は,n-best rescoringや浅層融合といった他のLMアプリケーション手法よりも優れているが,追加の推論コストは不要である。 Attention-based sequence-to-sequence (seq2seq) models have achieved promising results in automatic speech recognition (ASR). However, as these models decode in a left-to-right way, they do not have access to context on the right. We leverage both left and right context by applying BERT as an external language model to seq2seq ASR through knowledge distillation. In our proposed method, BERT generates soft labels to guide the training of seq2seq ASR. Furthermore, we leverage context beyond the current utterance as input to BERT. Experimental evaluations show that our method significantly improves the ASR performance from the seq2seq baseline on the Corpus of Spontaneous Japanese (CSJ). Knowledge distillation from BERT outperforms that from a transformer LM that only looks at left context. We also show the effectiveness of leveraging context beyond the current utterance. Our method outperforms other LM application approaches such as n-best rescoring and shallow fusion, while it does not require extra inference cost.	翻訳日:2022-11-01 04:35:53 公開日:2020-08-09
# SVMモデルからのホワイトボックス誘導:ロジックプログラミングによる説明可能なAI White-box Induction From SVM Models: Explainable AI with Logic Programming ( http://arxiv.org/abs/2008.03301v1 ) ライセンス: Link先を確認	Farhad Shakerin, Gopal Gupta	(参考訳) 本稿では,サポートベクトルマシン(SVM)アルゴリズムで学習したモデルを説明するロジックプログラムの誘導問題に焦点をあてる。トップダウンシーケンシャルカバーインダクティブ論理プログラミング(ILP)アルゴリズム(例えば、FOIL)は、情報理論からのヒューリスティックスを用いたヒルクライミング探索を適用する。このタイプのアルゴリズムの大きな問題は、ローカル最適化に詰まってしまうことだ。しかし,新たなアプローチでは,データ依存のヒルクライミング探索をモデル依存検索に置き換え,まずグローバルな最適SVMモデルをトレーニングし,次に,そのモデルにおいて最も影響力のあるデータポイントとしてサポートベクトルを探索し,そのサポートベクトルと最もよく似た点をカバーする節を誘導する。固定仮説探索空間を定義する代わりに、我々のアルゴリズムは、説明可能なAIの例固有のインタプリタであるSHAPを用いて、関連する機能セットを決定する。このアプローチは、svmモデルの基盤となるロジックをキャプチャし、%ggを上回るアルゴリズムを生成する: foilアルゴリズム --> 他のilpアルゴリズム他のilpアルゴリズム誘導節の数と分類評価メトリクスの観点から。本論文は「論理プログラミングの理論と実践」誌の出版に向けて検討中である。 We focus on the problem of inducing logic programs that explain models learned by the support vector machine (SVM) algorithm. The top-down sequential covering inductive logic programming (ILP) algorithms (e.g., FOIL) apply hill-climbing search using heuristics from information theory. A major issue with this class of algorithms is getting stuck in a local optimum. In our new approach, however, the data-dependent hill-climbing search is replaced with a model-dependent search where a globally optimal SVM model is trained first, then the algorithm looks into support vectors as the most influential data points in the model, and induces a clause that would cover the support vector and points that are most similar to that support vector. Instead of defining a fixed hypothesis search space, our algorithm makes use of SHAP, an example-specific interpreter in explainable AI, to determine a relevant set of features. This approach yields an algorithm that captures SVM model's underlying logic and outperforms %GG: the FOIL algorithm --> other ILP algorithms other ILP algorithms in terms of the number of induced clauses and classification evaluation metrics. This paper is under consideration for publication in the journal of "Theory and practice of logic programming".	翻訳日:2022-11-01 04:35:39 公開日:2020-08-09
# 網膜画像における異常検出のためのp-netとの符号化構造-テキスト関係 Encoding Structure-Texture Relation with P-Net for Anomaly Detection in Retinal Images ( http://arxiv.org/abs/2008.03632v1 ) ライセンス: Link先を確認	Kang Zhou, Yuting Xiao, Jianlong Yang, Jun Cheng, Wen Liu, Weixin Luo, Zaiwang Gu, Jiang Liu, Shenghua Gao	(参考訳) 網膜画像における異常検出は、トレーニング段階において正常な画像のみを活用することにより、様々な網膜疾患/疾患による異常の同定を指す。健康な被験者の正常な画像は、しばしば規則的な構造を持つ(例えば、基底画像の血管構造、光コヒーレンス断層画像の解剖学的構造など)。逆に、疾患や病変はしばしばこれらの構造を破壊する。そこで本研究では,画像テクスチャと構造の関係を利用して,異常検出のためのディープニューラルネットワークの設計を提案する。具体的には、まず網膜画像の構造を抽出し、次に、元の健康画像から抽出された構造特徴と最終層特徴の両方を組み合わせて、元の入力された健康画像の再構成を行う。画像特徴はテクスチャ情報を提供し、構造から回収された画像の特異性を保証する。最後に、再構成画像を利用して構造を抽出し、原画像から抽出した構造と再構成画像との差を測定する。一方、再構成差の最小化は正則化器のように振舞い、画像の復元が保証される。一方、そのような構造差は正規度測定の計量としても用いられる。ネットワーク全体は ``P'' 形状であるため、P-Net と呼ばれる。 RESCデータセットとiSeeデータセットの大規模な実験により、網膜画像における異常検出に対するアプローチの有効性が検証された。さらに,本手法は,網膜画像における新たなクラス発見や実世界画像における異常検出にもよく適用できる。 Anomaly detection in retinal image refers to the identification of abnormality caused by various retinal diseases/lesions, by only leveraging normal images in training phase. Normal images from healthy subjects often have regular structures (e.g., the structured blood vessels in the fundus image, or structured anatomy in optical coherence tomography image). On the contrary, the diseases and lesions often destroy these structures. Motivated by this, we propose to leverage the relation between the image texture and structure to design a deep neural network for anomaly detection. Specifically, we first extract the structure of the retinal images, then we combine both the structure features and the last layer features extracted from original health image to reconstruct the original input healthy image. The image feature provides the texture information and guarantees the uniqueness of the image recovered from the structure. In the end, we further utilize the reconstructed image to extract the structure and measure the difference between structure extracted from original and the reconstructed image. On the one hand, minimizing the reconstruction difference behaves like a regularizer to guarantee that the image is corrected reconstructed. On the other hand, such structure difference can also be used as a metric for normality measurement. The whole network is termed as P-Net because it has a ``P'' shape. Extensive experiments on RESC dataset and iSee dataset validate the effectiveness of our approach for anomaly detection in retinal images. Further, our method also generalizes well to novel class discovery in retinal images and anomaly detection in real-world images.	翻訳日:2022-11-01 04:34:14 公開日:2020-08-09
# ベクター表現による分子画像の増強と薬物分類への応用 Augmenting Molecular Images with Vector Representations as a Featurization Technique for Drug Classification ( http://arxiv.org/abs/2008.03646v1 ) ライセンス: Link先を確認	Daniel de Marchi, Amarjit Budhiraja	(参考訳) 薬物の分類と生成のためのディープラーニングシステムを構築するための重要なステップの1つは、分子の創成の選択である。以前は、分子画像、二分文字列、グラフ、スマイル文字列などがあった。本稿では,分子画像のみに含まれる,あるいは理解が容易でない情報をエンコードする2進ベクトルをキャプションとした分子画像の作成を提案する。具体的には、より高いレベルの構造情報をエンコードするmorgan fingerprintsと、分子の性質や構造についてイエスかノーかをエンコードするmaccs keysを使用します。本手法をパンデ研究所が公開しているHIVデータセットで検証し,HIVウイルスを阻害すると41,127個の分子がラベル付けされた。我々の最終モデルは、HIVデータセット上のAUC ROCの状態を達成し、他のすべての方法よりも優れています。さらに、モデルは他のほとんどの方法よりもはるかに高速に収束し、未拡張画像よりも計算能力が劇的に低下した。 One of the key steps in building deep learning systems for drug classification and generation is the choice of featurization for the molecules. Previous featurization methods have included molecular images, binary strings, graphs, and SMILES strings. This paper proposes the creation of molecular images captioned with binary vectors that encode information not contained in or easily understood from a molecular image alone. Specifically, we use Morgan fingerprints, which encode higher level structural information, and MACCS keys, which encode yes or no questions about a molecules properties and structure. We tested our method on the HIV dataset published by the Pande lab, which consists of 41,127 molecules labeled by if they inhibit the HIV virus. Our final model achieved a state of the art AUC ROC on the HIV dataset, outperforming all other methods. Moreover, the model converged significantly faster than most other methods, requiring dramatically less computational power than unaugmented images.	翻訳日:2022-11-01 04:33:17 公開日:2020-08-09
# リアルタイムuav追跡のための学習一貫性追従相関フィルタ Learning Consistency Pursued Correlation Filters for Real-Time UAV Tracking ( http://arxiv.org/abs/2008.03704v1 ) ライセンス: Link先を確認	Changhong Fu, Xiaoxiao Yang, Fan Li, Juntao Xu, Changjing Liu, and Peng Lu	(参考訳) 相関フィルタ(CF)に基づく手法は、無人航空機(UAV)の視覚的物体追跡において例外的な性能を示すが、望ましくない境界効果に悩まされている。この問題を解決するために,空間正規化相関フィルタ(srdcf)は,フィルタ係数をペナライズする空間正規化を提案する。しかし、応答マップに隠された時間情報はsrdcfでは考慮されないため、正確な追跡のための識別能力とロバスト性が制限される。本研究は,動的整合性追従相関フィルタ,すなわちCPCFトラッカーを用いた新しい手法を提案する。具体的には、隣接する応答マップ間の相関操作により、フレーム間の一貫性レベルを表す実用的な一貫性マップを生成する。実用的理想的一貫性マップと計画的理想的一貫性マップとの差を最小化することにより、時間的滑らかさを維持するために一貫性レベルを制約し、応答マップに含まれる豊富な時間情報を導入する。さらに,複雑な状況下でのトラッカーの適応性を向上させるために,動的制約戦略を提案する。包括的な実験は、UAV123@10FPS、UAVDT、DTB70という3つの挑戦的なUAVベンチマークで行われている。実験結果に基づき、提案したトラッカーは、1つのCPU上でのリアルタイムランニング速度($43FPS)の他の25の最先端トラッカーを好んで上回っている。 Correlation filter (CF)-based methods have demonstrated exceptional performance in visual object tracking for unmanned aerial vehicle (UAV) applications, but suffer from the undesirable boundary effect. To solve this issue, spatially regularized correlation filters (SRDCF) proposes the spatial regularization to penalize filter coefficients, thereby significantly improving the tracking performance. However, the temporal information hidden in the response maps is not considered in SRDCF, which limits the discriminative power and the robustness for accurate tracking. This work proposes a novel approach with dynamic consistency pursued correlation filters, i.e., the CPCF tracker. Specifically, through a correlation operation between adjacent response maps, a practical consistency map is generated to represent the consistency level across frames. By minimizing the difference between the practical and the scheduled ideal consistency map, the consistency level is constrained to maintain temporal smoothness, and rich temporal information contained in response maps is introduced. Besides, a dynamic constraint strategy is proposed to further improve the adaptability of the proposed tracker in complex situations. Comprehensive experiments are conducted on three challenging UAV benchmarks, i.e., UAV123@10FPS, UAVDT, and DTB70. Based on the experimental results, the proposed tracker favorably surpasses the other 25 state-of-the-art trackers with real-time running speed ($\sim$43FPS) on a single CPU.	翻訳日:2022-11-01 04:27:28 公開日:2020-08-09
# レーン幅に先行した車線境界観測を用いた時間的一貫性のあるipmのためのオンラインextrinsicカメラキャリブレーション Online Extrinsic Camera Calibration for Temporally Consistent IPM Using Lane Boundary Observations with a Lane Width Prior ( http://arxiv.org/abs/2008.03722v1 ) ライセンス: Link先を確認	Jeong-Kyun Lee and Young-Ki Baik and Hankyu Cho and Seungwoo Yoo	(参考訳) 本稿では,道路面からのピッチ,ヨー,ロール角,カメラ高さを逐次駆動シーン画像から推定するオンライン外部カメラキャリブレーション手法を提案する。提案手法では,2段階のカメラパラメータを推定する。 1)一組の車線境界観測から計算した消滅点を用いてピッチとヨー角を同時に推定する。 2)車線幅観測と車線幅の差を最小化して、ロール角度とカメラ高さを算出する。外部カメラパラメータは拡張カルマンフィルタリング(EKF)を用いて順次更新され、最終的に逆視点マッピング(IPM)により時間的に一貫した鳥眼ビュー(BEV)画像を生成する。合成および実世界のデータセットにおける提案手法の優位性を示す。 In this paper, we propose a method for online extrinsic camera calibration, i.e., estimating pitch, yaw, roll angles and camera height from road surface in sequential driving scene images. The proposed method estimates the extrinsic camera parameters in two steps: 1) pitch and yaw angles are estimated simultaneously using a vanishing point computed from a set of lane boundary observations, and then 2) roll angle and camera height are computed by minimizing difference between lane width observations and a lane width prior. The extrinsic camera parameters are sequentially updated using extended Kalman filtering (EKF) and are finally used to generate a temporally consistent bird-eye-view (BEV) image by inverse perspective mapping (IPM). We demonstrate the superiority of the proposed method in synthetic and real-world datasets.	翻訳日:2022-11-01 04:26:33 公開日:2020-08-09
# SOFA-Net: クラウドカウントのための2次および1次アテンションネットワーク SOFA-Net: Second-Order and First-order Attention Network for Crowd Counting ( http://arxiv.org/abs/2008.03723v1 ) ライセンス: Link先を確認	Haoran Duan, Shidong Wang, Yu Guan	(参考訳) 近年、スマートシティーに広範に応用されているため、画像や動画からの群衆自動カウントが注目されている。しかし、密集した群衆をモデル化することは困難であり、既存の作品のほとんどが信頼性が低下する。本研究で提案したSOFA-Net(Second-Order and First-order Attention Network)では,高密度頭部のチャネルワイド空間情報の選択性を維持するために2次統計を抽出し,頭部領域の特徴識別を強化する1次統計を補完情報として用いた。マルチストリームアーキテクチャにより,提案する2次/1次統計を学習し,ロバスト表現の洗練に注意を向けた。提案手法を4つの公開データセットで評価し,そのほとんどは最新技術に到達した。また,提案するsofa-netの構成成分について広範な実験を行い,課題シナリオにおけるモデル群における2次・1次統計の高機能化を示唆した。私たちの知る限りでは、クラウドカウントの2/1次統計を探求する最初の仕事です。 Automated crowd counting from images/videos has attracted more attention in recent years because of its wide application in smart cities. But modelling the dense crowd heads is challenging and most of the existing works become less reliable. To obtain the appropriate crowd representation, in this work we proposed SOFA-Net(Second-Order and First-order Attention Network): second-order statistics were extracted to retain selectivity of the channel-wise spatial information for dense heads while first-order statistics, which can enhance the feature discrimination for the heads' areas, were used as complementary information. Via a multi-stream architecture, the proposed second/first-order statistics were learned and transformed into attention for robust representation refinement. We evaluated our method on four public datasets and the performance reached state-of-the-art on most of them. Extensive experiments were also conducted to study the components in the proposed SOFA-Net, and the results suggested the high-capability of second/first-order statistics on modelling crowd in challenging scenarios. To the best of our knowledge, we are the first work to explore the second/first-order statistics for crowd counting.	翻訳日:2022-11-01 04:26:17 公開日:2020-08-09
# イメージインペインティングのための繰り返し特徴推論 Recurrent Feature Reasoning for Image Inpainting ( http://arxiv.org/abs/2008.03737v1 ) ライセンス: Link先を確認	Jingyuan Li, Ning Wang, Lefei Zhang, Bo Du, Dacheng Tao	(参考訳) 既存の塗装法は, 画像欠陥の回復に有望な性能を達成している。しかし,穴中心の制約の欠如により,大きな連続孔への充填は困難である。本稿では、主にプラグ・アンド・プレイのリカレント特徴推論モジュールと知識一貫性注意(kca)モジュールによって構築されるリカレント特徴推論(rfr)ネットワークを考案する。 RFRモジュールは、人間がパズルを解く方法(つまり、より簡単な部分を解き、難解な部分を解くための追加情報として結果を使用する)に似て、畳み込み特徴写像の穴の境界を反復的に推論し、さらに推論するための手がかりとして使用する。モジュールは徐々にホールセンターの制約を強化し、その結果は明確になる。 RFRの特徴マップ内の離れた場所からの情報を取得するため、我々はさらにKCAを開発し、RFRに組み込む。経験的に、提案したRFR-Netと既存のバックボーンを比較して、RFR-Netがより効率的であることを示す(例えば、同じモデルサイズで4\%のSSIMの改善)。次に、ネットワークを現在の最先端のコンテキストに配置し、パフォーマンスを向上させます。対応するソースコードは、https://github.com/jingyuanli001/RFR-Inpaintingで入手できる。 Existing inpainting methods have achieved promising performance for recovering regular or small image defects. However, filling in large continuous holes remains difficult due to the lack of constraints for the hole center. In this paper, we devise a Recurrent Feature Reasoning (RFR) network which is mainly constructed by a plug-and-play Recurrent Feature Reasoning module and a Knowledge Consistent Attention (KCA) module. Analogous to how humans solve puzzles (i.e., first solve the easier parts and then use the results as additional information to solve difficult parts), the RFR module recurrently infers the hole boundaries of the convolutional feature maps and then uses them as clues for further inference. The module progressively strengthens the constraints for the hole center and the results become explicit. To capture information from distant places in the feature map for RFR, we further develop KCA and incorporate it in RFR. Empirically, we first compare the proposed RFR-Net with existing backbones, demonstrating that RFR-Net is more efficient (e.g., a 4\% SSIM improvement for the same model size). We then place the network in the context of the current state-of-the-art, where it exhibits improved performance. The corresponding source code is available at: https://github.com/jingyuanli001/RFR-Inpainting	翻訳日:2022-11-01 04:25:53 公開日:2020-08-09
# 核ノルムと学習グラフモデルを用いた奥行き画像の復調 Depth image denoising using nuclear norm and learning graph model ( http://arxiv.org/abs/2008.03741v1 ) ライセンス: Link先を確認	Chenggang Yan, Zhisheng Li, Yongbing Zhang, Yutao Liu, Xiangyang Ji, Yongdong Zhang	(参考訳) 3次元(3d)シーンを反映しており、コンピュータビジョンの様々な分野に適用できるため、近年では奥行き画像がホットな研究テーマになりつつある。しかし、深度カメラから得られた深度画像にはノイズなどの汚れが含まれており、深度関連のアプリケーションの性能を著しく損なう。本稿では,パッチ間の類似性収集にグループベース画像復元手法が有効であることを考慮し,グループベース核ノルム・学習グラフ(gnnlg)モデルを提案した。各パッチに対して、検索ウィンドウ内で最もよく似たパッチを見つけてグループ化する。本モデルでは,グループパッチの内在的低ランク特性を利用した。さらに,画像のトポロジ的構造を反映したグラフラプラシアン行列を探索し,よりスムーズな事前処理を行うことを目的として,多様体学習手法を検証し,効率的な学習戦略を考案した。高速で高速な収束を実現するために,GNNLG を解くために乗算器の交互方向法 (ADMM) を提案する。実験の結果,提案手法は主観的,客観的両基準において,他の最先端の復調法よりも優れていることがわかった。 The depth images denoising are increasingly becoming the hot research topic nowadays because they reflect the three-dimensional (3D) scene and can be applied in various fields of computer vision. But the depth images obtained from depth camera usually contain stains such as noise, which greatly impairs the performance of depth related applications. In this paper, considering that group-based image restoration methods are more effective in gathering the similarity among patches, a group based nuclear norm and learning graph (GNNLG) model was proposed. For each patch, we find and group the most similar patches within a searching window. The intrinsic low-rank property of the grouped patches is exploited in our model. In addition, we studied the manifold learning method and devised an effective optimized learning strategy to obtain the graph Laplacian matrix, which reflects the topological structure of image, to further impose the smoothing priors to the denoised depth image. To achieve fast speed and high convergence, the alternating direction method of multipliers (ADMM) is proposed to solve our GNNLG. The experimental results show that the proposed method is superior to other current state-of-the-art denoising methods in both subjective and objective criterion.	翻訳日:2022-11-01 04:25:29 公開日:2020-08-09
# SemEval-2020 Task 8: Memotion Analysis - The Visuo-Lingual Metaphor! SemEval-2020 Task 8: Memotion Analysis -- The Visuo-Lingual Metaphor! ( http://arxiv.org/abs/2008.03781v1 ) ライセンス: Link先を確認	Chhavi Sharma and Deepesh Bhageria and William Scott and Srinivas PYKL and Amitava Das and Tanmoy Chakraborty and Viswanath Pulabaigari and Bjorn Gamback	(参考訳) ソーシャルメディア上の情報は、テキスト、ビジュアル、オーディオなどの様々なモダリティから構成される。 NLPとコンピュータビジョンのコミュニティは、ソーシャルメディアを研究するために単独で1つの顕著なモダリティしか利用していない。しかし、インターネットミームの計算処理にはハイブリッドアプローチが必要である。 Facebook、Instagram、Twiterなどのソーシャルメディアプラットフォームにおけるインターネットミームの普及はさらに、そのようなマルチモーダルコンテンツは無視できないことを示唆している。我々の知る限りでは、ミームの感情分析にはあまり注意が払わない。本提案の目的は,インターネットミームの自動処理に研究コミュニティの注意を向けることである。 task memotion analysisは、約10kの注釈付きミームをリリースし、感情(ポジティブ、ネガティブ、ニュートラル)、感情の種類(皮肉、面白い、不快、モチベーション)、それに対応する強度という、人間の注釈付きラベルを付けた。課題は、ミームの感情分析(肯定的、否定的、中立的)、ミームの全体的な感情分類(ユーモア、皮肉、攻撃的、動機づけ的)、ミームの強さの分類という3つのサブタスクで構成されていた。最高成績は3つのサブタスクごとにそれぞれ0.35, 0.51, 0.32のf1スコアであった。 Information on social media comprises of various modalities such as textual, visual and audio. NLP and Computer Vision communities often leverage only one prominent modality in isolation to study social media. However, the computational processing of Internet memes needs a hybrid approach. The growing ubiquity of Internet memes on social media platforms such as Facebook, Instagram, and Twiter further suggests that we can not ignore such multimodal content anymore. To the best of our knowledge, there is not much attention towards meme emotion analysis. The objective of this proposal is to bring the attention of the research community towards the automatic processing of Internet memes. The task Memotion analysis released approx 10K annotated memes, with human-annotated labels namely sentiment (positive, negative, neutral), type of emotion (sarcastic, funny, offensive, motivation) and their corresponding intensity. The challenge consisted of three subtasks: sentiment (positive, negative, and neutral) analysis of memes, overall emotion (humour, sarcasm, offensive, and motivational) classification of memes, and classifying intensity of meme emotion. The best performances achieved were F1 (macro average) scores of 0.35, 0.51 and 0.32, respectively for each of the three subtasks.	翻訳日:2022-11-01 04:25:08 公開日:2020-08-09
# 高解像度画像に基づくディープラーニングによるLiDARデータ強化:低コストLiDARを用いた高性能LiDARSLAMの実現へのアプローチ LiDAR Data Enrichment Using Deep Learning Based on High-Resolution Image: An Approach to Achieve High-Performance LiDAR SLAM Using Low-cost LiDAR ( http://arxiv.org/abs/2008.03694v1 ) ライセンス: Link先を確認	Jiang Yue, Weisong Wen, Jing Han, and Li-Ta Hsu	(参考訳) LiDARベースのSLAMアルゴリズムは、過去数十年で自動運転車(ADV)の堅牢で正確な位置決めを提供するために、広範囲に研究されている。 64チャンネルの高品位3dlidarで十分な性能を得ることができ、密集した点雲を提供することができる。残念ながら、高い価格がADVの広範な商業化を著しく妨げている。コスト効率のよい16チャンネルの3D LiDARは、有望な代替品だ。しかし、16チャンネルのlidarによってのみ、限定的かつスパースポイントの雲が提供され、動的環境におけるadvの十分な位置決め精度を保証することはできない。低コストカメラからの高解像度画像は、周囲についての豊富な情報を提供することができる。しかし、画像から明らかな深度情報は得られない。本稿では,3D LiDARとカメラの相補性に着想を得て,カメラからの高解像度画像を用いて,最先端のディープラーニングアルゴリズムに基づいて,低コストの16チャンネルLiDARから生の3D点雲を濃縮する手法を提案する。 ERFNetは、まず、生のスパース3D点雲の助けを借りて画像を分割するために使用される。一方、スパース畳み込みニューラルネットワークは、生のスパース3d点雲に基づいて密集した点雲を予測するために用いられる。そして、新たな多層畳み込みニューラルネットワークを用いてerfnetのセグメンテーション出力と予測された濃密点雲を融合させ、予測した3d点雲を精製する。最後に、高密度点雲を用いて、最先端の正規分布変換(NDT)に基づいてLiDAR SLAMを実行する。再編集されたkittiデータセットに対して,(1)スパース3dポイントの雲は,平均2乗誤差1.1mseで著しく濃縮されている。 2) LiDAR SLAM から生成された地図はより密集しており, 精度が著しく低下しない。 LiDAR-based SLAM algorithms are extensively studied to providing robust and accurate positioning for autonomous driving vehicles (ADV) in the past decades. Satisfactory performance can be obtained using high-grade 3D LiDAR with 64 channels, which can provide dense point clouds. Unfortunately, the high price significantly prevents its extensive commercialization in ADV. The cost-effective 3D LiDAR with 16 channels is a promising replacement. However, only limited and sparse point clouds can be provided by the 16 channels LiDAR, which cannot guarantee sufficient positioning accuracy for ADV in challenging dynamic environments. The high-resolution image from the low-cost camera can provide ample information about the surroundings. However, the explicit depth information is not available from the image. Inspired by the complementariness of 3D LiDAR and camera, this paper proposes to make use of the high-resolution images from a camera to enrich the raw 3D point clouds from the low-cost 16 channels LiDAR based on a state-of-the-art deep learning algorithm. An ERFNet is firstly employed to segment the image with the aid of the raw sparse 3D point clouds. Meanwhile, the sparse convolutional neural network is employed to predict the dense point clouds based on raw sparse 3D point clouds. Then, the predicted dense point clouds are fused with the segmentation outputs from ERFnet using a novel multi-layer convolutional neural network to refine the predicted 3D point clouds. Finally, the enriched point clouds are employed to perform LiDAR SLAM based on the state-of-the-art normal distribution transform (NDT). We tested our approach on the re-edited KITTI datasets: (1)the sparse 3D point clouds are significantly enriched with a mean square error of 1.1m MSE. (2)the map generated from the LiDAR SLAM is denser which includes more details without significant accuracy loss.	翻訳日:2022-11-01 04:18:20 公開日:2020-08-09
# 新型コロナウイルス(covid-19)診断のための深層学習技術のレビュー A Review on Deep Learning Techniques for the Diagnosis of Novel Coronavirus (COVID-19) ( http://arxiv.org/abs/2008.04815v1 ) ライセンス: Link先を確認	Md. Milon Islam, Fakhri Karray, Reda Alhajj, Jia Zeng	(参考訳) 新型コロナウイルス(COVID-19)の感染拡大は世界中で悲惨な状況となり、過去100年で最も急性で深刻な病気の1つとなっている。新型コロナウイルス(covid-19)の感染率は世界中で急速に増加している。このパンデミックのワクチンはまだ発見されていないが、深層学習技術は、新型コロナウイルスの自動診断に臨床医が使用する兵器の強力なツールであることが証明された。本稿では,CT (Computer Tomography) やX線といった様々な医用画像を用いた深層学習技術に基づく最近のシステムの概要を述べる。本稿では、深層学習技術を用いて新型コロナウイルスの診断のために開発されたシステムについて論じ、これらのネットワークのトレーニングに使用されるよく知られたデータセットについて考察する。また、この分野の研究者によって開発されたデータパーティショニング技術や様々なパフォーマンス対策についても強調する。分類学は、最近の著作を適切な洞察のために分類するために引き起こされる。最後に、新型コロナウイルス検出におけるディープラーニング手法の使用に伴う課題と、この研究領域における今後の展望について述べる。本論文は、この点において、深層学習技術がどのように使われているか、また新型コロナウイルスの感染拡大と戦うためにどのように機能するかについて、専門家や技術者に新たな洞察を提供することを目的としている。 Novel coronavirus (COVID-19) outbreak, has raised a calamitous situation all over the world and has become one of the most acute and severe ailments in the past hundred years. The prevalence rate of COVID-19 is rapidly rising every day throughout the globe. Although no vaccines for this pandemic have been discovered yet, deep learning techniques proved themselves to be a powerful tool in the arsenal used by clinicians for the automatic diagnosis of COVID-19. This paper aims to overview the recently developed systems based on deep learning techniques using different medical imaging modalities like Computer Tomography (CT) and X-ray. This review specifically discusses the systems developed for COVID-19 diagnosis using deep learning techniques and provides insights on well-known data sets used to train these networks. It also highlights the data partitioning techniques and various performance measures developed by researchers in this field. A taxonomy is drawn to categorize the recent works for proper insight. Finally, we conclude by addressing the challenges associated with the use of deep learning methods for COVID-19 detection and probable future trends in this research area. This paper is intended to provide experts (medical or otherwise) and technicians with new insights into the ways deep learning techniques are used in this regard and how they potentially further works in combatting the outbreak of COVID-19.	翻訳日:2022-11-01 04:17:49 公開日:2020-08-09
# 有向ネットワークにおけるコミュニティ検出のためのスペクトルアルゴリズム Spectral Algorithms for Community Detection in Directed Networks ( http://arxiv.org/abs/2008.03820v1 ) ライセンス: Link先を確認	Zhe Wang, Yingbin Liang and Pengsheng Ji	(参考訳) 大規模ソーシャルネットワークにおけるコミュニティ検出は,ノードの次数不均一性に影響される。有向ネットワークに対するD-SCOREアルゴリズムを導入し、クラスタリング前に隣接行列の特異ベクトルの要素ワイド比をとることにより、この効果を低減した。統計的引用ネットワークについて有意義な結果を得たが,その性能に関する厳密な分析は得られなかった。まず, 有向次数補正ブロックモデル (Directed-DCBM) のアルゴリズムとその変種に関する理論的保証を確立する。第2に,本論文は,D-SCOREアルゴリズムにおいて,特異ベクトルではなく,元のネットワークの情報を用いて,コミュニティコア外部のノードをアタッチすることで,大幅な改良を行う。 Community detection in large social networks is affected by degree heterogeneity of nodes. The D-SCORE algorithm for directed networks was introduced to reduce this effect by taking the element-wise ratios of the singular vectors of the adjacency matrix before clustering. Meaningful results were obtained for the statistician citation network, but rigorous analysis on its performance was missing. First, this paper establishes theoretical guarantee for this algorithm and its variants for the directed degree-corrected block model (Directed-DCBM). Second, this paper provides significant improvements for the original D-SCORE algorithms by attaching the nodes outside of the community cores using the information of the original network instead of the singular vectors.	翻訳日:2022-11-01 04:17:31 公開日:2020-08-09
# 高速かつ高精度なCRF構成解析 Fast and Accurate Neural CRF Constituency Parsing ( http://arxiv.org/abs/2008.03736v1 ) ライセンス: Link先を確認	Yu Zhang, Houquan Zhou, Zhenghua Li	(参考訳) 確率分布の推定はnlp分野の主要な問題の一つである。しかし、深層学習(DL)とプレDLの時代において、シーケンスラベリングタスクにおける線形鎖 CRF の広大な応用とは異なり、木構造 CRF を選挙区解析に適用する研究はほとんどなく、主に内向きアルゴリズムの複雑さと非効率性のためである。この研究は、高速で正確なCRF成分分析器を提示する。鍵となる考え方は、GPU上の大きなテンソル演算によって損失計算の内側のアルゴリズムをバッチ化し、一方、効率的なバックプロパゲーションによる勾配計算の外側のアルゴリズムを避けることである。また,より効率的な2段階ブラケットラベル解析手法を提案する。依存関係解析の最近の進歩に触発された解析性能を改善するため,境界表現と双対注意に基づく新たなスコアリングアーキテクチャを導入し,有益なドロップアウト戦略を提案する。 PTB, CTB5.1, CTB7 の実験では,2段階の CRF パーサが w/o と w/BERT の両設定で新たな最先端性能を実現し,毎秒1,000文以上を解析可能である。私たちはコードをhttps://github.com/yzhangcs/crfparでリリースします。 Estimating probability distribution is one of the core issues in the NLP field. However, in both deep learning (DL) and pre-DL eras, unlike the vast applications of linear-chain CRF in sequence labeling tasks, very few works have applied tree-structure CRF to constituency parsing, mainly due to the complexity and inefficiency of the inside-outside algorithm. This work presents a fast and accurate neural CRF constituency parser. The key idea is to batchify the inside algorithm for loss computation by direct large tensor operations on GPU, and meanwhile avoid the outside algorithm for gradient computation via efficient back-propagation. We also propose a simple two-stage bracketing-then-labeling parsing approach to improve efficiency further. To improve the parsing performance, inspired by recent progress in dependency parsing, we introduce a new scoring architecture based on boundary representation and biaffine attention, and a beneficial dropout strategy. Experiments on PTB, CTB5.1, and CTB7 show that our two-stage CRF parser achieves new state-of-the-art performance on both settings of w/o and w/ BERT, and can parse over 1,000 sentences per second. We release our code at https://github.com/yzhangcs/crfpar.	翻訳日:2022-11-01 04:17:20 公開日:2020-08-09
# 画像暗号化のための遺伝的アルゴリズムのランダム性評価:信号処理手法 Randomness Evaluation of a Genetic Algorithm for Image Encryption: A Signal Processing Approach ( http://arxiv.org/abs/2008.03681v1 ) ライセンス: Link先を確認	Zoubir Hamici	(参考訳) 本稿では,セキュアな画像通信のためのブロック暗号のランダム性評価を行う。 GFHT暗号(GFHT cipher)は、細菌の抗生物質耐性に触発された遺伝子融合(GF)と水平遺伝子導入(HGT)を組み合わせた遺伝的アルゴリズムである。対称暗号鍵は、多層ランダム配列を持つ4対の染色体によって生成される。暗号は1ブロックの主キーエージェントのGFから始まり、HGTは、遺伝子がピクセルであり、染色体が行と列である難読化を行う。画像ハッシュ値から抽出したソルトを用いてワンタイムパッド(OTP)方式を実装し、メインのパスフレーズやキーを変更することなく、1ピクセルの変更で異なる暗号化キーを生成する。これにより、99%の極端な雪崩効果が得られる。ランダムマトリクス理論,パワースペクトル密度,雪崩効果,2次元オートコリレーション,画素ランダムネステスト,チ二乗仮説テストに基づくランダムネス評価は,暗号化された画像が均一なホワイトノイズの統計的挙動を採用することを示す。さらに,カオス遺伝暗号との比較によりGFHTアルゴリズムの利点が示された。 In this paper a randomness evaluation of a block cipher for secure image communication is presented. The GFHT cipher is a genetic algorithm, that combines gene fusion (GF) and horizontal gene transfer (HGT) both inspired from antibiotic resistance in bacteria. The symmetric encryption key is generated by four pairs of chromosomes with multi-layer random sequences. The encryption starts by a GF of the principal key-agent in a single block, then HGT performs obfuscation where the genes are pixels and the chromosomes are the rows and columns. A Salt extracted from the image hash-value is used to implement one-time pad (OTP) scheme, hence a modification of one pixel generates a different encryption key without changing the main passphrase or key. Therefore, an extreme avalanche effect of 99% is achieved. Randomness evaluation based on random matrix theory, power spectral density, avalanche effect, 2D auto-correlation, pixels randomness tests and chi-square hypotheses testing show that encrypted images adopt the statistical behavior of uniform white noise; hence validating the theoretical model by experimental results. Moreover, performance comparison with chaos-genetic ciphers shows the merit of the GFHT algorithm.	翻訳日:2022-11-01 04:16:55 公開日:2020-08-09
# コード構築遺伝的プログラミング Code Building Genetic Programming ( http://arxiv.org/abs/2008.03649v1 ) ライセンス: Link先を確認	Edward Pantridge, Lee Spector	(参考訳) 近年、遺伝的プログラミングの分野は自動プログラミングに多大な進歩を遂げている。 pushgpやグラマーガイド遺伝的プログラミングのような現代のプログラム合成手法の研究と開発は、導入的な学術的な設定で典型的に割り当てられる問題を解決するプログラムを作成できる。これらの問題は、単純なデータ構造、基本的な制御フローパターン、プリミティブで重複しないデータ型(継承や複合型などなしで)の狭いセットに焦点を当てている。プログラム合成のための遺伝的プログラミング手法が、任意のデータ型、データ構造、および既存のコードベースから引き出された仕様を使用するプログラムを合成する能力を説得力のある形で実証した例はほとんどない。本稿では,リフレクションやファーストクラス仕様などのプログラミング言語機能を活用することで,これを実現するためのフレームワークとしてcbgp(code building genetic programming)を提案する。 CBGPは、ホスト言語のソースコードに実行または変換できる計算グラフを生成する。 CBGPの新たな機能を示すために,非原始多型データ型といくつかの標準プログラム合成ベンチマークを用いた新しいベンチマーク結果を提案する。 In recent years the field of genetic programming has made significant advances towards automatic programming. Research and development of contemporary program synthesis methods, such as PushGP and Grammar Guided Genetic Programming, can produce programs that solve problems typically assigned in introductory academic settings. These problems focus on a narrow, predetermined set of simple data structures, basic control flow patterns, and primitive, non-overlapping data types (without, for example, inheritance or composite types). Few, if any, genetic programming methods for program synthesis have convincingly demonstrated the capability of synthesizing programs that use arbitrary data types, data structures, and specifications that are drawn from existing codebases. In this paper, we introduce Code Building Genetic Programming (CBGP) as a framework within which this can be done, by leveraging programming language features such as reflection and first-class specifications. CBGP produces a computational graph that can be executed or translated into source code of a host language. To demonstrate the novel capabilities of CBGP, we present results on new benchmarks that use non-primitive, polymorphic data types as well as some standard program synthesis benchmarks.	翻訳日:2022-11-01 04:16:34 公開日:2020-08-09
# 長尺データのための特徴空間拡張 Feature Space Augmentation for Long-Tailed Data ( http://arxiv.org/abs/2008.03673v1 ) ライセンス: Link先を確認	Peng Chu and Xiao Bian and Shaopeng Liu and Haibin Ling	(参考訳) 実世界のデータは、各クラスの頻度が通常異なるため、しばしばロングテールの分布に従う。例えば、データセットは、多数の未表現のクラスと、十分なデータを持つ少数のクラスを持つことができる。しかしながら、データセットを表現するモデルは通常、クラス間で合理的に均質なパフォーマンスを持つことが期待されている。データのアンバランス問題を緩和するためのベストプラクティスとして、クラスバランス損失の導入とデータ再サンプリングと拡張に関する高度な手法がある。しかし、未表示のクラスに関する他の問題は、欠落した情報を回復するために追加の知識に頼る必要がある。本研究では,特徴空間における表現不足のクラスを,十分なサンプルを持つクラスから学習した特徴量で拡張することで,長鎖問題に対処する新しい手法を提案する。特に,各クラスの特徴を,クラスアクティベーションマップを用いてクラスジェネリックコンポーネントとクラス固有のコンポーネントに分解する。未表現のクラスの新しいサンプルは、未表現のクラスからクラス固有の特徴を、混乱したクラスからクラスジェネリックな特徴に融合させることで、トレーニング段階のフライで生成される。 iNaturalist、ImageNet-LT、Places-LT、CIFARの長期バージョンなど、さまざまなデータセットで得られた結果から、アートパフォーマンスの現状が示されている。 Real-world data often follow a long-tailed distribution as the frequency of each class is typically different. For example, a dataset can have a large number of under-represented classes and a few classes with more than sufficient data. However, a model to represent the dataset is usually expected to have reasonably homogeneous performances across classes. Introducing class-balanced loss and advanced methods on data re-sampling and augmentation are among the best practices to alleviate the data imbalance problem. However, the other part of the problem about the under-represented classes will have to rely on additional knowledge to recover the missing information. In this work, we present a novel approach to address the long-tailed problem by augmenting the under-represented classes in the feature space with the features learned from the classes with ample samples. In particular, we decompose the features of each class into a class-generic component and a class-specific component using class activation maps. Novel samples of under-represented classes are then generated on the fly during training stages by fusing the class-specific features from the under-represented classes with the class-generic features from confusing classes. Our results on different datasets such as iNaturalist, ImageNet-LT, Places-LT and a long-tailed version of CIFAR have shown the state of the art performances.	翻訳日:2022-11-01 04:16:16 公開日:2020-08-09
# レーダに基づく動的占有グリッドマッピングと物体検出 Radar-based Dynamic Occupancy Grid Mapping and Object Detection ( http://arxiv.org/abs/2008.03696v1 ) ライセンス: Link先を確認	Christopher Diehl, Eduard Feicho, Alexander Schwambach, Thomas Dammeier, Eric Mares, Torsten Bertram	(参考訳) センサデータ融合と物体追跡を利用した環境モデリングは安全な自動運転に不可欠である。近年,静的な環境を想定した古典的占有グリッドマップは,低レベルのデータ融合の可能性を維持しつつ,動的局所環境の位置と速度分布を推定するダイナミック占有グリッドマップに拡張されている。本稿では,従来のアプローチのさらなる発展について述べる。著者の知識を最大限に活用するために,レーダデータのみに基づく動的占有グリッドマッピングとその後の解析に関する出版物は存在しない。そこで本研究では,複数のレーダセンサのデータを融合し,グリッドを用いた物体追跡・マッピング手法を適用した。その後、動的領域のクラスタリングは高レベルなオブジェクト情報を提供する。比較のためにlidarベースの手法も開発されている。本手法は都市環境における移動車からの実世界データと質的,定量的に評価する。この評価は、異なる比較指標を考慮して、レーダベースの動的占有グリッドマップの利点を示す。 Environment modeling utilizing sensor data fusion and object tracking is crucial for safe automated driving. In recent years, the classical occupancy grid map approach, which assumes a static environment, has been extended to dynamic occupancy grid maps, which maintain the possibility of a low-level data fusion while also estimating the position and velocity distribution of the dynamic local environment. This paper presents the further development of a previous approach. To the best of the author's knowledge, there is no publication about dynamic occupancy grid mapping with subsequent analysis based only on radar data. Therefore in this work, the data of multiple radar sensors are fused, and a grid-based object tracking and mapping method is applied. Subsequently, the clustering of dynamic areas provides high-level object information. For comparison, also a lidar-based method is developed. The approach is evaluated qualitatively and quantitatively with real-world data from a moving vehicle in urban environments. The evaluation illustrates the advantages of the radar-based dynamic occupancy grid map, considering different comparison metrics.	翻訳日:2022-11-01 04:15:40 公開日:2020-08-09
# 全自動フォトグラムデータセグメンテーションと物体情報抽出によるシミュレーション地形の作成 Fully Automated Photogrammetric Data Segmentation and Object Information Extraction Approach for Creating Simulation Terrain ( http://arxiv.org/abs/2008.03697v1 ) ライセンス: Link先を確認	Meida Chen, Andrew Feng, Kyle McCullough, Pratusha Bhuvana Prasad, Ryan McAlinden, Lucio Soibelman, Mike Enloe	(参考訳) これまでの研究では、視覚的にリアルな3Dメッシュを、有能なカメラと効率的な測光ソフトウェア技術を備えた安価な無人航空機システム(UAS)で、自動的に再構築できることが実証された。しかし、そのような生成されたデータは、オブジェクトのセマンティック情報や機能(例えば、人工物、植生、地面、オブジェクト材料など)を含まないため、洗練されたユーザレベルとシステムレベルの相互作用を許さない。トレーニングとシミュレーションのための現実的な仮想環境(ミッション計画、リハーサル、脅威検出など)を作成する際のデータのユースケースを考えると、データのセグメンテーションとオブジェクト情報の抽出が不可欠である。そこで本研究の目的は,完全に自動化されたフォトグラムデータセグメンテーションおよびオブジェクト情報抽出フレームワークを設計・開発することである。提案手法を検証するために, 著者らが設計したシミュレーションツールであるAerial Terrain Line of Sight Analysis System (ATLAS) の仮想環境構築に, セグメントデータと抽出した特徴を用いた。その結果,3次元メッシュツリーは,抽出した個々の木の位置を用いてジオタイプな3次元ツリーモデルに置き換えることができた。抽出された樹木の特徴(色、幅、高さ)は、適切な樹木種を選択し、視覚的品質を高めるのに有用である。また、同定された地材情報は、パスファインディングに考慮することができる。最も短い経路は、物理的距離だけでなく、異なる地上材料におけるオフロード車両の性能も考慮して計算することができる。 Our previous works have demonstrated that visually realistic 3D meshes can be automatically reconstructed with low-cost, off-the-shelf unmanned aerial systems (UAS) equipped with capable cameras, and efficient photogrammetric software techniques. However, such generated data do not contain semantic information/features of objects (i.e., man-made objects, vegetation, ground, object materials, etc.) and cannot allow the sophisticated user-level and system-level interaction. Considering the use case of the data in creating realistic virtual environments for training and simulations (i.e., mission planning, rehearsal, threat detection, etc.), segmenting the data and extracting object information are essential tasks. Thus, the objective of this research is to design and develop a fully automated photogrammetric data segmentation and object information extraction framework. To validate the proposed framework, the segmented data and extracted features were used to create virtual environments in the authors previously designed simulation tool i.e., Aerial Terrain Line of Sight Analysis System (ATLAS). The results showed that 3D mesh trees could be replaced with geo-typical 3D tree models using the extracted individual tree locations. The extracted tree features (i.e., color, width, height) are valuable for selecting the appropriate tree species and enhance visual quality. Furthermore, the identified ground material information can be taken into consideration for pathfinding. The shortest path can be computed not only considering the physical distance, but also considering the off-road vehicle performance capabilities on different ground surface materials.	翻訳日:2022-11-01 04:15:26 公開日:2020-08-09
# 隠れマルコフモデルとLSTMの比較分析 : シミュレーション的アプローチ Comparative Analysis of the Hidden Markov Model and LSTM: A Simulative Approach ( http://arxiv.org/abs/2008.03825v1 ) ライセンス: Link先を確認	Manie Tadayon, Greg Pottie	(参考訳) 近年,金融,教育,生物学,工学など,さまざまな分野の現実的なプロセスが時系列としてモデル化できることから,時系列データやシーケンシャルデータが注目されている。 kalmanフィルタ、隠れマルコフモデル、long short term memory (lstm)のような多くのアルゴリズムや手法がデータの推論や予測のために提案されているが、それらの利用はアプリケーション、問題の種類、利用可能なデータ、十分な正確さや損失に大きく依存している。本稿では,教師付きおよび教師なしマルコフモデルとLSTMを比較し,学習に必要なデータ量,複雑性,予測精度について比較する。さらに,定常および非定常状況下で,観測を識別し,個別のマルコフモデルに変換する様々な手法を提案する。その結果,大量のラベル付きデータが利用できない場合,教師なしマルコフモデルでさえLSTMより優れていることがわかった。さらに,1次マルコフ仮定が満たされていない場合でも,隠れマルコフモデルがシーケンスデータの処理に有効な方法であることを示す。 Time series and sequential data have gained significant attention recently since many real-world processes in various domains such as finance, education, biology, and engineering can be modeled as time series. Although many algorithms and methods such as the Kalman filter, hidden Markov model, and long short term memory (LSTM) are proposed to make inferences and predictions for the data, their usage significantly depends on the application, type of the problem, available data, and sufficient accuracy or loss. In this paper, we compare the supervised and unsupervised hidden Markov model to LSTM in terms of the amount of data needed for training, complexity, and forecasting accuracy. Moreover, we propose various techniques to discretize the observations and convert the problem to a discrete hidden Markov model under stationary and non-stationary situations. Our results indicate that even an unsupervised hidden Markov model can outperform LSTM when a massive amount of labeled data is not available. Furthermore, we show that the hidden Markov model can still be an effective method to process the sequence data even when the first-order Markov assumption is not satisfied.	翻訳日:2022-11-01 04:09:14 公開日:2020-08-09
# LRSpeech: 極低リソース音声合成と認識 LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition ( http://arxiv.org/abs/2008.03687v1 ) ライセンス: Link先を確認	Jin Xu, Xu Tan, Yi Ren, Tao Qin, Jian Li, Sheng Zhao, Tie-Yan Liu	(参考訳) 音声合成(text to speech, tts)と音声認識(automatic speech recognition, asr)は重要な音声課題であり、モデル学習のために大量のテキストと音声ペアを必要とする。しかし、世界には6,000以上の言語があり、ほとんどの言語は音声訓練データがないため、極低リソース言語向けにTSやASRシステムを構築する際には大きな課題が生じる。本稿では,データコストの低いレア言語をサポート可能な,低リソース環境下でのTLS/ASRシステムであるLSpeechを開発する。 LRSpeechは3つの重要な技術から構成される。 1)リッチリソース言語の事前学習と低リソース言語の微調整 2) TTS と ASR の二重変換は,相互の精度を反復的に向上させる。 3)ttsモデルを高品質な目標話者音声でカスタマイズし,複数声のasrモデルを改善するための知識蒸留法実験言語(英語)と真の低リソース言語(リトアニア語)で実験を行い,LRSpeechの有効性を検証する。 LRSpeechの実験結果 1) 産業展開の要件を満たす合成音声の知性(98%以上)と自然性(3.5 平均意見スコア(mos))の両方において,ttsの高品質を実現する。 2)asrの有望な認識精度を達成し、 3) 最後に、非常に低いリソースのトレーニングデータを使用します。また,LRSpeechをさまざまな量のデータ資源で包括的に分析し,産業展開のための貴重な洞察とガイダンスを提供する。現在、より稀な言語でTSをサポートするために、商用のクラウド音声サービスにLSpeechをデプロイしています。 Speech synthesis (text to speech, TTS) and recognition (automatic speech recognition, ASR) are important speech tasks, and require a large amount of text and speech pairs for model training. However, there are more than 6,000 languages in the world and most languages are lack of speech training data, which poses significant challenges when building TTS and ASR systems for extremely low-resource languages. In this paper, we develop LRSpeech, a TTS and ASR system under the extremely low-resource setting, which can support rare languages with low data cost. LRSpeech consists of three key techniques: 1) pre-training on rich-resource languages and fine-tuning on low-resource languages; 2) dual transformation between TTS and ASR to iteratively boost the accuracy of each other; 3) knowledge distillation to customize the TTS model on a high-quality target-speaker voice and improve the ASR model on multiple voices. We conduct experiments on an experimental language (English) and a truly low-resource language (Lithuanian) to verify the effectiveness of LRSpeech. Experimental results show that LRSpeech 1) achieves high quality for TTS in terms of both intelligibility (more than 98% intelligibility rate) and naturalness (above 3.5 mean opinion score (MOS)) of the synthesized speech, which satisfy the requirements for industrial deployment, 2) achieves promising recognition accuracy for ASR, and 3) last but not least, uses extremely low-resource training data. We also conduct comprehensive analyses on LRSpeech with different amounts of data resources, and provide valuable insights and guidances for industrial deployment. We are currently deploying LRSpeech into a commercialized cloud speech service to support TTS on more rare languages.	翻訳日:2022-11-01 04:08:26 公開日:2020-08-09
# SpeedySpeech: 効率的なニューラル音声合成 SpeedySpeech: Efficient Neural Speech Synthesis ( http://arxiv.org/abs/2008.03802v1 ) ライセンス: Link先を確認	Jan Vainer, Ond\v{r}ej Du\v{s}ek	(参考訳) 最近のニューラルシーケンス・ツー・シーケンスモデルでは音声合成の質が大幅に改善されているが、高速な訓練、高速推論、高品質な音声合成を同時に行うシステムはない。本稿では,計算資源の要求が低く,学習時間も速い,高品質なリアルタイムスペクトログラム合成が可能な学生-教師ネットワークを提案する。高品質な音声を生成するには自己注意層は必要ないことを示す。教師ネットワークと教師ネットワークの両方に残存する単純な畳み込みブロックを活用し,教師モデルにおいて1つの注意層のみを使用する。 MelGANボコーダと組み合わせたモデルでは,声質はTacotron 2より有意に高かった。我々のモデルは1つのGPUで効率的にトレーニングでき、CPUでもリアルタイムで実行できる。ソースコードとオーディオサンプルの両方をgithubリポジトリで提供しています。 While recent neural sequence-to-sequence models have greatly improved the quality of speech synthesis, there has not been a system capable of fast training, fast inference and high-quality audio synthesis at the same time. We propose a student-teacher network capable of high-quality faster-than-real-time spectrogram synthesis, with low requirements on computational resources and fast training time. We show that self-attention layers are not necessary for generation of high quality audio. We utilize simple convolutional blocks with residual connections in both student and teacher networks and use only a single attention layer in the teacher model. Coupled with a MelGAN vocoder, our model's voice quality was rated significantly higher than Tacotron 2. Our model can be efficiently trained on a single GPU and can run in real time even on a CPU. We provide both our source code and audio samples in our GitHub repository.	翻訳日:2022-11-01 04:07:58 公開日:2020-08-09
# リスク感性マルコフ決定過程における平均と変動の組合せ Risk-Sensitive Markov Decision Processes with Combined Metrics of Mean and Variance ( http://arxiv.org/abs/2008.03707v1 ) ライセンス: Link先を確認	Li Xia	(参考訳) 本稿では,報酬の平均と分散を考慮した長期平均指標を用いた無限段階離散時間マルコフ決定過程(mdp)の最適化問題について検討する。平均は平均リターンを示し、分散はリスクまたは公正を示すので、このようなパフォーマンス指標は重要である。しかし、分散計量はすべての段階で報酬を結合し、伝統的な動的プログラミングは時間一貫性の原則が失敗するため適用できない。我々はこの問題を感度に基づく最適化理論と呼ばれる新しい視点から研究する。性能差公式が導出され、2つの異なるポリシーの下でmdpの平均分散結合指標の差を定量化することができる。差分公式は、厳密に平均分散性能が向上した新しいポリシーを生成するのに利用できる。最適政策の必要条件と決定論的政策の最適性が導出される。さらにポリシー反復の形で反復的アルゴリズムを開発し、混合およびランダム化されたポリシー空間において局所最適に収束することが証明された。特に、平均報酬がポリシーで一定であれば、アルゴリズムはグローバル最適に収束することが保証される。最後に,エネルギー貯蔵システムにおける風力発電のゆらぎ低減に関する研究に本手法を適用し,最適化手法の適用可能性を示す。 This paper investigates the optimization problem of an infinite stage discrete time Markov decision process (MDP) with a long-run average metric considering both mean and variance of rewards together. Such performance metric is important since the mean indicates average returns and the variance indicates risk or fairness. However, the variance metric couples the rewards at all stages, the traditional dynamic programming is inapplicable as the principle of time consistency fails. We study this problem from a new perspective called the sensitivity-based optimization theory. A performance difference formula is derived and it can quantify the difference of the mean-variance combined metrics of MDPs under any two different policies. The difference formula can be utilized to generate new policies with strictly improved mean-variance performance. A necessary condition of the optimal policy and the optimality of deterministic policies are derived. We further develop an iterative algorithm with a form of policy iteration, which is proved to converge to local optima both in the mixed and randomized policy space. Specially, when the mean reward is constant in policies, the algorithm is guaranteed to converge to the global optimum. Finally, we apply our approach to study the fluctuation reduction of wind power in an energy storage system, which demonstrates the potential applicability of our optimization method.	翻訳日:2022-11-01 04:07:45 公開日:2020-08-09
# 拘束多様体のニューラルマニピュレーション計画 Neural Manipulation Planning on Constraint Manifolds ( http://arxiv.org/abs/2008.03787v1 ) ライセンス: Link先を確認	Ahmed H. Qureshi, Jiangeng Dong, Austin Choe, and Michael C. Yip	(参考訳) タスク制約の存在は、モーションプランニングに重大な課題を課す。最近の進歩にもかかわらず、既存のアルゴリズムはほとんどの計画問題に対して計算コストがかかる。本稿では,マルチモーダルキネマティック制約に対する最初のニューラルプランナーであるConstrained Motion Planning Networks (CoMPNet)を提案する。我々のアプローチは以下の構成要素からなる。一制約及び環境認識エンコーダ二制約多様体の近傍における構成を出力するニューラルロボット構成生成装置、及び三実現可能なロボットの運動軌跡を作成するために生成した構成を取り込む双方向計画アルゴリズムコンプネットは制約のない問題と制約付き問題の両方を含む実用的なモーションプランニングタスクを解決している。さらに、トレーニング中に見えないような、高い成功率の環境において、オブジェクトの新しい見えない場所に一般化する。最先端の制約付き動作計画アルゴリズムと比較して、CoMPNetは計算速度の桁違いの改善により性能が著しく低下する。 The presence of task constraints imposes a significant challenge to motion planning. Despite all recent advancements, existing algorithms are still computationally expensive for most planning problems. In this paper, we present Constrained Motion Planning Networks (CoMPNet), the first neural planner for multimodal kinematic constraints. Our approach comprises the following components: i) constraint and environment perception encoders; ii) neural robot configuration generator that outputs configurations on/near the constraint manifold(s), and iii) a bidirectional planning algorithm that takes the generated configurations to create a feasible robot motion trajectory. We show that CoMPNet solves practical motion planning tasks involving both unconstrained and constrained problems. Furthermore, it generalizes to new unseen locations of the objects, i.e., not seen during training, in the given environments with high success rates. When compared to the state-of-the-art constrained motion planning algorithms, CoMPNet outperforms by order of magnitude improvement in computational speed with a significantly lower variance.	翻訳日:2022-11-01 04:07:27 公開日:2020-08-09
# 決定点プロセスのテスト Testing Determinantal Point Processes ( http://arxiv.org/abs/2008.03650v1 ) ライセンス: Link先を確認	Khashayar Gatmiry (1), Maryam Aliakbarpour (1), Stefanie Jegelka (1) ((1) Massachusetts Institute of Technology)	(参考訳) 決定点過程(DPP)は多様性の確率論的モデルとして人気がある。本稿では,分布特性テストという新たな視点から,dppsについて検討する。基底集合の部分集合上の未知分布へのサンプルアクセス$q$を仮定すると、$q$が DPP 分布であるか、$\epsilon$-far が $\ell_1$-distance のすべての DPP 分布と区別することを目指している。本研究では, DPP テストのための最初のアルゴリズムを提案する。さらに, DPP テストのサンプルの複雑さに一致した低い境界を確立する。この下限はまた、より一般的なlog-submodular分布のクラスをテストする問題に対する新たなハードネス結果の提示にも拡張される。 Determinantal point processes (DPPs) are popular probabilistic models of diversity. In this paper, we investigate DPPs from a new perspective: property testing of distributions. Given sample access to an unknown distribution $q$ over the subsets of a ground set, we aim to distinguish whether $q$ is a DPP distribution, or $\epsilon$-far from all DPP distributions in $\ell_1$-distance. In this work, we propose the first algorithm for testing DPPs. Furthermore, we establish a matching lower bound on the sample complexity of DPP testing. This lower bound also extends to showing a new hardness result for the problem of testing the more general class of log-submodular distributions.	翻訳日:2022-11-01 04:06:59 公開日:2020-08-09
# 量子深層学習におけるグローバル最適探索 Global Optimum Search in Quantum Deep Learning ( http://arxiv.org/abs/2008.03655v1 ) ライセンス: Link先を確認	Lanston Hau Man Chu, Tejas Bhojraj, Rui Huang	(参考訳) 本稿では,量子回路を用いた機械学習最適化問題を解くことを目的とする。 2つの目的関数のグローバル最小/最大値を求めるために, 平均的アプローチと部分スワップテストカットオフ法(pstc)という2つの手法を提案した。現在のコストは$o(\sqrt{\|\theta\|} n)$であるが、チェックプロセスの強化によってさらに$o(\sqrt{\|\theta\|} \cdot sublinear \ n)$に改善される可能性がある。 This paper aims to solve machine learning optimization problem by using quantum circuit. Two approaches, namely the average approach and the Partial Swap Test Cut-off method (PSTC) was proposed to search for the global minimum/maximum of two different objective functions. The current cost is $O(\sqrt{\|\Theta\|} N)$, but there is potential to improve PSTC further to $O(\sqrt{\|\Theta\|} \cdot sublinear \ N)$ by enhancing the checking process.	翻訳日:2022-11-01 04:06:46 公開日:2020-08-09
# 進化的深層学習における光と影:分類学、批判的方法論分析、学習事例、学習教訓、勧告と課題 Lights and Shadows in Evolutionary Deep Learning: Taxonomy, Critical Methodological Analysis, Cases of Study, Learned Lessons, Recommendations and Challenges ( http://arxiv.org/abs/2008.03620v1 ) ライセンス: Link先を確認	Aritz D. Martinez, Javier Del Ser, Esther Villar-Rodriguez, Eneko Osaba, Javier Poyatos, Siham Tabik, Daniel Molina, Francisco Herrera	(参考訳) バイオインスパイアされた最適化アルゴリズムとディープラーニングモデルの融合については、ネットワークトポロジの発見や、与えられたタスクのパフォーマンスを向上したハイパーパラメトリック構成、勾配に基づく解法に代わるモデルパラメータの最適化など、いくつかの目的で多くのことが述べられている。実際、文学は、これらのタスクに自然にインスパイアされたアプローチの適用を示す提案に富んでいる。この研究では、これまでの3つの軸に基づく貢献を包括的にレビューし、批判的に検討しています。 a) 歴史的視点,深層学習における最適化問題の定義,文献の深い分析に関連する分類を含む,最適化と分類法(なぜ?) b) 批判的方法論分析(ハウ?)は、2つのケーススタディとともに、文献の分析の後に、学習した教訓と良い実践に対する勧告に対処することができる。 c) 課題と研究の新たな方向性(何ができるか、何のためにできるか) まとめると、3つの軸(最適化と分類、批判的分析、挑戦)は、融合研究の領域におけるエキサイティングな未来を創り出す2つの技術の統合の完全なビジョンを概観している。 Much has been said about the fusion of bio-inspired optimization algorithms and Deep Learning models for several purposes: from the discovery of network topologies and hyper-parametric configurations with improved performance for a given task, to the optimization of the model's parameters as a replacement for gradient-based solvers. Indeed, the literature is rich in proposals showcasing the application of assorted nature-inspired approaches for these tasks. In this work we comprehensively review and critically examine contributions made so far based on three axes, each addressing a fundamental question in this research avenue: a) optimization and taxonomy (Why?), including a historical perspective, definitions of optimization problems in Deep Learning, and a taxonomy associated with an in-depth analysis of the literature, b) critical methodological analysis (How?), which together with two case studies, allows us to address learned lessons and recommendations for good practices following the analysis of the literature, and c) challenges and new directions of research (What can be done, and what for?). In summary, three axes - optimization and taxonomy, critical analysis, and challenges - which outline a complete vision of a merger of two technologies drawing up an exciting future for this area of fusion research.	翻訳日:2022-11-01 03:59:38 公開日:2020-08-09
# C1 at SemEval-2020 Task 9: SentiMix: Sentiment Analysis for Code-Mixed Social Media Text using Feature Engineering (英語) C1 at SemEval-2020 Task 9: SentiMix: Sentiment Analysis for Code-Mixed Social Media Text using Feature Engineering ( http://arxiv.org/abs/2008.13549v1 ) ライセンス: Link先を確認	Laksh Advani and Clement Lu and Suraj Maharjan	(参考訳) 今日の相互接続された多言語の世界では、ソーシャルメディア上での言語コード混合が一般的である。感情分析のような多くの自然言語処理(nlp)タスクは成熟しており、モノリンガルテキスト用によく設計されているが、これらのタスクをコード混合テキストに適用するための技術はまだ探索を必要とする。本稿では,SemEval-2020 Task 9: SentiMixのコード混合ソーシャルメディアテキストにおける感情分析における特徴工学的アプローチについて述べる。我々は,「肯定的」「否定的」感情と「中立的」感情の曖昧さを解消できる分類器を設計するために,手書きの語彙的,感情的,メタデータ的特徴を駆使してこの問題に取り組む。このモデルでは, "hinglish" タスクで 0.65 の重み付き f1 スコアと "spanglish" タスクで 0.63 のスコアを得ることができた。 In today's interconnected and multilingual world, code-mixing of languages on social media is a common occurrence. While many Natural Language Processing (NLP) tasks like sentiment analysis are mature and well designed for monolingual text, techniques to apply these tasks to code-mixed text still warrant exploration. This paper describes our feature engineering approach to sentiment analysis in code-mixed social media text for SemEval-2020 Task 9: SentiMix. We tackle this problem by leveraging a set of hand-engineered lexical, sentiment, and metadata features to design a classifier that can disambiguate between "positive", "negative" and "neutral" sentiment. With this model, we are able to obtain a weighted F1 score of 0.65 for the "Hinglish" task and 0.63 for the "Spanglish" tasks	翻訳日:2022-11-01 03:58:53 公開日:2020-08-09
# 概念ドリフト検出:ファジィ距離推定による欠落値の扱い Concept Drift Detection: Dealing with MissingValues via Fuzzy Distance Estimations ( http://arxiv.org/abs/2008.03662v1 ) ライセンス: Link先を確認	Anjin Liu, Jie Lu, Guangquan Zhang	(参考訳) データストリームでは、到着した観測の異なる時点におけるデータ分布が変化する可能性がある - 概念ドリフトと呼ばれる現象だ。概念の漂流を検出することは比較的成熟した研究分野であるが、観測結果から得られた不確実性に対する解決法は孤立して研究されている。これらのソリューションがドリフト検出性能にどのように影響するかはまだ検討されていない。しかし、データ計算手法はデータを減らすのではなく、実際にデータの不確実性を増大させる可能性があると考えている。また,ドリフト検出時に分布変化を推定するプロセスにバイアスを生じさせる可能性があり,学習モデルの学習が困難になる可能性がある。本研究の目的は, 観測値の欠落を推定するよりも, 観測距離を推定することに集中し, 推定誤差に応じて観測値をヒストグラムビンに割り当てるメンバシップ関数を定義することである。本手法は,観測における各欠落値の反復推定による累積誤差を低減するための新しいマスク付き距離学習 (MDL) アルゴリズムと,データ分布の相違点を同定するためのファジィ重み付き周波数 (FWF) 法を備える。本論文で提案するコンセプトドリフト検出アルゴリズムは,不足値を扱うことができる特異かつ統一的なアルゴリズムであるが,概念ドリフト検出アルゴリズムと組み合わせた計算アルゴリズムではない。合成と実世界の両方のデータセットの実験は、この手法の利点を示し、欠落した値を持つデータのドリフトの検出における頑健さを示している。これらの結果から, 欠損値がコンセプトドリフト検出に多大な影響を及ぼすことが明らかとなったが, ファジィ・セット理論をモデル観測に用いると, 計算よりも信頼性の高い結果が得られることがわかった。 In data streams, the data distribution of arriving observations at different time points may change - a phenomenon called concept drift. While detecting concept drift is a relatively mature area of study, solutions to the uncertainty introduced by observations with missing values have only been studied in isolation. No one has yet explored whether or how these solutions might impact drift detection performance. We, however, believe that data imputation methods may actually increase uncertainty in the data rather than reducing it. We also conjecture that imputation can introduce bias into the process of estimating distribution changes during drift detection, which can make it more difficult to train a learning model. Our idea is to focus on estimating the distance between observations rather than estimating the missing values, and to define membership functions that allocate observations to histogram bins according to the estimation errors. Our solution comprises a novel masked distance learning (MDL) algorithm to reduce the cumulative errors caused by iteratively estimating each missing value in an observation and a fuzzy-weighted frequency (FWF) method for identifying discrepancies in the data distribution. The concept drift detection algorithm proposed in this paper is a singular and unified algorithm that can handle missing values, but not an imputation algorithm combined with a concept drift detection algorithm. Experiments on both synthetic and real-world data sets demonstrate the advantages of this method and show its robustness in detecting drift in data with missing values. These findings reveal that missing values exert a profound impact on concept drift detection, but using fuzzy set theory to model observations can produce more reliable results than imputation.	翻訳日:2022-11-01 03:58:17 公開日:2020-08-09
# ニューラルネットワークの記憶と理由:影響推定によるロングテールの発見 What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation ( http://arxiv.org/abs/2008.03703v1 ) ライセンス: Link先を確認	Vitaly Feldman and Chiyuan Zhang	(参考訳) ディープラーニングアルゴリズムは、トレーニングデータに非常に適しており、異常値や誤ったラベル付きデータポイントにも適していることがよく知られている。このような適合性は、重要な研究関心を惹きつけたが、今のところ説得力のある説明は与えられていない現象である、データラベルの訓練を暗記する必要がある。 Feldman (2019) の最近の研究は、2つの洞察の組み合わせに基づく理論的な説明を提唱している。まず、自然画像とデータ分布は(形式的には)長い尾を持つことが知られており、稀で非定型的な例のかなりの割合を持つ。第二に、単純な理論モデルでは、データ分布が長い場合の至近汎化誤差を達成するためには、このような記憶化が必要である。しかし、この説明の直接的な実証的証拠や、そのような証拠を得るためのアプローチは与えられなかった。この研究では、この理論の重要なアイデアをテストする実験をデザインします。実験では、各トレーニング例が各テスト例の精度およびトレーニング例の記憶値に与える影響を推定する必要がある。これらの量を直接推定することは計算的に禁止されるが、密接な関係にある部分サンプリングの影響や記憶値をより効率的に推定できることを示す。私たちの実験は、いくつかの標準ベンチマークにおける一般化のための記憶の大幅な利点を示しています。また、この理論の定量的かつ視覚的に説得力のある証拠も提示している(Feldman, 2019)。 Deep learning algorithms are well-known to have a propensity for fitting the training data very well and often fit even outliers and mislabeled data points. Such fitting requires memorization of training data labels, a phenomenon that has attracted significant research interest but has not been given a compelling explanation so far. A recent work of Feldman (2019) proposes a theoretical explanation for this phenomenon based on a combination of two insights. First, natural image and data distributions are (informally) known to be long-tailed, that is have a significant fraction of rare and atypical examples. Second, in a simple theoretical model such memorization is necessary for achieving close-to-optimal generalization error when the data distribution is long-tailed. However, no direct empirical evidence for this explanation or even an approach for obtaining such evidence were given. In this work we design experiments to test the key ideas in this theory. The experiments require estimation of the influence of each training example on the accuracy at each test example as well as memorization values of training examples. Estimating these quantities directly is computationally prohibitive but we show that closely-related subsampled influence and memorization values can be estimated much more efficiently. Our experiments demonstrate the significant benefits of memorization for generalization on several standard benchmarks. They also provide quantitative and visually compelling evidence for the theory put forth in (Feldman, 2019).	翻訳日:2022-11-01 03:57:45 公開日:2020-08-09
# 干渉生成対向ネットワーク Intervention Generative Adversarial Networks ( http://arxiv.org/abs/2008.03712v1 ) ライセンス: Link先を確認	Jiadong Liang, Liangyu Zhang, Cheng Zhang and Zhihua Zhang	(参考訳) 本稿では,生成型逆ネットワークの学習過程を安定化し,モード崩壊問題を緩和するための新しい手法を提案する。主なアイデアは、目的に介入損失と呼ぶ正規化用語を導入することです。得られた生成モデルを、IVGAN(Intervention Generative Adversarial Networks)と呼ぶ。ガウス不変干渉による補助エンコーダネットワークから得られた実画像の潜伏表現を摂動させ、生成した画像の分布の相違を罰することにより、干渉損失は生成元に対してより有益な勾配を与え、GANのトレーニング安定性を著しく向上させる。本研究では,本手法の有効性と有効性を示すため,標準実世界データセットとスタック型mnistデータセットの徹底的な評価を行った。 In this paper we propose a novel approach for stabilizing the training process of Generative Adversarial Networks as well as alleviating the mode collapse problem. The main idea is to introduce a regularization term that we call intervention loss into the objective. We refer to the resulting generative model as Intervention Generative Adversarial Networks (IVGAN). By perturbing the latent representations of real images obtained from an auxiliary encoder network with Gaussian invariant interventions and penalizing the dissimilarity of the distributions of the resulting generated images, the intervention loss provides more informative gradient for the generator, significantly improving GAN's training stability. We demonstrate the effectiveness and efficiency of our methods via solid theoretical analysis and thorough evaluation on standard real-world datasets as well as the stacked MNIST dataset.	翻訳日:2022-11-01 03:57:27 公開日:2020-08-09

Title

Authors

Abstract

論文公表日・翻訳日

# 量子論を超えた非古典性

Witnessing non-classicality beyond quantum theory ( http://arxiv.org/abs/2003.07974v3 )

ライセンス: Link先を確認

Chiara Marletto, Vlatko Vedral

(参考訳) 物理系が2つの量子系間の絡み合いの発生を局所的に調停できるなら、それ自身を非古典的でなければならないことを示す一般的な議論を提案する。この結果は、最近提案されたコンストラクタ情報理論から導かれた一般的な情報理論の原則に従うものである。この議論は、最近提案された重力の非古典性試験において、量子プローブにおける重力によって引き起こされる絡み合いを目撃した理論の基礎となる。

We propose a general argument to show that if a physical system can mediate locally the generation of entanglement between two quantum systems, then it itself must be non-classical. Remarkably, we do not assume any classical or quantum formalism to describe the mediating physical system: our result follows from general information-theoretic principles, drawn from the recently proposed constructor theory of information. This argument provides the indispensable theoretical basis for recently proposed tests of non-classicality in gravity, based on witnessing gravitationally-induced entanglement in quantum probes.

翻訳日:2023-05-28 22:11:16 公開日:2020-08-09

# 3量子系における絡み合いのベクトル特性

Vector Properties of Entanglement in a Three-Qubit System ( http://arxiv.org/abs/2003.14390v2 )

ライセンス: Link先を確認

Dmitry B. Uskov, Paul M. Alsing

(参考訳) 我々は、$su(4)$ と $so(6)$ Lie 代数の間の同型性に基づく3量子系における絡み合いの動的ベクトルモデルを提案する。 Pl\ "ucker-type description of three-qubit local invariants" を一般化すると、3つの実数値の3D$ベクトル(ここでは$A_{R,I}$、$B_{R,I}$、$C_{R,I}$と表記する)を導入する。これらのベクトルのマグニチュードはシステムの2ビットおよび3ビットの絡み合いパラメータを決定する。局所的な$SU(2)$演算の下でのベクトルの進化$A$, $B$ , $C$は、qubits $a$, $b$, $c$の単一キュービットブロッホベクトルの進化$SO(3)$と同一であることを示す。同時に、一般的な 2-qubit $su(4)$ Hamiltonians は$a-b$, $a-c$ and $b-c$ 2-qubit coupling terms generate $SO(6)$ coupling between vectors $A$ and $B$, $A$ and $C$, and $B$ and $C$ を含む。異なる2量子結合項によって引き起こされる絡み合いのダイナミクスはベクトルの相互配向$A$,$B$,$C$によって完全に決定され、これは単量子変換によって制御できる。 W$, Greenberg-Horne-Zeilinger (GHZ$) と分岐状態の間の変換を含む量子制御問題を解くことで、絡み合いのベクトル記述の力を説明する。

We suggest a dynamical vector model of entanglement in a three qubit system based on isomorphism between $su(4)$ and $so(6)$ Lie algebras. Generalizing Pl\"ucker-type description of three-qubit local invariants we introduce three pairs of real-valued $3D$ vector (denoted here as $A_{R,I}$ , $B_{R,I}$ and $C_{R,I}$). Magnitudes of these vectors determine two- and three-qubit entanglement parameters of the system. We show that evolution of vectors $A$, $B$ , $C$ under local $SU(2)$ operations is identical to $SO(3)$ evolution of single-qubit Bloch vectors of qubits $a$, $b$ and $c$ correspondingly. At the same time, general two-qubit $su(4)$ Hamiltonians incorporating $a-b$, $a-c$ and $b-c$ two-qubit coupling terms generate $SO(6)$ coupling between vectors $A$ and $B$, $A$ and $C$, and $B$ and $C$, correspondingly. It turns out that dynamics of entanglement induced by different two-qubit coupling terms is entirely determined by mutual orientation of vectors $A$, $B$, $C$ which can be controlled by single-qubit transformations. We illustrate the power of this vector description of entanglement by solving quantum control problems involving transformations between $W$, Greenberg-Horne-Zeilinger ($GHZ$ ) and biseparable states.

翻訳日:2023-05-27 07:32:03 公開日:2020-08-09

# 持続スピンヘリックスを求める臨界超電流と$\phi_0$状態

Critical supercurrent and $\phi_0$ state for probing a persistent spin helix ( http://arxiv.org/abs/2004.14586v3 )

ライセンス: Link先を確認

Mohammad Alidoust

(参考訳) 理論的には、ゼーマン場の存在下でのRashba-Dresselhaus spin-orbit interaction (RDSOI) との2次元ジョセフソン接合における超電流のプロファイルを理論的に研究する。自己バイアス付き超電流(いわゆる$\varphi_0$-Josephson状態)を調べることで、RDSOIパラメータ($\alpha,\beta$)および面内ゼーマン場成分($h_x,h_y$)に対する$\varphi_0$状態の関数の明示的な式を得る。以上の結果から,超伝導電極のエネルギーギャップ (\Delta$) に対して化学ポテンシャル (\mu$) が十分高い場合, 等強度 (|\alpha|=|\beta|$) の DSOI と$\mu \gg \Delta$ (RSOI) は磁化とRDSOIのタイプに依存しない$\varphi_0$ を消滅させることがわかった。しかし、不等成分を持つゼーマン場、すなわち$|h_x|\neq |h_y|$は、同じ強度のRDSOIsの破壊的影響を(1つの型のみに対して)無効化することができるが、$|h_x|= |h_y|$は$\varphi_0$状態を取り除くことができる。驚くべきことに、$\mu\sim\delta$ 極限において、$\varphi_0$ 状態は平面内ゼーマン場の両成分の乗算、すなわち $\mu \gg \delta$ 極限に存在しない $h_xh_y$ に比例する。さらに, 臨界超電流実験の結果から, 持続的なスピンヘリックスは高い化学ポテンシャル系である$\mu\gg \delta$ で明らかとなり, 逆の系である$\mu\sim\delta$ が悪影響をもたらすことが示された。弾道的状態において、臨界超電流の「最大」は$|\alpha|=|\beta|$で起こり、ゼーマン場はこの特徴を高めることができる。障害や非磁性的不純物の存在は、この図を劇的に変えるので、臨界超電流の「最小」は対称性線 $|\alpha|=|\beta|$ の周辺で起こる。

We theoretically study the profile of a supercurrent in two-dimensional Josephson junctions with Rashba-Dresselhaus spin-orbit interaction (RDSOI) in the presence of a Zeeman field. Through investigating self-biased supercurrent (so called $\varphi_0$-Josephson state), we obtain explicit expressions for the functionality of the $\varphi_0$ state with respect to RDSOI parameters ($\alpha,\beta$) and in-plane Zeeman field components ($h_x,h_y$). Our findings reveal that, when the chemical potential ($\mu$) is high enough compared to the energy gap ($\Delta$) in superconducting electrodes, i.e., $\mu \gg \Delta$, RSOI and DSOI with equal strengths ($|\alpha|=|\beta|$) cause vanishing $\varphi_0$ state independent of magnetization and the type of RDSOI. A Zeeman field with unequal components, i.e., $|h_x|\neq |h_y|$, however, can counteract and nullify the destructive impact of equal-strength RDSOIs (for one type only), where $\mu\sim\Delta$, although $|h_x|= |h_y|$ can still eliminate the $\varphi_0$ state. Remarkably, in the $\mu\sim\Delta$ limit, the $\varphi_0$ state is proportional to the multiplication of both components of an in-plane Zeeman field, i.e., $h_xh_y$, which is absent in the $\mu \gg \Delta$ limit. Furthermore, our results of critical supercurrents demonstrate that the persistent spin helices can be revealed in a high enough chemical potential regime $\mu\gg \Delta$, while an opposite regime, i.e., $\mu\sim\Delta$, introduces an adverse effect. In the ballistic regime, the "maximum" of the critical supercurrent occurs at $|\alpha|=|\beta|$ and the Zeeman field can boost this feature. The presence of disorder and nonmagnetic impurities change this picture drastically so the "minimum" of the critical supercurrent occurs at and around the symmetry lines $|\alpha|=|\beta|$.

翻訳日:2023-05-21 17:31:39 公開日:2020-08-09

# IsiAの構造に基づくハミルトニアンモデルによる高堅牢な色素タンパク質複合体の発見

Structure-based Hamiltonian model for IsiA uncovers a highly robust pigment protein complex ( http://arxiv.org/abs/2006.00947v2 )

ライセンス: Link先を確認

Hanan Schoffman, William M. Brown, Yossi Paltiel, Nir Keren and Erik M. Gauger

(参考訳) 鉄ストレス誘発タンパク質a(iia)は、生物学的研究における興味と議論の源である。 200以上のクロロフィルを結合するIsiA超複合体は、光系I(PSI)の周りの多量体環に組み立てられる。近年、IsiA-PSI構造は3.48 {\AA} に分解された。この構造に基づいて、IsiAモノマー内の単一励起事象をシミュレートするモデルを構築した。このモデルにより,isia構造における励起の蛍光と局所化の計算が可能となった。このシステムをさらに検討するため、モデルに熱と位置の2つの形態でノイズを導入した。騒音の導入は、低温と生物学的に関連する温度のシステムにおける機能的差異を強調している。以上の結果から,IsiA色素タンパク質複合体のエネルギーは室温では非常に強いことが示唆された。それでも、特定のクロロフィルの位置の変化は、光学的および蛍光的性質に大きな変化をもたらす。これらの結果に基づき,光合成過程の機能と進化の理解に基づいて,異なる役割をコンテキストに依存した形で果たす可能性を持つ,高ロバストな構造の含意について考察する。

The iron stress-induced protein A (IsiA) is a source of interest and debate in biological research. The IsiA super-complex, binding over 200 chlorophylls, assembles in multimeric rings around photosystem I (PSI). Recently, the IsiA-PSI structure was resolved to 3.48 {\AA}. Based on this structure, we created a model simulating a single excitation event in an IsiA monomer. This model enabled us to calculate the fluorescence and the localisation of the excitation in the IsiA structure. To further examine this system, noise was introduced to the model in two forms -- thermal and positional. Introducing noise highlights the functional differences in the system between cryogenic temperatures and biologically relevant temperatures. Our results show that the energetics of the IsiA pigment-protein complex are very robust at room temperature. Nevertheless, shifts in the position of specific chlorophylls lead to large changes in their optical and fluorescence properties. Based on these results we discuss the implication of highly robust structures, with potential for serving different roles in a context dependent manner, on our understanding of the function and evolution of photosynthetic processes.

翻訳日:2023-05-17 11:28:16 公開日:2020-08-09

# 量子デファスメントプローブによるオーミック熱浴の識別

Discrimination of Ohmic thermal baths by quantum dephasing probes ( http://arxiv.org/abs/2008.02526v2 )

ライセンス: Link先を確認

Alessandro Candeloro, Matteo G. A. Paris

(参考訳) 量子プローブを軽視することで、異なる温度での構造化浴の識別に対処する。我々は、正確な還元力学を導出し、2つの量子ビットからなる3種類の量子プローブ、すなわち量子ビット、量子ビット、量子レジスタによって達成可能な最小誤差確率を評価する。その結果, 量子プローブの劣化は温度の低い値の識別に有用であり, 相互作用時間の中間値では誤差の確率が低くなることが示された。一方、2つのキュービットからなるレジスタは、連続的に使用される2つの単一キュービットと比較して、何の利点も与えない。

We address the discrimination of structured baths at different temperatures by dephasing quantum probes. We derive the exact reduced dynamics and evaluate the minimum error probability achievable by three different kinds of quantum probes, namely a qubit, a qutrit and a quantum register made of two qubits. Our results indicate that dephasing quantum probes are useful in discriminating low values of temperature, and that lower probabilities of error are achieved for intermediate values of the interaction time. A qutrit probe outperforms a qubit one in the discrimination task, whereas a register made of two qubits does not offer any advantage compared to two single qubits used sequentially.

翻訳日:2023-05-07 00:07:52 公開日:2020-08-09

# 量子場の幾何学的量子情報構造とその格子シミュレーション

Geometric Quantum Information Structure in Quantum Fields and their Lattice Simulation ( http://arxiv.org/abs/2008.03647v1 )

ライセンス: Link先を確認

Natalie Klco and Martin J. Savage

(参考訳) 質量を持たない非相互作用スカラー場理論の2つの切断領域の間の蒸留可能な絡み合いの上限は、幾何学的減衰定数によって定義される指数的減衰を持つ。空間格子で短い距離で制御されると、この絡み合いは突然無次元分離を越えて消滅し、ネガティビティ球面を定義する。 2つの空間次元において、一連の格子計算を通じて、円板対とネガティビティ球面の連続体への成長の間の幾何学的減衰定数を決定する。このような量子情報スケールが量子色力学(qcd)にも現れると仮定すると、核子と核の低エネルギーダイナミクスを記述する有効場理論に新しい相対スケールが存在するかもしれない。本稿では, 有効場理論, 格子qcd計算, 将来の量子シミュレーションにおける蒸留性絡み合い構造の影響について考察する。

An upper limit to distillable entanglement between two disconnected regions of massless non-interacting scalar field theory has an exponential decay defined by a geometric decay constant. When regulated at short distances with a spatial lattice, this entanglement abruptly vanishes beyond a dimensionless separation, defining a negativity sphere. In two spatial dimensions, we determine this geometric decay constant between a pair of disks and the growth of the negativity sphere toward the continuum through a series of lattice calculations. Making the connection to quantum field theories in three-spatial dimensions, assuming such quantum information scales appear also in quantum chromodynamics (QCD), a new relative scale may be present in effective field theories describing the low-energy dynamics of nucleons and nuclei. We highlight potential impacts of the distillable entanglement structure on effective field theories, lattice QCD calculations and future quantum simulations.

翻訳日:2023-05-06 18:05:54 公開日:2020-08-09

# ファジィテストを用いた消費者UAVサイバーセキュリティ脆弱性評価

Consumer UAV Cybersecurity Vulnerability Assessment Using Fuzzing Tests ( http://arxiv.org/abs/2008.03621v1 )

ライセンス: Link先を確認

David Rudo and Dr. Kai Zeng

(参考訳) 無人航空機(uav)は、遠隔操作可能な飛行可能な車両であり、軍事活動から国内娯楽まで様々な環境に存在する。これらの車両は素晴らしい資産ですが、パイロットが遠隔操作できるのと同じように、サイバー攻撃も同様に実行できます。 UAVに対するサイバー攻撃は、物理的および仮想システムに多くの問題をもたらす可能性がある。このような誤動作は、攻撃者にデータを盗んだり、UAVを無効にしたり、UAVをハイジャックする能力を与える。このような攻撃を軽減するには、悪意ある悪用される可能性のある脆弱性を特定し、パッチを当てる必要がある。本稿では, 特定のポートに送信される大量のデータストリームを用いて, 悪用可能なUAVセキュリティプラクティスを識別し, 新たなUAV脆弱性を探索する。より詳細なモデルでは、UAVのFTPポートに送信されるFTP固有のキーワードを含むデータの文字列をファジングテストとして含み、UAV上の他のポートでも数千のパケットを起動する。これらのテストの間、仮想および物理的システムは、特定のパターンや脆弱性を特定するために広範囲に監視される。このモデルは、攻撃者がネットワークを侵害したuavを正確に描写し、多くのローエンドのuavモデルを家庭で使用するparrot bebop 2に適用される。テスト中、Parrot Bebop 2はGPS性能の低下、ビデオ速度、パイロットに対するUAVの反応性、モーター機能、UAVのセンサーデータの精度をモニターする。これらすべての監視ポイントは、個々のテストに対するUAVの反応を包括的に見ることができる。本稿では,この脆弱性の悪用に対処するための対策と,ファジングテストから分岐する可能性のある攻撃について述べる。

Unmanned Aerial Vehicles (UAVs) are remote-controlled vehicles capable of flight and are present in a variety of environments from military operations to domestic enjoyment. These vehicles are great assets, but just as their pilot can control them remotely, cyberattacks can be executed in a similar manner. Cyber attacks on UAVs can bring a plethora of issues to physical and virtual systems. Such malfunctions are capable of giving an attacker the ability to steal data, incapacitate the UAV, or hijack the UAV. To mitigate such attacks, it is necessary to identify and patch vulnerabilities that may be maliciously exploited. In this paper, a new UAV vulnerability is explored with related UAV security practices identified for possible exploitation using large streams of data sent at specific ports. The more in-depth model involves strings of data involving FTP-specific keywords sent to the UAV's FTP port in the form of a fuzzing test and launching thousands of packets at other ports on the UAV as well. During these tests, virtual and physical systems are monitored extensively to identify specific patterns and vulnerabilities. This model is applied to a Parrot Bebop 2, which accurately portrays a UAV that had their network compromised by an attacker and portrays many lower-end UAV models for domestic use. During testings, the Parrot Bebop 2 is monitored for degradation in GPS performance, video speed, the UAV's reactivity to the pilot, motor function, and the accuracy of the UAV's sensor data. All these points of monitoring give a comprehensive view of the UAV's reaction to each individual test. In this paper, countermeasures to combat the exploitation of this vulnerability will be discussed as well as possible attacks that can branch from the fuzzing tests.

翻訳日:2023-05-06 18:05:40 公開日:2020-08-09

# ハイブリッドシステムの近似進化:光機械式jaynes-cummingsモデル

Approximate evolution for a hybrid system: An optomechanical Jaynes-Cummings model ( http://arxiv.org/abs/2008.03839v1 )

ライセンス: Link先を確認

L. Medina-Dozal, I. Ramos-Prieto, J. R\'ecamier

(参考訳) この研究は、2つの既知のシステムから構築された現象論的ハミルトニアンから始まります:ポンプ式光力学系のハミルトニアンとjaynes cummingsハミルトニアンです。代数的手法を用いて、(指数関数の積として)強制光学系に対して近似時間発展作用素 $\hat U_{opt}$ を構築し、相互作用として JC ハミルトニアンを取る。後者を$\hat u_{opt}$ で変換し、線形化でき、時間発展演算子が積形式で書かれる一般化された相互作用像ハミルトニアンを得る。解析結果はフルハミルトニアンを用いた純粋数値計算と比較され,両者の一致は顕著である。

In this work we start from a phenomenological Hamiltonian built from two known systems: the Hamiltonian of a pumped optomechanical system and the Jaynes Cummings Hamiltonian. Using algebraic techniques we construct an approximate time evolution operator $\hat U_{opt}$ for the forced optomechanical system (as a product of exponentials) and take the JC Hamiltonian as an interaction. We transform the later with $\hat U_{opt}$ to obtain a generalized interaction picture Hamiltonian which can be linearized and whose time evolution operator is written in a product form. The analytic results are compared with purely numerical calculations using the full Hamiltonian and the agreement between them is remarkable.

翻訳日:2023-05-06 18:04:16 公開日:2020-08-09

# モバイル受動センシングを利用した不安検出

Anxiety Detection Leveraging Mobile Passive Sensing ( http://arxiv.org/abs/2008.03810v1 )

ライセンス: Link先を確認

Lionel Levine, Migyeong Gwak, Kimmo Karkkainen, Shayan Fazeli, Bita Zadeh, Tara Peris, Alexander Young, Majid Sarrafzadeh

(参考訳) 不安障害は小児と成人の両方に影響する精神疾患の最も一般的な分類である。しかしながら、不安を効果的に監視し管理するためのツールは不足しており、不安に関するユニークな課題に対処するために比較的限られた研究が適用されている。スマートフォンから受動的で控えめなデータを集めることは、従来の方法の代替となり、リアルタイムのメンタルヘルス監視と疾病管理が可能になる。本稿では,センサとユーザログの完全適合性を連続的かつ受動的に追跡する実験用モバイルアプリケーションeWellnessを提案する。 1か月の間に10人を追跡し、受動的に監視された機能のみに基づいて、毎日の不安や抑うつレベルを予測するのに76%近い成功率を示した最初のパイロット研究を報告した。

Anxiety disorders are the most common class of psychiatric problems affecting both children and adults. However, tools to effectively monitor and manage anxiety are lacking, and comparatively limited research has been applied to addressing the unique challenges around anxiety. Leveraging passive and unobtrusive data collection from smartphones could be a viable alternative to classical methods, allowing for real-time mental health surveillance and disease management. This paper presents eWellness, an experimental mobile application designed to track a full-suite of sensor and user-log data off an individual's device in a continuous and passive manner. We report on an initial pilot study tracking ten people over the course of a month that showed a nearly 76% success rate at predicting daily anxiety and depression levels based solely on the passively monitored features.

翻訳日:2023-05-06 18:03:59 公開日:2020-08-09

# 拡張不確かさ原理が相対論的クーロンポテンシャルに及ぼす影響

Effects of Extended Uncertainty Principle on the Relativistic Coulomb Potential ( http://arxiv.org/abs/2008.03807v1 )

ライセンス: Link先を確認

B. Hamil, M. Merad, T. Birkandan

(参考訳) 拡張不確実性原理の文脈において、相対論的境界状態エネルギースペクトルとクーロンポテンシャルの波動関数をド・ジッター空間と反ド・ジッター空間に対して研究した。クライン=ゴルドン方程式とディラック方程式を解析的に解いて結果を得る。水素様原子の電子エネルギーを数値的に研究する。

The relativistic bound-state energy spectrum and the wavefunctions for the Coulomb potential are studied for de Sitter and anti-de Sitter spaces in the context of the extended uncertainty principle. Klein-Gordon and Dirac equations are solved analytically to obtain the results. The electron energies of hydrogen-like atoms are studied numerically.

翻訳日:2023-05-06 18:03:48 公開日:2020-08-09

# 内部コンプライアンスのための戦術:文献レビュー

Tactics for Internal Compliance: A Literature Review ( http://arxiv.org/abs/2008.03775v1 )

ライセンス: Link先を確認

Ralph Foorthuis

(参考訳) 組織の内部および外部の規範へのコンプライアンスは、現代の実践者と学者の両方にとって非常に重要なトピックである。しかし、組織が内部コンプライアンスを達成するために利用できる実質的かつ基本的なコンプライアンス戦術は、断片的な方法で記述され、学術分野の異なる文献で記述されている。本研究は,134冊の出版物を対象に,多分野の構造化文献レビューを行った。まず、45のコンプライアンス戦略のタイプを提示し、組織をコンプライアンス化するための基本的方法の概要を包括的かつリッチに概観する。第2に,コンプライアンス戦略を位置決めし,コンプライアンス戦略を解析・開発するためのフレームワークの基礎となるコンプライアンス理論の基本概念の概要を提供する。第3に,コンプライアンス戦略からコンプライアンス戦略に移行するための洞察を示す。この過程で,多分野の文献レビューを用いて鳥の視線を捉えることにより,コンプライアンス戦略はヒッヘルトよりもリッチな概念とみなす必要があることを実証する。また、イノベーションの機会が存在することも示しています。

Compliance of organizations with internal and external norms is a highly relevant topic for both practitioners and academics nowadays. However, the substantive, elementary compliance tactics that organizations can use for achieving internal compliance have been described in a fragmented manner and in the literatures of distinct academic disciplines. Using a multidisciplinary structured literature review of 134 publications, this study offers three contributions. First, we present a typology of 45 compliance tactics, which constitutes a comprehensive and rich overview of elementary ways for bringing the organization into compliance. Secondly, we provide an overview of fundamental concepts in the theory of compliance, which forms the basis for the framework we developed for positioning compliance tactics and for analyzing or developing compliance strategies. Thirdly, we present insights for moving from compliance tactics to compliance strategies. In the process, and using the multidisciplinary literature review to take a bird's-eye view, we demonstrate that compliance strategies need to be regarded as a richer concept than perceived hitherto. We also show that opportunities for innovation exist.

翻訳日:2023-05-06 18:03:15 公開日:2020-08-09

# バンドド行列構造を示す摂動による量子多体緩和の修正

Modification of quantum many-body relaxation by perturbations exhibiting a banded matrix structure ( http://arxiv.org/abs/2008.03745v1 )

ライセンス: Link先を確認

Lennart Dabelow, Patrick Vorndamme, and Peter Reimann

(参考訳) 分離量子多体系の可観測緩和挙動は,非摂動的典型的枠組み内での弱い摂動に応答してどのように修正されるかを検討する。鍵となる役割はいわゆる摂動プロファイル(perturbation profile)であり、摂動行列要素が対応するエネルギー固有値の差に対する非摂動ハミルトニアンの固有ベイシスにおける依存性を特徴付ける。特に、帯状マトリックス構造は、大きなエネルギー差のためにゼロに近づく摂動プロファイルによって定量的に捕獲される。緩和の時間的修正は、十分弱く強い摂動に対する近似解析解を許容する非線形積分方程式を介して摂動プロファイルと関連付けられ、一般的な場合において数値解スキームを考える。例として,可視なバンドド行列構造を持つスピン格子モデルについて考察し,自由適合パラメータを伴わない解析的予測と数値の相同性が極めて高いことを見出した。

We investigate how the observable relaxation behavior of an isolated quantum many-body system is modified in response to weak-to-moderate perturbations within a nonperturbative typicality framework. A key role is played by the so-called perturbation profile, which characterizes the dependence of the perturbation matrix elements in the eigenbasis of the unperturbed Hamiltonian on the difference of the corresponding energy eigenvalues. In particular, a banded matrix structure is quantitatively captured by a perturbation profile which approaches zero for large energy differences. The temporal modification of the relaxation is linked to the perturbation profile via a nonlinear integral equation, which admits approximate analytical solutions for sufficiently weak and strong perturbations, and for which we work out a numerical solution scheme in the general case. As an example, we consider a spin lattice model with a pronounced banded matrix structure, and we find very good agreement of the numerics with our analytical predictions without any free fit parameter.

翻訳日:2023-05-06 18:02:51 公開日:2020-08-09

# スマート音声メッセージングシステムを用いた農業知識管理 : 物理的・人的センサの組み合わせ

Agricultural Knowledge Management Using Smart Voice Messaging Systems: Combination of Physical and Human Sensors ( http://arxiv.org/abs/2008.03711v1 )

ライセンス: Link先を確認

Naoshi Uchihira and Masami Yoshida

(参考訳) 農業知識管理システムにおけるモノのインターネット(IoT)の利用は、農業の効率を高めるための最も有望なアプローチの1つである。しかし、農業における既存の物理的センサーは、作物の特性の変化をモニタリングするために限られており、平均的な農家にとっては高価である可能性がある。身体と人間のセンサー(五感)の組み合わせを提案する。農家は、自分の目、耳、鼻、舌、指を使って、作物や機器の特徴や状況(葉の色、病気、害虫、欠陥または機能不全)の様々な変化を確認し、その観察を口頭で表現し、スマートフォンのようなオーディオ録音装置で記述を捉えた。音声録音はwebサーバによってテキストに書き起こされる。物理的および人的センサ(音声メッセージ)が取得したデータは、データとテキストマイニングによって分析され、農業の知識を創造し、改善する。物理的および人的センサを用いた農業知識管理システムは、農業の効率と生産性を向上させる目的で、農家間で知識の共有と伝達を奨励する。北海道の温室野菜農場にこのような農業知識管理システム(スマート音声メッセージングシステム)を適用した。蓄積音声の質的分析と農家へのインタビューにより,本システムの有効性が示された。本研究の貢献は,「IoE(Agricultural Internet of Everything)」に対する新たな実践的アプローチと,実生野菜農場での試行実験の結果,その有効性を示すものである。

The use of the Internet of Things (IoT) in agricultural knowledge management systems is one of the most promising approaches to increasing the efficiency of agriculture. However, the existing physical sensors in agriculture are limited for monitoring various changes in the characteristics of crops and may be expensive for the average farmer. We propose a combination of physical and human sensors (the five human senses). By using their own eyes, ears, noses, tongues, and fingers, farmers could check the various changes in the characteristics and conditions (colors of leaves, diseases, pests, faulty or malfunctioning equipment) of their crops and equipment, verbally describe their observations, and capture the descriptions with audio recording devices, such as smartphones. The voice recordings could be transcribed into text by web servers. The data captured by the physical and human sensors (voice messages) are analyzed by data and text mining to create and improve agricultural knowledge. An agricultural knowledge management system using physical and human sensors encourages to share and transfer knowledge among farmers for the purpose of improving the efficiency and productivity of agriculture. We applied one such agricultural knowledge management system (smart voice messaging system) to a greenhouse vegetable farm in Hokkaido. A qualitative analysis of accumulated voice messages and an interview with the farmer demonstrated the effectiveness of this system. The contributions of this study include a new and practical approach to an "agricultural Internet of Everything (IoE)" and evidence of its effectiveness as a result of our trial experiment at a real vegetable farm.

翻訳日:2023-05-06 18:02:35 公開日:2020-08-09

# Schr\"odinger方程式の正確な離散化について

On the exact discretization of Schr\"odinger equation ( http://arxiv.org/abs/2008.03698v1 )

ライセンス: Link先を確認

Chih-Lung Chou

(参考訳) 我々は、運動量表現から位置表現へ作用素を変換する離散フーリエ変換を用いて、シュリンガー方程式の正確な離散アナログが、シュリンガー場理論のハミルトン作用素から自然に導出されることを示した。離散化されたschr\"odinger方程式としてよく用いられる標準中心差分方程式は、異なるハミルトニアン作用素から導かれるため、実際には別の理論を記述する。離散空間における位置と運動量作用素の間の可換関係も導出され、連続空間における従来の可換関係とは異なることが分かる。 2つの離散化公式の比較は、1次元空間の正方形ポテンシャル障壁を通過する波束の透過確率を数値的に研究することによってなされる。両方の離散化公式は、理論計算と比較して賢明で正確な数値結果を与えることが示されているが、正確な離散化公式を使うには計算時間がかかる。入射波パケットの平均波数 $k_0$ は、標準中央差分式を用いて正確な数値結果を得るために、位置空間における格子間隔である ||k_0\ell| < 0.35$ を満たす必要がある。

We show that the exact discrete analogue of Schr\"odinger equation can be derived naturally from the Hamiltonian operator of a Schr\"odinger field theory by using the discrete Fourier transform that transforms the operator from momentum representation into position representation. The standard central difference equation that is often used as the discretized Schr\"odinger equation actually describes a different theory since it is derived from a different Hamiltonian operator. The commutator relation between the position and momentum operators in discrete space is also derived and found to be different from the conventional commutator relation in continuous space. A comparison between the two discretization formulas is made by numerically studying the transmission probability for a wave packet passing through a square potential barrier in one dimensional space. Both discretization formulas are shown to give sensible and accurate numerical results as compared to theoretical calculation, though it takes more computation time when using the exact discretization formula. The average wave number $k_0$ of the incident wave packet must satisfy $|k_0\ell| < 0.35$, where $\ell$ is the lattice spacing in position space, in order to obtain an accurate numerical result by using the standard central difference formula.

翻訳日:2023-05-06 18:02:10 公開日:2020-08-09

# ソーシャルメディアを使って自然災害に対する人口動態を測る:2019年のオーストラリア・ブッシュファイア後の大規模Facebook調査から

Using social media to measure demographic responses to natural disaster: Insights from a large-scale Facebook survey following the 2019 Australia Bushfires ( http://arxiv.org/abs/2008.03665v1 )

ライセンス: Link先を確認

Paige Maas and Zack Almquist and Eugenia Giraudy and JW Schneider

(参考訳) 本稿では,自然災害後の調査データを収集し,そのデータをデバイス由来の移動情報と組み合わせ,人口統計学的結果の探索を行う。ソーシャルメディアを人口統計調査のプラットフォームとして使うことは、特に困難で費用がかかる人口統計調査のプラットフォームとして、人口統計コミュニティの関心をますます高めている。 Schneider と Harknett (2019) による最近の研究は、米国の低所得労働者のデータ収集に Facebook をターゲットとした広告の利用を探求している。他の研究では、移民同化(stewart et al, 2019)、世界出生率(ribeiro et al, 2020)、世界移住株(zagheni et al, 2017)に対処している。われわれは、Facebookアプリ自体を通じて、ディスアスター後の人口統計と経済結果の迅速応答調査を導入することで、この取り組みを構築している。我々は、これらの調査回答を用いて、facebookの変位マップを含むアプリから派生したモビリティデータを強化し、観察された行動トレンドの妥当性とドライバーを評価する。この調査は、2019年のオーストラリアで起きた山火事の後に行われた。そうすることで、変位と人口動態に関するいくつかの重要な仮説を試すことができます。特に,転位決定やタイミング,喫煙マスクなどの保護具へのアクセスなど,重要な領域における男女差を明らかにした。研究と政策に関する簡単な議論で締めくくります。

In this paper we explore a novel method for collecting survey data following a natural disaster and then combine this data with device-derived mobility information to explore demographic outcomes. Using social media as a survey platform for measuring demographic outcomes, especially those that are challenging or expensive to field for, is increasingly of interest to the demographic community. Recent work by Schneider and Harknett (2019) explores the use of Facebook targeted advertisements to collect data on low-income shift workers in the United States. Other work has addressed immigrant assimilation (Stewart et al, 2019), world fertility (Ribeiro et al, 2020), and world migration stocks (Zagheni et al, 2017). We build on this work by introducing a rapid-response survey of post-disaster demographic and economic outcomes fielded through the Facebook app itself. We use these survey responses to augment app-derived mobility data that comprises Facebook Displacement Maps to assess the validity of and drivers underlying those observed behavioral trends. This survey was deployed following the 2019 Australia bushfires to better understand how these events displaced residents. In doing so we are able to test a number of key hypotheses around displacement and demographics. In particular, we uncover several gender differences in key areas, including in displacement decision-making and timing, and in access to protective equipment such as smoke masks. We conclude with a brief discussion of research and policy implications.

翻訳日:2023-05-06 18:01:49 公開日:2020-08-09

# InSAR位相フィルタリングとコヒーレンス推定のための教師なし生成ニューラルアプローチ

An Unsupervised Generative Neural Approach for InSAR Phase Filtering and Coherence Estimation ( http://arxiv.org/abs/2001.09631v3 )

ライセンス: Link先を確認

Subhayan Mukherjee, Aaron Zimmer, Xinyao Sun, Parwant Ghuman, Irene Cheng

(参考訳) 位相フィルタリングと画素品質(コヒーレンス)推定は、干渉合成開口レーダ(InSAR)画像からデジタル標高モデル(DEM)を作成する際に重要であり、空間的不整合(残差)を除去し、その後のアンラッピングを大幅に改善する。大量のInSARデータは、地理的領域にわたる広域モニタリング(WAM)を容易にする。並列コンピューティングの進歩は、畳み込みニューラルネットワーク(CNN)を加速し、視覚的パターン認識における人間のパフォーマンスよりも有利になった。しかし、この研究はほとんど未調査である。そこで我々は,共同位相フィルタリングとコヒーレンス推定のためのCNNに基づく生成モデルであるGenInSARを提案し,InSARのデータ分布を直接学習する。ゲニンサーの衛星とシミュレートされたノイズのinsar画像に関する教師なしの訓練は、他の5つの関連する方法(平均16.5%以上)を上回っており、分岐カット周辺の過剰なスムーシング/アーティファクトは少ない。ゲニンサーの位相とコヒーレンス根-平均二乗誤差と位相コサイン誤差はそれぞれ0.54, 0.07, 0.05であった。

Phase filtering and pixel quality (coherence) estimation is critical in producing Digital Elevation Models (DEMs) from Interferometric Synthetic Aperture Radar (InSAR) images, as it removes spatial inconsistencies (residues) and immensely improves the subsequent unwrapping. Large amount of InSAR data facilitates Wide Area Monitoring (WAM) over geographical regions. Advances in parallel computing have accelerated Convolutional Neural Networks (CNNs), giving them advantages over human performance on visual pattern recognition, which makes CNNs a good choice for WAM. Nevertheless, this research is largely unexplored. We thus propose "GenInSAR", a CNN-based generative model for joint phase filtering and coherence estimation, that directly learns the InSAR data distribution. GenInSAR's unsupervised training on satellite and simulated noisy InSAR images outperforms other five related methods in total residue reduction (over 16.5% better on average) with less over-smoothing/artefacts around branch cuts. GenInSAR's Phase, and Coherence Root-Mean-Squared-Error and Phase Cosine Error have average improvements of 0.54, 0.07, and 0.05 respectively compared to the related methods.

翻訳日:2023-01-06 08:00:36 公開日:2020-08-09

# DAGの混合から生じる分布からの因果構造発見

Causal Structure Discovery from Distributions Arising from Mixtures of DAGs ( http://arxiv.org/abs/2001.11940v2 )

ライセンス: Link先を確認

Basil Saeed, Snigdha Panigrahi, Caroline Uhler

(参考訳) 本研究では,各モデルが有向非巡回グラフ(DAG)で表される因果モデルの混合から生じる分布について考察する。このような混合分布のグラフィカル表現を提供し、この表現が混合分布の条件付き独立関係を符号化することを示す。次に,このような分布からのサンプルに基づく構造学習の問題を考える。混合変数は潜時であるため、潜時変数に対処できるFCIなどの因果構造探索アルゴリズムを検討する。これらのアルゴリズムは, 成分DAGの「統一」を復元し, 成分DAG間の条件分布が異なる変数を同定可能であることを示す。本研究では,合成および実データを用いて,推定されたグラフが異なる混合成分間で異なるノードを識別することを示す。直近の応用として,各混合成分に応じてサンプルをクラスタリングするために,この因果情報の検索方法を示す。

We consider distributions arising from a mixture of causal models, where each model is represented by a directed acyclic graph (DAG). We provide a graphical representation of such mixture distributions and prove that this representation encodes the conditional independence relations of the mixture distribution. We then consider the problem of structure learning based on samples from such distributions. Since the mixing variable is latent, we consider causal structure discovery algorithms such as FCI that can deal with latent variables. We show that such algorithms recover a "union" of the component DAGs and can identify variables whose conditional distribution across the component DAGs vary. We demonstrate our results on synthetic and real data showing that the inferred graph identifies nodes that vary between the different mixture components. As an immediate application, we demonstrate how retrieval of this causal information can be used to cluster samples according to each mixture component.

翻訳日:2023-01-05 05:52:26 公開日:2020-08-09

# 相関色強調:カラーフィルタの最適化による非制限逆画像の生成

Adversarial Color Enhancement: Generating Unrestricted Adversarial Images by Optimizing a Color Filter ( http://arxiv.org/abs/2002.01008v3 )

ライセンス: Link先を確認

Zhengyu Zhao, Zhuoran Liu, Martha Larson

(参考訳) 本稿では,ニューラルネットワークを誤分類する逆効果を生成するために,カラーフィルタを用いた画像拡張手法を提案する。提案手法であるACE(Adversarial Color Enhancement)は,勾配降下によるカラーフィルタの最適化により,非制限逆画像を生成する。 ACEの新規性は、透明な画像強調のための確立された実践の取り入れである。実験によりACEの白色箱対向強度と黒箱移動性について検証した。さまざまな例がACEが生成する画像の知覚的品質を示している。 ACEは、L_p$非受容性を超えた最近の研究に重要な貢献をし、大きな知覚的摂動をもたらすが、人間の目には目立たないような、制限のない敵の修正に焦点を当てている。フィルタベースの敵の今後の可能性についても,共通の拡張プラクティス(Instagramフィルタなど)でACEを特定の魅力的なイメージスタイルに導くこと,イメージセマンティクスにACEを適用すること,の2つの方向が検討されている。コードはhttps://github.com/zhengyuzhao/aceで入手できる。

We introduce an approach that enhances images using a color filter in order to create adversarial effects, which fool neural networks into misclassification. Our approach, Adversarial Color Enhancement (ACE), generates unrestricted adversarial images by optimizing the color filter via gradient descent. The novelty of ACE is its incorporation of established practice for image enhancement in a transparent manner. Experimental results validate the white-box adversarial strength and black-box transferability of ACE. A range of examples demonstrates the perceptual quality of images that ACE produces. ACE makes an important contribution to recent work that moves beyond $L_p$ imperceptibility and focuses on unrestricted adversarial modifications that yield large perceptible perturbations, but remain non-suspicious, to the human eye. The future potential of filter-based adversaries is also explored in two directions: guiding ACE with common enhancement practices (e.g., Instagram filters) towards specific attractive image styles and adapting ACE to image semantics. Code is available at https://github.com/ZhengyuZhao/ACE.

翻訳日:2023-01-04 09:16:06 公開日:2020-08-09

# AOL:ダイナミックビデオシーンにおける人間の軌道予測のための適応型オンライン学習

AOL: Adaptive Online Learning for Human Trajectory Prediction in Dynamic Video Scenes ( http://arxiv.org/abs/2002.06666v2 )

ライセンス: Link先を確認

Manh Huynh, Gita Alaghband

(参考訳) 本稿では,動的映像シーンにおける人間の運動軌跡を予測するための適応型オンライン学習(aol)フレームワークを提案する。我々のフレームワークはシーン環境の変化を学習し、適応し、異なるシナリオに対して最適なネットワーク重みを生成する。このフレームワークは予測モデルに適用でき、シーンの変化に遭遇すると動的に調整し、次の場所を予測するのに最適なトレーニング重みを適用できるため、パフォーマンスを向上させることができる。 LSTM[3]とFuture Person Location(FPL)[1]という2つの既存の予測モデルとフレームワークを統合することでこれを実証する。さらに,ネットワークの重み付け数を最適性能として分析し,最も最近トレーニングされたネットワーク重みを維持したlru(lru)戦略を用いて,一定数のネットワークでリアルタイムに実現可能であることを示す。大規模な実験により,我々のフレームワークはLSTMとFPLの予測精度を平均で17%,FPLで28%向上し,FPLでは最大50%向上し,リアルタイム(20fps)を実現した。

We present a novel adaptive online learning (AOL) framework to predict human movement trajectories in dynamic video scenes. Our framework learns and adapts to changes in the scene environment and generates best network weights for different scenarios. The framework can be applied to prediction models and improve their performance as it dynamically adjusts when it encounters changes in the scene and can apply the best training weights for predicting the next locations. We demonstrate this by integrating our framework with two existing prediction models: LSTM [3] and Future Person Location (FPL) [1]. Furthermore, we analyze the number of network weights for optimal performance and show that we can achieve real-time with a fixed number of networks using the least recently used (LRU) strategy for maintaining the most recently trained network weights. With extensive experiments, we show that our framework increases prediction accuracies of LSTM and FPL by ~17% and 28% on average, and up to ~50% for FPL on the worst case while achieving real-time (20fps).

翻訳日:2022-12-31 18:16:23 公開日:2020-08-09

# パリからベルリンへ:世界中のファッションスタイルの影響を発見

From Paris to Berlin: Discovering Fashion Style Influences Around the World ( http://arxiv.org/abs/2004.01316v2 )

ライセンス: Link先を確認

Ziad Al-Halah, Kristen Grauman

(参考訳) 服のスタイルの進化とその世界への移住は興味深いが、定量的に説明するのは難しい。着ている人の日常のイメージからファッションの影響を発見・定量化することを提案する。我々は,他の都市がどの都市に影響を及ぼすかを検出する手法を導入する。次に、発見された影響パターンを活用して予測モデルに通知し、任意の都市における任意のスタイルの人気を予測します。 44大都市を対象とする7.7M画像の大規模なデータセットであるGeoStyleを用いて、私たちのアイデアを実証し、都市が50の視覚的スタイルに対してどのようにしてファッションの影響を受け、受けているかを明らかにする。さらに,提案した予測モデルは,空間的および時間的に視覚的スタイルの進化の基盤となることの利点を示す,挑戦的なスタイル予測タスクの最先端結果を実現する。

The evolution of clothing styles and their migration across the world is intriguing, yet difficult to describe quantitatively. We propose to discover and quantify fashion influences from everyday images of people wearing clothes. We introduce an approach that detects which cities influence which other cities in terms of propagating their styles. We then leverage the discovered influence patterns to inform a forecasting model that predicts the popularity of any given style at any given city into the future. Demonstrating our idea with GeoStyle---a large-scale dataset of 7.7M images covering 44 major world cities, we present the discovered influence relationships, revealing how cities exert and receive fashion influence for an array of 50 observed visual styles. Furthermore, the proposed forecasting model achieves state-of-the-art results for a challenging style forecasting task, showing the advantage of grounding visual style evolution both spatially and temporally.

翻訳日:2022-12-17 04:53:50 公開日:2020-08-09

# tuigan: 2つの非ペア画像による多彩な画像から画像への翻訳を学ぶ

TuiGAN: Learning Versatile Image-to-Image Translation with Two Unpaired Images ( http://arxiv.org/abs/2004.04634v2 )

ライセンス: Link先を確認

Jianxin Lin, Yingxue Pang, Yingce Xia, Zhibo Chen, Jiebo Luo

(参考訳) unsupervised image-to-image translation (ui2i)タスクは、2つのドメイン間のマッピングをペアイメージなしで学習する。既存のui2iメソッドは通常、トレーニングのために異なるドメインからの多数の非ペア画像を必要とするが、トレーニングデータが非常に限られるシナリオはたくさんある。本稿では、各ドメインが1つのイメージを含んでいても、ui2iは依然として達成できると主張する。この目的のために,2つの未ペア画像のみをトレーニングし,ワンショットで教師なし学習を行う生成モデルTuiGANを提案する。 TuiGANでは、生成した画像がグローバルな構造から局所的な詳細へと徐々に洗練される粗い方法で変換される。幅広いui2iタスクにおいて,汎用性が強いベースラインを上回ることを検証するために,広範な実験を行った。さらに、TuiGANは十分なデータでトレーニングされた最先端のUI2Iモデルと同等のパフォーマンスを達成することができる。

An unsupervised image-to-image translation (UI2I) task deals with learning a mapping between two domains without paired images. While existing UI2I methods usually require numerous unpaired images from different domains for training, there are many scenarios where training data is quite limited. In this paper, we argue that even if each domain contains a single image, UI2I can still be achieved. To this end, we propose TuiGAN, a generative model that is trained on only two unpaired images and amounts to one-shot unsupervised learning. With TuiGAN, an image is translated in a coarse-to-fine manner where the generated image is gradually refined from global structures to local details. We conduct extensive experiments to verify that our versatile method can outperform strong baselines on a wide variety of UI2I tasks. Moreover, TuiGAN is capable of achieving comparable performance with the state-of-the-art UI2I models trained with sufficient data.

翻訳日:2022-12-15 02:36:07 公開日:2020-08-09

# 新型コロナウイルス薬品購入機会の特定のためのネットワークメディカルフレームワーク

Network Medicine Framework for Identifying Drug Repurposing Opportunities for COVID-19 ( http://arxiv.org/abs/2004.07229v2 )

ライセンス: Link先を確認

Deisy Morselli Gysi and \'Italo Do Valle and Marinka Zitnik and Asher Ameli and Xiao Gan and Onur Varol and Susan Dina Ghiassian and JJ Patten and Robert Davey and Joseph Loscalzo and Albert-L\'aszl\'o Barab\'asi

(参考訳) 現在のパンデミックは、SARS-CoV-2感染の潜在的な効果のために、迅速かつ確実に臨床承認された化合物を優先順位付けできる方法の必要性を強調している。過去10年間で、ネットワークメディカルは、薬物の標的と疾患遺伝子の間の細胞内ネットワークに基づく関係を利用して、薬物の再利用のための複数の予測アルゴリズムを開発し、検証してきた。そこで我々は,人工知能,ネットワーク拡散,ネットワーク近接に基づくアルゴリズムをデプロイし,それぞれがSARS-CoV-2に対する効果を期待して6,340の薬物をランク付けするよう命じた。予測を検証するために,veroe6細胞で実験的にスクリーニングされた基底的真理918薬と,臨床試験中の薬物の一覧を用いて,covid-19の有効性を有する薬物に対する医療コミュニティの評価を捉えた。ほとんどのアルゴリズムは、これらの基底真理データに対して予測能力を提供しているが、すべてのデータセットとメトリクスに対して一貫した結果を提供する単一の方法はない。これにより、全てのアルゴリズムの予測を融合させるマルチモーダルアプローチを開発し、異なる予測手法間のコンセンサスが、最高のパイプラインの性能を常に上回ることを示した。ウイルス感染の抑制に成功している77薬のうち76薬はSARS-CoV-2を標的としたタンパク質に結合せず、これらの薬はドッキングベースの戦略では特定できないネットワークベースの作用に依存している。これらの進歩は、将来の病原体や、デ・ノボの薬物開発のコストと長期のスケジュールで守られていない疾患に対する再生可能な薬物を同定する方法を提供する。

The current pandemic has highlighted the need for methodologies that can quickly and reliably prioritize clinically approved compounds for their potential effectiveness for SARS-CoV-2 infections. In the past decade, network medicine has developed and validated multiple predictive algorithms for drug repurposing, exploiting the sub-cellular network-based relationship between a drug's targets and disease genes. Here, we deployed algorithms relying on artificial intelligence, network diffusion, and network proximity, tasking each of them to rank 6,340 drugs for their expected efficacy against SARS-CoV-2. To test the predictions, we used as ground truth 918 drugs that had been experimentally screened in VeroE6 cells, and the list of drugs under clinical trial, that capture the medical community's assessment of drugs with potential COVID-19 efficacy. We find that while most algorithms offer predictive power for these ground truth data, no single method offers consistently reliable outcomes across all datasets and metrics. This prompted us to develop a multimodal approach that fuses the predictions of all algorithms, showing that a consensus among the different predictive methods consistently exceeds the performance of the best individual pipelines. We find that 76 of the 77 drugs that successfully reduced viral infection do not bind the proteins targeted by SARS-CoV-2, indicating that these drugs rely on network-based actions that cannot be identified using docking-based strategies. These advances offer a methodological pathway to identify repurposable drugs for future pathogens and neglected diseases underserved by the costs and extended timeline of de novo drug development.

翻訳日:2022-12-13 03:15:44 公開日:2020-08-09

# 米国におけるcovid-19データレポジトリのキュレーションと郡レベルの死亡数予測

Curating a COVID-19 data repository and forecasting county-level death counts in the United States ( http://arxiv.org/abs/2005.07882v2 )

ライセンス: Link先を確認

Nick Altieri, Rebecca L. Barter, James Duncan, Raaz Dwivedi, Karl Kumbier, Xiao Li, Robert Netzorg, Briton Park, Chandan Singh, Yan Shuo Tan, Tiffany Tang, Yu Wang, Chao Zhang, Bin Yu

(参考訳) 新型コロナウイルス(covid-19)の流行が進むにつれ、正確な予測は政策決定に極めて重要な役割を果たす。本稿では,covid-19情報を含む大規模データレポジトリの継続的なキュレーションについて述べる。このデータを用いて、米国内の郡レベルでの累積死亡数の短期的軌道の予測とそれに対応する予測間隔を最大2週間前に作成する。 2020年1月22日から6月20日までのデータを用いて、複数の予測をセンシング技術を用いて開発し、組み合わせて、線形および指数予測器(clep)と呼ぶアンサンブルを作成する。我々の個人予測器には、郡固有の指数と線形予測器、郡にまたがるデータをまとめる共有指数予測器、近隣の郡からのデータを利用する共有指数予測器、人口統計に基づく共有指数予測器が含まれる。過去5日間の予測誤差を用いて、死亡予測の不確実性を評価し、一般に適用可能な予測間隔、最大(絶対)エラー予測間隔(mepi)を生み出した。 MEPIは、将来2週間の累計死亡数を予測するために、郡全体で平均して94%以上をカバーしている。当社の予測は現在、非営利組織である response4life が個々の病院の医療提供の必要性を判断するために使用しており、全国の医療用品の流通に直接貢献しています。 https://covidseverity.comの予測とデータリポジトリが、必要な郡固有の意思決定をガイドし、郡の新型コロナウイルス対策の継続を支援することを願っている。

As the COVID-19 outbreak evolves, accurate forecasting continues to play an extremely important role in informing policy decisions. In this paper, we present our continuous curation of a large data repository containing COVID-19 information from a range of sources. We use this data to develop predictions and corresponding prediction intervals for the short-term trajectory of COVID-19 cumulative death counts at the county-level in the United States up to two weeks ahead. Using data from January 22 to June 20, 2020, we develop and combine multiple forecasts using ensembling techniques, resulting in an ensemble we refer to as Combined Linear and Exponential Predictors (CLEP). Our individual predictors include county-specific exponential and linear predictors, a shared exponential predictor that pools data together across counties, an expanded shared exponential predictor that uses data from neighboring counties, and a demographics-based shared exponential predictor. We use prediction errors from the past five days to assess the uncertainty of our death predictions, resulting in generally-applicable prediction intervals, Maximum (absolute) Error Prediction Intervals (MEPI). MEPI achieves a coverage rate of more than 94% when averaged across counties for predicting cumulative recorded death counts two weeks in the future. Our forecasts are currently being used by the non-profit organization, Response4Life, to determine the medical supply need for individual hospitals and have directly contributed to the distribution of medical supplies across the country. We hope that our forecasts and data repository at https://covidseverity.com can help guide necessary county-specific decision-making and help counties prepare for their continued fight against COVID-19.

翻訳日:2022-12-02 14:00:50 公開日:2020-08-09

# 新型コロナウイルス(covid-19)抗パンデミック対策のインパクトスタディ--区画モデルと機械学習

Impact studies of nationwide measures COVID-19 anti-pandemic: compartmental model and machine learning ( http://arxiv.org/abs/2005.08395v2 )

ライセンス: Link先を確認

Mouhamadou A.M.T. Balde, Coura Balde, Babacar M. Ndiaye

(参考訳) 本稿では,全国的な新型コロナウイルス対策のパンデミック対策の効果について検討する。対策を考えると、covid-19のデータを分析するプロセスが2つある。我々は,全国的な尺度のレベルを,モデルの接触率に関連するパラメータの値と関連付ける。次に、パラメトリック・ソルバーは、これらの指標に関して、パンデミックの進化の異なる可能性を示している。パンデミックの進化を予測するために、2つの機械学習ツールが使用される。最後に、決定論的ツールと2つの機械学習ツールの比較を示す。

In this paper, we deal with the study of the impact of nationwide measures COVID-19 anti-pandemic. We drive two processes to analyze COVID-19 data considering measures. We associate level of nationwide measure with value of parameters related to the contact rate of the model. Then a parametric solve, with respect to those parameters of measures, shows different possibilities of the evolution of the pandemic. Two machine learning tools are used to forecast the evolution of the pandemic. Finally, we show comparison between deterministic and two machine learning tools.

翻訳日:2022-12-02 06:02:10 公開日:2020-08-09

# エンド・ツー・エンド注意によるカクテルの話者識別

Identify Speakers in Cocktail Parties with End-to-End Attention ( http://arxiv.org/abs/2005.11408v2 )

ライセンス: Link先を確認

Junzhe Zhu, Mark Hasegawa-Johnson, Leda Sari

(参考訳) 複数の話者が同時に話すシナリオでは、話者を正確に識別できることが重要である。本稿では、音源抽出と話者識別を統合したエンドツーエンドシステムを提案し、チャネル次元に沿って話者予測を最大にすることで、これら2つの部分を協調的に最適化する方法を提案する。残差注意により、話者識別のために最適化されたスペクトログラムマスクを学習でき、残差フォワード接続は、十分に大きなコンテキストウインドウによる拡張畳み込みを許容し、音節境界を越えた正しいストリーミングを保証する。エンドツーエンドトレーニングの結果、99.9%の精度と93.9%の精度で2つの話者を混合して認識し、81.2%の精度で3話者シナリオで全ての話者を認識するシステムが得られる。

In scenarios where multiple speakers talk at the same time, it is important to be able to identify the talkers accurately. This paper presents an end-to-end system that integrates speech source extraction and speaker identification, and proposes a new way to jointly optimize these two parts by max-pooling the speaker predictions along the channel dimension. Residual attention permits us to learn spectrogram masks that are optimized for the purpose of speaker identification, while residual forward connections permit dilated convolution with a sufficiently large context window to guarantee correct streaming across syllable boundaries. End-to-end training results in a system that recognizes one speaker in a two-speaker broadcast speech mixture with 99.9% accuracy and both speakers with 93.9% accuracy, and that recognizes all speakers in three-speaker scenarios with 81.2% accuracy.

翻訳日:2022-11-30 09:50:47 公開日:2020-08-09

# 識別的特徴アライメント:ガウス誘導潜在アライメントによる教師なし領域適応の伝達性の改善

Discriminative Feature Alignment: Improving Transferability of Unsupervised Domain Adaptation by Gaussian-guided Latent Alignment ( http://arxiv.org/abs/2006.12770v5 )

ライセンス: Link先を確認

Jing Wang, Jiahong Chen, Jianzhe Lin, Leonid Sigal, and Clarence W. de Silva

(参考訳) 本研究では,ラベル付きデータドメインから近似推論モデルを学習し,ラベル付きデータドメインへの一般化を期待する教師なし領域適応問題に焦点を当てた。教師なしドメイン適応の成功は、主にクロスドメイン機能アライメントに依存している。従来の研究は、分類器によって引き起こされる相違により、潜伏する特徴を直接調整しようと試みてきた。それでも、特に大きなドメインギャップが存在する場合、この直接的特徴アライメントを通じて共通の特徴空間を常に学べることはできない。この問題を解決するために,ガウス誘導型潜時アライメント手法を導入し,先行分布の誘導の下で2つの領域の潜時特徴分布を整列させる。このような間接的な方法では、2つの領域からのサンプル上の分布は共通の特徴空間、すなわち、より優れた特徴アライメントを促進する前の空間上に構築される。対象の潜伏分布をこの先行分布に効果的に整合させるため,エンコーダデコーダの定式化を生かして,不対向L1距離を提案する。 9つのベンチマークデータセットの広範な評価は、既存の作業を大幅に改善することで、最先端の手法よりも優れた知識伝達可能性と提案手法の汎用性を検証する。

In this study, we focus on the unsupervised domain adaptation problem where an approximate inference model is to be learned from a labeled data domain and expected to generalize well to an unlabeled data domain. The success of unsupervised domain adaptation largely relies on the cross-domain feature alignment. Previous work has attempted to directly align latent features by the classifier-induced discrepancies. Nevertheless, a common feature space cannot always be learned via this direct feature alignment especially when a large domain gap exists. To solve this problem, we introduce a Gaussian-guided latent alignment approach to align the latent feature distributions of the two domains under the guidance of the prior distribution. In such an indirect way, the distributions over the samples from the two domains will be constructed on a common feature space, i.e., the space of the prior, which promotes better feature alignment. To effectively align the target latent distribution with this prior distribution, we also propose a novel unpaired L1-distance by taking advantage of the formulation of the encoder-decoder. The extensive evaluations on nine benchmark datasets validate the superior knowledge transferability through outperforming state-of-the-art methods and the versatility of the proposed method by improving the existing work significantly.

翻訳日:2022-11-17 23:09:25 公開日:2020-08-09

# 空間指数構造におけるハンズオフモデルの統合

Hands-off Model Integration in Spatial Index Structures ( http://arxiv.org/abs/2006.16411v2 )

ライセンス: Link先を確認

Ali Hadian, Ankit Kumar, Thomas Heinis

(参考訳) 空間インデックスは、例えばIoTアプリケーションを通じて生成された空間データの増加量を分析するために不可欠である。近年開発されたインデックスの多さは、主にディスクに最適化されている。しかし、コモディティマシン上でもメモリ量が増加すると、メインメモリに移すことが選択肢となる。そうすることで、メインメモリにのみ対応可能な追加最適化を使用する機会が開かれる。本稿では,軽量機械学習モデルを用いて空間インデックスのクエリを高速化する機会について検討する。我々は、最も広く使われている空間指標であるR木に補間や同様の手法を用いる可能性を探究する。実験分析で示したように、クエリの実行時間は最大60%削減でき、同時にインデックスのメモリフットプリントを90%以上縮小できる。

Spatial indexes are crucial for the analysis of the increasing amounts of spatial data, for example generated through IoT applications. The plethora of indexes that has been developed in recent decades has primarily been optimised for disk. With increasing amounts of memory even on commodity machines, however, moving them to main memory is an option. Doing so opens up the opportunity to use additional optimizations that are only amenable to main memory. In this paper we thus explore the opportunity to use light-weight machine learning models to accelerate queries on spatial indexes. We do so by exploring the potential of using interpolation and similar techniques on the R-tree, arguably the most broadly used spatial index. As we show in our experimental analysis, the query execution time can be reduced by up to 60% while simultaneously shrinking the index's memory footprint by over 90%

翻訳日:2022-11-15 15:25:51 公開日:2020-08-09

# 準周期顕微鏡画像の学習に基づく欠陥認識

Learning-based Defect Recognition for Quasi-Periodic Microscope Images ( http://arxiv.org/abs/2007.01309v2 )

ライセンス: Link先を確認

Nik Dennler, Antonio Foncubierta-Rodriguez, Titus Neupert, Marilyne Sousa

(参考訳) 結晶材料欠陥の制御は、デバイスの最終性能に有害または有益である材料の特性に影響を与えるため、非常に重要である。サブナノメータスケールの欠陥解析は高分解能(走査型)透過電子顕微鏡[HR(S)TEM]によって実現され、人間の専門知識に基づいて欠陥の同定が行われている。しかし、プロセスは退屈で非常に時間がかかり、時には曖昧な結果をもたらす。本稿では,原子分解能顕微鏡画像からの格子欠陥検出を支援する半教師付き機械学習手法を提案する。画像パッチを欠陥または非欠陥と分類する畳み込みニューラルネットワーク、モデルとして1つの非破壊パッチを選択するグラフベースのヒューリスティック、最終的に自動的に生成される畳み込みフィルタバンク、スタック障害、ツイン欠陥、粒界などの対称性の破れを強調する。さらに,アモルファス領域とビーム欠陥を分割するための分散フィルタを提案する。このアルゴリズムは、III-V/Si結晶材料上でテストされ、異なるメトリクスに対して評価し、非常に小さなトレーニングデータセットであっても有望な結果を示す。データ駆動型分類の一般性,頑健性,深層学習の速度を,画像フィルタの有効性と組み合わせることで,結晶材料の将来のHR(S)TEM解析を効率化できる,マイクロスコピストコミュニティに貴重なオープンソースツールを提供する。

Controlling crystalline material defects is crucial, as they affect properties of the material that may be detrimental or beneficial for the final performance of a device. Defect analysis on the sub-nanometer scale is enabled by high-resolution (scanning) transmission electron microscopy [HR(S)TEM], where the identification of defects is currently carried out based on human expertise. However, the process is tedious, highly time consuming and, in some cases, yields ambiguous results. Here we propose a semi-supervised machine learning method that assists in the detection of lattice defects from atomic resolution microscope images. It involves a convolutional neural network that classifies image patches as defective or non-defective, a graph-based heuristic that chooses one non-defective patch as a model, and finally an automatically generated convolutional filter bank, which highlights symmetry breaking such as stacking faults, twin defects and grain boundaries. Additionally, we suggest a variance filter to segment amorphous regions and beam defects. The algorithm is tested on III-V/Si crystalline materials and successfully evaluated against different metrics, showing promising results even for extremely small training data sets. By combining the data-driven classification generality, robustness and speed of deep learning with the effectiveness of image filters in segmenting faulty symmetry arrangements, we provide a valuable open-source tool to the microscopist community that can streamline future HR(S)TEM analyses of crystalline materials.

翻訳日:2022-11-14 14:46:42 公開日:2020-08-09

# 点雲復調のための微分マニフォールド再構成

Differentiable Manifold Reconstruction for Point Cloud Denoising ( http://arxiv.org/abs/2007.13551v2 )

ライセンス: Link先を確認

Shitong Luo, Wei Hu

(参考訳) 3次元点雲は、表面再構成やレンダリングなどの下流のタスクを妨害する、取得装置の固有の制限のため、ノイズによって乱されることが多い。従来の作業は、主に下面からノイズ点の変位を推測するが、表面を明示的に回復するために指定されていないため、準最適化の結果につながる可能性がある。そこで本論文では,ノイズ摂動と組込み近傍特徴を持つ可微分部分サンプリングされた点から雑音点雲の基本多様体を学習し,点雲内の固有構造を捉えることを目的とする。具体的には,オートエンコーダライクなニューラルネットワークを提案する。エンコーダは各点の局所的特徴表現と非局所的特徴表現の両方を学習し、適応的微分可能プーリング操作を介して低ノイズの点をサンプリングする。その後、デコーダは、各サンプル点をその近傍の埋め込み特徴と共に、その点を中心とする局所曲面に変換することにより、基礎となる多様体を推定する。再構成多様体上の再サンプリングにより、偏微分点雲が得られる。さらに、教師なしのトレーニング損失を設計し、教師なしまたは教師なしの方法でネットワークをトレーニングできるようにする。提案手法は, 合成雑音と実環境雑音の両方において, 最先端のデノイジング法を著しく上回ることを示す実験を行った。コードとデータはhttps://github.com/luost26/dmrdenoiseで入手できる。

3D point clouds are often perturbed by noise due to the inherent limitation of acquisition equipments, which obstructs downstream tasks such as surface reconstruction, rendering and so on. Previous works mostly infer the displacement of noisy points from the underlying surface, which however are not designated to recover the surface explicitly and may lead to sub-optimal denoising results. To this end, we propose to learn the underlying manifold of a noisy point cloud from differentiably subsampled points with trivial noise perturbation and their embedded neighborhood feature, aiming to capture intrinsic structures in point clouds. Specifically, we present an autoencoder-like neural network. The encoder learns both local and non-local feature representations of each point, and then samples points with low noise via an adaptive differentiable pooling operation. Afterwards, the decoder infers the underlying manifold by transforming each sampled point along with the embedded feature of its neighborhood to a local surface centered around the point. By resampling on the reconstructed manifold, we obtain a denoised point cloud. Further, we design an unsupervised training loss, so that our network can be trained in either an unsupervised or supervised fashion. Experiments show that our method significantly outperforms state-of-the-art denoising methods under both synthetic noise and real world noise. The code and data are available at https://github.com/luost26/DMRDenoise

翻訳日:2022-11-06 08:46:06 公開日:2020-08-09

# 自己監督学習による低解像度画像からの3次元人物形状と姿勢

3D Human Shape and Pose from a Single Low-Resolution Image with Self-Supervised Learning ( http://arxiv.org/abs/2007.13666v2 )

ライセンス: Link先を確認

Xiangyu Xu, Hao Chen, Francesc Moreno-Noguer, Laszlo A. Jeni, Fernando De la Torre

(参考訳) 3次元人間の形状と単眼画像からのポーズ推定はコンピュータビジョンにおける活発な研究領域であり、活動認識から仮想アバターの作成に至るまで、新しいアプリケーションの開発に大きな影響を与えている。既存の3次元人体形状とポーズ推定の深層学習手法は比較的高解像度な入力画像に依存しているが、ビデオ監視やスポーツ放送といったいくつかの現実的なシナリオでは高解像度の視覚コンテンツが必ずしも利用できない。実際のシナリオにおける低解像度の画像は、幅広いサイズで異なり、1つの解像度で訓練されたモデルは、通常、解像度を越えて優雅に劣化しない。低解像度入力の問題を解決するための2つの一般的なアプローチは、視覚的アーティファクトにつながる可能性のある入力画像に超解像技術を適用するか、あるいは単に1つのモデルを各解像度で訓練するかである。上記の問題に対処するため,本研究では,レゾリューション・アウェア・ネットワーク,自己超越損失,コントラッシブ・ラーニング・スキームからなるRCC-Netという新しいアルゴリズムを提案する。提案したネットワークは3次元のボディ形状を学習し、単一のモデルで異なる解像度でポーズをとることができる。自己超越損失は出力のスケール一貫性を促進し、対照的な学習手法は深い特徴のスケール一貫性を強制する。これら2つの新たなトレーニング損失は,3次元形状を学習し,弱教師ありの姿勢を示す。広範な実験により、rsc-netは低解像度画像に挑戦するための最先端の手法よりも一貫して優れた結果が得られることが証明された。

3D human shape and pose estimation from monocular images has been an active area of research in computer vision, having a substantial impact on the development of new applications, from activity recognition to creating virtual avatars. Existing deep learning methods for 3D human shape and pose estimation rely on relatively high-resolution input images; however, high-resolution visual content is not always available in several practical scenarios such as video surveillance and sports broadcasting. Low-resolution images in real scenarios can vary in a wide range of sizes, and a model trained in one resolution does not typically degrade gracefully across resolutions. Two common approaches to solve the problem of low-resolution input are applying super-resolution techniques to the input images which may result in visual artifacts, or simply training one model for each resolution, which is impractical in many realistic applications. To address the above issues, this paper proposes a novel algorithm called RSC-Net, which consists of a Resolution-aware network, a Self-supervision loss, and a Contrastive learning scheme. The proposed network is able to learn the 3D body shape and pose across different resolutions with a single model. The self-supervision loss encourages scale-consistency of the output, and the contrastive learning scheme enforces scale-consistency of the deep features. We show that both these new training losses provide robustness when learning 3D shape and pose in a weakly-supervised manner. Extensive experiments demonstrate that the RSC-Net can achieve consistently better results than the state-of-the-art methods for challenging low-resolution images.

翻訳日:2022-11-06 08:19:06 公開日:2020-08-09

# より公平なバイナリサブマトリクス検出のための個人バイアスの除去

Denoising individual bias for a fairer binary submatrix detection ( http://arxiv.org/abs/2007.15816v2 )

ライセンス: Link先を確認

Changlin Wan, Wennan Chang, Tong Zhao, Sha Cao, Chi Zhang

(参考訳) バイナリマトリクスの低階表現は、スパース個人属性関係の分離において強力であり、広く応用されている。既存のbmf(binary matrix factorization)またはcc(co-clustering)メソッドは背景雑音を仮定することが多い。しかし、この仮定は実データでは容易に破られ、二進法の異質な行や列の確率は異なる要素の背景分布をもたらし、既存の方法の合理性を麻痺させる。本稿では,パターンの行または列単位での混合分布と異なる背景を推定し,背景からより可能性の高いバイナリ属性を除去し,真のパターンの検出を最適化するbindという2値化フレームワークを提案する。 BINDは行と列の混合分布の完全な数学的性質によって支えられている。 BINDは背景雑音を効果的に除去し,最先端のBMF法とCC法の妥当性と精度を大幅に向上させることを示した。

Low rank representation of binary matrix is powerful in disentangling sparse individual-attribute associations, and has received wide applications. Existing binary matrix factorization (BMF) or co-clustering (CC) methods often assume i.i.d background noise. However, this assumption could be easily violated in real data, where heterogeneous row- or column-wise probability of binary entries results in disparate element-wise background distribution, and paralyzes the rationality of existing methods. We propose a binary data denoising framework, namely BIND, which optimizes the detection of true patterns by estimating the row- or column-wise mixture distribution of patterns and disparate background, and eliminating the binary attributes that are more likely from the background. BIND is supported by thoroughly derived mathematical property of the row- and column-wise mixture distributions. Our experiment on synthetic and real-world data demonstrated BIND effectively removes background noise and drastically increases the fairness and accuracy of state-of-the arts BMF and CC methods.

翻訳日:2022-11-04 05:53:36 公開日:2020-08-09

# CNNを用いたウェイクワード開始端の高精度検出

Accurate Detection of Wake Word Start and End Using a CNN ( http://arxiv.org/abs/2008.03790v1 )

ライセンス: Link先を確認

Christin Jose, Yuriy Mishchenko, Thibaud Senechal, Anish Shah, Alex Escott, Shiv Vitaladevuni

(参考訳) 小さなフットプリント組み込みデバイスは、音声アシスタントを実現するために、小さなモデルサイズと検出遅延を持つキーワードスポッター(KWS)を必要とする。このようなキーワードは、ボイスアシスタント対応デバイスを起動するために使われるため、しばしば \textit{wake word} と呼ばれる。ウェイクワード検出と合わせて、ウェイクワードエンドポイント(開始と終了)の正確な推定はkwsの重要なタスクである。本稿では,単一段階の単語レベルニューラルネットワークを用いたニューラルKWSにおけるウェイクワードの終端を検出する2つの新しい手法を提案する。提案手法は, 従来の音響モデルとhmm強制アライメントと比較して, 最大50msecの標準誤差のウェークワードのエンドポイント検出に優れた精度を示すことを示す。我々の知る限り、これは単一段階のニューラルKWSに対するウェイクワード終端検出法の最初の研究である。

Small footprint embedded devices require keyword spotters (KWS) with small model size and detection latency for enabling voice assistants. Such a keyword is often referred to as \textit{wake word} as it is used to wake up voice assistant enabled devices. Together with wake word detection, accurate estimation of wake word endpoints (start and end) is an important task of KWS. In this paper, we propose two new methods for detecting the endpoints of wake words in neural KWS that use single-stage word-level neural networks. Our results show that the new techniques give superior accuracy for detecting wake words' endpoints of up to 50 msec standard error versus human annotations, on par with the conventional Acoustic Model plus HMM forced alignment. To our knowledge, this is the first study of wake word endpoints detection methods for single-stage neural KWS.

翻訳日:2022-11-01 04:57:39 公開日:2020-08-09

# 話者条件波RNN:未知話者と記録条件のためのユニバーサルニューラルボコーダを目指して

Speaker Conditional WaveRNN: Towards Universal Neural Vocoder for Unseen Speaker and Recording Conditions ( http://arxiv.org/abs/2008.05289v1 )

ライセンス: Link先を確認

Dipjyoti Paul, Yannis Pantazis, Yannis Stylianou

(参考訳) ディープラーニングの最近の進歩は、単一話者音声合成における人間レベルのパフォーマンスにつながった。しかし、これらのシステムを複数の話者モデルに一般化する際には、音声品質の面ではまだ制限がある。例えば、従来のニューラルボコーダはトレーニングスピーカーに調整され、目に見えない話者に一般化能力が不足している。本研究では,話者条件付きWaveRNN(SC-WaveRNN)と呼ばれるWaveRNNの変種を提案する。我々は,未知の話者や記録条件であっても,効率的なユニバーサルヴォコーダの開発を目指している。標準のWaveRNNとは対照的に、SC-WaveRNNは話者埋め込みという形で追加情報を利用する。 SC-WaveRNNは、トレーニングのために公開データを使用することで、主観的および客観的なメトリクスのベースラインであるWaveRNNよりも大幅にパフォーマンスが向上する。 MOSでは、SC-WaveRNNは、可視話者の約23%、可視話者の最大95%の改善を実現している。最後に,ゼロショット話者適応に類似したマルチ話者テキスト音声合成(tts)を実装して作業を拡大する。性能面では、我々のシステムはベースラインのTSシステムよりも15.5%以上60.9%以上32.6%以上60.9%より好まれている。

Recent advancements in deep learning led to human-level performance in single-speaker speech synthesis. However, there are still limitations in terms of speech quality when generalizing those systems into multiple-speaker models especially for unseen speakers and unseen recording qualities. For instance, conventional neural vocoders are adjusted to the training speaker and have poor generalization capabilities to unseen speakers. In this work, we propose a variant of WaveRNN, referred to as speaker conditional WaveRNN (SC-WaveRNN). We target towards the development of an efficient universal vocoder even for unseen speakers and recording conditions. In contrast to standard WaveRNN, SC-WaveRNN exploits additional information given in the form of speaker embeddings. Using publicly-available data for training, SC-WaveRNN achieves significantly better performance over baseline WaveRNN on both subjective and objective metrics. In MOS, SC-WaveRNN achieves an improvement of about 23% for seen speaker and seen recording condition and up to 95% for unseen speaker and unseen condition. Finally, we extend our work by implementing a multi-speaker text-to-speech (TTS) synthesis similar to zero-shot speaker adaptation. In terms of performance, our system has been preferred over the baseline TTS system by 60% over 15.5% and by 60.9% over 32.6%, for seen and unseen speakers, respectively.

翻訳日:2022-11-01 04:57:27 公開日:2020-08-09

# 新型コロナウイルスのトレンド予測のためのディープラーニングアプローチ

A Deep Learning Approach for COVID-19 Trend Prediction ( http://arxiv.org/abs/2008.05644v1 )

ライセンス: Link先を確認

Tong Yang, Long Sha, Justin Li, Pengyu Hong

(参考訳) 本研究では,米国におけるSARS-CoV-2の普及傾向を予測するためのディープラーニングモデルに基づくアプローチを開発した。我々は,米国を用いて事例と州人口統計データを確認する設計モデルを実装し,有望な傾向予測結果を得た。このモデルは、Gated Recurrent Unit構造を通して、人口統計情報と流行時系列データを組み込む。支配的な人口統計要因の識別は最後に行われる。

In this work, we developed a deep learning model-based approach to forecast the spreading trend of SARS-CoV-2 in the United States. We implemented the designed model using the United States to confirm cases and state demographic data and achieved promising trend prediction results. The model incorporates demographic information and epidemic time-series data through a Gated Recurrent Unit structure. The identification of dominating demographic factors is delivered in the end.

翻訳日:2022-11-01 04:57:07 公開日:2020-08-09

# MODEL: リンク予測のためのモチーフベースディープ特徴学習

MODEL: Motif-based Deep Feature Learning for Link Prediction ( http://arxiv.org/abs/2008.03637v1 )

ライセンス: Link先を確認

Lei Wang, Jing Ren, Bo Xu, Jianxin Li, Wei Luo, Feng Xia

(参考訳) リンク予測はネットワーク分析やアプリケーションにおいて重要な役割を果たす。近年、リンク予測のアプローチは、従来の類似性に基づくアルゴリズムから埋め込みに基づくアルゴリズムへと進化している。しかし、既存のアプローチの多くは、現実世界のネットワークがランダムなネットワークとは異なるという事実をうまく利用できない。特に、現実世界のネットワークにはモチーフ、基盤となるネットワーク生成プロセスを反映した自然なネットワーク構築ブロックが含まれていることが知られている。本稿では,ネットワーク内の高次構造をキャプチャするために,ネットワークモチーフを組み込んだ新しい埋め込みアルゴリズムを提案する。リンク予測の有効性を評価するために,ソーシャルネットワーク,生物ネットワーク,学術ネットワークの3種類のネットワークを用いて実験を行った。その結果,本アルゴリズムは従来の類似性に基づくアルゴリズムを20%,最先端の埋め込みベースアルゴリズムを19%上回った。

Link prediction plays an important role in network analysis and applications. Recently, approaches for link prediction have evolved from traditional similarity-based algorithms into embedding-based algorithms. However, most existing approaches fail to exploit the fact that real-world networks are different from random networks. In particular, real-world networks are known to contain motifs, natural network building blocks reflecting the underlying network-generating processes. In this paper, we propose a novel embedding algorithm that incorporates network motifs to capture higher-order structures in the network. To evaluate its effectiveness for link prediction, experiments were conducted on three types of networks: social networks, biological networks, and academic networks. The results demonstrate that our algorithm outperforms both the traditional similarity-based algorithms by 20% and the state-of-the-art embedding-based algorithms by 19%.

翻訳日:2022-11-01 04:53:55 公開日:2020-08-09

# ソーシャルネットワークにおける多変量関係集約学習

Multivariate Relations Aggregation Learning in Social Networks ( http://arxiv.org/abs/2008.03654v1 )

ライセンス: Link先を確認

Jin Xu, Shuo Yu, Ke Sun, Jing Ren, Ivan Lee, Shirui Pan, Feng Xia

(参考訳) 多変量関係は、生物ネットワーク、ソーシャルネットワーク、輸送ネットワーク、学術ネットワークなど、様々な種類のネットワークにおいて一般的である。三次閉鎖の原則とグループ形成の傾向により、ソーシャルネットワークにおける多変量関係は複雑で豊かなものである。したがって、ソーシャルネットワークのグラフ学習タスクでは、多変量関係情報の同定と活用がより重要である。既存のグラフ学習手法は近隣情報拡散機構に基づいており、これは多くの場合、部分的欠落や多変量関係情報の欠如を招き、最終的にタスクの正確性と実行効率に影響を及ぼす。これらの課題に対処するために,ネットワーク環境における多変量関係情報を効果的に把握できる多変量関係集約学習法(MORE)を提案する。 node属性と構造特徴を集約することで、より精度が高く、コンバージェンス速度が速くなる。 1つの引用ネットワークと5つのソーシャルネットワークで実験を行った。実験の結果,MOREモデルはノード分類タスクにおけるGCN(Graph Convolutional Network)モデルよりも精度が高く,時間コストを大幅に削減できることがわかった。

Multivariate relations are general in various types of networks, such as biological networks, social networks, transportation networks, and academic networks. Due to the principle of ternary closures and the trend of group formation, the multivariate relationships in social networks are complex and rich. Therefore, in graph learning tasks of social networks, the identification and utilization of multivariate relationship information are more important. Existing graph learning methods are based on the neighborhood information diffusion mechanism, which often leads to partial omission or even lack of multivariate relationship information, and ultimately affects the accuracy and execution efficiency of the task. To address these challenges, this paper proposes the multivariate relationship aggregation learning (MORE) method, which can effectively capture the multivariate relationship information in the network environment. By aggregating node attribute features and structural features, MORE achieves higher accuracy and faster convergence speed. We conducted experiments on one citation network and five social networks. The experimental results show that the MORE model has higher accuracy than the GCN (Graph Convolutional Network) model in node classification tasks, and can significantly reduce time cost.

翻訳日:2022-11-01 04:53:44 公開日:2020-08-09

# DINE: ディープ不完全なネットワーク埋め込みのためのフレームワーク

DINE: A Framework for Deep Incomplete Network Embedding ( http://arxiv.org/abs/2008.06311v1 )

ライセンス: Link先を確認

Ke Hou, Jiaying Liu, Yin Peng, Bo Xu, Ivan Lee, Feng Xia

(参考訳) ネットワーク表現学習(NRL)は,ノード分類やリンク予測など,さまざまなタスクにおいて重要な役割を果たす。ネットワーク構造やノード属性に基づいて,ノードの低次元ベクトル表現を学習することを目的とする。完全ネットワークへの埋め込み技術は集中的に研究されてきたが、実世界のアプリケーションでは、完全ネットワークの収集は依然として難しい課題である。本稿では,このギャップを埋めるために,ディープ不完全ネットワーク埋め込み法,すなわちDINEを提案する。具体的には、期待最大化フレームワークを用いて、部分的に観測可能なネットワーク内のノードとエッジの両方を含む欠落部分を最初に完成する。組込み性能を向上させるため,ノード表現の学習にはネットワーク構造とノード属性の両方を考慮する。実験により,マルチラベル分類およびリンク予測タスクにおいて,DINEを3つのネットワーク上で評価する。その結果,最先端のベースラインと比較して提案手法の優位性を示した。

Network representation learning (NRL) plays a vital role in a variety of tasks such as node classification and link prediction. It aims to learn low-dimensional vector representations for nodes based on network structures or node attributes. While embedding techniques on complete networks have been intensively studied, in real-world applications, it is still a challenging task to collect complete networks. To bridge the gap, in this paper, we propose a Deep Incomplete Network Embedding method, namely DINE. Specifically, we first complete the missing part including both nodes and edges in a partially observable network by using the expectation-maximization framework. To improve the embedding performance, we consider both network structures and node attributes to learn node representations. Empirically, we evaluate DINE over three networks on multi-label classification and link prediction tasks. The results demonstrate the superiority of our proposed approach compared against state-of-the-art baselines.

翻訳日:2022-11-01 04:53:08 公開日:2020-08-09

# 野生における教師なし迷路補正とアニメーションのためのデュアルインペイントモデル

Dual In-painting Model for Unsupervised Gaze Correction and Animation in the Wild ( http://arxiv.org/abs/2008.03834v1 )

ライセンス: Link先を確認

Jichao Zhang, Jingjing Chen, Hao Tang, Wei Wang, Yan Yan, Enver Sangineto, Nicu Sebe

(参考訳) 本稿では,野生における無監督視線補正の問題に対処し,視線角と頭部姿勢の正確な注釈を必要とせず,有効な解決法を提案する。私たちはCelebAGazeという新しいデータセットを作成しました。このデータセットは2つのドメイン X, Y で構成されています。本手法は,Gaze Correction Module (GCM), Gaze Animation Module (GAM), Pretrained Autoencoder Module (PAM)の3つの新しいモジュールから構成される。具体的には、GCMとGAMは、視線補正のためのドメイン$X$のデータと、視線アニメーションのためのドメイン$Y$のデータを使用して、デュアル塗装ネットワークを別々に訓練する。また、GAMのトレーニングにおいて、眼領域から符号化された特徴と角度情報との相関を助長し、潜伏空間の補間によって実現可能な視線アニメーションを実現するための合成-アスレーニング法を提案する。アイリス色など)の識別情報をさらに保存するために,自己監督ミラー学習に基づくオートエンコーダを用いたPAMを提案し,そのボトルネック特徴が角度不変であり,デュアルインペイントモデルへの追加入力として機能する。広汎な実験により,提案手法の有効性を検証し,本手法が最先端のベースラインよりも説得力のある結果を得る上での優位性を実証した。私たちのコード、事前訓練されたモデル、補足資料は、https://github.com/zhangqianhui/GazeAnimation.comで公開されています。

In this paper we address the problem of unsupervised gaze correction in the wild, presenting a solution that works without the need for precise annotations of the gaze angle and the head pose. We have created a new dataset called CelebAGaze, which consists of two domains X, Y, where the eyes are either staring at the camera or somewhere else. Our method consists of three novel modules: the Gaze Correction module (GCM), the Gaze Animation module (GAM), and the Pretrained Autoencoder module (PAM). Specifically, GCM and GAM separately train a dual in-painting network using data from the domain $X$ for gaze correction and data from the domain $Y$ for gaze animation. Additionally, a Synthesis-As-Training method is proposed when training GAM to encourage the features encoded from the eye region to be correlated with the angle information, resulting in a gaze animation which can be achieved by interpolation in the latent space. To further preserve the identity information~(e.g., eye shape, iris color), we propose the PAM with an Autoencoder, which is based on Self-Supervised mirror learning where the bottleneck features are angle-invariant and which works as an extra input to the dual in-painting models. Extensive experiments validate the effectiveness of the proposed method for gaze correction and gaze animation in the wild and demonstrate the superiority of our approach in producing more compelling results than state-of-the-art baselines. Our code, the pretrained models and the supplementary material are available at: https://github.com/zhangqianhui/GazeAnimation.

翻訳日:2022-11-01 04:52:57 公開日:2020-08-09

# ネットワーク侵入検出のための多段階最適化機械学習フレームワーク

Multi-Stage Optimized Machine Learning Framework for Network Intrusion Detection ( http://arxiv.org/abs/2008.03297v1 )

ライセンス: Link先を確認

MohammadNoor Injadat, Abdallah Moubayed, Ali Bou Nassif, Abdallah Shami

(参考訳) サイバーセキュリティは、インターネット上の個人や組織の依存度の増加と、オンライン活動のセキュリティとプライバシに関する懸念から、大きな注目を集めた。従来の機械学習(ML)ベースのネットワーク侵入検知システム(NIDS)は、悪意のあるオンライン行動から保護するために開発された。本稿では,その検出性能を維持しつつ計算複雑性を低減できる多段最適化mlベースのnidsフレームワークを提案する。本研究は,オーバーサンプリング手法がモデルのトレーニングサンプルサイズに与える影響を調査し,最小のトレーニングサンプルサイズを決定する。さらに、情報ゲインと相関に基づく2つの特徴選択手法を比較し、検出性能と時間複雑性への影響について検討する。さらに、NIDSの性能を高めるため、異なるMLハイパーパラメータ最適化手法について検討した。提案フレームワークの性能は、CICIDS 2017とUNSW-NB 2015データセットの2つの最近の侵入検知データセットを用いて評価される。実験の結果,提案モデルでは,必要なトレーニングサンプルサイズ (最大74%) と特徴セットサイズ (最大50%) を著しく削減できることがわかった。さらに、モデル性能はハイパーパラメータ最適化により向上し、両方のデータセットに対して99%以上の精度で検出精度が向上し、最近の文献の精度が1-2%、誤警報率が1-2%向上した。

Cyber-security garnered significant attention due to the increased dependency of individuals and organizations on the Internet and their concern about the security and privacy of their online activities. Several previous machine learning (ML)-based network intrusion detection systems (NIDSs) have been developed to protect against malicious online behavior. This paper proposes a novel multi-stage optimized ML-based NIDS framework that reduces computational complexity while maintaining its detection performance. This work studies the impact of oversampling techniques on the models' training sample size and determines the minimal suitable training sample size. Furthermore, it compares between two feature selection techniques, information gain and correlation-based, and explores their effect on detection performance and time complexity. Moreover, different ML hyper-parameter optimization techniques are investigated to enhance the NIDS's performance. The performance of the proposed framework is evaluated using two recent intrusion detection datasets, the CICIDS 2017 and the UNSW-NB 2015 datasets. Experimental results show that the proposed model significantly reduces the required training sample size (up to 74%) and feature set size (up to 50%). Moreover, the model performance is enhanced with hyper-parameter optimization with detection accuracies over 99% for both datasets, outperforming recent literature works by 1-2% higher accuracy and 1-2% lower false alarm rate.

翻訳日:2022-11-01 04:52:29 公開日:2020-08-09

# Big Networks: 調査

Big Networks: A Survey ( http://arxiv.org/abs/2008.03638v1 )

ライセンス: Link先を確認

Hayat Dino Bedru, Shuo Yu, Xinru Xiao, Da Zhang, Liangtian Wan, He Guo, Feng Xia

(参考訳) ネットワークは、ネットワークの構成要素間の相互作用のパターンが複雑である頂点とリンクの観点から複雑なシステムを表現する典型的な表現形式である。ネットワークは、時間とともに変化しない静的でもよいし、時間とともに進化する動的でもよい。ネットワーク解析の複雑さは,ネットワークサイズの爆発が増加する新しい状況下で異なる。本稿では,big networkという新たなネットワーク科学概念を提案する。大きなネットワークは通常、複雑で高次の内部構造を持つ大規模である。本稿では,大規模ネットワークの観点から,ネットワーク科学の分野における主要なトピックを考察するガイドラインフレームワークを提案する。まず,マイクロレベル,メソレベル,マクロレベルという3段階の大規模ネットワークの構造特性を紹介する。次に,大規模ネットワーク解析の最先端のトピックについて論じる。ランク付け手法や分割手法,ネットワーク埋め込みアルゴリズムなど,ネットワークモデルと関連するアプローチが体系的に導入されている。ビッグネットワークの典型的なアプリケーションは、コミュニティ検出、リンク予測、レコメンデーションなど、レビューされる。さらに、さらに調査すべき重要なオープンな問題についても指摘します。

A network is a typical expressive form of representing complex systems in terms of vertices and links, in which the pattern of interactions amongst components of the network is intricate. The network can be static that does not change over time or dynamic that evolves through time. The complication of network analysis is different under the new circumstance of network size explosive increasing. In this paper, we introduce a new network science concept called big network. Big networks are generally in large-scale with a complicated and higher-order inner structure. This paper proposes a guideline framework that gives an insight into the major topics in the area of network science from the viewpoint of a big network. We first introduce the structural characteristics of big networks from three levels, which are micro-level, meso-level, and macro-level. We then discuss some state-of-the-art advanced topics of big network analysis. Big network models and related approaches, including ranking methods, partition approaches, as well as network embedding algorithms are systematically introduced. Some typical applications in big networks are then reviewed, such as community detection, link prediction, recommendation, etc. Moreover, we also pinpoint some critical open issues that need to be investigated further.

翻訳日:2022-11-01 04:52:08 公開日:2020-08-09

# Random Walks: アルゴリズムと応用のレビュー

Random Walks: A Review of Algorithms and Applications ( http://arxiv.org/abs/2008.03639v1 )

ライセンス: Link先を確認

Feng Xia, Jiaying Liu, Hansong Nie, Yonghao Fu, Liangtian Wan, Xiangjie Kong

(参考訳) ランダムウォークは、数学空間におけるランダムなステップの連続を含む経路を記述するランダムプロセスとして知られている。数学や計算機科学などの様々な分野で人気を集めている。さらに量子力学では、量子ウォークは古典的ランダムウォークの量子アナログと見なすことができる。古典的なランダムウォークと量子ウォークは、ノード間の近接を計算し、ネットワーク内のトポロジーを抽出するために使用できる。様々なランダムウォーク関連モデルは、リンク予測、レコメンデーション、コンピュータビジョン、半教師付き学習、ネットワーク埋め込みといった下流タスクに非常に重要である。本稿では,古典的ランダムウォークと量子ウォークの総合的なレビューを行う。まず,古典的ランダムウォークと量子ウォークの知識,基本的な概念と一般的なアルゴリズムについて概説する。また,時間複雑性の観点から,量子ウォークと古典ランダムウォークに基づくアルゴリズムを比較する。次に,その応用を計算機科学の分野に導入する。最後に、効率性、主記憶容量、既存アルゴリズムの計算時間の観点から、オープンな問題について議論する。この研究は、ランダムウォークと量子ウォークを一緒に探索することで、この成長する研究領域に寄与することを目的としている。

A random walk is known as a random process which describes a path including a succession of random steps in the mathematical space. It has increasingly been popular in various disciplines such as mathematics and computer science. Furthermore, in quantum mechanics, quantum walks can be regarded as quantum analogues of classical random walks. Classical random walks and quantum walks can be used to calculate the proximity between nodes and extract the topology in the network. Various random walk related models can be applied in different fields, which is of great significance to downstream tasks such as link prediction, recommendation, computer vision, semi-supervised learning, and network embedding. In this paper, we aim to provide a comprehensive review of classical random walks and quantum walks. We first review the knowledge of classical random walks and quantum walks, including basic concepts and some typical algorithms. We also compare the algorithms based on quantum walks and classical random walks from the perspective of time complexity. Then we introduce their applications in the field of computer science. Finally we discuss the open issues from the perspectives of efficiency, main-memory volume, and computing time of existing algorithms. This study aims to contribute to this growing area of research by exploring random walks and quantum walks together.

翻訳日:2022-11-01 04:51:56 公開日:2020-08-09

# ネットワーク侵入検知システムにおける逆例に対するロバスト性向上

Enhancing Robustness Against Adversarial Examples in Network Intrusion Detection Systems ( http://arxiv.org/abs/2008.03677v1 )

ライセンス: Link先を確認

Mohammad J. Hashemi, Eric Keller

(参考訳) 近年のサイバー攻撃の増加は、より洗練されたネットワーク侵入検知システム(NIDS)の構築を要求している。これらのNIDSは、SDN(Software-Defined Network)にデプロイされるような、ネットワークを経由するすべてのトラフィックを監視することができれば、パフォーマンスが向上する。ゼロデイ攻撃を検出できないため、従来悪意のあるトラフィックを検出するために使われていたシグネチャベースのNIDSは、ニューラルネットワーク上に構築された異常ベースのNIDSに置き換えられ始めている。しかし、近年、このようなNIDSは独自の欠点、すなわち敵のサンプル攻撃に弱いことが示されている。さらに、ネットワークシステムが最近直面する可能性のあるさまざまな攻撃を表現していない古いデータセットで、主に評価された。本稿では、異なる種類のネットワーク攻撃を、敵のサンプル攻撃に対する堅牢性を高めた低い偽警報設定で検出できる自動エンコーダをデノナイズし、NIDSを構築するための新しいメカニズムとしてRestruction from partial Observation (RePO)を提案する。ネットワーク攻撃を多岐に及ぼしたデータセットを用いて行った評価の結果, オートエンコーダをデノナイズすることで, 通常の設定では最大29%, 対向設定では最大45%の悪質なトラフィックの検出精度が向上することがわかった。

The increase of cyber attacks in both the numbers and varieties in recent years demands to build a more sophisticated network intrusion detection system (NIDS). These NIDS perform better when they can monitor all the traffic traversing through the network like when being deployed on a Software-Defined Network (SDN). Because of the inability to detect zero-day attacks, signature-based NIDS which were traditionally used for detecting malicious traffic are beginning to get replaced by anomaly-based NIDS built on neural networks. However, recently it has been shown that such NIDS have their own drawback namely being vulnerable to the adversarial example attack. Moreover, they were mostly evaluated on the old datasets which don't represent the variety of attacks network systems might face these days. In this paper, we present Reconstruction from Partial Observation (RePO) as a new mechanism to build an NIDS with the help of denoising autoencoders capable of detecting different types of network attacks in a low false alert setting with an enhanced robustness against adversarial example attack. Our evaluation conducted on a dataset with a variety of network attacks shows denoising autoencoders can improve detection of malicious traffic by up to 29% in a normal setting and by up to 45% in an adversarial setting compared to other recently proposed anomaly detectors.

翻訳日:2022-11-01 04:51:38 公開日:2020-08-09

# クラスタベースモデリングによる合成音声の深部MOS予測

Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling ( http://arxiv.org/abs/2008.03710v1 )

ライセンス: Link先を確認

Yeunju Choi, Youngmoon Jung, Hoirin Kim

(参考訳) 深層学習は音声合成と音声変換において顕著な進歩を遂げてきたが, 人工音声の評価はまだ人間の被験者によって行われている。近年, 深層学習に基づく評価モデルを提案し, 音声品質評価の自動化の可能性を示した。先述した評価モデルであるmosnetを改善するために,クラスタベースのモデリング手法を用いて,グローバル品質トークン(gqt)層の使用,エンコーディング層の使用,および両者の使用という3つのモデルを提案する。我々は、音声変換チャレンジ2018の評価結果を用いて、合成音声の平均意見スコアと合成音声と参照音声の類似度スコアを予測する実験を行った。その結果、gqt層はタスクの有用な品質トークンを自動的に学習することで、人間評価の予測に役立ち、符号化層はフレームレベルのスコアをより正確に活用するのに役立ちます。

While deep learning has made impressive progress in speech synthesis and voice conversion, the assessment of the synthesized speech is still carried out by human participants. Several recent papers have proposed deep-learning-based assessment models and shown the potential to automate the speech quality assessment. To improve the previously proposed assessment model, MOSNet, we propose three models using cluster-based modeling methods: using a global quality token (GQT) layer, using an Encoding Layer, and using both of them. We perform experiments using the evaluation results of the Voice Conversion Challenge 2018 to predict the mean opinion score of synthesized speech and similarity score between synthesized speech and reference speech. The results show that the GQT layer helps to predict human assessment better by automatically learning the useful quality tokens for the task and that the Encoding Layer helps to utilize frame-level scores more precisely.

翻訳日:2022-11-01 04:51:13 公開日:2020-08-09

# グラフ分割による進化的構成木抽出の高速化

Accelerating Evolutionary Construction Tree Extraction via Graph Partitioning ( http://arxiv.org/abs/2008.03669v1 )

ライセンス: Link先を確認

Markus Friedrich and Sebastian Feld and Thomy Phan and Pierre-Alain Fayolle

(参考訳) 潜在的にノイズの多い点雲から構築木を抽出することは、コンピュータ支援設計におけるリバースエンジニアリングタスクの重要な側面である。アルゴリズム幾何学に基づく解は、使用可能なモデル表現(例えば二次曲面のみ)と雑音ロバスト性に制約を課す。問題を組合せ最適化問題として再計算し、進化的アルゴリズムで解くことで、計算複雑性の増大を犠牲にしてこれらの制約の一部を緩和することができる。本稿では,最新のcpuの並列化機能を活用しつつ,進化的構成木抽出を高速化するグラフ検索空間分割スキームを提案する。この評価は、ベースラインアプローチと比較して46.6ドルまでのスピードアップを示し、結果としてツリーサイズは25.2.%から8.6.%に増加した。

Extracting a Construction Tree from potentially noisy point clouds is an important aspect of Reverse Engineering tasks in Computer Aided Design. Solutions based on algorithmic geometry impose constraints on usable model representations (e.g. quadric surfaces only) and noise robustness. Re-formulating the problem as a combinatorial optimization problem and solving it with an Evolutionary Algorithm can mitigate some of these constraints at the cost of increased computational complexity. This paper proposes a graph-based search space partitioning scheme that is able to accelerate Evolutionary Construction Tree extraction while exploiting parallelization capabilities of modern CPUs. The evaluation indicates a speed-up up to a factor of $46.6$ compared to the baseline approach while resulting tree sizes increased by $25.2\%$ to $88.6\%$.

翻訳日:2022-11-01 04:45:14 公開日:2020-08-09

# ブロックシャッフル:メモリ制限のある高分解能高速スタイル転送方式

Block Shuffle: A Method for High-resolution Fast Style Transfer with Limited Memory ( http://arxiv.org/abs/2008.03706v1 )

ライセンス: Link先を確認

Weifeng Ma, Zhe Chen, Caoting Ji

(参考訳) Fast Style Transferは、フィードフォワードニューラルネットワークを使って入力画像をレンダリングする一連のNeural Style Transferアルゴリズムである。出力層の高次元のため、これらのネットワークは計算に多くのメモリを必要とする。したがって、高解像度画像の場合、ほとんどのモバイルデバイスやパーソナルコンピュータはそれらをスタイリングできないため、Fast Style Transferのアプリケーションシナリオは大幅に制限される。現在、既存の2つのソリューションは、より多くのメモリを購入し、羽毛ベースの方法を使用しているが、前者は追加コストが必要であり、後者は画質が劣っている。そこで本研究では,高メモリ消費の単一タスクを低メモリ消費の複数のサブタスクに変換する新しい画像合成手法である「emph{block shuffle}」を提案する。このメソッドは、ネットワークアーキテクチャを変更することなく、高速スタイル転送のプラグインとして機能することができる。私たちはGitHubで最も人気のあるFast Style Transferリポジトリをベースラインとして使用しています。実験により,本手法による高分解能画像の品質がフェザリング法より優れていることを示した。本手法はベースラインよりも桁違いに遅いが,メモリに制限のある高分解能画像のスタイリングが可能であり,ベースラインでは不可能である。コードとモデルは \url{https://github.com/czczup/block-shuffle} で利用可能になる。

Fast Style Transfer is a series of Neural Style Transfer algorithms that use feed-forward neural networks to render input images. Because of the high dimension of the output layer, these networks require much memory for computation. Therefore, for high-resolution images, most mobile devices and personal computers cannot stylize them, which greatly limits the application scenarios of Fast Style Transfer. At present, the two existing solutions are purchasing more memory and using the feathering-based method, but the former requires additional cost, and the latter has poor image quality. To solve this problem, we propose a novel image synthesis method named \emph{block shuffle}, which converts a single task with high memory consumption to multiple subtasks with low memory consumption. This method can act as a plug-in for Fast Style Transfer without any modification to the network architecture. We use the most popular Fast Style Transfer repository on GitHub as the baseline. Experiments show that the quality of high-resolution images generated by our method is better than that of the feathering-based method. Although our method is an order of magnitude slower than the baseline, it can stylize high-resolution images with limited memory, which is impossible with the baseline. The code and models will be made available on \url{https://github.com/czczup/block-shuffle}.

翻訳日:2022-11-01 04:45:00 公開日:2020-08-09

# 病理組織における一般核検出のためのスイッチング損失

Switching Loss for Generalized Nucleus Detection in Histopathology ( http://arxiv.org/abs/2008.03750v1 )

ライセンス: Link先を確認

Deepak Anand, Gaurav Patel, Yaman Dang, Amit Sethi

(参考訳) 医用画像解析における2つの基礎的課題に対する深層学習手法の精度 - 検出とセグメンテーション -- は、クラス不均衡に悩まされる可能性がある。本稿では,前景クラスと背景クラスを適応的にシフトする「スイッチングロス」関数を提案する。この問題に対処する既存の損失関数は分類タスクによって動機付けられているが、スイッチング損失はDice損失に基づいており、セグメンテーションや検出に適している。さらに、トレーニングサンプルを最大限に活用するために、トレーニングセット全体に対して一度適応する以前の提案とは異なり、各ミニバッチで損失を適応します。ソースデータセット上で提案された損失関数を用いて訓練された核検出器は、クロスエントロピー、サイコロ、焦点損失を用いて訓練されたものよりも優れていた。驚くべきことに、ターゲットデータセットをリトレーニングすることなく、トレーニング済みの核検出器は、ターゲットデータセットの少なくとも一部のイメージでトレーニングされた既存の核検出器よりも優れています。提案した損失の幅広い有用性を確立するため,MRIにおける他の損失関数と比較してより正確な心室分画が得られたことも確認した。当社のGPU対応でトレーニング済みの核検出ソフトウェアは、スライド画像全体を最初から処理する準備ができています。

The accuracy of deep learning methods for two foundational tasks in medical image analysis -- detection and segmentation -- can suffer from class imbalance. We propose a `switching loss' function that adaptively shifts the emphasis between foreground and background classes. While the existing loss functions to address this problem were motivated by the classification task, the switching loss is based on Dice loss, which is better suited for segmentation and detection. Furthermore, to get the most out the training samples, we adapt the loss with each mini-batch, unlike previous proposals that adapt once for the entire training set. A nucleus detector trained using the proposed loss function on a source dataset outperformed those trained using cross-entropy, Dice, or focal losses. Remarkably, without retraining on target datasets, our pre-trained nucleus detector also outperformed existing nucleus detectors that were trained on at least some of the images from the target datasets. To establish a broad utility of the proposed loss, we also confirmed that it led to more accurate ventricle segmentation in MRI as compared to the other loss functions. Our GPU-enabled pre-trained nucleus detection software is also ready to process whole slide images right out-of-the-box and is usably fast.

翻訳日:2022-11-01 04:44:28 公開日:2020-08-09

# コンピュータビジョンと慣性センサを用いた軌道形状計測手法

A methodology for the measurement of track geometry based on computer vision and inertial sensors ( http://arxiv.org/abs/2008.03763v1 )

ライセンス: Link先を確認

Jos\'e L. Escalona

(参考訳) 本論文は,鉄道車両に搭載される軌道形状測定システム(TGMS)における軌道形状の不規則性の計算に使用される理論について述べる。 TGMSは、データ取得と処理のためのコンピュータと、慣性測定ユニット(IMU、3Dジャイロスコープ、および3D加速度計)と、2つのビデオカメラとエンコーダを含む一連のセンサーを含む。提案システムの主な特徴は次のとおりである。 1.非接触技術を用いて、軌道アライメント、垂直プロファイル、クロスレベル、ゲージ、ツイスト、レールヘッドプロファイルを測定することができる。 2.鉄道車両に設置可能。コンパクトで低コストである。車両の移動時にレールヘッドを視認していれば、車輪セットレベル、プライマリサスペンション(ボディフレーム)上、またはセカンダリサスペンション(車体)上において、車両の任意の本体に設置することができる。

This document describes the theory used for the calculation of track geometric irregularities on a Track Geometry Measuring System (TGMS) to be installed in railway vehicles. The TGMS includes a computer for data acquisition and process, a set of sensors including an inertial measuring unit (IMU, 3D gyroscope and 3D accelerometer), two video cameras and an encoder. The main features of the proposed system are: 1. It is capable to measure track alignment, vertical profile, cross-level, gauge, twist and rail-head profile using non-contact technology. 2. It can be installed in line railway vehicles. It is compact and low cost. Provided that the equipment sees the rail heads when the vehicle is moving, it can be installed in any body of the vehicle: at the wheelsets level, above primary suspension (bogie frame) or above the secondary suspension (car body).

翻訳日:2022-11-01 04:44:06 公開日:2020-08-09

# 正則照明最適化と深部雑音抑圧による低光海洋画像強調

Low-Light Maritime Image Enhancement with Regularized Illumination Optimization and Deep Noise Suppression ( http://arxiv.org/abs/2008.03765v1 )

ライセンス: Link先を確認

Yu Guo, Yuxu Lu, Ryan Wen Liu, Meifang Yang, Kwok Tai Chui

(参考訳) 低光撮像条件下で撮影された海洋画像は視認性が低く、予期せぬノイズが発生しやすいため、海上交通の監督や管理に悪影響を及ぼす。画像性能向上のためには、劣化した低光度画像から重要な視覚情報を復元する必要がある。本稿では,正規化照明最適化とディープノイズ抑圧による低照度画像の高精細化を提案する。特に,L0-ノルム勾配間隔と構造認識正規化を併用したハイブリッド正規化変分モデルを示し,Max-RGBを用いて推定した粗い照明マップを改良する。次に、洗練された照明マップを調整するために適応ガンマ補正法を導入する。 Retinex理論の仮定に基づいて、リフレクションマップを最適化するために、ガイド付きフィルタに基づく詳細強化手法を導入する。調整された照明と最適化された反射マップを組み合わせて、拡張された海洋画像を生成する。望ましくないノイズが撮像性能に与える影響を抑制するため、深層学習に基づくブラインドデノイングフレームワークを更に導入し、強調画像の視覚的品質を向上する。特に、このフレームワークは2つのサブネットワーク(E-NetとD-Net)で構成されており、それぞれノイズレベル推定と非ブラインドノイズ低減に採用されている。画像強調手法の主な利点は、規則化された照明最適化と深い目隠しを最大限に活用できることである。人工海事画像と現実海事画像の総合的な実験を行い,提案手法と最先端画像との比較を行った。実験結果から,定量評価と定性評価の両面で優れた性能を示した。

Maritime images captured under low-light imaging condition easily suffer from low visibility and unexpected noise, leading to negative effects on maritime traffic supervision and management. To promote imaging performance, it is necessary to restore the important visual information from degraded low-light images. In this paper, we propose to enhance the low-light images through regularized illumination optimization and deep noise suppression. In particular, a hybrid regularized variational model, which combines L0-norm gradient sparsity prior with structure-aware regularization, is presented to refine the coarse illumination map originally estimated using Max-RGB. The adaptive gamma correction method is then introduced to adjust the refined illumination map. Based on the assumption of Retinex theory, a guided filter-based detail boosting method is introduced to optimize the reflection map. The adjusted illumination and optimized reflection maps are finally combined to generate the enhanced maritime images. To suppress the effect of unwanted noise on imaging performance, a deep learning-based blind denoising framework is further introduced to promote the visual quality of enhanced image. In particular, this framework is composed of two sub-networks, i.e., E-Net and D-Net adopted for noise level estimation and non-blind noise reduction, respectively. The main benefit of our image enhancement method is that it takes full advantage of the regularized illumination optimization and deep blind denoising. Comprehensive experiments have been conducted on both synthetic and realistic maritime images to compare our proposed method with several state-of-the-art imaging methods. Experimental results have illustrated its superior performance in terms of both quantitative and qualitative evaluations.

翻訳日:2022-11-01 04:43:49 公開日:2020-08-09

# sequence-to-sequence asrにおけるbertの知識の蒸留

Distilling the Knowledge of BERT for Sequence-to-Sequence ASR ( http://arxiv.org/abs/2008.03822v1 )

ライセンス: Link先を確認

Hayato Futami, Hirofumi Inaguma, Sei Ueno, Masato Mimura, Shinsuke Sakai and Tatsuya Kawahara

(参考訳) 注意に基づくシーケンス・ツー・シーケンス(seq2seq)モデルは自動音声認識(ASR)において有望な結果を得た。しかし、これらのモデルは左から右にデコードするので、右のコンテキストにアクセスできない。我々は、知識蒸留によるセク2seq ASRにBERTを外部言語モデルとして適用することで、左右両方の文脈を活用する。提案手法では,ceq2seq ASRのトレーニングを案内するソフトラベルを生成する。さらに,現在の発話を超えた文脈をbertの入力として活用する。日本語自発語コーパス(CSJ)のSeq2seqベースラインからASR性能を有意に向上することを示す実験的検討を行った。 BERTの知識蒸留は、左の文脈だけを見る変換器のLMよりも優れている。また,現在の発話を超えた文脈の活用の有効性を示す。提案手法は,n-best rescoringや浅層融合といった他のLMアプリケーション手法よりも優れているが,追加の推論コストは不要である。

Attention-based sequence-to-sequence (seq2seq) models have achieved promising results in automatic speech recognition (ASR). However, as these models decode in a left-to-right way, they do not have access to context on the right. We leverage both left and right context by applying BERT as an external language model to seq2seq ASR through knowledge distillation. In our proposed method, BERT generates soft labels to guide the training of seq2seq ASR. Furthermore, we leverage context beyond the current utterance as input to BERT. Experimental evaluations show that our method significantly improves the ASR performance from the seq2seq baseline on the Corpus of Spontaneous Japanese (CSJ). Knowledge distillation from BERT outperforms that from a transformer LM that only looks at left context. We also show the effectiveness of leveraging context beyond the current utterance. Our method outperforms other LM application approaches such as n-best rescoring and shallow fusion, while it does not require extra inference cost.

翻訳日:2022-11-01 04:35:53 公開日:2020-08-09

# SVMモデルからのホワイトボックス誘導:ロジックプログラミングによる説明可能なAI

White-box Induction From SVM Models: Explainable AI with Logic Programming ( http://arxiv.org/abs/2008.03301v1 )

ライセンス: Link先を確認

Farhad Shakerin, Gopal Gupta

(参考訳) 本稿では,サポートベクトルマシン(SVM)アルゴリズムで学習したモデルを説明するロジックプログラムの誘導問題に焦点をあてる。トップダウンシーケンシャルカバーインダクティブ論理プログラミング(ILP)アルゴリズム(例えば、FOIL)は、情報理論からのヒューリスティックスを用いたヒルクライミング探索を適用する。このタイプのアルゴリズムの大きな問題は、ローカル最適化に詰まってしまうことだ。しかし,新たなアプローチでは,データ依存のヒルクライミング探索をモデル依存検索に置き換え,まずグローバルな最適SVMモデルをトレーニングし,次に,そのモデルにおいて最も影響力のあるデータポイントとしてサポートベクトルを探索し,そのサポートベクトルと最もよく似た点をカバーする節を誘導する。固定仮説探索空間を定義する代わりに、我々のアルゴリズムは、説明可能なAIの例固有のインタプリタであるSHAPを用いて、関連する機能セットを決定する。このアプローチは、svmモデルの基盤となるロジックをキャプチャし、%ggを上回るアルゴリズムを生成する: foilアルゴリズム --> 他のilpアルゴリズム他のilpアルゴリズム誘導節の数と分類評価メトリクスの観点から。本論文は「論理プログラミングの理論と実践」誌の出版に向けて検討中である。

We focus on the problem of inducing logic programs that explain models learned by the support vector machine (SVM) algorithm. The top-down sequential covering inductive logic programming (ILP) algorithms (e.g., FOIL) apply hill-climbing search using heuristics from information theory. A major issue with this class of algorithms is getting stuck in a local optimum. In our new approach, however, the data-dependent hill-climbing search is replaced with a model-dependent search where a globally optimal SVM model is trained first, then the algorithm looks into support vectors as the most influential data points in the model, and induces a clause that would cover the support vector and points that are most similar to that support vector. Instead of defining a fixed hypothesis search space, our algorithm makes use of SHAP, an example-specific interpreter in explainable AI, to determine a relevant set of features. This approach yields an algorithm that captures SVM model's underlying logic and outperforms %GG: the FOIL algorithm --> other ILP algorithms other ILP algorithms in terms of the number of induced clauses and classification evaluation metrics. This paper is under consideration for publication in the journal of "Theory and practice of logic programming".

翻訳日:2022-11-01 04:35:39 公開日:2020-08-09

# 網膜画像における異常検出のためのp-netとの符号化構造-テキスト関係

Encoding Structure-Texture Relation with P-Net for Anomaly Detection in Retinal Images ( http://arxiv.org/abs/2008.03632v1 )

ライセンス: Link先を確認

Kang Zhou, Yuting Xiao, Jianlong Yang, Jun Cheng, Wen Liu, Weixin Luo, Zaiwang Gu, Jiang Liu, Shenghua Gao

(参考訳) 網膜画像における異常検出は、トレーニング段階において正常な画像のみを活用することにより、様々な網膜疾患/疾患による異常の同定を指す。健康な被験者の正常な画像は、しばしば規則的な構造を持つ(例えば、基底画像の血管構造、光コヒーレンス断層画像の解剖学的構造など)。逆に、疾患や病変はしばしばこれらの構造を破壊する。そこで本研究では,画像テクスチャと構造の関係を利用して,異常検出のためのディープニューラルネットワークの設計を提案する。具体的には、まず網膜画像の構造を抽出し、次に、元の健康画像から抽出された構造特徴と最終層特徴の両方を組み合わせて、元の入力された健康画像の再構成を行う。画像特徴はテクスチャ情報を提供し、構造から回収された画像の特異性を保証する。最後に、再構成画像を利用して構造を抽出し、原画像から抽出した構造と再構成画像との差を測定する。一方、再構成差の最小化は正則化器のように振舞い、画像の復元が保証される。一方、そのような構造差は正規度測定の計量としても用いられる。ネットワーク全体は ``P'' 形状であるため、P-Net と呼ばれる。 RESCデータセットとiSeeデータセットの大規模な実験により、網膜画像における異常検出に対するアプローチの有効性が検証された。さらに,本手法は,網膜画像における新たなクラス発見や実世界画像における異常検出にもよく適用できる。

Anomaly detection in retinal image refers to the identification of abnormality caused by various retinal diseases/lesions, by only leveraging normal images in training phase. Normal images from healthy subjects often have regular structures (e.g., the structured blood vessels in the fundus image, or structured anatomy in optical coherence tomography image). On the contrary, the diseases and lesions often destroy these structures. Motivated by this, we propose to leverage the relation between the image texture and structure to design a deep neural network for anomaly detection. Specifically, we first extract the structure of the retinal images, then we combine both the structure features and the last layer features extracted from original health image to reconstruct the original input healthy image. The image feature provides the texture information and guarantees the uniqueness of the image recovered from the structure. In the end, we further utilize the reconstructed image to extract the structure and measure the difference between structure extracted from original and the reconstructed image. On the one hand, minimizing the reconstruction difference behaves like a regularizer to guarantee that the image is corrected reconstructed. On the other hand, such structure difference can also be used as a metric for normality measurement. The whole network is termed as P-Net because it has a ``P'' shape. Extensive experiments on RESC dataset and iSee dataset validate the effectiveness of our approach for anomaly detection in retinal images. Further, our method also generalizes well to novel class discovery in retinal images and anomaly detection in real-world images.

翻訳日:2022-11-01 04:34:14 公開日:2020-08-09

# ベクター表現による分子画像の増強と薬物分類への応用

Augmenting Molecular Images with Vector Representations as a Featurization Technique for Drug Classification ( http://arxiv.org/abs/2008.03646v1 )

ライセンス: Link先を確認

Daniel de Marchi, Amarjit Budhiraja

(参考訳) 薬物の分類と生成のためのディープラーニングシステムを構築するための重要なステップの1つは、分子の創成の選択である。以前は、分子画像、二分文字列、グラフ、スマイル文字列などがあった。本稿では,分子画像のみに含まれる,あるいは理解が容易でない情報をエンコードする2進ベクトルをキャプションとした分子画像の作成を提案する。具体的には、より高いレベルの構造情報をエンコードするmorgan fingerprintsと、分子の性質や構造についてイエスかノーかをエンコードするmaccs keysを使用します。本手法をパンデ研究所が公開しているHIVデータセットで検証し,HIVウイルスを阻害すると41,127個の分子がラベル付けされた。我々の最終モデルは、HIVデータセット上のAUC ROCの状態を達成し、他のすべての方法よりも優れています。さらに、モデルは他のほとんどの方法よりもはるかに高速に収束し、未拡張画像よりも計算能力が劇的に低下した。

One of the key steps in building deep learning systems for drug classification and generation is the choice of featurization for the molecules. Previous featurization methods have included molecular images, binary strings, graphs, and SMILES strings. This paper proposes the creation of molecular images captioned with binary vectors that encode information not contained in or easily understood from a molecular image alone. Specifically, we use Morgan fingerprints, which encode higher level structural information, and MACCS keys, which encode yes or no questions about a molecules properties and structure. We tested our method on the HIV dataset published by the Pande lab, which consists of 41,127 molecules labeled by if they inhibit the HIV virus. Our final model achieved a state of the art AUC ROC on the HIV dataset, outperforming all other methods. Moreover, the model converged significantly faster than most other methods, requiring dramatically less computational power than unaugmented images.

翻訳日:2022-11-01 04:33:17 公開日:2020-08-09

# リアルタイムuav追跡のための学習一貫性追従相関フィルタ

Learning Consistency Pursued Correlation Filters for Real-Time UAV Tracking ( http://arxiv.org/abs/2008.03704v1 )

ライセンス: Link先を確認

Changhong Fu, Xiaoxiao Yang, Fan Li, Juntao Xu, Changjing Liu, and Peng Lu

(参考訳) 相関フィルタ(CF)に基づく手法は、無人航空機(UAV)の視覚的物体追跡において例外的な性能を示すが、望ましくない境界効果に悩まされている。この問題を解決するために,空間正規化相関フィルタ(srdcf)は,フィルタ係数をペナライズする空間正規化を提案する。しかし、応答マップに隠された時間情報はsrdcfでは考慮されないため、正確な追跡のための識別能力とロバスト性が制限される。本研究は,動的整合性追従相関フィルタ,すなわちCPCFトラッカーを用いた新しい手法を提案する。具体的には、隣接する応答マップ間の相関操作により、フレーム間の一貫性レベルを表す実用的な一貫性マップを生成する。実用的理想的一貫性マップと計画的理想的一貫性マップとの差を最小化することにより、時間的滑らかさを維持するために一貫性レベルを制約し、応答マップに含まれる豊富な時間情報を導入する。さらに,複雑な状況下でのトラッカーの適応性を向上させるために,動的制約戦略を提案する。包括的な実験は、UAV123@10FPS、UAVDT、DTB70という3つの挑戦的なUAVベンチマークで行われている。実験結果に基づき、提案したトラッカーは、1つのCPU上でのリアルタイムランニング速度($43FPS)の他の25の最先端トラッカーを好んで上回っている。

Correlation filter (CF)-based methods have demonstrated exceptional performance in visual object tracking for unmanned aerial vehicle (UAV) applications, but suffer from the undesirable boundary effect. To solve this issue, spatially regularized correlation filters (SRDCF) proposes the spatial regularization to penalize filter coefficients, thereby significantly improving the tracking performance. However, the temporal information hidden in the response maps is not considered in SRDCF, which limits the discriminative power and the robustness for accurate tracking. This work proposes a novel approach with dynamic consistency pursued correlation filters, i.e., the CPCF tracker. Specifically, through a correlation operation between adjacent response maps, a practical consistency map is generated to represent the consistency level across frames. By minimizing the difference between the practical and the scheduled ideal consistency map, the consistency level is constrained to maintain temporal smoothness, and rich temporal information contained in response maps is introduced. Besides, a dynamic constraint strategy is proposed to further improve the adaptability of the proposed tracker in complex situations. Comprehensive experiments are conducted on three challenging UAV benchmarks, i.e., UAV123@10FPS, UAVDT, and DTB70. Based on the experimental results, the proposed tracker favorably surpasses the other 25 state-of-the-art trackers with real-time running speed ($\sim$43FPS) on a single CPU.

翻訳日:2022-11-01 04:27:28 公開日:2020-08-09

# レーン幅に先行した車線境界観測を用いた時間的一貫性のあるipmのためのオンラインextrinsicカメラキャリブレーション

Online Extrinsic Camera Calibration for Temporally Consistent IPM Using Lane Boundary Observations with a Lane Width Prior ( http://arxiv.org/abs/2008.03722v1 )

ライセンス: Link先を確認

Jeong-Kyun Lee and Young-Ki Baik and Hankyu Cho and Seungwoo Yoo

(参考訳) 本稿では,道路面からのピッチ,ヨー,ロール角,カメラ高さを逐次駆動シーン画像から推定するオンライン外部カメラキャリブレーション手法を提案する。提案手法では,2段階のカメラパラメータを推定する。 1)一組の車線境界観測から計算した消滅点を用いてピッチとヨー角を同時に推定する。 2)車線幅観測と車線幅の差を最小化して、ロール角度とカメラ高さを算出する。外部カメラパラメータは拡張カルマンフィルタリング(EKF)を用いて順次更新され、最終的に逆視点マッピング(IPM)により時間的に一貫した鳥眼ビュー(BEV)画像を生成する。合成および実世界のデータセットにおける提案手法の優位性を示す。

In this paper, we propose a method for online extrinsic camera calibration, i.e., estimating pitch, yaw, roll angles and camera height from road surface in sequential driving scene images. The proposed method estimates the extrinsic camera parameters in two steps: 1) pitch and yaw angles are estimated simultaneously using a vanishing point computed from a set of lane boundary observations, and then 2) roll angle and camera height are computed by minimizing difference between lane width observations and a lane width prior. The extrinsic camera parameters are sequentially updated using extended Kalman filtering (EKF) and are finally used to generate a temporally consistent bird-eye-view (BEV) image by inverse perspective mapping (IPM). We demonstrate the superiority of the proposed method in synthetic and real-world datasets.

翻訳日:2022-11-01 04:26:33 公開日:2020-08-09

# SOFA-Net: クラウドカウントのための2次および1次アテンションネットワーク

SOFA-Net: Second-Order and First-order Attention Network for Crowd Counting ( http://arxiv.org/abs/2008.03723v1 )

ライセンス: Link先を確認

Haoran Duan, Shidong Wang, Yu Guan

(参考訳) 近年、スマートシティーに広範に応用されているため、画像や動画からの群衆自動カウントが注目されている。しかし、密集した群衆をモデル化することは困難であり、既存の作品のほとんどが信頼性が低下する。本研究で提案したSOFA-Net(Second-Order and First-order Attention Network)では,高密度頭部のチャネルワイド空間情報の選択性を維持するために2次統計を抽出し,頭部領域の特徴識別を強化する1次統計を補完情報として用いた。マルチストリームアーキテクチャにより,提案する2次/1次統計を学習し,ロバスト表現の洗練に注意を向けた。提案手法を4つの公開データセットで評価し,そのほとんどは最新技術に到達した。また,提案するsofa-netの構成成分について広範な実験を行い,課題シナリオにおけるモデル群における2次・1次統計の高機能化を示唆した。私たちの知る限りでは、クラウドカウントの2/1次統計を探求する最初の仕事です。

Automated crowd counting from images/videos has attracted more attention in recent years because of its wide application in smart cities. But modelling the dense crowd heads is challenging and most of the existing works become less reliable. To obtain the appropriate crowd representation, in this work we proposed SOFA-Net(Second-Order and First-order Attention Network): second-order statistics were extracted to retain selectivity of the channel-wise spatial information for dense heads while first-order statistics, which can enhance the feature discrimination for the heads' areas, were used as complementary information. Via a multi-stream architecture, the proposed second/first-order statistics were learned and transformed into attention for robust representation refinement. We evaluated our method on four public datasets and the performance reached state-of-the-art on most of them. Extensive experiments were also conducted to study the components in the proposed SOFA-Net, and the results suggested the high-capability of second/first-order statistics on modelling crowd in challenging scenarios. To the best of our knowledge, we are the first work to explore the second/first-order statistics for crowd counting.

翻訳日:2022-11-01 04:26:17 公開日:2020-08-09

# イメージインペインティングのための繰り返し特徴推論

Recurrent Feature Reasoning for Image Inpainting ( http://arxiv.org/abs/2008.03737v1 )

ライセンス: Link先を確認

Jingyuan Li, Ning Wang, Lefei Zhang, Bo Du, Dacheng Tao

(参考訳) 既存の塗装法は, 画像欠陥の回復に有望な性能を達成している。しかし,穴中心の制約の欠如により,大きな連続孔への充填は困難である。本稿では、主にプラグ・アンド・プレイのリカレント特徴推論モジュールと知識一貫性注意(kca)モジュールによって構築されるリカレント特徴推論(rfr)ネットワークを考案する。 RFRモジュールは、人間がパズルを解く方法(つまり、より簡単な部分を解き、難解な部分を解くための追加情報として結果を使用する)に似て、畳み込み特徴写像の穴の境界を反復的に推論し、さらに推論するための手がかりとして使用する。モジュールは徐々にホールセンターの制約を強化し、その結果は明確になる。 RFRの特徴マップ内の離れた場所からの情報を取得するため、我々はさらにKCAを開発し、RFRに組み込む。経験的に、提案したRFR-Netと既存のバックボーンを比較して、RFR-Netがより効率的であることを示す(例えば、同じモデルサイズで4\%のSSIMの改善)。次に、ネットワークを現在の最先端のコンテキストに配置し、パフォーマンスを向上させます。対応するソースコードは、https://github.com/jingyuanli001/RFR-Inpaintingで入手できる。

Existing inpainting methods have achieved promising performance for recovering regular or small image defects. However, filling in large continuous holes remains difficult due to the lack of constraints for the hole center. In this paper, we devise a Recurrent Feature Reasoning (RFR) network which is mainly constructed by a plug-and-play Recurrent Feature Reasoning module and a Knowledge Consistent Attention (KCA) module. Analogous to how humans solve puzzles (i.e., first solve the easier parts and then use the results as additional information to solve difficult parts), the RFR module recurrently infers the hole boundaries of the convolutional feature maps and then uses them as clues for further inference. The module progressively strengthens the constraints for the hole center and the results become explicit. To capture information from distant places in the feature map for RFR, we further develop KCA and incorporate it in RFR. Empirically, we first compare the proposed RFR-Net with existing backbones, demonstrating that RFR-Net is more efficient (e.g., a 4\% SSIM improvement for the same model size). We then place the network in the context of the current state-of-the-art, where it exhibits improved performance. The corresponding source code is available at: https://github.com/jingyuanli001/RFR-Inpainting

翻訳日:2022-11-01 04:25:53 公開日:2020-08-09

# 核ノルムと学習グラフモデルを用いた奥行き画像の復調

Depth image denoising using nuclear norm and learning graph model ( http://arxiv.org/abs/2008.03741v1 )

ライセンス: Link先を確認

Chenggang Yan, Zhisheng Li, Yongbing Zhang, Yutao Liu, Xiangyang Ji, Yongdong Zhang

(参考訳) 3次元(3d)シーンを反映しており、コンピュータビジョンの様々な分野に適用できるため、近年では奥行き画像がホットな研究テーマになりつつある。しかし、深度カメラから得られた深度画像にはノイズなどの汚れが含まれており、深度関連のアプリケーションの性能を著しく損なう。本稿では,パッチ間の類似性収集にグループベース画像復元手法が有効であることを考慮し,グループベース核ノルム・学習グラフ(gnnlg)モデルを提案した。各パッチに対して、検索ウィンドウ内で最もよく似たパッチを見つけてグループ化する。本モデルでは,グループパッチの内在的低ランク特性を利用した。さらに,画像のトポロジ的構造を反映したグラフラプラシアン行列を探索し,よりスムーズな事前処理を行うことを目的として,多様体学習手法を検証し,効率的な学習戦略を考案した。高速で高速な収束を実現するために,GNNLG を解くために乗算器の交互方向法 (ADMM) を提案する。実験の結果,提案手法は主観的,客観的両基準において,他の最先端の復調法よりも優れていることがわかった。

The depth images denoising are increasingly becoming the hot research topic nowadays because they reflect the three-dimensional (3D) scene and can be applied in various fields of computer vision. But the depth images obtained from depth camera usually contain stains such as noise, which greatly impairs the performance of depth related applications. In this paper, considering that group-based image restoration methods are more effective in gathering the similarity among patches, a group based nuclear norm and learning graph (GNNLG) model was proposed. For each patch, we find and group the most similar patches within a searching window. The intrinsic low-rank property of the grouped patches is exploited in our model. In addition, we studied the manifold learning method and devised an effective optimized learning strategy to obtain the graph Laplacian matrix, which reflects the topological structure of image, to further impose the smoothing priors to the denoised depth image. To achieve fast speed and high convergence, the alternating direction method of multipliers (ADMM) is proposed to solve our GNNLG. The experimental results show that the proposed method is superior to other current state-of-the-art denoising methods in both subjective and objective criterion.

翻訳日:2022-11-01 04:25:29 公開日:2020-08-09

# SemEval-2020 Task 8: Memotion Analysis - The Visuo-Lingual Metaphor!

SemEval-2020 Task 8: Memotion Analysis -- The Visuo-Lingual Metaphor! ( http://arxiv.org/abs/2008.03781v1 )

ライセンス: Link先を確認

Chhavi Sharma and Deepesh Bhageria and William Scott and Srinivas PYKL and Amitava Das and Tanmoy Chakraborty and Viswanath Pulabaigari and Bjorn Gamback

(参考訳) ソーシャルメディア上の情報は、テキスト、ビジュアル、オーディオなどの様々なモダリティから構成される。 NLPとコンピュータビジョンのコミュニティは、ソーシャルメディアを研究するために単独で1つの顕著なモダリティしか利用していない。しかし、インターネットミームの計算処理にはハイブリッドアプローチが必要である。 Facebook、Instagram、Twiterなどのソーシャルメディアプラットフォームにおけるインターネットミームの普及はさらに、そのようなマルチモーダルコンテンツは無視できないことを示唆している。我々の知る限りでは、ミームの感情分析にはあまり注意が払わない。本提案の目的は,インターネットミームの自動処理に研究コミュニティの注意を向けることである。 task memotion analysisは、約10kの注釈付きミームをリリースし、感情(ポジティブ、ネガティブ、ニュートラル)、感情の種類(皮肉、面白い、不快、モチベーション)、それに対応する強度という、人間の注釈付きラベルを付けた。課題は、ミームの感情分析(肯定的、否定的、中立的)、ミームの全体的な感情分類(ユーモア、皮肉、攻撃的、動機づけ的)、ミームの強さの分類という3つのサブタスクで構成されていた。最高成績は3つのサブタスクごとにそれぞれ0.35, 0.51, 0.32のf1スコアであった。

Information on social media comprises of various modalities such as textual, visual and audio. NLP and Computer Vision communities often leverage only one prominent modality in isolation to study social media. However, the computational processing of Internet memes needs a hybrid approach. The growing ubiquity of Internet memes on social media platforms such as Facebook, Instagram, and Twiter further suggests that we can not ignore such multimodal content anymore. To the best of our knowledge, there is not much attention towards meme emotion analysis. The objective of this proposal is to bring the attention of the research community towards the automatic processing of Internet memes. The task Memotion analysis released approx 10K annotated memes, with human-annotated labels namely sentiment (positive, negative, neutral), type of emotion (sarcastic, funny, offensive, motivation) and their corresponding intensity. The challenge consisted of three subtasks: sentiment (positive, negative, and neutral) analysis of memes, overall emotion (humour, sarcasm, offensive, and motivational) classification of memes, and classifying intensity of meme emotion. The best performances achieved were F1 (macro average) scores of 0.35, 0.51 and 0.32, respectively for each of the three subtasks.

翻訳日:2022-11-01 04:25:08 公開日:2020-08-09

# 高解像度画像に基づくディープラーニングによるLiDARデータ強化:低コストLiDARを用いた高性能LiDARSLAMの実現へのアプローチ

LiDAR Data Enrichment Using Deep Learning Based on High-Resolution Image: An Approach to Achieve High-Performance LiDAR SLAM Using Low-cost LiDAR ( http://arxiv.org/abs/2008.03694v1 )

ライセンス: Link先を確認

Jiang Yue, Weisong Wen, Jing Han, and Li-Ta Hsu

(参考訳) LiDARベースのSLAMアルゴリズムは、過去数十年で自動運転車(ADV)の堅牢で正確な位置決めを提供するために、広範囲に研究されている。 64チャンネルの高品位3dlidarで十分な性能を得ることができ、密集した点雲を提供することができる。残念ながら、高い価格がADVの広範な商業化を著しく妨げている。コスト効率のよい16チャンネルの3D LiDARは、有望な代替品だ。しかし、16チャンネルのlidarによってのみ、限定的かつスパースポイントの雲が提供され、動的環境におけるadvの十分な位置決め精度を保証することはできない。低コストカメラからの高解像度画像は、周囲についての豊富な情報を提供することができる。しかし、画像から明らかな深度情報は得られない。本稿では,3D LiDARとカメラの相補性に着想を得て,カメラからの高解像度画像を用いて,最先端のディープラーニングアルゴリズムに基づいて,低コストの16チャンネルLiDARから生の3D点雲を濃縮する手法を提案する。 ERFNetは、まず、生のスパース3D点雲の助けを借りて画像を分割するために使用される。一方、スパース畳み込みニューラルネットワークは、生のスパース3d点雲に基づいて密集した点雲を予測するために用いられる。そして、新たな多層畳み込みニューラルネットワークを用いてerfnetのセグメンテーション出力と予測された濃密点雲を融合させ、予測した3d点雲を精製する。最後に、高密度点雲を用いて、最先端の正規分布変換(NDT)に基づいてLiDAR SLAMを実行する。再編集されたkittiデータセットに対して,(1)スパース3dポイントの雲は,平均2乗誤差1.1mseで著しく濃縮されている。 2) LiDAR SLAM から生成された地図はより密集しており, 精度が著しく低下しない。

LiDAR-based SLAM algorithms are extensively studied to providing robust and accurate positioning for autonomous driving vehicles (ADV) in the past decades. Satisfactory performance can be obtained using high-grade 3D LiDAR with 64 channels, which can provide dense point clouds. Unfortunately, the high price significantly prevents its extensive commercialization in ADV. The cost-effective 3D LiDAR with 16 channels is a promising replacement. However, only limited and sparse point clouds can be provided by the 16 channels LiDAR, which cannot guarantee sufficient positioning accuracy for ADV in challenging dynamic environments. The high-resolution image from the low-cost camera can provide ample information about the surroundings. However, the explicit depth information is not available from the image. Inspired by the complementariness of 3D LiDAR and camera, this paper proposes to make use of the high-resolution images from a camera to enrich the raw 3D point clouds from the low-cost 16 channels LiDAR based on a state-of-the-art deep learning algorithm. An ERFNet is firstly employed to segment the image with the aid of the raw sparse 3D point clouds. Meanwhile, the sparse convolutional neural network is employed to predict the dense point clouds based on raw sparse 3D point clouds. Then, the predicted dense point clouds are fused with the segmentation outputs from ERFnet using a novel multi-layer convolutional neural network to refine the predicted 3D point clouds. Finally, the enriched point clouds are employed to perform LiDAR SLAM based on the state-of-the-art normal distribution transform (NDT). We tested our approach on the re-edited KITTI datasets: (1)the sparse 3D point clouds are significantly enriched with a mean square error of 1.1m MSE. (2)the map generated from the LiDAR SLAM is denser which includes more details without significant accuracy loss.

翻訳日:2022-11-01 04:18:20 公開日:2020-08-09

# 新型コロナウイルス(covid-19)診断のための深層学習技術のレビュー

A Review on Deep Learning Techniques for the Diagnosis of Novel Coronavirus (COVID-19) ( http://arxiv.org/abs/2008.04815v1 )

ライセンス: Link先を確認

Md. Milon Islam, Fakhri Karray, Reda Alhajj, Jia Zeng

(参考訳) 新型コロナウイルス(COVID-19)の感染拡大は世界中で悲惨な状況となり、過去100年で最も急性で深刻な病気の1つとなっている。新型コロナウイルス(covid-19)の感染率は世界中で急速に増加している。このパンデミックのワクチンはまだ発見されていないが、深層学習技術は、新型コロナウイルスの自動診断に臨床医が使用する兵器の強力なツールであることが証明された。本稿では,CT (Computer Tomography) やX線といった様々な医用画像を用いた深層学習技術に基づく最近のシステムの概要を述べる。本稿では、深層学習技術を用いて新型コロナウイルスの診断のために開発されたシステムについて論じ、これらのネットワークのトレーニングに使用されるよく知られたデータセットについて考察する。また、この分野の研究者によって開発されたデータパーティショニング技術や様々なパフォーマンス対策についても強調する。分類学は、最近の著作を適切な洞察のために分類するために引き起こされる。最後に、新型コロナウイルス検出におけるディープラーニング手法の使用に伴う課題と、この研究領域における今後の展望について述べる。本論文は、この点において、深層学習技術がどのように使われているか、また新型コロナウイルスの感染拡大と戦うためにどのように機能するかについて、専門家や技術者に新たな洞察を提供することを目的としている。

Novel coronavirus (COVID-19) outbreak, has raised a calamitous situation all over the world and has become one of the most acute and severe ailments in the past hundred years. The prevalence rate of COVID-19 is rapidly rising every day throughout the globe. Although no vaccines for this pandemic have been discovered yet, deep learning techniques proved themselves to be a powerful tool in the arsenal used by clinicians for the automatic diagnosis of COVID-19. This paper aims to overview the recently developed systems based on deep learning techniques using different medical imaging modalities like Computer Tomography (CT) and X-ray. This review specifically discusses the systems developed for COVID-19 diagnosis using deep learning techniques and provides insights on well-known data sets used to train these networks. It also highlights the data partitioning techniques and various performance measures developed by researchers in this field. A taxonomy is drawn to categorize the recent works for proper insight. Finally, we conclude by addressing the challenges associated with the use of deep learning methods for COVID-19 detection and probable future trends in this research area. This paper is intended to provide experts (medical or otherwise) and technicians with new insights into the ways deep learning techniques are used in this regard and how they potentially further works in combatting the outbreak of COVID-19.

翻訳日:2022-11-01 04:17:49 公開日:2020-08-09

# 有向ネットワークにおけるコミュニティ検出のためのスペクトルアルゴリズム

Spectral Algorithms for Community Detection in Directed Networks ( http://arxiv.org/abs/2008.03820v1 )

ライセンス: Link先を確認

Zhe Wang, Yingbin Liang and Pengsheng Ji

(参考訳) 大規模ソーシャルネットワークにおけるコミュニティ検出は,ノードの次数不均一性に影響される。有向ネットワークに対するD-SCOREアルゴリズムを導入し、クラスタリング前に隣接行列の特異ベクトルの要素ワイド比をとることにより、この効果を低減した。統計的引用ネットワークについて有意義な結果を得たが,その性能に関する厳密な分析は得られなかった。まず, 有向次数補正ブロックモデル (Directed-DCBM) のアルゴリズムとその変種に関する理論的保証を確立する。第2に,本論文は,D-SCOREアルゴリズムにおいて,特異ベクトルではなく,元のネットワークの情報を用いて,コミュニティコア外部のノードをアタッチすることで,大幅な改良を行う。

Community detection in large social networks is affected by degree heterogeneity of nodes. The D-SCORE algorithm for directed networks was introduced to reduce this effect by taking the element-wise ratios of the singular vectors of the adjacency matrix before clustering. Meaningful results were obtained for the statistician citation network, but rigorous analysis on its performance was missing. First, this paper establishes theoretical guarantee for this algorithm and its variants for the directed degree-corrected block model (Directed-DCBM). Second, this paper provides significant improvements for the original D-SCORE algorithms by attaching the nodes outside of the community cores using the information of the original network instead of the singular vectors.

翻訳日:2022-11-01 04:17:31 公開日:2020-08-09

# 高速かつ高精度なCRF構成解析

Fast and Accurate Neural CRF Constituency Parsing ( http://arxiv.org/abs/2008.03736v1 )

ライセンス: Link先を確認

Yu Zhang, Houquan Zhou, Zhenghua Li

(参考訳) 確率分布の推定はnlp分野の主要な問題の一つである。しかし、深層学習(DL)とプレDLの時代において、シーケンスラベリングタスクにおける線形鎖 CRF の広大な応用とは異なり、木構造 CRF を選挙区解析に適用する研究はほとんどなく、主に内向きアルゴリズムの複雑さと非効率性のためである。この研究は、高速で正確なCRF成分分析器を提示する。鍵となる考え方は、GPU上の大きなテンソル演算によって損失計算の内側のアルゴリズムをバッチ化し、一方、効率的なバックプロパゲーションによる勾配計算の外側のアルゴリズムを避けることである。また,より効率的な2段階ブラケットラベル解析手法を提案する。依存関係解析の最近の進歩に触発された解析性能を改善するため,境界表現と双対注意に基づく新たなスコアリングアーキテクチャを導入し,有益なドロップアウト戦略を提案する。 PTB, CTB5.1, CTB7 の実験では,2段階の CRF パーサが w/o と w/BERT の両設定で新たな最先端性能を実現し,毎秒1,000文以上を解析可能である。私たちはコードをhttps://github.com/yzhangcs/crfparでリリースします。

Estimating probability distribution is one of the core issues in the NLP field. However, in both deep learning (DL) and pre-DL eras, unlike the vast applications of linear-chain CRF in sequence labeling tasks, very few works have applied tree-structure CRF to constituency parsing, mainly due to the complexity and inefficiency of the inside-outside algorithm. This work presents a fast and accurate neural CRF constituency parser. The key idea is to batchify the inside algorithm for loss computation by direct large tensor operations on GPU, and meanwhile avoid the outside algorithm for gradient computation via efficient back-propagation. We also propose a simple two-stage bracketing-then-labeling parsing approach to improve efficiency further. To improve the parsing performance, inspired by recent progress in dependency parsing, we introduce a new scoring architecture based on boundary representation and biaffine attention, and a beneficial dropout strategy. Experiments on PTB, CTB5.1, and CTB7 show that our two-stage CRF parser achieves new state-of-the-art performance on both settings of w/o and w/ BERT, and can parse over 1,000 sentences per second. We release our code at https://github.com/yzhangcs/crfpar.

翻訳日:2022-11-01 04:17:20 公開日:2020-08-09

# 画像暗号化のための遺伝的アルゴリズムのランダム性評価:信号処理手法

Randomness Evaluation of a Genetic Algorithm for Image Encryption: A Signal Processing Approach ( http://arxiv.org/abs/2008.03681v1 )

ライセンス: Link先を確認

Zoubir Hamici

(参考訳) 本稿では,セキュアな画像通信のためのブロック暗号のランダム性評価を行う。 GFHT暗号(GFHT cipher)は、細菌の抗生物質耐性に触発された遺伝子融合(GF)と水平遺伝子導入(HGT)を組み合わせた遺伝的アルゴリズムである。対称暗号鍵は、多層ランダム配列を持つ4対の染色体によって生成される。暗号は1ブロックの主キーエージェントのGFから始まり、HGTは、遺伝子がピクセルであり、染色体が行と列である難読化を行う。画像ハッシュ値から抽出したソルトを用いてワンタイムパッド(OTP)方式を実装し、メインのパスフレーズやキーを変更することなく、1ピクセルの変更で異なる暗号化キーを生成する。これにより、99%の極端な雪崩効果が得られる。ランダムマトリクス理論,パワースペクトル密度,雪崩効果,2次元オートコリレーション,画素ランダムネステスト,チ二乗仮説テストに基づくランダムネス評価は,暗号化された画像が均一なホワイトノイズの統計的挙動を採用することを示す。さらに,カオス遺伝暗号との比較によりGFHTアルゴリズムの利点が示された。

In this paper a randomness evaluation of a block cipher for secure image communication is presented. The GFHT cipher is a genetic algorithm, that combines gene fusion (GF) and horizontal gene transfer (HGT) both inspired from antibiotic resistance in bacteria. The symmetric encryption key is generated by four pairs of chromosomes with multi-layer random sequences. The encryption starts by a GF of the principal key-agent in a single block, then HGT performs obfuscation where the genes are pixels and the chromosomes are the rows and columns. A Salt extracted from the image hash-value is used to implement one-time pad (OTP) scheme, hence a modification of one pixel generates a different encryption key without changing the main passphrase or key. Therefore, an extreme avalanche effect of 99% is achieved. Randomness evaluation based on random matrix theory, power spectral density, avalanche effect, 2D auto-correlation, pixels randomness tests and chi-square hypotheses testing show that encrypted images adopt the statistical behavior of uniform white noise; hence validating the theoretical model by experimental results. Moreover, performance comparison with chaos-genetic ciphers shows the merit of the GFHT algorithm.

翻訳日:2022-11-01 04:16:55 公開日:2020-08-09

# コード構築遺伝的プログラミング

Code Building Genetic Programming ( http://arxiv.org/abs/2008.03649v1 )

ライセンス: Link先を確認

Edward Pantridge, Lee Spector

(参考訳) 近年、遺伝的プログラミングの分野は自動プログラミングに多大な進歩を遂げている。 pushgpやグラマーガイド遺伝的プログラミングのような現代のプログラム合成手法の研究と開発は、導入的な学術的な設定で典型的に割り当てられる問題を解決するプログラムを作成できる。これらの問題は、単純なデータ構造、基本的な制御フローパターン、プリミティブで重複しないデータ型(継承や複合型などなしで)の狭いセットに焦点を当てている。プログラム合成のための遺伝的プログラミング手法が、任意のデータ型、データ構造、および既存のコードベースから引き出された仕様を使用するプログラムを合成する能力を説得力のある形で実証した例はほとんどない。本稿では,リフレクションやファーストクラス仕様などのプログラミング言語機能を活用することで,これを実現するためのフレームワークとしてcbgp(code building genetic programming)を提案する。 CBGPは、ホスト言語のソースコードに実行または変換できる計算グラフを生成する。 CBGPの新たな機能を示すために,非原始多型データ型といくつかの標準プログラム合成ベンチマークを用いた新しいベンチマーク結果を提案する。

In recent years the field of genetic programming has made significant advances towards automatic programming. Research and development of contemporary program synthesis methods, such as PushGP and Grammar Guided Genetic Programming, can produce programs that solve problems typically assigned in introductory academic settings. These problems focus on a narrow, predetermined set of simple data structures, basic control flow patterns, and primitive, non-overlapping data types (without, for example, inheritance or composite types). Few, if any, genetic programming methods for program synthesis have convincingly demonstrated the capability of synthesizing programs that use arbitrary data types, data structures, and specifications that are drawn from existing codebases. In this paper, we introduce Code Building Genetic Programming (CBGP) as a framework within which this can be done, by leveraging programming language features such as reflection and first-class specifications. CBGP produces a computational graph that can be executed or translated into source code of a host language. To demonstrate the novel capabilities of CBGP, we present results on new benchmarks that use non-primitive, polymorphic data types as well as some standard program synthesis benchmarks.

翻訳日:2022-11-01 04:16:34 公開日:2020-08-09

# 長尺データのための特徴空間拡張

Feature Space Augmentation for Long-Tailed Data ( http://arxiv.org/abs/2008.03673v1 )

ライセンス: Link先を確認

Peng Chu and Xiao Bian and Shaopeng Liu and Haibin Ling

(参考訳) 実世界のデータは、各クラスの頻度が通常異なるため、しばしばロングテールの分布に従う。例えば、データセットは、多数の未表現のクラスと、十分なデータを持つ少数のクラスを持つことができる。しかしながら、データセットを表現するモデルは通常、クラス間で合理的に均質なパフォーマンスを持つことが期待されている。データのアンバランス問題を緩和するためのベストプラクティスとして、クラスバランス損失の導入とデータ再サンプリングと拡張に関する高度な手法がある。しかし、未表示のクラスに関する他の問題は、欠落した情報を回復するために追加の知識に頼る必要がある。本研究では,特徴空間における表現不足のクラスを,十分なサンプルを持つクラスから学習した特徴量で拡張することで,長鎖問題に対処する新しい手法を提案する。特に,各クラスの特徴を,クラスアクティベーションマップを用いてクラスジェネリックコンポーネントとクラス固有のコンポーネントに分解する。未表現のクラスの新しいサンプルは、未表現のクラスからクラス固有の特徴を、混乱したクラスからクラスジェネリックな特徴に融合させることで、トレーニング段階のフライで生成される。 iNaturalist、ImageNet-LT、Places-LT、CIFARの長期バージョンなど、さまざまなデータセットで得られた結果から、アートパフォーマンスの現状が示されている。

Real-world data often follow a long-tailed distribution as the frequency of each class is typically different. For example, a dataset can have a large number of under-represented classes and a few classes with more than sufficient data. However, a model to represent the dataset is usually expected to have reasonably homogeneous performances across classes. Introducing class-balanced loss and advanced methods on data re-sampling and augmentation are among the best practices to alleviate the data imbalance problem. However, the other part of the problem about the under-represented classes will have to rely on additional knowledge to recover the missing information. In this work, we present a novel approach to address the long-tailed problem by augmenting the under-represented classes in the feature space with the features learned from the classes with ample samples. In particular, we decompose the features of each class into a class-generic component and a class-specific component using class activation maps. Novel samples of under-represented classes are then generated on the fly during training stages by fusing the class-specific features from the under-represented classes with the class-generic features from confusing classes. Our results on different datasets such as iNaturalist, ImageNet-LT, Places-LT and a long-tailed version of CIFAR have shown the state of the art performances.

翻訳日:2022-11-01 04:16:16 公開日:2020-08-09

# レーダに基づく動的占有グリッドマッピングと物体検出

Radar-based Dynamic Occupancy Grid Mapping and Object Detection ( http://arxiv.org/abs/2008.03696v1 )

ライセンス: Link先を確認

Christopher Diehl, Eduard Feicho, Alexander Schwambach, Thomas Dammeier, Eric Mares, Torsten Bertram

(参考訳) センサデータ融合と物体追跡を利用した環境モデリングは安全な自動運転に不可欠である。近年,静的な環境を想定した古典的占有グリッドマップは,低レベルのデータ融合の可能性を維持しつつ,動的局所環境の位置と速度分布を推定するダイナミック占有グリッドマップに拡張されている。本稿では,従来のアプローチのさらなる発展について述べる。著者の知識を最大限に活用するために,レーダデータのみに基づく動的占有グリッドマッピングとその後の解析に関する出版物は存在しない。そこで本研究では,複数のレーダセンサのデータを融合し,グリッドを用いた物体追跡・マッピング手法を適用した。その後、動的領域のクラスタリングは高レベルなオブジェクト情報を提供する。比較のためにlidarベースの手法も開発されている。本手法は都市環境における移動車からの実世界データと質的,定量的に評価する。この評価は、異なる比較指標を考慮して、レーダベースの動的占有グリッドマップの利点を示す。

Environment modeling utilizing sensor data fusion and object tracking is crucial for safe automated driving. In recent years, the classical occupancy grid map approach, which assumes a static environment, has been extended to dynamic occupancy grid maps, which maintain the possibility of a low-level data fusion while also estimating the position and velocity distribution of the dynamic local environment. This paper presents the further development of a previous approach. To the best of the author's knowledge, there is no publication about dynamic occupancy grid mapping with subsequent analysis based only on radar data. Therefore in this work, the data of multiple radar sensors are fused, and a grid-based object tracking and mapping method is applied. Subsequently, the clustering of dynamic areas provides high-level object information. For comparison, also a lidar-based method is developed. The approach is evaluated qualitatively and quantitatively with real-world data from a moving vehicle in urban environments. The evaluation illustrates the advantages of the radar-based dynamic occupancy grid map, considering different comparison metrics.

翻訳日:2022-11-01 04:15:40 公開日:2020-08-09

# 全自動フォトグラムデータセグメンテーションと物体情報抽出によるシミュレーション地形の作成

Fully Automated Photogrammetric Data Segmentation and Object Information Extraction Approach for Creating Simulation Terrain ( http://arxiv.org/abs/2008.03697v1 )

ライセンス: Link先を確認

Meida Chen, Andrew Feng, Kyle McCullough, Pratusha Bhuvana Prasad, Ryan McAlinden, Lucio Soibelman, Mike Enloe

(参考訳) これまでの研究では、視覚的にリアルな3Dメッシュを、有能なカメラと効率的な測光ソフトウェア技術を備えた安価な無人航空機システム(UAS)で、自動的に再構築できることが実証された。しかし、そのような生成されたデータは、オブジェクトのセマンティック情報や機能(例えば、人工物、植生、地面、オブジェクト材料など)を含まないため、洗練されたユーザレベルとシステムレベルの相互作用を許さない。トレーニングとシミュレーションのための現実的な仮想環境(ミッション計画、リハーサル、脅威検出など)を作成する際のデータのユースケースを考えると、データのセグメンテーションとオブジェクト情報の抽出が不可欠である。そこで本研究の目的は,完全に自動化されたフォトグラムデータセグメンテーションおよびオブジェクト情報抽出フレームワークを設計・開発することである。提案手法を検証するために, 著者らが設計したシミュレーションツールであるAerial Terrain Line of Sight Analysis System (ATLAS) の仮想環境構築に, セグメントデータと抽出した特徴を用いた。その結果,3次元メッシュツリーは,抽出した個々の木の位置を用いてジオタイプな3次元ツリーモデルに置き換えることができた。抽出された樹木の特徴(色、幅、高さ)は、適切な樹木種を選択し、視覚的品質を高めるのに有用である。また、同定された地材情報は、パスファインディングに考慮することができる。最も短い経路は、物理的距離だけでなく、異なる地上材料におけるオフロード車両の性能も考慮して計算することができる。

Our previous works have demonstrated that visually realistic 3D meshes can be automatically reconstructed with low-cost, off-the-shelf unmanned aerial systems (UAS) equipped with capable cameras, and efficient photogrammetric software techniques. However, such generated data do not contain semantic information/features of objects (i.e., man-made objects, vegetation, ground, object materials, etc.) and cannot allow the sophisticated user-level and system-level interaction. Considering the use case of the data in creating realistic virtual environments for training and simulations (i.e., mission planning, rehearsal, threat detection, etc.), segmenting the data and extracting object information are essential tasks. Thus, the objective of this research is to design and develop a fully automated photogrammetric data segmentation and object information extraction framework. To validate the proposed framework, the segmented data and extracted features were used to create virtual environments in the authors previously designed simulation tool i.e., Aerial Terrain Line of Sight Analysis System (ATLAS). The results showed that 3D mesh trees could be replaced with geo-typical 3D tree models using the extracted individual tree locations. The extracted tree features (i.e., color, width, height) are valuable for selecting the appropriate tree species and enhance visual quality. Furthermore, the identified ground material information can be taken into consideration for pathfinding. The shortest path can be computed not only considering the physical distance, but also considering the off-road vehicle performance capabilities on different ground surface materials.

翻訳日:2022-11-01 04:15:26 公開日:2020-08-09

# 隠れマルコフモデルとLSTMの比較分析 : シミュレーション的アプローチ

Comparative Analysis of the Hidden Markov Model and LSTM: A Simulative Approach ( http://arxiv.org/abs/2008.03825v1 )

ライセンス: Link先を確認

Manie Tadayon, Greg Pottie

(参考訳) 近年,金融,教育,生物学,工学など,さまざまな分野の現実的なプロセスが時系列としてモデル化できることから,時系列データやシーケンシャルデータが注目されている。 kalmanフィルタ、隠れマルコフモデル、long short term memory (lstm)のような多くのアルゴリズムや手法がデータの推論や予測のために提案されているが、それらの利用はアプリケーション、問題の種類、利用可能なデータ、十分な正確さや損失に大きく依存している。本稿では,教師付きおよび教師なしマルコフモデルとLSTMを比較し,学習に必要なデータ量,複雑性,予測精度について比較する。さらに,定常および非定常状況下で,観測を識別し,個別のマルコフモデルに変換する様々な手法を提案する。その結果,大量のラベル付きデータが利用できない場合,教師なしマルコフモデルでさえLSTMより優れていることがわかった。さらに,1次マルコフ仮定が満たされていない場合でも,隠れマルコフモデルがシーケンスデータの処理に有効な方法であることを示す。

Time series and sequential data have gained significant attention recently since many real-world processes in various domains such as finance, education, biology, and engineering can be modeled as time series. Although many algorithms and methods such as the Kalman filter, hidden Markov model, and long short term memory (LSTM) are proposed to make inferences and predictions for the data, their usage significantly depends on the application, type of the problem, available data, and sufficient accuracy or loss. In this paper, we compare the supervised and unsupervised hidden Markov model to LSTM in terms of the amount of data needed for training, complexity, and forecasting accuracy. Moreover, we propose various techniques to discretize the observations and convert the problem to a discrete hidden Markov model under stationary and non-stationary situations. Our results indicate that even an unsupervised hidden Markov model can outperform LSTM when a massive amount of labeled data is not available. Furthermore, we show that the hidden Markov model can still be an effective method to process the sequence data even when the first-order Markov assumption is not satisfied.

翻訳日:2022-11-01 04:09:14 公開日:2020-08-09

# LRSpeech: 極低リソース音声合成と認識

LRSpeech: Extremely Low-Resource Speech Synthesis and Recognition ( http://arxiv.org/abs/2008.03687v1 )

ライセンス: Link先を確認

Jin Xu, Xu Tan, Yi Ren, Tao Qin, Jian Li, Sheng Zhao, Tie-Yan Liu

(参考訳) 音声合成(text to speech, tts)と音声認識(automatic speech recognition, asr)は重要な音声課題であり、モデル学習のために大量のテキストと音声ペアを必要とする。しかし、世界には6,000以上の言語があり、ほとんどの言語は音声訓練データがないため、極低リソース言語向けにTSやASRシステムを構築する際には大きな課題が生じる。本稿では,データコストの低いレア言語をサポート可能な,低リソース環境下でのTLS/ASRシステムであるLSpeechを開発する。 LRSpeechは3つの重要な技術から構成される。 1)リッチリソース言語の事前学習と低リソース言語の微調整 2) TTS と ASR の二重変換は,相互の精度を反復的に向上させる。 3)ttsモデルを高品質な目標話者音声でカスタマイズし,複数声のasrモデルを改善するための知識蒸留法実験言語(英語)と真の低リソース言語(リトアニア語)で実験を行い,LRSpeechの有効性を検証する。 LRSpeechの実験結果 1) 産業展開の要件を満たす合成音声の知性(98%以上)と自然性(3.5 平均意見スコア(mos))の両方において,ttsの高品質を実現する。 2)asrの有望な認識精度を達成し、 3) 最後に、非常に低いリソースのトレーニングデータを使用します。また,LRSpeechをさまざまな量のデータ資源で包括的に分析し,産業展開のための貴重な洞察とガイダンスを提供する。現在、より稀な言語でTSをサポートするために、商用のクラウド音声サービスにLSpeechをデプロイしています。

Speech synthesis (text to speech, TTS) and recognition (automatic speech recognition, ASR) are important speech tasks, and require a large amount of text and speech pairs for model training. However, there are more than 6,000 languages in the world and most languages are lack of speech training data, which poses significant challenges when building TTS and ASR systems for extremely low-resource languages. In this paper, we develop LRSpeech, a TTS and ASR system under the extremely low-resource setting, which can support rare languages with low data cost. LRSpeech consists of three key techniques: 1) pre-training on rich-resource languages and fine-tuning on low-resource languages; 2) dual transformation between TTS and ASR to iteratively boost the accuracy of each other; 3) knowledge distillation to customize the TTS model on a high-quality target-speaker voice and improve the ASR model on multiple voices. We conduct experiments on an experimental language (English) and a truly low-resource language (Lithuanian) to verify the effectiveness of LRSpeech. Experimental results show that LRSpeech 1) achieves high quality for TTS in terms of both intelligibility (more than 98% intelligibility rate) and naturalness (above 3.5 mean opinion score (MOS)) of the synthesized speech, which satisfy the requirements for industrial deployment, 2) achieves promising recognition accuracy for ASR, and 3) last but not least, uses extremely low-resource training data. We also conduct comprehensive analyses on LRSpeech with different amounts of data resources, and provide valuable insights and guidances for industrial deployment. We are currently deploying LRSpeech into a commercialized cloud speech service to support TTS on more rare languages.

翻訳日:2022-11-01 04:08:26 公開日:2020-08-09

# SpeedySpeech: 効率的なニューラル音声合成

SpeedySpeech: Efficient Neural Speech Synthesis ( http://arxiv.org/abs/2008.03802v1 )

ライセンス: Link先を確認

Jan Vainer, Ond\v{r}ej Du\v{s}ek

(参考訳) 最近のニューラルシーケンス・ツー・シーケンスモデルでは音声合成の質が大幅に改善されているが、高速な訓練、高速推論、高品質な音声合成を同時に行うシステムはない。本稿では,計算資源の要求が低く,学習時間も速い,高品質なリアルタイムスペクトログラム合成が可能な学生-教師ネットワークを提案する。高品質な音声を生成するには自己注意層は必要ないことを示す。教師ネットワークと教師ネットワークの両方に残存する単純な畳み込みブロックを活用し,教師モデルにおいて1つの注意層のみを使用する。 MelGANボコーダと組み合わせたモデルでは,声質はTacotron 2より有意に高かった。我々のモデルは1つのGPUで効率的にトレーニングでき、CPUでもリアルタイムで実行できる。ソースコードとオーディオサンプルの両方をgithubリポジトリで提供しています。

While recent neural sequence-to-sequence models have greatly improved the quality of speech synthesis, there has not been a system capable of fast training, fast inference and high-quality audio synthesis at the same time. We propose a student-teacher network capable of high-quality faster-than-real-time spectrogram synthesis, with low requirements on computational resources and fast training time. We show that self-attention layers are not necessary for generation of high quality audio. We utilize simple convolutional blocks with residual connections in both student and teacher networks and use only a single attention layer in the teacher model. Coupled with a MelGAN vocoder, our model's voice quality was rated significantly higher than Tacotron 2. Our model can be efficiently trained on a single GPU and can run in real time even on a CPU. We provide both our source code and audio samples in our GitHub repository.

翻訳日:2022-11-01 04:07:58 公開日:2020-08-09

# リスク感性マルコフ決定過程における平均と変動の組合せ

Risk-Sensitive Markov Decision Processes with Combined Metrics of Mean and Variance ( http://arxiv.org/abs/2008.03707v1 )

ライセンス: Link先を確認

Li Xia

(参考訳) 本稿では,報酬の平均と分散を考慮した長期平均指標を用いた無限段階離散時間マルコフ決定過程(mdp)の最適化問題について検討する。平均は平均リターンを示し、分散はリスクまたは公正を示すので、このようなパフォーマンス指標は重要である。しかし、分散計量はすべての段階で報酬を結合し、伝統的な動的プログラミングは時間一貫性の原則が失敗するため適用できない。我々はこの問題を感度に基づく最適化理論と呼ばれる新しい視点から研究する。性能差公式が導出され、2つの異なるポリシーの下でmdpの平均分散結合指標の差を定量化することができる。差分公式は、厳密に平均分散性能が向上した新しいポリシーを生成するのに利用できる。最適政策の必要条件と決定論的政策の最適性が導出される。さらにポリシー反復の形で反復的アルゴリズムを開発し、混合およびランダム化されたポリシー空間において局所最適に収束することが証明された。特に、平均報酬がポリシーで一定であれば、アルゴリズムはグローバル最適に収束することが保証される。最後に,エネルギー貯蔵システムにおける風力発電のゆらぎ低減に関する研究に本手法を適用し,最適化手法の適用可能性を示す。

This paper investigates the optimization problem of an infinite stage discrete time Markov decision process (MDP) with a long-run average metric considering both mean and variance of rewards together. Such performance metric is important since the mean indicates average returns and the variance indicates risk or fairness. However, the variance metric couples the rewards at all stages, the traditional dynamic programming is inapplicable as the principle of time consistency fails. We study this problem from a new perspective called the sensitivity-based optimization theory. A performance difference formula is derived and it can quantify the difference of the mean-variance combined metrics of MDPs under any two different policies. The difference formula can be utilized to generate new policies with strictly improved mean-variance performance. A necessary condition of the optimal policy and the optimality of deterministic policies are derived. We further develop an iterative algorithm with a form of policy iteration, which is proved to converge to local optima both in the mixed and randomized policy space. Specially, when the mean reward is constant in policies, the algorithm is guaranteed to converge to the global optimum. Finally, we apply our approach to study the fluctuation reduction of wind power in an energy storage system, which demonstrates the potential applicability of our optimization method.

翻訳日:2022-11-01 04:07:45 公開日:2020-08-09

# 拘束多様体のニューラルマニピュレーション計画

Neural Manipulation Planning on Constraint Manifolds ( http://arxiv.org/abs/2008.03787v1 )

ライセンス: Link先を確認

Ahmed H. Qureshi, Jiangeng Dong, Austin Choe, and Michael C. Yip

(参考訳) タスク制約の存在は、モーションプランニングに重大な課題を課す。最近の進歩にもかかわらず、既存のアルゴリズムはほとんどの計画問題に対して計算コストがかかる。本稿では,マルチモーダルキネマティック制約に対する最初のニューラルプランナーであるConstrained Motion Planning Networks (CoMPNet)を提案する。我々のアプローチは以下の構成要素からなる。一制約及び環境認識エンコーダ二制約多様体の近傍における構成を出力するニューラルロボット構成生成装置、及び三実現可能なロボットの運動軌跡を作成するために生成した構成を取り込む双方向計画アルゴリズムコンプネットは制約のない問題と制約付き問題の両方を含む実用的なモーションプランニングタスクを解決している。さらに、トレーニング中に見えないような、高い成功率の環境において、オブジェクトの新しい見えない場所に一般化する。最先端の制約付き動作計画アルゴリズムと比較して、CoMPNetは計算速度の桁違いの改善により性能が著しく低下する。

The presence of task constraints imposes a significant challenge to motion planning. Despite all recent advancements, existing algorithms are still computationally expensive for most planning problems. In this paper, we present Constrained Motion Planning Networks (CoMPNet), the first neural planner for multimodal kinematic constraints. Our approach comprises the following components: i) constraint and environment perception encoders; ii) neural robot configuration generator that outputs configurations on/near the constraint manifold(s), and iii) a bidirectional planning algorithm that takes the generated configurations to create a feasible robot motion trajectory. We show that CoMPNet solves practical motion planning tasks involving both unconstrained and constrained problems. Furthermore, it generalizes to new unseen locations of the objects, i.e., not seen during training, in the given environments with high success rates. When compared to the state-of-the-art constrained motion planning algorithms, CoMPNet outperforms by order of magnitude improvement in computational speed with a significantly lower variance.

翻訳日:2022-11-01 04:07:27 公開日:2020-08-09

# 決定点プロセスのテスト

Testing Determinantal Point Processes ( http://arxiv.org/abs/2008.03650v1 )

ライセンス: Link先を確認

Khashayar Gatmiry (1), Maryam Aliakbarpour (1), Stefanie Jegelka (1) ((1) Massachusetts Institute of Technology)

(参考訳) 決定点過程(DPP)は多様性の確率論的モデルとして人気がある。本稿では,分布特性テストという新たな視点から,dppsについて検討する。基底集合の部分集合上の未知分布へのサンプルアクセス$q$を仮定すると、$q$が DPP 分布であるか、$\epsilon$-far が $\ell_1$-distance のすべての DPP 分布と区別することを目指している。本研究では, DPP テストのための最初のアルゴリズムを提案する。さらに, DPP テストのサンプルの複雑さに一致した低い境界を確立する。この下限はまた、より一般的なlog-submodular分布のクラスをテストする問題に対する新たなハードネス結果の提示にも拡張される。

Determinantal point processes (DPPs) are popular probabilistic models of diversity. In this paper, we investigate DPPs from a new perspective: property testing of distributions. Given sample access to an unknown distribution $q$ over the subsets of a ground set, we aim to distinguish whether $q$ is a DPP distribution, or $\epsilon$-far from all DPP distributions in $\ell_1$-distance. In this work, we propose the first algorithm for testing DPPs. Furthermore, we establish a matching lower bound on the sample complexity of DPP testing. This lower bound also extends to showing a new hardness result for the problem of testing the more general class of log-submodular distributions.

翻訳日:2022-11-01 04:06:59 公開日:2020-08-09

# 量子深層学習におけるグローバル最適探索

Global Optimum Search in Quantum Deep Learning ( http://arxiv.org/abs/2008.03655v1 )

ライセンス: Link先を確認

Lanston Hau Man Chu, Tejas Bhojraj, Rui Huang

(参考訳) 本稿では,量子回路を用いた機械学習最適化問題を解くことを目的とする。 2つの目的関数のグローバル最小/最大値を求めるために, 平均的アプローチと部分スワップテストカットオフ法(pstc)という2つの手法を提案した。現在のコストは$o(\sqrt{|\theta|} n)$であるが、チェックプロセスの強化によってさらに$o(\sqrt{|\theta|} \cdot sublinear \ n)$に改善される可能性がある。

This paper aims to solve machine learning optimization problem by using quantum circuit. Two approaches, namely the average approach and the Partial Swap Test Cut-off method (PSTC) was proposed to search for the global minimum/maximum of two different objective functions. The current cost is $O(\sqrt{|\Theta|} N)$, but there is potential to improve PSTC further to $O(\sqrt{|\Theta|} \cdot sublinear \ N)$ by enhancing the checking process.

翻訳日:2022-11-01 04:06:46 公開日:2020-08-09

# 進化的深層学習における光と影:分類学、批判的方法論分析、学習事例、学習教訓、勧告と課題

Lights and Shadows in Evolutionary Deep Learning: Taxonomy, Critical Methodological Analysis, Cases of Study, Learned Lessons, Recommendations and Challenges ( http://arxiv.org/abs/2008.03620v1 )

ライセンス: Link先を確認

Aritz D. Martinez, Javier Del Ser, Esther Villar-Rodriguez, Eneko Osaba, Javier Poyatos, Siham Tabik, Daniel Molina, Francisco Herrera

(参考訳) バイオインスパイアされた最適化アルゴリズムとディープラーニングモデルの融合については、ネットワークトポロジの発見や、与えられたタスクのパフォーマンスを向上したハイパーパラメトリック構成、勾配に基づく解法に代わるモデルパラメータの最適化など、いくつかの目的で多くのことが述べられている。実際、文学は、これらのタスクに自然にインスパイアされたアプローチの適用を示す提案に富んでいる。この研究では、これまでの3つの軸に基づく貢献を包括的にレビューし、批判的に検討しています。 a) 歴史的視点,深層学習における最適化問題の定義,文献の深い分析に関連する分類を含む,最適化と分類法(なぜ?) b) 批判的方法論分析(ハウ?)は、2つのケーススタディとともに、文献の分析の後に、学習した教訓と良い実践に対する勧告に対処することができる。 c) 課題と研究の新たな方向性(何ができるか、何のためにできるか) まとめると、3つの軸(最適化と分類、批判的分析、挑戦)は、融合研究の領域におけるエキサイティングな未来を創り出す2つの技術の統合の完全なビジョンを概観している。

Much has been said about the fusion of bio-inspired optimization algorithms and Deep Learning models for several purposes: from the discovery of network topologies and hyper-parametric configurations with improved performance for a given task, to the optimization of the model's parameters as a replacement for gradient-based solvers. Indeed, the literature is rich in proposals showcasing the application of assorted nature-inspired approaches for these tasks. In this work we comprehensively review and critically examine contributions made so far based on three axes, each addressing a fundamental question in this research avenue: a) optimization and taxonomy (Why?), including a historical perspective, definitions of optimization problems in Deep Learning, and a taxonomy associated with an in-depth analysis of the literature, b) critical methodological analysis (How?), which together with two case studies, allows us to address learned lessons and recommendations for good practices following the analysis of the literature, and c) challenges and new directions of research (What can be done, and what for?). In summary, three axes - optimization and taxonomy, critical analysis, and challenges - which outline a complete vision of a merger of two technologies drawing up an exciting future for this area of fusion research.

翻訳日:2022-11-01 03:59:38 公開日:2020-08-09

# C1 at SemEval-2020 Task 9: SentiMix: Sentiment Analysis for Code-Mixed Social Media Text using Feature Engineering (英語)

C1 at SemEval-2020 Task 9: SentiMix: Sentiment Analysis for Code-Mixed Social Media Text using Feature Engineering ( http://arxiv.org/abs/2008.13549v1 )

ライセンス: Link先を確認

Laksh Advani and Clement Lu and Suraj Maharjan

(参考訳) 今日の相互接続された多言語の世界では、ソーシャルメディア上での言語コード混合が一般的である。感情分析のような多くの自然言語処理(nlp)タスクは成熟しており、モノリンガルテキスト用によく設計されているが、これらのタスクをコード混合テキストに適用するための技術はまだ探索を必要とする。本稿では,SemEval-2020 Task 9: SentiMixのコード混合ソーシャルメディアテキストにおける感情分析における特徴工学的アプローチについて述べる。我々は,「肯定的」「否定的」感情と「中立的」感情の曖昧さを解消できる分類器を設計するために,手書きの語彙的,感情的,メタデータ的特徴を駆使してこの問題に取り組む。このモデルでは, "hinglish" タスクで 0.65 の重み付き f1 スコアと "spanglish" タスクで 0.63 のスコアを得ることができた。

In today's interconnected and multilingual world, code-mixing of languages on social media is a common occurrence. While many Natural Language Processing (NLP) tasks like sentiment analysis are mature and well designed for monolingual text, techniques to apply these tasks to code-mixed text still warrant exploration. This paper describes our feature engineering approach to sentiment analysis in code-mixed social media text for SemEval-2020 Task 9: SentiMix. We tackle this problem by leveraging a set of hand-engineered lexical, sentiment, and metadata features to design a classifier that can disambiguate between "positive", "negative" and "neutral" sentiment. With this model, we are able to obtain a weighted F1 score of 0.65 for the "Hinglish" task and 0.63 for the "Spanglish" tasks

翻訳日:2022-11-01 03:58:53 公開日:2020-08-09

# 概念ドリフト検出:ファジィ距離推定による欠落値の扱い

Concept Drift Detection: Dealing with MissingValues via Fuzzy Distance Estimations ( http://arxiv.org/abs/2008.03662v1 )

ライセンス: Link先を確認

Anjin Liu, Jie Lu, Guangquan Zhang

(参考訳) データストリームでは、到着した観測の異なる時点におけるデータ分布が変化する可能性がある - 概念ドリフトと呼ばれる現象だ。概念の漂流を検出することは比較的成熟した研究分野であるが、観測結果から得られた不確実性に対する解決法は孤立して研究されている。これらのソリューションがドリフト検出性能にどのように影響するかはまだ検討されていない。しかし、データ計算手法はデータを減らすのではなく、実際にデータの不確実性を増大させる可能性があると考えている。また,ドリフト検出時に分布変化を推定するプロセスにバイアスを生じさせる可能性があり,学習モデルの学習が困難になる可能性がある。本研究の目的は, 観測値の欠落を推定するよりも, 観測距離を推定することに集中し, 推定誤差に応じて観測値をヒストグラムビンに割り当てるメンバシップ関数を定義することである。本手法は,観測における各欠落値の反復推定による累積誤差を低減するための新しいマスク付き距離学習 (MDL) アルゴリズムと,データ分布の相違点を同定するためのファジィ重み付き周波数 (FWF) 法を備える。本論文で提案するコンセプトドリフト検出アルゴリズムは,不足値を扱うことができる特異かつ統一的なアルゴリズムであるが,概念ドリフト検出アルゴリズムと組み合わせた計算アルゴリズムではない。合成と実世界の両方のデータセットの実験は、この手法の利点を示し、欠落した値を持つデータのドリフトの検出における頑健さを示している。これらの結果から, 欠損値がコンセプトドリフト検出に多大な影響を及ぼすことが明らかとなったが, ファジィ・セット理論をモデル観測に用いると, 計算よりも信頼性の高い結果が得られることがわかった。

In data streams, the data distribution of arriving observations at different time points may change - a phenomenon called concept drift. While detecting concept drift is a relatively mature area of study, solutions to the uncertainty introduced by observations with missing values have only been studied in isolation. No one has yet explored whether or how these solutions might impact drift detection performance. We, however, believe that data imputation methods may actually increase uncertainty in the data rather than reducing it. We also conjecture that imputation can introduce bias into the process of estimating distribution changes during drift detection, which can make it more difficult to train a learning model. Our idea is to focus on estimating the distance between observations rather than estimating the missing values, and to define membership functions that allocate observations to histogram bins according to the estimation errors. Our solution comprises a novel masked distance learning (MDL) algorithm to reduce the cumulative errors caused by iteratively estimating each missing value in an observation and a fuzzy-weighted frequency (FWF) method for identifying discrepancies in the data distribution. The concept drift detection algorithm proposed in this paper is a singular and unified algorithm that can handle missing values, but not an imputation algorithm combined with a concept drift detection algorithm. Experiments on both synthetic and real-world data sets demonstrate the advantages of this method and show its robustness in detecting drift in data with missing values. These findings reveal that missing values exert a profound impact on concept drift detection, but using fuzzy set theory to model observations can produce more reliable results than imputation.

翻訳日:2022-11-01 03:58:17 公開日:2020-08-09

# ニューラルネットワークの記憶と理由:影響推定によるロングテールの発見

What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation ( http://arxiv.org/abs/2008.03703v1 )

ライセンス: Link先を確認

Vitaly Feldman and Chiyuan Zhang

(参考訳) ディープラーニングアルゴリズムは、トレーニングデータに非常に適しており、異常値や誤ったラベル付きデータポイントにも適していることがよく知られている。このような適合性は、重要な研究関心を惹きつけたが、今のところ説得力のある説明は与えられていない現象である、データラベルの訓練を暗記する必要がある。 Feldman (2019) の最近の研究は、2つの洞察の組み合わせに基づく理論的な説明を提唱している。まず、自然画像とデータ分布は(形式的には)長い尾を持つことが知られており、稀で非定型的な例のかなりの割合を持つ。第二に、単純な理論モデルでは、データ分布が長い場合の至近汎化誤差を達成するためには、このような記憶化が必要である。しかし、この説明の直接的な実証的証拠や、そのような証拠を得るためのアプローチは与えられなかった。この研究では、この理論の重要なアイデアをテストする実験をデザインします。実験では、各トレーニング例が各テスト例の精度およびトレーニング例の記憶値に与える影響を推定する必要がある。これらの量を直接推定することは計算的に禁止されるが、密接な関係にある部分サンプリングの影響や記憶値をより効率的に推定できることを示す。私たちの実験は、いくつかの標準ベンチマークにおける一般化のための記憶の大幅な利点を示しています。また、この理論の定量的かつ視覚的に説得力のある証拠も提示している(Feldman, 2019)。

Deep learning algorithms are well-known to have a propensity for fitting the training data very well and often fit even outliers and mislabeled data points. Such fitting requires memorization of training data labels, a phenomenon that has attracted significant research interest but has not been given a compelling explanation so far. A recent work of Feldman (2019) proposes a theoretical explanation for this phenomenon based on a combination of two insights. First, natural image and data distributions are (informally) known to be long-tailed, that is have a significant fraction of rare and atypical examples. Second, in a simple theoretical model such memorization is necessary for achieving close-to-optimal generalization error when the data distribution is long-tailed. However, no direct empirical evidence for this explanation or even an approach for obtaining such evidence were given. In this work we design experiments to test the key ideas in this theory. The experiments require estimation of the influence of each training example on the accuracy at each test example as well as memorization values of training examples. Estimating these quantities directly is computationally prohibitive but we show that closely-related subsampled influence and memorization values can be estimated much more efficiently. Our experiments demonstrate the significant benefits of memorization for generalization on several standard benchmarks. They also provide quantitative and visually compelling evidence for the theory put forth in (Feldman, 2019).

翻訳日:2022-11-01 03:57:45 公開日:2020-08-09

# 干渉生成対向ネットワーク

Intervention Generative Adversarial Networks ( http://arxiv.org/abs/2008.03712v1 )

ライセンス: Link先を確認

Jiadong Liang, Liangyu Zhang, Cheng Zhang and Zhihua Zhang

(参考訳) 本稿では,生成型逆ネットワークの学習過程を安定化し,モード崩壊問題を緩和するための新しい手法を提案する。主なアイデアは、目的に介入損失と呼ぶ正規化用語を導入することです。得られた生成モデルを、IVGAN(Intervention Generative Adversarial Networks)と呼ぶ。ガウス不変干渉による補助エンコーダネットワークから得られた実画像の潜伏表現を摂動させ、生成した画像の分布の相違を罰することにより、干渉損失は生成元に対してより有益な勾配を与え、GANのトレーニング安定性を著しく向上させる。本研究では,本手法の有効性と有効性を示すため,標準実世界データセットとスタック型mnistデータセットの徹底的な評価を行った。

In this paper we propose a novel approach for stabilizing the training process of Generative Adversarial Networks as well as alleviating the mode collapse problem. The main idea is to introduce a regularization term that we call intervention loss into the objective. We refer to the resulting generative model as Intervention Generative Adversarial Networks (IVGAN). By perturbing the latent representations of real images obtained from an auxiliary encoder network with Gaussian invariant interventions and penalizing the dissimilarity of the distributions of the resulting generated images, the intervention loss provides more informative gradient for the generator, significantly improving GAN's training stability. We demonstrate the effectiveness and efficiency of our methods via solid theoretical analysis and thorough evaluation on standard real-world datasets as well as the stacked MNIST dataset.

翻訳日:2022-11-01 03:57:27 公開日:2020-08-09

PDF登録状況（公開日: 20200809）