Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20210623となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# QuNetSim: 量子ネットワークのためのソフトウェアフレームワーク QuNetSim: A Software Framework for Quantum Networks ( http://arxiv.org/abs/2003.06397v5 ) ライセンス: Link先を確認	Stephen DiAdamo, Janis N\"otzel, Benjamin Zanger, Mehmet Mert Be\c{s}e	(参考訳) 量子インターネット技術の発展に伴い、シミュレーションソフトウェアや量子インターネット教育の必要性が高まっている。 QuNetSimはこのニーズを満たすことを目指している。 qunetsimは、ネットワーク層までの量子ネットワークをシミュレートするために使用できるpythonソフトウェアフレームワークである。 QuNetSimの目標は、さまざまな量子ネットワーク構成とパラメータ上での量子ネットワークプロトコルの調査とテストを容易にすることである。このフレームワークには多くの既知の量子ネットワークプロトコルが組み込まれており、ユーザーはシミュレーションを素早く構築でき、初心者は簡単に独自の量子ネットワークプロトコルを実装することができる。 As quantum internet technologies develop, the need for simulation software and education for quantum internet rises. QuNetSim aims to fill this need. QuNetSim is a Python software framework that can be used to simulate quantum networks up to the network layer. The goal of QuNetSim is to make it easier to investigate and test quantum networking protocols over various quantum network configurations and parameters. The framework incorporates many known quantum network protocols so that users can quickly build simulations and beginners can easily learn to implement their own quantum networking protocols.	翻訳日:2023-05-29 06:13:38 公開日:2021-06-23
# 量子レイリー問題と熱コヒーレントオンザガー関係 Quantum Rayleigh problem and thermocoherent Onsager relations ( http://arxiv.org/abs/2006.03186v4 ) ライセンス: Link先を確認	Onur Pusuluk and \"Ozg\"ur E. M\"ustecapl{\i}o\u{g}lu	(参考訳) 熱流と平衡における量子コヒーレンスと相関の役割は、レイリーの量子状態における平衡に関する力学問題を調べ、オンサーガーの熱電性へのアプローチに従って研究されている。具体的には、側面から2キュービット弾を照射する。任意の衝突時間と初期状態に対して、逐次衝突および集団衝突のマスター方程式を開発する。マスター方程式からフォッカー・プランク方程式を導出することにより、レイリーの熱伝導方程式の量子バージョンを同定する。発射体間で共有される量子的不協和と絡み合いは、いわゆる熱交換コヒーレンスと関連している場合にのみ、真の熱流に寄与する。エネルギーの最小散逸というレイリーの原理をオンサーガーが用いたのと同様、エントロピー生成率を用いてコヒーレンス電流を同定する。コヒーレンスと熱の流れは、コヒーレントなペルチェ効果とコヒーレントなシーベック効果を予測する量子オンサーガー関係の形で書くことができる。効果は衝突時間と集束率によって最適化できる。最後に,様々なプラットフォームにおける熱コヒーレント現象の実験的実現と技術応用について考察する。 The role of quantum coherence and correlations in heat flow and equilibration is investigated by exploring the Rayleigh's dynamical problem to equilibration in the quantum regime and following Onsager's approach to thermoelectricity. Specifically, we consider a qubit bombarded by two-qubit projectiles from a side. For arbitrary collision times and initial states, we develop the master equation for sequential and collective collisions. By deriving the Fokker-Planck equation out of the master equation, we identify the quantum version of the Rayleigh's heat conduction equation. We find that quantum discord and entanglement shared between the projectiles can contribute to genuine heat flow only when they are associated with so-called heat-exchange coherences. Analogous to Onsager's use of Rayleigh's principle of least dissipation of energy, we use the entropy production rate to identify the coherence current. Both coherence and heat flows can be written in the form of quantum Onsager relations, from which we predict coherent Peltier and coherent Seebeck effects. The effects can be optimized by the collision times and collectivity. Finally, we discuss some of the possible experimental realizations and technological applications of the thermocoherent phenomena in different platforms.	翻訳日:2023-05-17 02:17:14 公開日:2021-06-23
# sub-bosonic (deformed) ladder operator Sub-bosonic (deformed) ladder operators ( http://arxiv.org/abs/2009.06392v2 ) ライセンス: Link先を確認	J. Damastor Serafim, Ricardo Ximenes, and Fernando Parisio	(参考訳) 標準作用素 $\hat{a}^{\dagger}$$$$\hat{a}$) は、基本量子力学と量子場理論の両方において物理系にエネルギーの量$E$を(そこから)加える(差し引く)理想的な過程を表す。これは演算子レベルで$E$に関する変数が不可能であるという意味での ``sharp'' の概念である。本研究では、ファジィネスという厳密な概念から派生した変形生成および消滅作用素のクラスを示す。これにより変形し、ボゾン準可換関係は、修正された退化エネルギーとフォック状態を持つ単純な代数構造を誘導する。さらに,量子場理論において導入された形式性,例えば自由準ボソンの分散関係における線形性からの逸脱について検討する。 The canonical operator $\hat{a}^{\dagger}$ ($\hat{a}$) represents the ideal process of adding (subtracting) an {\it exact} amount of energy $E$ to (from) a physical system in both elementary quantum mechanics and quantum field theory. This is a ``sharp'' notion in the sense that no variability around $E$ is possible at the operator level. In this work, we present a class of deformed creation and annihilation operators that originates from a rigorous notion of fuzziness. This leads to deformed, sub-bosonic commutation relations inducing a simple algebraic structure with modified eigenenergies and Fock states. In addition, we investigate possible consequences of the introduced formalism in quantum field theories, as for instance, deviations from linearity in the dispersion relation for free quasibosons.	翻訳日:2023-05-03 00:29:08 公開日:2021-06-23
# 弱駆動量子系における3体相互作用の量子シミュレーション Quantum simulation of three-body interactions in weakly driven quantum systems ( http://arxiv.org/abs/2011.03399v2 ) ライセンス: Link先を確認	Francesco Petiziol, Mahdi Sameti, Stefano Carretta, Sandro Wimberger, Florian Mintert	(参考訳) 一対のカップリングを超えて多体相互作用を特徴とする有効ハミルトニアンの実現は、トポロジカル物理学と量子計算の基盤となる中心モデルの量子シミュレーションを可能にする。摂動フロッケ工学の限界を克服し、超伝導回路および分子ナノマグネットにおける純三体ハミルトンの高精度実現について論じる。 The realization of effective Hamiltonians featuring many-body interactions beyond pairwise coupling would enable the quantum simulation of central models underpinning topological physics and quantum computation. We overcome crucial limitations of perturbative Floquet engineering and discuss the highly accurate realization of a purely three-body Hamiltonian in superconducting circuits and molecular nanomagnets.	翻訳日:2023-04-25 03:13:57 公開日:2021-06-23
# テンソルネットワークを持つ有限密度における3+1次元格子量子電磁力学 Lattice Quantum Electrodynamics in (3+1)-dimensions at finite density with Tensor Networks ( http://arxiv.org/abs/2011.10658v2 ) ライセンス: Link先を確認	Giuseppe Magnifico, Timo Felser, Pietro Silvi, Simone Montangero	(参考訳) ゲージ理論は物質の基本構成要素とその相互作用を理解する上で最も重要なものである。しかしながら、その位相図の完全な特徴づけと非摂動効果の完全な理解は、特に有限電荷密度において、主にモンテカルロの数値シミュレーションに影響を与える符号問題のために議論されている。本稿では, 力学物質を含むハミルトニアン定式化における3次元格子ゲージ理論のテンソルネットワークシミュレーションについて報告する: この符号-プロブレムフリー法を用いて, 零および有限電荷密度のコンパクト量子力学の基底状態をシミュレートし, モデルの集合相のキャラクタリゼーション, 大ゲージ結合における凝縮相の存在, 電荷スクリーニング効果などの基本的な問題に対処する。 Gauge theories are of paramount importance in our understanding of fundamental constituents of matter and their interactions. However, the complete characterization of their phase diagrams and the full understanding of non-perturbative effects are still debated, especially at finite charge density, mostly due to the sign-problem affecting Monte Carlo numerical simulations. Here, we report the Tensor Network simulation of a three dimensional lattice gauge theory in the Hamiltonian formulation including dynamical matter: Using this sign-problem-free method, we simulate the ground states of a compact Quantum Electrodynamics at zero and finite charge densities, and address fundamental questions such as the characterization of collective phases of the model, the presence of a confining phase at large gauge coupling, and the study of charge-screening effects.	翻訳日:2023-04-23 19:09:49 公開日:2021-06-23
# 2光子遷移によるフラクソニウム量子ビット上のエンタングリングゲートの提案 Proposal for entangling gates on fluxonium qubits via a two-photon transition ( http://arxiv.org/abs/2011.10011v2 ) ライセンス: Link先を確認	Konstantin N. Nesterov, Quentin Ficheux, Vladimir E. Manucharyan, Maxim G. Vavilov	(参考訳) 2つの容量結合型フラックスニウム量子ビット上でマイクロ波を活性化する絡み合いゲートのファミリーを提案する。量子ビットに印加されるマイクロ波パルスは、$\|00\rangle - \|11\rangle$遷移の半周波数付近の周波数で、フラキソニウムの強い非調和性のため、計算部分空間の外に無視できる漏れを伴う2光子ラビ振動を誘導する。駆動周波数、振幅、持続時間を調整することにより、$\sqrt{\rm SWAP}$-likeや制御相ゲートのようなフェルミオンシミュレーションゲートと局所的に等価なゲートファミリーを得る。ゲート誤差は、過剰な回路パラメータマッチングなしで100 ns以下のパルス持続時間に対して10^{-4}$以下に調整できる。フラクソニウムコヒーレンス時間が1msを超えることを考えると、我々のゲートスキームは大規模量子プロセッサに期待できる。 We propose a family of microwave-activated entangling gates on two capacitively coupled fluxonium qubits. A microwave pulse applied to either qubit at a frequency near the half-frequency of the $\|00\rangle - \|11\rangle$ transition induces two-photon Rabi oscillations with a negligible leakage outside the computational subspace, owing to the strong anharmonicity of fluxoniums. By adjusting the drive frequency, amplitude, and duration, we obtain the gate family that is locally equivalent to the fermionic-simulation gates such as $\sqrt{\rm SWAP}$-like and controlled-phase gates. The gate error can be tuned below $10^{-4}$ for a pulse duration under 100 ns without excessive circuit parameter matching. Given that the fluxonium coherence time can exceed 1 ms, our gate scheme is promising for large-scale quantum processors.	翻訳日:2023-04-23 17:07:52 公開日:2021-06-23
# 回転波近似を超える2モードジョセフソン回路におけるサイドバンド遷移 Sideband transitions in a two-mode Josephson circuit driven beyond the rotating wave approximation ( http://arxiv.org/abs/2011.14600v2 ) ライセンス: Link先を確認	Byoung-moo Ann, Wouter Kessels, and Gary. A. Steele	(参考訳) 周期的に量子システムを駆動することは、量子状態のコヒーレント制御において重要な役割を果たす。回転波近似 (rwa) は弱およびほぼ共鳴駆動場に対する良い近似手法である。しかし、これらの実験は、RWAが保持できない大きなゆるやかで強い駆動場を必要とすることがある。本研究では,強い駆動と大変形の条件下での強駆動2モードジョセフソン回路を実験的,数値的,解析的に検討する。具体的には、2光子側帯遷移を駆動することによって引き起こされる2つのモード間のビームスプリッタおよび2モードスクイーズ相互作用について検討する。数値シミュレーションを用いて、RWAがサイドバンド遷移速度の振幅を正確に捉えることができないことを観察する。摂動補正に基づく解析モデルを用いて,この発見を検証する。研究した系におけるrwaの崩壊は、定性的に異なるダイナミクスをもたらすのではなく、高い駆動強度でのrwa理論と同じ結果をもたらし、結合速度を予測値と比較して向上させる。これはキャリア遷移の場合と比較して興味深い結果であり、RWAの分解は量子状態の質的に異なる時間進化をもたらす。我々の研究は、RWAを超えて、周期的に駆動されるシステムの振る舞いに関する洞察を提供する。また,これらの知見を回路量子電磁力学における量子プロトコルの計算と校正に含むためのロバストな理論的枠組みを提供する。 Driving quantum systems periodically in time plays an essential role in the coherent control of quantum states. The rotating wave approximation (RWA) is a good approximation technique for weak and nearly-resonance driven fields. However, these experiments sometimes require large detuning and strong driving fields, for which the RWA may not hold. In this work, we experimentally, numerically, and analytically explore strongly driven two-mode Josephson circuits in the regime of strong driving and large detuning. Specifically, we investigate beam-splitter and two-mode squeezing interaction between the two modes induced by driving a two-photon sideband transition. Using numerical simulations, we observe that the RWA is unable to correctly capture the amplitude of the sideband transition rates. We verify this finding using an analytical model that is based on perturbative corrections. We find that the breakdown of the RWA in the regime studied does not lead to qualitatively different dynamics, but gives the same results as the RWA theory at higher drive strengths, enhancing the coupling rates compared to what one would predict. This is an interesting consequence compared to the carrier transition case, where the breakdown of the RWA results in qualitatively different time evolution of the quantum state. Our work provides an insight into the behavior of time-periodically driven systems beyond the RWA. We also provide a robust theoretical framework for including these findings in the calculation and calibration of quantum protocols in circuit quantum electrodynamics.	翻訳日:2023-04-22 14:49:37 公開日:2021-06-23
# 連結ボソニックおよび離散変数量子符号に基づく量子リピータ Quantum repeaters based on concatenated bosonic and discrete-variable quantum codes ( http://arxiv.org/abs/2011.15076v2 ) ライセンス: Link先を確認	Filip Rozp\k{e}dek, Kyungjoo Noh, Qian Xu, Saikat Guha, Liang Jiang	(参考訳) 本稿では,離散および連続可変量子情報に使用される手法を組み合わせた量子エラー補正型量子リピータのアーキテクチャを提案する。具体的には、送信されたキュービットを2つのレベルからなる連結コードにエンコードする。最初のレベルでは、1つのボソニックモードでキュービットを符号化する連続可変GKPコードを使用します。第2のレベルでは、小さな離散変数コードを使用します。このようなアーキテクチャには2つの重要な特徴がある。まず、2つの異なるタイプのリピータにおいて、それぞれのレベルにおける誤差を補正する。これにより、すべてのリピータが同じアーキテクチャに対して、実用シナリオに必要なパフォーマンスをコスト削減で達成することができる。第二に、低レベルでの連続可変gkpコードの使用は、第2レベルのコードの誤り訂正能力を高める追加のアナログ情報を生成するため、4つまたは7つの光学モードからなる符号化で長距離通信が可能となる。 We propose an architecture of quantum-error-correction-based quantum repeaters that combines techniques used in discrete- and continuous-variable quantum information. Specifically, we propose to encode the transmitted qubits in a concatenated code consisting of two levels. On the first level we use a continuous-variable GKP code encoding the qubit in a single bosonic mode. On the second level we use a small discrete-variable code. Such an architecture has two important features. Firstly, errors on each of the two levels are corrected in repeaters of two different types. This enables for achieving performance needed in practical scenarios with a reduced cost with respect to an architecture for which all repeaters are the same. Secondly, the use of continuous-variable GKP code on the lower level generates additional analog information which enhances the error-correcting capabilities of the second-level code such that long-distance communication becomes possible with encodings consisting of only four or seven optical modes.	翻訳日:2023-04-22 14:19:30 公開日:2021-06-23
# 環境改善型コヒーレント光収穫 Environmentally improved coherent light harvesting ( http://arxiv.org/abs/2012.11864v2 ) ライセンス: Link先を確認	Stefano Tomasi, Dominic M. Rouse, Erik M. Gauger, Brendon W. Lovett, Ivan Kassal	(参考訳) コヒーレンス強化光収穫は、コヒーレンスが光ハーベスティング性能を著しく向上させるという理論的証拠にもかかわらず、実験的には直接観察されていない。主な実験的障害は、共役変数の存在下でのコヒーレンスの影響を分離することの難しさである。偏光度を操作することでコヒーレンスを外部から制御するための最近の提案は、コヒーレント効率の向上が可能であったが、環境に弱結合した光ハーベスティングシステムに限定されていた。本稿では,システム・バス結合強度の増大がコヒーレントな効率向上を増幅することを示す。この結果、コヒーレンス強化光収穫を決定的に実証したり、人工光ハーベスティングデバイスにコヒーレント効果を組み込むために使用できるシステムの範囲を劇的に拡大する。 Coherence-enhanced light harvesting has not been directly observed experimentally, despite theoretical evidence that coherence can significantly enhance light-harvesting performance. The main experimental obstacle has been the difficulty in isolating the effect of coherence in the presence of confounding variables. Recent proposals for externally controlling coherence by manipulating the light's degree of polarization showed that coherent efficiency enhancements would be possible, but were restricted to light-harvesting systems weakly coupled to their environment. Here, we show that increases in system-bath coupling strength can amplify coherent efficiency enhancements, rather than suppress them. This result dramatically broadens the range of systems that could be used to conclusively demonstrate coherence-enhanced light harvesting or to engineer coherent effects into artificial light-harvesting devices.	翻訳日:2023-04-19 22:24:17 公開日:2021-06-23
# ニューラルネットワークを用いた実験データからの非古典性同定 Identifying nonclassicality from experimental data using artificial neural networks ( http://arxiv.org/abs/2101.07112v2 ) ライセンス: Link先を確認	Valentin Gebhart, Martin Bohmann, Karsten Weiher, Nicola Biagi, Alessandro Zavatta, Marco Bellini, Elizabeth Agudelo	(参考訳) 非古典的資源の高速でアクセス可能な検証は、連続変数量子技術の幅広い利用に向けて不可欠のステップである。本稿では,ホモダイン検出により得られた実験データを処理し,光量子状態の非古典性同定のための機械学習手法を提案する。そこで我々は,古典的,非古典的状態の分類を行うニューラルネットワークを訓練した。光の状態の異なる実実験的な二次データから古典的特徴や非古典的特徴を正確に識別できることを実証する。さらに,訓練段階で使用されていない状態の非古典性も認識できることを示した。ホモダイントモグラフィを行うのに必要な大きな試料サイズの必要性を回避し,小型標本サイズに対する非古典性の同定に有望な代替案を示し,高速選別や実験データの直接監視への適用性を示す。 The fast and accessible verification of nonclassical resources is an indispensable step towards a broad utilization of continuous-variable quantum technologies. Here, we use machine learning methods for the identification of nonclassicality of quantum states of light by processing experimental data obtained via homodyne detection. For this purpose, we train an artificial neural network to classify classical and nonclassical states from their quadrature-measurement distributions. We demonstrate that the network is able to correctly identify classical and nonclassical features from real experimental quadrature data for different states of light. Furthermore, we show that nonclassicality of some states that were not used in the training phase is also recognized. Circumventing the requirement of the large sample sizes needed to perform homodyne tomography, our approach presents a promising alternative for the identification of nonclassicality for small sample sizes, indicating applicability for fast sorting or direct monitoring of experimental data.	翻訳日:2023-04-14 21:09:29 公開日:2021-06-23
# 2つの加速Unruh-DeWitt検出器間の絡み合い収穫における熱場の役割 Role of thermal field in entanglement harvesting between two accelerated Unruh-DeWitt detectors ( http://arxiv.org/abs/2104.11269v2 ) ライセンス: Link先を確認	Dipankar Barman, Subhajit Barman, Bibhas Ranjan Majhi	(参考訳) 2つの加速検出器間の絡み合いに及ぼすフィールド温度$T^{(f)}$の影響について検討した。平行運動では、場の熱的性質は絡み合いを生じないので、結果は非熱的状況と同じである。反対に、$t^{(f)}$ は、検知器が反平行運動である場合、すなわち、検知器 $a$ と $b$ がそれぞれ右と左のリンドラーウェッジにあるとき、絡み合いの収穫に影響する。 a$'saccelerate $a_a$のすべての値に対して$t^{(f)}=0$ エンタングルメントの収穫は可能であるが、温度が存在する場合、わずか$a_a$の範囲でのみ可能である。 1+1)$次元では、範囲は特定の値から始まり無限大に拡張され、$T^{(f)}$が増加するにつれて、絡み合いの収穫に必要最小の$a_A$が増加する。さらに、臨界値$a_A=a_c$の収穫は、以下の加速度と正反対の$T^{(f)}$の増加とともに増加する。加速度が異なる場合、いくつかの臨界値が$(1+3)$次元にある。 1+1)$次元の単一範囲とは対照的に、ここでの収穫はa_A$の離散範囲内で可能である。興味深いことに、等しい加速度の場合、1つの臨界点を持ち、性質は$(1+1)$次元の結果と非常に似ている。また、これらの検出器間の相互情報の依存を$a_A$と$T^{(f)}$で論じる。 We investigate the effects of field temperature $T^{(f)}$ on the entanglement harvesting between two uniformly accelerated detectors. For their parallel motion, the thermal nature of fields does not produce any entanglement, and therefore, the outcome is the same as the non-thermal situation. On the contrary, $T^{(f)}$ affects entanglement harvesting when the detectors are in anti-parallel motion, i.e., when detectors $A$ and $B$ are in the right and left Rindler wedges, respectively. While for $T^{(f)}=0$ entanglement harvesting is possible for all values of $A$'s acceleration $a_A$, in the presence of temperature, it is possible only within a narrow range of $a_A$. In $(1+1)$ dimensions, the range starts from specific values and extends to infinity, and as we increase $T^{(f)}$, the minimum required value of $a_A$ for entanglement harvesting increases. Moreover, above a critical value $a_A=a_c$ harvesting increases as we increase $T^{(f)}$, which is just opposite to the accelerations below it. There are several critical values in $(1+3)$ dimensions when they are in different accelerations. Contrary to the single range in $(1+1)$ dimensions, here harvesting is possible within several discrete ranges of $a_A$. Interestingly, for equal accelerations, one has a single critical point, with nature quite similar to $(1+1)$ dimensional results. We also discuss the dependence of mutual information among these detectors on $a_A$ and $T^{(f)}$.	翻訳日:2023-04-02 19:59:08 公開日:2021-06-23
# ポスト・クォータム時代の生産環境のための公開鍵インフラのセキュリティレコメンデーションに向けて Towards security recommendations for public-key infrastructures for production environments in the post-quantum era ( http://arxiv.org/abs/2105.01324v2 ) ライセンス: Link先を確認	S.E. Yunakovsky, M. Kot, N.O. Pozhar, D. Nabokov, M.A. Kudinov, A. Guglya, E.O. Kiktenko, E. Kolycheva, A. Borisov, and A.K. Fedorov	(参考訳) 量子コンピューティング技術は、現在使われている公開鍵暗号プロトコルに重大な脅威をもたらす。本稿では、運用環境を保護するためのセキュリティシステムの一部として使用される公開鍵基盤(PKI)に対する量子脅威の影響について論じる。我々は,量子化後のソリューションへの迅速な移行の要件に着目し,既存モデルのセキュリティ問題を解析する。量子コンピューティングによる攻撃に重点を置いていますが、使用する暗号アルゴリズムとは直接関係なく、pki全体のセキュリティに不可欠なセキュリティ上の問題についても論じています。我々は、量子コンピュータによる攻撃の観点から、pkiに関する一連のセキュリティ勧告を提供する。 Quantum computing technologies pose a significant threat to the currently employed public-key cryptography protocols. In this paper, we discuss the impact of the quantum threat on public key infrastructures (PKIs), which are used as a part of security systems for protecting production environments. We analyze security issues of existing models with a focus on requirements for a fast transition to post-quantum solutions. Although our primary focus is on the attacks with quantum computing, we also discuss some security issues that are not directly related to the used cryptographic algorithms but are essential for the overall security of the PKI. We attempt to provide a set of security recommendations regarding the PKI from the viewpoints of attacks with quantum computers.	翻訳日:2023-04-01 15:42:17 公開日:2021-06-23
# ド・ジッター空間におけるバタフライ速度とカオス抑制 Butterfly velocity and chaos suppression in de Sitter space ( http://arxiv.org/abs/2105.02258v2 ) ライセンス: Link先を確認	Dmitry S. Ageev	(参考訳) 本稿では,デシッター静電パッチにおけるホログラフィCFTを有限温度$T$および化学ポテンシャルで検討する。このような場理論におけるバタフライ速度 $v_b$ は、ハッブルパラメータ $h$ と $t$ のすべての値に対して縮退する。我々はこれを、$v_b$で制約されたカオス相関の拡大とド・ジッター曲率による効果との相互作用によるカオス混乱と解釈する。化学的ポテンシャルは、ある程度の温度で健康な蝶の速度を回復させる。また、ド・ジッターにおけるシュウィンガー効果や衝撃波衝突によるブラックホールの形成と、このカオス抑制の類似性を示す。 In this note, we study the holographic CFT in the de Sitter static patch at finite temperature $T$ and chemical potential. We find that butterfly velocity $v_B$ in such field theory degenerates for all values of the Hubble parameter $H$ and $T$. We interpret this as a chaos disruption caused by the interplay between the expansion of chaotic correlations constrained by $v_B$ and effects caused by de Sitter curvature. The chemical potential restores healthy butterfly velocity for some range of temperatures. Also, we provide some analogy of this chaos suppression with the Schwinger effect in de Sitter and black hole formation from shock wave collision.	翻訳日:2023-04-01 13:06:33 公開日:2021-06-23
# UPBの強い非局所集合 Strong nonlocal sets of UPB ( http://arxiv.org/abs/2106.08699v2 ) ライセンス: Link先を確認	Bichen Che, Zhao Dou, Min Lei, Yixian Yang	(参考訳) 拡張不可能な積基底(UPB)は直交積状態の族からの興味深いメンバーである。本稿では,異なる大きさの強い非局所性を持つ3量子 UPB の構成について検討する。まず、{C}^{3}}\otimes {{C}^{3}}\otimes {{C}^{3}}\otimes {{C}^{3}}$ of size 12 の UPB 集合がShifts UPB に基づいて表される。各キュービットの直交グラフを観察した後、${C}^{d}}\otimes {{C}^{d}}\otimes {{C}^{d}}\otimes {{C}^{d}}$ of size ${{{\left(d-1 \right)}^{3}}+3\left(d-2 \right)+1$.} で UPB を構築する一般的な方法を提供する。第二に、キュービットの次元が異なるより一般的な場合、タイル構造を3-キュービット系に拡張し、3-キュービット UPB に対してトリタイル構造を提案する。この構造により、{C}^{4}}\otimes {{C}^{4}}\otimes {{C}^{4}}\otimes {{C}^{5}}$ system of size 30 は、{C}^{3}}\otimes {{C}^{3}}\otimes {{C}^{4}}$ system に基づいて得られる。同様に、このアプローチを ${{c}^{{{d}_{1}}}}\otimes {{c}^{{{d}_{2}}}}\otimes {{c}^{{d}_{3}}}}$ system と一般化し、${c}^{d}}\otimes {{c}^{d}}\otimes {{c}^{d}}$ と同様の構成を持つ。我々の研究は、[Halder, et al., PRL, 122, 040403 (2019)]で提起されたオープンな質問に対する肯定的な回答を提供し、絡み合うことなく強い量子非局所性を示す複数の量子ビット UPBが存在することを示唆している。 The unextendible product bases (UPBs) are interesting members from the family of orthogonal product states. In this paper, we investigate the construction of 3-qubit UPB with strong nonlocality of different sizes. First, a UPB set in ${{C}^{3}}\otimes {{C}^{3}}\otimes {{C}^{3}}$ of size 12 is presented based on the Shifts UPB, the structure of which is described by mapping the system to a $3\times 3\times 3$ Rubik's Cube. After observing the orthogonal graph of each qubit, we provide a general method of constructing UPB in ${{C}^{d}}\otimes {{C}^{d}}\otimes {{C}^{d}}$ of size ${{\left( d-1 \right)}^{3}}+3\left( d-2 \right)+1$. Second, for the more general case where the dimensions of qubits are different, we extend the tile structure to 3-qubit system and propose a Tri-tile structure for 3-qubit UPB. Then, by means of this structure, a ${{C}^{4}}\otimes {{C}^{4}}\otimes {{C}^{5}}$ system of size 30 is obtained based on a ${{C}^{3}}\otimes {{C}^{3}}\otimes {{C}^{4}}$ system. Similarly, we generalize this approach to ${{C}^{{{d}_{1}}}}\otimes {{C}^{{{d}_{2}}}}\otimes {{C}^{{{d}_{3}}}}$ system which has a similar composition to ${{C}^{d}}\otimes {{C}^{d}}\otimes {{C}^{d}}$. Our research provides a positive answer to the open questions raised in [Halder, et al., PRL, 122, 040403 (2019)], indicating that there do exist multi-qubit UPBs that can exhibit strong quantum nonlocality without entanglement.	翻訳日:2023-03-26 13:18:42 公開日:2021-06-23
# 位置相関を用いた未検出光子を用いた量子イメージングにおける解像度限界 Resolution limit in quantum imaging with undetected photons using position correlations ( http://arxiv.org/abs/2106.11358v2 ) ライセンス: Link先を確認	Balakrishnan Viswanathan, Gabriela Barreto Lemos and Mayukh Lahiri	(参考訳) 未検出光子(QIUP)を用いた量子イメージングは、物体を照らす光子を検出できない独自の画像取得方法である。この方法は、双対光子間の量子干渉と空間的相関を利用して画像を形成する。ここでは位置相関が有効であるQIUPの分解能限界について詳細に検討する。自然パラメトリックダウンコンバージョンプロセス(SPDC)における空間分解能と双光子位置相関の定量的な関係を確立する。さらに,検出されていない照明場の波長と検出されたフィールドの波長が解像度で果たす役割を定量的に確立する。ゴーストイメージングや従来のイメージングとは異なり、QIUPにおける双対光子の空間相関による分解能限界は従来の光学技術ではさらに改善できない。 Quantum imaging with undetected photons (QIUP) is a unique method of image acquisition where the photons illuminating the object are not detected. This method relies on quantum interference and spatial correlations between the twin photons to form an image. Here we present a detailed study of the resolution limits of position correlation enabled QIUP. We establish a quantitative relation between the spatial resolution and the twin photon position correlation in the spontaneous parametric down-conversion process (SPDC). Furthermore, we also quantitatively establish the roles that the wavelength of the undetected illumination field and the wavelength of the detected field play in the resolution. Like ghost imaging and unlike conventional imaging, the resolution limit imposed by the spatial correlation between twin photons in QIUP cannot be further improved by conventional optical techniques.	翻訳日:2023-03-25 22:54:56 公開日:2021-06-23
# Som-Raychaudhuri時空における一般化Duffin-Kemmer-Petiau発振子に対するAharonov-Bohm効果 Aharonov-Bohm effect on the generalized Duffin-Kemmer-Petiau oscillator in the Som-Raychaudhuri space-time ( http://arxiv.org/abs/2106.12192v1 ) ライセンス: Link先を確認	Yi Yang, Zheng-Wen Long, Hao Chen, Zi-Long Zhao and Chao-Yun Long	(参考訳) 曲線時空における電磁相互作用を持つ一般化ダフィン-ケムマー-ペティオー振動子(dkp)について検討した。まず、コーネルポテンシャルを持つSom-Raychaudhuri時空における一般化DKP発振器を紹介する。次に、一般化DKP発振器における電磁相互作用について考察する。我々の問題のエネルギー固有値と固有関数が得られた。エネルギー固有値に対する時空パラメータ,振動子周波数,コーネル電位,磁束の影響を解析した。我々は,アハロノフ・ボーム効果から有界状態に対する類似効果を考察した。 The generalized Duffin-Kemmer-Petiau (DKP) oscillator with electromagnetic interactions in the curved space-times are investigated. We introduce firstly the generalized DKP oscillator in Som-Raychaudhuri space-time with Cornell potential. Then, we consider the electromagnetic interactions into the generalized DKP oscillator. The energy eigenvalues and eigenfunction of our problem are obtained. The effect from the parameters of space-time, the frequency of oscillator, the Cornell potential and the magnetic flux on the energy eigenvalues have been analyzed. We find a analogs effect for the bound states from the Aharonov-Bohm effect in our considered system.	翻訳日:2023-03-25 18:45:41 公開日:2021-06-23
# バンドドhfを用いたロンガのエンレース Enlaces de r\'adio de longa dist\^ancia utilizando a banda de HF ( http://arxiv.org/abs/2106.12187v1 ) ライセンス: Link先を確認	Rafael Diniz, Myl\`ene C. Q. Farias	(参考訳) 通信におけるHFバンドの使用に対する関心は、主にHFにおける軍用通信の新たな標準の開発とHFバンドにおけるデジタル放送の拡大により、この10年間で著しく高まっている。より具体的には、これらの新しい標準により、数百から数千kmのリンクを低コストで実装できるため、広く採用される可能性がある。ブラジルでは、この種のコミュニケーションは、アマゾン熱帯雨林地域のような、遠隔地やアクセスが難しい地域で使用することができる。 HF通信システムの物理層に関する技術の進化に加え、音声や画像の符号化に機械学習アルゴリズムを用いる技術が盛んに開発されてきた。これらすべての進歩により、通信インフラのない場所での通信サービスにHFバンドを使用できると信じられている。本研究は、ブラジルにおけるデジタルリンクにおけるHFラジオの最近の応用について、HFバンドにおける通信システム開発における課題について述べる。 The interest in the use of the HF band in telecommunication has increased significantly in the last decade, mainly due to the development of new standards for military telecommunications in HF, as well as the expansion of digital broadcasting in the HF band. More specifically, these new standards allow the implementation of links of hundreds or thousands of kilometers at a low cost, which suggests a widespread adoption can occur. In Brazil, this type of communication can be used in remote regions or regions of difficult access, such as the Amazon rain-forest region. In addition to the evolution of technologies concerning the physical layer of the HF telecommunication systems, there has been a great development of techniques that use machine learning algorithms for audio and image coding. It is believed that all these advances will enable the use of the HF band for communication services in places without telecommunication infrastructure. This work presents recent applications of HF radio for digital links in Brazil, describing the challenges present for the development of telecommunication systems in the HF band.	翻訳日:2023-03-25 18:45:32 公開日:2021-06-23
# 駅周辺における都市機能の多様性と密度 Diversity and density of urban functions in station areas ( http://arxiv.org/abs/2106.12107v1 ) ライセンス: Link先を確認	Yusuke Kumakoshi, Hideki Koizumi, Yuji Yoshimura	(参考訳) 都市機能の多様性と密度は都市での活力に肯定的に影響を与えることが知られているが、両者の関係は実証的に検討されていない。そこで本稿では, 都市における都市機能の多様性と密度との関係について, モンテカルロシミュレーションにより求めたロバスト密度指数を用いて, 都心における都市機能の多様性と密度の関係を実証的に示す。相関分析により,高密度の局部では複数スケールで低多様性を示す傾向が見られた。さらに, この負の相関は, 機能と相補関数の空間的特性の相違が原因であることが示唆された。本稿では, 都市計画における多様性と密度の両立を考慮し, 駅エリアの活力と弾力性について論じる。 The diversity and density of urban functions have been known to affect urban vibrancy positively, but the relation between the two has not been empirically examined; if high density is associated with low diversity in an area, its vibrancy may not increase. To obtain a better understanding of the metabolism of cities and directions for urban planning interventions, this paper offers empirical evidence on the association between the diversity and density of urban functions in the Tokyo Metropolitan Area, using a robust density index that was determined via a Monte Carlo simulation. By conducting association analyses, it was found that highly dense station areas tended to display low diversity at multiple scales. Further investigation indicated that this negative correlation was owing to different spatial characteristics of functions and complementary functioning among highly accessible station areas. This paper argues for considering both diversity and density in urban planning to make station areas vibrant and resilient.	翻訳日:2023-03-25 18:44:42 公開日:2021-06-23
# 量子ホモダイントモグラフィの高次元法 High-Dimensional Methods for Quantum Homodyne Tomography ( http://arxiv.org/abs/2106.12353v1 ) ライセンス: Link先を確認	Nicola Mosco, Lorenzo Maccone	(参考訳) 我々はホモダイントモグラフィに最適な再帰関係を提供する。従来のパターン関数の計算に内在する多様性を緩和し,モンテカルロシミュレーションによるデータ解析の実装方法について詳述した。我々の改良は、電磁場ヒルベルト空間の高次元部分空間を配置する励起量子状態の再構成に必要である。また,解析と再構築のためのJuliaパッケージも提示する。 We provide optimized recursion relations for homodyne tomography. We improve previous methods by mitigating the divergences intrinsic in the calculation of the pattern functions used previously, and detail how to implement the data analysis through Monte Carlo simulations. Our refinements are necessary for the reconstruction of excited quantum states which populate a high-dimensional subspace of the electromagnetic field Hilbert space. We also present a Julia package for the analysis and the reconstruction method.	翻訳日:2023-03-25 18:40:36 公開日:2021-06-23
# 圧縮熱浴における量子パラメトリック発振器:理論的基礎問題 Quantum Parametric Oscillator Heat Engines in Squeezed Thermal Baths: Foundational Theoretical Issues ( http://arxiv.org/abs/2106.12325v1 ) ライセンス: Link先を確認	Onat Ar{\i}soy, Jen-Tsung Hsiang and Bei-Lok Hu	(参考訳) In this paper we examine some foundational issues of a class of quantum engines where the system consists of a single quantum parametric oscillator, operating in an Otto cycle consisting of 4 stages of two alternating phases: the isentropic phase is detached from any bath (thus a closed system) where the natural frequency of the oscillator is changed from one value to another, and the isothermal phase where the system (now rendered open) is put in contact with one or two squeezed baths of different temperatures, whose nonequilibrium dynamics follows the Hu-Paz-Zhang (HPZ) master equation for quantum Brownian motion. hpz方程式は密度作用素の正則性を保つ完全非マルコフ方程式であり、有効である。 a) すべての温度 b)浴槽の任意のスペクトル密度,及び c) システムと浴槽との間の任意の結合強度。これらの性質を生かして、量子オットーエンジンのこれら2つの相に対する量子オープン・スクイーズド系の理論の重要な基礎的問題について検討する。以下を含む。一非マルコフ政権の非正統で低温の浴場二非断熱周波数変調に期待するもの三強固なシステムバス結合及び強固なシステムバスカップリング四この二つの相の間の適切な接合条件ここでの目標は、より高い効率を実現する方法を示すのではなく、より広い範囲のパラメータ空間をカバーする連続変数の量子エンジンのより堅実な理論的基礎を構築することである。 In this paper we examine some foundational issues of a class of quantum engines where the system consists of a single quantum parametric oscillator, operating in an Otto cycle consisting of 4 stages of two alternating phases: the isentropic phase is detached from any bath (thus a closed system) where the natural frequency of the oscillator is changed from one value to another, and the isothermal phase where the system (now rendered open) is put in contact with one or two squeezed baths of different temperatures, whose nonequilibrium dynamics follows the Hu-Paz-Zhang (HPZ) master equation for quantum Brownian motion. The HPZ equation is an exact nonMarkovian equation which preserves the positivity of the density operator and is valid for a) all temperatures, b) arbitrary spectral density of the bath, and c) arbitrary coupling strength between the system and the bath. Taking advantage of these properties we examine some key foundational issues of theories of quantum open and squeezed systems for these two phases of the quantum Otto engines. This include, i) the nonMarkovian regimes for non-Ohmic, low temperature baths, ii) what to expect in nonadiabatic frequency modulations, iii) strong system-bath coupling, as well as iv) the proper junction conditions between these two phases. Our aim here is not to present ways for attaining higher efficiency but to build a more solid theoretical foundation for quantum engines of continuous variables covering a broader range of parameter spaces hopefully of use for exploring such possibilities.	翻訳日:2023-03-25 18:40:29 公開日:2021-06-23
# 量子脳ネットワークの展望 Quantum Brain Networks: a Perspective ( http://arxiv.org/abs/2106.12295v1 ) ライセンス: Link先を確認	E. R. Miranda, S. Venkatesh, C. Hernani-Morales, L. Lamata, J. D. Mart\'in-Guerrero, and E. Solano	(参考訳) 我々はニューロテクノロジー、人工知能、量子コンピューティングの知識と手法を統合する新たな分野として量子脳ネットワーク(QBraiNs)を提案する。目標は、さまざまな破壊的応用のために、人間の脳と量子コンピュータの接続性を高めることである。我々は、ウェットウェアとハードウェアノードのハイブリッド古典量子ネットワークの出現を予測し、機械学習技術とブレイン・マシン・インタフェースを媒介とする。 QBraiNsは前例のない方法で芸術、科学、技術、起業家精神、特に医学、人間のインターネット、インテリジェントデバイス、感覚体験、ゲーム、物のインターネット、暗号取引、ビジネスに関連する活動を活用し、変革する。 We propose Quantum Brain Networks (QBraiNs) as a new interdisciplinary field integrating knowledge and methods from neurotechnology, artificial intelligence, and quantum computing. The objective is to develop an enhanced connectivity between the human brain and quantum computers for a variety of disruptive applications. We foresee the emergence of hybrid classical-quantum networks of wetware and hardware nodes, mediated by machine learning techniques and brain-machine interfaces. QBraiNs will harness and transform in unprecedented ways arts, science, technologies, and entrepreneurship, in particular activities related to medicine, Internet of humans, intelligent devices, sensorial experience, gaming, Internet of things, crypto trading, and business.	翻訳日:2023-03-25 18:40:08 公開日:2021-06-23
# 強相互作用原子の駆動非平衡系におけるエピデミック拡散と群免疫 Epidemic spreading and herd immunity in a driven non-equilibrium system of strongly-interacting atoms ( http://arxiv.org/abs/2106.12290v1 ) ライセンス: Link先を確認	Dong-Sheng Ding, Zong-Kai Liu, Hannes Busche, Bao-Sen Shi, Guang-Can Guo, Charles S. Adams, and Franco Nori	(参考訳) 疫病の空間的ダイナミクスを理解することがますます重要である。流行の数学的モデルが数多く存在するが、量的モデルテストを可能にする十分な制御パラメータを持つ物理システムが不足している。また、顕微鏡系における複雑なモデルのマクロ非平衡効果の再現も困難である。本研究では, 強い相互作用を持つリドバーグ原子における光学的非平衡相転移を利用した拡散拡散の物理アナログを実験的に示す。複数のレーザービームを使用することで、任意の所望の空間構造を課すことができる。サンプルの異なる部分で空間的局所化相転移とその相互作用を観察する。これらの相転移は、複数の場所での感染症の発生をシミュレートし、異なる体制下での免疫と疫病状態へのダイナミクスをシミュレートする。報告された結果は、Rydberg系は複雑な時空間力学をモデル化するのに十分な万能性を持っていることを示している。 It is increasingly important to understand the spatial dynamics of epidemics. While there are numerous mathematical models of epidemics, there is a scarcity of physical systems with sufficiently well-controlled parameters to allow quantitative model testing. It is also challenging to replicate the macro non-equilibrium effects of complex models in microscopic systems. In this work, we demonstrate experimentally a physics analog of epidemic spreading using optically driven non-equilibrium phase transitions in strongly interacting Rydberg atoms. Using multiple laser beams we can impose any desired spatial structure. We observe spatially localized phase transitions and their interplay in different parts of the sample. These phase transitions simulate the outbreak of an infectious disease in multiple locations, as well as the dynamics towards herd immunity and endemic state in different regimes. The reported results indicate that Rydberg systems are versatile enough to model complex spatial-temporal dynamics.	翻訳日:2023-03-25 18:39:57 公開日:2021-06-23
# 連続可変量子状態の数量子ビットへの普遍的ユニタリ移動 Universal unitary transfer of continuous-variable quantum states into a few qubits ( http://arxiv.org/abs/2106.12272v1 ) ライセンス: Link先を確認	Jacob Hastrup, Kimin Park, Jonatan Bohr Brask, Radim Filip and Ulrik Lund Andersen	(参考訳) 任意の連続変数量子状態を数個の離散変数量子ビットに転送するためのプロトコルを提案する。このプロトコルは決定論的であり、トラップイオンおよび超伝導回路プラットフォームで容易に利用できる2モードのラビ型相互作用のみを利用する。無限次元の状態を有限次元レジスタに転送することで生じる避けられない誤差は、指数関数的に量子ビットの数で抑制される。さらに、エンコードされた状態は、量子ビットに作用するデファスメントや振幅減衰などのノイズに対して頑健性を示す。このプロトコルは、離散連続型ハイブリッド量子システムのための強力で柔軟なツールを提供する。 We present a protocol for transferring arbitrary continuous-variable quantum states into a few discrete-variable qubits and back. The protocol is deterministic and utilizes only two-mode Rabi-type interactions which are readily available in trapped-ion and superconducting circuit platforms. The inevitable errors caused by transferring an infinite-dimensional state into a finite-dimensional register are suppressed exponentially with the number of qubits. Furthermore, the encoded states exhibit robustness against noise, such as dephasing and amplitude damping, acting on the qubits. Our protocol thus provides a powerful and flexible tool for discrete-continuous hybrid quantum systems.	翻訳日:2023-03-25 18:39:43 公開日:2021-06-23
# 軌道光学格子におけるフェルミガスの量子退化 Quantum degenerate Fermi gas in an orbital optical lattice ( http://arxiv.org/abs/2106.12241v1 ) ライセンス: Link先を確認	M. Hachmann, Y. Kiefer, J. Riebesehl, R. Eichberger, A. Hemmerich	(参考訳) 光学式チェッカーボード正方形格子の励起ブロッホ帯において, スピン偏極試料と量子退化フェルミオン原子のスピン混合物を調製した。スピン偏極の場合、パウリの排除原理による衝突の抑制を反映して、10,$s以上の極端帯域寿命が観測される。スピン混合物の場合、寿命は異なるスピン成分間の2体衝突によって桁違いに減少するが、それでも約1秒の顕著な大きな値が見つかる。運動量スペクトルを分析することで、光学格子の軌道特性を直接観測することができる。ここで実証された観測は、ユニタリティの体制を含む軌道光学格子における2対のスピン成分を持つフェルミ気体の物理を探索する基礎となる。 Spin-polarized samples and spin mixtures of quantum degenerate fermionic atoms are prepared in selected excited Bloch bands of an optical chequerboard square lattice. For the spin-polarized case, extreme band lifetimes above $10\,$s are observed, reflecting the suppression of collisions by Pauli's exclusion principle. For spin mixtures, lifetimes are reduced by an order of magnitude by two-body collisions between different spin components, but still remarkably large values of about one second are found. By analyzing momentum spectra, we can directly observe the orbital character of the optical lattice. The observations demonstrated here form the basis for exploring the physics of Fermi gases with two paired spin components in orbital optical lattices, including the regime of unitarity.	翻訳日:2023-03-25 18:39:32 公開日:2021-06-23
# Revenge Porn: 予備的専門家分析 Reporting Revenge Porn: a Preliminary Expert Analysis ( http://arxiv.org/abs/2106.12223v1 ) ライセンス: Link先を確認	A. De Angeli, M. Falduti, M. Menendez Blanco, S. Tessaris	(参考訳) 本研究では,リベンジポルノ(リベンジポルノ)と呼ばれる成人の親密・性的に露骨なデジタル画像の非コンセンサス分布に対する,被害者の視点からの対応に焦点を当てた。本稿では,選択したコンテンツ共有プラットフォームにおけるリベンジポルノ乱用を報告するためのプロセスに関する予備的専門家分析を行う。その中には、ソーシャルネットワーク、画像ホスティングサイト、ビデオホスティングプラットフォーム、フォーラム、ポルノサイトが含まれていました。性的行為の文脈における被害者の描写を目的とし、本人の顔を元の視覚的内容に置き換えるディープフェイク技術(ディープフェイク技術)の活用と、非合意による性的イメージやビデオのオンライン配信(リベンジポルノ)について、虐待を報告する方法について検討した。この予備分析は、これらの乱用を報告するためにプロバイダが設計した手順における、現在のプラクティスと潜在的な問題を理解することを目的としている。 In our research, we focus on the response to the non-consensual distribution of intimate or sexually explicit digital images of adults, also referred as revenge porn, from the point of view of the victims. In this paper, we present a preliminary expert analysis of the process for reporting revenge porn abuses in selected content sharing platforms. Among these, we included social networks, image hosting websites, video hosting platforms, forums, and pornographic sites. We looked at the way to report abuse, concerning both the non-consensual online distribution of private sexual image or video (revenge pornography), as well as the use of deepfake techniques, where the face of a person can be replaced on original visual content with the aim of portraying the victim in the context of sexual behaviours. This preliminary analysis is directed to understand the current practices and potential issues in the procedures designed by the providers for reporting these abuses.	翻訳日:2023-03-25 18:39:11 公開日:2021-06-23
# 平面k-一様状態:平面最大絡み合い状態の一般化 Planar k-Uniform States: a Generalization of Planar Maximally Entangled States ( http://arxiv.org/abs/2106.12209v1 ) ライセンス: Link先を確認	Yan-Ling Wang	(参考訳) 最近,ドローディアーニとカリミポーリ [Phys]. rev. a \textbf{102} 012427(2020)] は、極度に絡み合った (ame) 状態よりもより広い多成分の絡み合った状態のクラスである平面的極大絡み合い (pme) 状態の表記を提案した。そこで彼らは多成分系でその構成を示したが、粒子の数は偶数に制限されている。ここでは、まず残りのケース、すなわち、奇数の粒子を持つ系の平面的最大絡み合った状態の構成を解く。さらに、pme を平面的 $k$-uniform 状態に一般化し、n$ パーティの円に沿って隣接する$k$ パーティが最大に混合されるようにした。我々は最小のサポートを持つ平面$k$-一様状態の集合を構築する方法を示した。 Recently, Doroudiani and Karimipour [Phys. Rev. A \textbf{102} 012427(2020)] proposed the notation of planar maximally entangled (PME) states which are a wider class of multipartite entangled states than absolutely maximally entangled (AME) states. There they presented their constructions in the multipartite systems but the number of particles is restricted to be even. Here we first solve the remaining cases, i.e., constructions of planar maximally entangled states on systems with odd number of particles. In addition, we generalized the PME to the planar $k$-uniform states whose reductions to any adjacent $k$ parties along a circle of $N$ parties are maximally mixed. We presented a method to construct sets of planar $k$-uniform states which have minimal support.	翻訳日:2023-03-25 18:38:53 公開日:2021-06-23
# 一貫した質量を持たない粒子論の群理論的導出 Group theoretical derivation of consistent massless particle theories ( http://arxiv.org/abs/2106.12206v1 ) ライセンス: Link先を確認	Giuseppe Nistic\`o	(参考訳) 質量のない自由粒子の現在の理論は、空間反転と反ユニタリ作用素を仮定する。したがって、可能な理論の強固なクラスは破棄される。現在の無質量系の研究理論は、相対論的不変性の原理から厳密に推論的発展を通じて導かれるため、空間反転や時間反転作用素の一種が不整合を引き起こす場合にのみ排除される。その結果、質量のない孤立系に対する新しい一貫した理論のクラスが明確に決定される。一方、このアプローチは不変原理が示唆する一定の制約を定めており、過去のいくつかの調査で無視された結果、結果として不変原理と一致しないことがわかった。また、マスレスシステムのローカライズ可能性の問題は、新しい理論枠組みの中で再考され、一般化と過去の結果のより詳細な情報が得られる。 Current theories of massless free particle assume {\sl unitary} space inversion and {\sl anti-unitary} time reversal operators. In so doing robust classes of possible theories are discarded. In the present work theories of massless systems are derived through a strictly deductive development from the principle of relativistic invariance, so that a kind of space inversion or time reversal operator is ruled out only if it causes inconsistencies. As results, new classes of consistent theories for massless isolated systems are explicitly determined. On the other hand, the approach determines definite constraints implied by the invariance principle; they were ignored by some past investigations that, as a consequence, turn out to be not consistent with the invariance principle. Also the problem of the localizability for massless systems is reconsidered within the new theoretical framework, obtaining a generalization and a deeper detailing of previous results.	翻訳日:2023-03-25 18:38:36 公開日:2021-06-23
# MEMS単点磁気格子計用カシミールプルイン抵抗形パラメトリック増幅器の解析 Analysis of a Casimir-driven Parametric Amplifier with Resilience to Casimir Pull-in for MEMS Single-Point Magnetic Gradiometry ( http://arxiv.org/abs/2106.12477v1 ) ライセンス: Link先を確認	Josh Javor, Zhancheng Yao, Matthias Imboden, David K. Campbell and David J. Bishop	(参考訳) 量子力学的効果であるカシミール力は、いくつかのマイクロエレクトロメカニカル・システム(MEMS)プラットフォームで観測されている。 2つの物体の分離に対する極度な感度のため、カシミール力は量子力学の優れた道として提案されている。しかし、実用的応用はカシミールプルイン(casimir pull-in)と呼ばれる装置の消耗や故障に繋がる魅力的な力によって困難である。本研究では,時間遅延に基づくパラメトリック増幅手法を開発し,定常状態を実現し,プルインを回避するカシミール駆動型メトロロジープラットフォームの設計とシミュレーションを行う。この設計は、心臓と脳のイオン電流から発生するものと類似した弱い、低周波、勾配磁場の検出に応用する。シミュレーションパラメータは、MEMSプラットフォーム上でカシミール気象学および磁気グラディオメトリーのために開発された最近の実験プラットフォームから選択される。 MEMSはそのような用途に多くの利点を提供するが、検出された信号は通常、生体磁場の低周波状態において感度を低下させるため、デバイスの共振周波数でなければならない。カシミール駆動パラメトリック増幅器を用いて,MEMS単点勾配計の最適分解能が1万倍向上し,最大感度は1Hzで6Hz/(pT/cm)であった。提案した設計は、気象学に革命をもたらす可能性があり、特に環境条件下での生体磁場の非シールドモニタリングを可能にする可能性がある。 The Casimir Force, a quantum mechanical effect, has been observed in several microelectromechanical systems (MEMS) platforms. Due to its extreme sensitivity to the separation of two objects, the Casimir Force has been proposed as an excellent avenue for quantum metrology. Practical application, however, is challenging due to attractive forces leading to stiction and failure of the device, called Casimir pull-in. In this work, we design and simulate a Casimir-driven metrology platform, where a time-delay based parametric amplification technique is developed to achieve a steady state and avoid pull-in. We apply the design to the detection of weak, low frequency, gradient magnetic fields, similar to those emanating from ionic currents in the heart and brain. Simulation parameters are selected from recent experimental platforms developed for Casimir metrology and magnetic gradiometry, both on MEMS platforms. While MEMS offer many advantages to such an application, the detected signal must typically be at the resonant frequency of the device, with diminished sensitivity in the low frequency regime of biomagnetic fields. Using a Casimir-drive parametric amplifier, we report a 10,000 fold improvement in the best-case resolution of MEMS single-point gradiometers, with a maximum sensitivity of 6 Hz/(pT/cm) at 1 Hz. The development of the proposed design has the potential to revolutionize metrology, and specifically may enable unshielded monitoring of biomagnetic fields in ambient conditions.	翻訳日:2023-03-25 18:31:30 公開日:2021-06-23
# 光遠心分離型ガス混合系における状態及び分子選択的回転制御 State- and molecule-selective rotational control in gas mixtures with a shaped optical centrifuge ( http://arxiv.org/abs/2106.12468v1 ) ライセンス: Link先を確認	P. Amani, A. A. Milner, V. Milner	(参考訳) ガス混合系におけるオールオプティカル選択的回転制御法を実験的に示す。線形偏光が加速速度で回転する強いレーザーパルスである光遠心子を用いて、2つの異なる分子種を同時に2つの異なる回転周波数に励起する。新しいレベルの制御は、遠心分離分子の回転スペクトルに従って遠心スペクトルを形成することで達成される。形状の光学遠心分離機は、1つの分子種を他の分子よりも早く放出し、ターゲットの回転周波数と対応する回転状態とを分離する。この技術は、分子回転が衝突や化学反応に与える影響の研究において、回転制御の有用性を拡大する。 We demonstrate experimentally a method of all-optical selective rotational control in gas mixtures. Using an optical centrifuge - an intense laser pulse whose linear polarization rotates at an accelerated rate, we simultaneously excite two different molecular species to two different rotational frequencies of choice. The new level of control is achieved by shaping the centrifuge spectrum according to the rotational spectra of the centrifuged molecules. The shaped optical centrifuge releases one molecular species earlier than the other, therefore separating their target rotational frequencies and corresponding rotational states. The technique will expand the utility of rotational control in the studies of the effects of molecular rotation on collisions and chemical reactions.	翻訳日:2023-03-25 18:31:04 公開日:2021-06-23
# 深部ネットワークにおけるアナログ回路の展望 Prospects for Analog Circuits in Deep Networks ( http://arxiv.org/abs/2106.12444v1 ) ライセンス: Link先を確認	Shih-Chii Liu, John Paul Strachan, Arindam Basu	(参考訳) 機械学習のal-gorithms(例:addsとsoft max)で一般的に使用される演算は、compactアナログ回路で実装できる。 Analog Application-Specific Integrated Circuit (ASIC) は、電荷共有回路やサブスレッショルドトランジスタなどの技術を用いてこれらのアルゴリズムを実装し、非常に高い電力効率を実現する。近年のディープラーニングアルゴリズムの進歩により、一般的な行列ベクトル乗算処理を実装するハードウェアデジタルアクセラレータの設計に焦点が移った。これらの設計の電力は通常、ネットワークの重みとアクティベーションを保持するのに必要なオフチップDRAMのメモリアクセスパワーによって支配される。複雑な非揮発性メモリ技術はオンチップメモリを提供するのに役立ち、アナログ回路はインコンピュータメモリアプローチと組み合わせて必要な乗算ベクトル演算を実装するのに適している。本稿では,様々な機械学習アルゴリズムを実装したアナログ設計について概説する。そして、エッジや小さな機械学習アプリケーションに適した低消費電力ディープネットワークアクセラレータでofanalog回路を使用するための展望を示す。 Operations typically used in machine learning al-gorithms (e.g. adds and soft max) can be implemented bycompact analog circuits. Analog Application-Specific Integrated Circuit (ASIC) designs that implement these algorithms using techniques such as charge sharing circuits and subthreshold transistors, achieve very high power efficiencies. With the recent advances in deep learning algorithms, focus has shifted to hardware digital accelerator designs that implement the prevalent matrix-vector multiplication operations. Power in these designs is usually dominated by the memory access power of off-chip DRAM needed for storing the network weights and activations. Emerging dense non-volatile memory technologies can help to provide on-chip memory and analog circuits can be well suited to implement the needed multiplication-vector operations coupled with in-computing memory approaches. This paper presents abrief review of analog designs that implement various machine learning algorithms. It then presents an outlook for the use ofanalog circuits in low-power deep network accelerators suitable for edge or tiny machine learning applications.	翻訳日:2023-03-25 18:30:10 公開日:2021-06-23
# 量子計量に基づく普遍的半古典方程式 Universal semiclassical equations based on the quantum metric ( http://arxiv.org/abs/2106.12383v1 ) ライセンス: Link先を確認	C. Leblanc and G. Malpuech and D. D. Solnyshkov	(参考訳) 二バンド系における加速波束に対する半古典的運動方程式を導出する。これらの方程式は、量子計量によって記述される静的バンド幾何を用いて定式化できることを示す。我々は、ゼーマン項の有無にかかわらず、ラシュバ・ハミルトニアンの特定の場合を考える。半古典的軌道はシュリンガー方程式の解法から得られるものと完全に一致している。この形式主義は、伝統的なベリー曲率による断熱限界と異常ホール効果をうまく記述した。また、コヒーレントバンド重ね合わせの反対の極限を記述し、空間的に振動するZitterbewegung運動を引き起こす。 k=0$で、そのような波束は実空間において円軌道を示し、その半径は量子計量の平方根によって与えられる。この量は普遍長スケールとして現れ、コンプトン波長の幾何学的起源を与える。 We derive semiclassical equations of motion for an accelerated wavepacket in a two-band system. We show that these equations can be formulated in terms of the static band geometry described by the quantum metric. We consider the specific cases of the Rashba Hamiltonian with and without a Zeeman term. The semiclassical trajectories are in full agreement with the ones found by solving the Schr\"odinger equation. This formalism successfully describes the adiabatic limit and the anomalous Hall effect traditionally attributed to Berry curvature. It also describes the opposite limit of coherent band superposition giving rise to a spatially oscillating Zitterbewegung motion. At $k=0$, such wavepacket exhibits a circular trajectory in real space, with its radius given by the square root of the quantum metric. This quantity appears as a universal length scale, providing a geometrical origin of the Compton wavelength.	翻訳日:2023-03-25 18:28:56 公開日:2021-06-23
# 絡み合った状態からのマジック状態蒸留 Magic State Distillation from Entangled States ( http://arxiv.org/abs/2106.12591v1 ) ライセンス: Link先を確認	Ning Bao, ChunJun Cao, Vincent Paul Su	(参考訳) マジックは、凝縮物系の低エネルギー状態のような多体絡み合い状態において非局所的に分散することができる。ブラヴィイ・キタエフ・マジックステート蒸留プロトコルを用いて、非局所魔法は蒸留可能であり、蒸留結果を改善することができる。いくつかの明確な例を分析し、スピンスクイージングにより蒸留不能状態が蒸留可能状態に変換することができることを示した。また, 魔法蒸留プロトコルによって仮定される従来の製品入力状態は, 蒸留可能な魔法を持つ一般の州では非常に非定型的であることも示唆した。さらに、高い確率でマジック状態を与える様々なエンタングル入力を研究する必要性を正当化している。 Magic can be distributed non-locally in many-body entangled states, such as the low energy states of condensed matter systems. Using the Bravyi-Kitaev magic state distillation protocol, we find that non-local magic is distillable and can improve the distillation outcome. We analyze a few explicit examples and show that spin squeezing can be used to convert non-distillable states into distillable ones. Our analysis also suggests that the conventional product input states assumed by magic distillation protocols are extremely atypical among general states with distillable magic. It further justifies the need for studying a diverse range of entangled inputs that yield magic states with high probability.	翻訳日:2023-03-25 18:22:08 公開日:2021-06-23
# 演算子のユニタリ分解によるオープン量子系の量子シミュレーション Quantum Simulation of Open Quantum Systems Using a Unitary Decomposition of Operators ( http://arxiv.org/abs/2106.12588v1 ) ライセンス: Link先を確認	Anthony W. Schlimgen, Kade Head-Marsden, LeeAnn M. Sager, Prineha Narang, and David A. Mazziotti	(参考訳) 現実の物理系と化学系における電子輸送は、しばしば大きな環境と非自明なエネルギー交換を伴い、開量子系の定義と処理を必要とする。オープン量子系の時間進化は非ユニタリ演算子を用いるため、オープン量子系のシミュレーションはユニタリ演算子やゲートのみから構築される普遍量子コンピュータの課題を示す。本稿では、量子デバイス上の任意の状態に対する非ユニタリ作用素の作用を実装するための一般的なアルゴリズムを提案する。任意の量子作用素は、少なくとも4つのユニタリ作用素の線型結合として正確に分解できることを示す。本手法を,ゼロおよび有限温度振幅減衰チャネルの2レベル系で実証する。結果は古典計算と一致しており、中間および将来の量子デバイス上での非ユニタリ操作をシミュレートする可能性を示している。 Electron transport in realistic physical and chemical systems often involves the non-trivial exchange of energy with a large environment, requiring the definition and treatment of open quantum systems. Because the time evolution of an open quantum system employs a non-unitary operator, the simulation of open quantum systems presents a challenge for universal quantum computers constructed from only unitary operators or gates. Here we present a general algorithm for implementing the action of any non-unitary operator on an arbitrary state on a quantum device. We show that any quantum operator can be exactly decomposed as a linear combination of at most four unitary operators. We demonstrate this method on a two-level system in both zero and finite temperature amplitude damping channels. The results are in agreement with classical calculations, showing promise in simulating non-unitary operations on intermediate-term and future quantum devices.	翻訳日:2023-03-25 18:21:55 公開日:2021-06-23
# 弱重力場における相対論的粒子運動と量子光学 Relativistic Particle Motion and Quantum Optics in a Weak Gravitational Field ( http://arxiv.org/abs/2106.12514v1 ) ライセンス: Link先を確認	Charis Anastopoulos and Bei-Lok Hu	(参考訳) 宇宙における長いベースライン量子実験の可能性は、弱い重力場における相対論的量子粒子の時間的進化をよりよく理解する必要がある。従来の量子光学と量子力学に基づく原子物理学による従来の処理が、局所性、同時性、シグナリング、因果性などの問題に直面したときに不適切になる理由を説明する。量子場理論が必要である。重力効果を加えると、曲線時空(qftcst)における場の量子論に導かれる。この確立された理論は、重力と量子理論の基礎、および相対論的設定における量子情報理論の基礎概念をテストする、提案された宇宙実験の大規模なクラスに対する標準参照理論として機能するべきである。これは、qftcstの観点から宇宙空間における近距離量子光学および物質波実験を扱う一連の論文の最初のものである。我々はQFTCSTを用いた光子及びスカラー粒子の量子運動と干渉計実験への応用について分析した。我々の主な結果は、光子の場合、弱い重力場は不均質誘電体と完全に等しい順序に導かれるため、光媒体の理論からよく知られた概念を用いて、曲面空間における量子光学実験を記述できるということである。また、一階の量子コヒーレンスをプローブする干渉実験、共変粒子検出理論の重要性、到着時刻の関連性についても論じる。内部構造を持つ大質量粒子に対しては、励起内部状態に起因する異なる重力質量に由来する新しい重力誘起相転移を同定する。この位相シフトは、宇宙実験で原理的に測定することができる。 The possibility of long-baseline quantum experiments in space makes it necessary to better understand the time evolution of relativistic quantum particles in a weakly varying gravitational field. We explain why conventional treatments by traditional quantum optics and atomic physics based on quantum mechanics may become inadequate when faced with issues related to locality, simultaneity, signaling, causality, etc. Quantum field theory is needed. Adding the effects of gravitation, we are led to Quantum Field Theory in Curved Spacetime (QFTCST). This well-established theory should serve as the canonical reference theory to a large class of proposed space experiments testing the foundations of gravitation and quantum theory, and the basic notions of quantum information theory in relativistic settings. This is the first in a series of papers treating near-term quantum optics and matter waves experiments in space from the perspective of QFTCST. We analyze the quantum motion of photons and of scalar massive particles using QFTCST with application to interferometer experiments. Our main result is that, for photons, the weak gravitational field is to leading order completely equivalent to an inhomogeneous dielectric, thus allowing for a description of quantum optics experiments in curved space using familiar notions from the theory of optical media. We also discuss interference experiments that probe first-order quantum coherence, the importance of a covariant particle detection theory, and the relevance of time of arrival measurements. For massive particles with internal structure, we identify a novel gravity-induced phase shift that originates from the different gravitational masses attributed to the excited internal states. This phase shift can in principle be measured in space experiments.	翻訳日:2023-03-25 18:20:32 公開日:2021-06-23
# ソーシャルエンジニアリング:概念,技術,セキュリティ対策 Social engineering: Concepts, Techniques and Security Countermeasures ( http://arxiv.org/abs/2107.14082v1 ) ライセンス: Link先を確認	Adib Mohammed Syed	(参考訳) 本報告の目的は,サイバーセキュリティにおける社会工学と呼ばれるトピックを調査し,実社会工学の意味,概念,技術,セキュリティ対策について,事実的な学術研究に基づいて解説することである。 The purpose of this report is to research the topic called Social Engineering in Cyber Security and present the explanation of the meaning, concepts, techniques, and security countermeasures of Social Engineering based on factual academic research.	翻訳日:2023-03-25 18:13:09 公開日:2021-06-23
# 気候変動のための量子技術:予備評価 Quantum technologies for climate change: Preliminary assessment ( http://arxiv.org/abs/2107.05362v1 ) ライセンス: Link先を確認	Casey Berger, Agustin Di Paolo, Tracey Forrest, Stuart Hadfield, Nicolas Sawaya, Micha{\l} St\k{e}ch{\l}y and Karl Thibault	(参考訳) 気候変動は、人間社会や地球の生態系にもっと一般的な脅威をもたらす。緩和戦略は自然に科学、工学、経済学の幅広い課題を解決する必要がある。この文脈では、コンピュータ、センシング、通信における量子技術が急速に発展し、気候変動の影響を診断し緩和するための有用なツールになり得る。しかし、気候と量子科学の交わりはほとんど解明されていない。本報告は, 物理システムのシミュレーション, 組合せ最適化, センシング, エネルギー効率という4つの分野に着目し, 気候変動における量子技術の潜在的高インパクト利用事例を明らかにすることを目的としている。このレポートは、気候と量子科学のコミュニティを結びつける上で有用なリソースを提供してくれることを願っています。 Climate change presents an existential threat to human societies and the Earth's ecosystems more generally. Mitigation strategies naturally require solving a wide range of challenging problems in science, engineering, and economics. In this context, rapidly developing quantum technologies in computing, sensing, and communication could become useful tools to diagnose and help mitigate the effects of climate change. However, the intersection between climate and quantum sciences remains largely unexplored. This preliminary report aims to identify potential high-impact use-cases of quantum technologies for climate change with a focus on four main areas: simulating physical systems, combinatorial optimization, sensing, and energy efficiency. We hope this report provides a useful resource towards connecting the climate and quantum science communities, and to this end we identify relevant research questions and next steps.	翻訳日:2023-03-25 18:13:04 公開日:2021-06-23
# Analisis Kualitas Layanan website E-Commerce Bukalapak Terhadap Kepuasan Pengguna Mahasiswa Universitas Bina Darma Menggunakan Metode Webqual 4.0 Analisis Kualitas Layanan Website E-Commerce Bukalapak Terhadap Kepuasan Pengguna Mahasiswa Universitas Bina Darma Menggunakan Metode Webqual 4.0 ( http://arxiv.org/abs/2106.15342v1 ) ライセンス: Link先を確認	Adellia, Leon Andretti Abdillah	(参考訳) 新しいテクノロジーの成長は、オンラインで行うプロダクトマーケティングを動機付けている。オンライン開発をサポートする要因の1つは、オンラインの売買サイトやElectronic Commerceである。電子商取引の支持要因の1つはウェブサイトの利用である。 Webサイト(Webサイト、英: web)は、様々なテキスト情報、データ、静止画像、アニメーションデータ、サウンド、ビデオ、静的および動的の両方を表示するページの集合として解釈できるメディアの一種である。電子商取引企業はWebを通じて消費者と対話し、そのうちの1つはBukalapakのウェブサイトである。ウェブサイトの品質を決定するためには、測定する必要がある。 Webサイトの品質を測定することで、Webサイトに対するユーザの認識を見ることができる。本研究では,ユーザビリティ,情報品質,ユーザ満足度に関するインタラクション品質という3次元からなる webqual 4.0 を用いた。使用するデータは、アンケートを配布して元の情報源から直接得られるデータソースである一次データである。収集したデータは104人。本研究の回答者はbina darma大学生で,webサイトを客観的に評価することが期待された。 The growth of new technology, motivates some product marketing to be done online. One of the factors that support online development is online buying and selling sites or Electronic Commerce. One of the supporting factors for Electronic Commerce is using a website. Website or also commonly called the web is a form of media that can be interpreted as a collection of pages that display various kinds of text information, data, still or moving images, animation data, sound, video, both static and dynamic. Electronic Commerce companies interact with consumers through the web, one of which is the Bukalapak website, which is an online site provider for buying and selling products to be marketed. To determine the quality of a website, it is necessary to measure. By measuring the quality of a website, it can be seen the user's perception of the website. In this study using the Webqual 4.0 method which consists of 3 dimensions, namely usability, information quality and interaction quality on user satisfaction. The data used is primary data which is a source of data obtained directly from the original source by distributing questionnaires. The total data obtained are 104 respondents. Respondents in this study were Bina Darma University students who were expected to provide an objective assessment of the website to be analyzed.	翻訳日:2023-03-25 18:12:13 公開日:2021-06-23
# マルチタスク強化学習における階層型メモリ予測マシンの進化 Evolving Hierarchical Memory-Prediction Machines in Multi-Task Reinforcement Learning ( http://arxiv.org/abs/2106.12659v1 ) ライセンス: Link先を確認	Stephen Kelly, Tatiana Voegerl, Wolfgang Banzhaf, Cedric Gondro	(参考訳) 行動の基本的な側面は、記憶における経験の突出した特徴をエンコードし、これらの記憶を現在の感覚情報と組み合わせて、長期的な目標を最大化するような各状況に対する最善の行動を予測する能力である。世界は非常にダイナミックで、行動エージェントは時間とともに様々な環境や目的にまたがって一般化する必要がある。このシナリオは、部分的に観測可能なマルチタスク強化学習問題としてモデル化することができる。遺伝的プログラミングを用いて、OpenAIのClassic Controlスイートを含む6つのユニークな環境で動作可能な、高度に一般化されたエージェントを進化させる。これはエージェントが離散的および連続的なアクションを同時にサポートする必要がある。タスク識別センサーの入力は提供されないため、エージェントは状態変数のダイナミクスからタスクを識別し、各タスクの制御ポリシーを定義する必要がある。進化するプログラムにおける創発的階層構造は、時間分解とメモリ上の問題環境の符号化を成功させるマルチタスクエージェントをもたらすことを示す。結果として得られるエージェントは、6つの環境すべてにおいてタスク固有のエージェントと競合する。さらに、プログラムの階層構造は動的実行時の複雑さを許容し、これは比較的効率的な操作をもたらす。 A fundamental aspect of behaviour is the ability to encode salient features of experience in memory and use these memories, in combination with current sensory information, to predict the best action for each situation such that long-term objectives are maximized. The world is highly dynamic, and behavioural agents must generalize across a variety of environments and objectives over time. This scenario can be modeled as a partially-observable multi-task reinforcement learning problem. We use genetic programming to evolve highly-generalized agents capable of operating in six unique environments from the control literature, including OpenAI's entire Classic Control suite. This requires the agent to support discrete and continuous actions simultaneously. No task-identification sensor inputs are provided, thus agents must identify tasks from the dynamics of state variables alone and define control policies for each task. We show that emergent hierarchical structure in the evolving programs leads to multi-task agents that succeed by performing a temporal decomposition and encoding of the problem environments in memory. The resulting agents are competitive with task-specific agents in all six environments. Furthermore, the hierarchical structure of programs allows for dynamic run-time complexity, which results in relatively efficient operation.	翻訳日:2023-03-25 18:11:48 公開日:2021-06-23
# ブリルアン地域のトンネル:バレー・ホール・エッジ・チャンネルにおける後方散乱の理論 Tunneling in the Brillouin Zone: Theory of Backscattering in Valley Hall Edge Channels ( http://arxiv.org/abs/2106.12646v1 ) ライセンス: Link先を確認	Tirth Shah, Florian Marquardt, and Vittorio Peano	(参考訳) 最近の大規模な実験では、光子やフォノンなどのボソニック系のトポロジカル輸送を探索している。大部分の場合、時間反転対称性は保存され、バンド構造は幾何の適切な選択によって設計され、高対称性点近傍で位相的に非自明なバンドギャップを生成する。しかし、これは大きなクアシモメンタムの後方散乱の可能性を開き、トポロジカル保護を破壊した。これまでのところ、この効果を十分に抑制できる条件が何であるかははっきりしていない。本研究では,運動量空間におけるトンネル遷移の包括的半古典理論を導入し,バレーホール効果に基づく最も重要なシステムクラスの後方散乱について述べる。平滑な領域壁においても,局所的な壁面傾斜とエネルギーの双方で,有効散乱中心が形成されると予測する。さらに,本理論は,領域壁の滑らかさの増加に伴う反射振幅の指数関数的抑制の定量的解析を提供する。 A large set of recent experiments has been exploring topological transport in bosonic systems, e.g. of photons or phonons. In the vast majority, time-reversal symmetry is preserved, and band structures are engineered by a suitable choice of geometry, to produce topologically nontrivial bandgaps in the vicinity of high-symmetry points. However, this leaves open the possibility of large-quasimomentum backscattering, destroying the topological protection. Up to now, it has been unclear what precisely are the conditions where this effect can be sufficiently suppressed. In the present work, we introduce a comprehensive semiclassical theory of tunneling transitions in momentum space, describing backscattering for one of the most important system classes, based on the valley Hall effect. We predict that even for a smooth domain wall effective scattering centres develop at locations determined by both the local slope of the wall and the energy. Moreover, our theory provides a quantitative analysis of the exponential suppression of the overall reflection amplitude with increasing domain wall smoothness.	翻訳日:2023-03-25 18:10:17 公開日:2021-06-23
# アナログ量子シミュレータにおける電流の非侵襲計測 Non-invasive measurement of currents in analog quantum simulators ( http://arxiv.org/abs/2106.12599v1 ) ライセンス: Link先を確認	Kevin T. Geier, Janika Reichstetter, Philipp Hauke	(参考訳) アナログ量子シミュレータによる量子力学の研究能力にもかかわらず、電流を検出する可能性は低い。本稿では, 量子多体系の電流をアンシラに弱い結合で測定し, 続いてアンシラ集団を測定する柔軟な非侵襲的手法を提案する。ハーパー・ホフシュタットラー光格子ラダーにおける相互作用ボソンの例として,このスキームを数値的に評価し,実験誤差源について考察する。非常にフレキシブルなプロトコルは、ハードコアとソフトコアのボソンとフェルミオンの両方で使用することができ、現在の相関のようなより一般的な観測可能なものに容易に拡張可能であり、閉じ込められたイオンプラットフォームのために例示しているように、冷たい原子以外の設定にも適用できる。 Despite the pristine abilities of analog quantum simulators to study quantum dynamics, possibilities to detect currents are sparse. Here, we propose a flexible non-invasive technique to measure currents in quantum many-body systems by weakly coupling the system to an ancilla, followed by a measurement of the ancilla population. We numerically benchmark the scheme at the example of interacting bosons in a Harper-Hofstadter optical-lattice ladder, and discuss potential experimental error sources. The highly flexible protocol can be used with both hard-core and soft-core bosons as well as fermions, is easily extendable to more general observables like current-current correlations, and applies to other setups beyond cold atoms as we exemplify for the trapped-ion platform.	翻訳日:2023-03-25 18:10:00 公開日:2021-06-23
# graph universal adversarial attack: 悪役がグラフ学習モデルを台無しにする Graph Universal Adversarial Attacks: A Few Bad Actors Ruin Graph Learning Models ( http://arxiv.org/abs/2002.04784v2 ) ライセンス: Link先を確認	Xiao Zang, Yi Xie, Jie Chen, Bo Yuan	(参考訳) ディープニューラルネットワークは一般化されているものの、小さな対向摂動に敏感であることが知られている。この現象は深刻なセキュリティの脅威をもたらし、ディープラーニングモデルの堅牢性について深く調査する必要がある。グラフ構造化データのためのニューラルネットワークが出現すると、同様の調査が彼らの堅牢性を理解するよう促される。グラフ構造やノードの特徴を逆向きに摂動すると、モデルの性能が著しく低下する可能性があることが判明した。本研究では,対象とする被害者との接続を反転させることで,訓練済みのグラフニューラルネットワークを侵害する悪役ノードをグラフに含む場合,このような脆弱性が発生することを異なる角度から示す。さらに悪いことに、あるグラフモデルで見つかった悪いアクターは、他のモデルもひどく侵害している。我々はバッドアクタを ‘アンカーノード' と呼び、それらを識別するために gua というアルゴリズムを提案する。徹底的な実証調査は、アンカーノードがしばしば同じクラスに属することの興味深い発見であり、アンカーノードの数と攻撃成功率の間の直感的なトレードオフの相関も示している。 2708ノードを含むデータセットCoraでは、6つのアンカーノードがGCNや他の3モデルの攻撃成功率を80%以上上回る結果となる。 Deep neural networks, while generalize well, are known to be sensitive to small adversarial perturbations. This phenomenon poses severe security threat and calls for in-depth investigation of the robustness of deep learning models. With the emergence of neural networks for graph structured data, similar investigations are urged to understand their robustness. It has been found that adversarially perturbing the graph structure and/or node features may result in a significant degradation of the model performance. In this work, we show from a different angle that such fragility similarly occurs if the graph contains a few bad-actor nodes, which compromise a trained graph neural network through flipping the connections to any targeted victim. Worse, the bad actors found for one graph model severely compromise other models as well. We call the bad actors ``anchor nodes'' and propose an algorithm, named GUA, to identify them. Thorough empirical investigations suggest an interesting finding that the anchor nodes often belong to the same class; and they also corroborate the intuitive trade-off between the number of anchor nodes and the attack success rate. For the dataset Cora which contains 2708 nodes, as few as six anchor nodes will result in an attack success rate higher than 80\% for GCN and other three models.	翻訳日:2023-01-01 19:37:33 公開日:2021-06-23
# クロスドメインオブジェクト検出のための無バイアス平均教師 Unbiased Mean Teacher for Cross-domain Object Detection ( http://arxiv.org/abs/2003.00707v2 ) ライセンス: Link先を確認	Jinhong Deng, Wen Li, Yuhua Chen, Lixin Duan	(参考訳) オブジェクト検出モデルはデータ分散、特に2つの異なるドメイン間のかなりの領域シフトに対して脆弱であることが多いため、ドメイン間のオブジェクト検出は困難である。本稿では,ドメイン間オブジェクト検出のためのUnbiased Mean Teacher (UMT)モデルを提案する。我々は、ドメイン横断シナリオにおいて、単純な平均教師(MT)モデルに対してかなりのモデルバイアスが存在することを明らかにする。特に,教師モデルにおいて,教師モデルの専門知識を最大限活用するためのMTのクロスドメイン蒸留法を提案する。さらに,学生モデルでは,画素レベルの適応でトレーニングサンプルを増強することにより,バイアスを軽減する。最後に, 現状モデルに最も適合する試料を選別し, クロスドメイン蒸留プロセスをさらに強化するために, アウト・オブ・ディストリビューション推定手法を用いる。これらの戦略でモデルバイアスの問題に取り組むことで、我々のumtモデルは、ベンチマークデータセットであるclipart1k、watercolor2k、fogggy cityscapes、cityscapes上で44.1%、58.1%、41.7%、43.1%のマップをそれぞれ達成し、既存の最先端の成果を上回っている。私たちの実装はhttps://github.com/kinredon/umtで利用可能です。 Cross-domain object detection is challenging, because object detection model is often vulnerable to data variance, especially to the considerable domain shift between two distinctive domains. In this paper, we propose a new Unbiased Mean Teacher (UMT) model for cross-domain object detection. We reveal that there often exists a considerable model bias for the simple mean teacher (MT) model in cross-domain scenarios, and eliminate the model bias with several simple yet highly effective strategies. In particular, for the teacher model, we propose a cross-domain distillation method for MT to maximally exploit the expertise of the teacher model. Moreover, for the student model, we alleviate its bias by augmenting training samples with pixel-level adaptation. Finally, for the teaching process, we employ an out-of-distribution estimation strategy to select samples that most fit the current model to further enhance the cross-domain distillation process. By tackling the model bias issue with these strategies, our UMT model achieves mAPs of 44.1%, 58.1%, 41.7%, and 43.1% on benchmark datasets Clipart1k, Watercolor2k, Foggy Cityscapes, and Cityscapes, respectively, which outperforms the existing state-of-the-art results in notable margins. Our implementation is available at https://github.com/kinredon/umt.	翻訳日:2022-12-27 05:25:20 公開日:2021-06-23
# 人間デモから長距離タスクを一般化する学習 Learning to Generalize Across Long-Horizon Tasks from Human Demonstrations ( http://arxiv.org/abs/2003.06085v2 ) ライセンス: Link先を確認	Ajay Mandlekar, Danfei Xu, Roberto Mart\'in-Mart\'in, Silvio Savarese, Li Fei-Fei	(参考訳) 模倣学習は、高価なランダム探索プロセスに依存しないため、現実世界でロボットポリシーを訓練するための効果的で安全な手法である。しかし、探索の欠如により、実証された行動を超えて一般化する学習方針は依然としてオープンな課題である。本稿では,ロボットの模倣学習の枠組みを提案する。 1)少数の人間のデモンストレーションから複雑な実世界の操作タスクを効率的に学習し、 2) 収集した実演に含まれない新たな行動の合成。我々の重要な洞察は、多タスク領域がしばしば潜在構造を持ち、状態空間の共通領域で異なるタスクの軌道が交差することを示すことである。本稿では,この間欠的構造を利用した2段階のオフライン模倣学習アルゴリズムであるimitation(gti)による一般化について述べる。 GTIの第1段階では、異なる実演軌跡から行動を構成する能力を持つために軌道交叉を利用する確率的ポリシーを訓練する。 GTIの第2段階では、第1段階の無条件確率ポリシーからロールアウトの小さなセットを収集し、ゴール指向エージェントをトレーニングして、新規なスタートおよびゴール設定を一般化する。我々は,実世界におけるGTIのシミュレーション領域と長距離ロボット操作領域の両面での検証を行った。追加の結果とビデオはhttps://sites.google.com/view/gti2020/で見ることができる。 Imitation learning is an effective and safe technique to train robot policies in the real world because it does not depend on an expensive random exploration process. However, due to the lack of exploration, learning policies that generalize beyond the demonstrated behaviors is still an open challenge. We present a novel imitation learning framework to enable robots to 1) learn complex real world manipulation tasks efficiently from a small number of human demonstrations, and 2) synthesize new behaviors not contained in the collected demonstrations. Our key insight is that multi-task domains often present a latent structure, where demonstrated trajectories for different tasks intersect at common regions of the state space. We present Generalization Through Imitation (GTI), a two-stage offline imitation learning algorithm that exploits this intersecting structure to train goal-directed policies that generalize to unseen start and goal state combinations. In the first stage of GTI, we train a stochastic policy that leverages trajectory intersections to have the capacity to compose behaviors from different demonstration trajectories together. In the second stage of GTI, we collect a small set of rollouts from the unconditioned stochastic policy of the first stage, and train a goal-directed agent to generalize to novel start and goal configurations. We validate GTI in both simulated domains and a challenging long-horizon robotic manipulation domain in the real world. Additional results and videos are available at https://sites.google.com/view/gti2020/ .	翻訳日:2022-12-24 01:23:44 公開日:2021-06-23
# 訴訟手続の状況予測:逐次的テキストデータに基づくアプローチ Predicting Legal Proceedings Status: Approaches Based on Sequential Text Data ( http://arxiv.org/abs/2003.11561v4 ) ライセンス: Link先を確認	Felipe Maia Polo, Itamar Ciochetti, Emerson Bertolo	(参考訳) 本研究の目的は,ブラジルの法的手続を3段階に分類する予測モデルを開発することである。 (i)アーカイブされた手続 (ii)積極的な手続、及び (iii)停止。この問題の解決は、公共機関や民間機関が大規模な法的手続きのポートフォリオを管理し、規模と効率性を高めることを目的としている。本論文では,「運動」と呼ばれる短文の系列からなる訴訟手続について述べる。自然言語処理(NLP)と機械学習技術を組み合わせて問題解決を行った。資源不足のため、ポルトガルのNLPで作業することは難しいが、我々のアプローチは分類作業において非常にうまく行っており、最大精度は.93、最高スコアは.89(マクロ)と.93(重み)である。さらに,モデルの1つで学習したパターンを抽出・解釈し,そのパターンが分類タスクとどのように関連しているかを定量化することができた。解釈可能性のステップは、マシンラーニングの法的アプリケーションにおいて重要であり、ブラックボックスモデルがどのように意思決定を行うかに関するエキサイティングな洞察を与えてくれます。 The objective of this paper is to develop predictive models to classify Brazilian legal proceedings in three possible classes of status: (i) archived proceedings, (ii) active proceedings, and (iii) suspended proceedings. This problem's resolution is intended to assist public and private institutions in managing large portfolios of legal proceedings, providing gains in scale and efficiency. In this paper, legal proceedings are made up of sequences of short texts called "motions." We combined several natural language processing (NLP) and machine learning techniques to solve the problem. Although working with Portuguese NLP, which can be challenging due to lack of resources, our approaches performed remarkably well in the classification task, achieving maximum accuracy of .93 and top average F1 Scores of .89 (macro) and .93 (weighted). Furthermore, we could extract and interpret the patterns learned by one of our models besides quantifying how those patterns relate to the classification task. The interpretability step is important among machine learning legal applications and gives us an exciting insight into how black-box models make decisions.	翻訳日:2022-12-24 00:55:54 公開日:2021-06-23
# PO-EMO:ドイツ語・英語詩における美的感情の概念化・注釈・モデル化 PO-EMO: Conceptualization, Annotation, and Modeling of Aesthetic Emotions in German and English Poetry ( http://arxiv.org/abs/2003.07723v3 ) ライセンス: Link先を確認	Thomas Haider, Steffen Eger, Evgeny Kim, Roman Klinger, Winfried Menninghaus	(参考訳) ソーシャルメディア、文学、ニュース、その他のドメインの感情分析へのほとんどのアプローチは、ekmanやplutchikが定義する基本的な感情カテゴリのみに焦点を当てている。しかし、芸術(文学など)はより複雑で微妙な感情の幅広い範囲への関与を可能にする。それらには感情的な反応も混ざり合っていることが示されている。詩の感情は、著者がテキストで表現したものや意図したものではなく、読者によって引き起こされるものだと考える。そこで我々は,読者の審美的評価を予測可能な審美感情の集合を概念化し,複数のラベルの注釈を1行にまとめることで,その文脈内での混合感情を捉える。注意深い訓練を受けた専門家とクラウドソーシングによるアノテーション実験において,この新しい設定を評価した。専門家とのアノテーションは、kappa = .70の許容可能な一致をもたらし、将来の大規模分析のための一貫したデータセットをもたらす。最後に、BERTに基づく最初の感情分類実験を行い、ドイツのサブセットで最大.52F1-microの美的感情の識別が困難であることを示す。データとリソースはhttps://github.com/tnhaider/poetry-emotionで入手できる。 Most approaches to emotion analysis of social media, literature, news, and other domains focus exclusively on basic emotion categories as defined by Ekman or Plutchik. However, art (such as literature) enables engagement in a broader range of more complex and subtle emotions. These have been shown to also include mixed emotional responses. We consider emotions in poetry as they are elicited in the reader, rather than what is expressed in the text or intended by the author. Thus, we conceptualize a set of aesthetic emotions that are predictive of aesthetic appreciation in the reader, and allow the annotation of multiple labels per line to capture mixed emotions within their context. We evaluate this novel setting in an annotation experiment both with carefully trained experts and via crowdsourcing. Our annotation with experts leads to an acceptable agreement of kappa = .70, resulting in a consistent dataset for future large scale analysis. Finally, we conduct first emotion classification experiments based on BERT, showing that identifying aesthetic emotions is challenging in our data, with up to .52 F1-micro on the German subset. Data and resources are available at https://github.com/tnhaider/poetry-emotion	翻訳日:2022-12-22 21:11:38 公開日:2021-06-23
# ぼやけ、ノイズ、圧縮ロバストな生成型逆ネットワーク Blur, Noise, and Compression Robust Generative Adversarial Networks ( http://arxiv.org/abs/2003.07849v2 ) ライセンス: Link先を確認	Takuhiro Kaneko, Tatsuya Harada	(参考訳) generative adversarial networks (gans) は、画像の再現能力によってかなりの注目を集めている。しかし、画像がぼやけ、ノイズ、圧縮という形で劣化しているにもかかわらず、トレーニング画像を忠実に再現することができ、同様に劣化した画像を生成する。この問題を解決するために、最近提案されたノイズロバストGAN(NR-GAN)は、画像とノイズジェネレータからなる2世代モデルを用いて、ノイズの多い画像から直接クリーンな画像ジェネレータを学習できることを示し、部分解を提供する。しかし、その応用はノイズに限定されており、その付加的・可逆的特性により比較的分解が容易であり、ぼかし、圧縮、そしてすべての組み合わせという形で可逆的画像劣化への応用は依然として課題である。これらの問題に対処するために,劣化パラメータ(ぼかしカーネルタイプ,ノイズ量,品質係数値など)を知らずに,劣化画像から直接クリーン画像生成器を学習できる,ぼかし,ノイズ,圧縮頑健 gan (bncr-gan) を提案する。 NR-GANにインスパイアされたBNCR-GANは、画像、ボケカーネル、ノイズ、品質要素ジェネレータで構成される多重ジェネレータモデルを使用する。しかし,nr-ganとは対照的に,非可逆的な特性に対処するために,劣化前後のバイパスを用いてデータ駆動方式で劣化強度値を調整するマスキングアーキテクチャを導入する。さらに, ボケ, ノイズ, 圧縮の組み合わせによる不確実性を抑制するため, 劣化強度に応じて可逆的劣化過程間の一貫性を規定する適応的一貫性損失を導入する。 CIFAR-10の大規模比較とFFHQの一般性解析によるBNCR-GANの有効性を示す。さらに,画像復元におけるBNCR-GANの適用性を示す。 Generative adversarial networks (GANs) have gained considerable attention owing to their ability to reproduce images. However, they can recreate training images faithfully despite image degradation in the form of blur, noise, and compression, generating similarly degraded images. To solve this problem, the recently proposed noise robust GAN (NR-GAN) provides a partial solution by demonstrating the ability to learn a clean image generator directly from noisy images using a two-generator model comprising image and noise generators. However, its application is limited to noise, which is relatively easy to decompose owing to its additive and reversible characteristics, and its application to irreversible image degradation, in the form of blur, compression, and combination of all, remains a challenge. To address these problems, we propose blur, noise, and compression robust GAN (BNCR-GAN) that can learn a clean image generator directly from degraded images without knowledge of degradation parameters (e.g., blur kernel types, noise amounts, or quality factor values). Inspired by NR-GAN, BNCR-GAN uses a multiple-generator model composed of image, blur-kernel, noise, and quality-factor generators. However, in contrast to NR-GAN, to address irreversible characteristics, we introduce masking architectures adjusting degradation strength values in a data-driven manner using bypasses before and after degradation. Furthermore, to suppress uncertainty caused by the combination of blur, noise, and compression, we introduce adaptive consistency losses imposing consistency between irreversible degradation processes according to the degradation strengths. We demonstrate the effectiveness of BNCR-GAN through large-scale comparative studies on CIFAR-10 and a generality analysis on FFHQ. In addition, we demonstrate the applicability of BNCR-GAN in image restoration.	翻訳日:2022-12-22 20:18:33 公開日:2021-06-23
# 自然言語処理のための事前学習モデル:調査 Pre-trained Models for Natural Language Processing: A Survey ( http://arxiv.org/abs/2003.08271v4 ) ライセンス: Link先を確認	Xipeng Qiu, Tianxiang Sun, Yige Xu, Yunfan Shao, Ning Dai, and Xuanjing Huang	(参考訳) 近年,事前学習モデル(PTM)の出現により,自然言語処理(NLP)が新たな時代を迎えている。本調査では,NLP 用 PTM について概説する。まず,言語表現学習とその研究の進展について紹介する。そして,4つの観点から,既存のPTMを分類的に分類する。次に,PTMの知識を下流タスクに適応させる方法について述べる。最後に,今後の研究に向けた PTM の可能性について概説する。この調査は、様々なNLPタスクに対するPTMの理解、利用、開発のためのハンズオンガイドになることを目的としている。 Recently, the emergence of pre-trained models (PTMs) has brought natural language processing (NLP) to a new era. In this survey, we provide a comprehensive review of PTMs for NLP. We first briefly introduce language representation learning and its research progress. Then we systematically categorize existing PTMs based on a taxonomy with four perspectives. Next, we describe how to adapt the knowledge of PTMs to the downstream tasks. Finally, we outline some potential directions of PTMs for future research. This survey is purposed to be a hands-on guide for understanding, using, and developing PTMs for various NLP tasks.	翻訳日:2022-12-22 09:31:44 公開日:2021-06-23
# 集中効果を持つマクロアクションを用いた効率的なブラックボックス計画 Efficient Black-Box Planning Using Macro-Actions with Focused Effects ( http://arxiv.org/abs/2004.13242v3 ) ライセンス: Link先を確認	Cameron Allen, Michael Katz, Tim Klinger, George Konidaris, Matthew Riemer, Gerald Tesauro	(参考訳) 決定論的計画の難しさは探索木深度とともに指数関数的に増加する。ブラックボックスプランニングは、プランナーがドメインの明示的なモデルなしで運用する必要があるため、さらに大きな課題となる。ヒューリスティックは検索をより効率的にするが、ブラックボックス計画のための目標認識ヒューリスティックは通常ゴールカウントに依存している。本稿では,目標数ヒューリスティックをより正確にするマクロアクションの発見によって,この制限を克服する方法を示す。提案手法は,目標数ヒューリスティックによる仮定とよく一致した,集中した効果(つまり少数の状態変数のみを修飾するマクロ)を持つマクロアクションを探索する。フォーカスされたマクロは、幅広い計画領域におけるブラックボックス計画効率を劇的に改善し、時には完全なドメインモデルへのアクセスで最先端のプランナーを圧倒する。 The difficulty of deterministic planning increases exponentially with search-tree depth. Black-box planning presents an even greater challenge, since planners must operate without an explicit model of the domain. Heuristics can make search more efficient, but goal-aware heuristics for black-box planning usually rely on goal counting, which is often quite uninformative. In this work, we show how to overcome this limitation by discovering macro-actions that make the goal-count heuristic more accurate. Our approach searches for macro-actions with focused effects (i.e. macros that modify only a small number of state variables), which align well with the assumptions made by the goal-count heuristic. Focused macros dramatically improve black-box planning efficiency across a wide range of planning domains, sometimes beating even state-of-the-art planners with access to a full domain model.	翻訳日:2022-12-08 21:59:49 公開日:2021-06-23
# 分類データセットのクラスタリングのための効率的な$k$-modesアルゴリズム An Efficient $k$-modes Algorithm for Clustering Categorical Datasets ( http://arxiv.org/abs/2006.03936v3 ) ライセンス: Link先を確認	Karin S. Dorman and Ranjan Maitra	(参考訳) データからクラスタをマイニングすることは、多くのアプリケーションにおいて重要な取り組みです。 k$-means法は、数値データをクラスタリングするための一般的で効率的で分散のないアプローチであるが、分類値の観測には適用されない。 k$-modes メソッドは、ユークリッドをハミング距離と平均とを $k$-means 目的関数のモードに置き換えることで、この lacuna に対処する。我々は, OTQT と呼ばれる$k$-modes の斬新で効率的な実装を提供する。 OTQTは既存の$k$-modesアルゴリズムでは検出不可能な目的関数を改善するために更新を見つける。アルゴリズムの複雑さのため、イテレーション毎に若干遅いが、otqtは常にイテレーションごとに正確であり、ほぼ常に(一部のデータセットではわずかに遅い)最終最適化まで高速である。したがって、$k$-modes最適化のためのデフォルトアルゴリズムとしてOTQTを推奨する。 Mining clusters from data is an important endeavor in many applications. The $k$-means method is a popular, efficient, and distribution-free approach for clustering numerical-valued data, but does not apply for categorical-valued observations. The $k$-modes method addresses this lacuna by replacing the Euclidean with the Hamming distance and the means with the modes in the $k$-means objective function. We provide a novel, computationally efficient implementation of $k$-modes, called OTQT. We prove that OTQT finds updates to improve the objective function that are undetectable to existing $k$-modes algorithms. Although slightly slower per iteration due to algorithmic complexity, OTQT is always more accurate per iteration and almost always faster (and only barely slower on some datasets) to the final optimum. Thus, we recommend OTQT as the preferred, default algorithm for $k$-modes optimization.	翻訳日:2022-11-24 21:32:05 公開日:2021-06-23
# 複雑ネットワーク上の感染ダイナミクスの深層学習 Deep learning of contagion dynamics on complex networks ( http://arxiv.org/abs/2006.05410v5 ) ライセンス: Link先を確認	Charles Murphy, Edward Laurence, Antoine Allard	(参考訳) 感染力学の進化を予測することは、力学モデルが部分解のみを与えるようなオープンな問題である。数学的または計算的に計算可能となるためには、これらのモデルは仮定を単純化し、予測の量的精度とモデル化できる力学の複雑さを制限する必要がある。本稿では,ネットワーク上で動的に制御する効果的な局所機構を時系列データから学習する深層学習に基づく補完的アプローチを提案する。当社のグラフニューラルネットワークアーキテクチャは,そのダイナミクスに関する仮定をほとんど行わず,複雑化に伴う異なる伝染ダイナミクスを用いてその正確さを実証する。任意のネットワーク構造をシミュレーションすることで,学習したダイナミックスの性質を学習データを超えて探索することが可能になる。最後に,スペインにおけるcovid-19流行の実データを用いて,このアプローチの適用性を示す。この結果は,ネットワーク上での感染動態の効果的なモデルを構築するために,ディープラーニングが新たな補完的な視点を提供することを示す。 Forecasting the evolution of contagion dynamics is still an open problem to which mechanistic models only offer a partial answer. To remain mathematically or computationally tractable, these models must rely on simplifying assumptions, thereby limiting the quantitative accuracy of their predictions and the complexity of the dynamics they can model. Here, we propose a complementary approach based on deep learning where the effective local mechanisms governing a dynamic on a network are learned from time series data. Our graph neural network architecture makes very few assumptions about the dynamics, and we demonstrate its accuracy using different contagion dynamics of increasing complexity. By allowing simulations on arbitrary network structures, our approach makes it possible to explore the properties of the learned dynamics beyond the training data. Finally, we illustrate the applicability of our approach using real data of the COVID-19 outbreak in Spain. Our results demonstrate how deep learning offers a new and complementary perspective to build effective models of contagion dynamics on networks.	翻訳日:2022-11-23 15:39:55 公開日:2021-06-23
# ShapeFlow: 3次元形状の学習可能な変形 ShapeFlow: Learnable Deformations Among 3D Shapes ( http://arxiv.org/abs/2006.07982v2 ) ライセンス: Link先を確認	Chiyu "Max" Jiang, Jingwei Huang, Andrea Tagliasacchi, Leonidas Guibas	(参考訳) 本稿では,3次元形状全体の変形空間を学習するためのフローベースモデルであるshapeflowを提案する。 ShapeFlowは、形状トポロジーに非依存なマルチテンプレートの変形空間を学習できるが、微妙な幾何学的詳細を保存できる。遅延ベクトルが直接形状にデコードされる生成空間と異なり、変形空間はベクトルを連続流れにデコードし、ソース形状を目標に向けて対流させることができる。このような空間は、自然に幾何学的スタイル(元から来る)と構造的ポーズ(ターゲットに変形する)の切り離しを許す。ニューラルネットワークによって学習された連続的流れ場としてジオメトリ間の変形をパラメトリ化し、そのような変形が単射性、自己切断の自由、体積保存といった望ましい特性を持つことを保証できることを示す。本研究は, 変形による形状生成, 幾何学的様相転移, 形状のクラス全体に対する一貫したパラメータ化の教師なし学習, 形状補間など, 下流の様々な応用において学習された変形空間の有効性を示す。 We present ShapeFlow, a flow-based model for learning a deformation space for entire classes of 3D shapes with large intra-class variations. ShapeFlow allows learning a multi-template deformation space that is agnostic to shape topology, yet preserves fine geometric details. Different from a generative space where a latent vector is directly decoded into a shape, a deformation space decodes a vector into a continuous flow that can advect a source shape towards a target. Such a space naturally allows the disentanglement of geometric style (coming from the source) and structural pose (conforming to the target). We parametrize the deformation between geometries as a learned continuous flow field via a neural network and show that such deformations can be guaranteed to have desirable properties, such as be bijectivity, freedom from self-intersections, or volume preservation. We illustrate the effectiveness of this learned deformation space for various downstream applications, including shape generation via deformation, geometric style transfer, unsupervised learning of a consistent parameterization for entire classes of shapes, and shape interpolation.	翻訳日:2022-11-21 13:32:46 公開日:2021-06-23
# dual t: ラベルノイズ学習における遷移行列の推定誤差の低減 Dual T: Reducing Estimation Error for Transition Matrix in Label-noise Learning ( http://arxiv.org/abs/2006.07805v3 ) ライセンス: Link先を確認	Yu Yao, Tongliang Liu, Bo Han, Mingming Gong, Jiankang Deng, Gang Niu, Masashi Sugiyama	(参考訳) クリーンラベルからノイズラベルへの遷移関係を示す遷移行列は、ラベルノイズ学習において統計的に一貫性のある分類器を構築するために必須である。遷移行列を推定するための既存の手法は、後方の騒がしいクラスの推定に大きく依存している。しかし, ラベルノイズのランダム性により, 騒音クラス後方推定誤差が大きくなり, 遷移行列の精度が低下する可能性が示唆された。そこで本稿では,分割・分割パラダイムを活用し,この問題を解決しようとする。具体的には,雑音のクラス後部を直接推定しないように中間クラスを導入する。この中間クラスにより、元の遷移行列は2つの容易に推定できる遷移行列の積に分解できる。提案手法を双対T推定器と呼ぶ。理論的解析と実証結果は、遷移行列を推定するための双対T推定器の有効性を示し、より良い分類性能をもたらす。 The transition matrix, denoting the transition relationship from clean labels to noisy labels, is essential to build statistically consistent classifiers in label-noise learning. Existing methods for estimating the transition matrix rely heavily on estimating the noisy class posterior. However, the estimation error for noisy class posterior could be large due to the randomness of label noise, which would lead the transition matrix to be poorly estimated. Therefore, in this paper, we aim to solve this problem by exploiting the divide-and-conquer paradigm. Specifically, we introduce an intermediate class to avoid directly estimating the noisy class posterior. By this intermediate class, the original transition matrix can then be factorized into the product of two easy-to-estimate transition matrices. We term the proposed method the dual-T estimator. Both theoretical analyses and empirical results illustrate the effectiveness of the dual-T estimator for estimating transition matrices, leading to better classification performances.	翻訳日:2022-11-21 09:50:26 公開日:2021-06-23
# パンデミックの初期段階における米国と英国における新型コロナウイルスワクチンの受容: 未成年者に対するai生成ワクチンの緩和と政府の役割 COVID-19 Vaccine Acceptance in the US and UK in the Early Phase of the Pandemic: AI-Generated Vaccines Hesitancy for Minors, and the Role of Governments ( http://arxiv.org/abs/2006.08164v3 ) ライセンス: Link先を確認	Gabriel Lima, Meeyoung Cha, Chiyoung Cha, Hyeyoung Hwang	(参考訳) 本研究は、新型コロナウイルス(COVID-19)の早期にワクチン接種を受けたいという国民の意思を調査し、対象間のデザインに基づいてワクチンの受け入れに影響を与える可能性のある要因について検討する。米国と英国の成人572人がオンライン調査に参加した。まず、参加者の医療利用傾向と初期ワクチンの受け入れを評価し、その後、新型コロナウイルスワクチンに対する態度の変化を評価するための短いビグネットを提供した。データ解析にはANOVAとポストホックのペアワイド比較が用いられた。参加者は自分の子供や高齢者よりも予防接種に消極的だった。ワクチン開発における人工知能(ai)の使用はワクチンの受容に影響を与えなかった。ワクチンの有効性を明示したヴィグネットは、ワクチンの受け入れを増加させた。本研究は、ウイルスに対するワクチンの有効性を強調する公共政策がワクチン接種率の向上につながることを示唆している。また、ワクチンの安全性に関する国民の期待についても論じ、その結果に基づく一連の影響を提示する。 This study presents survey results of the public's willingness to get vaccinated against COVID-19 during an early phase of the pandemic and examines factors that could influence vaccine acceptance based on a between-subjects design. A representative quota sample of 572 adults in the US and UK participated in an online survey. First, the participants' medical use tendencies and initial vaccine acceptance were assessed; then, short vignettes were provided to evaluate their changes in attitude towards COVID-19 vaccines. For data analysis, ANOVA and post hoc pairwise comparisons were used. The participants were more reluctant to vaccinate their children than themselves and the elderly. The use of artificial intelligence (AI) in vaccine development did not influence vaccine acceptance. Vignettes that explicitly stated the high effectiveness of vaccines led to an increase in vaccine acceptance. Our study suggests public policies emphasizing the vaccine effectiveness against the virus could lead to higher vaccination rates. We also discuss the public's expectations of governments concerning vaccine safety and present a series of implications based on our findings.	翻訳日:2022-11-21 04:44:48 公開日:2021-06-23
# 生成モデルを用いたロバスト圧縮センシング Robust Compressed Sensing using Generative Models ( http://arxiv.org/abs/2006.09461v3 ) ライセンス: Link先を確認	Ajil Jalal, Liu Liu, Alexandros G. Dimakis, Constantine Caramanis	(参考訳) 圧縮センシングの目標は、ノイズ線形方程式の未決定系から高次元ベクトルを推定することである。古典的な圧縮センシングと類似して、ここでは生成モデルが先行する、つまりベクトルは深い生成モデル $g: \mathbb{r}^k \rightarrow \mathbb{r}^n$ で表現されると仮定する。経験的リスク最小化(ERM)のような古典的回復アプローチは、測定行列がガウス以下である場合に成功することが保証される。しかし、測定行列と測定値が重く、または外れ値がある場合、回復は劇的に失敗する可能性がある。本稿では,Median-of-Means (MOM) にヒントを得たアルゴリズムを提案する。我々のアルゴリズムは、外れ値が存在する場合でも、重み付きデータの回復を保証する。理論的には,本手法はサブガウシアン仮定下でのermと同等のサンプル複雑性を満足することを示す。我々の実験は、我々の主張の両面を検証している: 他のアルゴリズムは、実際は脆弱で、重み付けや破損したデータの下で失敗する。 The goal of compressed sensing is to estimate a high dimensional vector from an underdetermined system of noisy linear equations. In analogy to classical compressed sensing, here we assume a generative model as a prior, that is, we assume the vector is represented by a deep generative model $G: \mathbb{R}^k \rightarrow \mathbb{R}^n$. Classical recovery approaches such as empirical risk minimization (ERM) are guaranteed to succeed when the measurement matrix is sub-Gaussian. However, when the measurement matrix and measurements are heavy-tailed or have outliers, recovery may fail dramatically. In this paper we propose an algorithm inspired by the Median-of-Means (MOM). Our algorithm guarantees recovery for heavy-tailed data, even in the presence of outliers. Theoretically, our results show our novel MOM-based algorithm enjoys the same sample complexity guarantees as ERM under sub-Gaussian assumptions. Our experiments validate both aspects of our claims: other algorithms are indeed fragile and fail under heavy-tailed and/or corrupted data, while our approach exhibits the predicted robustness.	翻訳日:2022-11-20 20:19:59 公開日:2021-06-23
# Lookahead-MinmaxによるGANの処理 Taming GANs with Lookahead-Minmax ( http://arxiv.org/abs/2006.14567v3 ) ライセンス: Link先を確認	Tatjana Chavdarova, Matteo Pagliardini, Sebastian U. Stich, Francois Fleuret, Martin Jaggi	(参考訳) ジェネレーティブ・Adversarial Networksはトレーニングが難しいことで有名だ。基礎となるminmax最適化は、確率勾配と関連するゲームベクトル場の回転成分の分散に非常に影響を受けやすい。これらの課題に取り組むため,我々は,単一目的最小化専用に開発されたminmax最適化のためのlookaheadアルゴリズムを提案する。 Lookahead-minmaxのバックトラックステップは自然に回転ゲームダイナミクスを処理します。この特性は、文献でしばしば分析される挑戦的な例に基づいて勾配上昇降下法を収束させる鍵であると考えられていました。さらに、大きなミニバッチを使用せずに、暗黙のうちに高い分散を処理する。 mnist、svhn、cifar-10、imagenetの実験結果は、性能と安定性の向上、メモリと計算コストの面で、lookahead-minmaxとadamまたはextragradientを組み合わせるという明確な利点を示している。 CIFAR-10のクラス依存型BigGANでは,30倍のパラメータと16倍のミニバッチを用いることで,クラスラベルを使わずに12.19のFIDを取得し,一般的な計算資源の範囲内で最先端のGANトレーニングを行う。 Generative Adversarial Networks are notoriously challenging to train. The underlying minmax optimization is highly susceptible to the variance of the stochastic gradient and the rotational component of the associated game vector field. To tackle these challenges, we propose the Lookahead algorithm for minmax optimization, originally developed for single objective minimization only. The backtracking step of our Lookahead-minmax naturally handles the rotational game dynamics, a property which was identified to be key for enabling gradient ascent descent methods to converge on challenging examples often analyzed in the literature. Moreover, it implicitly handles high variance without using large mini-batches, known to be essential for reaching state of the art performance. Experimental results on MNIST, SVHN, CIFAR-10, and ImageNet demonstrate a clear advantage of combining Lookahead-minmax with Adam or extragradient, in terms of performance and improved stability, for negligible memory and computational cost. Using 30-fold fewer parameters and 16-fold smaller minibatches we outperform the reported performance of the class-dependent BigGAN on CIFAR-10 by obtaining FID of 12.19 without using the class labels, bringing state-of-the-art GAN training within reach of common computational resources.	翻訳日:2022-11-17 03:23:43 公開日:2021-06-23
# ドメインに依存しない内部分布を用いた逐次モデル適応 Sequential Model Adaptation Using Domain Agnostic Internal Distributions ( http://arxiv.org/abs/2007.00197v4 ) ライセンス: Link先を確認	Mohammad Rostami, Aram Galstyan	(参考訳) 分類器の逐次適応アルゴリズムを開発し, 対象領域の非注釈領域に一般化するために, ソース領域を訓練した。このモデルは、ソースドメインアノテートされたデータに基づいてトレーニングされており、ソースドメインデータがアクセスできない場合には、ターゲットドメインアンアノテートされたデータを使用して適用する必要があると考えている。我々は、中間内部分布を介して、識別的埋め込み空間におけるソースとターゲットドメインの分布を整列する。この分布は埋め込みのソースデータ表現を用いて推定される。提案手法の有効性を実証する4つのベンチマーク実験を行い,既存手法と比較した。 We develop an algorithm for sequential adaptation of a classifier that is trained for a source domain to generalize in an unannotated target domain. We consider that the model has been trained on the source domain annotated data and then it needs to be adapted using the target domain unannotated data when the source domain data is not accessible. We align the distributions of the source and the target domains in a discriminative embedding space via an intermediate internal distribution. This distribution is estimated using the source data representations in the embedding. We conduct experiments on four benchmarks to demonstrate the method is effective and compares favorably against existing methods.	翻訳日:2022-11-14 22:35:52 公開日:2021-06-23
# 循環グラフ上の量子ウォークにおけるカオスからの秩序 Order from chaos in quantum walks on cyclic graphs ( http://arxiv.org/abs/2008.00316v3 ) ライセンス: Link先を確認	Abhisek Panda, Colin Benjamin	(参考訳) 2つのカオスランダムウォークを組み合わせることで、順序付けられた(周期的な)ウォークが得られることが古典的に示されている。本論文の目的は,この非直観的な結果に対する量子アナログを見つけることである。循環型量子ウォークのカオス的および周期的性質を考察し,3サイクルグラフ上の周期的量子ウォークが同じグラフ上の2つのカオス的量子ウォークの決定論的組み合わせによって生成されるユニークな状況に着目した。結果は偶数巡回グラフ、特に4サイクルグラフにも拡張します。私たちの結果は量子暗号と量子カオス制御に関係します。 It has been shown classically that combining two chaotic random walks can yield an ordered(periodic) walk. Our aim in this paper is to find a quantum analog for this rather counter-intuitive result. We study chaotic and periodic nature of cyclic quantum walks and focus on a unique situation wherein a periodic quantum walk on a 3-cycle graph is generated via a deterministic combination of two chaotic quantum walks on the same graph. We extend our results to even-numbered cyclic graphs, specifically a 4-cycle graph too. Our results will be relevant in quantum cryptography and quantum chaos control.	翻訳日:2022-11-04 00:55:20 公開日:2021-06-23
# 1つの単眼映像による高妥当性・信頼性歩行パラメータのアルゴリズム Algorithm Based on One Monocular Video Delivers Highly Valid and Reliable Gait Parameters ( http://arxiv.org/abs/2008.08045v5 ) ライセンス: Link先を確認	Dr. Arash Azhand, Dr. Sophie Rabe, Dr. Swantje M\"uller, Igor Sattler, Dr. Anika Steinert	(参考訳) 多様体のユースケース(例えば、医療産業、スポーツ、リハビリテーション、フィットネスアセスメントなど)において最重要でありながら、十分な有効で信頼性の高い歩行パラメータの測定は依然としてハイテク歩行研究所に限られている。本稿では,現代の畳み込みニューラルネットワークを基盤とし,歩行者の単眼前頭視映像から三次元骨格関節を抽出する,新たな歩行評価システムの有効性とテスト・テストの再現性を示す。この妥当性は, GAITRite の圧力感受性歩行システムとの比較に基づく。すべての歩行パラメータ(歩行速度、ケイデンス、歩幅、歩幅)は、通常の歩行と速い歩行速度で複数の歩行試行において優れた同時妥当性を示した。テスト-再テスト-リピータビリティは、GAITRiteシステムと同じレベルである。結論として,本研究の結果は,幅広い主流アプリケーションにおいて,コスト,空間,運用上有効な歩容解析への道を開くことができると確信している。ほとんどのセンサーベースのシステムはコストがかかり、広範囲に訓練された人員(例えばモーションキャプチャシステム)によって運用されなければならない。対照的に、ここで提示する評価方法に十分なビデオは、多くのトレーニングなしで、スマートフォンのカメラで誰でも入手することができる。 Despite its paramount importance for manifold use cases (e.g., in the health care industry, sports, rehabilitation and fitness assessment), sufficiently valid and reliable gait parameter measurement is still limited to high-tech gait laboratories mostly. Here, we demonstrate the excellent validity and test-retest repeatability of a novel gait assessment system which is built upon modern convolutional neural networks to extract three-dimensional skeleton joints from monocular frontal-view videos of walking humans. The validity study is based on a comparison to the GAITRite pressure-sensitive walkway system. All measured gait parameters (gait speed, cadence, step length and step time) showed excellent concurrent validity for multiple walk trials at normal and fast gait speeds. The test-retest-repeatability is on the same level as the GAITRite system. In conclusion, we are convinced that our results can pave the way for cost, space and operationally effective gait analysis in broad mainstream applications. Most sensor-based systems are costly, must be operated by extensively trained personnel (e.g., motion capture systems) or - even if not quite as costly - still possess considerable complexity (e.g., wearable sensors). In contrast, a video sufficient for the assessment method presented here can be obtained by anyone, without much training, via a smartphone camera.	翻訳日:2022-11-02 18:56:36 公開日:2021-06-23
# 古典密度汎関数理論における状態関数の物理制約ベイズ推論 Physics-constrained Bayesian inference of state functions in classical density-functional theory ( http://arxiv.org/abs/2010.03374v4 ) ライセンス: Link先を確認	Peter Yatsyshin, Serafim Kalliadasis and Andrew B. Duncan	(参考訳) 古典統計力学の逆問題に対する新しいデータ駆動型アプローチを開発し、古典的な多体系の集合運動に関する実験データから、その系の自由エネルギー景観をどう特徴づけるか。非パラメトリックベイズ推論と物理的動機付け制約を組み合わせることで,近似自由エネルギー汎関数の構成を自動化する効率的な学習アルゴリズムを開発した。コスト関数を最小化しようとする最適化ベースの機械学習アプローチとは対照的に、ベイズ推論の中心となる考え方は、物理原理から導かれるモデルを通じて事前仮定の集合を伝播させることである。実験データは、可能なモデル予測を確率的に評価するために使用される。これは自然に予測の完全不確実な定量化を伴う人間の解釈可能なアルゴリズムにつながる。この場合、学習アルゴリズムの出力は、観測された粒子データと一致する自由エネルギー汎関数の族上の確率分布である。驚くほど小さなデータサンプルは、基礎となる自由エネルギー関数の高精度な解析式を推測するのに十分な情報を含んでおり、アルゴリズムを高度にデータ効率良くする。自由エネルギーの観点からのモデリングにおいて非常に困難である一方, 自然界においてユビキタスである体積粒子相互作用の排除を考える。このアプローチを検証するために, 1次元流体のパラダイム的場合を考察し, 標準的および大カノニカル統計力学的アンサンブルの推論アルゴリズムを開発した。高次元システムの拡張は概念的には単純であるが、標準的な粗粒化技術では魅力的な相互作用を容易に取り入れることができる。 We develop a novel data-driven approach to the inverse problem of classical statistical mechanics: given experimental data on the collective motion of a classical many-body system, how does one characterise the free energy landscape of that system? By combining non-parametric Bayesian inference with physically-motivated constraints, we develop an efficient learning algorithm which automates the construction of approximate free energy functionals. In contrast to optimisation-based machine learning approaches, which seek to minimise a cost function, the central idea of the proposed Bayesian inference is to propagate a set of prior assumptions through the model, derived from physical principles. The experimental data is used to probabilistically weigh the possible model predictions. This naturally leads to humanly interpretable algorithms with full uncertainty quantification of predictions. In our case, the output of the learning algorithm is a probability distribution over a family of free energy functionals, consistent with the observed particle data. We find that surprisingly small data samples contain sufficient information for inferring highly accurate analytic expressions of the underlying free energy functionals, making our algorithm highly data efficient. We consider excluded volume particle interactions, which are ubiquitous in nature, whilst being highly challenging for modelling in terms of free energy. To validate our approach we consider the paradigmatic case of one-dimensional fluid and develop inference algorithms for the canonical and grand-canonical statistical-mechanical ensembles. Extensions to higher-dimensional systems are conceptually straightforward, whilst standard coarse-graining techniques allow one to easily incorporate attractive interactions.	翻訳日:2022-10-10 00:05:09 公開日:2021-06-23
# textsettr: 最小限のテキストスタイル抽出とチューニング可能なターゲットレスタイリング TextSETTR: Few-Shot Text Style Extraction and Tunable Targeted Restyling ( http://arxiv.org/abs/2010.03802v3 ) ライセンス: Link先を確認	Parker Riley, Noah Constant, Mandy Guo, Girish Kumar, David Uthus, Zarana Parekh	(参考訳) 本稿では,テキストスタイル転送問題に対する新しいアプローチを提案する。スタイルラベル付き学習データを必要とする従来の手法とは異なり,提案手法は隣接した文間のスタイルの暗黙的な接続に依存し,推論時にのみラベル付きデータを使用する。我々は、強い事前訓練されたテキスト-テキストモデルであるT5(Raffel et al., 2020)に適応し、テキストからスタイルベクトルを抽出し、デコーダを用いてスタイル転送を行う。ラベルなしのトレーニングでは,多くのスタイルのファセットを符号化したスタイルベクトル空間が生成されるので,入力の特定の属性を調整し,他の属性を保存しながら,転送を"ターゲット復元"ベクター操作として再キャストする。ラベルなしのamazon reviewsデータに対するトレーニングの結果、ラベル付きデータで完全にトレーニングされたモデルと比較しても、感情伝達に競争力のあるモデルが得られることを実証する。さらに,ラベルのないwebテキストの多種多様なコーパスに適用することで,追加のトレーニングを受けず,推論時にほんの一握りの例を用いても,多次元のスタイル(発話性,動機づけ性,形式性,礼儀正しく,感情)を伝達できる単一モデルが得られた。 We present a novel approach to the problem of text style transfer. Unlike previous approaches requiring style-labeled training data, our method makes use of readily-available unlabeled text by relying on the implicit connection in style between adjacent sentences, and uses labeled data only at inference time. We adapt T5 (Raffel et al., 2020), a strong pretrained text-to-text model, to extract a style vector from text and use it to condition the decoder to perform style transfer. As our label-free training results in a style vector space encoding many facets of style, we recast transfers as "targeted restyling" vector operations that adjust specific attributes of the input while preserving others. We demonstrate that training on unlabeled Amazon reviews data results in a model that is competitive on sentiment transfer, even compared to models trained fully on labeled data. Furthermore, applying our novel method to a diverse corpus of unlabeled web text results in a single model capable of transferring along multiple dimensions of style (dialect, emotiveness, formality, politeness, sentiment) despite no additional training and using only a handful of exemplars at inference time.	翻訳日:2022-10-09 11:14:31 公開日:2021-06-23
# Permuted AdaIN: 画像分類における世界統計へのバイアス削減 Permuted AdaIN: Reducing the Bias Towards Global Statistics in Image Classification ( http://arxiv.org/abs/2010.05785v3 ) ライセンス: Link先を確認	Oren Nuriel, Sagie Benaim, Lior Wolf	(参考訳) 近年の研究では、畳み込みニューラルネットワーク分類器は形状を犠牲にしてテクスチャに依存することが示されている。一方、形状と局所像の区別は類似しているが異なるが、一方、グローバル画像統計は異なる。提案手法は,pAdaIN(Permuted Adaptive Instance Normalization)と呼ばれ,画像分類器の隠蔽層におけるグローバル統計の表現を低減する。 padainは、与えられたバッチ内のサンプルを並べ替えるランダムな置換$\pi$をサンプリングする。適応インスタンス正規化(adain)は、各(置換されていない)サンプル$i$のアクティベーションと、対応するサンプル$\pi(i)$のアクティベーションの間に適用される。グローバル画像統計は歪んでいるため、この交換手順により、ネットワークは形状やテクスチャなどの手がかりに依存することになる。確率 $p$ のランダム置換とそれ以外は恒等置換を選択することで、効果の強さを制御できる。すべての実験で$p$と固定 aprioriを正しく選択し、テストデータを考慮せずに選択することで、複数の設定でベースラインを一貫して上回っています。画像分類では,複数のアーキテクチャを用いてCIFAR100とImageNetの両方を改良する。堅牢性の設定では、複数のアーキテクチャに対して ImageNet-C と Cifar-100-C の両方を改良する。ドメイン適応とドメイン一般化の設定において,本手法はGTAVからCityscapesおよびPACSベンチマークへの変換学習タスクにおける技術結果の状態を達成している。 Recent work has shown that convolutional neural network classifiers overly rely on texture at the expense of shape cues. We make a similar but different distinction between shape and local image cues, on the one hand, and global image statistics, on the other. Our method, called Permuted Adaptive Instance Normalization (pAdaIN), reduces the representation of global statistics in the hidden layers of image classifiers. pAdaIN samples a random permutation $\pi$ that rearranges the samples in a given batch. Adaptive Instance Normalization (AdaIN) is then applied between the activations of each (non-permuted) sample $i$ and the corresponding activations of the sample $\pi(i)$, thus swapping statistics between the samples of the batch. Since the global image statistics are distorted, this swapping procedure causes the network to rely on cues, such as shape or texture. By choosing the random permutation with probability $p$ and the identity permutation otherwise, one can control the effect's strength. With the correct choice of $p$, fixed apriori for all experiments and selected without considering test data, our method consistently outperforms baselines in multiple settings. In image classification, our method improves on both CIFAR100 and ImageNet using multiple architectures. In the setting of robustness, our method improves on both ImageNet-C and Cifar-100-C for multiple architectures. In the setting of domain adaptation and domain generalization, our method achieves state of the art results on the transfer learning task from GTAV to Cityscapes and on the PACS benchmark.	翻訳日:2022-10-09 05:59:22 公開日:2021-06-23
# スマートビルにおける異常検出のための連合学習手法 A Federated Learning Approach to Anomaly Detection in Smart Buildings ( http://arxiv.org/abs/2010.10293v3 ) ライセンス: Link先を確認	Raed Abdel Sater and A. Ben Hamza	(参考訳) スマートな建物におけるIoT(Internet of Things)センサーはますます普及しており、建物をより生き生きとエネルギー効率を良くし、持続可能なものにしている。これらの装置は環境を感知し、スマートビルにおける異常の検出とエネルギー使用量の予測を改善するため、最重要度の多変量時間データを生成する。しかしながら、中央システムにおけるこれらの異常の検出は、応答時間の大幅な遅延によってしばしば悩まされる。本研究では,タスク間の類似性と差異を生かしつつ,複数のタスクを同時に解決することを目的としたマルチタスク学習パラダイムを活用して,連合学習環境における異常検出問題を定式化する。本論文では,lstm(stacked long short-time memory)モデルを用いた,新しいプライバシ・バイ・デザインのフェデレーション学習モデルを提案する。当社のフェデレーション学習手法の有効性を,一般電流スマートビルディングにおけるiot生産システムによって生成された3つの実世界データセットを用いて実証した。本研究は,予測性能を損なうことなく,総合的なトレーニングコストを削減するためのフレームワークの有効性を示す。 Internet of Things (IoT) sensors in smart buildings are becoming increasingly ubiquitous, making buildings more livable, energy efficient, and sustainable. These devices sense the environment and generate multivariate temporal data of paramount importance for detecting anomalies and improving the prediction of energy usage in smart buildings. However, detecting these anomalies in centralized systems is often plagued by a huge delay in response time. To overcome this issue, we formulate the anomaly detection problem in a federated learning setting by leveraging the multi-task learning paradigm, which aims at solving multiple tasks simultaneously while taking advantage of the similarities and differences across tasks. We propose a novel privacy-by-design federated learning model using a stacked long short-time memory (LSTM) model, and we demonstrate that it is more than twice as fast during training convergence compared to the centralized LSTM. The effectiveness of our federated learning approach is demonstrated on three real-world datasets generated by the IoT production system at General Electric Current smart building, achieving state-of-the-art performance compared to baseline methods in both classification and regression tasks. Our experimental results demonstrate the effectiveness of the proposed framework in reducing the overall training cost without compromising the prediction performance.	翻訳日:2022-10-05 07:21:34 公開日:2021-06-23
# PHEW: トレーニングデータなしで学習し、より良く一般化するスパースネットワークの構築 PHEW: Constructing Sparse Networks that Learn Fast and Generalize Well without Training Data ( http://arxiv.org/abs/2010.11354v2 ) ライセンス: Link先を確認	Shreyas Malakarjun Patil, Constantine Dovrolis	(参考訳) 初期化時にネットワークをスパース化する手法は、学習と推論の両方の効率を大幅に改善するため、実際に重要である。我々の研究は、最近提案されたNeural Tangent Kernel(NTK)の分解に基づいており、トレーニングプロセスのダイナミクスをデータ依存コンポーネントとアーキテクチャ依存カーネル(後者はPath Kernelと呼ばれる)に分離した。この研究は、Synflow-L2アルゴリズムを使用して、トレーニングデータなしで、より高速な収束のためにスパースニューラルネットワークを設計する方法を示した。我々はまず、Synflow-L2が収束の点で最適であるにもかかわらず、ネットワーク密度が与えられた場合、ネットワークのサブネットワークに"bottleneck"層(狭い層)が生じることを示し、同じ数のパラメータを使用する他のデータに依存しない手法と比べてパフォーマンスが劣ることを示した。そこで本稿では,PHEW(Paths with Higher-Edge Weights)と呼ばれるトレーニングデータなしでスパースネットワークを構築する手法を提案する。 phewは、初期重みのみに依存するバイアス付きランダムウォークに基づく確率的ネットワーク形成手法である。 Synflow-L2と同様のパスカーネル特性を持つが、より広い層を生成するため、より一般化と性能が向上する。 PHEWは、幅広いネットワーク密度で、データ非依存のSynFlowとSynFlow-L2メソッドよりも大幅に改善されている。 Methods that sparsify a network at initialization are important in practice because they greatly improve the efficiency of both learning and inference. Our work is based on a recently proposed decomposition of the Neural Tangent Kernel (NTK) that has decoupled the dynamics of the training process into a data-dependent component and an architecture-dependent kernel - the latter referred to as Path Kernel. That work has shown how to design sparse neural networks for faster convergence, without any training data, using the Synflow-L2 algorithm. We first show that even though Synflow-L2 is optimal in terms of convergence, for a given network density, it results in sub-networks with "bottleneck" (narrow) layers - leading to poor performance as compared to other data-agnostic methods that use the same number of parameters. Then we propose a new method to construct sparse networks, without any training data, referred to as Paths with Higher-Edge Weights (PHEW). PHEW is a probabilistic network formation method based on biased random walks that only depends on the initial weights. It has similar path kernel properties as Synflow-L2 but it generates much wider layers, resulting in better generalization and performance. PHEW achieves significant improvements over the data-independent SynFlow and SynFlow-L2 methods at a wide range of network densities.	翻訳日:2022-10-04 07:18:31 公開日:2021-06-23
# fdrn:医療画像のための高速変形可能な登録ネットワーク FDRN: A Fast Deformable Registration Network for Medical Images ( http://arxiv.org/abs/2011.02307v4 ) ライセンス: Link先を確認	Kaicong Sun and Sven Simon	(参考訳) 変形可能な画像登録は医療画像の基本的な課題である。ボリューム画像の変形可能な登録の計算複雑性が大きいため、従来の反復法は通常、登録精度と実際の計算時間とのトレードオフに直面している。精度と実行時間の両方で登録性能を向上させるため,高速畳み込みニューラルネットワークを提案する。特に、メモリ資源を効率的に活用し、モデル容量を拡大するために、各エンコーダおよびデコーダステージにおいて、チャネル結合の代わりに付加フォワードを採用し、ネットワークを深くする。学習効率を高めるため,エンコーダおよびデコーダ段内のスキップ接続を活用し,残差学習を可能にし,下位層の補助損失を最小の分解能で活用し,深い監督を行う。特に、トレーニングフェーズ中に指数減衰パラメータによって低分解能補助損失を重み付けする。高解像度グリッドの主な損失と合わせて、粗大な学習戦略が達成される。最後に, Dice スコアの登録性能を改善するために, セグメンテーションに基づく補助的損失を導入する。平均diceスコアを用いた補助損失と比較すると,提案するマルチラベルセグメンテーション損失はトレーニング段階で追加のメモリコストを生じさせず,任意の量のカテゴリを持つ画像に適用できる。実験では,fdrnが,コンパクトネットワーク構造と効率的な学習を駆使して,既存の脳mr画像の最先端登録手法よりも優れていることを示す。さらに、FDRNは画像登録のための一般的なフレームワークであり、特定の種類の医療画像や解剖に制限されない。 Deformable image registration is a fundamental task in medical imaging. Due to the large computational complexity of deformable registration of volumetric images, conventional iterative methods usually face the tradeoff between the registration accuracy and the computation time in practice. In order to boost the registration performance in both accuracy and runtime, we propose a fast convolutional neural network. Specially, to efficiently utilize the memory resources and enlarge the model capacity, we adopt additive forwarding instead of channel concatenation and deepen the network in each encoder and decoder stage. To facilitate the learning efficiency, we leverage skip connection within the encoder and decoder stages to enable residual learning and employ an auxiliary loss at the bottom layer with lowest resolution to involve deep supervision. Particularly, the low-resolution auxiliary loss is weighted by an exponentially decayed parameter during the training phase. In conjunction with the main loss in high-resolution grid, a coarse-to-fine learning strategy is achieved. Last but not least, we introduce an auxiliary loss based on the segmentation prior to improve the registration performance in Dice score. Comparing to the auxiliary loss using average Dice score, the proposed multi-label segmentation loss does not induce additional memory cost in the training phase and can be employed on images with arbitrary amount of categories. In the experiments, we show FDRN outperforms the existing state-of-the-art registration methods for brain MR images by resorting to the compact network structure and efficient learning. Besides, FDRN is a generalized framework for image registration which is not confined to a particular type of medical images or anatomy.	翻訳日:2022-09-29 23:00:00 公開日:2021-06-23
# HILONet:非アライン観測による階層的模倣学習 HILONet: Hierarchical Imitation Learning from Non-Aligned Observations ( http://arxiv.org/abs/2011.02671v2 ) ライセンス: Link先を確認	Shanqi Liu, Junjie Cao, Wenzhou Chen, Licheng Wen, Yong Liu	(参考訳) 実演を段階的に追従して専門家を模倣することを目的とした模倣学習手法が多いため,非時間連携環境において実演のみの軌跡から学ぶことは困難である。しかし、実世界でのデモはほとんど得られない。本研究では,ハイロネット(Hierarchical Imitation Learning from Observation, HiLONet)と呼ばれる新しい模倣学習手法を提案する。本手法は,1つのゴール位置の有無に関わらず,これらのサブゴールを達成することで,あらゆる種類のタスクを解決できる。また, 階層構造における試料効率を向上させる3つの方法を提案する。いくつかの環境を用いて広範な実験を行う。その結果,性能と学習効率の両面で改善が見られた。 It is challenging learning from demonstrated observation-only trajectories in a non-time-aligned environment because most imitation learning methods aim to imitate experts by following the demonstration step-by-step. However, aligned demonstrations are seldom obtainable in real-world scenarios. In this work, we propose a new imitation learning approach called Hierarchical Imitation Learning from Observation(HILONet), which adopts a hierarchical structure to choose feasible sub-goals from demonstrated observations dynamically. Our method can solve all kinds of tasks by achieving these sub-goals, whether it has a single goal position or not. We also present three different ways to increase sample efficiency in the hierarchical structure. We conduct extensive experiments using several environments. The results show the improvement in both performance and learning efficiency.	翻訳日:2022-09-29 12:25:23 公開日:2021-06-23
# GANMEX: 1-vs-one属性をGANベースの対実説明ベースラインでガイドする GANMEX: One-vs-One Attributions Guided by GAN-based Counterfactual Explanation Baselines ( http://arxiv.org/abs/2011.06015v4 ) ライセンス: Link先を確認	Sheng-Min Shih, Pin-Ju Tien, Zohar Karnin	(参考訳) 帰属法は学習モデル予測に繋がる重要な特徴を特定するための有望な手法として示されてきた。既存の帰属法の多くは特徴摂動を行うためのベースライン入力に依存しているが、ベースライン選択問題に対処するための限定的な研究がなされている。ベースラインの貧弱な選択は、マルチクラス分類器に対する1-vs-one (1-vs-1)説明の能力を制限する。 1-vs-1の説明は、あるクラスが他のクラスと類似している場合、例えば、複数の動物の間での2種類の鳥のタイプは、クラス間での共有機能よりも重要な識別機能に焦点を当てることによって重要である。本稿では,GAN(Generative Adversarial Networks)を用いた新しい手法であるGANMEX(GAN-based Model Explainability)を提案する。提案手法は, 対象クラスに最も近い実写的なサンプルとして, 対物的ベースラインを効果的に選択することで, 真の1-vs-1説明を提供する属性法を実現する。我々は,GANMEXベースラインがサリエンシマップを改善し,既存のベースラインよりも摂動に基づく評価指標の性能が向上したことを示した。既存の帰属結果はモデルランダム化に敏感であることが知られており、GANMEXベースラインがモデルのカスケードランダム化の下でより良い結果をもたらすことを示した。 Attribution methods have been shown as promising approaches for identifying key features that led to learned model predictions. While most existing attribution methods rely on a baseline input for performing feature perturbations, limited research has been conducted to address the baseline selection issues. Poor choices of baselines limit the ability of one-vs-one (1-vs-1) explanations for multi-class classifiers, which means the attribution methods were not able to explain why an input belongs to its original class but not the other specified target class. 1-vs-1 explanation is crucial when certain classes are more similar than others, e.g. two bird types among multiple animals, by focusing on key differentiating features rather than shared features across classes. In this paper, we present GAN-based Model EXplainability (GANMEX), a novel approach applying Generative Adversarial Networks (GAN) by incorporating the to-be-explained classifier as part of the adversarial networks. Our approach effectively selects the counterfactual baseline as the closest realistic sample belong to the target class, which allows attribution methods to provide true 1-vs-1 explanations. We showed that GANMEX baselines improved the saliency maps and led to stronger performance on perturbation-based evaluation metrics over the existing baselines. Existing attribution results are known for being insensitive to model randomization, and we demonstrated that GANMEX baselines led to better outcome under the cascading randomization of the model.	翻訳日:2022-09-27 00:33:58 公開日:2021-06-23
# hebbian meta-learningにおけるゲノムボトルネック仮説の検証 Testing the Genomic Bottleneck Hypothesis in Hebbian Meta-Learning ( http://arxiv.org/abs/2011.06811v2 ) ライセンス: Link先を確認	Rasmus Berg Palm, Elias Najarro, Sebastian Risi	(参考訳) hebbian meta-learningは最近、厳しい強化学習問題を解決する約束を示しており、エージェントが環境の変化にある程度適応できるようにしている。しかしながら、これらの手法のシナプスは、非常に特定の学習規則を学習できるため、非常に異なる状況に一般化する能力は減少する可能性が高い。我々は、ヘビアン学習規則の数を「ゲノムボトルネック」によって制限することは、環境の変化をまたいだより良い一般化につながると仮定する。本仮説は,ヘッブの学習規則数をシナプス数から分離し,体系的にヘッブの学習規則の数を変化させることで検証する。本稿では,ヘビアン学習規則の同時学習とシナプスへの割り当てが困難な最適化問題であり,テスト環境における性能の低下につながることを示唆する。しかし,並列研究の結果,類似したルールをクラスタ化することで,学習ルールの数を減らすことが可能であることが判明した。ゲノムボトルネック」アルゴリズムを最もうまく実装する方法は、さらなる調査を保証する重要な研究方向である。 Hebbian meta-learning has recently shown promise to solve hard reinforcement learning problems, allowing agents to adapt to some degree to changes in the environment. However, because each synapse in these approaches can learn a very specific learning rule, the ability to generalize to very different situations is likely reduced. We hypothesize that limiting the number of Hebbian learning rules through a "genomic bottleneck" can act as a regularizer leading to better generalization across changes to the environment. We test this hypothesis by decoupling the number of Hebbian learning rules from the number of synapses and systematically varying the number of Hebbian learning rules. The results in this paper suggest that simultaneously learning the Hebbian learning rules and their assignment to synapses is a difficult optimization problem, leading to poor performance in the environments tested. However, parallel research to ours finds that it is indeed possible to reduce the number of learning rules by clustering similar rules together. How to best implement a "genomic bottleneck" algorithm is thus an important research direction that warrants further investigation.	翻訳日:2022-09-25 23:44:26 公開日:2021-06-23
# 有限地平線上の騒音線形二次レギュレータのポリシー勾配法 Policy Gradient Methods for the Noisy Linear Quadratic Regulator over a Finite Horizon ( http://arxiv.org/abs/2011.10300v2 ) ライセンス: Link先を確認	Ben Hambly, Renyuan Xu and Huining Yang	(参考訳) 線形二次レギュレータ(lqr)問題における最適方針を求めるための強化学習法について検討する。特に、既知のパラメータと未知パラメータの設定におけるポリシー勾配法の収束について考察する。弱仮定下での有限時間地平線と確率状態ダイナミクスの設定において、このアプローチに対する大域的線形収束保証を作成できる。また,制約問題に対処するために,計画された方針勾配法の収束性も確立した。アルゴリズムの性能を2つの例で説明する。最初の例は、資産の持ち株の最適清算である。基礎となるダイナミクスのモデルを仮定し、その手法をデータに直接適用する場合の結果を示す。実証的な証拠は、政策勾配法がLQRフレームワークを含むより大規模な確率系の大域的最適解を学習し、モデルベースアプローチと比較してモデルミス特定に関してより堅牢であることを示唆している。第二の例は合成データを用いた高次元設定におけるLQRシステムである。 We explore reinforcement learning methods for finding the optimal policy in the linear quadratic regulator (LQR) problem. In particular, we consider the convergence of policy gradient methods in the setting of known and unknown parameters. We are able to produce a global linear convergence guarantee for this approach in the setting of finite time horizon and stochastic state dynamics under weak assumptions. The convergence of a projected policy gradient method is also established in order to handle problems with constraints. We illustrate the performance of the algorithm with two examples. The first example is the optimal liquidation of a holding in an asset. We show results for the case where we assume a model for the underlying dynamics and where we apply the method to the data directly. The empirical evidence suggests that the policy gradient method can learn the global optimal solution for a larger class of stochastic systems containing the LQR framework and that it is more robust with respect to model mis-specification when compared to a model-based approach. The second example is an LQR system in a higher dimensional setting with synthetic data.	翻訳日:2022-09-23 06:32:53 公開日:2021-06-23
# HAWQV3: Dyadic Neural Network Quantization HAWQV3: Dyadic Neural Network Quantization ( http://arxiv.org/abs/2011.10680v3 ) ライセンス: Link先を確認	Zhewei Yao, Zhen Dong, Zhangcheng Zheng, Amir Gholami, Jiali Yu, Eric Tan, Leyuan Wang, Qijing Huang, Yida Wang, Michael W. Mahoney, Kurt Keutzer	(参考訳) 現在の低精度量子化アルゴリズムは浮動小数点から量子化された整数値への変換の隠れたコストを持つことが多い。この隠れたコストは、ニューラルネットワークの量子化によって実現されるレイテンシの改善を制限する。そこで本研究では,新しい混合精度整数専用量子化フレームワークHAWQV3を提案する。 HAWQV3の貢献は以下のとおりである。 (i)浮動小数点演算や整数除算なしで、整数乗算、加算、ビットシフトのみで計算グラフ全体が実行される整数専用推論 2 モデル摂動とその他の制約(例えばメモリフットプリントと遅延)のトレードオフをバランスさせる整数線形計画問題の解法により、ビット精度を計算したハードウェア対応混合精度量子化法 3TVMにおける4ビットの均一/混合精度量子化のための直接ハードウェア展開とオープンソースコントリビューションで、T4 GPU上のResNet50の均一8ビットと比較して平均速度が1.45\times$に達する。 (iv)resnet18/50とinceptionv3の混合精度の異なるモデル圧縮レベルに対する提案手法の広範な評価 resnet50では、int8量子化は77.58\%$(以前の整数のみの仕事よりも2.68\%$高い)の精度を達成し、混合精度のint4/8量子化はint8のレイテンシを23\%$に削減し、それでも7.73\%の精度を達成します。私たちのフレームワークとtvmの実装はオープンソースです。 Current low-precision quantization algorithms often have the hidden cost of conversion back and forth from floating point to quantized integer values. This hidden cost limits the latency improvement realized by quantizing Neural Networks. To address this, we present HAWQV3, a novel mixed-precision integer-only quantization framework. The contributions of HAWQV3 are the following: (i) An integer-only inference where the entire computational graph is performed only with integer multiplication, addition, and bit shifting, without any floating point operations or even integer division; (ii) A novel hardware-aware mixed-precision quantization method where the bit-precision is calculated by solving an integer linear programming problem that balances the trade-off between model perturbation and other constraints, e.g., memory footprint and latency; (iii) Direct hardware deployment and open source contribution for 4-bit uniform/mixed-precision quantization in TVM, achieving an average speed up of $1.45\times$ for uniform 4-bit, as compared to uniform 8-bit for ResNet50 on T4 GPUs; and (iv) extensive evaluation of the proposed methods on ResNet18/50 and InceptionV3, for various model compression levels with/without mixed precision. For ResNet50, our INT8 quantization achieves an accuracy of $77.58\%$, which is $2.68\%$ higher than prior integer-only work, and our mixed-precision INT4/8 quantization can reduce INT8 latency by $23\%$ and still achieve $76.73\%$ accuracy. Our framework and the TVM implementation have been open sourced.	翻訳日:2022-09-23 06:06:01 公開日:2021-06-23
# チャットボットを用いた水稲画像の自動水稲病検出システム A System for Automatic Rice Disease Detection from Rice Paddy Images Serviced via a Chatbot ( http://arxiv.org/abs/2011.10823v2 ) ライセンス: Link先を確認	Pitchayagan Temniranrat, Kantip Kiratiratanapruk, Apichon Kitvimonrat, Wasin Sinthupinyo and Sujin Patarapuwadol	(参考訳) 実際の水田画像からイネの病気を診断するLINEボットシステムを開発し,本論文で紹介した。稲作農家の収量・品質向上に資する、使い易く自動的な制度であった。対象画像は,水田環境から特別に試料を採取することなく撮影した。画像からイネ病を検出するために深層学習ニューラルネットワークを用いた。水稲病検出に関するこれまでの研究の成果を改善するために,オブジェクト検出モデルのトレーニングと改良プロセスを開発した。このプロセスはモデルの予測結果の分析に基づいており、モデルの次のトレーニングでデータベースの品質を改善するために繰り返し使用される。 LINE Bot システムのデプロイモデルは,前回の論文 YOLOv3 で選択された最高のパフォーマンス技術を用いて,洗練されたトレーニングデータセットによってトレーニングされた。配置モデルの性能を5つの対象クラスで測定した結果, 前回の論文では91.1%から95.6%に改善した。そこで,この展開モデルをイネ病線ボットシステムに適用した。当システムでは, 稲作農家やイネ病専門医を含むLINEグループ利用者に対して, 初診結果を自動で提示する。彼らはチャットを通じて自由にコミュニケーションできる。実ラインボットのデプロイメントでは、モデルのパフォーマンスは、我々の定義した測定平均であるtrue positive pointで測定され、平均78.86%であることが判明した。システムは高速で,検出処理に2～3秒しかかからなかった。 A LINE Bot System to diagnose rice diseases from actual paddy field images was developed and presented in this paper. It was easy-to-use and automatic system designed to help rice farmers improve the rice yield and quality. The targeted images were taken from the actual paddy environment without special sample preparation. We used a deep learning neural networks technique to detect rice diseases from the images. We developed an object detection model training and refinement process to improve the performance of our previous research on rice leave diseases detection. The process was based on analyzing the model's predictive results and could be repeatedly used to improve the quality of the database in the next training of the model. The deployment model for our LINE Bot system was created from the selected best performance technique in our previous paper, YOLOv3, trained by refined training data set. The performance of the deployment model was measured on 5 target classes and found that the Average True Positive Point improved from 91.1% in the previous paper to 95.6% in this study. Therefore, we used this deployment model for Rice Disease LINE Bot system. Our system worked automatically real-time to suggest primary diagnosis results to the users in the LINE group, which included rice farmers and rice disease specialists. They could communicate freely via chat. In the real LINE Bot deployment, the model's performance was measured by our own defined measurement Average True Positive Point and was found to be an average of 78.86%. The system was fast and took only 2-3 s for detection process in our system server.	翻訳日:2022-09-22 23:41:52 公開日:2021-06-23
# (参考訳) オートオルガニサドのスーパーメルカドにおけるカナスタ・デ・メルカドの考察 An\'alisis de Canasta de mercado en supermercados mediante mapas auto-organizados ( http://arxiv.org/abs/2107.10647v1 ) ライセンス: CC BY 4.0	Joaqu\'in Cordero, Alfredo Bolt and Mauricio Valle	(参考訳) 導入:チリの首都の西部地域で重要なスーパーマーケットチェーンは、決定を行う上で重要な情報を得る必要があり、この情報はデータベースで利用可能であるが、可視化が困難になる情報の複雑さと量のために処理する必要がある。方法: この目的のために, 人工ニューラルネットワークを用いて, コホーネンのSOM法を用いたアルゴリズムを開発した。これを実行するには、特定の重要な手順に従う必要がある。例えば、データマイニングはフィルタリングに責任を持ち、関連するデータのみをマーケットバスケット分析に使用する。情報をフィルタリングした後、データは準備されなければならない。データ準備の後、サンプルデータに適応するためにPythonプログラミング環境を用意し、テスト結果の後にパラメータをセットしてSOMのトレーニングを進めました。結果:SOMの成果は,SOMのトレーニングと実際の取引の結果として得られたことから,店主が考慮すべきプロモーション,パック,バンドルを形成するために,トポロジカルに近接して配置して購入した商品間の関係が得られた。結論:これに基づいて,調査で使用したデータを提供するスーパーマーケットチェーンに対して,頻繁な買い物かごの推薦がなされている。 Introduction: An important chain of supermarkets in the western zone of the capital of Chile, needs to obtain key information to make decisions, this information is available in the databases but needs to be processed due to the complexity and quantity of information which becomes difficult to visualiz,. Method: For this purpose, an algorithm was developed using artificial neural networks applying Kohonen's SOM method. To carry it out, certain key procedures must be followed to develop it, such as data mining that will be responsible for filtering and then use only the relevant data for market basket analysis. After filtering the information, the data must be prepared. After data preparation, we prepared the Python programming environment to adapt it to the sample data, then proceed to train the SOM with its parameters set after test results. Result: the result of the SOM obtains the relationship between the products that were most purchased by positioning them topologically close, to form promotions, packs and bundles for the retail manager to take into consideration, because these relationships were obtained as a result of the SOM training with the real transactions of the clients. Conclusion: Based on this, recommendations on frequent shopping baskets have been made to the supermarket chain that provided the data used in the research	翻訳日:2021-07-25 15:03:07 公開日:2021-06-23
# (参考訳) MegazordNet:時系列予測のための統計と機械学習の視点を組み合わせる MegazordNet: combining statistical and machine learning standpoints for time series forecasting ( http://arxiv.org/abs/2107.01017v1 ) ライセンス: CC BY 4.0	Angelo Garangau Menezes and Saulo Martiello Mastelini	(参考訳) 金融時系列の予測は、シリーズのカオス的特徴のために難しい課題であると考えられている。統計学的アプローチは、市場方向の予測や株価の単価など、いくつかの特定の問題において確固たる結果を示しているが、近年のディープラーニングとビッグデータ技術の進歩により、金融時系列予測に新たな有望な選択肢が生まれている。さらに,近年の文献では,統計と機械学習を組み合わせることで,単一解と比較して予測精度が向上する可能性が示唆されている。そこで本研究では,時系列予測のための構造化深層学習モデルと組み合わせて,金融時系列内の統計的特徴を探索するフレームワークであるMegazordNetを提案する。我々は、s&p500種株価の終値予測手法を異なる指標を用いて評価し、単一統計および機械学習手法を上回った。 Forecasting financial time series is considered to be a difficult task due to the chaotic feature of the series. Statistical approaches have shown solid results in some specific problems such as predicting market direction and single-price of stocks; however, with the recent advances in deep learning and big data techniques, new promising options have arises to tackle financial time series forecasting. Moreover, recent literature has shown that employing a combination of statistics and machine learning may improve accuracy in the forecasts in comparison to single solutions. Taking into consideration the mentioned aspects, in this work, we proposed the MegazordNet, a framework that explores statistical features within a financial series combined with a structured deep learning model for time series forecasting. We evaluated our approach predicting the closing price of stocks in the S&P 500 using different metrics, and we were able to beat single statistical and machine learning methods.	翻訳日:2021-07-11 13:00:20 公開日:2021-06-23
# トランスファーラーニングによるインフォーマル・フォーマル言語シナリオにおけるジェンダー認識 Gender Recognition in Informal and Formal Language Scenarios via Transfer Learning ( http://arxiv.org/abs/2107.02759v1 ) ライセンス: Link先を確認	Daniel Escobar-Grisales, Juan Camilo Vasquez-Correa, Juan Rafael Orozco-Arroyave	(参考訳) テキストデータに基づく人口統計情報検索への関心は,セキュリティ,マーケティング,ヒースケアなどさまざまな分野において,アプリケーションが成功を収めていることから,研究コミュニティで高まっている。テキストデータに基づく性別、年齢、場所、性格などの人口統計特性の認識と識別は、異なるマーケティング戦略を改善するのに役立つ。例えば、オファーのセグメンテーションとパーソナライズを可能にすることで、製品やサービスを最も関心のあるグループに公開することができる。この種の技術は、ソーシャルメディアの文書で広く議論されている。しかし、これらの手法は、ソーシャルメディアにしか存在しないエモティコン、言及、その他の言語現象へのアクセスがない、より形式的な構造を持つデータで研究されていない。本稿では,再帰的・畳み込み型ニューラルネットワークと,非公式言語と形式言語で書かれた文書における性別認識のための伝達学習戦略を提案する。モデルは、ツイートとコールセンター会話からなる2つの異なるデータベースでテストされる。両方のデータベースで最大75\%のアキュラティが達成される。また、ソーシャルメディアで一般的に使用されるような特定の表現やイディオムに基づいて訓練されたシステムから、より形式的なテキストデータに知識を移すことも可能であり、データ量が少なく、構造が完全に異なることを示している。 The interest in demographic information retrieval based on text data has increased in the research community because applications have shown success in different sectors such as security, marketing, heath-care, and others. Recognition and identification of demographic traits such as gender, age, location, or personality based on text data can help to improve different marketing strategies. For instance it makes it possible to segment and to personalize offers, thus products and services are exposed to the group of greatest interest. This type of technology has been discussed widely in documents from social media. However, the methods have been poorly studied in data with a more formal structure, where there is no access to emoticons, mentions, and other linguistic phenomena that are only present in social media. This paper proposes the use of recurrent and convolutional neural networks, and a transfer learning strategy for gender recognition in documents that are written in informal and formal languages. Models are tested in two different databases consisting of Tweets and call-center conversations. Accuracies of up to 75\% are achieved for both databases. The results also indicate that it is possible to transfer the knowledge from a system trained on a specific type of expressions or idioms such as those typically used in social media into a more formal type of text data, where the amount of data is more scarce and its structure is completely different.	翻訳日:2021-07-11 11:34:03 公開日:2021-06-23
# (参考訳) 対話型セグメンテーションのための確率的注意 Probabilistic Attention for Interactive Segmentation ( http://arxiv.org/abs/2106.15338v1 ) ライセンス: CC BY 4.0	Prasad Gabbur and Manjot Bilkhu and Javier Movellan	(参考訳) 我々は注意の確率論的解釈を提供し、トランスフォーマーにおける標準ドット生産注意は最大後方推定(map)の特別な場合であることを示す。提案手法は,キーおよび値モデルパラメータのオンライン適応に期待最大化アルゴリズムを用いることを提案する。このアプローチは、外部エージェント、例えば注釈器が、いくつかのトークンの正しい値、例えば、いくつかのピクセルの意味圏に関する推論時間情報を提供する場合に有用であり、この新しい情報は、原則的に他のトークンに伝播する必要がある。本稿では,アノテーションの効率を向上させるために,アノテーションとモデルがオンラインで協調する対話型意味セグメンテーションタスクのアプローチについて述べる。標準ベンチマークを用いて、キー適応は低フィードバック方式におけるモデル性能を向上し(\sim10\%$ mIoU)、高フィードバック方式における値伝搬はモデル応答性を向上させる。確率的注意モデルのpytorch層の実装が公開される予定だ。 We provide a probabilistic interpretation of attention and show that the standard dot-product attention in transformers is a special case of Maximum A Posteriori (MAP) inference. The proposed approach suggests the use of Expectation Maximization algorithms for online adaptation of key and value model parameters. This approach is useful for cases in which external agents, e.g., annotators, provide inference-time information about the correct values of some tokens, e.g, the semantic category of some pixels, and we need for this new information to propagate to other tokens in a principled manner. We illustrate the approach on an interactive semantic segmentation task in which annotators and models collaborate online to improve annotation efficiency. Using standard benchmarks, we observe that key adaptation boosts model performance ($\sim10\%$ mIoU) in the low feedback regime and value propagation improves model responsiveness in the high feedback regime. A PyTorch layer implementation of our probabilistic attention model will be made publicly available.	翻訳日:2021-07-04 21:08:49 公開日:2021-06-23
# (参考訳) ScanBank: Scanned Electronic Theses and Dissertationsから図を抽出するためのベンチマークデータセット ScanBank: A Benchmark Dataset for Figure Extraction from Scanned Electronic Theses and Dissertations ( http://arxiv.org/abs/2106.15320v1 ) ライセンス: CC BY 4.0	Sampanna Yashwant Kahu, William A. Ingram, Edward A. Fox, Jian Wu	(参考訳) 我々は,600万人以上が公開されており,アクセスの向上と実用性の拡大をめざして,電子製図・論文(ETDs)に焦点を合わせ,研究・教育を専門分野にわたって支援するための重要なコーパスを構成している。新たなデジタル文書が含まれるにつれてコーパスは成長しており、何百万もの古い論文や論文がデジタル形式に変換され、機関リポジトリに電子的に配布されている。 ETDでは、他の学術作品と同様に、数字や表は簡潔な方法で大量の情報を伝達することができる。デジタルPDFから図形や表を抽出する手法が提案されているが、スキャンされたETDではうまく機能しない。この問題を考慮し,本研究では,スキャンしたPDFでうまく機能しない理由として,デジタル文書でのみトレーニングを行ったことが挙げられる。この制限に対処するため、ScanBankは1万ページの画像をスキャンし、人間が手動でラベル付けした新しいデータセットである。このデータセットを用いて、YOLOv5に基づくディープニューラルネットワークモデルをトレーニングし、スキャンされたETDから数値とテーブルを正確に抽出する。我々は,スキャンされた文書から図形を抽出するためのより良い方法を見つけることを目的とした,重要な研究課題を提起し,回答する。そのうちの1つは、スキャンされたドキュメントからの図形抽出に適したモデルをトレーニングするために使用される、生まれながらのデジタルドキュメントに適用されるデータ拡張技術である。我々の知る限りでは、ScanBankはスキャンされたETDのフィギュアとテーブル抽出のための最初の手動アノテートデータセットである。 ScanBankでトレーニングされたYOLOv5ベースのモデルでは、既存の同等のオープンソースおよび無償のベースラインメソッドよりも大幅にパフォーマンスが向上している。 We focus on electronic theses and dissertations (ETDs), aiming to improve access and expand their utility, since more than 6 million are publicly available, and they constitute an important corpus to aid research and education across disciplines. The corpus is growing as new born-digital documents are included, and since millions of older theses and dissertations have been converted to digital form to be disseminated electronically in institutional repositories. In ETDs, as with other scholarly works, figures and tables can communicate a large amount of information in a concise way. Although methods have been proposed for extracting figures and tables from born-digital PDFs, they do not work well with scanned ETDs. Considering this problem, our assessment of state-of-the-art figure extraction systems is that the reason they do not function well on scanned PDFs is that they have only been trained on born-digital documents. To address this limitation, we present ScanBank, a new dataset containing 10 thousand scanned page images, manually labeled by humans as to the presence of the 3.3 thousand figures or tables found therein. We use this dataset to train a deep neural network model based on YOLOv5 to accurately extract figures and tables from scanned ETDs. We pose and answer important research questions aimed at finding better methods for figure extraction from scanned documents. One of those concerns the value for training, of data augmentation techniques applied to born-digital documents which are used to train models better suited for figure extraction from scanned documents. To the best of our knowledge, ScanBank is the first manually annotated dataset for figure and table extraction for scanned ETDs. A YOLOv5-based model, trained on ScanBank, outperforms existing comparable open-source and freely available baseline methods by a considerable margin.	翻訳日:2021-07-04 20:49:55 公開日:2021-06-23
# (参考訳) Wasserstein生成逆数インプットネットワークを用いた画像インパインティング Image Inpainting Using Wasserstein Generative Adversarial Imputation Network ( http://arxiv.org/abs/2106.15341v1 ) ライセンス: CC BY 4.0	Daniel Va\v{s}ata, Tom\'a\v{s} Halama, Magda Friedjungov\'a	(参考訳) 画像インペイントは、画像内の欠落した領域の再構築に焦点を当てたコンピュータビジョンにおける重要なタスクの1つである。本研究の目的は,Wasserstein Generative Adversarial Imputation Networkに基づく画像インペイントモデルの導入である。モデルのジェネレータネットワークは、異なるダイレーションレートの畳み込み層の構築ブロックと、モデルが出力の詳細を再現するのに役立つスキップ接続を使用する。この組み合わせは、不足する様々なシナリオを十分な品質で扱える普遍的な計算モデルをもたらす。これを実験的に示すために、このモデルはランダムなピクセルの欠落、様々な小さな平方領域の欠落、画像の中心に1つの欠落した四角の欠落という3つのシナリオを同時に扱うように訓練されている。私たちのモデルはすべてのシナリオで高品質なインペインティング結果を達成しています。 2つの実世界のベンチマークデータセット、celeba facesとparis streetviewにおけるピーク信号対雑音比と構造類似性指数を用いて性能評価を行う。本モデルの結果は,バイハーモニック・インパクション法や,他の最先端画像インパインティング法と比較された。 Image inpainting is one of the important tasks in computer vision which focuses on the reconstruction of missing regions in an image. The aim of this paper is to introduce an image inpainting model based on Wasserstein Generative Adversarial Imputation Network. The generator network of the model uses building blocks of convolutional layers with different dilation rates, together with skip connections that help the model reproduce fine details of the output. This combination yields a universal imputation model that is able to handle various scenarios of missingness with sufficient quality. To show this experimentally, the model is simultaneously trained to deal with three scenarios given by missing pixels at random, missing various smaller square regions, and one missing square placed in the center of the image. It turns out that our model achieves high-quality inpainting results on all scenarios. Performance is evaluated using peak signal-to-noise ratio and structural similarity index on two real-world benchmark datasets, CelebA faces and Paris StreetView. The results of our model are compared to biharmonic imputation and to some of the other state-of-the-art image inpainting methods.	翻訳日:2021-07-04 20:31:39 公開日:2021-06-23
# (参考訳) ワトソン博士型人工知性(ai)システム Dr. Watson type Artificial Intellect (AI) Systems ( http://arxiv.org/abs/2106.13322v1 ) ライセンス: CC BY 4.0	Saveli Goldberg (1), Stanislav Belyaev (2), Vladimir Sluchak ((1) MGH Radiation Oncology Department, (2) Eastern New Mexico Medical Center)	(参考訳) この記事では、ソリューションを直接提供せず、むしろその方向を指して、ユーザーに質問やメッセージの調整を促す新しいタイプのAIシステムを提案する。 aiヒューマンコラボレーションのモデルは、コナン・ドイルの物語からホームズ氏とワトソン博士の相互作用の古典的な文学的例から導き出され、高度に資格のあるホームズ氏はワトソン博士の問いに答える。ここでMr. Holmesは、ルールベースの計算、ロジック、メモリ管理と共に、明らかにAIシステムの役割を担っており、Watson博士がユーザである。同じホームズとワトソンのインタラクションを調べて、Watson博士のようなAIが行動する別のモデルを見つけ、促進します。この原理に基づいて、これらのシステムを「ワトソン博士型システム」と呼ぶ。本稿では,これらのシステムの特徴について述べ,集中治療医のための患者管理システムとデータエラー防止システムについて紹介する。 The article proposes a new type of AI system that does not give solutions directly but rather points toward it, friendly prompting the user with questions and adjusting messages. Models of AI human collaboration can be deduced from the classic literary example of interaction between Mr. Holmes and Dr. Watson from the stories by Conan Doyle, where the highly qualified expert Mr. Holmes answers questions posed by Dr. Watson. Here Mr. Holmes, with his rule-based calculations, logic, and memory management, apparently plays the role of an AI system, and Dr. Watson is the user. Looking into the same Holmes-Watson interaction, we find and promote another model in which the AI behaves like Dr. Watson, who, by asking questions and acting in a particular way, helps Holmes (the AI user) make the right decisions. We call the systems based on this principle "Dr. Watson-type systems." The article describes the properties of such systems and introduces two particular: Patient Management System for intensive care physicians and Data Error Prevention System.	翻訳日:2021-06-29 06:15:42 公開日:2021-06-23
# 逐次文書上での反復結合トピックモデリング Recurrent Coupled Topic Modeling over Sequential Documents ( http://arxiv.org/abs/2106.13732v1 ) ライセンス: Link先を確認	Jinjin Guo, Longbing Cao and Zhiguo Gong	(参考訳) オンラインアーカイブ、ソーシャルメディア、ニュースフィードなどの豊富なシーケンシャルなドキュメントはストリーミング更新され、各ドキュメントはスムーズに進化するが依存するトピックに組み込まれる。このようなデジタルテキストは、隠れた進化するトピックとその時間的依存性を推測するために、動的トピックモデリングに関する広範な研究を惹きつけている。しかし、既存のアプローチのほとんどはシングルトピックとスレッドの進化に焦点を当てており、現在のトピックが複数の関連する先行トピックと結合される可能性があるという事実を無視している。さらに、これらの手法は遅延パラメータを推論する際の難解な推論問題も引き起こし、高い計算コストと性能劣化をもたらす。この研究では、現在のトピックが対応する結合重み付き以前のトピックから進化し、マルチトピック・スレッドの進化が形成されると仮定する。我々の手法は、進化するトピック間の依存関係をモデル化し、時間ステップで複雑なマルチカップリングを徹底的にエンコードする。難解な推論課題を克服するために,新しいデータ拡張手法のセットを用いた新しい解を提案し,進化するトピック間の多重結合をうまく分解する。これにより、完全な共役モデルが得られ、推論手法の有効性と効率が保証される。後方フィルタアルゴリズムを備えた新しいギブスサンプリング器は、閉形式の潜時時間パラメータを効率的に学習する。さらに、潜在インディアンバッファプロセス(IBP)複合分布を利用して、全体のトピック番号を自動的に推測し、バイアスのない各シーケンシャル文書のスパーストピック比をカスタマイズする。提案手法は, 競合するベースラインに対する合成データセットと実世界のデータセットの両方で評価され, 単語ごとのパープレキシティの低さ, 一貫性の高いトピック, 文書時間予測の精度が向上した。 The abundant sequential documents such as online archival, social media and news feeds are streamingly updated, where each chunk of documents is incorporated with smoothly evolving yet dependent topics. Such digital texts have attracted extensive research on dynamic topic modeling to infer hidden evolving topics and their temporal dependencies. However, most of the existing approaches focus on single-topic-thread evolution and ignore the fact that a current topic may be coupled with multiple relevant prior topics. In addition, these approaches also incur the intractable inference problem when inferring latent parameters, resulting in a high computational cost and performance degradation. In this work, we assume that a current topic evolves from all prior topics with corresponding coupling weights, forming the multi-topic-thread evolution. Our method models the dependencies between evolving topics and thoroughly encodes their complex multi-couplings across time steps. To conquer the intractable inference challenge, a new solution with a set of novel data augmentation techniques is proposed, which successfully discomposes the multi-couplings between evolving topics. A fully conjugate model is thus obtained to guarantee the effectiveness and efficiency of the inference technique. A novel Gibbs sampler with a backward-forward filter algorithm efficiently learns latent timeevolving parameters in a closed-form. In addition, the latent Indian Buffet Process (IBP) compound distribution is exploited to automatically infer the overall topic number and customize the sparse topic proportions for each sequential document without bias. The proposed method is evaluated on both synthetic and real-world datasets against the competitive baselines, demonstrating its superiority over the baselines in terms of the low per-word perplexity, high coherent topics, and better document time prediction.	翻訳日:2021-06-28 12:56:32 公開日:2021-06-23
# (参考訳) オンラインハンドブック of argumentation for ai: volume 2 Online Handbook of Argumentation for AI: Volume 2 ( http://arxiv.org/abs/2106.10832v2 ) ライセンス: CC BY 4.0	OHAAI Collaboration: Andreas Brannstrom, Federico Castagna, Theo Duchatelle, Matt Foulis, Timotheus Kampik, Isabelle Kuhlmann, Lars Malmqvist, Mariela Morveli-Espinoza, Jack Mumford, Stipe Pandzic, Robin Schaefer, Luke Thorburn, Andreas Xydis, Antonio Yuste-Ginel, Heng Zheng	(参考訳) 本巻は、OHAAI(Online Handbook of Argumentation for AI)の第2巻に選択された論文の改訂版を含む。従来、議論と議論の相互作用の形式理論が提案され研究され、近年では議論の計算モデルが研究されている。人工知能(AI)の分野としての論証は、知識の象徴的表現や実現不可能な推論に関心を持つ研究者にとって非常に重要である。このハンドブックの目的は、議論研究コミュニティにオープンアクセスとキュレートされたアンソロジーを提供することである。 OHAAIは、AIに関連するあらゆる分野における議論の理論と応用に関する、最新のおよび今後の博士主導の研究を追跡するための研究ハブとして設計されている。 This volume contains revised versions of the papers selected for the second volume of the Online Handbook of Argumentation for AI (OHAAI). Previously, formal theories of argument and argument interaction have been proposed and studied, and this has led to the more recent study of computational models of argument. Argumentation, as a field within artificial intelligence (AI), is highly relevant for researchers interested in symbolic representations of knowledge and defeasible reasoning. The purpose of this handbook is to provide an open access and curated anthology for the argumentation research community. OHAAI is designed to serve as a research hub to keep track of the latest and upcoming PhD-driven research on the theory and application of argumentation in all areas related to AI.	翻訳日:2021-06-27 09:52:53 公開日:2021-06-23
# (参考訳) 硬膜外電図信号からのロバスト軌道復号のための部分的最大コレントロピー回帰 Partial Maximum Correntropy Regression for Robust Trajectory Decoding from Noisy Epidural Electrocorticographic Signals ( http://arxiv.org/abs/2106.13086v1 ) ライセンス: CC BY 4.0	Yuanhao Li, Badong Chen, Gang Wang, Natsue Yoshimura, Yasuharu Koike	(参考訳) PLSR(Partial Least Square Regression)アルゴリズムは、脳-コンピュータインタフェースにおける相関脳記録から連続変数を予測する特別な能力を示し、近年のマカクの硬膜外電図から3次元連続ハンドトラジェクトリへの予測に成功した。それにもかかわらず、PLSRは本質的に最小二乗基準に基づいて定式化されており、結果として複雑な雑音に関して損なわれない。本研究の目的は,PLSRの堅牢なバージョンを提案することである。この目的のために、最大コレントロピー基準は、PLSRの新しい頑健な変種であるPartial Maximum Correntropy Regression (PMCR)を構築するために採用されている。半量子最適化手法を用いて頑健な潜在変数を計算する。提案するPMCRを合成例と公開Neurotychoデータセットを用いて評価した。従来のPLSRと最先端の変種と比較して、PMCRは、汚染されたトレーニングセットを持つ3つの異なるパフォーマンス指標に対して優れた予測能力を実現した。提案するpmcrは雑音下脳測定からのロバスト復号化に有効な手法として実証され,ノイズによる性能劣化を低減し,脳-コンピュータ界面の復号ロバスト性を向上させることができた。 The Partial Least Square Regression (PLSR) algorithm exhibits exceptional competence for predicting continuous variables from inter-correlated brain recordings in brain-computer interfaces, which achieved successful prediction from epidural electrocorticography of macaques to three-dimensional continuous hand trajectories recently. Nevertheless, PLSR is in essence formulated based on the least square criterion, thus, being non-robust with respect to complicated noises consequently. The aim of the present study is to propose a robust version of PLSR. To this end, the maximum correntropy criterion is adopted to structure a new robust variant of PLSR, namely Partial Maximum Correntropy Regression (PMCR). Half-quadratic optimization technique is utilized to calculate the robust latent variables. We assess the proposed PMCR on a synthetic example and the public Neurotycho dataset. Compared with the conventional PLSR and the state-of-the-art variant, PMCR realized superior prediction competence on three different performance indicators with contaminated training set. The proposed PMCR was demonstrated as an effective approach for robust decoding from noisy brain measurements, which could reduce the performance degradation resulting from adverse noises, thus, improving the decoding robustness of brain-computer interfaces.	翻訳日:2021-06-26 12:00:07 公開日:2021-06-23
# (参考訳) ジオタグ付きソーシャルメディアにおけるリアルタイム時空間イベント検出 Real-time Spatio-temporal Event Detection on Geotagged Social Media ( http://arxiv.org/abs/2106.13121v1 ) ライセンス: CC BY 4.0	Yasmeen George, Shanika Karunasekera, Aaron Harwood and Kwan Hui Lim	(参考訳) ソーシャルメディアデータストリームのマイニングにおける重要な課題は、特定の地域またはグローバルな地域の人々のグループによって活発に議論されるイベントを特定することである。このような出来事は、事故、抗議、選挙、突破ニュースの早期警告に有用である。しかし、イベントのリストやイベント時間と空間の解決は事前に固定または既知のものではない。本研究では,異なる時間と空間解像度のイベントを検出可能なソーシャルメディアを用いたオンライン時空間イベント検出システムを提案する。まず, イベントの空間分解に関する課題に対処するため, ソーシャルメディアデータの密度に基づいて, 地理的空間をマルチスケール領域に分割するために, クワッドツリー法を用いる。次に,ポアソン分布とソーシャルポストの予期せぬ密度の領域を強調する平滑化を含む統計的非教師なしアプローチを行う。さらに、連続した時間間隔で同じ領域で発生した事象をマージすることにより、イベント期間を正確に推定する。ポスト処理ステージは、スパム、フェイク、不正なイベントをフィルタリングするために導入される。最後に,ソーシャルメディアを利用した単純な意味論を取り入れ,検出された事象の完全性や正確性を評価する。提案手法は,メルボルン,ロンドン,パリ,ニューヨークの各都市を対象としたtwitterとflickrのソーシャルメディアデータセットを用いて評価される。提案手法の有効性を検証するため,地理的空間の固定分割とクラスタリング法に基づく2つのベースラインアルゴリズムとの比較を行った。性能評価のために,手動でリコールと精度を計算する。また,報告された事象の正確性を自動的に測定する「強度指標」という新しい品質指標を提案する。 A key challenge in mining social media data streams is to identify events which are actively discussed by a group of people in a specific local or global area. Such events are useful for early warning for accident, protest, election or breaking news. However, neither the list of events nor the resolution of both event time and space is fixed or known beforehand. In this work, we propose an online spatio-temporal event detection system using social media that is able to detect events at different time and space resolutions. First, to address the challenge related to the unknown spatial resolution of events, a quad-tree method is exploited in order to split the geographical space into multiscale regions based on the density of social media data. Then, a statistical unsupervised approach is performed that involves Poisson distribution and a smoothing method for highlighting regions with unexpected density of social posts. Further, event duration is precisely estimated by merging events happening in the same region at consecutive time intervals. A post processing stage is introduced to filter out events that are spam, fake or wrong. Finally, we incorporate simple semantics by using social media entities to assess the integrity, and accuracy of detected events. The proposed method is evaluated using different social media datasets: Twitter and Flickr for different cities: Melbourne, London, Paris and New York. To verify the effectiveness of the proposed method, we compare our results with two baseline algorithms based on fixed split of geographical space and clustering method. For performance evaluation, we manually compute recall and precision. We also propose a new quality measure named strength index, which automatically measures how accurate the reported event is.	翻訳日:2021-06-26 11:16:32 公開日:2021-06-23
# (参考訳) 芸術解釈と意味のモデル化。イコノロジーとイコノグラフィーを記述するためのデータモデル Modelling Art Interpretation and Meaning. A Data Model for Describing Iconology and Iconography ( http://arxiv.org/abs/2106.12967v1 ) ライセンス: CC BY 4.0	S. Baroncini (1), M. Daquino (1), F. Tomasi (1) ((1) Department of Classical Philology and Italian Studies, University of Bologna)	(参考訳) イコノロジー(Iconology)は、美術史の分野の一つで、芸術の社会的・文化的背景に関する意味を研究する。今日、いくつかの学際研究分野は、データサイエンスの手法とセマンティックウェブ技術を用いて定量的美術史を追求するために、イコノロジーに近い理論的な枠組みを利用している。しかし、近年ではイコノグラフィー研究がオントロジーで取り上げられているが、イコノロジー研究に関連する側面の完全な記述はいまだに欠落している。本稿では,本論文から選択した11の事例研究について予備研究を行い,既存のオントロジーを拡張するための新たな用語を提案する。我々は,新しい用語を共通の評価手法で検証し,デジタル美術史のコミュニティにおいて,このような拡張オントロジーが生まれる可能性に照らして,その結果について考察する。 Iconology is a branch of art history that investigates the meaning of artworks in relation to their social and cultural background. Nowadays, several interdisciplinary research fields leverage theoretical frameworks close to iconology to pursue quantitative Art History with data science methods and Semantic Web technologies. However, while Iconographic studies have been recently addressed in ontologies, a complete description of aspects relevant to iconological studies is still missing. In this article, we present a preliminary study on eleven case studies selected from the literature and we envision new terms for extending existing ontologies. We validate new terms according to a common evaluation method and we discuss our results in the light of the opportunities that such an extended ontology would arise in the community of Digital Art History.	翻訳日:2021-06-26 10:45:24 公開日:2021-06-23
# (参考訳) 連続時間深部グリオーマ成長モデル Continuous-Time Deep Glioma Growth Models ( http://arxiv.org/abs/2106.12917v1 ) ライセンス: CC BY-SA 4.0	Jens Petersen and Fabian Isensee and Gregor K\"ohler and Paul F. J\"ager and David Zimmerer and Ulf Neuberger and Wolfgang Wick and J\"urgen Debus and Sabine Heiland and Martin Bendszus and Philipp Vollmuth and Klaus H. Maier-Hein	(参考訳) 将来、腫瘍がどのように進化するかを推定できる能力は、治療決定の改善から放射線治療における線量分布の改善まで、大きな臨床効果をもたらす可能性がある。最近の研究は、深層学習と変分推論を通じてグリオーマ成長モデル問題にアプローチし、実際の患者データ分布から完全に学習する。これまでのところ、このアプローチは画像取得間隔と固定長のシーケンスに制約されており、より現実的なシナリオにおける適用性を制限する。本稿では,確率的時系列の条件生成モデルであるNeural Processesを拡張し,時空間の注意機構を含む階層的マルチスケール表現符号化を行う。その結果、任意の数の観測で条件付けできる学習的成長モデルとなり、連続時間軸上で時間的に一貫した成長軌道の分布を生成することができる。 379人の患者のデータセット上で、この手法は画像のグローバルおよびよりきめ細かなバリエーションを捉え、他の学習された成長モデルよりも優れたパフォーマンスを示す。 The ability to estimate how a tumor might evolve in the future could have tremendous clinical benefits, from improved treatment decisions to better dose distribution in radiation therapy. Recent work has approached the glioma growth modeling problem via deep learning and variational inference, thus learning growth dynamics entirely from a real patient data distribution. So far, this approach was constrained to predefined image acquisition intervals and sequences of fixed length, which limits its applicability in more realistic scenarios. We overcome these limitations by extending Neural Processes, a class of conditional generative models for stochastic time series, with a hierarchical multi-scale representation encoding including a spatio-temporal attention mechanism. The result is a learned growth model that can be conditioned on an arbitrary number of observations, and that can produce a distribution of temporally consistent growth trajectories on a continuous time axis. On a dataset of 379 patients, the approach successfully captures both global and finer-grained variations in the images, exhibiting superior performance compared to other learned growth models.	翻訳日:2021-06-26 10:24:29 公開日:2021-06-23
# (参考訳) 解釈可能なグラフニューラルネットワークのための学習スパーシフィケーション Learnt Sparsification for Interpretable Graph Neural Networks ( http://arxiv.org/abs/2106.12920v1 ) ライセンス: CC BY 4.0	Mandeep Rathee, Zijian Zhang, Thorben Funke, Megha Khosla, and Avishek Anand	(参考訳) グラフニューラルネットワーク(GNN)は、リレーショナルモデリングを必要とするさまざまなタスクや分野において大きな成功を収めている。 GNNは、グラフ構造を帰納バイアスとして利用し、柔軟性と強力なモデルを生成する。しかし、ノード特徴とグラフ構造の間の相互作用が暗黙的にのみ学習されるため、GNNの解釈は困難である。本稿では,不要な近傍を除去し,基礎となるグラフを明示的にスパースする手法であるkedgeを提案する。提案手法は,任意のgnnモデルと共役で使用可能な硬いkumaraswamy分布を用いた,扱いやすいスパーシフィケーション法に基づいている。 Kedgeは、任意のGNNでトレーニングされたモジュール方式でエッジマスクを学び、エンドツーエンドで勾配ベースの最適化を実現する。実験では,実験精度に小さな影響を及ぼさずに,エッジのかなりの割合をプルーピングできることを実証した。具体的には、pubmedデータセットでkedgeは、エッジの80%以上をドロップし、わずか2%の精度低下でグラフ構造がノードの機能に対して小さな貢献しか持たないことを学んでいる。最後に、Kedgeは、GNN層の増加とともにタスク性能を向上し、深いGNNにおいて過度にスムースな現象に効果的に対処することを示した。 Graph neural networks (GNNs) have achieved great success on various tasks and fields that require relational modeling. GNNs aggregate node features using the graph structure as inductive biases resulting in flexible and powerful models. However, GNNs remain hard to interpret as the interplay between node features and graph structure is only implicitly learned. In this paper, we propose a novel method called Kedge for explicitly sparsifying the underlying graph by removing unnecessary neighbors. Our key idea is based on a tractable method for sparsification using the Hard Kumaraswamy distribution that can be used in conjugation with any GNN model. Kedge learns edge masks in a modular fashion trained with any GNN allowing for gradient based optimization in an end-to-end fashion. We demonstrate through extensive experiments that our model Kedge can prune a large proportion of the edges with only a minor effect on the test accuracy. Specifically, in the PubMed dataset, Kedge learns to drop more than 80% of the edges with an accuracy drop of merely 2% showing that graph structure has only a small contribution in comparison to node features. Finally, we also show that Kedge effectively counters the over-smoothing phenomena in deep GNNs by maintaining good task performance with increasing GNN layers.	翻訳日:2021-06-26 10:11:42 公開日:2021-06-23
# (参考訳) 機械学習を用いた救急医療における入院予測 Using machine learning techniques to predict hospital admission at the emergency department ( http://arxiv.org/abs/2106.12921v1 ) ライセンス: CC BY 4.0	Georgios Feretzakis, George Karlis, Evangelos Loupelis, Dimitris Kalles, Rea Chatzikyriakou, Nikolaos Trakas, Eugenia Karakou, Aikaterini Sakagianni, Lazaros Tzelves, Stavroula Petropoulou, Aikaterini Tika, Ilias Dalainas and Vasileios Kaldis	(参考訳) 紹介:救急部門(ED)における最も重要な課題の1つは、病院入院の恩恵を受ける患者を迅速に特定することである。機械学習(ML)技術は、医療における診断支援として有望であることを示している。材料と方法: 尿素, クレアチニン, 乳酸脱水素酵素, クレアチンキナーゼ, c-反応性蛋白, 血液計数, 活性化部分トロンボプラスチン時間, dダイマー, 国際正規化比, 年齢, 性別, edユニットへのトリアージ配置, 救急車の使用率など, 入院率の予測における成績について検討した。合計3,204回のED訪問が分析された。結果:提案アルゴリズムは,ED患者の入院予測における許容性能を示すモデルを生成する。 8つの評価アルゴリズムのF値とROC値の範囲はそれぞれ [0.679-0.708] と [0.734-0.774] であった。議論: このツールの主な利点は、簡単アクセス、可用性、イエス/ノー結果、低コストである。本手法の臨床的意義は,従来の臨床的意思決定からより洗練されたモデルへの移行を促進する可能性がある。結論: 共通のバイオマーカーを利用したロバストな予後モデルの開発は, 救急医療の将来を形作るかもしれない。本研究は,実用的ED試験の実施を保証している。 Introduction: One of the most important tasks in the Emergency Department (ED) is to promptly identify the patients who will benefit from hospital admission. Machine Learning (ML) techniques show promise as diagnostic aids in healthcare. Material and methods: We investigated the following features seeking to investigate their performance in predicting hospital admission: serum levels of Urea, Creatinine, Lactate Dehydrogenase, Creatine Kinase, C-Reactive Protein, Complete Blood Count with differential, Activated Partial Thromboplastin Time, D Dimer, International Normalized Ratio, age, gender, triage disposition to ED unit and ambulance utilization. A total of 3,204 ED visits were analyzed. Results: The proposed algorithms generated models which demonstrated acceptable performance in predicting hospital admission of ED patients. The range of F-measure and ROC Area values of all eight evaluated algorithms were [0.679-0.708] and [0.734-0.774], respectively. Discussion: The main advantages of this tool include easy access, availability, yes/no result, and low cost. The clinical implications of our approach might facilitate a shift from traditional clinical decision-making to a more sophisticated model. Conclusion: Developing robust prognostic models with the utilization of common biomarkers is a project that might shape the future of emergency medicine. Our findings warrant confirmation with implementation in pragmatic ED trials.	翻訳日:2021-06-26 09:58:28 公開日:2021-06-23
# (参考訳) contextized token representationsを用いた臨床名付きエンティティ認識 Clinical Named Entity Recognition using Contextualized Token Representations ( http://arxiv.org/abs/2106.12608v1 ) ライセンス: CC BY 4.0	Yichao Zhou, Chelsea Ju, J. Harry Caufield, Kevin Shih, Calvin Chen, Yizhou Sun, Kai-Wei Chang, Peipei Ping, Wei Wang	(参考訳) clinical named entity recognition (cner) タスクは、診断手順、疾患障害、重症度、薬物、薬物量、徴候などの予め定義されたカテゴリに臨床用語を分類することを目的としている。 CNERは、新しい現象の同定や人為的な情報抽出を含む薬物に対する副作用の研究を促進する。関心の実体を抽出する既存のアプローチは、各単語を表現するために静的な単語埋め込みを使うことに焦点を当てている。しかし、1つの単語は、文の文脈に依存する異なる解釈を持つことができる。静的な単語埋め込みは、単語の多様な解釈を統合するには不十分である。この課題を克服するために,各単語の意味的意味をより正確に把握するために,文脈的単語埋め込み技術が導入された。これら2つの言語モデルであるelmoとflairは、自然言語処理の分野で広く使われ、ドメインジェネリックドキュメントにコンテキスト化された単語埋め込みを生成する。しかし、これらの埋め込みは通常、特定のドメインの語彙間の近接を捉えるには一般的すぎる。臨床症例報告 (CCR) を用いた下流の様々な応用を容易にするため, PubMed Central による臨床関連コーパスを用いて, 深層文脈言語モデル (C-ELMo) と臨床コンテキスト文字列埋め込み (C-Flair) を事前訓練した。明示的な実験により、私たちのモデルは静的な単語埋め込みとドメイン固有言語モデルの両方と比較して劇的な改善が得られます。 The clinical named entity recognition (CNER) task seeks to locate and classify clinical terminologies into predefined categories, such as diagnostic procedure, disease disorder, severity, medication, medication dosage, and sign symptom. CNER facilitates the study of side-effect on medications including identification of novel phenomena and human-focused information extraction. Existing approaches in extracting the entities of interests focus on using static word embeddings to represent each word. However, one word can have different interpretations that depend on the context of the sentences. Evidently, static word embeddings are insufficient to integrate the diverse interpretation of a word. To overcome this challenge, the technique of contextualized word embedding has been introduced to better capture the semantic meaning of each word based on its context. Two of these language models, ELMo and Flair, have been widely used in the field of Natural Language Processing to generate the contextualized word embeddings on domain-generic documents. However, these embeddings are usually too general to capture the proximity among vocabularies of specific domains. To facilitate various downstream applications using clinical case reports (CCRs), we pre-train two deep contextualized language models, Clinical Embeddings from Language Model (C-ELMo) and Clinical Contextual String Embeddings (C-Flair) using the clinical-related corpus from the PubMed Central. Explicit experiments show that our models gain dramatic improvements compared to both static word embeddings and domain-generic language models.	翻訳日:2021-06-26 09:52:02 公開日:2021-06-23
# (参考訳) 機械学習とディープラーニングを用いた手書き文字認識 Handwritten Digit Recognition using Machine and Deep Learning Algorithms ( http://arxiv.org/abs/2106.12614v1 ) ライセンス: CC0 1.0	Samay Pashine, Ritik Dixit, and Rishika Kushwah	(参考訳) 人間のマシンへの依存度は、写真のオブジェクト分類からサイレント映画への音の追加まで、ディープラーニングと機械学習アルゴリズムの助けを借りて、すべてを実行することができるほど高くはない。同様に、手書きのテキスト認識は研究と開発において重要な分野の1つであり、多くの可能性があり得る。手書き文字認識 (HWR) は、手書き文字認識 (HTR) とも呼ばれ、紙文書、写真、タッチスクリーン、その他の装置から手書き入力を受信し、解釈するコンピュータの能力である。本稿では,MNISTデータセットを用いて,Support Vector Machines (SVM), Multi-Layer Perceptron (MLP), Convolution Neural Network (CNN)モデルを用いて手書き桁認識を行った。我々の主な目的は、上述したモデルの精度と実行時間を比較して、桁認識に最適なモデルを得ることである。 The reliance of humans over machines has never been so high such that from object classification in photographs to adding sound to silent movies everything can be performed with the help of deep learning and machine learning algorithms. Likewise, Handwritten text recognition is one of the significant areas of research and development with a streaming number of possibilities that could be attained. Handwriting recognition (HWR), also known as Handwritten Text Recognition (HTR), is the ability of a computer to receive and interpret intelligible handwritten input from sources such as paper documents, photographs, touch-screens and other devices [1]. Apparently, in this paper, we have performed handwritten digit recognition with the help of MNIST datasets using Support Vector Machines (SVM), Multi-Layer Perceptron (MLP) and Convolution Neural Network (CNN) models. Our main objective is to compare the accuracy of the models stated above along with their execution time to get the best possible model for digit recognition.	翻訳日:2021-06-26 09:40:17 公開日:2021-06-23
# (参考訳) 量子多体問題に対する確率的機械学習 Provably efficient machine learning for quantum many-body problems ( http://arxiv.org/abs/2106.12627v1 ) ライセンス: CC BY 4.0	Hsin-Yuan Huang, Richard Kueng, Giacomo Torlai, Victor V. Albert, John Preskill	(参考訳) 古典機械学習(ML)は、物理学と化学における量子多体問題の解決に潜在的に強力なアプローチを提供する。しかし,従来の手法に比べてMLの優位性は確立されていない。本研究では, 古典的mlアルゴリズムを用いて, ガッピングハミルトニアンの有限次元における基底状態特性を, 同じ量子相で他のハミルトニアンを測定した結果から効率的に予測できることを実証する。対照的に、広く受け入れられている複雑性理論の仮定の下では、データから学ばない古典的アルゴリズムは同じ保証を達成できない。また、古典的MLアルゴリズムは、幅広い量子位相の物質を効率的に分類できることを示す。我々の議論は古典的な影の概念に基づいており、これは多体量子状態の簡潔な古典的な記述であり、実現可能な量子実験で構築でき、状態の多くの特性を予測できる。大規模数値実験は、Rydberg原子系、2次元ランダムハイゼンベルクモデル、対称性保護位相、位相秩序相などの様々なシナリオにおいて、我々の理論結果を裏付ける。 Classical machine learning (ML) provides a potentially powerful approach to solving challenging quantum many-body problems in physics and chemistry. However, the advantages of ML over more traditional methods have not been firmly established. In this work, we prove that classical ML algorithms can efficiently predict ground state properties of gapped Hamiltonians in finite spatial dimensions, after learning from data obtained by measuring other Hamiltonians in the same quantum phase of matter. In contrast, under widely accepted complexity theory assumptions, classical algorithms that do not learn from data cannot achieve the same guarantee. We also prove that classical ML algorithms can efficiently classify a wide range of quantum phases of matter. Our arguments are based on the concept of a classical shadow, a succinct classical description of a many-body quantum state that can be constructed in feasible quantum experiments and be used to predict many properties of the state. Extensive numerical experiments corroborate our theoretical results in a variety of scenarios, including Rydberg atom systems, 2D random Heisenberg models, symmetry-protected topological phases, and topologically ordered phases.	翻訳日:2021-06-26 09:31:18 公開日:2021-06-23
# (参考訳) リスク階層化と分析のための医療的主張に基づくトランスフォーマーに基づく教師なし患者表現学習 Transformer-based unsupervised patient representation learning based on medical claims for risk stratification and analysis ( http://arxiv.org/abs/2106.12658v1 ) ライセンス: CC BY 4.0	Xianlong Zeng, Simon Lin, Chang Liu	(参考訳) クレームデータは、医療コード、サービス情報、および発生した支出を含むものであり、個人の健康状態と医療リスクレベルを推定するのによい資源である。本研究では,マルチモーダルオートエンコーダ(tmae,transformer-based multimodal autoencoder)を開発した。 TMAEは、医療の実践的なニーズにより、患者を異なるリスクレベルに階層化し、ケア提供と管理を改善する。従来のアプローチと比較して, TMAEは, 1) 入院者, 外来患者, 服薬請求を総合的にモデル化し, 2) 医療イベント間の不規則な時間間隔を処理し, 3) まれな医療基準の空白問題を緩和し, 4) 医療費情報を組み込むことができる。我々は,60万人以上の患者を含む実世界の小児クレームデータセットを用いてtmaeを訓練し,その性能を2つのクラスタリングタスクにおける様々なアプローチと比較した。実験の結果, TMAEは全てのベースラインに比べて優れた性能を示した。フレームワークの有効性を説明するために、複数のダウンストリームアプリケーションも実施する。有望な結果は,TMAEフレームワークが大規模クレームデータに対してスケーラブルであり,リスク階層化と分析のために効率的な患者埋め込みを生成することができることを確認した。 The claims data, containing medical codes, services information, and incurred expenditure, can be a good resource for estimating an individual's health condition and medical risk level. In this study, we developed Transformer-based Multimodal AutoEncoder (TMAE), an unsupervised learning framework that can learn efficient patient representation by encoding meaningful information from the claims data. TMAE is motivated by the practical needs in healthcare to stratify patients into different risk levels for improving care delivery and management. Compared to previous approaches, TMAE is able to 1) model inpatient, outpatient, and medication claims collectively, 2) handle irregular time intervals between medical events, 3) alleviate the sparsity issue of the rare medical codes, and 4) incorporate medical expenditure information. We trained TMAE using a real-world pediatric claims dataset containing more than 600,000 patients and compared its performance with various approaches in two clustering tasks. Experimental results demonstrate that TMAE has superior performance compared to all baselines. Multiple downstream applications are also conducted to illustrate the effectiveness of our framework. The promising results confirm that the TMAE framework is scalable to large claims data and is able to generate efficient patient embeddings for risk stratification and analysis.	翻訳日:2021-06-26 09:29:32 公開日:2021-06-23
# (参考訳) 畳み込みニューラルネットワークを用いた高速で忠実なlyman$\alpha$ forests Fast, high-fidelity Lyman $\alpha$ forests with convolutional neural networks ( http://arxiv.org/abs/2106.12662v1 ) ライセンス: CC BY 4.0	Peter Harrington, Mustafa Mustafa, Max Dornfest, Benjamin Horowitz, Zarija Luki\'c	(参考訳) フル物理宇宙学シミュレーションは宇宙の構造の形成と進化を研究する強力なツールであるが、極端な計算資源を必要とする。そこで我々は,Nyxシミュレーションのデータを用いて,Lyman-$\alpha$(Ly$\alpha$)森林のバリオン流体力学変数(密度,温度,速度)を復元するために,より安価なN-body-onlyシミュレーションを使用するように畳み込みニューラルネットワークを訓練する。本手法は20kpcの解像度でこれらのフィールドの迅速な推定を可能にし,既存の近似値よりもはるかに精度の高いly$\alpha$ forestの統計値を取得する。私たちのモデルは完全なコンボリューションであるため、より小さなシミュレーションボックスでトレーニングし、より大きなモデルにデプロイすることが可能です。さらに, この手法は, ly$\alpha$ flux ではなく, 流体力学場の近似を生成するので, 電離背景や平均透過流束の特定の選択に限定されない。 Full-physics cosmological simulations are powerful tools for studying the formation and evolution of structure in the universe but require extreme computational resources. Here, we train a convolutional neural network to use a cheaper N-body-only simulation to reconstruct the baryon hydrodynamic variables (density, temperature, and velocity) on scales relevant to the Lyman-$\alpha$ (Ly$\alpha$) forest, using data from Nyx simulations. We show that our method enables rapid estimation of these fields at a resolution of $\sim$20kpc, and captures the statistics of the Ly$\alpha$ forest with much greater accuracy than existing approximations. Because our model is fully-convolutional, we can train on smaller simulation boxes and deploy on much larger ones, enabling substantial computational savings. Furthermore, as our method produces an approximation for the hydrodynamic fields instead of Ly$\alpha$ flux directly, it is not limited to a particular choice of ionizing background or mean transmitted flux.	翻訳日:2021-06-26 09:16:14 公開日:2021-06-23
# (参考訳) 連続ウェーブレット変換と畳み込みニューラルネットワークを用いたヒューマンアクティビティ認識 Human Activity Recognition using Continuous Wavelet Transform and Convolutional Neural Networks ( http://arxiv.org/abs/2106.12666v1 ) ライセンス: CC BY 4.0	Anna Nedorubova, Alena Kadyrova, Aleksey Khlyupin	(参考訳) 糖尿病患者や慢性疾患のある人、高齢者、障害者など、健康上の理由から永遠の監視を受ける人は非常に少なく、これらの集団は、生命を脅かすような転倒やシンコペに襲われるリスクが高まる可能性がある。資源が限られているため、リスクのある人の大部分は必要な監視を受けられず、過度の危険にさらされる。現在、この問題はHAR(Human Activity Recognition)手法を用いて解決されている。 HARは、医療、スポーツ、セキュリティなど、幅広い分野の応用分野を持つ、視点と速いペースのデータサイエンス分野である。しかし,現在の認識技術では精度が著しく低いため,人間の行動分類の高精度な手法が提案されている。我々は、HAR問題に対処する新しいワークフローを提案し、加速度センサ信号からなるUniMiB SHARデータセット上で評価する。提案するモデルは連続ウェーブレット変換(CWT)と畳み込みニューラルネットワーク(CNN)に基づいている。ウェーブレット変換は信号特徴を時間領域と周波数領域の両方にローカライズし、その後、cnnはこれらの特徴を抽出して活動を認識する。また、CWTは1D加速度計信号を2D画像に変換するため、2Dネットワークの予測能力が著しく高いため、より良い結果が得られる。研究の過程で、畳み込みニューラルネットワークを構築し、空間軸の数、層数、各層内のニューロン数、画像サイズ、母ウェーブレットの種類、母ウェーブレットのゼロモーメントの順序など、モデルパラメータを変化させる。さらに,残差ブロックを持つモデルを適用することで,測定値が大きく向上する。最後に、99.26パーセントの精度に達することに成功し、この問題に対して価値のあるパフォーマンスである。 Quite a few people in the world have to stay under permanent surveillance for health reasons; they include diabetic people or people with some other chronic conditions, the elderly and the disabled.These groups may face heightened risk of having life-threatening falls or of being struck by a syncope. Due to limited availability of resources a substantial part of people at risk can not receive necessary monitoring and thus are exposed to excessive danger. Nowadays, this problem is usually solved via applying Human Activity Recognition (HAR) methods. HAR is a perspective and fast-paced Data Science field, which has a wide range of application areas such as healthcare, sport, security etc. However, the currently techniques of recognition are markedly lacking in accuracy, hence, the present paper suggests a highly accurate method for human activity classification. Wepropose a new workflow to address the HAR problem and evaluate it on the UniMiB SHAR dataset, which consists of the accelerometer signals. The model we suggest is based on continuous wavelet transform (CWT) and convolutional neural networks (CNNs). Wavelet transform localizes signal features both in time and frequency domains and after that a CNN extracts these features and recognizes activity. It is also worth noting that CWT converts 1D accelerometer signal into 2D images and thus enables to obtain better results as 2D networks have a significantly higher predictive capacity. In the course of the work we build a convolutional neural network and vary such model parameters as number of spatial axes, number of layers, number of neurons in each layer, image size, type of mother wavelet, the order of zero moment of mother wavelet etc. Besides, we also apply models with residual blocks which resulted in significantly higher metric values. Finally, we succeed to reach 99.26 % accuracy and it is a worthy performance for this problem.	翻訳日:2021-06-26 09:01:31 公開日:2021-06-23
# (参考訳) 表現中立化による公正性 Fairness via Representation Neutralization ( http://arxiv.org/abs/2106.12674v1 ) ライセンス: CC BY 4.0	Mengnan Du, Subhabrata Mukherjee, Guanchu Wang, Ruixiang Tang, Ahmed Hassan Awadallah, Xia Hu	(参考訳) DNNモデルの既存のバイアス軽減手法は、主にデバイアスエンコーダの学習に取り組んでいる。このプロセスは、センシティブな属性に対して多くのインスタンスレベルのアノテーションを必要とするだけでなく、すべての公平さに敏感な情報がエンコーダから削除されたことを保証しません。これらの制限に対処するために、我々は以下の研究課題を探求する: DNNモデルの識別は、入力として偏りのある表現であっても、分類ヘッドを乱すだけで抑えられるか? そこで本研究では,DNNモデルのタスク固有分類先頭のみを曖昧にすることで,公平性を実現するための表現中立化(Representation Neutralization for Fairness, RNF)を提案する。そこで我々は,DNNモデルの分類ヘッドをトレーニングするために,同一の地下構造ラベルを持つサンプルを,異なる感度特性で利用し,その中性表現を用いて評価する。 RNFの鍵となる考え方は、特定のクラスラベルを持つエンコーダ表現において、公平さに敏感な情報間の素早い相関を捉えないようにすることである。機密属性アノテーションにアクセスせずに低リソース設定に対処するため、バイアス増幅モデルを用いて機密属性のプロキシアノテーションを生成する。複数のベンチマークデータセットに対する実験結果は、タスク固有の性能の低下を最小限に抑えつつ、DNNモデルの識別を効果的に削減するRNFフレームワークを実証している。 Existing bias mitigation methods for DNN models primarily work on learning debiased encoders. This process not only requires a lot of instance-level annotations for sensitive attributes, it also does not guarantee that all fairness sensitive information has been removed from the encoder. To address these limitations, we explore the following research question: Can we reduce the discrimination of DNN models by only debiasing the classification head, even with biased representations as inputs? To this end, we propose a new mitigation technique, namely, Representation Neutralization for Fairness (RNF) that achieves fairness by debiasing only the task-specific classification head of DNN models. To this end, we leverage samples with the same ground-truth label but different sensitive attributes, and use their neutralized representations to train the classification head of the DNN model. The key idea of RNF is to discourage the classification head from capturing spurious correlation between fairness sensitive information in encoder representations with specific class labels. To address low-resource settings with no access to sensitive attribute annotations, we leverage a bias-amplified model to generate proxy annotations for sensitive attributes. Experimental results over several benchmark datasets demonstrate our RNF framework to effectively reduce discrimination of DNN models with minimal degradation in task-specific performance.	翻訳日:2021-06-26 08:46:05 公開日:2021-06-23
# Charformer: Gradient-based Subword Tokenizationによる高速文字変換器 Charformer: Fast Character Transformers via Gradient-based Subword Tokenization ( http://arxiv.org/abs/2106.12672v1 ) ライセンス: Link先を確認	Yi Tay, Vinh Q. Tran, Sebastian Ruder, Jai Gupta, Hyung Won Chung, Dara Bahri, Zhen Qin, Simon Baumgartner, Cong Yu, Donald Metzler	(参考訳) 自然言語処理における最先端モデルは、その一般化能力と新しい設定への適応を制限する、別個の厳密なサブワードトークン化アルゴリズムに依存している。本稿では,モデルの一部として単語のトークン化を端から端まで学習するモデルインダクティブバイアスを提案する。そこで本研究では,データ駆動方式で文字から潜在サブワード表現を自動的に学習する,ソフトグラデーションベースのサブワードトークンモジュール(GBST)を提案する。具体的には、gbstは候補のサブワードブロックを列挙し、ブロックスコアリングネットワークを用いて位置的にスコア付けすることを学習する。また、GBSTを統合し、バイトレベルで動作する深層トランスフォーマーモデルであるCharformerを紹介する。英語のグルー、多言語、騒がしいテキストデータセットに関する広範な実験を通じて、charformerは、一般的にparおよび時としてsubwordベースのモデルよりも優れたパフォーマンスを保ちながら、一連の競合バイトレベルのベースラインよりも優れています。さらにCharformerは高速で、バニラバイトレベルのトランスフォーマーとサブワードレベルのトランスフォーマーの両方のスピードを28%-100%向上し、競争上の品質を維持している。この作業は、エンドツーエンドで完全にトレーニングされた高性能なトークンフリーモデルの道を開くものだと考えています。 State-of-the-art models in natural language processing rely on separate rigid subword tokenization algorithms, which limit their generalization ability and adaptation to new settings. In this paper, we propose a new model inductive bias that learns a subword tokenization end-to-end as part of the model. To this end, we introduce a soft gradient-based subword tokenization module (GBST) that automatically learns latent subword representations from characters in a data-driven fashion. Concretely, GBST enumerates candidate subword blocks and learns to score them in a position-wise fashion using a block scoring network. We additionally introduce Charformer, a deep Transformer model that integrates GBST and operates on the byte level. Via extensive experiments on English GLUE, multilingual, and noisy text datasets, we show that Charformer outperforms a series of competitive byte-level baselines while generally performing on par and sometimes outperforming subword-based models. Additionally, Charformer is fast, improving the speed of both vanilla byte-level and subword-level Transformers by 28%-100% while maintaining competitive quality. We believe this work paves the way for highly performant token-free models that are trained completely end-to-end.	翻訳日:2021-06-25 15:22:24 公開日:2021-06-23
# 多層ランダムreluネットワークにおける逆例 Adversarial Examples in Multi-Layer Random ReLU Networks ( http://arxiv.org/abs/2106.12611v1 ) ライセンス: Link先を確認	Peter L. Bartlett, S\'ebastien Bubeck and Yeshwanth Cherapanamjeri	(参考訳) 独立ガウスパラメータを持つReLUネットワークにおける逆例の現象を考察する。一定の深さと広い幅のネットワーク(例えば、各層の幅が他の層の多項式であれば十分)に対して、入力ベクトルの小さな摂動は出力の大きな変化をもたらす。これにより、急速に幅を減らしたネットワークに対する Daniely と Schacham (2020) の結果と、2層ネットワークに対する Bubeck et al (2021) の結果が一般化される。この証明は、それらが計算する関数が線形に非常に近いため、これらのネットワークで逆例が生じることを示している。ネットワーク内のいくつかのポイントまでの最小の幅は、そのポイントまで計算されたマッピングのスケールと感度を決定する。主な結果は、深さが一定であるネットワークに対してであるが、この種の結果には深さの制約が必要であり、それは、一定の確率で定数に近い関数を計算する、適切なディープネットワークが存在するためである。 We consider the phenomenon of adversarial examples in ReLU networks with independent gaussian parameters. For networks of constant depth and with a large range of widths (for instance, it suffices if the width of each layer is polynomial in that of any other layer), small perturbations of input vectors lead to large changes of outputs. This generalizes results of Daniely and Schacham (2020) for networks of rapidly decreasing width and of Bubeck et al (2021) for two-layer networks. The proof shows that adversarial examples arise in these networks because the functions that they compute are very close to linear. Bottleneck layers in the network play a key role: the minimal width up to some point in the network determines scales and sensitivities of mappings computed up to that point. The main result is for networks with constant depth, but we also show that some constraint on depth is necessary for a result of this kind, because there are suitably deep networks that, with constant probability, compute a function that is close to constant.	翻訳日:2021-06-25 15:21:28 公開日:2021-06-23
# タブラルデータからのアイデアによるGNN説明の再検討 Reimagining GNN Explanations with ideas from Tabular Data ( http://arxiv.org/abs/2106.12665v1 ) ライセンス: Link先を確認	Anjali Singh, Shamanth R Nayak K, Balaji Ganesan	(参考訳) グラフニューラルネットワークの説明可能性技術は、グラフデータに基づいてトレーニングされたニューラルネットワークと決定木ベースのモデルの両方で利用可能な説明と比較して、まだ長い道のりがある。グラフと表データの両方にまたがるタスク、すなわちEntity Matchingを使って、GNNモデル説明に欠けている説明可能性の重要な側面についてコメントする。 Explainability techniques for Graph Neural Networks still have a long way to go compared to explanations available for both neural and decision decision tree-based models trained on tabular data. Using a task that straddles both graphs and tabular data, namely Entity Matching, we comment on key aspects of explainability that are missing in GNN model explanations.	翻訳日:2021-06-25 15:19:13 公開日:2021-06-23
# 多目的非同期逐次Halving Multi-objective Asynchronous Successive Halving ( http://arxiv.org/abs/2106.12639v1 ) ライセンス: Link先を確認	Robin Schmucker, Michele Donini, Muhammad Bilal Zafar, David Salinas, C\'edric Archambeau	(参考訳) ハイパーパラメータ最適化(HPO)は、機械学習モデルの予測性能(例えば精度)を自動調整するために、ますます使われている。しかし、現実世界のアプリケーションでは、精度は複数の(しばしば矛盾する)パフォーマンス基準の1つに過ぎず、多目的(MO)の観点を採用する必要がある。 MO最適化に関する文献は豊富だが、HPOに焦点を当てた先行研究はほとんどない。本稿では,非同期連続半減期(ASHA)をMO設定に拡張するアルゴリズムを提案する。複数の評価指標を考慮して,3つの実世界課題,すなわち(i)ニューラルアーキテクチャ探索,(ii)アルゴリズム的公平性,(iii)言語モデル最適化の性能評価を行った。実験分析の結果,MO ASHAはMO HPOを大規模に実行可能であることがわかった。さらに,パレートフロント全体を候補選択の考慮に入れることで,壁時計時間の観点からのmoスカラー化に基づくマルチ忠実度hpoを一貫して上回っていることを観察する。私たちのアルゴリズム(オープンソース化)は、この分野における今後の研究のための新しいベースラインを確立します。 Hyperparameter optimization (HPO) is increasingly used to automatically tune the predictive performance (e.g., accuracy) of machine learning models. However, in a plethora of real-world applications, accuracy is only one of the multiple -- often conflicting -- performance criteria, necessitating the adoption of a multi-objective (MO) perspective. While the literature on MO optimization is rich, few prior studies have focused on HPO. In this paper, we propose algorithms that extend asynchronous successive halving (ASHA) to the MO setting. Considering multiple evaluation metrics, we assess the performance of these methods on three real world tasks: (i) Neural architecture search, (ii) algorithmic fairness and (iii) language model optimization. Our empirical analysis shows that MO ASHA enables to perform MO HPO at scale. Further, we observe that that taking the entire Pareto front into account for candidate selection consistently outperforms multi-fidelity HPO based on MO scalarization in terms of wall-clock time. Our algorithms (to be open-sourced) establish new baselines for future research in the area.	翻訳日:2021-06-25 15:18:31 公開日:2021-06-23
# ディープフェイク検出:顔面マニピュレーション検出ソリューションの調査 Deep Fake Detection: Survey of Facial Manipulation Detection Solutions ( http://arxiv.org/abs/2106.12605v1 ) ライセンス: Link先を確認	Samay Pashine, Sagar Mandiya, Praveen Gupta, and Rashid Sheikh	(参考訳) 分野としてのディープラーニングは、数十年前に想像できなかったような、多くの複雑な問題の解決に成功しています。しかし、それがもたらす多くの利益は、社会に害をもたらすのに使える方法がまだ残っています。ディープフェイクはそのような問題のひとつであることが証明されており、スマートフォン上で単にアプリケーションを使って偽の画像やビデオを作成できる場合には、画像や動画が偽物なのか本物なのかを検知し、オンライン情報の信頼性を脅かす問題を処分する、何らかの対策が必要になります。ニューラルネットワークによって作成されたディープフェイクは、実際の画像やビデオと同じくらいリアルに思えるかもしれないが、モデレーション後の空間的および時間的痕跡やシグネチャは残っており、人間の目に見えないシグネチャは、ディープフェイク検出を専門に訓練されたニューラルネットワークによって検出することができる。本稿では,アートニューラルネット(mesonet,resnet-50,vgg-19,xception net)のいくつかの状態を分析し,それらを比較することで,分類をできるだけ早く行うべきオンラインソーシャルメディアプラットフォームや,その分類をリアルタイムに必要とせず,かつ最も正確性を要する小さなニュース機関において,リアルタイムのディープフェイク検出のような様々なシナリオに対して最適な解決策を見出す。 Deep Learning as a field has been successfully used to solve a plethora of complex problems, the likes of which we could not have imagined a few decades back. But as many benefits as it brings, there are still ways in which it can be used to bring harm to our society. Deep fakes have been proven to be one such problem, and now more than ever, when any individual can create a fake image or video simply using an application on the smartphone, there need to be some countermeasures, with which we can detect if the image or video is a fake or real and dispose of the problem threatening the trustworthiness of online information. Although the Deep fakes created by neural networks, may seem to be as real as a real image or video, it still leaves behind spatial and temporal traces or signatures after moderation, these signatures while being invisible to a human eye can be detected with the help of a neural network trained to specialize in Deep fake detection. In this paper, we analyze several such states of the art neural networks (MesoNet, ResNet-50, VGG-19, and Xception Net) and compare them against each other, to find an optimal solution for various scenarios like real-time deep fake detection to be deployed in online social media platforms where the classification should be made as fast as possible or for a small news agency where the classification need not be in real-time but requires utmost accuracy.	翻訳日:2021-06-25 15:16:29 公開日:2021-06-23
# IA-RED$^2$:視覚変換器の解釈可能性を考慮した冗長性低減 IA-RED$^2$: Interpretability-Aware Redundancy Reduction for Vision Transformers ( http://arxiv.org/abs/2106.12620v1 ) ライセンス: Link先を確認	Bowen Pan, Yifan Jiang, Rameswar Panda, Zhangyang Wang, Rogerio Feris, Aude Oliva	(参考訳) 自己注意型モデルであるTransformerは最近、コンピュータビジョン分野における主要なバックボーンになりつつある。様々なビジョンタスクでトランスフォーマーが素晴らしい成功をおさめたにもかかわらず、計算量と集中的なメモリコストに苦しめられている。本稿では,この制限に対処するため,解釈可能性を考慮したredundancy REDuction framework(IA-RED$^2$)を提案する。まず,非相関な入力パッチに主に費やされる大量の冗長な計算を観察し,その冗長なパッチを動的かつ優雅に削除するための解釈可能なモジュールを導入する。この新たなフレームワークは階層構造に拡張され、異なるステージで無相関なトークンが徐々に削除され、計算コストが大幅に削減される。 DeiTやTimeSformerのような最先端モデルの1.4倍のスピードアップを実現するために、画像タスクとビデオタスクの両方で広範な実験を行いました。さらに重要なことは、他の加速手法とは対照的に、我々の手法は本質的にかなりの視覚的証拠で解釈可能であり、より軽量でありながら、より人間に理解可能なアーキテクチャに近づきます。筆者らは,本フレームワークで自然に現れる解釈可能性について,本来の視覚変換器で学習した生の注意力,および既成の解釈法で生成されたものより質的かつ定量的な結果よりも優れていることを示した。プロジェクトページ: http://people.csail.mit.edu/bpan/ia-red/ The self-attention-based model, transformer, is recently becoming the leading backbone in the field of computer vision. In spite of the impressive success made by transformers in a variety of vision tasks, it still suffers from heavy computation and intensive memory cost. To address this limitation, this paper presents an Interpretability-Aware REDundancy REDuction framework (IA-RED$^2$). We start by observing a large amount of redundant computation, mainly spent on uncorrelated input patches, and then introduce an interpretable module to dynamically and gracefully drop these redundant patches. This novel framework is then extended to a hierarchical structure, where uncorrelated tokens at different stages are gradually removed, resulting in a considerable shrinkage of computational cost. We include extensive experiments on both image and video tasks, where our method could deliver up to 1.4X speed-up for state-of-the-art models like DeiT and TimeSformer, by only sacrificing less than 0.7% accuracy. More importantly, contrary to other acceleration approaches, our method is inherently interpretable with substantial visual evidence, making vision transformer closer to a more human-understandable architecture while being lighter. We demonstrate that the interpretability that naturally emerged in our framework can outperform the raw attention learned by the original visual transformer, as well as those generated by off-the-shelf interpretation methods, with both qualitative and quantitative results. Project Page: http://people.csail.mit.edu/bpan/ia-red/.	翻訳日:2021-06-25 15:09:27 公開日:2021-06-23
# 最小シャープネス:ニューラルネットワークのスケール不変パラメータロバストネス Minimum sharpness: Scale-invariant parameter-robustness of neural networks ( http://arxiv.org/abs/2106.12612v1 ) ライセンス: Link先を確認	Hikaru Ibayashi, Takuo Hamaguchi, Masaaki Imaizum	(参考訳) 堅牢で防御的なニューラルネットワークの実現に向けて、重量パラメータ摂動(シャープネス)に対する堅牢性は近年注目を集めている(Sun et al., 2020)。しかし、鋭さは「スケール感度」という重要な問題のままである。本稿では,新しいシャープネス尺度,Minimum Sharpnessを提案する。 NNは、機能的特性が完全に同一である同値なクラスを構成する特定のスケール変換を持ち、同時にそのシャープさは無限に変化することが知られている。我々は、スケール変換に不変な等価NNに対する最小化問題を通じて、シャープさを定義する。また, 研削性を実現するための効率的かつ精密な手法を開発し, ヘシアンの計算コストを低減した。実験の結果,我々のシャープネスはNNの一般化と有効に相関しており,既存のシャープネス対策よりも計算コストが低いことがわかった。 Toward achieving robust and defensive neural networks, the robustness against the weight parameters perturbations, i.e., sharpness, attracts attention in recent years (Sun et al., 2020). However, sharpness is known to remain a critical issue, "scale-sensitivity." In this paper, we propose a novel sharpness measure, Minimum Sharpness. It is known that NNs have a specific scale transformation that constitutes equivalent classes where functional properties are completely identical, and at the same time, their sharpness could change unlimitedly. We define our sharpness through a minimization problem over the equivalent NNs being invariant to the scale transformation. We also develop an efficient and exact technique to make the sharpness tractable, which reduces the heavy computational costs involved with Hessian. In the experiment, we observed that our sharpness has a valid correlation with the generalization of NNs and runs with less computational cost than existing sharpness measures.	翻訳日:2021-06-25 15:03:51 公開日:2021-06-23
# オンライン学習におけるベストケースローワーバウンダリ Best-Case Lower Bounds in Online Learning ( http://arxiv.org/abs/2106.12688v1 ) ライセンス: Link先を確認	Crist\'obal Guzm\'an and Nishant A. Mehta and Ali Mortazavi	(参考訳) オンライン学習における研究の多くは、後悔に対する下線上界の研究に焦点を当てている。本研究は,オンライン凸最適化における最良ケース下界の研究を開始し,アルゴリズムが後から得られる最良動作に対する最大の改善点を定めている。この問題は、学習アルゴリズムの適応性をよりよく理解することを目的としている。もうひとつのモチベーションは、グループフェアネスの概念を満たす決定理論オンライン学習(DTOL)のアルゴリズムを得る上で、ベストケースの下位境界が有効であることが知られていることである。我々のコントリビューションは、Follow The Regularized Leader (FTRL)アルゴリズムに時間変化レギュラーライザを付加する一般的な方法であり、このアルゴリズムは、最良のケースの下位境界が既存の上位の後悔境界と同じ順序であることを示すために使われる。対照的に、FTRLの線形化バージョンは負の線形後悔を達成できることを示す。最後に、2人の専門家とバイナリ予測を持つdtolでは、ベストケースシーケンスを完全に特徴付けし、ベストケース下限をより詳細に理解します。 Much of the work in online learning focuses on the study of sublinear upper bounds on the regret. In this work, we initiate the study of best-case lower bounds in online convex optimization, wherein we bound the largest improvement an algorithm can obtain relative to the single best action in hindsight. This problem is motivated by the goal of better understanding the adaptivity of a learning algorithm. Another motivation comes from fairness: it is known that best-case lower bounds are instrumental in obtaining algorithms for decision-theoretic online learning (DTOL) that satisfy a notion of group fairness. Our contributions are a general method to provide best-case lower bounds in Follow The Regularized Leader (FTRL) algorithms with time-varying regularizers, which we use to show that best-case lower bounds are of the same order as existing upper regret bounds: this includes situations with a fixed learning rate, decreasing learning rates, timeless methods, and adaptive gradient methods. In stark contrast, we show that the linearized version of FTRL can attain negative linear regret. Finally, in DTOL with two experts and binary predictions, we fully characterize the best-case sequences, which provides a finer understanding of the best-case lower bounds.	翻訳日:2021-06-25 15:03:36 公開日:2021-06-23
# バックプロパゲーションにおけるReLU'(0)の数値解析効果 Numerical influence of ReLU'(0) on backpropagation ( http://arxiv.org/abs/2106.12915v1 ) ライセンス: Link先を確認	David Bertoin (ISAE-SUPAERO), J\'er\^ome Bolte (UT1, TSE), S\'ebastien Gerchinovitz (IMT), Edouard Pauwels (CNRS, IRIT)	(参考訳) 理論上、ニューラルネットワークの[0, 1]におけるrelu(0)の選択は、バックプロパゲーションとトレーニングの両方に無視できない影響を与える。しかし、現実世界では、32ビットのデフォルト精度とディープラーニングの問題のサイズが組み合わさって、トレーニング手法のハイパーパラメータとなる。各種ネットワーク(全接続, VGG, ResNet)およびデータセット(MNIST, CIFAR10, SVHN)における複数の精度レベル(16, 32, 64ビット)に対するReLU(0)の値の重要性について検討する。約半分の時間で32ビット精度で発生するバックプロパゲーション出力のかなりの変動を観測する。この効果は倍精度で消失するが、16ビットで体系化される。バニラSGDトレーニングでは、ReLU (0) = 0の選択が最も効率的と思われる。また、バッチノルムやADAMのようなリコンディショニングアプローチは、ReLU(0)値の影響を緩衝する傾向にあることを示す。全体として、我々が伝えたいメッセージは、非滑らかな問題のアルゴリズム的微分が、有利に調整できるパラメータを隠蔽する可能性があるということだ。 In theory, the choice of ReLU (0) in [0, 1] for a neural network has a negligible influence both on backpropagation and training. Yet, in the real world, 32 bits default precision combined with the size of deep learning problems makes it a hyperparameter of training methods. We investigate the importance of the value of ReLU (0) for several precision levels (16, 32, 64 bits), on various networks (fully connected, VGG, ResNet) and datasets (MNIST, CIFAR10, SVHN). We observe considerable variations of backpropagation outputs which occur around half of the time in 32 bits precision. The effect disappears with double precision, while it is systematic at 16 bits. For vanilla SGD training, the choice ReLU (0) = 0 seems to be the most efficient. We also evidence that reconditioning approaches as batch-norm or ADAM tend to buffer the influence of ReLU (0)'s value. Overall, the message we want to convey is that algorithmic differentiation of nonsmooth problems potentially hides parameters that could be tuned advantageously.	翻訳日:2021-06-25 15:02:01 公開日:2021-06-23
# L'Apprentissage Automatique dans la planification et le contr{\^o}le de la production : un {\displaystyle {\'e}tat de l'art L'Apprentissage Automatique dans la planification et le contr{\^o}le de la production : un {\'e}tat de l'art ( http://arxiv.org/abs/2106.12916v1 ) ライセンス: Link先を確認	Juan Pablo Usuga Cadavid (LAMIH, ENSAM), Samir Lamouri (LAMIH, ENSAM), Bernard Grabot (LGP, ENIT), Arnaud Fortin	(参考訳) PPC(Proper Production Planning and Control)は、競争相手を圧倒し、コストを削減し、納期を尊重する資本である。 PPCに関しては、機械学習(ML)がデータに基づいてインテリジェントな意思決定を行う新たな機会を提供する。したがって、このコミュニケーションは、PPCに適用されたMLに関する出版物の初期の体系的なレビューを提供する。本研究の目的は2つある:第1に、PPCにMLを適用可能な技術やツールを特定し、第2に、最近の研究論文における産業4.0(I4.0)の特徴をレビューすることである。第2の目的について、i4.0の7つの特徴が分析フレームワークで使われ、そのうちの2つは著者によって提案されている。さらに、科学文献におけるML支援PPCのアドレスドメインを同定した。最後に、結果は分析され、さらなる研究の動機となるギャップが強調される。 Proper Production Planning and Control (PPC) is capital to have an edge over competitors, reduce costs and respect delivery dates. With regard to PPC, Machine Learning (ML) provides new opportunities to make intelligent decisions based on data. Therefore, this communication provides an initial systematic review of publications on ML applied in PPC. The research objective of this study is twofold: firstly, it aims to identify techniques and tools allowing to apply ML in PPC, and secondly, it reviews the characteristics of Industry 4.0 (I4.0) in recent research papers. Concerning the second objective, seven characteristics of I4.0 are used in the analysis framework, from which two of them are proposed by the authors. Additionally, the addressed domains of ML-aided PPC in scientific literature are identified. Finally, results are analyzed and gaps that may motivate further research are highlighted.	翻訳日:2021-06-25 15:01:42 公開日:2021-06-23
# トレーニングとテストセグメンテーションミスマッチによる対処: FBK@IWSLT2021 Dealing with training and test segmentation mismatch: FBK@IWSLT2021 ( http://arxiv.org/abs/2106.12607v1 ) ライセンス: Link先を確認	Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi	(参考訳) 本稿では,FIWLT 2021オフライン音声翻訳タスクに対するFBKのシステム適用について述べる。英語音声データをドイツ語のテキストに変換するためのトランスフォーマティブなアーキテクチャであるdirect modelに参加した。訓練パイプラインは、知識蒸留と2段階の微調整手順により特徴づけられる。知識蒸留と第1の微調整工程の両方を手作業で分割した実データと合成データで行い、後者は利用可能なコーパスで訓練されたMTシステムで生成する。異なることに、第2の微調整ステップは、MuST-C v2 En-Deデータセットのランダムセグメンテーションで実行される。その主な目的は、手動でセグメンテーションされたデータ(すなわち)に基づいて訓練された音声翻訳モデルで発生する性能低下を減らすことである。自動セグメンテーション音声(すなわち、音声)上で理想的な文様セグメンテーションを評価する。実際の、より現実的なテスト条件) 同じ目的のために、システムに渡す前に、オーディオコンテンツ(ポーズ)と生成されたセグメントの長さの両方を考慮に入れた独自のハイブリッドセグメンテーション手順をテストデータに適用する。提案手法を,Voice Activity Detection (VAD) に基づくベースラインセグメンテーション法と比較した。提案手法の有効性は,手動のセグメンテーションによるギャップを8.3から1.4 BLEUポイントに減らし,提案手法の有効性を示した。 This paper describes FBK's system submission to the IWSLT 2021 Offline Speech Translation task. We participated with a direct model, which is a Transformer-based architecture trained to translate English speech audio data into German texts. The training pipeline is characterized by knowledge distillation and a two-step fine-tuning procedure. Both knowledge distillation and the first fine-tuning step are carried out on manually segmented real and synthetic data, the latter being generated with an MT system trained on the available corpora. Differently, the second fine-tuning step is carried out on a random segmentation of the MuST-C v2 En-De dataset. Its main goal is to reduce the performance drops occurring when a speech translation model trained on manually segmented data (i.e. an ideal, sentence-like segmentation) is evaluated on automatically segmented audio (i.e. actual, more realistic testing conditions). For the same purpose, a custom hybrid segmentation procedure that accounts for both audio content (pauses) and for the length of the produced segments is applied to the test data before passing them to the system. At inference time, we compared this procedure with a baseline segmentation method based on Voice Activity Detection (VAD). Our results indicate the effectiveness of the proposed hybrid approach, shown by a reduction of the gap with manual segmentation from 8.3 to 1.4 BLEU points.	翻訳日:2021-06-25 14:59:18 公開日:2021-06-23
# フロリダ野生生物カメラトラップデータセット Florida Wildlife Camera Trap Dataset ( http://arxiv.org/abs/2106.12628v1 ) ライセンス: Link先を確認	Crystal Gagne, Jyoti Kini, Daniel Smith, Mubarak Shah	(参考訳) トレイルカメラの画像は、保護と生態研究のために生物学者の間で人気が高まっている。カメラトラップを操作するのに必要な最小限の人間の干渉は、偏見のない種の活動を捉えることができる。人類と野生生物の相互作用、様々な種の移動パターン、絶滅危惧種の絶滅リスクなどに基づくいくつかの研究は、豊富なデータ不足と、手動で注釈付けされたトレイルカメラ画像の時間的特性によって制限されている。フロリダ州南西部の2つの異なる場所から収集された野生生物カメラトラップ分類データセットについて,視覚に類似した種を特徴とする104,495枚の画像,照明条件の相違,類型分布,絶滅危惧種のサンプルなどを紹介する。フロリダ・パンサー。 ResNet-50アーキテクチャによる実験的評価は、この画像分類に基づくデータセットが野生生物統計モデリングのさらなる進歩を推し進めることを示している。私たちはデータセットを公開します。 Trail camera imagery has increasingly gained popularity amongst biologists for conservation and ecological research. Minimal human interference required to operate camera traps allows capturing unbiased species activities. Several studies - based on human and wildlife interactions, migratory patterns of various species, risk of extinction in endangered populations - are limited by the lack of rich data and the time-consuming nature of manually annotating trail camera imagery. We introduce a challenging wildlife camera trap classification dataset collected from two different locations in Southwestern Florida, consisting of 104,495 images featuring visually similar species, varying illumination conditions, skewed class distribution, and including samples of endangered species, i.e. Florida panthers. Experimental evaluations with ResNet-50 architecture indicate that this image classification-based dataset can further push the advancements in wildlife statistical modeling. We will make the dataset publicly available.	翻訳日:2021-06-25 14:58:05 公開日:2021-06-23
# 視覚的な場所認識が簡単か難しいか? What makes visual place recognition easy or hard? ( http://arxiv.org/abs/2106.12671v1 ) ライセンス: Link先を確認	Stefan Schubert and Peer Neubert	(参考訳) 視覚的位置認識は移動ロボットの局所化の基本的な機能である。物理的世界で動作する物理エージェントの実践的な文脈に画像検索を配置する。これは研究の活発な分野であり、多くの異なる実験で多くの異なるアプローチが提案され評価されている。以下では、この実践的文脈と個々の設計判断のバリエーションから、場所認識実験は、異なる論文にほぼ匹敵するものであり、ある実験から別の実験へと変化する様々な特性が存在すると論じる。このような特性の広範なリストを提供し、位置認識実験をより簡単に、あるいは困難に設定する方法の例を示す。本研究は,(1)特定の課題の特質に適した場所認識アプローチを選択したい人,(2)オープンな研究課題を探求し,特に困難な事例に関心を持つ研究者,(3)再現可能な論文を作成したい著者,(4)レビュー中の論文の潜在的な問題を識別するタスクを持つレビュアーなど,様々な関係者にとって興味深いものである。 Visual place recognition is a fundamental capability for the localization of mobile robots. It places image retrieval in the practical context of physical agents operating in a physical world. It is an active field of research and many different approaches have been proposed and evaluated in many different experiments. In the following, we argue that due to variations of this practical context and individual design decisions, place recognition experiments are barely comparable across different papers and that there is a variety of properties that can change from one experiment to another. We provide an extensive list of such properties and give examples how they can be used to setup a place recognition experiment easier or harder. This might be interesting for different involved parties: (1) people who just want to select a place recognition approach that is suitable for the properties of their particular task at hand, (2) researchers that look for open research questions and are interested in particularly difficult instances, (3) authors that want to create reproducible papers on this topic, and (4) also reviewers that have the task to identify potential problems in papers under review.	翻訳日:2021-06-25 14:57:49 公開日:2021-06-23
# 畳み込みニューラルネットワークによる条件変形可能な画像登録 Conditional Deformable Image Registration with Convolutional Neural Network ( http://arxiv.org/abs/2106.12673v1 ) ライセンス: Link先を確認	Tony C. W. Mok and Albert C. S. Chung	(参考訳) 近年のディープラーニングに基づく手法は、変形可能な画像登録において有望な結果と実行時の利点を示している。しかし、ハイパーパラメータの効果を分析し、最適な正規化パラメータを探索することは、深層学習法では違法すぎることが証明されている。これは、異なるハイパーパラメータ値を持つかなりの数の異なるモデルをトレーニングする必要があるためである。本稿では,深部変形可能な画像登録のための条件付き画像登録手法と自己教師付き学習パラダイムを提案する。正規化ハイパーパラメータと相関する条件特徴を学習することにより、任意のハイパーパラメータによる最適解を1つの深層畳み込みニューラルネットワークで捉えることができることを示す。さらに、結果として生じる変形場の滑らかさは、推論中の滑らか度正規化の任意の強度で操作することができる。大規模脳MRIデータセットの大規模な実験により,提案手法は実行時の優位性や登録精度を犠牲にすることなく,変形場の滑らかさを正確に制御できることを示した。 Recent deep learning-based methods have shown promising results and runtime advantages in deformable image registration. However, analyzing the effects of hyperparameters and searching for optimal regularization parameters prove to be too prohibitive in deep learning-based methods. This is because it involves training a substantial number of separate models with distinct hyperparameter values. In this paper, we propose a conditional image registration method and a new self-supervised learning paradigm for deep deformable image registration. By learning the conditional features that correlated with the regularization hyperparameter, we demonstrate that optimal solutions with arbitrary hyperparameters can be captured by a single deep convolutional neural network. In addition, the smoothness of the resulting deformation field can be manipulated with arbitrary strength of smoothness regularization during inference. Extensive experiments on a large-scale brain MRI dataset show that our proposed method enables the precise control of the smoothness of the deformation field without sacrificing the runtime advantage or registration accuracy.	翻訳日:2021-06-25 14:57:32 公開日:2021-06-23
# 不可逆過程を予測するブラケットを保存する機械学習構造 Machine learning structure preserving brackets for forecasting irreversible processes ( http://arxiv.org/abs/2106.12619v1 ) ライセンス: Link先を確認	Kookjin Lee and Nathaniel A. Trask and Panos Stinis	(参考訳) 時系列データの予測には予測外挿を得るために帰納バイアスの付与が必要であり、最近の研究では可逆力学系の構造を保存するためにハミルトニアン/ラグランジアン形式が課されている。本稿では,未知の事前モデル形式を持つ可逆力学の学習に適した3次力学系からの散逸ブラケットのパラメータ化を提案する。この過程は、エネルギーとエントロピーがそれぞれ保存され、非減少することが保証された一般化されたカシミールを学ぶ。さらに, 熱雑音が加わった場合, 変動散逸定理の正確な保存を保証し, 熱力学的整合性を確保する。学習したダイナミクスが"ブラックボックス"やペナルティベースのアプローチよりも堅牢で一般化していることを示す散逸システムのベンチマークを提供する。 Forecasting of time-series data requires imposition of inductive biases to obtain predictive extrapolation, and recent works have imposed Hamiltonian/Lagrangian form to preserve structure for systems with reversible dynamics. In this work we present a novel parameterization of dissipative brackets from metriplectic dynamical systems appropriate for learning irreversible dynamics with unknown a priori model form. The process learns generalized Casimirs for energy and entropy guaranteed to be conserved and nondecreasing, respectively. Furthermore, for the case of added thermal noise, we guarantee exact preservation of a fluctuation-dissipation theorem, ensuring thermodynamic consistency. We provide benchmarks for dissipative systems demonstrating learned dynamics are more robust and generalize better than either "black-box" or penalty-based approaches.	翻訳日:2021-06-25 14:55:05 公開日:2021-06-23
# 協調フィルタ型推薦システムにおけるステレオタイプ問題 The Stereotyping Problem in Collaboratively Filtered Recommender Systems ( http://arxiv.org/abs/2106.12622v1 ) ライセンス: Link先を確認	Wenshuo Guo, Karl Krauth, Michael I. Jordan, Nikhil Garg	(参考訳) 推薦システム、特に行列分解に基づく協調フィルタリングアルゴリズムは、オンライン情報へのアクセスを仲介する上で重要な役割を果たす。このようなアルゴリズムが特定のステレオタイプを誘導することを示している: アイテムの \textit{set} に対する嗜好が一般ユーザ集団で反相関である場合、それらのアイテムはユーザの好みや評価履歴に関係なく、ユーザと一緒に推奨されない可能性がある。まず,一組のアイテムがユーザによって共同でアクセス可能な範囲を計測する「textit{joint accessibility}」という概念を導入する。次に,標準因子化に基づく協調フィルタリングの枠組みに基づく協調的アクセシビリティを研究し,協調的アクセシビリティに違反した場合の理論的必要十分条件を提供する。さらに,ユーザが単一の特徴ベクトルで表される場合,これらの条件が容易に破られることを示す。共同アクセシビリティを向上させるために,マルチベクタ表現を用いて各ユーザの多様な利害関係を捉えるための代替的なモデリング修正を提案する。本研究では,実データとシミュレーションデータについて広範な実験を行い,標準単一ベクトル行列分解モデルを用いてステレオタイプ問題を示す。 Recommender systems -- and especially matrix factorization-based collaborative filtering algorithms -- play a crucial role in mediating our access to online information. We show that such algorithms induce a particular kind of stereotyping: if preferences for a \textit{set} of items are anti-correlated in the general user population, then those items may not be recommended together to a user, regardless of that user's preferences and ratings history. First, we introduce a notion of \textit{joint accessibility}, which measures the extent to which a set of items can jointly be accessed by users. We then study joint accessibility under the standard factorization-based collaborative filtering framework, and provide theoretical necessary and sufficient conditions when joint accessibility is violated. Moreover, we show that these conditions can easily be violated when the users are represented by a single feature vector. To improve joint accessibility, we further propose an alternative modelling fix, which is designed to capture the diverse multiple interests of each user using a multi-vector representation. We conduct extensive experiments on real and simulated datasets, demonstrating the stereotyping problem with standard single-vector matrix factorization models.	翻訳日:2021-06-25 14:54:51 公開日:2021-06-23
# 製品探索における意味マッチングのためのエクストリームマルチラベル学習 Extreme Multi-label Learning for Semantic Matching in Product Search ( http://arxiv.org/abs/2106.12657v1 ) ライセンス: Link先を確認	Wei-Cheng Chang, Daniel Jiang, Hsiang-Fu Yu, Choon-Hui Teo, Jiong Zhang, Kai Zhong, Kedarnath Kolluri, Qie Hu, Nikhil Shandilya, Vyacheslav Ievgrafov, Japinder Singh, Inderjit S. Dhillon	(参考訳) 製品検索におけるセマンティックマッチングの問題について考察する。顧客の問い合わせを与えられた場合、1億以上の巨大なカタログからすべてのセマンティック関連商品を検索する。大きなカタログ空間とリアルタイムレイテンシの制約のため、セマンティックマッチングアルゴリズムは高いリコールを求めるだけでなく、低レイテンシを必要とする。従来の語彙マッチングアプローチ(例えばokapi-bm25)は、逆インデックスを利用して高速な推論時間を達成するが、クエリと製品間の動作信号をキャプチャしない。対照的に、埋め込みベースのモデルは顧客の行動データからセマンティック表現を学習するが、レイテンシの制約により、浅いニューラルエンコーダによって性能が制限されることが多い。セマンティック製品検索は、顧客クエリが入力インスタンスであり、製品が出力ラベルである、極端なマルチラベル分類(xmc)の問題と見なすことができる。本稿では,推論時間の複雑さが製品数に対数である木ベースxmcモデルを用いて,意味的製品探索を改善することを目的とする。高速リアルタイム推論のためのn-gram特徴を持つ階層線形モデルを考える。定量的には、1クエリあたりのレイテンシが1.25ミリ秒と低く、recall@100 (60.9%v.s) を65%向上させた。 36.8%) が競合する組込みベースのdssmモデルより優れている。私たちのモデルは、さまざまなしきい値で重み付けし、オンラインデプロイメントの異なるシステム要件を柔軟に満たすことができます。質的には,既存の製品検索システムと相補的な製品を検索し,マッチセットに多様性を加えることができる。 We consider the problem of semantic matching in product search: given a customer query, retrieve all semantically related products from a huge catalog of size 100 million, or more. Because of large catalog spaces and real-time latency constraints, semantic matching algorithms not only desire high recall but also need to have low latency. Conventional lexical matching approaches (e.g., Okapi-BM25) exploit inverted indices to achieve fast inference time, but fail to capture behavioral signals between queries and products. In contrast, embedding-based models learn semantic representations from customer behavior data, but the performance is often limited by shallow neural encoders due to latency constraints. Semantic product search can be viewed as an eXtreme Multi-label Classification (XMC) problem, where customer queries are input instances and products are output labels. In this paper, we aim to improve semantic product search by using tree-based XMC models where inference time complexity is logarithmic in the number of products. We consider hierarchical linear models with n-gram features for fast real-time inference. Quantitatively, our method maintains a low latency of 1.25 milliseconds per query and achieves a 65% improvement of Recall@100 (60.9% v.s. 36.8%) over a competing embedding-based DSSM model. Our model is robust to weight pruning with varying thresholds, which can flexibly meet different system requirements for online deployments. Qualitatively, our method can retrieve products that are complementary to existing product search system and add diversity to the match set.	翻訳日:2021-06-25 14:54:30 公開日:2021-06-23
# 最適化における現代的な技術を理解する:Frank-Wolfe、NesterovのMomentum、PolyakのMomentum Understanding Modern Techniques in Optimization: Frank-Wolfe, Nesterov's Momentum, and Polyak's Momentum ( http://arxiv.org/abs/2106.12923v1 ) ライセンス: Link先を確認	Jun-Kun Wang	(参考訳) この論文研究の第1部では,凸最適化のための反復アルゴリズムの構築と解析のためのレシピとして機能するモジュラーフレームワークを開発した。具体的には,2プレイヤーゼロサムゲームを反復的に行うことで最適化を行う。フランク・ウルフやネステロフの加速法を含む既存の多くの最適化アルゴリズムは、2人のオンライン学習者を互いに適切な戦略でピットすることでゲームから復元することができる。さらに、ゲーム中のプレイヤーの重み付けされた平均的後悔の和は収束率を示している。その結果,本手法はこれらのアルゴリズムに簡単な代替的証明を与える。さらに,ゲームプレイを反復的に行うことによる最適化のアプローチが,いくつかの制約セットに対してフランク・ウルフ風のアルゴリズムを新たに3つ導入すること,さらに,我々のフレームワークが本当に汎用的でモジュール的で使いやすくなっていることを示す。第2部では,古典的強二次凸問題の解法,神経接核系下での広いreluネットワークの訓練,直交初期化を用いた深い線形ネットワークの訓練など,ある問題に対するpolyakの運動量による証明可能な加速度のモジュラー解析を開発した。我々はメタ定理を開発し、これらの問題にポリアックの運動量を適用するとき、誘導力学はメタ定理を直接適用できる形式を示すことを示した。論文の最後の部分では、ポリアックの運動量の使用の別の利点を示し、滑らかな非凸最適化において、サドルポイントの高速脱出を容易にする。この結果、第2部と共に、現代の非凸最適化とディープラーニングにおけるPolyakの勢いに新たな光を当てた。 In the first part of this dissertation research, we develop a modular framework that can serve as a recipe for constructing and analyzing iterative algorithms for convex optimization. Specifically, our work casts optimization as iteratively playing a two-player zero-sum game. Many existing optimization algorithms including Frank-Wolfe and Nesterov's acceleration methods can be recovered from the game by pitting two online learners with appropriate strategies against each other. Furthermore, the sum of the weighted average regrets of the players in the game implies the convergence rate. As a result, our approach provides simple alternative proofs to these algorithms. Moreover, we demonstrate that our approach of optimization as iteratively playing a game leads to three new fast Frank-Wolfe-like algorithms for some constraint sets, which further shows that our framework is indeed generic, modular, and easy-to-use. In the second part, we develop a modular analysis of provable acceleration via Polyak's momentum for certain problems, which include solving the classical strongly quadratic convex problems, training a wide ReLU network under the neural tangent kernel regime, and training a deep linear network with an orthogonal initialization. We develop a meta theorem and show that when applying Polyak's momentum for these problems, the induced dynamics exhibit a form where we can directly apply our meta theorem. In the last part of the dissertation, we show another advantage of the use of Polyak's momentum -- it facilitates fast saddle point escape in smooth non-convex optimization. This result, together with those of the second part, sheds new light on Polyak's momentum in modern non-convex optimization and deep learning.	翻訳日:2021-06-25 14:53:21 公開日:2021-06-23
# テキストデータを用いた株式市場分析:レビュー Stock Market Analysis with Text Data: A Review ( http://arxiv.org/abs/2106.12985v1 ) ライセンス: Link先を確認	Kamaladdin Fataliyev, Aneesh Chivukula, Mukesh Prasad and Wei Liu	(参考訳) 株式市場の動きは、ニュース記事、会社の報告、ソーシャルメディアの議論を通じて共有される公開情報やプライベート情報の影響を受けている。こうした膨大なデータソースを分析することで、市場参加者に利益をもたらすことができる。しかし、文学における研究の大部分は、構造化されていない膨大なテキストデータの解析に近づいた伝統的なアプローチに基づいている。本研究では,テキストベースの株式市場分析における既存文献の膨大な量について概観する。入力データ型を示し、主要なテキストデータソースとバリエーションをカバーする。特徴表現技法が提示される。次に、分析手法を概説し、主要な株式市場予測モデルの分類を作成する。ここでは,分類学の各分野の代表的業績について論じ,それぞれの貢献を分析した。最後に,未解決の未解決問題に関する知見を示し,今後の課題の提案を行う。本研究の目的は,主要な株式市場分析モデル,金融市場予測のためのテキスト表現技術,既存手法の欠点,今後の研究への道筋を提案することである。 Stock market movements are influenced by public and private information shared through news articles, company reports, and social media discussions. Analyzing these vast sources of data can give market participants an edge to make profit. However, the majority of the studies in the literature are based on traditional approaches that come short in analyzing unstructured, vast textual data. In this study, we provide a review on the immense amount of existing literature of text-based stock market analysis. We present input data types and cover main textual data sources and variations. Feature representation techniques are then presented. Then, we cover the analysis techniques and create a taxonomy of the main stock market forecast models. Importantly, we discuss representative work in each category of the taxonomy, analyzing their respective contributions. Finally, this paper shows the findings on unaddressed open problems and gives suggestions for future work. The aim of this study is to survey the main stock market analysis models, text representation techniques for financial market prediction, shortcomings of existing techniques, and propose promising directions for future research.	翻訳日:2021-06-25 14:52:44 公開日:2021-06-23
# 神経後部推定を用いたリアルタイム重力波科学 Real-time gravitational-wave science with neural posterior estimation ( http://arxiv.org/abs/2106.12594v1 ) ライセンス: Link先を確認	Maximilian Dax, Stephen R. Green, Jonathan Gair, Jakob H. Macke, Alessandra Buonanno, Bernhard Sch\"olkopf	(参考訳) 深層学習による高速重力波パラメータ推定について,前例のない精度を示す。ニューラルネットワークをベイズ分布のサロゲートとして用いて,最初のLIGO-Virgo Gravitational-Wave Transient Catalogから8つの重力波イベントを解析し,標準推論符号と非常に密に一致しているが,推定時間はO(day)から1分間に短縮された。ネットワークはシミュレーションデータを用いて,事象近傍の検出器ノイズ特性の推定を含むトレーニングを行う。これにより、数百万のニューラルネットワークパラメータ内の信号とノイズモデルを符号化し、イベントからイベントまでのノイズ非定常性を考慮して、トレーニング分布に整合した観測データの推論を可能にする。私たちのアルゴリズムは、"dingo"と呼ばれ、検出された重力波イベントの物理的パラメータの高速かつ正確な推論の新しい標準を設定します。 We demonstrate unprecedented accuracy for rapid gravitational-wave parameter estimation with deep learning. Using neural networks as surrogates for Bayesian posterior distributions, we analyze eight gravitational-wave events from the first LIGO-Virgo Gravitational-Wave Transient Catalog and find very close quantitative agreement with standard inference codes, but with inference times reduced from O(day) to a minute per event. Our networks are trained using simulated data, including an estimate of the detector-noise characteristics near the event. This encodes the signal and noise models within millions of neural-network parameters, and enables inference for any observed data consistent with the training distribution, accounting for noise nonstationarity from event to event. Our algorithm -- called "DINGO" -- sets a new standard in fast-and-accurate inference of physical parameters of detected gravitational-wave events, which should enable real-time data analysis without sacrificing accuracy.	翻訳日:2021-06-25 14:50:05 公開日:2021-06-23
# 表現の組み合わせによるランキングのセマンティックな類似クエリの活用 Leveraging semantically similar queries for ranking via combining representations ( http://arxiv.org/abs/2106.12621v1 ) ライセンス: Link先を確認	Hayden S. Helm and Marah Abdin and Benjamin D. Pedigo and Shweti Mahajan and Vince Lyzinski and Youngser Park and Amitabh Basu and Piali~Choudhury and Christopher M. White and Weiwei Yang and Carey E. Priebe	(参考訳) 現代のランキング問題では、ランク付けされる項目の異なる、異なる表現がしばしば利用できる。したがって、これらの表現を組み合わせてランキングを改善するのは賢明である。実際、表現を組み合わせることでランク付けを学ぶことは、特定のクエリのランク付け関数を学ぶための原則と実践の両方である。しかし、極めてデータ量の多い設定では、特定のクエリで利用可能なラベル付きデータの量は、高度に可変で非効率なランキング機能につながる可能性がある。少量のデータの影響を軽減する一つの方法は、セマンティックに類似したクエリからの情報を活用することである。実際、シミュレーション設定や実データ例で示すように、セマンティックに類似したクエリが利用可能であれば、特定のクエリに対してランク付けするときに、それらを適切に使用できる。我々は,この現象をバイアス分散トレードオフの文脈で記述し,Bingナビゲーショングラフとショウジョウバエ幼虫コネクトームのデータスカース設定に適用する。 In modern ranking problems, different and disparate representations of the items to be ranked are often available. It is sensible, then, to try to combine these representations to improve ranking. Indeed, learning to rank via combining representations is both principled and practical for learning a ranking function for a particular query. In extremely data-scarce settings, however, the amount of labeled data available for a particular query can lead to a highly variable and ineffective ranking function. One way to mitigate the effect of the small amount of data is to leverage information from semantically similar queries. Indeed, as we demonstrate in simulation settings and real data examples, when semantically similar queries are available it is possible to gainfully use them when ranking with respect to a particular query. We describe and explore this phenomenon in the context of the bias-variance trade off and apply it to the data-scarce settings of a Bing navigational graph and the Drosophila larva connectome.	翻訳日:2021-06-25 14:49:47 公開日:2021-06-23
# 低複雑さDFT空間サンプリングに基づくロバスト適応ビームフォーミングの検討 Study of Robust Adaptive Beamforming Based on Low-Complexity DFT Spatial Sampling ( http://arxiv.org/abs/2106.12663v1 ) ライセンス: Link先を確認	Saeed Mohammadzadeh, Vitor H.Nascimento, Rodrigo C. de Lamare and Osman Kukrer	(参考訳) 本稿では,無作為過程の自己相関列(acs)を一組の計測データから再構成する手法に基づいて,適応的ビームフォーミングのための新しいロバストなアルゴリズムを提案する。これは、その対角線に沿って平均化した後、サンプル共分散行列(SCM)の第1列と第1列から得られる。次に、離散フーリエ変換(DFT)を用いて相関系列のパワースペクトルを推定する。ノイズプラス干渉領域内の角度に対応するDFT係数を用いてノイズプラス干渉共分散行列(NPICM)を再構成し、所望の信号共分散行列(DSCM)をSCMからノイズプラス干渉成分を同定して除去する。特に、推定された受信信号の空間パワースペクトルを利用して、ノイズプラス干渉の優位dft係数をキャプチャしたノイズプラス干渉に対応する相関シーケンスを算出する。提案した適応ビームフォーミングの重要な利点は、わずかな事前情報しか必要としないことである。具体的には、配列幾何学と干渉が位置する角のセクターに関する不正確な知識が必要である。シミュレーションの結果,提案手法は従来の再構成方式のビームフォーマと比較して,入力信号-雑音比が非常に広い範囲で複数ミスマッチした場合の全体的な性能が向上することが示された。 In this paper, a novel and robust algorithm is proposed for adaptive beamforming based on the idea of reconstructing the autocorrelation sequence (ACS) of a random process from a set of measured data. This is obtained from the first column and the first row of the sample covariance matrix (SCM) after averaging along its diagonals. Then, the power spectrum of the correlation sequence is estimated using the discrete Fourier transform (DFT). The DFT coefficients corresponding to the angles within the noise-plus-interference region are used to reconstruct the noise-plus-interference covariance matrix (NPICM), while the desired signal covariance matrix (DSCM) is estimated by identifying and removing the noise-plus-interference component from the SCM. In particular, the spatial power spectrum of the estimated received signal is utilized to compute the correlation sequence corresponding to the noise-plus-interference in which the dominant DFT coefficient of the noise-plus-interference is captured. A key advantage of the proposed adaptive beamforming is that only little prior information is required. Specifically, an imprecise knowledge of the array geometry and of the angular sectors in which the interferences are located is needed. Simulation results demonstrate that compared with previous reconstruction-based beamformers, the proposed approach can achieve better overall performance in the case of multiple mismatches over a very large range of input signal-to-noise ratios.	翻訳日:2021-06-25 14:49:31 公開日:2021-06-23
# (参考訳) GKSD(Generalized Kernel Stein Discrepancy) : 非パラメトリックグッドネス・オブ・フィットテストのための統一的アプローチ Generalised Kernel Stein Discrepancy(GKSD): A Unifying Approach for Non-parametric Goodness-of-fit Testing ( http://arxiv.org/abs/2106.12105v1 ) ライセンス: CC BY 4.0	Wenkai Xu	(参考訳) kernel stein discrepancies (ksd)に基づく非パラメトリックな適合性テスト手順は、様々なシナリオにおける一般的な非正規化分布を検証するための有望なアプローチである。既存の研究は、テスト性能を高めるために最適なカーネルの選択を研究することに重点を置いている。しかし、スタイン作用素は一般に普遍的ではないが、スタイン作用素の異なる選択はテスト性能にかなりの影響を及ぼす。そこで本研究では,KSDに基づく適合性テストの実行において,異なるStein演算子を理論的に比較・解釈する汎用カーネルStein差分法(GKSD)を提案する。提案したGKSDフレームワークが既存のStein演算子とその対応するテストをどのように一般化するかを明確に導出する。さらに、GKSDフレームワークは、カーネルベースの複雑な新しいデータシナリオのための非パラメトリック良性テストを開発するためのガイドとして使用できることを示す。断続分布または構成データ。実験結果から,提案したテストは,最大平均分解率(MMD)に基づくテストを含む既存手法よりも高いテストパワーを達成できることがわかった。 Non-parametric goodness-of-fit testing procedures based on kernel Stein discrepancies (KSD) are promising approaches to validate general unnormalised distributions in various scenarios. Existing works have focused on studying optimal kernel choices to boost test performances. However, the Stein operators are generally non-unique, while different choices of Stein operators can also have considerable effect on the test performances. In this work, we propose a unifying framework, the generalised kernel Stein discrepancy (GKSD), to theoretically compare and interpret different Stein operators in performing the KSD-based goodness-of-fit tests. We derive explicitly that how the proposed GKSD framework generalises existing Stein operators and their corresponding tests. In addition, we show thatGKSD framework can be used as a guide to develop kernel-based non-parametric goodness-of-fit tests for complex new data scenarios, e.g. truncated distributions or compositional data. Experimental results demonstrate that the proposed tests control type-I error well and achieve higher test power than existing approaches, including the test based on maximum-mean-discrepancy (MMD).	翻訳日:2021-06-25 01:28:52 公開日:2021-06-23
# (参考訳) 分布シフト下における近似線形回帰 Near-Optimal Linear Regression under Distribution Shift ( http://arxiv.org/abs/2106.12108v1 ) ライセンス: CC0 1.0	Qi Lei, Wei Hu, Jason D. Lee	(参考訳) 十分なデータがソースドメインから来る場合、転送学習は必須であり、対象ドメインからのラベル付きデータが不足する。分布シフト中の線形回帰問題に対する最小線形リスクを実現する推定器を開発する。アルゴリズムは,共変量シフトやモデルシフトなど,さまざまなトランスファー学習設定をカバーする。また、線形あるいは一般非線形モデルからデータを生成する場合についても検討する。線形ミニマックス推定器は、様々なソース/ターゲット分布に対する非線形推定器であっても、ミニマックスリスクの絶対定数であることを示す。 Transfer learning is essential when sufficient data comes from the source domain, with scarce labeled data from the target domain. We develop estimators that achieve minimax linear risk for linear regression problems under distribution shift. Our algorithms cover different transfer learning settings including covariate shift and model shift. We also consider when data are generated from either linear or general nonlinear models. We show that linear minimax estimators are within an absolute constant of the minimax risk even among nonlinear estimators for various source/target distributions.	翻訳日:2021-06-25 01:01:16 公開日:2021-06-23
# (参考訳) 因果効果の境界と高次元データへの応用 Bounds on Causal Effects and Application to High Dimensional Data ( http://arxiv.org/abs/2106.12121v1 ) ライセンス: CC BY 4.0	Ang Li, Judea Pearl	(参考訳) 本稿では,バックドア条件やフロントドア基準の調整変数が部分的に観察された場合の因果効果を推定する問題に対処する。このようなシナリオでは、2つの非線形最適化問題を解くことによって因果効果の境界を導出し、境界が十分であることを示す。この最適化手法を用いて,推定パワーに対するバイアスをトレードオフできる次元性低減のための枠組みを提案し,その性能をシミュレーションにより実証する。 This paper addresses the problem of estimating causal effects when adjustment variables in the back-door or front-door criterion are partially observed. For such scenarios, we derive bounds on the causal effects by solving two non-linear optimization problems, and demonstrate that the bounds are sufficient. Using this optimization method, we propose a framework for dimensionality reduction that allows one to trade bias for estimation power, and demonstrate its performance using simulation studies.	翻訳日:2021-06-25 01:00:23 公開日:2021-06-23
# (参考訳) ソースフリードメイン適応意味セグメンテーションにおける暗黙的擬似ラベル整流法に対する負学習の活用 Exploiting Negative Learning for Implicit Pseudo Label Rectification in Source-Free Domain Adaptive Semantic Segmentation ( http://arxiv.org/abs/2106.12123v1 ) ライセンス: CC BY 4.0	Xin Luo, Wei Chen, Yusong Tan, Chen Li, Yulin He, Xiaogang Jia	(参考訳) ソースデータがない場合には、十分に訓練されたソースモデルに格納された知識を非注釈のターゲットドメインに転送することが望ましい。しかし、ソースフリードメイン適応(sfda)のための最先端の手法には厳しい制限がある。1) ソースモデルの内部仕様へのアクセスは必須であり、2) 擬似ラベルは自己学習中にクリーンでなければならず、セマンティックセグメンテーションに依存する重要なタスクは信頼できない。 Aiming at these pitfalls, this study develops a domain adaptive solution to semantic segmentation with pseudo label rectification (namely \textit{PR-SFDA}), which operates in two phases: 1) \textit{Confidence-regularized unsupervised learning}: Maximum squares loss applies to regularize the target model to ensure the confidence in prediction; and 2) \textit{Noise-aware pseudo label learning}: Negative learning enables tolerance to noisy pseudo labels in training, meanwhile positive learning achieves fast convergence. ドメイン適応型セマンティックセグメンテーションのベンチマークである \textit{GTA5 $\to$ Cityscapes} で大規模な実験が行われた。全体として、textit{PR-SFDA} は 49.0 mIoU のパフォーマンスを達成している。後者の要求はソースモデルの内部仕様にアクセスできるが、 \textit{PR-SFDA} ソリューションは明確なコントラストを必要としない。 It is desirable to transfer the knowledge stored in a well-trained source model onto non-annotated target domain in the absence of source data. However, state-of-the-art methods for source free domain adaptation (SFDA) are subject to strict limits: 1) access to internal specifications of source models is a must; and 2) pseudo labels should be clean during self-training, making critical tasks relying on semantic segmentation unreliable. Aiming at these pitfalls, this study develops a domain adaptive solution to semantic segmentation with pseudo label rectification (namely \textit{PR-SFDA}), which operates in two phases: 1) \textit{Confidence-regularized unsupervised learning}: Maximum squares loss applies to regularize the target model to ensure the confidence in prediction; and 2) \textit{Noise-aware pseudo label learning}: Negative learning enables tolerance to noisy pseudo labels in training, meanwhile positive learning achieves fast convergence. Extensive experiments have been performed on domain adaptive semantic segmentation benchmark, \textit{GTA5 $\to$ Cityscapes}. Overall, \textit{PR-SFDA} achieves a performance of 49.0 mIoU, which is very close to that of the state-of-the-art counterparts. Note that the latter demand accesses to the source model's internal specifications, whereas the \textit{PR-SFDA} solution needs none as a sharp contrast.	翻訳日:2021-06-25 00:38:43 公開日:2021-06-23
# (参考訳) 複数ソースによるセキュアなドメイン適応 Secure Domain Adaptation with Multiple Sources ( http://arxiv.org/abs/2106.12124v1 ) ライセンス: CC BY 4.0	Serban Stan, Mohammad Rostami	(参考訳) マルチソースアン教師付きドメイン適応(MUDA)は、最近検討された学習フレームワークであり、アノテーション付きデータで複数のソースドメインから知識を伝達することで、ターゲットドメインにおけるラベル付きデータの不足に対処することを目的としている。ソースデータは分散されているため、ソースドメインのデータのプライバシは自然な関心事になり得る。 MUDAのプライバシー問題に対処するために、埋め込みスペースにおけるドメインアライメントというアイデアの恩恵を受けます。本手法は,ドメイン間のデータサンプルを伝達することなく,内部学習分布を介して間接的にソースとターゲット分布を整列する手法である。提案手法を理論的に正当化し,提案手法が有効であることを示す実験を行い,既存の手法と比較した。 Multi-source unsupervised domain adaptation (MUDA) is a recently explored learning framework, where the goal is to address the challenge of labeled data scarcity in a target domain via transferring knowledge from multiple source domains with annotated data. Since the source data is distributed, the privacy of source domains' data can be a natural concern. We benefit from the idea of domain alignment in an embedding space to address the privacy concern for MUDA. Our method is based on aligning the sources and target distributions indirectly via internally learned distributions, without communicating data samples between domains. We justify our approach theoretically and perform extensive experiments to demonstrate that our method is effective and compares favorably against existing methods.	翻訳日:2021-06-25 00:26:56 公開日:2021-06-23
# (参考訳) NAX:memristive Xbarベースのコンピューティングシステムのためのニューラルネットワークとハードウェアアーキテクチャの共同設計 NAX: Co-Designing Neural Network and Hardware Architecture for Memristive Xbar based Computing Systems ( http://arxiv.org/abs/2106.12125v1 ) ライセンス: CC BY 4.0	Shubham Negi, Indranil Chakraborty, Aayush Ankit, Kaushik Roy	(参考訳) Memristive Crossbar Arrays (MCAs) を用いたインメモリコンピューティング(IMC)ハードウェアは、von-Neumannアーキテクチャに関連する"メモリウォール"問題を緩和するため、ディープニューラルネットワーク(DNN)を加速するために人気を集めている。このようなハードウェアにマッピングされたDNNのハードウェア効率(エネルギー、レイテンシ、領域)とアプリケーション精度(デバイスと回路の非理想性)は、カーネルサイズ、深さなどのネットワークパラメータに共依存する。クロスバーサイズのようなハードウェアアーキテクチャのパラメータですしかし、ネットワークパラメータとハードウェアパラメータの共最適化は、様々なクロスバーサイズにマッピングされた異なるカーネルサイズからなる困難な探索空間を示している。そこで我々は,ニューラルネットワークとiccベースのハードウェアアーキテクチャを共設計する効率的なニューラルネットワーク検索エンジンnaxを提案する。 NAXは前述の検索空間を探索し、各DNN層のカーネルと対応するクロスバーサイズを決定し、ハードウェア効率とアプリケーション精度の最適なトレードオフを実現する。 NAXの結果,ネットワークは異なるネットワーク層にまたがって異質なクロスバーサイズを持ち,クロスバーの非理想性を考慮した最適ハードウェア効率と精度が得られた。 CIFAR-10 と Tiny ImageNet では,ベースラインの ResNet-20 と ResNet-18 と比較して0.8%,0.2%,17%,4% の EDAP (Energy-delay-area product) を実現している。 In-Memory Computing (IMC) hardware using Memristive Crossbar Arrays (MCAs) are gaining popularity to accelerate Deep Neural Networks (DNNs) since it alleviates the "memory wall" problem associated with von-Neumann architecture. The hardware efficiency (energy, latency and area) as well as application accuracy (considering device and circuit non-idealities) of DNNs mapped to such hardware are co-dependent on network parameters, such as kernel size, depth etc. and hardware architecture parameters such as crossbar size. However, co-optimization of both network and hardware parameters presents a challenging search space comprising of different kernel sizes mapped to varying crossbar sizes. To that effect, we propose NAX -- an efficient neural architecture search engine that co-designs neural network and IMC based hardware architecture. NAX explores the aforementioned search space to determine kernel and corresponding crossbar sizes for each DNN layer to achieve optimal tradeoffs between hardware efficiency and application accuracy. Our results from NAX show that the networks have heterogeneous crossbar sizes across different network layers, and achieves optimal hardware efficiency and accuracy considering the non-idealities in crossbars. On CIFAR-10 and Tiny ImageNet, our models achieve 0.8%, 0.2% higher accuracy, and 17%, 4% lower EDAP (energy-delay-area product) compared to a baseline ResNet-20 and ResNet-18 models, respectively.	翻訳日:2021-06-25 00:07:48 公開日:2021-06-23
# (参考訳) patentnet: 大規模不完全なマルチビュー、マルチモーダル、マルチラベル産業製品画像データベース PatentNet: A Large-Scale Incomplete Multiview, Multimodal, Multilabel Industrial Goods Image Database ( http://arxiv.org/abs/2106.12139v1 ) ライセンス: CC BY 4.0	Fangyuan Lei, Da Huang, Jianjian Jiang, Ruijun Ma, Senhong Wang, Jiangzhong Cao, Yusen Lin and Qingyun Dai	(参考訳) ディープラーニング領域では、大規模な画像データセットがオブジェクト認識と検索の成功にブレークスルーをもたらす。今日では、イノベーションの具体例として、産業品の多様性が著しく大きくなり、不完全なマルチビュー、マルチモーダル、マルチラベルが従来のデータセットとは異なる。本稿では,産業製品画像および対応するテキストの多種多様な,正確かつ詳細なアノテーションを備えた産業製品データセットであるPatentNetを紹介する。 patentnetでは、画像とテキストは設計特許から引用される。 6m以上の画像と、専門家が手動でチェックした工業製品の対応するテキストの中で、パテントネットは、以前ベンチマークに使用されていた工業製品データセットよりも多種多様な産業製品画像データベースである。 patentnetは、ロカルノ分類協定に基づいて、何百万もの画像を32のクラスと219のサブクラスに分類する。画像分類,画像検索,不完全なマルチビュークラスタリングに関する広範な実験を通じて,我々の特許ネットワークは,既存の産業画像データセットよりもはるかに多様性があり,複雑で,困難であり,高いポテンシャルを享受できることを実証した。さらに、パテントネットにおける不完全なマルチビュー、マルチモーダル、マルチラベルの特徴は、人工知能コミュニティなどにおいて、別個の機会を提供することができる。 In deep learning area, large-scale image datasets bring a breakthrough in the success of object recognition and retrieval. Nowadays, as the embodiment of innovation, the diversity of the industrial goods is significantly larger, in which the incomplete multiview, multimodal and multilabel are different from the traditional dataset. In this paper, we introduce an industrial goods dataset, namely PatentNet, with numerous highly diverse, accurate and detailed annotations of industrial goods images, and corresponding texts. In PatentNet, the images and texts are sourced from design patent. Within over 6M images and corresponding texts of industrial goods labeled manually checked by professionals, PatentNet is the first ongoing industrial goods image database whose varieties are wider than industrial goods datasets used previously for benchmarking. PatentNet organizes millions of images into 32 classes and 219 subclasses based on the Locarno Classification Agreement. Through extensive experiments on image classification, image retrieval and incomplete multiview clustering, we demonstrate that our PatentNet is much more diverse, complex, and challenging, enjoying higher potentials than existing industrial image datasets. Furthermore, the characteristics of incomplete multiview, multimodal and multilabel in PatentNet are able to offer unparalleled opportunities in the artificial intelligence community and beyond.	翻訳日:2021-06-24 23:40:41 公開日:2021-06-23
# (参考訳) 個人別$k$-Clusteringのためのより良いアルゴリズム Better Algorithms for Individually Fair $k$-Clustering ( http://arxiv.org/abs/2106.12150v1 ) ライセンス: CC BY 4.0	Deeparnab Chakrabarty and Maryam Negahbani	(参考訳) データクラスタリングの問題を$\ell_p$-normの目的(例)で研究する。 $k$-Median と $k$-Means) は、個々のフェアネスの文脈における。データセットは$n$ポイントで構成されており、(a)目的が最小化されるような$k$センターを探したいが、(b)各点$v$が最大$r(v)$の範囲内で中心を持つという個々の公正性制約を尊重する一方で、$r(v)$はその$(n/k)$から最寄りの点まで$v$sの距離である。 Jung, Kannan, Lutz [FORC 2020]は、この概念を導入し、$\ell_\infty$または$k$-Centerの目的に対して、証明可能な(近似的な)フェアネスと客観的保証を備えたクラスタリングアルゴリズムを設計した。 MahabadiとVakilian(ICML 2020)はこの問題を再考し、すべての$\ell_p$-normsに対してローカル検索アルゴリズムを提供した。経験上、アルゴリズムはjungなどよりも優れている。アルコスト面では大きな差($k$-Medianと$k$-Meansは$k$-Means)がありますが、フェアネスでは合理的な損失をもたらします。本稿では,Linear Programming (LP) 技術を用いて,理論と実践の両方において,この問題に対するより良いアルゴリズムを得る。我々は、既知のLPラウンドリング技術を変更することで、MV20よりもはるかに優れた目標に対して最悪のケースを保証できることを実証的に証明し、この目標が最適に非常に近いことを実証した。さらに、理論上の公平性保証は理論上mv20と同等であり、経験上、より公平な解が得られる。 lp {\em exactly} を解くことは禁止されるかもしれないが、実際には単純なスパーシフィケーション手法がアルゴリズムの実行時間を大幅に改善することを示している。 We study data clustering problems with $\ell_p$-norm objectives (e.g. $k$-Median and $k$-Means) in the context of individual fairness. The dataset consists of $n$ points, and we want to find $k$ centers such that (a) the objective is minimized, while (b) respecting the individual fairness constraint that every point $v$ has a center within a distance at most $r(v)$, where $r(v)$ is $v$'s distance to its $(n/k)$th nearest point. Jung, Kannan, and Lutz [FORC 2020] introduced this concept and designed a clustering algorithm with provable (approximate) fairness and objective guarantees for the $\ell_\infty$ or $k$-Center objective. Mahabadi and Vakilian [ICML 2020] revisited this problem to give a local-search algorithm for all $\ell_p$-norms. Empirically, their algorithms outperform Jung et. al.'s by a large margin in terms of cost (for $k$-Median and $k$-Means), but they incur a reasonable loss in fairness. In this paper, our main contribution is to use Linear Programming (LP) techniques to obtain better algorithms for this problem, both in theory and in practice. We prove that by modifying known LP rounding techniques, one gets a worst-case guarantee on the objective which is much better than in MV20, and empirically, this objective is extremely close to the optimal. Furthermore, our theoretical fairness guarantees are comparable with MV20 in theory, and empirically, we obtain noticeably fairer solutions. Although solving the LP {\em exactly} might be prohibitive, we demonstrate that in practice, a simple sparsification technique drastically improves the run-time of our algorithm.	翻訳日:2021-06-24 23:32:37 公開日:2021-06-23
# (参考訳) Atari-2600ベンチマークによる学習ベースポリシとヒューリスティックスを用いた幅ベースルックアヘッド Width-based Lookaheads with Learnt Base Policies and Heuristics Over the Atari-2600 Benchmark ( http://arxiv.org/abs/2106.12151v1 ) ライセンス: CC BY 4.0	Stefan O'Toole, Nir Lipovetzky, Miquel Ramirez, Adrian Pearce	(参考訳) atari-2600ベンチマークを用いて,新たな幅ベースの計画学習アルゴリズムを提案する。提案するアルゴリズムは、以前の幅ベースのプランナーによる設計決定を慎重に分析することから着想を得ている。我々は,Atari-2600ゲームに対して新たなアルゴリズムをベンチマークし,これまで導入した幅ベース計画学習アルゴリズムであるRIW$_C$+CPV,$\pi$-IW(1),$\pi$-IW(1)+,$\pi$-HIW(n, 1)より優れていることを示す。さらに, atari-2600ゲームセットの分類について, その特徴について述べる。このゲームの分析は、導入された幅ベースのアルゴリズムの挙動と性能に関するさらなる洞察を与える。すなわち、大きな分岐因子を持つゲームや、希薄な有意義な報酬を持つゲームの場合、RIW$_C$+CPVは$\pi$-IW, $\pi$-IW(1)+および$\pi$-HIW(n, 1)より優れている。 We propose new width-based planning and learning algorithms applied over the Atari-2600 benchmark. The algorithms presented are inspired from a careful analysis of the design decisions made by previous width-based planners. We benchmark our new algorithms over the Atari-2600 games and show that our best performing algorithm, RIW$_C$+CPV, outperforms previously introduced width-based planning and learning algorithms $\pi$-IW(1), $\pi$-IW(1)+ and $\pi$-HIW(n, 1). Furthermore, we present a taxonomy of the set of Atari-2600 games according to some of their defining characteristics. This analysis of the games provides further insight into the behaviour and performance of the width-based algorithms introduced. Namely, for games with large branching factors, and games with sparse meaningful rewards, RIW$_C$+CPV outperforms $\pi$-IW, $\pi$-IW(1)+ and $\pi$-HIW(n, 1).	翻訳日:2021-06-24 23:12:33 公開日:2021-06-23
# (参考訳) ニューラルファッション画像のキャプション : データ多様性の会計 Neural Fashion Image Captioning : Accounting for Data Diversity ( http://arxiv.org/abs/2106.12154v1 ) ライセンス: CC BY-SA 4.0	Gilles Hacheme, Noureini Sayouti	(参考訳) 画像キャプションはアプリケーション分野が拡大しており、ファッションも例外ではない。自動アイテム記述を持つことは、何十万もの画像をホストするファッションwebプラットフォームにとって非常に興味深いことです。本論文はファッション画像のキャプションを初めて行う手法の1つである。 InFashAIv1データセットには、約16万のアフリカのファッションアイテムイメージとそのタイトル、価格、一般的な説明が含まれている。 InFashAIv1に加えて、よく知られたDeepFashionデータセットも使用しました。キャプションは、CNNエンコーダとRNNデコーダで作られた \textit{Show and Tell} モデルを使って生成される。両データセットのモデルを共同でトレーニングすることで,アフリカのスタイルのファッションイメージのキャプション品質が向上し,西洋スタイルのデータからの移行学習が示唆された。 InFashAIv1データセットは \href{https://github.com/hgilles06/infashai}{Github} でリリースされ、より多くの多様性を含む作業を促進する。 Image captioning has increasingly large domains of application, and fashion is not an exception. Having automatic item descriptions is of great interest for fashion web platforms hosting sometimes hundreds of thousands of images. This paper is one of the first tackling image captioning for fashion images. To contribute addressing dataset diversity issues, we introduced the InFashAIv1 dataset containing almost 16.000 African fashion item images with their titles, prices and general descriptions. We also used the well known DeepFashion dataset in addition to InFashAIv1. Captions are generated using the \textit{Show and Tell} model made of CNN encoder and RNN Decoder. We showed that jointly training the model on both datasets improves captions quality for African style fashion images, suggesting a transfer learning from Western style data. The InFashAIv1 dataset is released on \href{https://github.com/hgilles06/infashai}{Github} to encourage works with more diversity inclusion.	翻訳日:2021-06-24 22:55:52 公開日:2021-06-23
# (参考訳) 地域意識ネットワーク: 群衆カウントのためのモデル人間のトップダウン視覚知覚メカニズム Region-Aware Network: Model Human's Top-Down Visual Perception Mechanism for Crowd Counting ( http://arxiv.org/abs/2106.12163v1 ) ライセンス: CC BY 4.0	Yuehai Chen, Jing Yang, Dong Zhang, Kun Zhang, Badong Chen and Shaoyi Du	(参考訳) 背景雑音とスケール変動は、群集数で長年認識されてきた一般的な問題である。人間は群衆のイメージをちらっと見て、人間のほぼ数を瞬時に把握し、群衆の領域や、地球規模の受容性のある群衆の混雑度に注意を払います。そこで本稿では,人間のトップダウン視覚認識機構をモデル化し,RANetと呼ばれる領域認識ブロックを用いた新しいフィードバックネットワークを提案する。まず,入力画像中の候補群領域を優先する優先順位マップを生成するためのフィードバックアーキテクチャを提案する。前者により、ラネットは群衆地域にもっと注意を払うことができる。次に、グローバルレセプティブフィールドを介して、文脈情報を入力画像に適応的にエンコードできる領域認識ブロックを設計する。具体的には、入力画像全体とその優先度マップを列ベクトルの形でスキャンし、それらの類似性を推定する関連行列を得る。得られた関連行列は、ピクセル間のグローバルな関係を構築するために使用される。提案手法は,いくつかの公開データセットにおいて,最先端の群集カウント法より優れる。 Background noise and scale variation are common problems that have been long recognized in crowd counting. Humans glance at a crowd image and instantly know the approximate number of human and where they are through attention the crowd regions and the congestion degree of crowd regions with a global receptive filed. Hence, in this paper, we propose a novel feedback network with Region-Aware block called RANet by modeling human's Top-Down visual perception mechanism. Firstly, we introduce a feedback architecture to generate priority maps that provide prior about candidate crowd regions in input images. The prior enables the RANet pay more attention to crowd regions. Then we design Region-Aware block that could adaptively encode the contextual information into input images through global receptive field. More specifically, we scan the whole input images and its priority maps in the form of column vector to obtain a relevance matrix estimating their similarity. The relevance matrix obtained would be utilized to build global relationships between pixels. Our method outperforms state-of-the-art crowd counting methods on several public datasets.	翻訳日:2021-06-24 22:45:18 公開日:2021-06-23
# (参考訳) Cough Sounds を用いたディープニューラルネットワークによる呼吸病理分類 Deep Neural Network Based Respiratory Pathology Classification Using Cough Sounds ( http://arxiv.org/abs/2106.12174v1 ) ライセンス: CC BY 4.0	Balamurali B T, Hwan Ing Hee, Saumitra Kapoor, Oon Hoe Teoh, Sung Shin Teng, Khai Pin Lee, Dorien Herremans, Jer Ming Chen	(参考訳) インテリジェントなシステムは、私たちの医療システムと同様に、世界を変えつつある。本研究では,気管支喘息,上気道感染症(urti),下気道感染症(lrti)などの健康な小児と,病態のある小児の区別が可能な,深層学習に基づくcough音分類モデルを提案する。深層ニューラルネットワークモデルをトレーニングするために,臨床医の診断でラベル付けされた新しいcough音のデータセットを収集した。選択されたモデルは、Mel Frequency Cepstral Coefficients(MFCC)機能に基づいた双方向長短メモリネットワーク(BiLSTM)である。結果として得られた訓練されたモデルは、健康または病理(一般には特定の呼吸器病理学に属する)の2つのクラスを分類するために訓練された場合、医師の診断によって提供されるラベルに分類すると、84\%を超える精度に達する。対象者の呼吸病状を分類するために, 被験者1人あたりに複数カウエポックの結果が組み合わされた。その結果、3つの呼吸器疾患の予測精度は91\%を超える。しかし、モデルが4種類のうずくの分類と識別を行うように訓練されると、全体的な精度は低下し、1種類の病的くずはしばしば別のものと誤分類される。しかし, 健康的, 病理学的に何らかの病態を有すると分類された健康性うがいを考慮すれば, 4種類のモデル全体の精度は84\%以上である。 MFCCの特徴空間の経時的変化は, 病理学的, 回復的生地を比較して検討した結果, 病態によらず病理的生地が同じ特徴空間を占めるため, MFCCの特徴のみを区別することが困難であった。 Intelligent systems are transforming the world, as well as our healthcare system. We propose a deep learning-based cough sound classification model that can distinguish between children with healthy versus pathological coughs such as asthma, upper respiratory tract infection (URTI), and lower respiratory tract infection (LRTI). In order to train a deep neural network model, we collected a new dataset of cough sounds, labelled with clinician's diagnosis. The chosen model is a bidirectional long-short term memory network (BiLSTM) based on Mel Frequency Cepstral Coefficients (MFCCs) features. The resulting trained model when trained for classifying two classes of coughs -- healthy or pathology (in general or belonging to a specific respiratory pathology), reaches accuracy exceeding 84\% when classifying cough to the label provided by the physicians' diagnosis. In order to classify subject's respiratory pathology condition, results of multiple cough epochs per subject were combined. The resulting prediction accuracy exceeds 91\% for all three respiratory pathologies. However, when the model is trained to classify and discriminate among the four classes of coughs, overall accuracy dropped: one class of pathological coughs are often misclassified as other. However, if one consider the healthy cough classified as healthy and pathological cough classified to have some kind of pathologies, then the overall accuracy of four class model is above 84\%. A longitudinal study of MFCC feature space when comparing pathological and recovered coughs collected from the same subjects revealed the fact that pathological cough irrespective of the underlying conditions occupy the same feature space making it harder to differentiate only using MFCC features.	翻訳日:2021-06-24 22:29:15 公開日:2021-06-23
# (参考訳) 不確かさ属性による画像生成の公正性 Fairness for Image Generation with Uncertain Sensitive Attributes ( http://arxiv.org/abs/2106.12182v1 ) ライセンス: CC BY 4.0	Ajil Jalal and Sushrut Karmalkar and Jessica Hoffmann and Alexandros G. Dimakis and Eric Price	(参考訳) 本研究は、画像超解像などの生成手順の文脈における公平性の問題に対処し、標準分類設定と異なる定義を包含する。さらに、伝統的グループフェアネスの定義は、通常、指定された保護されたグループ(これらのグループ化が人工的であり、歴史的、政治的モチベーションを担っているという事実を象徴する)に関して定義されるが、本質的な真理の同一性は存在しないことを強調する。例えば、南アジアと東アジアは一つのグループか別々のグループと見なされるべきか? ひとつの人種を全体、あるいはさらに性別によって分割すべきだろうか? どの集団が有効で、どのグループに属しているかを決めることは不可能なジレンマであり、アジア人に関して「フェア」であることは、南アジア人に対して「アンフェア」であることを必要とするかもしれない。これにより、関連するグルーピングに対してアルゴリズムが \emph{oblivious} となるような定義が導入される。グループフェアネスの直感的な概念を定義し、不整合性とトレードオフを研究する。統計学的パリティの自然な拡張はグループ化に強く依存しており、<emph{impossible} は必然的に達成できることを示した。一方、概念的に新しい定義である条件付き比例表現は、後サンプリングによって明確化することができる。実験は, 最新の生成モデルを用いて, 理論結果の検証を行い, 公平な画像再構成を実現する。 This work tackles the issue of fairness in the context of generative procedures, such as image super-resolution, which entail different definitions from the standard classification setting. Moreover, while traditional group fairness definitions are typically defined with respect to specified protected groups -- camouflaging the fact that these groupings are artificial and carry historical and political motivations -- we emphasize that there are no ground truth identities. For instance, should South and East Asians be viewed as a single group or separate groups? Should we consider one race as a whole or further split by gender? Choosing which groups are valid and who belongs in them is an impossible dilemma and being ``fair'' with respect to Asians may require being ``unfair'' with respect to South Asians. This motivates the introduction of definitions that allow algorithms to be \emph{oblivious} to the relevant groupings. We define several intuitive notions of group fairness and study their incompatibilities and trade-offs. We show that the natural extension of demographic parity is strongly dependent on the grouping, and \emph{impossible} to achieve obliviously. On the other hand, the conceptually new definition we introduce, Conditional Proportional Representation, can be achieved obliviously through Posterior Sampling. Our experiments validate our theoretical results and achieve fair image reconstruction using state-of-the-art generative models.	翻訳日:2021-06-24 22:13:27 公開日:2021-06-23
# (参考訳) 高齢者の日常生活活動支援技術の現状と展望 A Review of Assistive Technologies for Activities of Daily Living of Elderly ( http://arxiv.org/abs/2106.12183v1 ) ライセンス: CC BY 4.0	Nirmalya Thakur and Chia Y. Han	(参考訳) この世紀の特筆すべき特徴の1つは、常に増加を続けている高齢者の人口である。高齢者は、身体障害、認知障害、記憶の弱化、老化に伴う非組織的行動などにより、様々なニーズと要求を抱えている。これらの制限の範囲は、年齢、性別、背景、経験、スキル、知識など、高齢者の多様性によっても異なる。高齢者が日常生活活動(ADL)を自立的に行うためには,年齢の増大や能力の制限など,様々なニーズと課題がある。さらに、介護者の不足は、高齢者が日常の日常業務をこなし、自立した生活と活動的な高齢化を維持するために、テクノロジーベースのサービスの必要性が高まっている。これらのニーズに対処するため、この作品はこの分野で3つの主要な貢献をしている。まず,高齢者のadl実施支援を目的とした生活支援技術の包括的レビューを行う。第2に, 本研究は, スマートホームとスマートシティにおける高齢者介護支援サービス実施の文脈において現在存在している課題について考察する。最後に、この研究は、この分野での既存の作業の実装、拡張、統合のためのアプローチを概説し、変化し続けるニーズに応じて高齢者にパーソナライズされた支援とユーザー中心の行動介入を提供する、待望のフレームワークを開発するためのものである。 One of the distinct features of this century has been the population of older adults which has been on a constant rise. Elderly people have several needs and requirements due to physical disabilities, cognitive issues, weakened memory and disorganized behavior, that they face with increasing age. The extent of these limitations also differs according to the varying diversities in elderly, which include age, gender, background, experience, skills, knowledge and so on. These varying needs and challenges with increasing age, limits abilities of older adults to perform Activities of Daily Living (ADLs) in an independent manner. To add to it, the shortage of caregivers creates a looming need for technology-based services for elderly people, to assist them in performing their daily routine tasks to sustain their independent living and active aging. To address these needs, this work consists of making three major contributions in this field. First, it provides a rather comprehensive review of assisted living technologies aimed at helping elderly people to perform ADLs. Second, the work discusses the challenges identified through this review, that currently exist in the context of implementation of assisted living services for elderly care in Smart Homes and Smart Cities. Finally, the work also outlines an approach for implementation, extension and integration of the existing works in this field for development of a much-needed framework that can provide personalized assistance and user-centered behavior interventions to elderly as per their varying and ever-changing needs.	翻訳日:2021-06-24 21:47:24 公開日:2021-06-23
# (参考訳) 希少クラスのための合成サンプルの画像から画像への変換 Image-to-Image Translation of Synthetic Samples for Rare Classes ( http://arxiv.org/abs/2106.12212v1 ) ライセンス: CC BY 4.0	Edoardo Lanzini and Sara Beery	(参考訳) 希少なクラスは一般的なクラスよりも桁違いに頻繁に観察され、希少なクラスがほんの一握りの例しか持たない高度に不均衡なデータに繋がる。少数の例から学ぶことは、ディープラーニングベースの分類アルゴリズムにとって既知の課題であり、低ショット学習の分野の焦点である。これらのレアクラスのトレーニングデータを増やす潜在的アプローチの一つは、限られた実データを合成サンプルで強化することである。これは役に立つことが示されているが、実際のデータでテストした場合、実データと合成データのドメインシフトはアプローチの有効性を妨げる。本研究では,動物種分類における合成画像と実画像の領域ギャップを埋めるための画像間翻訳手法について,野生動物を観察する静止カメラのカメラトラップから収集したデータを用いて検討する。我々は、ソースドメインとターゲットドメイン間の低レベルの特徴アライメントを用いて、グラフィックエンジンを用いて生成される稀な種の合成データを作成する。非整合合成データを用いたシステムと比較すると, 希少種に対する分類誤差は有意に減少した。 The natural world is long-tailed: rare classes are observed orders of magnitudes less frequently than common ones, leading to highly-imbalanced data where rare classes can have only handfuls of examples. Learning from few examples is a known challenge for deep learning based classification algorithms, and is the focus of the field of low-shot learning. One potential approach to increase the training data for these rare classes is to augment the limited real data with synthetic samples. This has been shown to help, but the domain shift between real and synthetic hinders the approaches' efficacy when tested on real data. We explore the use of image-to-image translation methods to close the domain gap between synthetic and real imagery for animal species classification in data collected from camera traps: motion-activated static cameras used to monitor wildlife. We use low-level feature alignment between source and target domains to make synthetic data for a rare species generated using a graphics engine more "realistic". Compared against a system augmented with unaligned synthetic data, our experiments show a considerable decrease in classification error rates on a rare species.	翻訳日:2021-06-24 21:38:55 公開日:2021-06-23
# (参考訳) バイオメディカルネームの認識:課題と解決 Recognising Biomedical Names: Challenges and Solutions ( http://arxiv.org/abs/2106.12230v1 ) ライセンス: CC BY 4.0	Xiang Dai	(参考訳) 生物医学的な文書の量の増加率は驚異的だ。これらの文書に閉じ込められた情報をアンロックすることで、研究者や実践者は情報の世界において確実に操作できる。バイオメディカルNERは、通常、NLPパイプラインの最初のステップとして用いられる。シーケンシャルタグ技術に基づく標準NERモデルは、ジェネリックドメインにおける短いエンティティ参照を認識するのに長けている。 However, there are several open challenges of applying these models to recognise biomedical names: 1) Biomedical names may contain complex inner structure (discontinuity and overlapping) which cannot be recognised using standard sequence tagging technique; 2) The training of NER models usually requires large amount of labelled data, which are difficult to obtain in the biomedical domain; and, 3) Commonly used language representation models are pre-trained on generic data; a domain shift therefore exists between these models and target biomedical data. 1) 不連続な言及を認識可能なトランジッションベースのnerモデルを提案し, 2) 適切な事前学習データを生成するためのコスト効率の高いアプローチを開発し,3) nerのためのデータ拡張手法をいくつか設計する。我々の貢献は、特に新しいバイオメディカル・アプリケーションが必要な場合に、明らかな実践的意味を持つ。提案手法は,少ないラベル付きデータしか必要とせず,NERモデルの良好な性能を実現するのに有効である。事前学習データの選択に関する調査は、ドメイン内データを用いて事前学習した言語表現モデルを組み込むことで、モデルを改善することができる。最後に,提案する遷移型nerモデルは,不連続な言及を認識することにより,さらに性能を向上させることができる。 The growth rate in the amount of biomedical documents is staggering. Unlocking information trapped in these documents can enable researchers and practitioners to operate confidently in the information world. Biomedical NER, the task of recognising biomedical names, is usually employed as the first step of the NLP pipeline. Standard NER models, based on sequence tagging technique, are good at recognising short entity mentions in the generic domain. However, there are several open challenges of applying these models to recognise biomedical names: 1) Biomedical names may contain complex inner structure (discontinuity and overlapping) which cannot be recognised using standard sequence tagging technique; 2) The training of NER models usually requires large amount of labelled data, which are difficult to obtain in the biomedical domain; and, 3) Commonly used language representation models are pre-trained on generic data; a domain shift therefore exists between these models and target biomedical data. To deal with these challenges, we explore several research directions and make the following contributions: 1) we propose a transition-based NER model which can recognise discontinuous mentions; 2) We develop a cost-effective approach that nominates the suitable pre-training data; and, 3) We design several data augmentation methods for NER. Our contributions have obvious practical implications, especially when new biomedical applications are needed. Our proposed data augmentation methods can help the NER model achieve decent performance, requiring only a small amount of labelled data. Our investigation regarding selecting pre-training data can improve the model by incorporating language representation models, which are pre-trained using in-domain data. Finally, our proposed transition-based NER model can further improve the performance by recognising discontinuous mentions.	翻訳日:2021-06-24 21:28:06 公開日:2021-06-23
# (参考訳) 腎乳頭癌の組織像におけるサブタイピングのためのインスタンスベース視覚トランスフォーマ Instance-based Vision Transformer for Subtyping of Papillary Renal Cell Carcinoma in Histopathological Image ( http://arxiv.org/abs/2106.12265v1 ) ライセンス: CC BY 4.0	Zeyu Gao, Bangyang Hong, Xianli Zhang, Yang Li, Chang Jia, Jialun Wu, Chunbao Wang, Deyu Meng, Chen Li	(参考訳) P型腎細胞癌(RCC)の病理組織学的サブタイプは1型対2型であり,本態性予後因子である。 pRCCの2つのサブタイプは類似したパターン、すなわち乳頭状構造を持つが、細胞および細胞層レベルのパターンを含む微妙な違いがある。しかし、細胞層と細胞層レベルのパターンは、大規模な病理組織像において既存のCNNモデルではほとんど捉えられず、このような細粒度分類にこれらのモデルを直接適用する際の障害となる。そこで,本研究では,pRCCサブタイピングタスクにおける病理像の頑健な表現をインスタンスパッチから抽出し,より微細な特徴を抽出し(セグメント化された核を取り囲み,予測等級を割り当てることにより)学習する。提案するi-vitは、top-kインスタンスを入力として、位置埋め込み層、グレードエンベディング層、マルチヘッド多層セルフアテンションモジュールによってセル層とセル層の両方のレベルパターンをキャプチャする。提案フレームワークの性能を評価するため,1型と2型pRCCの171枚のスライド画像から,経験的病理医を1162個の関心領域に招待した。実験結果から,提案手法は既存のCNNモデルよりも優れた性能を示すことが示された。 Histological subtype of papillary (p) renal cell carcinoma (RCC), type 1 vs. type 2, is an essential prognostic factor. The two subtypes of pRCC have a similar pattern, i.e., the papillary architecture, yet some subtle differences, including cellular and cell-layer level patterns. However, the cellular and cell-layer level patterns almost cannot be captured by existing CNN-based models in large-size histopathological images, which brings obstacles to directly applying these models to such a fine-grained classification task. This paper proposes a novel instance-based Vision Transformer (i-ViT) to learn robust representations of histopathological images for the pRCC subtyping task by extracting finer features from instance patches (by cropping around segmented nuclei and assigning predicted grades). The proposed i-ViT takes top-K instances as input and aggregates them for capturing both the cellular and cell-layer level patterns by a position-embedding layer, a grade-embedding layer, and a multi-head multi-layer self-attention module. To evaluate the performance of the proposed framework, experienced pathologists are invited to selected 1162 regions of interest from 171 whole slide images of type 1 and type 2 pRCC. Experimental results show that the proposed method achieves better performance than existing CNN-based models with a significant margin.	翻訳日:2021-06-24 21:26:53 公開日:2021-06-23
# (参考訳) 糖尿病網膜症の網膜底画像分類のためのラベル管理機構 A Label Management Mechanism for Retinal Fundus Image Classification of Diabetic Retinopathy ( http://arxiv.org/abs/2106.12284v1 ) ライセンス: CC BY 4.0	Mengdi Gao, Ximeng Feng, Mufeng Geng, Zhe Jiang, Lei Zhu, Xiangxi Meng, Chuanqing Zhou, Qiushi Ren and Yanye Lu	(参考訳) 糖尿病網膜症(DR)は、成人の視力障害と不可逆性失明の最も多い原因である。深層学習 (DL) のルネッサンスにより, DLをベースとしたDR診断は, DRの早期スクリーニングと重症度向上に有効である。しかし、ディープニューラルネットワーク(DNN)のトレーニングには、大量の注意深くラベル付けされたデータが必要である。ノイズの多いラベルデータは、大量のデータをラベル付けすることで、モデルのパフォーマンスを低下させる。本研究では,DNNがノイズの多いデータに対する過度な適合を克服するための新しいラベル管理機構(LMM)を提案する。 LMMはベイズ統計および時間重み付け手法における最大後続確率(MAP)を用いて、不確定データのラベルを選択的に補正し、徐々にトレーニングデータを浄化し、分類性能を向上させる。合成ノイズデータ(Messidor \ and our collected DR dataset)と実世界のノイズデータ(ANIMAL-10N)の総合実験により、LMMはモデルの性能を向上し、3つの最先端手法よりも優れていることが示された。 Diabetic retinopathy (DR) remains the most prevalent cause of vision impairment and irreversible blindness in the working-age adults. Due to the renaissance of deep learning (DL), DL-based DR diagnosis has become a promising tool for the early screening and severity grading of DR. However, training deep neural networks (DNNs) requires an enormous amount of carefully labeled data. Noisy label data may be introduced when labeling plenty of data, degrading the performance of models. In this work, we propose a novel label management mechanism (LMM) for the DNN to overcome overfitting on the noisy data. LMM utilizes maximum posteriori probability (MAP) in the Bayesian statistic and time-weighted technique to selectively correct the labels of unclean data, which gradually purify the training data and improve classification performance. Comprehensive experiments on both synthetic noise data (Messidor \& our collected DR dataset) and real-world noise data (ANIMAL-10N) demonstrated that LMM could boost performance of models and is superior to three state-of-the-art methods.	翻訳日:2021-06-24 21:15:04 公開日:2021-06-23
# (参考訳) 行動模倣分布:フェデレーション学習のための個人行動と集団行動の組み合わせ Behavior Mimics Distribution: Combining Individual and Group Behaviors for Federated Learning ( http://arxiv.org/abs/2106.12300v1 ) ライセンス: CC BY 4.0	Hua Huang, Fanhua Shang, Yuanyuan Liu, Hongying Liu	(参考訳) Federated Learning(FL)は、アクティブで有望な分散機械学習パラダイムになった。統計的不均質性の結果,近年の研究では,ローカル更新によるクライアントドリフトにより,一般的なfl法(fedavgなど)の性能が劇的に低下することが明らかとなった。本稿では,個人とグループの両方の行動を利用して分布を模倣し,不均一性に対処できる新しいフェデレート学習アルゴリズム(IGFL)を提案する。既存のFLメソッドとは異なり、IGFLはクライアントとサーバの最適化にも適用できます。本稿では,IGFLのサーバ最適化における注目度に基づく新しいフェデレーション学習を提案する。私たちの知る限りでは、フェデレーション最適化に注意機構を組み込むのはこれが初めてです。広範な実験を行い,igflが既存の連合学習手法の性能を著しく向上できることを示す。特に個人間のデータの分布が多様である場合、IGFLは以前のベースラインと比較して約13%の精度で分類できる。 Federated Learning (FL) has become an active and promising distributed machine learning paradigm. As a result of statistical heterogeneity, recent studies clearly show that the performance of popular FL methods (e.g., FedAvg) deteriorates dramatically due to the client drift caused by local updates. This paper proposes a novel Federated Learning algorithm (called IGFL), which leverages both Individual and Group behaviors to mimic distribution, thereby improving the ability to deal with heterogeneity. Unlike existing FL methods, our IGFL can be applied to both client and server optimization. As a by-product, we propose a new attention-based federated learning in the server optimization of IGFL. To the best of our knowledge, this is the first time to incorporate attention mechanisms into federated optimization. We conduct extensive experiments and show that IGFL can significantly improve the performance of existing federated learning methods. Especially when the distributions of data among individuals are diverse, IGFL can improve the classification accuracy by about 13% compared with prior baselines.	翻訳日:2021-06-24 20:57:20 公開日:2021-06-23
# (参考訳) 単一"in-the-wild"画像による3次元舌再建 3D human tongue reconstruction from single "in-the-wild" images ( http://arxiv.org/abs/2106.12302v1 ) ライセンス: CC BY 4.0	Stylianos Ploumpis, Stylianos Moschoglou, Vasileios Triantafyllou, Stefanos Zafeiriou	(参考訳) 単一の画像からの3D顔の再構成は、特にリアルな3Dアバター作成、不変顔認識、顔の幻覚といった多くのアプリケーションで広く使われているため、コンピュータビジョンコミュニティへの関心が高まったタスクである。 90年代後半に3D Morphable Modelが導入されて以来、我々はこの課題に特に取り組むことを目的とした研究の爆発を目撃した。しかし, 深層学習に起因した単一画像からの3次元顔再構成の精度は高まっているものの, 3次元アバター表現の現実性には極めて重要であるにもかかわらず, 舌などの顔の微細で変形性の高い成分は, 文学におけるすべての3次元顔モデルにはまだ欠落している。本研究では,まず,舌とともに3次元顔の正確な再構築を行う,エンド・ツー・エンドのトレーニング可能なパイプラインについて述べる。さらに,3次元舌表面生成に適した新しいGAN法を導入することにより,このパイプラインを「夢中」画像で堅牢にする。最後に、性別、年齢、民族的背景の異なる700人の生スキャン1,800人からなる、最初の多様な舌データセットをコミュニティに公開します。定量的および定性的実験の広範なシリーズで示すように、我々のモデルは、悪質な「未熟な」条件下であっても、頑健で現実的な3D舌の構造を捉えることができる。 3D face reconstruction from a single image is a task that has garnered increased interest in the Computer Vision community, especially due to its broad use in a number of applications such as realistic 3D avatar creation, pose invariant face recognition and face hallucination. Since the introduction of the 3D Morphable Model in the late 90's, we witnessed an explosion of research aiming at particularly tackling this task. Nevertheless, despite the increasing level of detail in the 3D face reconstructions from single images mainly attributed to deep learning advances, finer and highly deformable components of the face such as the tongue are still absent from all 3D face models in the literature, although being very important for the realness of the 3D avatar representations. In this work we present the first, to the best of our knowledge, end-to-end trainable pipeline that accurately reconstructs the 3D face together with the tongue. Moreover, we make this pipeline robust in "in-the-wild" images by introducing a novel GAN method tailored for 3D tongue surface generation. Finally, we make publicly available to the community the first diverse tongue dataset, consisting of 1,800 raw scans of 700 individuals varying in gender, age, and ethnicity backgrounds. As we demonstrate in an extensive series of quantitative as well as qualitative experiments, our model proves to be robust and realistically captures the 3D tongue structure, even in adverse "in-the-wild" conditions.	翻訳日:2021-06-24 20:43:10 公開日:2021-06-23
# (参考訳) 学習した特徴空間の構造による分類モデルのロバスト性の推定 Estimating the Robustness of Classification Models by the Structure of the Learned Feature-Space ( http://arxiv.org/abs/2106.12303v1 ) ライセンス: CC BY 4.0	Kalun Ho, Franz-Josef Pfreundt, Janis Keuper, Margret Keuper	(参考訳) 過去10年間で、ディープイメージ分類ネットワークの開発は、主にimagenetのような標準ベンチマークにおける分類精度の観点から、最高のパフォーマンスの探索によって進められてきた。最近では、モデルロバストネスの概念によって、この焦点が拡張されている。データ分布の変化を事前に把握したモデル一般化能力。 ImageNet-Cのような新しいベンチマークは堅牢性を測定するために導入されたが、固定テストセットはデータバリエーションのごく一部しかキャプチャできないため、新しい過度なソリューションを生成する傾向にある、と我々は主張する。これらの欠点を克服するために、学習した特徴空間の構造から直接モデルの堅牢性を推定することを提案する。学習した分類器内の潜在表現の教師なしクラスタリングによって得られるロバスト性指標を導入し,破損したテストデータに対するモデル性能に非常に高い相関を示す。 Over the last decade, the development of deep image classification networks has mostly been driven by the search for the best performance in terms of classification accuracy on standardized benchmarks like ImageNet. More recently, this focus has been expanded by the notion of model robustness, i.e. the generalization abilities of models towards previously unseen changes in the data distribution. While new benchmarks, like ImageNet-C, have been introduced to measure robustness properties, we argue that fixed testsets are only able to capture a small portion of possible data variations and are thus limited and prone to generate new overfitted solutions. To overcome these drawbacks, we suggest to estimate the robustness of a model directly from the structure of its learned feature-space. We introduce robustness indicators which are obtained via unsupervised clustering of latent representations inside a trained classifier and show very high correlations to the model performance on corrupted test data.	翻訳日:2021-06-24 20:26:38 公開日:2021-06-23
# (参考訳) もっと深く行くべきか? 受容場解析による学習を伴わない畳み込みニューラルネットワークアーキテクチャの最適化 Should You Go Deeper? Optimizing Convolutional Neural Network Architectures without Training by Receptive Field Analysis ( http://arxiv.org/abs/2106.12307v1 ) ライセンス: CC BY 4.0	Mats L. Richter, Julius Sch\"oning, Ulf Krumnack	(参考訳) 特定のタスクにニューラルネットワーク(ann)を適用する場合、研究者、プログラマ、その他の専門家は通常、設計上の畳み込み層の数をオーバーショットする。これらのannにはパラメータが多すぎるため、結果に影響を与えずに不必要なトレーニングが必要となる。畳み込み層が処理できる特徴は、その受容場によって厳密に制限される。受容場の拡大を階層的に解析することにより、ANNアーキテクチャの推論に質的に寄与しない階層列を確実に予測することができる。これらの分析に基づいて,これらの非効率性を解決するための設計戦略を提案し, annの解法と計算性能を最適化する。戦略も分析も実際のモデルのトレーニングを必要としないため、これらの洞察は、将来自動化されるであろうansアーキテクチャの非常に効率的な設計プロセスを可能にする。 Applying artificial neural networks (ANN) to specific tasks, researchers, programmers, and other specialists usually overshot the number of convolutional layers in their designs. By implication, these ANNs hold too many parameters, which needed unnecessarily trained without impacting the result. The features, a convolutional layer can process, are strictly limited by its receptive field. By layer-wise analyzing the expansion of the receptive fields, we can reliably predict sequences of layers that will not contribute qualitatively to the inference in thegiven ANN architecture. Based on these analyses, we propose design strategies to resolve these inefficiencies, optimizing the explainability and the computational performance of ANNs. Since neither the strategies nor the analysis requires training of the actual model, these insights allow for a very efficient design process of ANNs architectures which might be automated in the future.	翻訳日:2021-06-24 20:12:22 公開日:2021-06-23
# (参考訳) GraphConfRec: グラフニューラルネットワークに基づくカンファレンスレコメンダシステム GraphConfRec: A Graph Neural Network-Based Conference Recommender System ( http://arxiv.org/abs/2106.12340v1 ) ライセンス: CC BY 4.0	Andreea Iana, Heiko Paulheim	(参考訳) 今日の学術出版モデル、特にコンピュータ科学において、会議は、それぞれの分野で最新のピアレビューされた進歩を公表するための主要なプラットフォームを構成する。しかし、研究の出版に適した学術的場を選ぶことは、特に学術的キャリアの開始時や通常の領域外の出版を希望する人にとって、利用可能な会議の多さを考える上で困難な課題となる。本稿では,SciGraphとグラフニューラルネットワークを組み合わせた会議推薦システムであるGraphConfRecを提案する。 graphconfrecは、リコール@10を0.580まで、マップを0.336まで、グラフアテンションネットワークベースのレコメンデーションモデルで達成する。 25名の被験者によるユーザスタディは、肯定的な結果を支持する。 In today's academic publishing model, especially in Computer Science, conferences commonly constitute the main platforms for releasing the latest peer-reviewed advancements in their respective fields. However, choosing a suitable academic venue for publishing one's research can represent a challenging task considering the plethora of available conferences, particularly for those at the start of their academic careers, or for those seeking to publish outside of their usual domain. In this paper, we propose GraphConfRec, a conference recommender system which combines SciGraph and graph neural networks, to infer suggestions based not only on title and abstract, but also on co-authorship and citation relationships. GraphConfRec achieves a recall@10 of up to 0.580 and a MAP of up to 0.336 with a graph attention network-based recommendation model. A user study with 25 subjects supports the positive results.	翻訳日:2021-06-24 19:59:55 公開日:2021-06-23
# (参考訳) PALRACE: 人間のデータとラベル付き合理化による包括的データセットを読む PALRACE: Reading Comprehension Dataset with Human Data and Labeled Rationales ( http://arxiv.org/abs/2106.12373v1 ) ライセンス: CC BY 4.0	Jiajie Zou, Yuran Zhang, Peiqing Jin, Cheng Luo, Xunyi Pan, Nai Ding	(参考訳) 事前学習された言語モデルは、機械読解(MRC)タスクにおいて高い性能を達成するが、結果は説明が難しい。モデルを説明するための魅力的なアプローチは、その決定の根拠を提供することである。本稿では,人間理論の教師付き学習を容易にするために,レースデータセットから選択した800のパスに対して,人間のラベル付き合理性を持つ新しいmrcデータセットであるpalrace(pruned and labeled race)を提案する。さらに,質問を各項目に6種類に分類した。各章は少なくとも26人の参加者が読み、質問に答える根拠をラベル付けした。また,ラベル付き合理性のみに基づいた質問への回答を参加者に依頼し,ラベル付き合理性が高品質であり,質問応答を十分に支援できる合理性評価セッションを実施した。 Pre-trained language models achieves high performance on machine reading comprehension (MRC) tasks but the results are hard to explain. An appealing approach to make models explainable is to provide rationales for its decision. To facilitate supervised learning of human rationales, here we present PALRACE (Pruned And Labeled RACE), a new MRC dataset with human labeled rationales for 800 passages selected from the RACE dataset. We further classified the question to each passage into 6 types. Each passage was read by at least 26 participants, who labeled their rationales to answer the question. Besides, we conducted a rationale evaluation session in which participants were asked to answering the question solely based on labeled rationales, confirming that the labeled rationales were of high quality and can sufficiently support question answering.	翻訳日:2021-06-24 19:37:34 公開日:2021-06-23
# (参考訳) 心臓MRI画像解析における公正性:深層学習におけるデータ不均衡によるバイアスの検討 Fairness in Cardiac MR Image Analysis: An Investigation of Bias Due to Data Imbalance in Deep Learning Based Segmentation ( http://arxiv.org/abs/2106.12387v1 ) ライセンス: CC BY 4.0	Esther Puyol-Anton, Bram Ruijsink, Stefan K. Piechnik, Stefan Neubauer, Steffen E. Petersen, Reza Razavi, and Andrew P. King	(参考訳) 人工知能(AI)における「フェアネス」の主題は、人種や性別などの人口動態特性に基づく潜在的なバイアスに対するAIアルゴリズムの評価と、このバイアスに対処するアルゴリズムの開発である。これまでほとんどのアプリケーションはコンピュータビジョンで使われてきたが、医療分野の仕事がいくつか現れ始めている。心臓mrセグメンテーションにおける深層学習(dl)の使用は近年、印象的な結果をもたらしており、その技術は臨床に翻訳され始めている。しかし、これらのモデルの公平性についてはまだ研究されていない。本研究では,6つの人種グループから5,903人の被験者からなる英国バイオバンクデータセットから,短軸心MRデータをトレーニングし,評価したnnU-Netモデルを用いて,人種/ジェンダーグループを対象としたこのような分析を行った。異なる人種間でのサイコロのパフォーマンスに統計的に有意な差が見られた。人種バイアスを低減するために,(1) 人種間のバランスを確保するためにバッチサンプリングが階層化される階層化バッチサンプリング,(2) 人種分類のための公平なメタラーニング,(2) DL分類器が人種分類を訓練し,セグメンテーションモデルと共同最適化されたグループモデル,(3) 人種毎に異なるセグメンテーションモデルを訓練する保護されたグループモデル,の3つの戦略を検討した。また、完全にバランスの取れたデータベースがあるシナリオと比較しました。公平性を評価するために,平均Dice値の標準偏差(SD)とスキュード誤差比(SER)を用いた。以上の結果から,不均衡なトレーニングデータを用いることにより人種バイアスが生じ,提案されているバイアス緩和戦略はすべて公平性が向上し,保護されたグループモデルを用いた最良のsdとserが得られた。 The subject of "fairness" in artificial intelligence (AI) refers to assessing AI algorithms for potential bias based on demographic characteristics such as race and gender, and the development of algorithms to address this bias. Most applications to date have been in computer vision, although some work in healthcare has started to emerge. The use of deep learning (DL) in cardiac MR segmentation has led to impressive results in recent years, and such techniques are starting to be translated into clinical practice. However, no work has yet investigated the fairness of such models. In this work, we perform such an analysis for racial/gender groups, focusing on the problem of training data imbalance, using a nnU-Net model trained and evaluated on cine short axis cardiac MR data from the UK Biobank dataset, consisting of 5,903 subjects from 6 different racial groups. We find statistically significant differences in Dice performance between different racial groups. To reduce the racial bias, we investigated three strategies: (1) stratified batch sampling, in which batch sampling is stratified to ensure balance between racial groups; (2) fair meta-learning for segmentation, in which a DL classifier is trained to classify race and jointly optimized with the segmentation model; and (3) protected group models, in which a different segmentation model is trained for each racial group. We also compared the results to the scenario where we have a perfectly balanced database. To assess fairness we used the standard deviation (SD) and skewed error ratio (SER) of the average Dice values. Our results demonstrate that the racial bias results from the use of imbalanced training data, and that all proposed bias mitigation strategies improved fairness, with the best SD and SER resulting from the use of protected group models.	翻訳日:2021-06-24 19:35:00 公開日:2021-06-23
# (参考訳) 形態的にリッチな言語に対する語彙制約付き機械翻訳 End-to-End Lexically Constrained Machine Translation for Morphologically Rich Languages ( http://arxiv.org/abs/2106.12398v1 ) ライセンス: CC BY 4.0	Josef Jon and Jo\~ao Paulo Aires and Du\v{s}an Vari\v{s} and Ond\v{r}ej Bojar	(参考訳) 語彙的に制約された機械翻訳では、特定の単語やフレーズの存在や欠如を強制して出力文を操作できる。現在のアプローチでは、翻訳に現れる用語を強制することはできるが、制約語形式を生成された出力の他の部分と一致させるのに苦労することが多い。手動分析の結果、英語からチェコ語への翻訳における基準制約モデルの出力エラーの46%が合意に関連していることがわかった。本研究は, 機械翻訳による単語の正しいインフレクションを許容する機構について検討する。特に,入力シーケンスの一部として制約を付与したモデルトレーニングに基づく手法に着目した。本手法は, 自動評価と手動評価の両方における制約項の翻訳を, 一致の誤りを減らすことにより改善することを示す。提案手法は,新しい誤りや翻訳の全体的な品質を低下させることなく,屈折誤差を除去する。 Lexically constrained machine translation allows the user to manipulate the output sentence by enforcing the presence or absence of certain words and phrases. Although current approaches can enforce terms to appear in the translation, they often struggle to make the constraint word form agree with the rest of the generated output. Our manual analysis shows that 46% of the errors in the output of a baseline constrained model for English to Czech translation are related to agreement. We investigate mechanisms to allow neural machine translation to infer the correct word inflection given lemmatized constraints. In particular, we focus on methods based on training the model with constraints provided as part of the input sequence. Our experiments on the English-Czech language pair show that this approach improves the translation of constrained terms in both automatic and manual evaluation by reducing errors in agreement. Our approach thus eliminates inflection errors, without introducing new errors or decreasing the overall quality of the translation.	翻訳日:2021-06-24 19:25:11 公開日:2021-06-23
# (参考訳) 機械予測における偽の完全性:機械学習における循環問題の検出と評価 False perfection in machine prediction: Detecting and assessing circularity problems in machine learning ( http://arxiv.org/abs/2106.12417v1 ) ライセンス: CC BY 4.0	Michael Hagmann, Stefan Riezler	(参考訳) 機械学習アルゴリズムは、見えないテスト入力の正しい出力を予測することを目的として、入力データとターゲット出力のパターンからモデルをトレーニングする。本稿では, 医療情報学や特許法などの応用分野において, 入力データの表現において, 目標出力が決定論的に定義された測定値を含むことによる機械学習の問題を示す。これは、既知の目標定義の機械的再構成に基づく完全だが円形の予測につながるが、定義された測定値が不完全あるいは不完全であるような実世界のデータでは失敗する。本稿では,任意のデータセットとブラックボックス機械学習モデルに対して,対象の機能定義を再構築可能か,トレーニングに使用しているかを示す循環性テストを行う。我々は,機械学習におけるデータ表現から対象とする結果を定義することで,研究結果を実世界のアプリケーションに転送するには円周性を回避する必要があると論じる。 Machine learning algorithms train models from patterns of input data and target outputs, with the goal of predicting correct outputs for unseen test inputs. Here we demonstrate a problem of machine learning in vital application areas such as medical informatics or patent law that consists of the inclusion of measurements on which target outputs are deterministically defined in the representations of input data. This leads to perfect, but circular predictions based on a machine reconstruction of the known target definition, but fails on real-world data where the defining measurements may not or only incompletely be available. We present a circularity test that shows, for given datasets and black-box machine learning models, whether the target functional definition can be reconstructed and has been used in training. We argue that a transfer of research results to real-world applications requires to avoid circularity by separating measurements that define target outcomes from data representations in machine learning.	翻訳日:2021-06-24 19:01:45 公開日:2021-06-23
# (参考訳) 単純さの発見:シフト不変変分オートエンコーダによる特徴・パターン・順序パラメータの教師なし発見 Finding simplicity: unsupervised discovery of features, patterns, and order parameters via shift-invariant variational autoencoders ( http://arxiv.org/abs/2106.12472v1 ) ライセンス: CC BY 4.0	Maxim Ziatdinov, Chun Yin Wong, and Sergei V. Kalinin	(参考訳) 走査トンネル法と透過電子顕微鏡(STM, STEM)の最近の進歩により, 材料の構造や機能に関する情報を含む大量のイメージングデータが日常的に生成されるようになった。実験データセットは、STEMにおける物理秩序パラメータ場、偏光およびひずみ勾配、STMにおける定常電子波およびキャリア媒介交換相互作用などの長距離現象のシグネチャを含む。それに応じて、人間の目は格子周期、繰り返し構造要素、微細構造などの画像の特定のパターンを容易に識別することができるが、それらの自動抽出と分類は非常に非自明で、そのような分析を達成するための普遍的な経路が欠如している。 STMおよび(S)TEM画像で観察されるパターンの最も特徴的な要素は、(ほぼ)周期性であり、基本原子構造のパーシモニーから直接発生する挙動であり、秩序パラメータ分布を反映する段階的変化に重畳されている。しかしながら、大域的フーリエ法によるこれらの要素の発見は、可変性と理想的離散的翻訳対称性の欠如により非自明である。この問題に対処するため,画像空間をランダムにサンプリングするためには,画像の特徴的反復的特徴を解消するシフト不変変分オートエンコーダ(shift-VAE)を開発した。シフト-VAEは、対象物の位置の不確実性と形状再構成の不確実性とのバランスをとる。このアプローチはモデル1Dデータに対して説明され、さらに合成および実験的なSTMおよびSTEM2Dデータに拡張される。 Recent advances in scanning tunneling and transmission electron microscopies (STM and STEM) have allowed routine generation of large volumes of imaging data containing information on the structure and functionality of materials. The experimental data sets contain signatures of long-range phenomena such as physical order parameter fields, polarization and strain gradients in STEM, or standing electronic waves and carrier-mediated exchange interactions in STM, all superimposed onto scanning system distortions and gradual changes of contrast due to drift and/or mis-tilt effects. Correspondingly, while the human eye can readily identify certain patterns in the images such as lattice periodicities, repeating structural elements, or microstructures, their automatic extraction and classification are highly non-trivial and universal pathways to accomplish such analyses are absent. We pose that the most distinctive elements of the patterns observed in STM and (S)TEM images are similarity and (almost-) periodicity, behaviors stemming directly from the parsimony of elementary atomic structures, superimposed on the gradual changes reflective of order parameter distributions. However, the discovery of these elements via global Fourier methods is non-trivial due to variability and lack of ideal discrete translation symmetry. To address this problem, we develop shift-invariant variational autoencoders (shift-VAE) that allow disentangling characteristic repeating features in the images, their variations, and shifts inevitable for random sampling of image space. Shift-VAEs balance the uncertainty in the position of the object of interest with the uncertainty in shape reconstruction. This approach is illustrated for model 1D data, and further extended to synthetic and experimental STM and STEM 2D data.	翻訳日:2021-06-24 19:00:37 公開日:2021-06-23
# (参考訳) 転校学習に対する教師モデルフィンガープリント攻撃 Teacher Model Fingerprinting Attacks Against Transfer Learning ( http://arxiv.org/abs/2106.12478v1 ) ライセンス: CC BY 4.0	Yufei Chen, Chao Shen, Cong Wang, Yang Zhang	(参考訳) トランスファーラーニングは、トレーニングデータの不足に対処するための一般的なソリューションになっています。訓練された教師モデルの初期の層を再利用または微調整することで、特定の学生モデルを訓練する。しかし、ユーティリティの改善に加えて、移行された公開知識は機密性をモデル化する潜在的な脅威をもたらし、さらに他のセキュリティやプライバシーの問題も引き起こす。本稿では,情報伝達学習の文脈における教師モデル暴露の脅威について,初めて総合的な調査を行い,公開知識とモデル機密性との緊張関係について深い知見を得ることを目的としている。そこで本研究では,学生モデルの起源を推定するために,教師モデルフィンガープリント攻撃を提案する。具体的には,学生モデルを探索して攻撃を実現するために,クエリを慎重に生成する新しい最適化手法を提案する。既存のモデルリバースエンジニアリングのアプローチとは異なり、提案手法では、後部などのきめ細かいモデル出力や、モデルアーキテクチャやトレーニングデータセットの補助情報に依存しない。提案攻撃の有効性を系統的に評価した。実験結果から,本攻撃はプロービングクエリの少ないモデル起源を正確に識別できることが判明した。さらに,提案攻撃は,モデル盗難などの機械学習モデルに対する攻撃を容易化するためのステップストーンとして機能することを示す。 Transfer learning has become a common solution to address training data scarcity in practice. It trains a specified student model by reusing or fine-tuning early layers of a well-trained teacher model that is usually publicly available. However, besides utility improvement, the transferred public knowledge also brings potential threats to model confidentiality, and even further raises other security and privacy issues. In this paper, we present the first comprehensive investigation of the teacher model exposure threat in the transfer learning context, aiming to gain a deeper insight into the tension between public knowledge and model confidentiality. To this end, we propose a teacher model fingerprinting attack to infer the origin of a student model, i.e., the teacher model it transfers from. Specifically, we propose a novel optimization-based method to carefully generate queries to probe the student model to realize our attack. Unlike existing model reverse engineering approaches, our proposed fingerprinting method neither relies on fine-grained model outputs, e.g., posteriors, nor auxiliary information of the model architecture or training dataset. We systematically evaluate the effectiveness of our proposed attack. The empirical results demonstrate that our attack can accurately identify the model origin with few probing queries. Moreover, we show that the proposed attack can serve as a stepping stone to facilitating other attacks against machine learning models, such as model stealing.	翻訳日:2021-06-24 18:43:40 公開日:2021-06-23
# (参考訳) アラビア語におけるサーカズム検出と感情分析のための深部マルチタスクモデル Deep Multi-Task Model for Sarcasm Detection and Sentiment Analysis in Arabic Language ( http://arxiv.org/abs/2106.12488v1 ) ライセンス: CC BY 4.0	Abdelkader El Mahdaouy, Abdellah El Mekki, Kabil Essefar, Nabil El Mamoun, Ismail Berrada, Ahmed Khoumsi	(参考訳) 皮肉や皮肉といった比喩的言語装置の普及は、アラビア語の知覚分析(SA)に深刻な課題をもたらす。従来の研究ではSAとsarcasm検出が別々に行われているが,本研究では,両タスク間の知識相互作用を可能にする,エンドツーエンドの深層多タスク学習(MTL)モデルを提案する。我々のMTLモデルは、変換器(BERT)モデルからの双方向エンコーダ表現、マルチタスクアテンション相互作用モジュール、および2つのタスク分類器で構成されている。以上の結果から, 提案手法は, SAおよびsarcasm検出サブタスクにおいて, 単タスクモデルよりも優れていることがわかった。 The prominence of figurative language devices, such as sarcasm and irony, poses serious challenges for Arabic Sentiment Analysis (SA). While previous research works tackle SA and sarcasm detection separately, this paper introduces an end-to-end deep Multi-Task Learning (MTL) model, allowing knowledge interaction between the two tasks. Our MTL model's architecture consists of a Bidirectional Encoder Representation from Transformers (BERT) model, a multi-task attention interaction module, and two task classifiers. The overall obtained results show that our proposed model outperforms its single-task counterparts on both SA and sarcasm detection sub-tasks.	翻訳日:2021-06-24 18:14:33 公開日:2021-06-23
# (参考訳) ハイパースペクトル画像復調のためのマルチモーダルおよび周波数重み付きテンソル核ノルム Multi-modal and frequency-weighted tensor nuclear norm for hyperspectral image denoising ( http://arxiv.org/abs/2106.12489v1 ) ライセンス: CC BY 4.0	Sheng Liu, Xiaozhen Xie, Wenfeng Kong, and Jifeng Ning	(参考訳) 低ランク性はハイパースペクトル画像(hsi)のタスクにおいて重要である。テンソル核ノルム (TNN) は、テンソル特異値分解に基づいて定義され、HSIの低ランク性を記述するための最先端の手法である。しかしながら、TNNは、非正規化タスクに対処する際のHSIの物理的意味を無視し、亜最適デノイズ化パフォーマンスをもたらす。本稿では,マルチモーダル・周波数重み付きテンソル核ノルム (MFWTNN) と非凸MFWTNN を提案する。まず、周波数成分の物理的意味を調査し、その重みを再考し、TNNの低ランク表現能力を改善する。また,2つの空間次元とHSIのスペクトル次元の相関を考察し,上記のTNNの改良と組み合わせてMFWTNNを提案する。次に、非凸関数を用いて周波数テンソルのランク関数を近似し、MFWTNNをより緩和するNonMFWTNNを提案する。また,ノイズ情報を含むスライスに対して,プロファイル情報を含むスライスに対して,より小さなウエイトを適応的に選択する。最後に,提案モデルを解くために,乗算器(admm)に基づくアルゴリズムの効率的な交互方向法を開発し,本手法の有効性をシミュレーションおよび実hsiデータセットで検証した。 Low-rankness is important in the hyperspectral image (HSI) denoising tasks. The tensor nuclear norm (TNN), defined based on the tensor singular value decomposition, is a state-of-the-art method to describe the low-rankness of HSI. However, TNN ignores some of the physical meanings of HSI in tackling the denoising tasks, leading to suboptimal denoising performance. In this paper, we propose the multi-modal and frequency-weighted tensor nuclear norm (MFWTNN) and the non-convex MFWTNN for HSI denoising tasks. Firstly, we investigate the physical meaning of frequency components and reconsider their weights to improve the low-rank representation ability of TNN. Meanwhile, we also consider the correlation among two spatial dimensions and the spectral dimension of HSI and combine the above improvements to TNN to propose MFWTNN. Secondly, we use non-convex functions to approximate the rank function of the frequency tensor and propose the NonMFWTNN to relax the MFWTNN better. Besides, we adaptively choose bigger weights for slices mainly containing noise information and smaller weights for slices containing profile information. Finally, we develop the efficient alternating direction method of multiplier (ADMM) based algorithm to solve the proposed models, and the effectiveness of our models are substantiated in simulated and real HSI datasets.	翻訳日:2021-06-24 18:07:39 公開日:2021-06-23
# (参考訳) 国・州レベルの現代アラビア語標準および方言アラビア語識別のためのBERTに基づくマルチタスクモデル BERT-based Multi-Task Model for Country and Province Level Modern Standard Arabic and Dialectal Arabic Identification ( http://arxiv.org/abs/2106.12495v1 ) ライセンス: CC BY 4.0	Abdellah El Mekki, Abdelkader El Mahdaouy, Kabil Essefar, Nabil El Mamoun, Ismail Berrada, Ahmed Khoumsi	(参考訳) 方言と標準言語識別は多くのアラビア語自然言語処理アプリケーションにとって重要なタスクである。本稿では,現代標準アラビア語 (msa) と方言アラビア語 (da) の国レベルと州レベルを識別するための第2のnadi共通課題である深層学習に基づくシステムを提案する。このシステムは、国レベルと州レベルのmsa/da識別に取り組むために、エンドツーエンドのディープマルチタスク学習(mtl)モデルに基づいている。後者のMTLモデルは、共通の双方向エンコーダ表現変換器(BERT)エンコーダ、2つのタスク固有の注意層、2つの分類器で構成される。私たちのキーとなる考え方は、タスク識別とタスク間共有機能の両方を活用することです。その結果,MTLモデルは,ほとんどのサブタスクにおいて単一タスクモデルよりも優れていた。 Dialect and standard language identification are crucial tasks for many Arabic natural language processing applications. In this paper, we present our deep learning-based system, submitted to the second NADI shared task for country-level and province-level identification of Modern Standard Arabic (MSA) and Dialectal Arabic (DA). The system is based on an end-to-end deep Multi-Task Learning (MTL) model to tackle both country-level and province-level MSA/DA identification. The latter MTL model consists of a shared Bidirectional Encoder Representation Transformers (BERT) encoder, two task-specific attention layers, and two classifiers. Our key idea is to leverage both the task-discriminative and the inter-task shared features for country and province MSA/DA identification. The obtained results show that our MTL model outperforms single-task models on most subtasks.	翻訳日:2021-06-24 17:44:50 公開日:2021-06-23
# (参考訳) 医用画像セグメンテーションのためのオフザシェルフソースセグメンタの適用 Adapting Off-the-Shelf Source Segmenter for Target Medical Image Segmentation ( http://arxiv.org/abs/2106.12497v1 ) ライセンス: CC BY 4.0	Xiaofeng Liu, Fangxu Xing, Chao Yang, Georges El Fakhri, Jonghye Woo	(参考訳) unsupervised domain adaptation(uda)は、ラベル付きソースドメインから学んだ知識を、ラベル付きで見当たらないターゲットドメインに転送することを目的としている。しかし、適応段階におけるソースドメインデータへのアクセスは、データストレージやプライバシの問題のため、しばしば制限される。これを軽減するため,本研究では,ソースドメイン内で事前学習した ``off-the-shelf' セグメントモデルを,適応型バッチワイド正規化統計適応フレームワークを用いて,対象ドメインに適応させることを提案する。具体的には、ドメイン固有の低次バッチ統計、すなわち平均と分散は指数運動量減衰スキームに徐々に適応し、ドメイン共有可能な高次バッチ統計、すなわちスケーリングとシフトパラメータの整合性は、最適化目標によって明示的に強制される。各チャネルの転送性は、まず各チャネルの寄与のバランスをとるために適応的に測定される。さらに、提案したオープンソースフリーなUDAフレームワークは、例えば自己エントロピーの最小化など、教師なしの学習手法に直交しているため、フレームワークの上に簡単に追加できる。 BraTS 2018データベース上での大規模な実験により、我々のソースフリーなUDAフレームワークは、クロスサブタイプUDAセグメンテーションタスクの既存のソースラックスUDAメソッドよりも優れており、ソースデータとの教師付きUDAメソッドと比較して、クロスモダリティUDAセグメンテーションタスクの同等の結果が得られました。 Unsupervised domain adaptation (UDA) aims to transfer knowledge learned from a labeled source domain to an unlabeled and unseen target domain, which is usually trained on data from both domains. Access to the source domain data at the adaptation stage, however, is often limited, due to data storage or privacy issues. To alleviate this, in this work, we target source free UDA for segmentation, and propose to adapt an ``off-the-shelf" segmentation model pre-trained in the source domain to the target domain, with an adaptive batch-wise normalization statistics adaptation framework. Specifically, the domain-specific low-order batch statistics, i.e., mean and variance, are gradually adapted with an exponential momentum decay scheme, while the consistency of domain shareable high-order batch statistics, i.e., scaling and shifting parameters, is explicitly enforced by our optimization objective. The transferability of each channel is adaptively measured first from which to balance the contribution of each channel. Moreover, the proposed source free UDA framework is orthogonal to unsupervised learning methods, e.g., self-entropy minimization, which can thus be simply added on top of our framework. Extensive experiments on the BraTS 2018 database show that our source free UDA framework outperformed existing source-relaxed UDA methods for the cross-subtype UDA segmentation task and yielded comparable results for the cross-modality UDA segmentation task, compared with a supervised UDA methods with the source data.	翻訳日:2021-06-24 17:38:44 公開日:2021-06-23
# (参考訳) 深層畳み込みニューラルネットワークの普遍的一貫性 Universal Consistency of Deep Convolutional Neural Networks ( http://arxiv.org/abs/2106.12498v1 ) ライセンス: CC BY 4.0	Shao-Bo Lin, Kaidong Wang, Yao Wang, Ding-Xuan Zhou	(参考訳) 深層畳み込みニューラルネットワーク(dcnn)の実際的な研究活動と比較すると、dcnnの理論的な挙動の研究は遅れている。特にDCNNの普遍的な一貫性は未解決のままである。本稿では,拡張畳み込みを伴うDCNNにおける経験的リスク最小化の実装が,(ゼロパディングを伴う)強固に一貫したものであることを示す。完全連結層がなければ、拡張畳み込みを伴うDCNNは、収縮(ゼロパディング)畳み込み層と複数の完全連結層を含むハイブリッド構造を持つ広く使われているディープニューラルネットワークよりも悪くはならないことを示す一連の実験を行う。 Compared with avid research activities of deep convolutional neural networks (DCNNs) in practice, the study of theoretical behaviors of DCNNs lags heavily behind. In particular, the universal consistency of DCNNs remains open. In this paper, we prove that implementing empirical risk minimization on DCNNs with expansive convolution (with zero-padding) is strongly universally consistent. Motivated by the universal consistency, we conduct a series of experiments to show that without any fully connected layers, DCNNs with expansive convolution perform not worse than the widely used deep neural networks with hybrid structure containing contracting (without zero-padding) convolution layers and several fully connected layers.	翻訳日:2021-06-24 17:27:48 公開日:2021-06-23
# (参考訳) クロスドメイン非教師付きタグ・ツー・シネMRI合成のための生成的自己学習 Generative Self-training for Cross-domain Unsupervised Tagged-to-Cine MRI Synthesis ( http://arxiv.org/abs/2106.12499v1 ) ライセンス: CC BY 4.0	Xiaofeng Liu, Fangxu Xing, Maureen Stone, Jiachen Zhuo, Reese Timothy, Jerry L. Prince, Georges El Fakhri, Jonghye Woo	(参考訳) 自己学習に基づく教師なしドメイン適応(UDA)は、未ラベルのターゲットドメインに訓練されたディープラーニングモデルをソースドメインに適用する場合、ドメインシフトの問題に対処する大きな可能性を示している。しかし、自己学習udaは、ソフトマックス離散ヒストグラムに基づく信頼性の高い疑似ラベル選択により、分類やセグメンテーションなどの判別タスクにおいて有効性を示すが、画像合成などの生成課題に対する自己学習udaは十分に研究されていない。本稿では,連続値予測とクロスドメイン画像合成のための回帰目標を備えた新しい生成的自己学習(gst) udaフレームワークを提案する。具体的には,疑似ラベルを不確実性マスクでフィルタリングし,生成画像の予測信頼度を実用的変動ベイズ学習で定量化する。高速テストタイム適応はラウンドベースの代替最適化スキームによって達成される。我々は、ソースドメインとターゲットドメインのデータセットを異なるスキャナーやセンターから取得する、タグ付き磁気共鳴画像(MRI)合成問題に関する枠組みを検証した。一般的なUDA手法に対して,我々の枠組みを検証するため,広範囲な検証を行った。以上の結果より,新しい対象領域の被験者のMRIをタグ付けしたGSTでは,UDA法と比較すると,合成品質が有意に向上した。 Self-training based unsupervised domain adaptation (UDA) has shown great potential to address the problem of domain shift, when applying a trained deep learning model in a source domain to unlabeled target domains. However, while the self-training UDA has demonstrated its effectiveness on discriminative tasks, such as classification and segmentation, via the reliable pseudo-label selection based on the softmax discrete histogram, the self-training UDA for generative tasks, such as image synthesis, is not fully investigated. In this work, we propose a novel generative self-training (GST) UDA framework with continuous value prediction and regression objective for cross-domain image synthesis. Specifically, we propose to filter the pseudo-label with an uncertainty mask, and quantify the predictive confidence of generated images with practical variational Bayes learning. The fast test-time adaptation is achieved by a round-based alternative optimization scheme. We validated our framework on the tagged-to-cine magnetic resonance imaging (MRI) synthesis problem, where datasets in the source and target domains were acquired from different scanners or centers. Extensive validations were carried out to verify our framework against popular adversarial training UDA methods. Results show that our GST, with tagged MRI of test subjects in new target domains, improved the synthesis quality by a large margin, compared with the adversarial training UDA methods.	翻訳日:2021-06-24 17:07:31 公開日:2021-06-23
# (参考訳) ベイズ深層学習ハイパーパラメータ探索による雑音付き多項式へのロバスト関数マッピング Bayesian Deep Learning Hyperparameter Search for Robust Function Mapping to Polynomials with Noise ( http://arxiv.org/abs/2106.12532v1 ) ライセンス: CC BY 4.0	Nidhin Harilal, Udit Bhatia, Auroop R. Ganguly	(参考訳) ニューラルアーキテクチャ探索の進歩と、コネクショナリストアーキテクチャの説明可能性と解釈性は、最近の文献で報告されている。しかし,BDL(Bayesian Deep Learning)ハイパーパラメータの設計方法,特に不確実な定量化を伴うロバストな関数マッピングのための深さ,幅,アンサンブルサイズについて,我々はまだ理解していない。本稿では,ベイズ接続性表現を,ノイズタイプや比率の異なる異なる次数の多項式にマッピングすることで,理解を深めようとする。雑音特性に基づく不確かさを定量化しつつ, 基礎となる多項式信号を抽出するハイパーパラメータの組み合わせを探索するために, 雑音汚染多項式を調べる。具体的には、異なる分布とSNR比と様々な雑音特性を有するノイズで汚染されたn次多項式の信号を検出するために、適切なニューラルネットワークアーキテクチャとアンサンブル構成が見つかるかどうかを考察する。以上の結果から,ネットワーク深度が最適であること,および予測スキルと不確かさの定量化に最適なアンサンブル数があることが示唆された。しかし、高い幅値での幅増加に伴い性能向上率が低下しても、幅に対する最適性は識別できない。我々の実験と洞察は、BDL表現の理論的性質を理解し、実用的なソリューションを設計するための方向性となる。 Advances in neural architecture search, as well as explainability and interpretability of connectionist architectures, have been reported in the recent literature. However, our understanding of how to design Bayesian Deep Learning (BDL) hyperparameters, specifically, the depth, width and ensemble size, for robust function mapping with uncertainty quantification, is still emerging. This paper attempts to further our understanding by mapping Bayesian connectionist representations to polynomials of different orders with varying noise types and ratios. We examine the noise-contaminated polynomials to search for the combination of hyperparameters that can extract the underlying polynomial signals while quantifying uncertainties based on the noise attributes. Specifically, we attempt to study the question that an appropriate neural architecture and ensemble configuration can be found to detect a signal of any n-th order polynomial contaminated with noise having different distributions and signal-to-noise (SNR) ratios and varying noise attributes. Our results suggest the possible existence of an optimal network depth as well as an optimal number of ensembles for prediction skills and uncertainty quantification, respectively. However, optimality is not discernible for width, even though the performance gain reduces with increasing width at high values of width. Our experiments and insights can be directional to understand theoretical properties of BDL representations and to design practical solutions.	翻訳日:2021-06-24 16:57:07 公開日:2021-06-23
# (参考訳) PAC-Bayes一般化境界の最小化による確率的多数票の学習 Learning Stochastic Majority Votes by Minimizing a PAC-Bayes Generalization Bound ( http://arxiv.org/abs/2106.12535v1 ) ライセンス: CC BY 4.0	Valentina Zantedeschi, Paul Viallard, Emilie Morvant, R\'emi Emonet, Amaury Habrard, Pascal Germain, Benjamin Guedj	(参考訳) 分類器の有限アンサンブルに対する多数票の確率的対向について検討し,その一般化特性について検討する。このアプローチは任意の分布に対して成り立つが、dirichlet分布をインスタンス化する: これは、期待されるリスクに対して閉じた形式と微分可能な表現を可能にする。その結果得られた確率的多数決学習アルゴリズムは、pap-bayes目標を最小化する競合するアルゴリズムと比較した一連の数値実験において、最先端の精度と(空でない)密接な一般化限界の利点を達成する。 We investigate a stochastic counterpart of majority votes over finite ensembles of classifiers, and study its generalization properties. While our approach holds for arbitrary distributions, we instantiate it with Dirichlet distributions: this allows for a closed-form and differentiable expression for the expected risk, which then turns the generalization bound into a tractable training objective. The resulting stochastic majority vote learning algorithm achieves state-of-the-art accuracy and benefits from (non-vacuous) tight generalization bounds, in a series of numerical experiments when compared to competing algorithms which also minimize PAC-Bayes objectives -- both with uninformed (data-independent) and informed (data-dependent) priors.	翻訳日:2021-06-24 16:45:21 公開日:2021-06-23
# (参考訳) 特徴帰属と反事実的説明は操作できる Feature Attributions and Counterfactual Explanations Can Be Manipulated ( http://arxiv.org/abs/2106.12563v1 ) ライセンス: CC BY 4.0	Dylan Slack, Sophie Hilgard, Sameer Singh, Hima Lakkaraju	(参考訳) 機械学習モデルは、重要な意思決定設定(医療や金融など)でますます使われているため、モデル予測を説明する方法の開発に重点が置かれている。このような \textit{explanations} はモデルの理解と確立に使用され、マシンラーニングパイプラインの重要なコンポーネントである。これらのシステムでは、説明は重要な部分であるが、敵による操作に対する脆弱性についてはほとんど理解されていない。本稿では,2つの幅広い説明のクラスが操作に対して脆弱であるかを論じる。敵がモデルに依存しない特徴帰属法(例: lime \& shap)を操作するバイアス付きモデルをどのように設計するかを実証し、反事実探索(例:wachterのアルゴリズム \& dice)中のヒル・クライムがモデルのバイアスである \textit{concealing} へ変換されるという反事実的説明を実証する。これらの脆弱性は、敵がバイアス付きモデルをデプロイすることを可能にするが、説明はこのバイアスを明らかにしないため、ステークホルダーをモデルの信頼性を損なう。我々は,実世界のデータセット上での操作について,compas や community \& crime などを評価し,実際に操作できる説明を見つける。 As machine learning models are increasingly used in critical decision-making settings (e.g., healthcare, finance), there has been a growing emphasis on developing methods to explain model predictions. Such \textit{explanations} are used to understand and establish trust in models and are vital components in machine learning pipelines. Though explanations are a critical piece in these systems, there is little understanding about how they are vulnerable to manipulation by adversaries. In this paper, we discuss how two broad classes of explanations are vulnerable to manipulation. We demonstrate how adversaries can design biased models that manipulate model agnostic feature attribution methods (e.g., LIME \& SHAP) and counterfactual explanations that hill-climb during the counterfactual search (e.g., Wachter's Algorithm \& DiCE) into \textit{concealing} the model's biases. These vulnerabilities allow an adversary to deploy a biased model, yet explanations will not reveal this bias, thereby deceiving stakeholders into trusting the model. We evaluate the manipulations on real world data sets, including COMPAS and Communities \& Crime, and find explanations can be manipulated in practice.	翻訳日:2021-06-24 16:09:15 公開日:2021-06-23
# (参考訳) ニューラルネットワークにおける近似可逆性のための特徴アライメント Feature Alignment for Approximated Reversibility in Neural Networks ( http://arxiv.org/abs/2106.12562v1 ) ライセンス: CC BY 4.0	Tiago de Souza Farias and Jonas Maziero	(参考訳) 本稿では,ニューラルネットワークにおける近似可逆性を得る手法である特徴アライメントを導入する。特徴抽出によって、ニューラルネットワークを訓練して、出力から入力への逆プロセスのための推定マップを学習することができる。変分オートエンコーダと組み合わせることで、トレーニングデータと同じ統計から新しいサンプルを生成することができる。生成的対向ネットワークの概念を用いて, 結果の改善を図った。最後に、ニューラルネットワークをローカルにトレーニングし、計算メモリリソースを節約するためにこの技術を変更可能であることを示す。これらの手法を適用し,MNIST,CIFAR-10,celebAの3つの視覚生成課題について報告する。 We introduce feature alignment, a technique for obtaining approximate reversibility in artificial neural networks. By means of feature extraction, we can train a neural network to learn an estimated map for its reverse process from outputs to inputs. Combined with variational autoencoders, we can generate new samples from the same statistics as the training data. Improvements of the results are obtained by using concepts from generative adversarial networks. Finally, we show that the technique can be modified for training neural networks locally, saving computational memory resources. Applying these techniques, we report results for three vision generative tasks: MNIST, CIFAR-10, and celebA.	翻訳日:2021-06-24 15:52:19 公開日:2021-06-23
# エイリアスフリー生成型adversarial network Alias-Free Generative Adversarial Networks ( http://arxiv.org/abs/2106.12423v1 ) ライセンス: Link先を確認	Tero Karras, Miika Aittala, Samuli Laine, Erik H\"ark\"onen, Janne Hellsten, Jaakko Lehtinen, Timo Aila	(参考訳) 階層的畳み込みの性質にもかかわらず、典型的な生成逆数ネットワークの合成過程は不健全な方法で絶対画素座標に依存する。例えば、ディテールは描写されたオブジェクトの表面ではなく、画像座標に接着されているように見える。我々は、生成ネットワーク内でエイリアスを引き起こす不注意信号処理に根本原因を辿る。ネットワーク内のすべての信号を連続的に解釈すると、不要な情報が階層的な合成プロセスに漏れないことを保証する小さなアーキテクチャ変更が一般的に適用される。その結果得られるネットワークはstylegan2のfidと一致するが、内部表現では劇的に異なり、サブピクセルスケールでも翻訳と回転に完全同値である。その結果,ビデオやアニメーションに適した生成モデルへの道が開けた。 We observe that despite their hierarchical convolutional nature, the synthesis process of typical generative adversarial networks depends on absolute pixel coordinates in an unhealthy manner. This manifests itself as, e.g., detail appearing to be glued to image coordinates instead of the surfaces of depicted objects. We trace the root cause to careless signal processing that causes aliasing in the generator network. Interpreting all signals in the network as continuous, we derive generally applicable, small architectural changes that guarantee that unwanted information cannot leak into the hierarchical synthesis process. The resulting networks match the FID of StyleGAN2 but differ dramatically in their internal representations, and they are fully equivariant to translation and rotation even at subpixel scales. Our results pave the way for generative models better suited for video and animation.	翻訳日:2021-06-24 15:37:32 公開日:2021-06-23
# NodePiece: 大きな知識グラフの合成とパラメータ効率の良い表現 NodePiece: Compositional and Parameter-Efficient Representations of Large Knowledge Graphs ( http://arxiv.org/abs/2106.12144v1 ) ライセンス: Link先を確認	Mikhail Galkin, Jiapeng Wu, Etienne Denis, William L. Hamilton	(参考訳) 知識グラフ(KG)の従来の表現学習アルゴリズムは、各エンティティを独自の埋め込みベクトルにマッピングする。このような浅いルックアップは、埋め込み行列を格納するためのメモリ消費の線形増加をもたらし、現実世界のKGを扱う際に高い計算コストを発生させる。 NLPで一般的に使われているサブワードトークン化と平行に描画することで、サブ線形メモリ要求を伴うパラメータ効率の高いノード埋め込み戦略の展望を探る。そこで我々は,固定サイズのエンティティ語彙を学習するためのアンカーベースアプローチであるnodepieceを提案する。ノードピースでは、既知の関係型を持つグラフのアンカーノードからサブワード/サブエンティティ単位の語彙を構築する。このような固定サイズの語彙を考えると、トレーニング中に見えないものを含むあらゆるエンティティのエンコーディングと埋め込みをブートストラップすることができる。実験によると、NodePieceはノード分類、リンク予測、関係予測タスクにおいて競合的に動作し、グラフ内の明示的なノードの10%未満をアンカーとして保持し、しばしば10倍のパラメータを持つ。 Conventional representation learning algorithms for knowledge graphs (KG) map each entity to a unique embedding vector. Such a shallow lookup results in a linear growth of memory consumption for storing the embedding matrix and incurs high computational costs when working with real-world KGs. Drawing parallels with subword tokenization commonly used in NLP, we explore the landscape of more parameter-efficient node embedding strategies with possibly sublinear memory requirements. To this end, we propose NodePiece, an anchor-based approach to learn a fixed-size entity vocabulary. In NodePiece, a vocabulary of subword/sub-entity units is constructed from anchor nodes in a graph with known relation types. Given such a fixed-size vocabulary, it is possible to bootstrap an encoding and embedding for any entity, including those unseen during training. Experiments show that NodePiece performs competitively in node classification, link prediction, and relation prediction tasks while retaining less than 10% of explicit nodes in a graph as anchors and often having 10x fewer parameters.	翻訳日:2021-06-24 15:37:01 公開日:2021-06-23
# トランスファーラーニングとデータ変換による事前学習視覚モデルによるテキストデータの分類 Classifying Textual Data with Pre-trained Vision Models through Transfer Learning and Data Transformations ( http://arxiv.org/abs/2106.12479v1 ) ライセンス: Link先を確認	Charaf Eddine Benarab	(参考訳) 知識は経験を通じて人間によって獲得され、異なるタスクで同時に達成できる知識の種類やスキルレベルの境界は設定されない。ニューラルネットワークに関しては、そうではありませんが、この分野における大きなブレークスルーは極めてタスクとドメイン特化です。ビジョンと言語は別々の方法で処理され、別々のメソッドと異なるデータセットを使用する。本稿では,imagenetでトレーニングされたベンチマークビジョンモデルによって得られた知識を用いて,より小さなアーキテクチャでテキストの分類を学ぶことを提案する。 IMDBデータセットに含まれるテキストデータをグレースケールイメージに変換する。異なる領域の解析と転送学習法を実行する。まったく異なるデータセットによる課題にもかかわらず、有望な結果が得られます。この研究の主な貢献は、言語とビジョンの両方で事前訓練された大きなモデルを結びつけて、元のタスクと異なるサブフィールドで最新の結果を達成する、新しいアプローチである。計算能力の高いリソースを必要とせず具体的には、視覚と言語モデル間の知識を転送して感情分析を行う。 BERT埋め込みはグレースケールのイメージに変換され、これらのイメージはVGG16やResNet Index Terms:自然言語、ビジョン、BERT、Transfer Learning、CNN、ドメイン適応といった事前訓練されたビジョンモデルのトレーニング例として使用される。 Knowledge is acquired by humans through experience, and no boundary is set between the kinds of knowledge or skill levels we can achieve on different tasks at the same time. When it comes to Neural Networks, that is not the case, the major breakthroughs in the field are extremely task and domain specific. Vision and language are dealt with in separate manners, using separate methods and different datasets. In this work, we propose to use knowledge acquired by benchmark Vision Models which are trained on ImageNet to help a much smaller architecture learn to classify text. After transforming the textual data contained in the IMDB dataset to gray scale images. An analysis of different domains and the Transfer Learning method is carried out. Despite the challenge posed by the very different datasets, promising results are achieved. The main contribution of this work is a novel approach which links large pretrained models on both language and vision to achieve state-of-the-art results in different sub-fields from the original task. Without needing high compute capacity resources. Specifically, Sentiment Analysis is achieved after transferring knowledge between vision and language models. BERT embeddings are transformed into grayscale images, these images are then used as training examples for pretrained vision models such as VGG16 and ResNet Index Terms: Natural language, Vision, BERT, Transfer Learning, CNN, Domain Adaptation.	翻訳日:2021-06-24 15:36:43 公開日:2021-06-23
# 血液細胞の多種分類 --エンドツーエンドコンピュータビジョンに基づく診断ケーススタディ- Multi-Class Classification of Blood Cells -- End to End Computer Vision based diagnosis case study ( http://arxiv.org/abs/2106.12548v1 ) ライセンス: Link先を確認	Sai Sukruth Bezugam	(参考訳) 血液ベースの疾患の診断は、しばしば患者の血液サンプルを特定して特徴付ける。血液細胞サブタイプの検出と分類の自動化は、重要な医学的応用である。医療画像の自動処理と分析は、医療診断に強力なツールを提供する。本研究では, 白血球の外輪郭, 色の形態的特徴に基づいて, 白血球分類の問題に取り組む。 The work we would explore a set of preprocessing and segmentation (Color-based segmentation, Morphological processing, contouring) algorithms along with a set of features extraction methods (Corner detection algorithms and Histogram of Gradients(HOG)), dimensionality reduction algorithms (Principal Component Analysis(PCA)) that are able to recognize and classify through various Unsupervised(k-nearest neighbors) and Supervised (Support Vector Machine, Decision Trees, Linear Discriminant Analysis, Quadratic Discriminant Analysis, Naive Bayes) algorithms different categories of white blood cells to Eosinophil, Lymphocyte, Monocyte, and Neutrophil. さまざまなDeep Convolutional Neural Network Architecture(Sqeezent、MobilenetV1、MobilenetV2、InceptionNetなど)の探求も進めています。前処理/セグメンテーションおよび前処理なしで。我々は、最小時間複雑さと低リソース要求でロバストなアルゴリズムを特定するために、多くのアルゴリズムを探求したい。この研究の結果は、自動的な血液細胞分類に必要なアルゴリズムの選択の手がかりとなる可能性がある。 The diagnosis of blood-based diseases often involves identifying and characterizing patient blood samples. Automated methods to detect and classify blood cell subtypes have important medical applications. Automated medical image processing and analysis offers a powerful tool for medical diagnosis. In this work we tackle the problem of white blood cell classification based on the morphological characteristics of their outer contour, color. The work we would explore a set of preprocessing and segmentation (Color-based segmentation, Morphological processing, contouring) algorithms along with a set of features extraction methods (Corner detection algorithms and Histogram of Gradients(HOG)), dimensionality reduction algorithms (Principal Component Analysis(PCA)) that are able to recognize and classify through various Unsupervised(k-nearest neighbors) and Supervised (Support Vector Machine, Decision Trees, Linear Discriminant Analysis, Quadratic Discriminant Analysis, Naive Bayes) algorithms different categories of white blood cells to Eosinophil, Lymphocyte, Monocyte, and Neutrophil. We even take a step forwards to explore various Deep Convolutional Neural network architecture (Sqeezent, MobilenetV1,MobilenetV2, InceptionNet etc.) without preprocessing/segmentation and with preprocessing. We would like to explore many algorithms to identify the robust algorithm with least time complexity and low resource requirement. The outcome of this work can be a cue to selection of algorithms as per requirement for automated blood cell classification.	翻訳日:2021-06-24 15:36:21 公開日:2021-06-23
# 安定,高速,高精度:相対的位置エンコーディングによるカーネル化注意 Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding ( http://arxiv.org/abs/2106.12566v1 ) ライセンス: Link先を確認	Shengjie Luo, Shanda Li, Tianle Cai, Di He, Dinglan Peng, Shuxin Zheng, Guolin Ke, Liwei Wang, Tie-Yan Liu	(参考訳) トランスフォーマーの重要な要素であるアテンションモジュールは、二次複雑性のため、長いシーケンスに対して効率的にスケールできない。多くの作品は、元々の注意でドット指数のソフトマックス関数を近似することに焦点を当てており、サブクアドラティックあるいは線形複雑トランスフォーマーアーキテクチャへと繋がる。しかし,これらの手法は,例えば相対位置符号化 (rpe) を用いたトランスフォーマなど,dot-then-exponentiate スタイルを超えて,より強力な注意モジュールには適用できないことを示す。多くの最先端モデルでは、相対的な位置符号化がデフォルトとして使用されるため、RPEを組み込む効率的なトランスフォーマーを設計することは魅力的である。本稿では、カーネル化された注目の上にRPEを持つトランスフォーマーの注意計算を高速化する新しい手法を提案する。相対的な位置符号化がtoeplitz行列を形成するという観測に基づいて,高速フーリエ変換(fft)を用いてrpeによるカーネル化注意を効率的に計算できることを数学的に示す。 FFTでは,時間複雑性を$\mathcal{O}(n\log n)$とする。さらに, 相対的位置符号化を適切に使用することで, バニラ核化注意のトレーニング不安定性問題を軽減できることを示す。幅広いタスクにおいて、最適化の問題なしにモデルをゼロからトレーニングできることを経験的に示します。学習されたモデルは、多くの効率的なTransformer変種よりも優れた性能を示し、長周期の標準的なTransformerよりも高速である。 The attention module, which is a crucial component in Transformer, cannot scale efficiently to long sequences due to its quadratic complexity. Many works focus on approximating the dot-then-exponentiate softmax function in the original attention, leading to sub-quadratic or even linear-complexity Transformer architectures. However, we show that these methods cannot be applied to more powerful attention modules that go beyond the dot-then-exponentiate style, e.g., Transformers with relative positional encoding (RPE). Since in many state-of-the-art models, relative positional encoding is used as default, designing efficient Transformers that can incorporate RPE is appealing. In this paper, we propose a novel way to accelerate attention calculation for Transformers with RPE on top of the kernelized attention. Based upon the observation that relative positional encoding forms a Toeplitz matrix, we mathematically show that kernelized attention with RPE can be calculated efficiently using Fast Fourier Transform (FFT). With FFT, our method achieves $\mathcal{O}(n\log n)$ time complexity. Interestingly, we further demonstrate that properly using relative positional encoding can mitigate the training instability problem of vanilla kernelized attention. On a wide range of tasks, we empirically show that our models can be trained from scratch without any optimization issues. The learned model performs better than many efficient Transformer variants and is faster than standard Transformer in the long-sequence regime.	翻訳日:2021-06-24 15:36:03 公開日:2021-06-23
# 説明可能な機械学習における科学研究のための合成ベンチマーク Synthetic Benchmarks for Scientific Research in Explainable Machine Learning ( http://arxiv.org/abs/2106.12543v1 ) ライセンス: Link先を確認	Yang Liu, Sujay Khandagale, Colin White, Willie Neiswanger	(参考訳) 機械学習モデルがより複雑になり、アプリケーションがよりハイテイクになるにつれて、モデル予測を説明するツールがますます重要になっている。説明可能性技術が広く使われているにもかかわらず、異なる特徴帰属法の評価と比較は依然として困難である: 評価は理想的には人間の研究を必要とし、経験的評価メトリクスは実世界のデータセットでは計算的に禁止されることが多い。本稿では,XAI-Benchという合成データセットのスイートと,特徴属性アルゴリズムをベンチマークするライブラリのリリースによってこの問題に対処する。実世界のデータセットとは異なり、合成データセットは、地味なShapley値やその他のメトリクスを評価するのに必要な条件付き期待値の効率的な計算を可能にする。私たちがリリースした合成データセットは、現実世界のデータをシミュレートするように構成できる幅広いパラメータを提供します。我々は,いくつかの評価指標にまたがる一般的な説明可能性手法をベンチマークし,一般的な説明者の障害モードを特定することで,図書館のパワーを実証する。ライブラリの効率は、開発からデプロイまで、新しい説明可能性メソッドをもたらすのに役立つでしょう。 As machine learning models grow more complex and their applications become more high-stakes, tools for explaining model predictions have become increasingly important. Despite the widespread use of explainability techniques, evaluating and comparing different feature attribution methods remains challenging: evaluations ideally require human studies, and empirical evaluation metrics are often computationally prohibitive on real-world datasets. In this work, we address this issue by releasing XAI-Bench: a suite of synthetic datasets along with a library for benchmarking feature attribution algorithms. Unlike real-world datasets, synthetic datasets allow the efficient computation of conditional expected values that are needed to evaluate ground-truth Shapley values and other metrics. The synthetic datasets we release offer a wide variety of parameters that can be configured to simulate real-world data. We demonstrate the power of our library by benchmarking popular explainability techniques across several evaluation metrics and identifying failure modes for popular explainers. The efficiency of our library will help bring new explainability methods from development to deployment.	翻訳日:2021-06-24 15:35:37 公開日:2021-06-23
# 機能可視化はcnnアクティベーションの因果理解にどの程度有効か? How Well do Feature Visualizations Support Causal Understanding of CNN Activations? ( http://arxiv.org/abs/2106.12447v1 ) ライセンス: Link先を確認	Roland S. Zimmermann, Judy Borowski, Robert Geirhos, Matthias Bethge, Thomas S. A. Wallis, Wieland Brendel	(参考訳) 深層畳み込みニューラルネットワークの内部動作を理解するために広く用いられるアプローチの1つは、アクティベーションの最大化による単位応答の可視化である。アクティベーションの最大化による特徴可視化は、ユニットをアクティベートする画像の特徴に関する正確な情報を提供すると考えられている。もしこれが本当なら、これらの合成画像は、画像の特定のパッチ(例えば犬の頭)がユニットのアクティベーションを変化させるかどうかなど、人間が介入の効果を予測できるようにすべきである。ここでは、2つの正方形のオクルージョンのどれがユニットのアクティベーションに大きな変化を引き起こすかを予測することで、この仮説をテストする。大規模なクラウドソースによる実験と専門家による測定は、平均的に、Olahらによる非常に活発な特徴視覚化が示している。 (2017年)は確かに、このタスクの人間を助ける(67 \pm 4\%$の正確さ;ベースラインのパフォーマンスは、視覚化なしで60 \pm 3\%$)。しかし、他の視覚化(例えば、)に比べて大きな優位性は提供されない。データセットは、同様のパフォーマンスをもたらす(66 \pm 3\%$から676 \pm 3\%$ accuracy)。本研究では,人間に対する単位レベルの解釈可能性手法の利点を定量化するための客観的心理学的課題を提案し,特徴的可視化が人間の「因果的理解」を,単純な代替的可視化よりも優れていることを示す証拠は見つからない。 One widely used approach towards understanding the inner workings of deep convolutional neural networks is to visualize unit responses via activation maximization. Feature visualizations via activation maximization are thought to provide humans with precise information about the image features that cause a unit to be activated. If this is indeed true, these synthetic images should enable humans to predict the effect of an intervention, such as whether occluding a certain patch of the image (say, a dog's head) changes a unit's activation. Here, we test this hypothesis by asking humans to predict which of two square occlusions causes a larger change to a unit's activation. Both a large-scale crowdsourced experiment and measurements with experts show that on average, the extremely activating feature visualizations by Olah et al. (2017) indeed help humans on this task ($67 \pm 4\%$ accuracy; baseline performance without any visualizations is $60 \pm 3\%$). However, they do not provide any significant advantage over other visualizations (such as e.g. dataset samples), which yield similar performance ($66 \pm 3\%$ to $67 \pm 3\%$ accuracy). Taken together, we propose an objective psychophysical task to quantify the benefit of unit-level interpretability methods for humans, and find no evidence that feature visualizations provide humans with better "causal understanding" than simple alternative visualizations.	翻訳日:2021-06-24 15:34:31 公開日:2021-06-23
# 粗Q注意:離散化による視覚ロボットマニピュレーションのための効率的な学習 Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via Discretisation ( http://arxiv.org/abs/2106.12534v1 ) ライセンス: Link先を確認	Stephen James, Kentaro Wada, Tristan Laidlow, Andrew J. Davison	(参考訳) 過去数年間を振り返ると、深層強化学習(RL)における最大のブレークスルーは、離散的なアクション領域にある。しかし、ロボット操作は本質的には連続制御環境であるが、これらの連続制御強化学習アルゴリズムは、俳優と批評家の共同最適化のため、サンプル非効率で本質的に訓練が困難であるアクタ-批判的手法に依存することが多い。そこで我々は,ロボット操作領域に離散型アクションrlアルゴリズムの安定性を実現する方法について検討する。我々は最近リリースされたARMアルゴリズムを拡張し、連続する次ベストポーズエージェントを離散的な次ベストポーズエージェントに置き換える。回転の離散化はその有界性を考えると自明であるが、翻訳は本質的に非有界であり、離散化は困難である。翻訳予測は3次元空間を判別することでボクセル予測問題として定式化するが、大きなワークスペースのボクセル化はメモリ集約的であり、ボクセルの密度が高く、ロボット操作に必要な解像度を得るのに不可欠である。そこで我々は, このボクセル予測を, 分解能を徐々に高め, 粗い方法で適用することを提案する。各ステップにおいて,予測位置として最も高い値のボクセルを抽出し,次のステップで高分解能ボクセル化の中心として使用する。この粗大な予測はいくつかのステップで適用され、翻訳のほとんどロスレスな予測を与える。我々の新しい粗大きめのアルゴリズムは、連続的な制御の同等性よりもずっと効率的にRLBenchのタスクを達成でき、実世界のタスクである表状のラザを7分以内で訓練し、わずか3回のデモしか行えません。さらに,voxel表現に移行することで,複数のカメラからの観測を容易に取り入れることができることを示す。 Reflecting on the last few years, the biggest breakthroughs in deep reinforcement learning (RL) have been in the discrete action domain. Robotic manipulation, however, is inherently a continuous control environment, but these continuous control reinforcement learning algorithms often depend on actor-critic methods that are sample-inefficient and inherently difficult to train, due to the joint optimisation of the actor and critic. To that end, we explore how we can bring the stability of discrete action RL algorithms to the robot manipulation domain. We extend the recently released ARM algorithm, by replacing the continuous next-best pose agent with a discrete next-best pose agent. Discretisation of rotation is trivial given its bounded nature, while translation is inherently unbounded, making discretisation difficult. We formulate the translation prediction as the voxel prediction problem by discretising the 3D space; however, voxelisation of a large workspace is memory intensive and would not work with a high density of voxels, crucial to obtaining the resolution needed for robotic manipulation. We therefore propose to apply this voxel prediction in a coarse-to-fine manner by gradually increasing the resolution. In each step, we extract the highest valued voxel as the predicted location, which is then used as the centre of the higher-resolution voxelisation in the next step. This coarse-to-fine prediction is applied over several steps, giving a near-lossless prediction of the translation. We show that our new coarse-to-fine algorithm is able to accomplish RLBench tasks much more efficiently than the continuous control equivalent, and even train some real-world tasks, tabular rasa, in less than 7 minutes, with only 3 demonstrations. Moreover, we show that by moving to a voxel representation, we are able to easily incorporate observations from multiple cameras.	翻訳日:2021-06-24 15:34:02 公開日:2021-06-23
# 勾配に基づく解法と二元化ニューラルネットワーク Gradient-Based Interpretability Methods and Binarized Neural Networks ( http://arxiv.org/abs/2106.12569v1 ) ライセンス: Link先を確認	Amy Widdicombe, Simon J. Julier	(参考訳) バイナリニューラルネットワーク(BNN)は、エッジコンピューティングプラットフォームでディープラーニングが実行される方法に革命をもたらす可能性がある。しかし,これらのネットワークにおける解釈可能性手法の有効性は評価されていない。本稿では,2値化および完全精度ニューラルネットワーク(fpnn)に適用した場合に,多種多様な塩分マップに基づく解釈手法(gradient, smoothgrad, gradcam)の性能を比較する。基礎的なグラディエント法は両タイプのネットワークに対して非常に類似したマップを生成する。しかし、SmoothGradはBNNに対して非常にノイズの多いマップを生成する。 GradCAMはまた、ネットワークタイプによって異なるサリエンシマップも作成しており、BNNのいくつかは意味のない説明をしている。我々は,これらの相違の原因を解説し,より広い範囲のネットワークタイプに対して,解釈可能性手法をテストすべき理由の例として提示する。 Binarized Neural Networks (BNNs) have the potential to revolutionize the way that deep learning is carried out in edge computing platforms. However, the effectiveness of interpretability methods on these networks has not been assessed. In this paper, we compare the performance of several widely used saliency map-based interpretabilty techniques (Gradient, SmoothGrad and GradCAM), when applied to Binarized or Full Precision Neural Networks (FPNNs). We found that the basic Gradient method produces very similar-looking maps for both types of network. However, SmoothGrad produces significantly noisier maps for BNNs. GradCAM also produces saliency maps which differ between network types, with some of the BNNs having seemingly nonsensical explanations. We comment on possible reasons for these differences in explanations and present it as an example of why interpretability techniques should be tested on a wider range of network types.	翻訳日:2021-06-24 15:33:29 公開日:2021-06-23
# ブラックウェルによる公正オンライン学習への統一的アプローチ A Unified Approach to Fair Online Learning via Blackwell Approachability ( http://arxiv.org/abs/2106.12242v1 ) ライセンス: Link先を確認	Evgenii Chzhen (LMO, CELESTE), Christophe Giraud (LMO, CELESTE), Gilles Stoltz (LMO, CELESTE)	(参考訳) 確率的かつ非敏感な文脈でオンライン学習を公平に行うための設定と一般的なアプローチを提供する。設定はプレイヤーと自然の間の繰り返しのゲームであり、それぞれのステージにおいてそれぞれのコンテキストに基づいてアクションを選択する。不知性の概念に触発されて、プレイヤーは決定を下す前に非敏感なコンテキストにしかアクセスできないと仮定し、同時に、敏感なコンテキストにアクセスする自然のケースと、敏感なコンテキストに気付いていない自然のケースについて論じる。未知の文脈分布の場合を扱うためにブラックウェルのアプローチ可能性理論を適用することにより、学習目的が公正性制約に適合するために必要な一般的な条件を提供する。この条件は (group-wise) no-regret と (group-wise) calibration の目的と、追加の制約として人口順にインスタンス化される。目的が制約と適合しない場合、提供されたフレームワークは、両者間の最適なトレードオフを特徴付けることができる。 We provide a setting and a general approach to fair online learning with stochastic sensitive and non-sensitive contexts. The setting is a repeated game between the Player and Nature, where at each stage both pick actions based on the contexts. Inspired by the notion of unawareness, we assume that the Player can only access the non-sensitive context before making a decision, while we discuss both cases of Nature accessing the sensitive contexts and Nature unaware of the sensitive contexts. Adapting Blackwell's approachability theory to handle the case of an unknown contexts' distribution, we provide a general necessary and sufficient condition for learning objectives to be compatible with some fairness constraints. This condition is instantiated on (group-wise) no-regret and (group-wise) calibration objectives, and on demographic parity as an additional constraint. When the objective is not compatible with the constraint, the provided framework permits to characterise the optimal trade-off between the two.	翻訳日:2021-06-24 15:33:14 公開日:2021-06-23
# IQ-Learn:模倣のための逆ソフトQ学習 IQ-Learn: Inverse soft-Q Learning for Imitation ( http://arxiv.org/abs/2106.12142v1 ) ライセンス: Link先を確認	Divyansh Garg, Shuvam Chakraborty, Chris Cundy, Jiaming Song, Stefano Ermon	(参考訳) 多くの逐次的な意思決定問題(ロボット制御、ゲームプレイ、逐次予測など)では、人間または専門家のデータがタスクに関する有用な情報を含んでいる。しかし、少量のエキスパートデータからの模倣学習(il)は、複雑なダイナミクスを持つ高次元環境では困難である。振る舞いのクローニングは、実装の単純さと安定した収束性のため広く使われている単純な方法であるが、環境のダイナミクスに関する情報は利用しない。力学情報を利用する既存の多くの手法は、報酬や政策近似に対する逆最適化プロセスや偏りのある高分散勾配推定器による訓練が困難である。本稿では,1つのq関数を学習し,報酬と方針の両方を暗黙的に表現することにより,敵対的トレーニングを回避するダイナミクス認識il法を提案する。標準ベンチマークでは,暗黙的に学習した報奨は,強健な報奨と高い正の相関を示すが,本手法は逆強化学習(IRL)にも利用できる。提案手法である逆ソフトq学習(iq-learn)は,オフラインとオンラインの模倣学習環境において,必要な環境相互作用の数と高次元空間のスケーラビリティの両方において既存の手法を上回って最先端の結果を得る。 In many sequential decision-making problems (e.g., robotics control, game playing, sequential prediction), human or expert data is available containing useful information about the task. However, imitation learning (IL) from a small amount of expert data can be challenging in high-dimensional environments with complex dynamics. Behavioral cloning is a simple method that is widely used due to its simplicity of implementation and stable convergence but doesn't utilize any information involving the environment's dynamics. Many existing methods that exploit dynamics information are difficult to train in practice due to an adversarial optimization process over reward and policy approximators or biased, high variance gradient estimators. We introduce a method for dynamics-aware IL which avoids adversarial training by learning a single Q-function, implicitly representing both reward and policy. On standard benchmarks, the implicitly learned rewards show a high positive correlation with the ground-truth rewards, illustrating our method can also be used for inverse reinforcement learning (IRL). Our method, Inverse soft-Q learning (IQ-Learn) obtains state-of-the-art results in offline and online imitation learning settings, surpassing existing methods both in the number of required environment interactions and scalability in high-dimensional spaces.	翻訳日:2021-06-24 15:31:46 公開日:2021-06-23
# 制約プログラミングによるベイズネットワーク構造学習のための非周期推論の改良 Improved Acyclicity Reasoning for Bayesian Network Structure Learning with Constraint Programming ( http://arxiv.org/abs/2106.12269v1 ) ライセンス: Link先を確認	Fulya Tr\"osser (MIAT INRA), Simon de Givry (MIAT INRA), George Katsirelos (MIA-Paris)	(参考訳) ベイジアンネットワークは確率的グラフィカルモデルであり、遺伝子制御ネットワークの推論、リスク分析、画像処理など幅広い応用領域を持つ。離散データからベイズネットワーク(BNSL)の構造を学習することは、有向非巡回グラフの超指数探索空間を持つNPハードタスクであることが知られている。本研究では,全ての可能なクラスタカットのサブセットを発見するための新しい多項式時間アルゴリズム,結果の線形プログラムを近似的に解くグリーディアルゴリズム,非循環性制約に対する一般化アーク整合アルゴリズムを提案する。制約プログラミングに基づく分岐結合解法 CPBayes にこれらを組み込んで, 最適ではないにもかかわらず, 桁違いの性能向上を図っている。結果として得られる解法は、NPハード問題を解くBNSL問題に対する最先端の解法である GOBNILP と好意的に比較し、線形プログラムを正確に解く。 Bayesian networks are probabilistic graphical models with a wide range of application areas including gene regulatory networks inference, risk analysis and image processing. Learning the structure of a Bayesian network (BNSL) from discrete data is known to be an NP-hard task with a superexponential search space of directed acyclic graphs. In this work, we propose a new polynomial time algorithm for discovering a subset of all possible cluster cuts, a greedy algorithm for approximately solving the resulting linear program, and a generalised arc consistency algorithm for the acyclicity constraint. We embed these in the constraint programmingbased branch-and-bound solver CPBayes and show that, despite being suboptimal, they improve performance by orders of magnitude. The resulting solver also compares favourably with GOBNILP, a state-of-the-art solver for the BNSL problem which solves an NP-hard problem to discover each cut and solves the linear program exactly.	翻訳日:2021-06-24 15:31:23 公開日:2021-06-23
# AC/DC: ディープニューラルネットワークの交互圧縮/非圧縮訓練 AC/DC: Alternating Compressed/DeCompressed Training of Deep Neural Networks ( http://arxiv.org/abs/2106.12379v1 ) ライセンス: Link先を確認	Alexandra Peste, Eugenia Iofinova, Adrian Vladu, Dan Alistarh	(参考訳) ディープニューラルネットワーク(DNN)の計算要求の増大は、疎いが正確でないDNNモデルを得ることに大きな関心を惹き付けている。最近の研究は、DNNの重量が可能な限り、訓練中の計算コストを減らすために既に不足しているスパーストレーニングのさらに難しいケースを調査している。既存のスパーストレーニング法は主に経験的であり、しばしば密度の高いベースラインと比較して精度が低い。本稿では,DNNのAlternating Compressed/DeCompressed (AC/DC) トレーニングと呼ばれる一般的な手法を提案し,アルゴリズムの変種に対する収束性を実証し,AC/DCが既存のスパーストレーニング手法を類似の計算予算で精度良く上回っていることを示す。 AC/DCの重要な特性は、密度とスパースモデルのコトレーニングが可能であり、トレーニングプロセスの終了時に正確なスパースセンスモデルペアが得られることである。これは実際に有用であり、圧縮された変種は、トレーニングフロー全体をやり直すことなく、リソース制約された設定に展開するのに好適であり、また、密集モデルと圧縮モデルの間の精度ギャップに関する洞察を提供する。 The increasing computational requirements of deep neural networks (DNNs) have led to significant interest in obtaining DNN models that are sparse, yet accurate. Recent work has investigated the even harder case of sparse training, where the DNN weights are, for as much as possible, already sparse to reduce computational costs during training. Existing sparse training methods are mainly empirical and often have lower accuracy relative to the dense baseline. In this paper, we present a general approach called Alternating Compressed/DeCompressed (AC/DC) training of DNNs, demonstrate convergence for a variant of the algorithm, and show that AC/DC outperforms existing sparse training methods in accuracy at similar computational budgets; at high sparsity levels, AC/DC even outperforms existing methods that rely on accurate pre-trained dense models. An important property of AC/DC is that it allows co-training of dense and sparse models, yielding accurate sparse-dense model pairs at the end of the training process. This is useful in practice, where compressed variants may be desirable for deployment in resource-constrained settings without re-doing the entire training flow, and also provides us with insights into the accuracy gap between dense and compressed models.	翻訳日:2021-06-24 15:31:06 公開日:2021-06-23
# 神経odeにおける予測を超える:同定と介入 Beyond Predictions in Neural ODEs: Identification and Interventions ( http://arxiv.org/abs/2106.12430v1 ) ライセンス: Link先を確認	Hananeh Aliee, Fabian J. Theis, Niki Kilbertus	(参考訳) パターンマッチングと予測タスクの膨大な成功に刺激され、研究者は独自の科学的発見を支援するために機械学習に頼るようになった。システムに関する大量の観測データがあれば、その進化を支配するルールを解明できるだろうか? このタスクの解決は、因果的相互作用を完全に理解し、介入の下でシステムの振る舞いについて信頼できる予測を行うという大きな約束を果たす。我々は、通常の微分方程式(ODE)系から生成された時系列データに対して、この問題に答えるための一歩を踏み出した。ガバナンスODEはデータだけでは識別できないかもしれないが、フレキシブルなニューラルODEと単純な正規化スキームを組み合わせることで、時系列データから動的および因果構造を堅牢に復元できることを示す。提案手法は, 実データと同様に, 様々な(非)線形一階および二階システムにおいて検証された。我々は、変数やシステム自体の介入の下で正確な予測を行うこともできることを示して結論付けます。 Spurred by tremendous success in pattern matching and prediction tasks, researchers increasingly resort to machine learning to aid original scientific discovery. Given large amounts of observational data about a system, can we uncover the rules that govern its evolution? Solving this task holds the great promise of fully understanding the causal interactions and being able to make reliable predictions about the system's behavior under interventions. We take a step towards answering this question for time-series data generated from systems of ordinary differential equations (ODEs). While the governing ODEs might not be identifiable from data alone, we show that combining simple regularization schemes with flexible neural ODEs can robustly recover the dynamics and causal structures from time-series data. Our results on a variety of (non)-linear first and second order systems as well as real data validate our method. We conclude by showing that we can also make accurate predictions under interventions on variables or the system itself.	翻訳日:2021-06-24 15:30:44 公開日:2021-06-23
# レバレッジ統計とイノベーションサーチによるクローズドフォーム,プロビブル,ロバストPCA Closed-Form, Provable, and Robust PCA via Leverage Statistics and Innovation Search ( http://arxiv.org/abs/2106.12190v1 ) ライセンス: Link先を確認	Mostafa Rahmani and Ping Li	(参考訳) データクラスタリングのために最初に提案されたInnovation Searchのアイデアは、最近、外れ値検出に使用された。異常検出のためのイノベーション探索の応用において、データポイントの革新を測定するためにイノベーションの方向性が活用された。本研究では,革新探索アルゴリズムで計算された革新価値を二次コスト関数で検討し,新しいコスト関数を用いた革新価値がスコアの活用に等しいことを証明した。この興味深い接続は、Levanage ScoreベースのロバストPCA法に対するいくつかの理論的保証を確立し、新しいロバストPCA法を設計するために利用される。理論的には、アウトリアー分布とインリアー分布の異なるモデルによるパフォーマンス保証が含まれる。さらに,ノイズの存在に対するアルゴリズムの堅牢性を示す。数値的および理論的研究は、提案手法は高速かつ閉形式であるが、既存のアルゴリズムの大部分を上回ることができることを示している。 The idea of Innovation Search, which was initially proposed for data clustering, was recently used for outlier detection. In the application of Innovation Search for outlier detection, the directions of innovation were utilized to measure the innovation of the data points. We study the Innovation Values computed by the Innovation Search algorithm under a quadratic cost function and it is proved that Innovation Values with the new cost function are equivalent to Leverage Scores. This interesting connection is utilized to establish several theoretical guarantees for a Leverage Score based robust PCA method and to design a new robust PCA method. The theoretical results include performance guarantees with different models for the distribution of outliers and the distribution of inliers. In addition, we demonstrate the robustness of the algorithms against the presence of noise. The numerical and theoretical studies indicate that while the presented approach is fast and closed-form, it can outperform most of the existing algorithms.	翻訳日:2021-06-24 15:29:42 公開日:2021-06-23
# ランダム効果バンディット Random Effect Bandits ( http://arxiv.org/abs/2106.12200v1 ) ライセンス: Link先を確認	Rong Zhu and Branislav Kveton	(参考訳) 本稿では,古典的オンライン学習問題である多腕バンディットにおける後悔の最小化について述べる。より統計的に効率的なアルゴリズムを開発するために,ランダム効果モデルの仮定を用いることを提案する。このモデルでは、腕の平均報酬は未知の分布から独立に引き出され、そのパラメータは推定される。我々は,本モデルにおけるアーム平均の推定器を提供し,その不確実性を分析する。これらの結果に基づいて,我々はReUCBと呼ぶ UCB アルゴリズムを設計する。 reucbを分析して、既存の下限に合致した、n$roundの後悔に縛られたベイズ後悔を証明する。実験の結果,reucbは,アーム平均の事前分布が分かっていないと仮定することなく,様々なシナリオにおいてトンプソンサンプリングよりも優れることがわかった。 This paper studies regret minimization in multi-armed bandits, a classical online learning problem. To develop more statistically-efficient algorithms, we propose to use the assumption of a random-effect model. In this model, the mean rewards of arms are drawn independently from an unknown distribution, whose parameters we estimate. We provide an estimator of the arm means in this model and also analyze its uncertainty. Based on these results, we design a UCB algorithm, which we call ReUCB. We analyze ReUCB and prove a Bayes regret bound on its $n$-round regret, which matches an existing lower bound. Our experiments show that ReUCB can outperform Thompson sampling in various scenarios, without assuming that the prior distribution of arm means is known.	翻訳日:2021-06-24 15:29:27 公開日:2021-06-23
# groupShapley: 特徴群に対するShapley値を用いた効率的な予測説明 groupShapley: Efficient prediction explanation with Shapley values for feature groups ( http://arxiv.org/abs/2106.12228v1 ) ライセンス: Link先を確認	Martin Jullum, Annabelle Redelmeier, Kjersti Aas	(参考訳) 共有値は、複雑な機械学習モデルから予測を説明する最も適切で理論的に健全なフレームワークの1つとして確立されている。説明設定におけるシェープリー値の人気は、おそらくそのユニークな理論的性質によるものである。しかし、Shapley値の最大の欠点は、その計算複雑性が入力機能の数で指数関数的に増加し、何百、何千もの機能がある多くの現実世界の状況では実現不可能であることだ。さらに、多くの(依存した)機能により、計算されたShapley値の提示と解釈も困難になる。本稿では,上記のボトルネックに対処するための概念的にシンプルなアプローチであるgroupshapleyを紹介する。そのアイデアは、例えば、型や依存によって、機能をグループ化し、その後、個々の機能ではなく、これらのグループのためにshapley値を計算し、提示することです。数百から数千の機能を半ダース程度に削減することで、正確な計算が事実上可能になり、プレゼンテーションや知識の抽出が大幅に単純化される。特定の条件下では、groupShapleyは各特徴群内の特徴量Shapley値の和と同値であることを示す。さらに,これらの条件を満たさない場合の違いを示すシミュレーション実験を行う。このアプローチのユーザビリティを、grouphapleyがシンプルで直感的な説明を提供する実世界の自動車保険の例で説明します。 Shapley values has established itself as one of the most appropriate and theoretically sound frameworks for explaining predictions from complex machine learning models. The popularity of Shapley values in the explanation setting is probably due to its unique theoretical properties. The main drawback with Shapley values, however, is that its computational complexity grows exponentially in the number of input features, making it unfeasible in many real world situations where there could be hundreds or thousands of features. Furthermore, with many (dependent) features, presenting/visualizing and interpreting the computed Shapley values also becomes challenging. The present paper introduces groupShapley: a conceptually simple approach for dealing with the aforementioned bottlenecks. The idea is to group the features, for example by type or dependence, and then compute and present Shapley values for these groups instead of for all individual features. Reducing hundreds or thousands of features to half a dozen or so, makes precise computations practically feasible and the presentation and knowledge extraction greatly simplified. We prove that under certain conditions, groupShapley is equivalent to summing the feature-wise Shapley values within each feature group. Moreover, we provide a simulation study exemplifying the differences when these conditions are not met. We illustrate the usability of the approach in a real world car insurance example, where groupShapley is used to provide simple and intuitive explanations.	翻訳日:2021-06-24 15:29:15 公開日:2021-06-23
# ParK: 特徴空間分割による音と効率の良いカーネルリッジ回帰 ParK: Sound and Efficient Kernel Ridge Regression by Feature Space Partitions ( http://arxiv.org/abs/2106.12231v1 ) ライセンス: Link先を確認	Luigi Carratino, Stefano Vigogna, Daniele Calandriello, Lorenzo Rosasco	(参考訳) 我々は,カーネルリッジ回帰のための新しい大規模解法parkを紹介する。分割とランダムな投影と反復最適化を組み合わせることで,同じ統計的精度を維持しつつ,空間と時間の複雑さを低減できる。特に、入力空間ではなく特徴空間に直接適切な分割を構築することにより、局所的推定子間の直交性が促進され、局所的有効次元やバイアスといった重要な量が制御されていることが保証される。本手法は,大規模データセット上での数値実験により,統計計算のトレードオフを特徴とし,その効果を実証する。 We introduce ParK, a new large-scale solver for kernel ridge regression. Our approach combines partitioning with random projections and iterative optimization to reduce space and time complexity while provably maintaining the same statistical accuracy. In particular, constructing suitable partitions directly in the feature space rather than in the input space, we promote orthogonality between the local estimators, thus ensuring that key quantities such as local effective dimension and bias remain under control. We characterize the statistical-computational tradeoff of our model, and demonstrate the effectiveness of our method by numerical experiments on large-scale datasets.	翻訳日:2021-06-24 15:28:55 公開日:2021-06-23
# ニューラルネットワークによるLee-CarterモデルとPoisson Lee-Carterモデルの校正 Calibrating the Lee-Carter and the Poisson Lee-Carter models via Neural Networks ( http://arxiv.org/abs/2106.12312v1 ) ライセンス: Link先を確認	Salvatore Scognamiglio	(参考訳) 本稿では,複数の個体群にLee-CarterモデルとPoisson Lee-Carterモデルを適用するニューラルネットワーク手法を提案する。我々は, 個々のlcモデルの構造を再現したニューラルネットワークを開発し, 全集団の死亡データを同時に解析することにより, それらの統合的適合を可能にする。ニューラルネットワークアーキテクチャは、従来の推定スキームのように、人口固有のデータサブセットを使用するのではなく、利用可能なすべての情報を使用して各モデルを調整するように特別に設計されている。 HMD(Human Mortality Database)のすべての国で実施された大規模な数値実験は、我々のアプローチの有効性を示している。特に、結果のパラメータ推定値は、死亡率のデータ、特に低人口国でしばしば発生するランダムな変動に対して滑らかに、より敏感に見えます。また,予測性能も大幅に向上した。 This paper introduces a neural network approach for fitting the Lee-Carter and the Poisson Lee-Carter model on multiple populations. We develop some neural networks that replicate the structure of the individual LC models and allow their joint fitting by analysing the mortality data of all the considered populations simultaneously. The neural network architecture is specifically designed to calibrate each individual model using all available information instead of using a population-specific subset of data as in the traditional estimation schemes. A large set of numerical experiments performed on all the countries of the Human Mortality Database (HMD) shows the effectiveness of our approach. In particular, the resulting parameter estimates appear smooth and less sensitive to the random fluctuations often present in the mortality rates' data, especially for low-population countries. In addition, the forecasting performance results significantly improved as well.	翻訳日:2021-06-24 15:28:45 公開日:2021-06-23
# オートエンコーダの革新とリアルタイム異常検出への応用 Innovations Autoencoder and its Application in Real-Time Anomaly Detection ( http://arxiv.org/abs/2106.12382v1 ) ライセンス: Link先を確認	Xinyi Wang, Lang Tong	(参考訳) 時系列のイノベーティブシーケンス(innovations sequence of a time series)は、元の時系列が因果表現を持つ独立かつ同分布の確率変数の列である。当時の革新は、時系列の以前の歴史とは統計的に独立している。そのため、現在に含まれている新しい情報を表すが、過去にはない。単純な確率構造のため、イノベーションシーケンスはオリジナルの最も効率的な署名である。原理または独立解析(PCA/ICA)表現とは異なり、革新系列は完全な統計的性質だけでなく、オリジナルの時系列の時間順序も保存する。長年の未解決問題は、非ガウス過程のイノベーション列を抽出するための計算的に扱いやすい方法を見つけることである。本稿では,因果畳み込みニューラルネットワークを用いてイノベーションシーケンスを抽出する,innovations autoencoder(iae)と呼ばれるディープラーニング手法を提案する。未知の異常モデルと無異常モデルを用いた非パラメトリック異常検出へのIAEの適用について述べる。 An innovations sequence of a time series is a sequence of independent and identically distributed random variables with which the original time series has a causal representation. The innovation at a time is statistically independent of the prior history of the time series. As such, it represents the new information contained at present but not in the past. Because of its simple probability structure, an innovations sequence is the most efficient signature of the original. Unlike the principle or independent analysis (PCA/ICA) representations, an innovations sequence preserves not only the complete statistical properties but also the temporal order of the original time series. An long-standing open problem is to find a computationally tractable way to extract an innovations sequence of non-Gaussian processes. This paper presents a deep learning approach, referred to as Innovations Autoencoder (IAE), that extracts innovations sequences using a causal convolutional neural network. An application of IAE to nonparametric anomaly detection with unknown anomaly and anomaly-free models is also presented.	翻訳日:2021-06-24 15:28:32 公開日:2021-06-23
# ミラードステイン演算子によるサンプリング Sampling with Mirrored Stein Operators ( http://arxiv.org/abs/2106.12506v1 ) ライセンス: Link先を確認	Jiaxin Shi, Chang Liu, Lester Mackey	(参考訳) 制約領域と非ユークリッド測地に適した新しい粒子進化型試料群を紹介する。ステイン変分ミラーDescentとミラーレッド変分グレイディエントDescentは、鏡写像で定義される双対空間における粒子の進化による制約対象分布へのクルバック・リーブラー(KL)の偏差を最小化する。スタイン変分自然勾配は非ユークリッド幾何学を利用して、KLの非拘束対象への発散をより効率的に最小化する。この研究で開発されたミラー化されたスタイン作用素と適応カーネルからこれらのサンプルを導出する。これらの新しい標本は, 単純集合上の分布に正確な近似を与え, 選択後の推論において有効な信頼区間を与え, 大規模非拘束後推定において, 従来法よりも高速に収束することを示す。最後に,対象分布の検証可能な条件下での新たな手続きの収束を確立する。 We introduce a new family of particle evolution samplers suitable for constrained domains and non-Euclidean geometries. Stein Variational Mirror Descent and Mirrored Stein Variational Gradient Descent minimize the Kullback-Leibler (KL) divergence to constrained target distributions by evolving particles in a dual space defined by a mirror map. Stein Variational Natural Gradient exploits non-Euclidean geometry to more efficiently minimize the KL divergence to unconstrained targets. We derive these samplers from a new class of mirrored Stein operators and adaptive kernels developed in this work. We demonstrate that these new samplers yield accurate approximations to distributions on the simplex, deliver valid confidence intervals in post-selection inference, and converge more rapidly than prior methods in large-scale unconstrained posterior inference. Finally, we establish the convergence of our new procedures under verifiable conditions on the target distribution.	翻訳日:2021-06-24 15:28:17 公開日:2021-06-23
# co-advise:クロスインダクティブバイアス蒸留 Co-advise: Cross Inductive Bias Distillation ( http://arxiv.org/abs/2106.12378v1 ) ライセンス: Link先を確認	Sucheng Ren, Zhengqi Gao, Tianyu Hua, Zihui Xue, Yonglong Tian, Shengfeng He, Hang Zhao	(参考訳) 近年のトランスフォーマーは、自然言語処理のコミュニティから、視覚学習タスクのための畳み込みベースのニューラルネットワークの代替として適応している。しかし、その優越性は不十分なトレーニングデータ(例: imagenet)を与えられた。そこで本研究では,視覚変換器を訓練するための蒸留法を提案する。単に重い畳み込みベースの教師が提供される以前の作品とは異なり、学生トランスフォーマーを助言するために異なるアーキテクチャ的帰納的バイアス(例えば、畳み込みと畳み込み)を持つ軽量の教師を導入する。鍵となるのは、異なるインダクティブバイアスを持つ教師は、同じデータセットでトレーニングされているにもかかわらず異なる知識を得ることであり、そのような異なる知識の複合物であり、蒸留中の生徒のパフォーマンスを高めることである。このクロスインダクティブバイアス蒸留法により、私たちのビジョントランスフォーマー(CivT)は、ImageNet上の同じアーキテクチャの以前のトランスフォーマーよりも優れています。 Transformers recently are adapted from the community of natural language processing as a promising substitute of convolution-based neural networks for visual learning tasks. However, its supremacy degenerates given an insufficient amount of training data (e.g., ImageNet). To make it into practical utility, we propose a novel distillation-based method to train vision transformers. Unlike previous works, where merely heavy convolution-based teachers are provided, we introduce lightweight teachers with different architectural inductive biases (e.g., convolution and involution) to co-advise the student transformer. The key is that teachers with different inductive biases attain different knowledge despite that they are trained on the same dataset, and such different knowledge compounds and boosts the student's performance during distillation. Equipped with this cross inductive bias distillation method, our vision transformers (termed as CivT) outperform all previous transformers of the same architecture on ImageNet.	翻訳日:2021-06-24 15:28:00 公開日:2021-06-23
# APNN-TC: Ampere GPU Tensor Core上での任意精度ニューラルネットワークの高速化 APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU Tensor Cores ( http://arxiv.org/abs/2106.12169v1 ) ライセンス: Link先を確認	Boyuan Feng, Yuke Wang, Tong Geng, Ang Li, Yufei Ding	(参考訳) 近年,量子化によるニューラルネットワークの高速化が広く研究されている。残念なことに、さまざまな精度(1ビットの重みや2ビットのアクティベーションなど)の以前の取り組みは、gpu(例えば、int1やint4)の精度の制限によって制限される。このような制約を破るために,最初の任意精度ニューラルネットワークフレームワーク(apnn-tc)を導入し,アンペアgpuテンソルコアの量子化利点を十分に活用する。具体的には、APNN-TCはまず、int1計算プリミティブとXOR/ANDブール演算による任意の短ビット幅計算をサポートする新しいエミュレーションアルゴリズムを組み込んだ。第2に、APNN-TCは任意の精度層の設計を統合し、エミュレーションアルゴリズムを新しいバッチ戦略と特別なメモリ構成でTensor Coresに効率的にマッピングする。第3に、apnn-tcは層間のメモリアクセスを最小化し、さらにパフォーマンスを向上させるために、任意の精度のnn設計を具体化する。大規模な評価の結果、APNN-TCはCUTLASSカーネルやResNetやVGGといったNNモデルよりも大幅に高速化できることがわかった。 Over the years, accelerating neural networks with quantization has been widely studied. Unfortunately, prior efforts with diverse precisions (e.g., 1-bit weights and 2-bit activations) are usually restricted by limited precision support on GPUs (e.g., int1 and int4). To break such restrictions, we introduce the first Arbitrary Precision Neural Network framework (APNN-TC) to fully exploit quantization benefits on Ampere GPU Tensor Cores. Specifically, APNN-TC first incorporates a novel emulation algorithm to support arbitrary short bit-width computation with int1 compute primitives and XOR/AND Boolean operations. Second, APNN-TC integrates arbitrary precision layer designs to efficiently map our emulation algorithm to Tensor Cores with novel batching strategies and specialized memory organization. Third, APNN-TC embodies a novel arbitrary precision NN design to minimize memory access across layers and further improve performance. Extensive evaluations show that APNN-TC can achieve significant speedup over CUTLASS kernels and various NN models, such as ResNet and VGG.	翻訳日:2021-06-24 15:27:44 公開日:2021-06-23
# 雲除去のためのセンチネル-1とセンチネル-2時空間データ融合 Sentinel-1 and Sentinel-2 Spatio-Temporal Data Fusion for Clouds Removal ( http://arxiv.org/abs/2106.12226v1 ) ライセンス: Link先を確認	Alessandro Sebastianelli, Artur Nowakowski, Erika Puglisi, Maria Pia Del Rosso, Jamila Mifdal, Fiora Pirri, Pierre Philippe Mathieu and Silvia Liberata Ullo	(参考訳) 空間的にも時間的にも多数の雲は、光学画像を用いたリモートセンシングアプリケーションを困難または不可能にすることがしばしばある。本研究では,sentinel-1とsentinel-2の時系列データから抽出した時空間的特徴を融合するために,3つの深層ニューラルネットワークを結合した合同データ融合パラダイムに基づいて,雲分解光画像復元法を提案する。コードとデータセットの両方がスクラッチから実装され、さらなる分析と調査のために興味のある研究に利用可能であることは注目に値する。 The abundance of clouds, located both spatially and temporally, often makes remote sensing applications with optical images difficult or even impossible. In this manuscript, a novel method for clouds-corrupted optical image restoration has been presented and developed, based on a joint data fusion paradigm, where three deep neural networks have been combined in order to fuse spatio-temporal features extracted from Sentinel-1 and Sentinel-2 time-series of data. It is worth highlighting that both the code and the dataset have been implemented from scratch and made available to interested research for further analysis and investigation.	翻訳日:2021-06-24 15:27:22 公開日:2021-06-23
# ADAVI:ピラミッドベイズモデルに適用された2値補正変分自動推定 ADAVI: Automatic Dual Amortized Variational Inference Applied To Pyramidal Bayesian Models ( http://arxiv.org/abs/2106.12248v1 ) ライセンス: Link先を確認	Louis Rouillard (PARIETAL, Inria, CEA), Demian Wassermann (PARIETAL, Inria, CEA)	(参考訳) しばしば、人口調査は階層ベイズモデル(HBM)で表されるピラミッド的に組織化されたデータで表される。これらのモデルは、ニューロイメージングのような設定では違法に大きくなり、サンプルは6万の脳位置で測定された機能的なMRI信号からなり、4回の測定セッションにまたがり、少なくとも10人の被験者からなる。 300の脳の特定の皮質領域の縮小例でさえ、約100万のパラメータが特徴であり、シミュレーションベース推論(SBI)のような現代的な密度推定技術の使用を妨げる。この課題のクラスにおいて,パラメータの後方分布を推定するために,ターゲットHBMに双対な変動族を自動生成する手法を考案した。ニューラルネットワークとして表現されるこのVariatonal familyは、注意に基づく階層エンコーダの組み合わせによって、要約統計を正規化フローの集合に供給する。我々のニューラルネットワークはプレート強化HBMの交換性を利用してパラメータ空間を分解する。結果として得られるアーキテクチャは、表現性を維持しながら、典型的なSBI表現に関するパラメータ化を桁違いに削減する。トレーニングが完了すれば,パラメータの完全後部を計算するために,新しいデータサンプルに容易に適用することができる。シミュレーションデータにおける本手法の有効性を実証するとともに,高次元脳解析実験を行った。また、SBI技術と構造化変分推論の共通点にあるいくつかの質問も開きます。 Frequently, population studies feature pyramidally-organized data represented using Hierarchical Bayesian Models (HBM) enriched with plates. These models can become prohibitively large in settings such as neuroimaging, where a sample is composed of a functional MRI signal measured on 64 thousand brain locations, across 4 measurement sessions, and at least tens of subjects. Even a reduced example on a specific cortical region of 300 brain locations features around 1 million parameters, hampering the usage of modern density estimation techniques such as Simulation-Based Inference (SBI). To infer parameter posterior distributions in this challenging class of problems, we designed a novel methodology that automatically produces a variational family dual to a target HBM. This variatonal family, represented as a neural network, consists in the combination of an attention-based hierarchical encoder feeding summary statistics to a set of normalizing flows. Our automatically-derived neural network exploits exchangeability in the plate-enriched HBM and factorizes its parameter space. The resulting architecture reduces by orders of magnitude its parameterization with respect to that of a typical SBI representation, while maintaining expressivity. Our method performs inference on the specified HBM in an amortized setup: once trained, it can readily be applied to a new data sample to compute the parameters' full posterior. We demonstrate the capability of our method on simulated data, as well as a challenging high-dimensional brain parcellation experiment. We also open up several questions that lie at the intersection between SBI techniques and structured Variational Inference.	翻訳日:2021-06-24 15:27:10 公開日:2021-06-23
# 動的変分オートエンコーダを用いた教師なし音声強調 Unsupervised Speech Enhancement using Dynamical Variational Auto-Encoders ( http://arxiv.org/abs/2106.12271v1 ) ライセンス: Link先を確認	Xiaoyu Bie, Simon Leglaive, Xavier Alameda-Pineda, Laurent Girin	(参考訳) 動的変分自動エンコーダ(Dynamical Variational Auto-Encoders, DVAE)は、時系列データモデリングに特化した潜伏変数を持つ深部生成モデルのクラスである。 DVAEは、連続した観測ベクトルおよび/または遅延ベクトル間の時間的依存関係のモデリングを含む変分オートエンコーダ(VAE)の拡張と見なすことができる。従来の研究は、音声信号(スペクトログラム)モデリングにおいて、DVAEの関心と、VAEよりも優れた性能を示してきた。独立して、VAEは、訓練にクリーンでノイズの多い音声サンプルの並列データセットを必要とせず、クリーンな音声信号のみを必要とする、教師なしノイズ非依存のセットアップにおいて、音声強調に成功している。本稿では,dvaeを用いた単一チャネル非教師なし音声強調に拡張し,教師なし表現学習とダイナミクスモデリングの両方を利用する。我々は,最も一般的なdvaesに基づく教師なし音声強調アルゴリズムを提案し,フレームワークの汎用性を説明するために3つのdvaeモデルに適応させた。より正確には、DVAEに基づく先行音声を非負行列分解に基づく雑音モデルと組み合わせ、変動予測最大化(VEM)アルゴリズムを導出し、音声強調を行う。実験の結果,DVAEsに基づく提案手法は,VAEと教師付き音声強調ベースラインよりも優れていた。 Dynamical variational auto-encoders (DVAEs) are a class of deep generative models with latent variables, dedicated to time series data modeling. DVAEs can be considered as extensions of the variational autoencoder (VAE) that include the modeling of temporal dependencies between successive observed and/or latent vectors in data sequences. Previous work has shown the interest of DVAEs and their better performance over the VAE for speech signals (spectrogram) modeling. Independently, the VAE has been successfully applied to speech enhancement in noise, in an unsupervised noise-agnostic set-up that does not require the use of a parallel dataset of clean and noisy speech samples for training, but only requires clean speech signals. In this paper, we extend those works to DVAE-based single-channel unsupervised speech enhancement, hence exploiting both speech signals unsupervised representation learning and dynamics modeling. We propose an unsupervised speech enhancement algorithm based on the most general form of DVAEs, that we then adapt to three specific DVAE models to illustrate the versatility of the framework. More precisely, we combine DVAE-based speech priors with a noise model based on nonnegative matrix factorization, and we derive a variational expectation-maximization (VEM) algorithm to perform speech enhancement. Experimental results show that the proposed approach based on DVAEs outperforms its VAE counterpart and a supervised speech enhancement baseline.	翻訳日:2021-06-24 15:26:46 公開日:2021-06-23
# 心血管深層学習による左室肥大の高スループット表現型化 High-Throughput Precision Phenotyping of Left Ventricular Hypertrophy with Cardiovascular Deep Learning ( http://arxiv.org/abs/2106.12511v1 ) ライセンス: Link先を確認	Grant Duffy, Paul P Cheng, Neal Yuan, Bryan He, Alan C. Kwan, Matthew J. Shun-Shin, Kevin M. Alexander, Joseph Ebinger, Matthew P. Lungren, Florian Rader, David H. Liang, Ingela Schnittger, Euan A. Ashley, James Y. Zou, Jignesh Patel, Ronald Witteles, Susan Cheng, David Ouyang	(参考訳) 左室肥大 (LVH) は、高血圧、大動脈狭窄症、肥大型心筋症、心アミロイドーシスなど、幅広い全身・心血管疾患による慢性リモデリングによるものである。 LVHの早期検出と評価は、患者のケアに大きな影響を及ぼすが、肥大の低認識、測定誤差と変動性、LVHの差別化の難しさによって制限される。この課題を克服するために、人間の専門家に匹敵する精度で心室肥大を自動的に定量化し、LVHのエチオロジーを予測するディープラーニングワークフローであるEchoNet-LVHを提案する。 28,201心エコービデオを用いて心内膜厚(平均絶対誤差[MAE]1.4mm,95%CI1.2-1.5mm),左室径(MAE 2.4mm,95%CI 2.2-2.6mm),後壁厚(MAE 1.2mm,95%CI 1.1-1.3mm)を正確に測定し,心アミロイドーシスと肥大型心筋症(AUC 0.98)をLVHの他の病因から分類した。内外の医療システムからの外部データセットでは、EchoNet-LVHは心室パラメータ(それぞれ0.96と0.90)を正確に定量し、心アミロイドーシス(AUC 0.79)と肥大型心筋症(AUC 0.89)を検出した。複数の心拍数を測定することで、LV形状の微妙な変化とその因果関係をより正確に識別することができる。人間の専門家と比較して、EchoNet-LVHは完全に自動化されており、再現可能で正確な測定が可能であり、心臓肥大の正確な診断の基礎となっている。さらなるイノベーションを促進するためのリソースとして,23,212の注釈付き心エコービデオの大規模なデータセットを公開しています。 Left ventricular hypertrophy (LVH) results from chronic remodeling caused by a broad range of systemic and cardiovascular disease including hypertension, aortic stenosis, hypertrophic cardiomyopathy, and cardiac amyloidosis. Early detection and characterization of LVH can significantly impact patient care but is limited by under-recognition of hypertrophy, measurement error and variability, and difficulty differentiating etiologies of LVH. To overcome this challenge, we present EchoNet-LVH - a deep learning workflow that automatically quantifies ventricular hypertrophy with precision equal to human experts and predicts etiology of LVH. Trained on 28,201 echocardiogram videos, our model accurately measures intraventricular wall thickness (mean absolute error [MAE] 1.4mm, 95% CI 1.2-1.5mm), left ventricular diameter (MAE 2.4mm, 95% CI 2.2-2.6mm), and posterior wall thickness (MAE 1.2mm, 95% CI 1.1-1.3mm) and classifies cardiac amyloidosis (area under the curve of 0.83) and hypertrophic cardiomyopathy (AUC 0.98) from other etiologies of LVH. In external datasets from independent domestic and international healthcare systems, EchoNet-LVH accurately quantified ventricular parameters (R2 of 0.96 and 0.90 respectively) and detected cardiac amyloidosis (AUC 0.79) and hypertrophic cardiomyopathy (AUC 0.89) on the domestic external validation site. Leveraging measurements across multiple heart beats, our model can more accurately identify subtle changes in LV geometry and its causal etiologies. Compared to human experts, EchoNet-LVH is fully automated, allowing for reproducible, precise measurements, and lays the foundation for precision diagnosis of cardiac hypertrophy. As a resource to promote further innovation, we also make publicly available a large dataset of 23,212 annotated echocardiogram videos.	翻訳日:2021-06-24 15:26:06 公開日:2021-06-23
# 切替トークンを用いた複数音声テキスト変換タスクのゼロショットジョイントモデリング Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks using Switching Tokens ( http://arxiv.org/abs/2106.12131v1 ) ライセンス: Link先を確認	Mana Ihori, Naoki Makishima, Tomohiro Tanaka, Akihiko Takashima, Shota Orihashi, Ryo Masumura	(参考訳) 本稿では,一致したデータセットを作成することなく,句読取復元や不規則削除といった複数のスタイル変換モジュールを同時に実行可能な,音声文型変換手法を提案する。実際には、自動音声認識システムによって生成された文字は、多くの不一致を含むことが多く、句読点を含まないため、読めない。可読性を向上させるために、単一の変換タスクを個別にモデル化する複数の音声テキストスタイルの変換モジュールがカスケードされる。しかし、変換エラーの連鎖のため、カスケードはタスクの順序に対して不安定である。加えて、カスケードの計算コストは単一変換よりも高くなければならない。一致したデータセットを準備せずに複数の変換タスクを同時に実行するためには、オンオフスイッチを使用して個々の変換タスクを区別する。提案したゼロショット共同モデリングでは,複数の切替トークンを用いて個々のタスクを切り替え,ゼロショット学習アプローチを用いて同時変換を行う。ディフルエンシ除去と句読取回復の連成モデリング実験により,本手法の有効性を実証した。 In this paper, we propose a novel spoken-text-style conversion method that can simultaneously execute multiple style conversion modules such as punctuation restoration and disfluency deletion without preparing matched datasets. In practice, transcriptions generated by automatic speech recognition systems are not highly readable because they often include many disfluencies and do not include punctuation marks. To improve their readability, multiple spoken-text-style conversion modules that individually model a single conversion task are cascaded because matched datasets that simultaneously handle multiple conversion tasks are often unavailable. However, the cascading is unstable against the order of tasks because of the chain of conversion errors. Besides, the computation cost of the cascading must be higher than the single conversion. To execute multiple conversion tasks simultaneously without preparing matched datasets, our key idea is to distinguish individual conversion tasks using the on-off switch. In our proposed zero-shot joint modeling, we switch the individual tasks using multiple switching tokens, enabling us to utilize a zero-shot learning approach to executing simultaneous conversions. Our experiments on joint modeling of disfluency deletion and punctuation restoration demonstrate the effectiveness of our method.	翻訳日:2021-06-24 15:25:13 公開日:2021-06-23
# 強化学習に基づく対話型イベント抽出による議論関係の活用 Reinforcement Learning-based Dialogue Guided Event Extraction to Exploit Argument Relations ( http://arxiv.org/abs/2106.12384v1 ) ライセンス: Link先を確認	Qian Li, Hao Peng, Jianxin Li, Yuanxing Ning, Lihong Wang, Philip S. Yu, Zheng Wang	(参考訳) イベント抽出は自然言語処理の基本的なタスクである。イベント参加者のようなイベント引数の役割を見つけることは、イベント抽出に不可欠である。しかし、実生活におけるイベント記述のために行うことは、議論の役割が異なる状況でしばしば異なるため、難しい。複数の引数間の関係と相互作用は引数の役割を解決するのに有用であるが、そのような情報は既存のアプローチでは無視されている。本稿では,イベント引数の関係を明示的に活用し,イベント抽出のためのより良い手法を提案する。タスク指向対話システムによってこれを実現できる。議論関係をモデル化するために,強化学習とインクリメンタル学習を用い,マルチターン反復プロセスを通じて複数の引数を抽出する。提案手法では,すでに抽出された同一文の引数の知識を活用して,個別に決定しにくい議論の役割を決定する。その後、新たに取得した情報を使用して、以前抽出された議論の判断を改善する。この双方向フィードバックプロセスにより、議論関係を利用して議論の役割を効果的に解決し、文理解とイベント抽出を改善することができる。実験の結果,提案手法は,イベントの分類や引数の役割,引数の識別において,7つの最先端イベント抽出手法を一貫して上回っていることがわかった。 Event extraction is a fundamental task for natural language processing. Finding the roles of event arguments like event participants is essential for event extraction. However, doing so for real-life event descriptions is challenging because an argument's role often varies in different contexts. While the relationship and interactions between multiple arguments are useful for settling the argument roles, such information is largely ignored by existing approaches. This paper presents a better approach for event extraction by explicitly utilizing the relationships of event arguments. We achieve this through a carefully designed task-oriented dialogue system. To model the argument relation, we employ reinforcement learning and incremental learning to extract multiple arguments via a multi-turned, iterative process. Our approach leverages knowledge of the already extracted arguments of the same sentence to determine the role of arguments that would be difficult to decide individually. It then uses the newly obtained information to improve the decisions of previously extracted arguments. This two-way feedback process allows us to exploit the argument relations to effectively settle argument roles, leading to better sentence understanding and event extraction. Experimental results show that our approach consistently outperforms seven state-of-the-art event extraction methods for the classification of events and argument role and argument identification.	翻訳日:2021-06-24 15:24:55 公開日:2021-06-23
# 音声自動スコアリングのためのディープニューラル専門家の混合 Mixtures of Deep Neural Experts for Automated Speech Scoring ( http://arxiv.org/abs/2106.12475v1 ) ライセンス: Link先を確認	Sara Papi, Edmondo Trentin, Roberto Gretter, Marco Matassoni, Daniele Falavigna	(参考訳) 本論文は,言語学習者の音声応答からテストプロンプトに対する第二言語能力の自動評価の課題に対処する。このタスクは、コンピュータ支援言語学習の分野に大きく関係している。本論文で提示されたアプローチは,(1)音声対話のテキスト書き起こしを生成する自動音声認識システム,(2)書き起こしを熟練度クラスに分類する深層学習者に基づく多重分類システムという,2つの異なるモジュールに依存している。異なるディープニューラルネットワークアーキテクチャ(フィードフォワードとリカレントの両方)は、参照文法、確率言語モデルの結果、複数の単語埋め込み、2つのバグ・オブ・ワードモデルという観点でテキストの多様な表現に特化している。個々の分類器の組み合わせは、確率的擬似結合モデルまたは専門家の神経混合物を介して実現される。第3回Spoken CALL Shared Task Challengeのデータを用いて,3つの評価指標から,現在までの最高値を得た。 The paper copes with the task of automatic assessment of second language proficiency from the language learners' spoken responses to test prompts. The task has significant relevance to the field of computer assisted language learning. The approach presented in the paper relies on two separate modules: (1) an automatic speech recognition system that yields text transcripts of the spoken interactions involved, and (2) a multiple classifier system based on deep learners that ranks the transcripts into proficiency classes. Different deep neural network architectures (both feed-forward and recurrent) are specialized over diverse representations of the texts in terms of: a reference grammar, the outcome of probabilistic language models, several word embeddings, and two bag-of-word models. Combination of the individual classifiers is realized either via a probabilistic pseudo-joint model, or via a neural mixture of experts. Using the data of the third Spoken CALL Shared Task challenge, the highest values to date were obtained in terms of three popular evaluation metrics.	翻訳日:2021-06-24 15:24:37 公開日:2021-06-23
# LegoFormer:マルチビュー3D再構築のためのトランスフォーマー LegoFormer: Transformers for Block-by-Block Multi-view 3D Reconstruction ( http://arxiv.org/abs/2106.12102v1 ) ライセンス: Link先を確認	Farid Yagubbayli, Alessio Tonioni, Federico Tombari	(参考訳) 現代のディープラーニングベースの多視点3D再構成技術のほとんどは、RNNまたは融合モジュールを使用して、エンコード後の複数の画像からの情報を組み合わせている。これら2つのステップは疎結合であり、各ビューをエンコーディングしている間に利用可能なすべての情報を考慮しない。 legoformerは,単一のフレームワークでオブジェクトの再構成を統一し,その分解因子によって再構成された占有グリッドをパラメータ化するトランスフォーマモデルである。この再構成により、オブジェクトを独立した構造の集合として予測し、最終的な再構成を得ることができる。 shapenet上で行った実験では,最先端の手法に関して,ネットワークの競合性能を示す。また,自己注意の使用がモデル出力の解釈可能性の向上につながることを示す。 Most modern deep learning-based multi-view 3D reconstruction techniques use RNNs or fusion modules to combine information from multiple images after encoding them. These two separate steps have loose connections and do not consider all available information while encoding each view. We propose LegoFormer, a transformer-based model that unifies object reconstruction under a single framework and parametrizes the reconstructed occupancy grid by its decomposition factors. This reformulation allows the prediction of an object as a set of independent structures then aggregated to obtain the final reconstruction. Experiments conducted on ShapeNet display the competitive performance of our network with respect to the state-of-the-art methods. We also demonstrate how the use of self-attention leads to increased interpretability of the model output.	翻訳日:2021-06-24 15:22:50 公開日:2021-06-23
# 医用ボリュームとシーケンスのセグメンテーションのためのブートストラップ表現学習 Bootstrap Representation Learning for Segmentation on Medical Volumes and Sequences ( http://arxiv.org/abs/2106.12153v1 ) ライセンス: Link先を確認	Zejian Chen, Wei Zhuo, Tianfu Wang, Wufeng Xue and Dong Ni	(参考訳) そこで本研究では,アノテーションを限定した医用ボリュームとシーケンスセグメンテーションの簡易化手法を提案する。自己教師付き学習(ssl)の最近の成功は、ラベルなしデータの事前学習を動機付けている。その成功にもかかわらず、ローカルなセマンティックな差別やボリュームやシーケンス構造への稀な利用が欠如しているため、一般的なSSLメソッドをボリューム/シーケンスセグメンテーションに適応することは依然として困難である。スライス/フレーム間の連続性とボリューム/シーケンス間のオルガンの共通空間配置に基づいて,隣接するスライスの予測可能性を活用したブートストラップ自己監督表現学習手法を提案する。本手法の核心は,局所表現の予測に関する単純で分かりやすい自己管理と,グローバルコンテキストに基づく局所表現予測戦略であり,ボリューム間のグローバル表現マイニングと局所表現マイニングの両方に対して安定かつ信頼性の高い監督を可能にする。具体的には,注意誘導型予測器を備えた非対称ネットワークを提案し,ボリューム/シーケンス間のスライス間の距離特異的な予測と監視を行った。次に,新しいプロトタイプベースフォアグラウンド・バックグラウンドキャリブレーションモジュールを導入した。 2つの部分はラベル付きおよびラベルなしのデータに基づいて共同で訓練される。医療用ボリュームとシークエンスの3つのベンチマークデータセットで評価すると、adcdcでは4.5\%dsc、前立腺では1.7\%、camusでは2.3\%という大きなマージンで既存の手法を上回っている。集中評価は,本手法の有効性と優位性を明らかにする。 In this work, we propose a novel straightforward method for medical volume and sequence segmentation with limited annotations. To avert laborious annotating, the recent success of self-supervised learning(SSL) motivates the pre-training on unlabeled data. Despite its success, it is still challenging to adapt typical SSL methods to volume/sequence segmentation, due to their lack of mining on local semantic discrimination and rare exploitation on volume and sequence structures. Based on the continuity between slices/frames and the common spatial layout of organs across volumes/sequences, we introduced a novel bootstrap self-supervised representation learning method by leveraging the predictable possibility of neighboring slices. At the core of our method is a simple and straightforward dense self-supervision on the predictions of local representations and a strategy of predicting locals based on global context, which enables stable and reliable supervision for both global and local representation mining among volumes. Specifically, we first proposed an asymmetric network with an attention-guided predictor to enforce distance-specific prediction and supervision on slices within and across volumes/sequences. Secondly, we introduced a novel prototype-based foreground-background calibration module to enhance representation consistency. The two parts are trained jointly on labeled and unlabeled data. When evaluated on three benchmark datasets of medical volumes and sequences, our model outperforms existing methods with a large margin of 4.5\% DSC on ACDC, 1.7\% on Prostate, and 2.3\% on CAMUS. Intensive evaluations reveals the effectiveness and superiority of our method.	翻訳日:2021-06-24 15:22:39 公開日:2021-06-23
# 視覚に基づく豚の新規選好の行動認識 Vision-based Behavioral Recognition of Novelty Preference in Pigs ( http://arxiv.org/abs/2106.12181v1 ) ライセンス: Link先を確認	Aniket Shirke, Rebecca Golden, Mrinal Gautam, Angela Green-Miller, Matthew Caesar, Ryan N. Dilger	(参考訳) 研究データの行動スコアリングは、ドメイン固有のメトリクスを抽出するために重要であるが、人間の労働力を用いて膨大な量の情報を分析する能力にボトルネックがある。ディープラーニングは、このボトルネックを緩和するための重要な進歩と見なされている。我々は,手動スコアリングのプロセスを緩和するために,ディープラーニングを活用できる分野を1つ同定する。新規嗜好のパラダイムはブタの認知記憶の研究に広く用いられているが、これらのビデオの分析には人間の介入が必要である。ブタの行動とキーポイントを完全に注釈付けした 'Pig Novelty Preference Behavior' (PNPB) データセットの形式で,このようなビデオのサブセットを紹介する。本データセットにおける最先端の行動認識モデルの適用例を示すために,様々な分析指標に基づいてlrcn,c3d,tsmを比較し,モデルの落とし穴について考察する。豚の行動推定における平均精度は93%,平均精度は96%であった。コードと注釈付きデータセットをhttps://github.com/AIFARMS/NOR-behavior-recognitionでオープンソース化しました。 Behavioral scoring of research data is crucial for extracting domain-specific metrics but is bottlenecked on the ability to analyze enormous volumes of information using human labor. Deep learning is widely viewed as a key advancement to relieve this bottleneck. We identify one such domain, where deep learning can be leveraged to alleviate the process of manual scoring. Novelty preference paradigms have been widely used to study recognition memory in pigs, but analysis of these videos requires human intervention. We introduce a subset of such videos in the form of the 'Pig Novelty Preference Behavior' (PNPB) dataset that is fully annotated with pig actions and keypoints. In order to demonstrate the application of state-of-the-art action recognition models on this dataset, we compare LRCN, C3D, and TSM on the basis of various analytical metrics and discuss common pitfalls of the models. Our methods achieve an accuracy of 93% and a mean Average Precision of 96% in estimating piglet behavior. We open-source our code and annotated dataset at https://github.com/AIFARMS/NOR-behavior-recognition	翻訳日:2021-06-24 15:22:12 公開日:2021-06-23
# 識別指向マップを用いたリアルタイムインスタンス分割 Real-time Instance Segmentation with Discriminative Orientation Maps ( http://arxiv.org/abs/2106.12204v1 ) ライセンス: Link先を確認	Wentao Du, Zhiyu Xiang, Shuya Chen, Chengyu Qiao, Yiman Chen and Tingming Bai	(参考訳) インスタンスのセグメンテーションは近年かなり進歩していますが、リアルタイムパフォーマンスで高精度なアルゴリズムを設計することは依然として課題です。本稿では,OrienMaskと呼ばれるリアルタイムインスタンスセグメンテーションフレームワークを提案する。一段物検出器YOLOv3では、マスクヘッドが追加され、前景および背景画素の両方の空間オフセットベクトルとして明示的に定義される識別向きマップが予測される。方位マップの識別能力のおかげで、余分なフォアグラウンドセグメンテーションを必要とせずにマスクを復元できる。同じアンカーサイズにマッチするすべてのインスタンスは、共通の向きマップを共有する。この特別な共有戦略は、マスクの粒度が失われることなく、マスク予測のメモリ使用量を削減する。 NMS後に残るボックス予測を考慮すれば、インスタンスマスクは複雑さの低い対応する向きマップから同時に構築することができる。マスク表現の簡潔な設計とアンカーベースオブジェクト検出器との効果的な統合により,本手法は競争精度を維持しつつ,リアルタイム条件下での精度が確保できる。 COCOベンチマークの実験では、OrienMaskは1つのRTX 2080 Tiで評価された42.7 fpsの速度で34.8マスクAPを達成した。コードはhttps://github.com/duwt/orienmaskで入手できる。 Although instance segmentation has made considerable advancement over recent years, it's still a challenge to design high accuracy algorithms with real-time performance. In this paper, we propose a real-time instance segmentation framework termed OrienMask. Upon the one-stage object detector YOLOv3, a mask head is added to predict some discriminative orientation maps, which are explicitly defined as spatial offset vectors for both foreground and background pixels. Thanks to the discrimination ability of orientation maps, masks can be recovered without the need for extra foreground segmentation. All instances that match with the same anchor size share a common orientation map. This special sharing strategy reduces the amortized memory utilization for mask predictions but without loss of mask granularity. Given the surviving box predictions after NMS, instance masks can be concurrently constructed from the corresponding orientation maps with low complexity. Owing to the concise design for mask representation and its effective integration with the anchor-based object detector, our method is qualified under real-time conditions while maintaining competitive accuracy. Experiments on COCO benchmark show that OrienMask achieves 34.8 mask AP at the speed of 42.7 fps evaluated with a single RTX 2080 Ti. The code is available at https://github.com/duwt/OrienMask.	翻訳日:2021-06-24 15:21:56 公開日:2021-06-23
# 相互情報に基づくFew-Shot分類 Mutual-Information Based Few-Shot Classification ( http://arxiv.org/abs/2106.12252v1 ) ライセンス: Link先を確認	Malik Boudiaf, Ziko Imtiaz Masud, J\'er\^ome Rony, Jose Dolz, Ismail Ben Ayed, Pablo Piantanida	(参考訳) 数ショット学習のためのTIM(Transductive Infomation Maximization)を提案する。提案手法は,クエリ特徴とラベル予測との相互情報を最大化し,その支援セットに基づく監督損失を付与する。我々は, 分類精度と相互情報最大化の形式的関係を導出することにより, トランスダクティブ損失の動機付けを行う。さらに、勾配に基づく最適化よりもトランスダクティブ推論を大幅に高速化し、競争精度を向上する新しい交互方向解法を提案する。また、Zangwillの理論と有界最適化論に基づく解の収束解析も提供する。 TIM推論はモジュラーであり、任意のベーストレーニング機能抽出器上で使用することができる。 TIMは様々なデータセットやネットワークにまたがる最先端の手法よりも優れており、複雑なメタ学習手法を使わずに、ベースクラス上で単純なクロスエントロピーで訓練された固定された特徴抽出器上で使用されている。最近発表されたMETA-DATASETのようなランダムなタスク、ドメインシフト、より大きなクラス数を含む、より困難なシナリオでも、優れたパフォーマンスのメソッドよりも、一貫して2%から5%の精度の向上を実現している。私たちのコードはhttps://github.com/mboudiaf/TIMで公開されています。また、META-DATASETのスタンドアロンPyTorch実装とベンチマーク結果もhttps://github.com/mboudiaf/pytorch-meta-datasetで公開しています。 We introduce Transductive Infomation Maximization (TIM) for few-shot learning. Our method maximizes the mutual information between the query features and their label predictions for a given few-shot task, in conjunction with a supervision loss based on the support set. We motivate our transductive loss by deriving a formal relation between the classification accuracy and mutual-information maximization. Furthermore, we propose a new alternating-direction solver, which substantially speeds up transductive inference over gradient-based optimization, while yielding competitive accuracy. We also provide a convergence analysis of our solver based on Zangwill's theory and bound-optimization arguments. TIM inference is modular: it can be used on top of any base-training feature extractor. Following standard transductive few-shot settings, our comprehensive experiments demonstrate that TIM outperforms state-of-the-art methods significantly across various datasets and networks, while used on top of a fixed feature extractor trained with simple cross-entropy on the base classes, without resorting to complex meta-learning schemes. It consistently brings between 2 % and 5 % improvement in accuracy over the best performing method, not only on all the well-established few-shot benchmarks but also on more challenging scenarios, with random tasks, domain shift and larger numbers of classes, as in the recently introduced META-DATASET. Our code is publicly available at https://github.com/mboudiaf/TIM. We also publicly release a standalone PyTorch implementation of META-DATASET, along with additional benchmarking results, at https://github.com/mboudiaf/pytorch-meta-dataset.	翻訳日:2021-06-24 15:21:38 公開日:2021-06-23
# まばらなランドマークの集合体からの深部無監督3次元人体再構築 Deep unsupervised 3D human body reconstruction from a sparse set of landmarks ( http://arxiv.org/abs/2106.12282v1 ) ライセンス: Link先を確認	Meysam Madadi and Hugo Bertiche and Sergio Escalera	(参考訳) 本稿では,DeepMurfと呼ばれる,まばらなランドマークの集合から人体表面を推定する,人体再構成における初の深層非教師的アプローチを提案する。欠落したランドマークを推定するためにデノナイズドオートエンコーダを適用する。次に,ランドマークからの身体関節の推定に注意モデルを適用する。最後に、身体を再構築する統計的生成モデルの回帰パラメータにカスケードネットワークを適用する。提案した損失関数セットは、教師なしの方法でネットワークをトレーニングすることができる。 4つの公開データセットの結果から,実世界のモキャップデータから人体を正確に再構築した。 In this paper we propose the first deep unsupervised approach in human body reconstruction to estimate body surface from a sparse set of landmarks, so called DeepMurf. We apply a denoising autoencoder to estimate missing landmarks. Then we apply an attention model to estimate body joints from landmarks. Finally, a cascading network is applied to regress parameters of a statistical generative model that reconstructs body. Our set of proposed loss functions allows us to train the network in an unsupervised way. Results on four public datasets show that our approach accurately reconstructs the human body from real world mocap data.	翻訳日:2021-06-24 15:21:09 公開日:2021-06-23
# Open Images V5 Text Annotation and another Mask Text Spotter Open Images V5 Text Annotation and Yet Another Mask Text Spotter ( http://arxiv.org/abs/2106.12326v1 ) ライセンス: Link先を確認	Ilya Krylov, Sergei Nosov, Vladislav Sovrasov	(参考訳) 大規模な人間ラベルデータセットは、高品質なディープラーニングモデルを作成する上で重要な役割を果たす。本稿では,Open Images V5データセットのテキストアノテーションについて述べる。私たちの知る限り、手作業で作成したテキストアノテーションの中では最大である。 icdar2013、icdar2015、total-textデータセットにおいて、競争力のあるパフォーマンスを実現するか、あるいは現在の最先端のアプローチを上回ることさえ可能な、シンプルなマスクrcnnベースのネットワークをトレーニングした。 https://github.com/openvinotoolkit/training_extensions。モデルはOpenVINO-formatにエクスポートでき、Intel CPUで動作する。 A large scale human-labeled dataset plays an important role in creating high quality deep learning models. In this paper we present text annotation for Open Images V5 dataset. To our knowledge it is the largest among publicly available manually created text annotations. Having this annotation we trained a simple Mask-RCNN-based network, referred as Yet Another Mask Text Spotter (YAMTS), which achieves competitive performance or even outperforms current state-of-the-art approaches in some cases on ICDAR2013, ICDAR2015 and Total-Text datasets. Code for text spotting model available online at: https://github.com/openvinotoolkit/training_extensions. The model can be exported to OpenVINO-format and run on Intel CPUs.	翻訳日:2021-06-24 15:20:58 公開日:2021-06-23
# Vision Permutator: 視覚認識のための可変MLP様アーキテクチャ Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition ( http://arxiv.org/abs/2106.12368v1 ) ライセンス: Link先を確認	Qibin Hou, Zihang Jiang, Li Yuan, Ming-Ming Cheng, Shuicheng Yan, Jiashi Feng	(参考訳) 本稿では,視覚認識のための概念的にシンプルでデータ効率のよいMLP型アーキテクチャであるVision Permutatorを提案する。平面化された空間次元に沿って空間情報を符号化する最近のMLPのようなモデルとは異なり、2次元特徴表現が持つ位置情報の重要性を実現することにより、視覚パーミュータは、高さと幅の表現を線形投影で別々に符号化する。これにより、Vision Permutatorは1つの空間方向に沿った長距離依存関係をキャプチャし、他方の方向に沿った正確な位置情報を保存できる。結果として得られる位置感性出力は相互補完的な方法で集約され、興味のある対象の表現表現を形成する。私たちのVision Permutatorは、畳み込みニューラルネットワーク(CNN)とビジョントランスフォーマーとの激しい競合であることを示す。空間畳み込みやアテンション機構に依存せずに、Vision Permutatorは同じモデルサイズ制約の下でほとんどのCNNや視覚変換器よりもはるかに優れた25Mの学習可能なパラメータを使用して、大規模なトレーニングデータ(例えばImageNet-22k)を使わずに、ImageNet上で81.5%のトップ-1精度を達成する。 88Mまでスケールアップすると、83.2%のトップ1の精度に達する。本研究は,空間情報のエンコーディング方法の再考と,MLPのようなモデルの開発を促進することを目的としている。コードはhttps://github.com/Andrew-Qibin/VisionPermutator.comで入手できる。 In this paper, we present Vision Permutator, a conceptually simple and data efficient MLP-like architecture for visual recognition. By realizing the importance of the positional information carried by 2D feature representations, unlike recent MLP-like models that encode the spatial information along the flattened spatial dimensions, Vision Permutator separately encodes the feature representations along the height and width dimensions with linear projections. This allows Vision Permutator to capture long-range dependencies along one spatial direction and meanwhile preserve precise positional information along the other direction. The resulting position-sensitive outputs are then aggregated in a mutually complementing manner to form expressive representations of the objects of interest. We show that our Vision Permutators are formidable competitors to convolutional neural networks (CNNs) and vision transformers. Without the dependence on spatial convolutions or attention mechanisms, Vision Permutator achieves 81.5% top-1 accuracy on ImageNet without extra large-scale training data (e.g., ImageNet-22k) using only 25M learnable parameters, which is much better than most CNNs and vision transformers under the same model size constraint. When scaling up to 88M, it attains 83.2% top-1 accuracy. We hope this work could encourage research on rethinking the way of encoding spatial information and facilitate the development of MLP-like models. Code is available at https://github.com/Andrew-Qibin/VisionPermutator.	翻訳日:2021-06-24 15:20:47 公開日:2021-06-23
# Transformer Meets Convolution: Very Fine Resolution Ur-ban Scene Imageのセマンティックセグメンテーションのためのバイラテラルアウェアネスネットワーク Transformer Meets Convolution: A Bilateral Awareness Net-work for Semantic Segmentation of Very Fine Resolution Ur-ban Scene Images ( http://arxiv.org/abs/2106.12413v1 ) ライセンス: Link先を確認	Libo Wang, Rui Li, Dongzhi Wang, Chenxi Duan, Teng Wang, Xiaoliang Meng	(参考訳) 超微細解像度(vfr)からのセマンティックセグメンテーション都市景観画像は、自動運転、土地被覆分類、都市計画など、いくつかのアプリケーションシナリオにおいて重要な役割を果たす。しかし、VFR画像に含まれる膨大な詳細は、既存のディープラーニングアプローチの可能性を著しく制限している。さらに、スケールや物体の出現のかなりの変化は、これらのセマンティックセグメンテーション法の表現能力をさらに悪化させ、隣接する物体の混乱につながった。このような課題に対処することは、シーンレベルの景観パターン分析と意思決定の道を開くリモートセンシングコミュニティにおける有望な研究分野である。本稿では,VFR画像の長距離関係と細粒度をフルに捉えるために,依存経路とテクスチャパスを含む両側認知ネットワーク(BANet)を提案する。特に、依存関係パスはメモリ効率の良いマルチヘッド自己アテンションを備えた新しいトランスフォーマーバックボーンであるResTに基づいて実行され、テクスチャパスはスタック化されたコンボサーション操作上に構築される。さらに、線形アテンション機構を使用することで、依存性機能とテクスチャ機能を効果的に融合する機能アグリゲーションモジュール(FAM)が設計されている。大規模都市景観画像セグメンテーションデータセット(ISPRS Vaihingen データセット,ISPRS Potsdam データセット,UAVid データセット)で実施された大規模な実験により,BANet の有効性が示された。具体的には、UAVidデータセット上で64.6%のmIoUが達成される。 Semantic segmentation from very fine resolution (VFR) urban scene images plays a significant role in several application scenarios including autonomous driving, land cover classification, and urban planning, etc. However, the tremendous details contained in the VFR image severely limit the potential of the existing deep learning approaches. More seriously, the considerable variations in scale and appearance of objects further deteriorate the representational capacity of those se-mantic segmentation methods, leading to the confusion of adjacent objects. Addressing such is-sues represents a promising research field in the remote sensing community, which paves the way for scene-level landscape pattern analysis and decision making. In this manuscript, we pro-pose a bilateral awareness network (BANet) which contains a dependency path and a texture path to fully capture the long-range relationships and fine-grained details in VFR images. Specif-ically, the dependency path is conducted based on the ResT, a novel Transformer backbone with memory-efficient multi-head self-attention, while the texture path is built on the stacked convo-lution operation. Besides, using the linear attention mechanism, a feature aggregation module (FAM) is designed to effectively fuse the dependency features and texture features. Extensive experiments conducted on the three large-scale urban scene image segmentation datasets, i.e., ISPRS Vaihingen dataset, ISPRS Potsdam dataset, and UAVid dataset, demonstrate the effective-ness of our BANet. Specifically, a 64.6% mIoU is achieved on the UAVid dataset.	翻訳日:2021-06-24 15:20:21 公開日:2021-06-23
# FusionPainting:3Dオブジェクト検出のための適応注意型マルチモーダルフュージョン FusionPainting: Multimodal Fusion with Adaptive Attention for 3D Object Detection ( http://arxiv.org/abs/2106.12449v1 ) ライセンス: Link先を確認	Shaoqing Xu, Dingfu Zhou, Jin Fang, Junbo Yin, Zhou Bin and Liangjun Zhang	(参考訳) 3dの障害物の正確な検出は、自動運転とインテリジェントな輸送に欠かせない課題である。本研究では,2次元RGB画像と3次元点雲を意味レベルで融合させて3次元物体検出タスクを増強する汎用多モード融合フレームワークFusionPaintingを提案する。特にFusionPaintingフレームワークは、マルチモーダルセマンティックセグメンテーションモジュール、アダプティブアテンションベースのセマンティックフュージョンモジュール、および3Dオブジェクト検出器の3つの主要モジュールで構成されている。まず、2次元および3次元セグメンテーションアプローチに基づく2次元画像および3次元lidar点雲について意味情報を得る。そして、提案する注意に基づくセマンティクス融合モジュールに基づいて、異なるセンサからのセグメンテーション結果を適応的に融合する。最後に、融合セマンティックラベルで塗られた点雲を3D検出器に送信し、3D対物結果を得る。提案手法の有効性を3つの異なるベースラインと比較し,大規模なnuScenes検出ベンチマークで検証した。実験の結果,点群のみを用いた手法に比べ,核融合戦略は検出性能を大幅に向上し,点群を用いた手法は2次元分節情報のみを描画することを示した。さらに、提案手法は、nuScenesテストベンチマークにおいて、他の最先端メソッドよりも優れている。 Accurate detection of obstacles in 3D is an essential task for autonomous driving and intelligent transportation. In this work, we propose a general multimodal fusion framework FusionPainting to fuse the 2D RGB image and 3D point clouds at a semantic level for boosting the 3D object detection task. Especially, the FusionPainting framework consists of three main modules: a multi-modal semantic segmentation module, an adaptive attention-based semantic fusion module, and a 3D object detector. First, semantic information is obtained for 2D images and 3D Lidar point clouds based on 2D and 3D segmentation approaches. Then the segmentation results from different sensors are adaptively fused based on the proposed attention-based semantic fusion module. Finally, the point clouds painted with the fused semantic label are sent to the 3D detector for obtaining the 3D objection results. The effectiveness of the proposed framework has been verified on the large-scale nuScenes detection benchmark by comparing it with three different baselines. The experimental results show that the fusion strategy can significantly improve the detection performance compared to the methods using only point clouds, and the methods using point clouds only painted with 2D segmentation information. Furthermore, the proposed approach outperforms other state-of-the-art methods on the nuScenes testing benchmark.	翻訳日:2021-06-24 15:19:51 公開日:2021-06-23
# 視覚感情分布学習のための円構造表現 A Circular-Structured Representation for Visual Emotion Distribution Learning ( http://arxiv.org/abs/2106.12450v1 ) ライセンス: Link先を確認	Jingyuan Yang, Ji Lie, Leida Li, Xiumei Wang, and Xinbo Gao	(参考訳) 視覚感情分析(vea)は,近年,ソーシャルネットワーク上で画像共有が普及するにつれて注目を浴びている。人間の感情は曖昧で主観的であるため、単一ラベル分類タスクよりもラベル分散学習(LDL)パラダイムでVEAに取り組む方が妥当である。他のLCLタスクと異なり、心理学理論で示されるように、感情とその内固有の特徴の間に固有の関係が存在する。そこで本研究では,視覚的感情分布学習に先立つ知識を活かした,身近な円形構造表現を提案する。具体的には、まず感情圏を構築し、その中の感情状態を統一する。提案した感情圏では、感情分布は感情ベクトルで表され、3つの属性(感情の極性、感情のタイプ、感情の強さ)と2つの特性(類似性、付加性)で定義される。さらに,予測された感情ベクトルとラベル付き感情ベクトルとの相違を粗い方法でペナルティ化する新たなプログレッシブ・サークル(PC)の損失を設計し,さらに感情特異的な学習プロセスを促進させる。公開視覚感情分布データセット上での広範な実験と比較を行い,提案手法が最先端手法よりも優れていることを示す。 Visual Emotion Analysis (VEA) has attracted increasing attention recently with the prevalence of sharing images on social networks. Since human emotions are ambiguous and subjective, it is more reasonable to address VEA in a label distribution learning (LDL) paradigm rather than a single-label classification task. Different from other LDL tasks, there exist intrinsic relationships between emotions and unique characteristics within them, as demonstrated in psychological theories. Inspired by this, we propose a well-grounded circular-structured representation to utilize the prior knowledge for visual emotion distribution learning. To be specific, we first construct an Emotion Circle to unify any emotional state within it. On the proposed Emotion Circle, each emotion distribution is represented with an emotion vector, which is defined with three attributes (i.e., emotion polarity, emotion type, emotion intensity) as well as two properties (i.e., similarity, additivity). Besides, we design a novel Progressive Circular (PC) loss to penalize the dissimilarities between predicted emotion vector and labeled one in a coarse-to-fine manner, which further boosts the learning process in an emotion-specific way. Extensive experiments and comparisons are conducted on public visual emotion distribution datasets, and the results demonstrate that the proposed method outperforms the state-of-the-art methods.	翻訳日:2021-06-24 15:19:33 公開日:2021-06-23
# characterchat:チャットボットによる会話とプログレッシブな表現による架空のキャラクターの創造を支援する CharacterChat: Supporting the Creation of Fictional Characters through Conversation and Progressive Manifestation with a Chatbot ( http://arxiv.org/abs/2106.12314v1 ) ライセンス: Link先を確認	Oliver Schmitt, Daniel Buschek	(参考訳) CharacterChatは、作家が架空のキャラクターを作るのを支援するコンセプトとチャットボットです。具体的には、作家は会話を通じてボットを想像上のキャラクターに変える。著者による文字作成に関する調査(n=30)から,2つの質的ユーザ調査(n=7,n=8)まで,ユーザ中心のアプローチで文字チャットを反復的に開発した。プロトタイプには2つのモードが組み合わさっている。(1) 文字属性の定義を支援するガイドプロンプト。ユーザー:「あなたはジェーンです。属性(例えば、属性)の提案を含む。 Bot: “私の主な動機は何ですか? 概念ネットワークを備えたルールベースのシステムとして実現された。 2) チャットボットとのオープンな会話は, 文字属性を考慮に入れた言語モデルを用いて, 文字の探索とインスピレーション獲得を支援する。ユーザスタディでは,文字生成の初期段階におけるメリットと,会話能力の制限による課題を明らかにする。学んだ教訓と将来の仕事のアイデアで締めくくります。 We present CharacterChat, a concept and chatbot to support writers in creating fictional characters. Concretely, writers progressively turn the bot into their imagined character through conversation. We iteratively developed CharacterChat in a user-centred approach, starting with a survey on character creation with writers (N=30), followed by two qualitative user studies (N=7 and N=8). Our prototype combines two modes: (1) Guided prompts help writers define character attributes (e.g. User: "Your name is Jane."), including suggestions for attributes (e.g. Bot: "What is my main motivation?") and values, realised as a rule-based system with a concept network. (2) Open conversation with the chatbot helps writers explore their character and get inspiration, realised with a language model that takes into account the defined character attributes. Our user studies reveal benefits particularly for early stages of character creation, and challenges due to limited conversational capabilities. We conclude with lessons learned and ideas for future work.	翻訳日:2021-06-24 15:19:10 公開日:2021-06-23
# 模倣学習 : 進歩・分類学・機会 Imitation Learning: Progress, Taxonomies and Opportunities ( http://arxiv.org/abs/2106.12177v1 ) ライセンス: Link先を確認	Boyuan Zheng, Sunny Verma, Jianlong Zhou, Ivor Tsang, Fang Chen	(参考訳) 模倣学習(imitation learning)は、人間の専門家のデモンストレーションから知識を抽出することを目的としている。その成功は、ビデオゲーム、自律運転、ロボットシミュレーション、オブジェクト操作などの分野で実証されている。しかし、この複製プロセスは、性能がデモ品質に大きく依存するなど問題があり、ほとんどの訓練されたエージェントはタスク固有の環境でうまく機能するように制限されている。本研究では,模倣学習に関する体系的考察を行う。まず, 開発史と予備知識から得られた背景知識を紹介し, 模倣学習と分野の重要なマイルストーンの中で異なる分類法を提示する。次に,学習戦略の課題を詳述し,サブオプティマイズや音声指示,その他の関連する最適化手法から学習方針を学ぶための研究機会を提案する。 Imitation learning aims to extract knowledge from human experts' demonstrations or artificially created agents in order to replicate their behaviors. Its success has been demonstrated in areas such as video games, autonomous driving, robotic simulations and object manipulation. However, this replicating process could be problematic, such as the performance is highly dependent on the demonstration quality, and most trained agents are limited to perform well in task-specific environments. In this survey, we provide a systematic review on imitation learning. We first introduce the background knowledge from development history and preliminaries, followed by presenting different taxonomies within Imitation Learning and key milestones of the field. We then detail challenges in learning strategies and present research opportunities with learning policy from suboptimal demonstration, voice instructions and other associated optimization schemes.	翻訳日:2021-06-24 15:18:22 公開日:2021-06-23
# マルチバンドVAE:連続学習における知識統合のための潜在空間分割 Multiband VAE: Latent Space Partitioning for Knowledge Consolidation in Continual Learning ( http://arxiv.org/abs/2106.12196v1 ) ライセンス: Link先を確認	Kamil Deja, Pawe{\l} Wawrzy\'nski, Daniel Marczak, Wojciech Masarczyk, Tomasz Trzci\'nski	(参考訳) 本稿では,変分オートエンコーダの潜伏空間の分割に依存する生成モデルにおける教師なし連続的知識統合手法を提案する。従来を忘れずに新しいデータサンプルに関する知識を取得することは、継続的な学習の重要な問題である。現在提案されている手法は、既存のモデルを拡張しながら、過去のデータで劣化しないように振舞いを制約することで、この目標を達成している。本研究では,この限界を特定し,知識蓄積タスクとして継続学習の目標を実証する。我々は、異なるタスクで見られるサンプルの表現であるバンドを、それらが含む情報の類似性によって駆動する、遅延空間分割を継続的に調整することで解決する。さらに,遅延帯域に符号化された再構成の質を向上する過去のデータの制御をシンプルかつ効果的に行う方法と,知識統合を改善する潜時空間のゆがみ技術を導入する。標準の連続学習評価ベンチマークに基づいて,本手法を新たな知識統合シナリオで評価し,提案手法がすべてのテストシナリオで最大2倍の性能を発揮することを示す。 We propose a new method for unsupervised continual knowledge consolidation in generative models that relies on the partitioning of Variational Autoencoder's latent space. Acquiring knowledge about new data samples without forgetting previous ones is a critical problem of continual learning. Currently proposed methods achieve this goal by extending the existing model while constraining its behavior not to degrade on the past data, which does not exploit the full potential of relations within the entire training dataset. In this work, we identify this limitation and posit the goal of continual learning as a knowledge accumulation task. We solve it by continuously re-aligning latent space partitions that we call bands which are representations of samples seen in different tasks, driven by the similarity of the information they contain. In addition, we introduce a simple yet effective method for controlled forgetting of past data that improves the quality of reconstructions encoded in latent bands and a latent space disentanglement technique that improves knowledge consolidation. On top of the standard continual learning evaluation benchmarks, we evaluate our method on a new knowledge consolidation scenario and show that the proposed approach outperforms state-of-the-art by up to twofold across all testing scenarios.	翻訳日:2021-06-24 15:18:09 公開日:2021-06-23
# 遅延フィードバックによる学習: 勾配遅延に暗黙的に適応する Learning Under Delayed Feedback: Implicitly Adapting to Gradient Delays ( http://arxiv.org/abs/2106.12261v1 ) ライセンス: Link先を確認	Rotem Zamir Aviv (1), Ido Hakimi (2), Assaf Schuster (2), Kfir Y. Levy (1 and 3) ((1) Department of Electrical and Computer Engineering, Technion, (2) Department of Computer Science, Technion, (3) A Viterbi Fellow)	(参考訳) 複数のマシンが共通のメモリを共有しながら並列に動作する確率的凸最適化問題を考える。本稿では,更新遅延,客観的な滑らかさ,勾配分散の事前知識に依存しない,制約付き設定のロバストなトレーニング手法を提案し,非漸近収束保証を導出する。逆に、この設定のための既存のメソッドは、クラウドやデータセンターなど、本質的にすべての共有リソース計算環境に不向きな、この事前の知識に依存している。具体的には,従来の手法ではマシンの動的割り当てによる遅延の変化に適応できないが,本手法はそのような変化に暗黙的に適応する。 We consider stochastic convex optimization problems, where several machines act asynchronously in parallel while sharing a common memory. We propose a robust training method for the constrained setting and derive non asymptotic convergence guarantees that do not depend on prior knowledge of update delays, objective smoothness, and gradient variance. Conversely, existing methods for this setting crucially rely on this prior knowledge, which render them unsuitable for essentially all shared-resources computational environments, such as clouds and data centers. Concretely, existing approaches are unable to accommodate changes in the delays which result from dynamic allocation of the machines, while our method implicitly adapts to such changes.	翻訳日:2021-06-24 15:17:49 公開日:2021-06-23
# マルウェア行動の説明可能な表現の学習 Learning Explainable Representations of Malware Behavior ( http://arxiv.org/abs/2106.12328v1 ) ライセンス: Link先を確認	Paul Prasse, Jan Brabec, Jan Kohout, Martin Kopp, Lukas Bajer, Tobias Scheffer	(参考訳) 我々は,ネットワークテレメトリログにおけるマルウェア識別の問題に対処し,脅威を識別する行動パターンの理解可能な説明である \emph{indicators of compromise} を提供する。本システムでは,専用検出器群がネットワークフローデータを第1ステップで理解可能な \emph{network events} に抽象化する。我々は、この一連のイベントを処理し、特定の脅威、マルウェアファミリー、および広範囲のマルウェアを識別するニューラルネットワークを開発した。次に、emph{integrated-gradients} メソッドを使用して、脅威の特徴的行動パターンを共同で構成するイベントをハイライトする。 CNN,LSTM,トランスフォーマーに基づくネットワークアーキテクチャを比較し,大規模テレメトリデータを用いた教師なし事前学習の有効性について検討する。本システムは,行動パターンに基づいて,njRATや他のマルウェアを検出する方法を示す。 We address the problems of identifying malware in network telemetry logs and providing \emph{indicators of compromise} -- comprehensible explanations of behavioral patterns that identify the threat. In our system, an array of specialized detectors abstracts network-flow data into comprehensible \emph{network events} in a first step. We develop a neural network that processes this sequence of events and identifies specific threats, malware families and broad categories of malware. We then use the \emph{integrated-gradients} method to highlight events that jointly constitute the characteristic behavioral pattern of the threat. We compare network architectures based on CNNs, LSTMs, and transformers, and explore the efficacy of unsupervised pre-training experimentally on large-scale telemetry data. We demonstrate how this system detects njRAT and other malware based on behavioral patterns.	翻訳日:2021-06-24 15:17:37 公開日:2021-06-23
# 正準相関解析から自己教師付きグラフニューラルネットワークへ From Canonical Correlation Analysis to Self-supervised Graph Neural Networks ( http://arxiv.org/abs/2106.12484v1 ) ライセンス: Link先を確認	Hengrui Zhang, Qitian Wu, Junchi Yan, David Wipf, Philip S. Yu	(参考訳) 本稿では,グラフデータを用いた自己教師付き表現学習のための概念的単純かつ効果的なモデルを提案する。データ拡張を通じて入力グラフの2つのビューを生成する以前の方法に従う。しかし、インスタンスレベルの識別に焦点を当てた対照的な手法とは異なり、古典的正準相関分析に触発された革新的な特徴レベルの目標を最適化する。他の研究と比較すると、パラメータ化された相互情報推定器、追加のプロジェクタ、非対称構造、そして最も重要なのは、コストがかかる負のサンプルを必要としない。本研究の目的は,1) 不変表現を学習することで拡張不変情報を排除し,2) 異なる次元の特徴をデコレーションすることでデジェネレーションソリューションを回避できることである。また,本理論解析は,情報ボトルネック原理のインスタンス化と同等視できる新しい目的の理解を,自己教師付き設定下でも提供する。その単純さにもかかわらず、この手法は7つのパブリックグラフデータセット上で競合的に実行される。 We introduce a conceptually simple yet effective model for self-supervised representation learning with graph data. It follows the previous methods that generate two views of an input graph through data augmentation. However, unlike contrastive methods that focus on instance-level discrimination, we optimize an innovative feature-level objective inspired by classical Canonical Correlation Analysis. Compared with other works, our approach requires none of the parameterized mutual information estimator, additional projector, asymmetric structures, and most importantly, negative samples which can be costly. We show that the new objective essentially 1) aims at discarding augmentation-variant information by learning invariant representations, and 2) can prevent degenerated solutions by decorrelating features in different dimensions. Our theoretical analysis further provides an understanding for the new objective which can be equivalently seen as an instantiation of the Information Bottleneck Principle under the self-supervised setting. Despite its simplicity, our method performs competitively on seven public graph datasets.	翻訳日:2021-06-24 15:17:23 公開日:2021-06-23
# 相互監督によるマルチモーダルベイズ学習 Learning Multimodal VAEs through Mutual Supervision ( http://arxiv.org/abs/2106.12570v1 ) ライセンス: Link先を確認	Tom Joy, Yuge Shi, Philip H.S. Torr, Tom Rainforth, Sebastian M. Schmon, N. Siddharth	(参考訳) マルチモーダルVAEは、異種データ(例えば、視覚、言語)上の共同分布をモデル化し、そのようなモダリティをまたいだ共有表現も取得しようとする。先行研究は、典型的には、明示的な積、混合、または他のそのような因子化を通じて認識モデル内で直接の慣用的表現を調和させることによって、モダリティからの情報を結合する。本稿では、半教師付きVAEを再利用し、相互監督を通じて暗黙的にモダリティ間の情報を組み合わせることで、このような明示的な組み合わせを避ける新しい選択肢MEMEを紹介する。この定式化は自然に、いくつかのモダリティが完全に欠落する部分的観測されたデータから学習を可能にする。我々は,MNIST-SVHN (image-image) と CUB (image-text) のデータセットを用いた部分的および完全観察スキームにおいて,MEME が標準指標のベースラインよりも優れていることを示す。また、相互監督によって学習される表現の品質を標準的アプローチと対比し、データ間の関連性を捉えた興味深い傾向を観察する。 Multimodal VAEs seek to model the joint distribution over heterogeneous data (e.g.\ vision, language), whilst also capturing a shared representation across such modalities. Prior work has typically combined information from the modalities by reconciling idiosyncratic representations directly in the recognition model through explicit products, mixtures, or other such factorisations. Here we introduce a novel alternative, the MEME, that avoids such explicit combinations by repurposing semi-supervised VAEs to combine information between modalities implicitly through mutual supervision. This formulation naturally allows learning from partially-observed data where some modalities can be entirely missing -- something that most existing approaches either cannot handle, or do so to a limited extent. We demonstrate that MEME outperforms baselines on standard metrics across both partial and complete observation schemes on the MNIST-SVHN (image-image) and CUB (image-text) datasets. We also contrast the quality of the representations learnt by mutual supervision against standard approaches and observe interesting trends in its ability to capture relatedness between data.	翻訳日:2021-06-24 15:17:06 公開日:2021-06-23
# weisfeilerとlehman goのセルラー:cwネットワーク Weisfeiler and Lehman Go Cellular: CW Networks ( http://arxiv.org/abs/2106.12575v1 ) ライセンス: Link先を確認	Cristian Bodnar, Fabrizio Frasca, Nina Otter, Yu Guang Wang, Pietro Li\`o, Guido Mont\'ufar, Michael Bronstein	(参考訳) グラフニューラルネットワーク(GNN)は、表現力に制限があり、長距離相互作用に苦慮し、高次構造をモデル化する原則的な方法がない。これらの問題は、計算グラフと入力グラフ構造の間の強い結合によって引き起こされる。最近提案されたMessage Passing Simplicial Networksは、グラフのcliqueコンプレックスにメッセージパッシングを実行することによって、これらの要素を自然に分離する。しかしながら、これらのモデルはSimplicial Complexs (SCs) の厳密な組合せ構造によって厳しく制約されている。本研究では, SC の最近の理論的結果を, SC とグラフを柔軟にサブスムする位相対象である正則なセルコンプレックスに拡張する。この一般化は、グラフ ``lifting''' 変換の強力なセットを提供し、それぞれにユニークな階層的メッセージパッシング手順をもたらす。 CWN(CW Networks)と呼ばれる結果の手法は、WLテストよりも厳格に強力であり、場合によっては3WLテストよりも強力である。特に,そのようなスキームが分子グラフ問題に適用された場合に,環に基づく効果を示す。提案したアーキテクチャは、一般的に使用されるGNNよりも明らかに大きな表現性、高次信号のモデリングの原則、ノード間の距離の圧縮の利点がある。本モデルにより, 種々の分子データセットの最先端結果が得られた。 Graph Neural Networks (GNNs) are limited in their expressive power, struggle with long-range interactions and lack a principled way to model higher-order structures. These problems can be attributed to the strong coupling between the computational graph and the input graph structure. The recently proposed Message Passing Simplicial Networks naturally decouple these elements by performing message passing on the clique complex of the graph. Nevertheless, these models are severely constrained by the rigid combinatorial structure of Simplicial Complexes (SCs). In this work, we extend recent theoretical results on SCs to regular Cell Complexes, topological objects that flexibly subsume SCs and graphs. We show that this generalisation provides a powerful set of graph ``lifting'' transformations, each leading to a unique hierarchical message passing procedure. The resulting methods, which we collectively call CW Networks (CWNs), are strictly more powerful than the WL test and, in certain cases, not less powerful than the 3-WL test. In particular, we demonstrate the effectiveness of one such scheme, based on rings, when applied to molecular graph problems. The proposed architecture benefits from provably larger expressivity than commonly used GNNs, principled modelling of higher-order signals and from compressing the distances between nodes. We demonstrate that our model achieves state-of-the-art results on a variety of molecular datasets.	翻訳日:2021-06-24 15:16:45 公開日:2021-06-23
# すべてのユーザが同じではない: シーケンシャルな意思決定問題に対するパーソナライズされた説明を提供すること Not all users are the same: Providing personalized explanations for sequential decision making problems ( http://arxiv.org/abs/2106.12207v1 ) ライセンス: Link先を確認	Utkarsh Soni, Sarath Sreedharan, Subbarao Kambhampati	(参考訳) 人間と一緒に機能する自律エージェントの設計への関心が高まっている。そのようなエージェントは間違いなく彼らの行動や決定を説明するだろう。説明の生成は積極的に研究されているトピックだが、ほとんどの作品は1つのサイズに適合する説明を生成する方法にフォーカスする傾向がある。ユーザモデルの仕様は、完全に無視されます。説明をユーザーの背景に合わせて調整する作業は、ユーザの特定のモデル(分析モデルや学習したラベル付けモデル)に依存する。本研究の目的は,エージェントが対話できるユーザの種類を学習することから始まる,エンドツーエンドの適応的説明生成システムを提案することである。そして、ターゲットユーザーとのインタラクション中に、フライ上のタイプを特定し、それに応じて説明を調整するタスクが実行される。前者はデータ駆動クラスタリング手法により実現され,後者では説明生成問題をPOMDPにコンパイルする。最先端のPOMDPソルバを用いた2つの領域におけるシステムの有用性を示す。また,人間とロボットのインタラクション設定においてパーソナライズされた説明を提供することのメリットを調査するユーザスタディの結果を報告する。 There is a growing interest in designing autonomous agents that can work alongside humans. Such agents will undoubtedly be expected to explain their behavior and decisions. While generating explanations is an actively researched topic, most works tend to focus on methods that generate explanations that are one size fits all. As in the specifics of the user-model are completely ignored. The handful of works that look at tailoring their explanation to the user's background rely on having specific models of the users (either analytic models or learned labeling models). The goal of this work is thus to propose an end-to-end adaptive explanation generation system that begins by learning the different types of users that the agent could interact with. Then during the interaction with the target user, it is tasked with identifying the type on the fly and adjust its explanations accordingly. The former is achieved by a data-driven clustering approach while for the latter, we compile our explanation generation problem into a POMDP. We demonstrate the usefulness of our system on two domains using state-of-the-art POMDP solvers. We also report the results of a user study that investigates the benefits of providing personalized explanations in a human-robot interaction setting.	翻訳日:2021-06-24 15:16:06 公開日:2021-06-23
# DeepStochLog: ニューラルネットワークの確率論理プログラミング DeepStochLog: Neural Stochastic Logic Programming ( http://arxiv.org/abs/2106.12574v1 ) ライセンス: Link先を確認	Thomas Winters, Giuseppe Marra, Robin Manhaeve, Luc De Raedt	(参考訳) DeepProbLogのようなニューラルシンボル学習の最近の進歩は、確率論的論理プログラムをニューラル述語で拡張している。グラフィカルモデルと同様に、これらの確率論的論理プログラムは可能な世界上の確率分布を定義する。本稿では,確率論的定節文法に基づく別のニューラルネットワーク記号フレームワークであるDeepStochLogを提案する。より具体的には、ニューラル文法規則を確率論的定節文法に導入し、エンドツーエンドにトレーニング可能なフレームワークを作成する。神経確率論理プログラミングにおける推論と学習は、神経確率論理プログラムよりもはるかに優れていることを示す。さらに,DeepStochLogを用いた実験結果から,ニューラルシンボリック学習課題における最先端の成果が得られた。 Recent advances in neural symbolic learning, such as DeepProbLog, extend probabilistic logic programs with neural predicates. Like graphical models, these probabilistic logic programs define a probability distribution over possible worlds, for which inference is computationally hard. We propose DeepStochLog, an alternative neural symbolic framework based on stochastic definite clause grammars, a type of stochastic logic program, which defines a probability distribution over possible derivations. More specifically, we introduce neural grammar rules into stochastic definite clause grammars to create a framework that can be trained end-to-end. We show that inference and learning in neural stochastic logic programming scale much better than for neural probabilistic logic programs. Furthermore, the experimental evaluation shows that DeepStochLog achieves state-of-the-art results on challenging neural symbolic learning tasks.	翻訳日:2021-06-24 15:15:47 公開日:2021-06-23
# cxse:胸部x線スローコードcnn forcovid-19診断 CxSE: Chest X-ray Slow Encoding CNN forCOVID-19 Diagnosis ( http://arxiv.org/abs/2106.12157v1 ) ライセンス: Link先を確認	Thangarajah Akilan	(参考訳) 新型コロナウイルスは指数的なペースで広がる中、私たちの日常生活を混乱させ続けている。さらなる拡散を避けるために、陽性患者を隔離するためには迅速に検出する必要がある。この研究は、'slow Encoding CNN'と呼ばれる新しい畳み込みニューラルネットワーク(CNN)アーキテクチャを提案する。提案されたモデルで最高の性能であるPPV(Positive Predictive Value)は、SP=0.67、PP=0.98、SN=0.96、PN=0.52でAI AGAINST COVID19 - 新型コロナウイルスの検査データサンプルのX線画像のスクリーニングを行う。 SP と PP は COVID-19 陽性クラスの感度と PPV を表し、PN と SN は COVID-19 陰性クラスの感度と PPV を表す。 The coronavirus continues to disrupt our everyday lives as it spreads at an exponential rate. It needs to be detected quickly in order to quarantine positive patients so as to avoid further spread. This work proposes a new convolutional neural network (CNN) architecture called 'slow Encoding CNN. The proposed model's best performance wrt Sensitivity, Positive Predictive Value (PPV) found to be SP=0.67, PP=0.98, SN=0.96, and PN=0.52 on AI AGAINST COVID19 - Screening X-ray images for COVID-19 Infections competition's test data samples. SP and PP stand for the Sensitivity and PPV of the COVID-19 positive class, while PN and SN stand for the Sensitivity and PPV of the COVID-19 negative class.	翻訳日:2021-06-24 15:15:32 公開日:2021-06-23
# deformed2self: dynamic medical imagingのための自己教師付きデノイジング Deformed2Self: Self-Supervised Denoising for Dynamic Medical Imaging ( http://arxiv.org/abs/2106.12175v1 ) ライセンス: Link先を確認	Junshen Xu, Elfar Adalsteinsson	(参考訳) 画像変性は,疾患診断や下流画像解析のための画像品質を向上させるため,医用画像システムにとって非常に重要である。様々な応用において、動的イメージング技術を用いて被写体の時間変化の特徴を捉え、同じ被写体に対して異なる時間ポイントで複数の画像を取得する。各時間フレームの信号対雑音比は通常、短い取得時間によって制限されるが、異なる時間フレーム間の相関を利用して、時間フレーム間の共有情報による復調結果を改善することができる。コンピュータビジョンにおけるニューラルネットワークの成功により、教師付きディープラーニング手法は、クリーンなvsノイズのイメージペアを持つ大規模なデータセットに依存する、単一イメージの認知において顕著なパフォーマンスを示す。近年, 自己教師付き深層ノイズモデルがいくつか提案されており, クリーン画像のペアワイズ基底真理を必要とせず, 有望な結果が得られた。しかし,マルチイメージ・Denoisingの分野では,自己教師型深層学習法を用いて,複数のスライスから相関情報を抽出する作業はほとんど行われていない。本研究では,ダイナミックイメージングのためのエンドツーエンドの自己教師型ディープラーニングフレームワークDeformed2Selfを提案する。シングルイメージとマルチイメージのデノゲーションを組み合わせて画質を改善し、空間トランスフォーマーネットワークを使用して異なるスライス間の動きをモデル化する。さらに、トレーニングと推論のために異なる時間フレームでいくつかの補助的な観察を行う単一ノイズ画像のみを必要とする。ノイズ統計値の異なるファントムおよび生体内データを用いて評価したところ,本手法は他の最先端の教師なし・自己監督型復調法と同等の性能を示した。 Image denoising is of great importance for medical imaging system, since it can improve image quality for disease diagnosis and downstream image analyses. In a variety of applications, dynamic imaging techniques are utilized to capture the time-varying features of the subject, where multiple images are acquired for the same subject at different time points. Although signal-to-noise ratio of each time frame is usually limited by the short acquisition time, the correlation among different time frames can be exploited to improve denoising results with shared information across time frames. With the success of neural networks in computer vision, supervised deep learning methods show prominent performance in single-image denoising, which rely on large datasets with clean-vs-noisy image pairs. Recently, several self-supervised deep denoising models have been proposed, achieving promising results without needing the pairwise ground truth of clean images. In the field of multi-image denoising, however, very few works have been done on extracting correlated information from multiple slices for denoising using self-supervised deep learning methods. In this work, we propose Deformed2Self, an end-to-end self-supervised deep learning framework for dynamic imaging denoising. It combines single-image and multi-image denoising to improve image quality and use a spatial transformer network to model motion between different slices. Further, it only requires a single noisy image with a few auxiliary observations at different time frames for training and inference. Evaluations on phantom and in vivo data with different noise statistics show that our method has comparable performance to other state-of-the-art unsupervised or self-supervised denoising methods and outperforms under high noise levels.	翻訳日:2021-06-24 15:15:16 公開日:2021-06-23
# 複数のスマートフォンのための協調的視覚慣性SLAM Collaborative Visual Inertial SLAM for Multiple Smart Phones ( http://arxiv.org/abs/2106.12186v1 ) ライセンス: Link先を確認	Jialing Liu, Ruyu Liu, Kaiqi Chen, Jianhua Zhang, Dongyan Guo	(参考訳) マッピングの効率性と正確性は、大規模なシーンと長期的なarアプリケーションにおいて極めて重要です。マルチエージェント協調SLAMはマルチユーザARインタラクションの前提条件である。複数のスマートフォンの連携により、タスク完了の効率性と堅牢性が向上し、単一のエージェントができないタスクを完了することができる。しかし、堅牢な通信、効率的な位置検出、ロバストマッピング、エージェント間の効率的な情報共有に依存している。マルチインテリジェンス・コラボレーティブな単眼視覚-慣性SLAMを,集中型アーキテクチャで複数のiosモバイルデバイスにデプロイする。各エージェントは独立して環境を探索し、視覚的慣性オドメトリーモジュールをオンラインで実行し、高いコンピューティングリソースを持つ中央サーバにすべての計測情報を送信することができる。サーバは受信したすべての情報を管理し、重複領域を検出し、地図をマージして最適化し、必要に応じてエージェントと情報を共有する。我々は,公開データセットと実環境におけるシステムの性能を検証した。提案システムのマッピングと融合の精度は,より高い計算資源を必要とするVINS-Monoに匹敵する。 The efficiency and accuracy of mapping are crucial in a large scene and long-term AR applications. Multi-agent cooperative SLAM is the precondition of multi-user AR interaction. The cooperation of multiple smart phones has the potential to improve efficiency and robustness of task completion and can complete tasks that a single agent cannot do. However, it depends on robust communication, efficient location detection, robust mapping, and efficient information sharing among agents. We propose a multi-intelligence collaborative monocular visual-inertial SLAM deployed on multiple ios mobile devices with a centralized architecture. Each agent can independently explore the environment, run a visual-inertial odometry module online, and then send all the measurement information to a central server with higher computing resources. The server manages all the information received, detects overlapping areas, merges and optimizes the map, and shares information with the agents when needed. We have verified the performance of the system in public datasets and real environments. The accuracy of mapping and fusion of the proposed system is comparable to VINS-Mono which requires higher computing resources.	翻訳日:2021-06-24 15:14:46 公開日:2021-06-23
# Pseudo Lesionから学ぶ : 新型コロナウイルスの自己診断フレームワーク Learning from Pseudo Lesion: A Self-supervised Framework for COVID-19 Diagnosis ( http://arxiv.org/abs/2106.12313v1 ) ライセンス: Link先を確認	Zhongliang Li, Zhihao Jin, Xuechen Li, Linlin Shen	(参考訳) 2019年12月の最初の報告以来、新型コロナウイルス(covid-19)は世界中で急速に広がり、胸部ct(胸部ct)はその診断の主要なツールの1つとなっている。近年、ディープラーニングに基づくアプローチは、無数の画像認識タスクにおいて顕著なパフォーマンスを示している。しかし、通常、トレーニングには大量の注釈付きデータが必要である。今回我々は,COIVD-19患者のCT検査でよく見られるグラウンドグラス・オパシティ(GGO)に触発され,疑似病変の発生と回復に基づく自己監督型事前訓練法を提案した。我々は,勾配雑音に基づく数学的モデルであるPerlin noiseを用いて病変様パターンを生成し,正常なCT画像の肺領域にランダムに貼り付け,擬似的なCOVID-19画像を生成する。正常と偽のCOVID-19イメージのペアは、ラベル付きデータを必要としない画像復元のためにエンコーダデコーダアーキテクチャに基づくU-Netのトレーニングに使用された。事前訓練されたエンコーダは、新型コロナウイルスの診断のためにラベル付きデータを使用して微調整された。 CT画像を用いた2つの公開COVID-19診断データセットを用いて評価を行った。総合的な実験結果から,提案手法は,SARS-CoV-2データセットとJinan COVID-19データセットでそれぞれ6.57%,3.03%の精度で教師付きモデルより優れた特徴表現を抽出できることが確認された。 The Coronavirus disease 2019 (COVID-19) has rapidly spread all over the world since its first report in December 2019 and thoracic computed tomography (CT) has become one of the main tools for its diagnosis. In recent years, deep learning-based approaches have shown impressive performance in myriad image recognition tasks. However, they usually require a large number of annotated data for training. Inspired by Ground Glass Opacity (GGO), a common finding in COIVD-19 patient's CT scans, we proposed in this paper a novel self-supervised pretraining method based on pseudo lesions generation and restoration for COVID-19 diagnosis. We used Perlin noise, a gradient noise based mathematical model, to generate lesion-like patterns, which were then randomly pasted to the lung regions of normal CT images to generate pseudo COVID-19 images. The pairs of normal and pseudo COVID-19 images were then used to train an encoder-decoder architecture based U-Net for image restoration, which does not require any labelled data. The pretrained encoder was then fine-tuned using labelled data for COVID-19 diagnosis task. Two public COVID-19 diagnosis datasets made up of CT images were employed for evaluation. Comprehensive experimental results demonstrated that the proposed self-supervised learning approach could extract better feature representation for COVID-19 diagnosis and the accuracy of the proposed method outperformed the supervised model pretrained on large scale images by 6.57% and 3.03% on SARS-CoV-2 dataset and Jinan COVID-19 dataset, respectively.	翻訳日:2021-06-24 15:14:31 公開日:2021-06-23
# ステレオカメラを用いた新しいビデオ合成手法 A new Video Synopsis Based Approach Using Stereo Camera ( http://arxiv.org/abs/2106.12362v1 ) ライセンス: Link先を確認	Talha Dilber, Mehmet Serdar Guzel, Erkan Bostanci	(参考訳) 今日の世界では、各分野で生成されるデータ量は予期せぬレベルで増加している。データの増加に直面したデータ処理の重要性は著しく高まっている。当社のリソーストピックは,データ増加に重要な位置を占めるビデオデータの処理と要約ビデオの生成に関するものです。このリソースの範囲内で,映像要約作成中に,オブジェクトベースの教師なし学習を用いた異常検出手法が開発されている。この方法を用いて、映像データを画素として処理し、ビデオセグメントとして結果を生成する。プロセスフローは、次のように簡単に要約できる。ビデオ上のオブジェクトは、そのタイプに応じて検出され、その後追跡される。そして、オブジェクトのトラッキング履歴データを処理し、そのオブジェクトタイプで分類器を訓練する。この分類器により、物体の異常な挙動を検出する。映像セグメントは、異常動作を含む映像モーメントを処理して決定される。検出されたビデオセグメントを元のビデオから抽出して組み合わせることで、ビデオ要約を作成する。私たちが開発したモデルは、シングルカメラとデュアルカメラシステムで別々にテストされ、検証されています。 In today's world, the amount of data produced in every field has increased at an unexpected level. In the face of increasing data, the importance of data processing has increased remarkably. Our resource topic is on the processing of video data, which has an important place in increasing data, and the production of summary videos. Within the scope of this resource, a new method for anomaly detection with object-based unsupervised learning has been developed while creating a video summary. By using this method, the video data is processed as pixels and the result is produced as a video segment. The process flow can be briefly summarized as follows. Objects on the video are detected according to their type, and then they are tracked. Then, the tracking history data of the objects are processed, and the classifier is trained with the object type. Thanks to this classifier, anomaly behavior of objects is detected. Video segments are determined by processing video moments containing anomaly behaviors. The video summary is created by extracting the detected video segments from the original video and combining them. The model we developed has been tested and verified separately for single camera and dual camera systems.	翻訳日:2021-06-24 15:14:04 公開日:2021-06-23
# STESS:自己監督学習を用いた動的胎児MRIの超解像 STRESS: Super-Resolution for Dynamic Fetal MRI using Self-Supervised Learning ( http://arxiv.org/abs/2106.12407v1 ) ライセンス: Link先を確認	Junshen Xu, Esra Abaci Turk, P. Ellen Grant, Polina Golland, Elfar Adalsteinsson	(参考訳) 胎児の運動は、従来のMRIスキャンのスケールでは予測不可能で急速である。したがって、胎児の運動と胎児機能のダイナミックスを捉えることを目的とした動的胎児MRIは、画像品質と解像度の妥協を伴う高速イメージング技術に限られる。特にオーバーサンプリングのための多方向画像スライススタックが利用できず、胎児や胎盤のダイナミックスを記録するための高時間分解能が望まれる場合、動的胎児MRIの超高解像度化は依然として課題である。さらに、胎児の動きは、教師あり学習方法のための高解像度画像を得るのを難しくする。そこで本研究では,動的胎児MRIのための自己監督型超解像フレームワークSTRESS(Spatio-Temporal Resolution Enhancement with Simulated Scans)を提案する。提案手法は,低解像度画像と高解像度画像のペアを生成するために,元々取得したデータの高分解能軸に沿ったインターリーブスライス取得をシミュレートする。そして、MR時系列における空間的相関と時間的相関を利用して、元のデータの解像度を高めることで超解像ネットワークを訓練する。シミュレーションおよび子宮内データによる評価は,提案手法が他の自己教師付き超解像法より優れ,画質が向上し,他の下流タスクや評価に有用であることを示す。 Fetal motion is unpredictable and rapid on the scale of conventional MR scan times. Therefore, dynamic fetal MRI, which aims at capturing fetal motion and dynamics of fetal function, is limited to fast imaging techniques with compromises in image quality and resolution. Super-resolution for dynamic fetal MRI is still a challenge, especially when multi-oriented stacks of image slices for oversampling are not available and high temporal resolution for recording the dynamics of the fetus or placenta is desired. Further, fetal motion makes it difficult to acquire high-resolution images for supervised learning methods. To address this problem, in this work, we propose STRESS (Spatio-Temporal Resolution Enhancement with Simulated Scans), a self-supervised super-resolution framework for dynamic fetal MRI with interleaved slice acquisitions. Our proposed method simulates an interleaved slice acquisition along the high-resolution axis on the originally acquired data to generate pairs of low- and high-resolution images. Then, it trains a super-resolution network by exploiting both spatial and temporal correlations in the MR time series, which is used to enhance the resolution of the original data. Evaluations on both simulated and in utero data show that our proposed method outperforms other self-supervised super-resolution methods and improves image quality, which is beneficial to other downstream tasks and evaluations.	翻訳日:2021-06-24 15:13:50 公開日:2021-06-23
# foldit: 大腸内視鏡ビデオにおけるhaustral foldsの検出とセグメンテーション FoldIt: Haustral Folds Detection and Segmentation in Colonoscopy Videos ( http://arxiv.org/abs/2106.12522v1 ) ライセンス: Link先を確認	Shawn Mathew, Saad Nadeem, Arie Kaufman	(参考訳) ホストラル折りたたみ(haustral fold)は、大腸内視鏡検査中に高いポリープミス率を示す大腸壁突起である。正確にセグメンテーションされた場合、ハストラルフォールドは欠損面のより良い推定を可能にし、また前処理の仮想(CT)と光学的大腸内視鏡を登録するための貴重なランドマークとして機能し、前処理のスキャンで見つかった異常へのナビゲーションをガイドする。本稿では,光学的大腸内視鏡映像から仮想大腸内視鏡画像へのハウストラルフォールドオーバーレイを用いた画像変換のための,新しい生成的逆向きネットワークfolditを提案する。新しい推移的損失を導入し,Hustral fold アノテーションと仮想大腸内視鏡的レンダリングの接地真理情報を活用する。そこで本研究では,本モデルの有効性を,実際に挑戦する光大腸内視鏡ビデオや,臨床医が検証したオーストラルフォールドアノテーションを用いたテクスチャ付き仮想大腸内視鏡ビデオに示す。この論文の実験を再現するコードとスクリプトは、https://github.com/nadeemlab/CEPのComputational Endoscopy Platformで公開されます。 Haustral folds are colon wall protrusions implicated for high polyp miss rate during optical colonoscopy procedures. If segmented accurately, haustral folds can allow for better estimation of missed surface and can also serve as valuable landmarks for registering pre-treatment virtual (CT) and optical colonoscopies, to guide navigation towards the anomalies found in pre-treatment scans. We present a novel generative adversarial network, FoldIt, for feature-consistent image translation of optical colonoscopy videos to virtual colonoscopy renderings with haustral fold overlays. A new transitive loss is introduced in order to leverage ground truth information between haustral fold annotations and virtual colonoscopy renderings. We demonstrate the effectiveness of our model on real challenging optical colonoscopy videos as well as on textured virtual colonoscopy videos with clinician-verified haustral fold annotations. All code and scripts to reproduce the experiments of this paper will be made available via our Computational Endoscopy Platform at https://github.com/nadeemlab/CEP.	翻訳日:2021-06-24 15:13:28 公開日:2021-06-23
# ソーシャルナビゲーションにおける紛争の予防と解決 -調査- Prevention and Resolution of Conflicts in Social Navigation -- a Survey ( http://arxiv.org/abs/2106.12113v1 ) ライセンス: Link先を確認	Reuth Mirsky and Xuesu Xiao and Justin Hart and Peter Stone	(参考訳) ロボットを共有ロボット環境で協調させるという目標が近づき、このコンテキストにおけるナビゲーションは重要かつ望ましいものとなる。ロボット工学の最近の進歩は、混在するロボット環境をナビゲートする際のいくつかの課題に遭遇し、対処してきたが、近年は、ソーシャルナビゲーションにおけるエージェント間の衝突をどう扱うかという問題に特に焦点を絞った、関連する研究の急増が観察されている。これらの貢献はモデル、アルゴリズム、評価指標を提供するが、この研究領域は本質的に学際的であるため、関連する論文の多くは同等ではなく、研究者の間には標準的な語彙がない。この調査の主な目標は、このような共通言語を提案し、既存の作業を調査し、オープンな問題を強調することで、このギャップを埋めることである。ソーシャルナビゲーションにおける衝突を定義することから始まり、コンポーネントの詳細な分類を提供する。この調査は、提案する分類法の枠組みを用いて論文を議論しながら、既存の研究を地図化する。最後に,現在ソーシャルナビゲーションの最前線にある今後の方向性と課題について,研究の焦点を絞るために提案する。 With the approaching goal of having robots collaborate in shared human-robot environments, navigation in this context becomes both crucial and desirable. Recent developments in robotics have encountered and tackled some of the challenges of navigating in mixed human-robot environments, and in recent years we observe a surge of related work that specifically targets the question of how to handle conflicts between agents in social navigation. These contributions offer models, algorithms, and evaluation metrics, however as this research area is inherently interdisciplinary, many of the relevant papers are not comparable and there is no standard vocabulary between the researchers. The main goal of this survey is to bridge this gap by proposing such a common language, using it to survey existing work, and highlighting open problems. It starts by defining a conflict in social navigation, and offers a detailed taxonomy of its components. This survey then maps existing work while discussing papers using the framing of the proposed taxonomy. Finally, this paper propose some future directions and problems that are currently in the frontier of social navigation to help focus research efforts.	翻訳日:2021-06-24 15:13:09 公開日:2021-06-23
# SKIM-FAカーネル:線形時間における高次元可変選択と非線形相互作用発見 The SKIM-FA Kernel: High-Dimensional Variable Selection and Nonlinear Interaction Discovery in Linear Time ( http://arxiv.org/abs/2106.12408v1 ) ライセンス: Link先を確認	Raj Agrawal and Tamara Broderick	(参考訳) 多くの科学的問題は、標的反応に関連する小さな共変体を同定し、その効果を推定する必要がある。これらの効果は、しばしば非線形であり、相互作用を含むため、線形および加法的手法は、推定と変数選択の貧弱につながる。ベイズフレームワークは、階層的モデルにおいてスパーシリティ、非線形性、相互作用を同時に表現する。しかし、この三量体を扱う他のいくつかの方法と同様に、推論は計算的に難解である。本研究では,この計算ボトルネックを解消する。まず,適切なベイズモデルがガウス過程(gps)として表現できることを示す。次に,これらのgpsを用いた計算を,変数選択と推定の両方においてo(# covariates)時間に短縮する方法を示す。我々の結果の適合性は、ヒルベルト空間における回帰関数のスパース直交分解(つまり、機能的ANOVA分解)に対応し、相互作用効果は低次効果によって説明できないすべての変動を表す。様々な合成データセットと実データセットにおいて、当社のアプローチは、大規模で高次元のデータセットに使用される既存の手法よりも優れています。 Many scientific problems require identifying a small set of covariates that are associated with a target response and estimating their effects. Often, these effects are nonlinear and include interactions, so linear and additive methods can lead to poor estimation and variable selection. The Bayesian framework makes it straightforward to simultaneously express sparsity, nonlinearity, and interactions in a hierarchical model. But, as for the few other methods that handle this trifecta, inference is computationally intractable - with runtime at least quadratic in the number of covariates, and often worse. In the present work, we solve this computational bottleneck. We first show that suitable Bayesian models can be represented as Gaussian processes (GPs). We then demonstrate how a kernel trick can reduce computation with these GPs to O(# covariates) time for both variable selection and estimation. Our resulting fit corresponds to a sparse orthogonal decomposition of the regression function in a Hilbert space (i.e., a functional ANOVA decomposition), where interaction effects represent all variation that cannot be explained by lower-order effects. On a variety of synthetic and real datasets, our approach outperforms existing methods used for large, high-dimensional datasets while remaining competitive (or being orders of magnitude faster) in runtime.	翻訳日:2021-06-24 15:12:50 公開日:2021-06-23
# パスシグネチャを用いた近似ベイズ計算 Approximate Bayesian Computation with Path Signatures ( http://arxiv.org/abs/2106.12555v1 ) ライセンス: Link先を確認	Joel Dyer, Patrick Cannon, Sebastian M Schmon	(参考訳) 科学的な関心のシミュレーションモデルは、しばしば、標準的確率に基づく統計推論に先行して、扱いやすい確率関数を欠いている。シミュレータのパラメータを推定する一般的な帰納法として近似ベイズ計算があり、シミュレータ出力と観測データを比較して近似後段をサンプリングする。しかし,特に高次元で構造的に複雑である時系列データでは,シミュレーションデータと観測データとの密接度を効果的に測定することは一般的に困難である。既存のアプローチは通常、手動で要約統計を構築したり、ドメインの専門知識や実験を必要としたり、idデータのような非現実的な仮定に依存したりする。その他、多変量や不規則にサンプリングされた時系列データのようなより複雑な設定では不適切である。本稿では,近似ベイズ計算アルゴリズムで使用する時系列データ間の距離を構築するための自然候補特徴集合としてパスシグネチャを用いることを提案する。実験により, 従来の時系列モデルよりも高精度なベイズ後方推定が可能であることが示された。 Simulation models of scientific interest often lack a tractable likelihood function, precluding standard likelihood-based statistical inference. A popular likelihood-free method for inferring simulator parameters is approximate Bayesian computation, where an approximate posterior is sampled by comparing simulator output and observed data. However, effective measures of closeness between simulated and observed data are generally difficult to construct, particularly for time series data which are often high-dimensional and structurally complex. Existing approaches typically involve manually constructing summary statistics, requiring substantial domain expertise and experimentation, or rely on unrealistic assumptions such as iid data. Others are inappropriate in more complex settings like multivariate or irregularly sampled time series data. In this paper, we introduce the use of path signatures as a natural candidate feature set for constructing distances between time series data for use in approximate Bayesian computation algorithms. Our experiments show that such an approach can generate more accurate approximate Bayesian posteriors than existing techniques for time series models.	翻訳日:2021-06-24 15:12:28 公開日:2021-06-23
# 不確実性認識モデルに基づく強化学習と自動運転への応用 Uncertainty-Aware Model-Based Reinforcement Learning with Application to Autonomous Driving ( http://arxiv.org/abs/2106.12194v1 ) ライセンス: Link先を確認	Jingda Wu, Zhiyu Huang, Chen Lv	(参考訳) 本稿では、強化学習(RL)の学習効率と性能をさらに向上させるために、新しい不確実性を考慮したモデルベースRL(UA-MBRL)フレームワークを提案する。まず,仮想環境モデルとして不確実性評価能力を有する動作条件アンサンブルモデルを確立する。そして,適応的トランケーションアプローチに基づいて,新たな不確実性を考慮したRLフレームワークを開発し,エージェントと環境モデルの仮想インタラクションを提供し,RLのトレーニング効率と性能を向上させる。開発したアルゴリズムは、エンド・ツー・エンドの自動運転車制御タスクで実装され、様々な運転シナリオで最先端の手法と比較される。その結果,UA-MBRL法は既存のモデルベースおよびモデルフリーRL法を学習効率の観点から上回り,性能が向上した。また,様々な自律運転シナリオにおいて,適応性とロバスト性に関して提案手法の有効性を示す。 To further improve the learning efficiency and performance of reinforcement learning (RL), in this paper we propose a novel uncertainty-aware model-based RL (UA-MBRL) framework, and then implement and validate it in autonomous driving under various task scenarios. First, an action-conditioned ensemble model with the ability of uncertainty assessment is established as the virtual environment model. Then, a novel uncertainty-aware model-based RL framework is developed based on the adaptive truncation approach, providing virtual interactions between the agent and environment model, and improving RL's training efficiency and performance. The developed algorithms are then implemented in end-to-end autonomous vehicle control tasks, validated and compared with state-of-the-art methods under various driving scenarios. The validation results suggest that the proposed UA-MBRL method surpasses the existing model-based and model-free RL approaches, in terms of learning efficiency and achieved performance. The results also demonstrate the good ability of the proposed method with respect to the adaptiveness and robustness, under various autonomous driving scenarios.	翻訳日:2021-06-24 15:11:50 公開日:2021-06-23
# MG-DVD:動的不均一グラフ学習に基づくマルウェア検出のためのリアルタイムフレームワーク MG-DVD: A Real-time Framework for Malware Variant Detection Based on Dynamic Heterogeneous Graph Learning ( http://arxiv.org/abs/2106.12288v1 ) ライセンス: Link先を確認	Chen Liu, Bo Li, Jun Zhao, Ming Su, Xu-Dong Liu	(参考訳) 新たなマルウェアをリアルタイムで検出することは、サイバーリスクを軽減し、積極的に侵入を阻止するために重要である。本稿では,動的異種グラフ学習に基づく新しい検出フレームワークMG-DVDを提案する。特にmg-dvdは、マルウェア変異体の細かな実行イベントストリームを動的ヘテロジニアスグラフにモデル化し、マルウェアオブジェクト間の実世界のメタグラフを調査し、マルウェアとその変異種間のより識別的な悪意のある進化パターンを効果的に特徴付ける。そして、MG-DVDは2つの動的ウォークに基づく異種グラフ学習法を示し、より包括的なマルウェアの表現を学習し、グラフ再学習のコストを大幅に削減する。その結果、MG-DVDはマルウェアの変種をリアルタイムで検出する機能を備えており、意味のあるメタグラフを導入することにより、より優れた解釈性を示す。大規模サンプルの総合的な実験により,提案したMG-DVDは,有効性と効率の観点から,マルウェアの変異を検出する最先端の手法より優れていることが示された。 Detecting the newly emerging malware variants in real time is crucial for mitigating cyber risks and proactively blocking intrusions. In this paper, we propose MG-DVD, a novel detection framework based on dynamic heterogeneous graph learning, to detect malware variants in real time. Particularly, MG-DVD first models the fine-grained execution event streams of malware variants into dynamic heterogeneous graphs and investigates real-world meta-graphs between malware objects, which can effectively characterize more discriminative malicious evolutionary patterns between malware and their variants. Then, MG-DVD presents two dynamic walk-based heterogeneous graph learning methods to learn more comprehensive representations of malware variants, which significantly reduces the cost of the entire graph retraining. As a result, MG-DVD is equipped with the ability to detect malware variants in real time, and it presents better interpretability by introducing meaningful meta-graphs. Comprehensive experiments on large-scale samples prove that our proposed MG-DVD outperforms state-of-the-art methods in detecting malware variants in terms of effectiveness and efficiency.	翻訳日:2021-06-24 15:11:31 公開日:2021-06-23
# EXPLAINable DGA Multiclass Classification への第一歩 First Step Towards EXPLAINable DGA Multiclass Classification ( http://arxiv.org/abs/2106.12336v1 ) ライセンス: Link先を確認	Arthur Drichel, Nils Faerber, Ulrike Meyer	(参考訳) 多くのマルウェアファミリーは、コマンドとコントロール(C2)サーバーへの接続を確立するためにドメイン生成アルゴリズム(DGA)に依存している。 DGAと対抗して、特定のドメイン名を生成したDGAを識別し、ターゲットの修復措置を誘発する機械学習分類器が提案されている。しかし、提案した最先端分類器はディープラーニングモデルに基づいている。これらのブラックボックスの性質は、その推論を評価するのを難しくしている。その結果、信頼性の欠如により、そのようなモデルの利用は不可能となる。本稿では,機能ベースでコンテキストレスなDGAマルチクラス分類器EXPLAINを提案する。我々は,同じ実世界のデータに基づいて,複数の最先端の分類器に対して,特徴集合とハイパーパラメータの組み合わせを比較検討した。提案するDGAマルチクラス分類器の予測よりも,特徴に遡ることが容易である。 Numerous malware families rely on domain generation algorithms (DGAs) to establish a connection to their command and control (C2) server. Counteracting DGAs, several machine learning classifiers have been proposed enabling the identification of the DGA that generated a specific domain name and thus triggering targeted remediation measures. However, the proposed state-of-the-art classifiers are based on deep learning models. The black box nature of these makes it difficult to evaluate their reasoning. The resulting lack of confidence makes the utilization of such models impracticable. In this paper, we propose EXPLAIN, a feature-based and contextless DGA multiclass classifier. We comparatively evaluate several combinations of feature sets and hyperparameters for our approach against several state-of-the-art classifiers in a unified setting on the same real-world data. Our classifier achieves competitive results, is real-time capable, and its predictions are easier to trace back to features than the predictions made by the DGA multiclass classifiers proposed in related work.	翻訳日:2021-06-24 15:11:10 公開日:2021-06-23
# ハイスタックでフィッシュを見つける: 認証透明性ログのフィッシュ分類のためのパイプライン Finding Phish in a Haystack: A Pipeline for Phishing Classification on Certificate Transparency Logs ( http://arxiv.org/abs/2106.12343v1 ) ライセンス: Link先を確認	Arthur Drichel, Vincent Drury, Justus von Brandt, Ulrike Meyer	(参考訳) 現在の一般的なフィッシング防止技術は、主に、被害者が保護されていない攻撃者に「機会の窓」を残すリアクティブブロックリストを使用する。このウィンドウを短くする1つの可能なアプローチは、認証透明性(CT)ログを監視して、ウェブサイトの準備中にフィッシング攻撃を早期に検出することである。フィッシング分類のためのCTログデータを扱う以前の試みは存在するが、実際のCTログデータに対する評価は欠如している。本稿では,CTログデータを扱う際の問題に対処し,そのような評価を容易にするパイプラインを提案する。パイプラインにはデータセットの作成、トレーニング、CTログの過去またはライブ分類が含まれている。そのモジュラ構造により、分類器や検証源を簡単に交換し、基底真理ラベルと分類器の比較をサポートすることができる。パイプラインを多数の新しいおよび既存の分類器でテストし、将来このシナリオの分類器を改善する一般的な可能性を見出す。パイプラインと使用するデータセットのソースコードと論文(https://gitlab.com/rwth-itsec/ctl-pipeline)を公開しています。 Current popular phishing prevention techniques mainly utilize reactive blocklists, which leave a ``window of opportunity'' for attackers during which victims are unprotected. One possible approach to shorten this window aims to detect phishing attacks earlier, during website preparation, by monitoring Certificate Transparency (CT) logs. Previous attempts to work with CT log data for phishing classification exist, however they lack evaluations on actual CT log data. In this paper, we present a pipeline that facilitates such evaluations by addressing a number of problems when working with CT log data. The pipeline includes dataset creation, training, and past or live classification of CT logs. Its modular structure makes it possible to easily exchange classifiers or verification sources to support ground truth labeling efforts and classifier comparisons. We test the pipeline on a number of new and existing classifiers, and find a general potential to improve classifiers for this scenario in the future. We publish the source code of the pipeline and the used datasets along with this paper (https://gitlab.com/rwth-itsec/ctl-pipeline), thus making future research in this direction more accessible.	翻訳日:2021-06-24 15:10:55 公開日:2021-06-23
# パストレースのためのリアルタイムニューラルネットワークラミアンスキャッシング Real-time Neural Radiance Caching for Path Tracing ( http://arxiv.org/abs/2106.12372v1 ) ライセンス: Link先を確認	Thomas M\"uller, Fabrice Rousselle, Jan Nov\'ak, Alexander Keller	(参考訳) 本稿では,パストレースによるグローバル照明のためのリアルタイムニューラルネットワークラミアンスキャッシング手法を提案する。我々のシステムは、完全にダイナミックなシーンを扱うように設計されており、照明、幾何学、材料に関する仮定は一切ない。私たちのアプローチのデータ駆動性は、キャッシュポイントの配置、補間、更新など、キャッシュアルゴリズムの多くの難しさを回避します。ニューラルネットワークをトレーニングして新しいものを扱うため、動的シーンは恐ろしい一般化の課題であるので、事前トレーニングを廃止し、適応によって一般化する。レンダリング中にレイディアンスキャッシュを訓練することにしました低ノイズのトレーニングターゲットを提供し、数バウンストレーニング更新を単に繰り返して無限バウンス輸送をシミュレートするために、自己学習を採用している。最新のハードウェアをフル活用したニューラルネットワークのストリーミング実装のおかげで、更新とキャッシュクエリは -- フルhd解像度で約2.6ミリ秒の軽いオーバーヘッドを伴います。バイアスを小さく抑えることで大きなノイズ低減効果を示すとともに,多くの課題に対して最先端のリアルタイム性能を報告した。 We present a real-time neural radiance caching method for path-traced global illumination. Our system is designed to handle fully dynamic scenes, and makes no assumptions about the lighting, geometry, and materials. The data-driven nature of our approach sidesteps many difficulties of caching algorithms, such as locating, interpolating, and updating cache points. Since pretraining neural networks to handle novel, dynamic scenes is a formidable generalization challenge, we do away with pretraining and instead achieve generalization via adaptation, i.e. we opt for training the radiance cache while rendering. We employ self-training to provide low-noise training targets and simulate infinite-bounce transport by merely iterating few-bounce training updates. The updates and cache queries incur a mild overhead -- about 2.6ms on full HD resolution -- thanks to a streaming implementation of the neural network that fully exploits modern hardware. We demonstrate significant noise reduction at the cost of little induced bias, and report state-of-the-art, real-time performance on a number of challenging scenarios.	翻訳日:2021-06-24 15:10:33 公開日:2021-06-23
# 一般化誤差制御による回帰学習のためのトレーニングデータサブセット選択 Training Data Subset Selection for Regression with Controlled Generalization Error ( http://arxiv.org/abs/2106.12491v1 ) ライセンス: Link先を確認	Durga Sivasubramanian, Rishabh Iyer, Ganesh Ramakrishnan, Abir De	(参考訳) 多数のトレーニングインスタンスからのデータサブセット選択は、効率的でコスト効率の良い機械学習へのアプローチとして成功している。しかし、より小さな部分集合で訓練されたモデルは、一般化能力に乏しい。本稿では,トレーニングデータのサブセットを選択するアルゴリズムを設計することで,精度を著しく犠牲にすることなく,モデルを迅速にトレーニングすることを目的とする。より具体的には、l2正規化回帰問題に対するデータサブセット選択に着目し、トレーニング可能なパラメータとトレーニングデータのサブセットの両方に対するトレーニング損失を最小限に抑えることを目的とした新しい問題定式化を提供する。我々はいくつかの技術革新を用いてこの問題に取り組む。まず、この問題を元のトレーニング問題の双対を用いて単純化した制約で表現し、この新しい表現の目的が様々なモデリング選択に対してモノトーンおよびα-部分モジュラー関数であることを示す。このような特性により、トレーニングがトレーニングされたモデルの不完全推定を提供しても近似を保証する、データサブセット選択のための効率的な分極最小化アルゴリズムであるSELCONを開発することができる。最後に、いくつかのデータセットに対する実験により、SELCONは現在の最先端技術よりも精度と効率を効果的に交換することを示した。 Data subset selection from a large number of training instances has been a successful approach toward efficient and cost-effective machine learning. However, models trained on a smaller subset may show poor generalization ability. In this paper, our goal is to design an algorithm for selecting a subset of the training data, so that the model can be trained quickly, without significantly sacrificing on accuracy. More specifically, we focus on data subset selection for L2 regularized regression problems and provide a novel problem formulation which seeks to minimize the training loss with respect to both the trainable parameters and the subset of training data, subject to error bounds on the validation set. We tackle this problem using several technical innovations. First, we represent this problem with simplified constraints using the dual of the original training problem and show that the objective of this new representation is a monotone and alpha-submodular function, for a wide variety of modeling choices. Such properties lead us to develop SELCON, an efficient majorization-minimization algorithm for data subset selection, that admits an approximation guarantee even when the training provides an imperfect estimate of the trained model. Finally, our experiments on several datasets show that SELCON trades off accuracy and efficiency more effectively than the current state-of-the-art.	翻訳日:2021-06-24 15:10:16 公開日:2021-06-23
# 戦略分類で誰がリードし、誰がフォローするか? Who Leads and Who Follows in Strategic Classification? ( http://arxiv.org/abs/2106.12529v1 ) ライセンス: Link先を確認	Tijana Zrnic, Eric Mazumdar, S. Shankar Sastry, Michael I. Jordan	(参考訳) 予測モデルが現実世界にデプロイされるにつれ、彼らはますます戦略的な行動と競合しなくてはならない。戦略分類に関する活動の活発化は、この問題をStackelbergのゲームとして扱う: 意思決定者(deciment-maker)は、モデルをデプロイすることでゲーム内で"リード(leads)"する。重要なのは、このフレーミングでは、学習の負担は意思決定者のみに置かれ、エージェントのベストレスポンスは暗黙的に瞬時に扱われる。本研究では,戦略分類における役割の順序は,意思決定者とエージェントが互いの行動に適応する相対周波数によって決定されると主張している。特に,両プレイヤーが時間とともに学習できるように標準モデルを一般化することにより,エージェントよりも高速に更新を行う意思決定者がプレーの順序を逆転し,エージェントがリードし,意思決定者が従うことを示す。我々は,このような役割の逆転が意思決定者や戦略エージェントにとって望ましいことを,標準的な学習環境で観察する。最後に,更新頻度を自由に選択できる意思決定者は,いずれの順序でもstackelberg equilibriaに収束する学習ダイナミクスを誘導できることを示す。 As predictive models are deployed into the real world, they must increasingly contend with strategic behavior. A growing body of work on strategic classification treats this problem as a Stackelberg game: the decision-maker "leads" in the game by deploying a model, and the strategic agents "follow" by playing their best response to the deployed model. Importantly, in this framing, the burden of learning is placed solely on the decision-maker, while the agents' best responses are implicitly treated as instantaneous. In this work, we argue that the order of play in strategic classification is fundamentally determined by the relative frequencies at which the decision-maker and the agents adapt to each other's actions. In particular, by generalizing the standard model to allow both players to learn over time, we show that a decision-maker that makes updates faster than the agents can reverse the order of play, meaning that the agents lead and the decision-maker follows. We observe in standard learning settings that such a role reversal can be desirable for both the decision-maker and the strategic agents. Finally, we show that a decision-maker with the freedom to choose their update frequency can induce learning dynamics that converge to Stackelberg equilibria with either order of play.	翻訳日:2021-06-24 15:09:56 公開日:2021-06-23
# bregmangradient policyの最適化 Bregman Gradient Policy Optimization ( http://arxiv.org/abs/2106.12112v1 ) ライセンス: Link先を確認	Feihu Huang, Shangqian Gao, Heng Huang	(参考訳) 本稿では,Bregman分散度と運動量に基づく強化学習のための新しいBregman勾配ポリシー最適化フレームワークを設計する。具体的には,基本運動量法とミラー降下反復に基づくBregmanグラデーションポリシー最適化(BGPO)アルゴリズムを提案する。同時に,運動量分散を再現した手法に基づいて,ブレグマン勾配ポリシー最適化(VR-BGPO)アルゴリズムを提案する。さらに,非凸条件下でのブレグマン勾配政策最適化のための収束解析フレームワークを提案する。具体的には、BGPOが各反復で1つの軌道のみを必要とする$\epsilon$-stationary pointを見つけるために$\tilde{O}(\epsilon^{-4})$のサンプル複雑性を達成し、VR-BGPOは各反復で1つの軌道のみを必要とする$\tilde{O}(\epsilon^{-3})$の既知のサンプル複雑さに達することを証明している。特に,Bregmanの相違を利用して,既存の政策最適化アルゴリズムと,既存の(分散還元)政策勾配アルゴリズムや(分散還元)自然政策勾配アルゴリズムなどの新しい変種を統一する。複数の強化学習タスクに関する広範な実験結果から,新しいアルゴリズムの有効性が示された。 In this paper, we design a novel Bregman gradient policy optimization framework for reinforcement learning based on Bregman divergences and momentum techniques. Specifically, we propose a Bregman gradient policy optimization (BGPO) algorithm based on the basic momentum technique and mirror descent iteration. At the same time, we present an accelerated Bregman gradient policy optimization (VR-BGPO) algorithm based on a momentum variance-reduced technique. Moreover, we introduce a convergence analysis framework for our Bregman gradient policy optimization under the nonconvex setting. Specifically, we prove that BGPO achieves the sample complexity of $\tilde{O}(\epsilon^{-4})$ for finding $\epsilon$-stationary point only requiring one trajectory at each iteration, and VR-BGPO reaches the best known sample complexity of $\tilde{O}(\epsilon^{-3})$ for finding an $\epsilon$-stationary point, which also only requires one trajectory at each iteration. In particular, by using different Bregman divergences, our methods unify many existing policy optimization algorithms and their new variants such as the existing (variance-reduced) policy gradient algorithms and (variance-reduced) natural policy gradient algorithms. Extensive experimental results on multiple reinforcement learning tasks demonstrate the efficiency of our new algorithms.	翻訳日:2021-06-24 15:08:56 公開日:2021-06-23
# 運動方程式の保守的ニューラルネットワーク解のためのラグランジュ双対フレームワーク Lagrangian dual framework for conservative neural network solutions of kinetic equations ( http://arxiv.org/abs/2106.12147v1 ) ライセンス: Link先を確認	Hyung Ju Hwang and Hwijae Son	(参考訳) 本稿では,ニューラルネットワークによる運動方程式の解法として,新しい保存的定式化を提案する。より正確には、学習問題を物理的保存則を表す制約付き最適化問題として定式化する。制約はラグランジアン双対性により残留損失関数に対して緩和される。学習問題の制約として解の物理的保存特性を仮定することにより、解の誤差や保存則の観点からのより正確な近似を、速度論的フォッカー・プランク方程式と均質なボルツマン方程式に対して示す。 In this paper, we propose a novel conservative formulation for solving kinetic equations via neural networks. More precisely, we formulate the learning problem as a constrained optimization problem with constraints that represent the physical conservation laws. The constraints are relaxed toward the residual loss function by the Lagrangian duality. By imposing physical conservation properties of the solution as constraints of the learning problem, we demonstrate far more accurate approximations of the solutions in terms of errors and the conservation laws, for the kinetic Fokker-Planck equation and the homogeneous Boltzmann equation.	翻訳日:2021-06-24 15:08:32 公開日:2021-06-23
# エネルギーを考慮した資源配分のための畳み込みニューラルネットワークとGated Recurrent Unitの組み合わせ Combination of Convolutional Neural Network and Gated Recurrent Unit for Energy Aware Resource Allocation ( http://arxiv.org/abs/2106.12178v1 ) ライセンス: Link先を確認	Zeinab Khodaverdian, Hossein Sadr, Seyed Ahmad Edalatpanah and Mojdeh Nazari Solimandarabi	(参考訳) クラウドコンピューティングサービスモデルは急速に成長し、非効率なリソース利用は、クラウドデータセンターにおける高エネルギー消費の最大の原因の1つとして知られている。仮想マシン (VM) のライブマイグレーションと, 少数の物理マシン (PM) への統合により, エネルギー消費削減を目的としたクラウドデータセンターの資源配分を行った。しかし、移行に適したvmの選択は重要な課題である。この問題を解決するため、ユーザリクエストのパターンに従ってVMを機密性のある、あるいは非機密なクラスに分類し、その後、マイグレーション用に適切なVMを選択することができる。本稿では、Microsoft Azureデータセット内のVMの分類に、畳み込みニューラルネットワーク(CNN)とGRU(Gated Recurrent Unit)の組み合わせを利用する。このデータセットのほとんどのVMは遅延に敏感であるとラベル付けされているため、このグループのVMへの移行はエネルギー消費を減らすだけでなく、サービスレベルアグリーメント(SLA)に違反している。実験結果に基づき,提案モデルでは,既存モデルと比較して,提案モデルが優れていることを示す95.18の精度を得た。 Cloud computing service models have experienced rapid growth and inefficient resource usage is known as one of the greatest causes of high energy consumption in cloud data centers. Resource allocation in cloud data centers aiming to reduce energy consumption has been conducted using live migration of Virtual Machines (VMs) and their consolidation into the small number of Physical Machines (PMs). However, the selection of the appropriate VM for migration is an important challenge. To solve this issue, VMs can be classified according to the pattern of user requests into sensitive or insensitive classes to latency, and thereafter suitable VMs can be selected for migration. In this paper, the combination of Convolution Neural Network (CNN) and Gated Recurrent Unit (GRU) is utilized for the classification of VMs in the Microsoft Azure dataset. Due to the fact the majority of VMs in this dataset are labeled as insensitive to latency, migration of more VMs in this group not only reduces energy consumption but also decreases the violation of Service Level Agreements (SLA). Based on the empirical results, the proposed model obtained an accuracy of 95.18which clearly demonstrates the superiority of our proposed model compared to other existing models.	翻訳日:2021-06-24 15:08:23 公開日:2021-06-23
# bibliodap:1st workshop on bibliographic data analysis and processing BiblioDAP: The 1st Workshop on Bibliographic Data Analysis and Processing ( http://arxiv.org/abs/2106.12320v1 ) ライセンス: Link先を確認	Zeyd Boukhers, Philipp Mayr, Silvio Peroni	(参考訳) 書誌データの自動処理は, 図書館, データサイエンス, 機械学習において, 書誌データの自動処理が重要となる。この処理には、I)PDF文書からの参照の自動抽出、II)正確な引用グラフの構築、III)著者名曖昧化等を含むいくつかの側面がある。書誌データは自然と異質であり、構造化された(例えば)両者で発生する。引用グラフ)と非構造化(例) 出版物) 形式。そのため、データサイエンスと機械学習のテクニックを処理および分析する必要がある。ここでは、BiblioDAP'21: The First Workshop on Bibliographic Data Analysis and Processingを紹介する。 Automatic processing of bibliographic data becomes very important in digital libraries, data science and machine learning due to its importance in keeping pace with the significant increase of published papers every year from one side and to the inherent challenges from the other side. This processing has several aspects including but not limited to I) Automatic extraction of references from PDF documents, II) Building an accurate citation graph, III) Author name disambiguation, etc. Bibliographic data is heterogeneous by nature and occurs in both structured (e.g. citation graph) and unstructured (e.g. publications) formats. Therefore, it requires data science and machine learning techniques to be processed and analysed. Here we introduce BiblioDAP'21: The 1st Workshop on Bibliographic Data Analysis and Processing.	翻訳日:2021-06-24 15:08:05 公開日:2021-06-23
# ラジオマップを用いたリアルタイム屋外位置推定 : 深層学習アプローチ Real-time Outdoor Localization Using Radio Maps: A Deep Learning Approach ( http://arxiv.org/abs/2106.12556v1 ) ライセンス: Link先を確認	\c{C}a\u{g}kan Yapar, Ron Levie, Gitta Kutyniok, Giuseppe Caire	(参考訳) 本稿では,密集した都市シナリオにおけるセルネットワークの局在の問題を扱う。グローバル・ナビゲーション・サテライト・システムは通常、機器と衛星の間の視線条件が低くなる都市環境では性能が悪く、適切な精度のために代替のローカライズ方法が必要となる。本稿では,パスロスのみに基づく局所化のための深層学習手法を提案する。これは,到着時刻や到着角に依存する手法とは異なり,デバイス標準操作に対するユーザデバイスでの計算複雑性の増大を必要としない。無線ネットワークにおいて、ユーザデバイスは、ベースステーションビーコンスロットをスキャンし、ハンドオーバおよびユーザベースステーションアソシエーションのために、数少ない最強のベースステーション信号を特定する。提案手法では,受信した信号強度をクラウド上に位置する中央処理ユニットに簡易に報告する。各基地局に対して、地図上の高密度グリッド内のすべての位置におけるパスロスをよく近似する。この近似は,都市環境におけるパスロス関数の深層学習に基づくシミュレータであるRadioUNetによって提供される。提案した深層学習アルゴリズムは,すべての基地局の推定パスロスラジオマップとそれに対応する信号強度を用いて,ユーザの正確な位置推定を行うことができる。提案手法はLocUNetと呼ばれ,推定無線地図における不正確性が高い。これを数値実験により実演し,最新の結果を得た。 This paper deals with the problem of localization in a cellular network in a dense urban scenario. Global Navigation Satellite Systems typically perform poorly in urban environments, where the likelihood of line-of-sight conditions between the devices and the satellites is low, and thus alternative localization methods are required for good accuracy. We present a deep learning method for localization, based merely on pathloss, which does not require any increase in computation complexity at the user devices with respect to the device standard operations, unlike methods that rely on time of arrival or angle of arrival information. In a wireless network, user devices scan the base station beacon slots and identify the few strongest base station signals for handover and user-base station association purposes. In the proposed method, the user to be localized simply reports such received signal strengths to a central processing unit, which may be located in the cloud. For each base station we have good approximation of the pathloss at every location in a dense grid in the map. This approximation is provided by RadioUNet, a deep learning-based simulator of pathloss functions in urban environment, that we have previously proposed and published. Using the estimated pathloss radio maps of all base stations and the corresponding reported signal strengths, the proposed deep learning algorithm can extract a very accurate localization of the user. The proposed method, called LocUNet, enjoys high robustness to inaccuracies in the estimated radio maps. We demonstrate this by numerical experiments, which obtain state-of-the-art results.	翻訳日:2021-06-24 15:07:53 公開日:2021-06-23
# (参考訳) STEP-EZ:Syntax Tree Guided semantic ExPlanation for Explainable Zero-shot Modeling of Clinical depression symptoms from text STEP-EZ: Syntax Tree guided semantic ExPlanation for Explainable Zero-shot modeling of clinical depression symptoms from text ( http://arxiv.org/abs/2106.10928v2 ) ライセンス: CC BY 4.0	Nawshad Farruque, Randy Goebel, Osmar Zaiane, Sudhakar Sivapalan	(参考訳) 我々は,ZSL(Zero-Shot Learning)の様々なアプローチと,データ不足のトレーニングで有名な,重要な教師付き学習課題の説明可能性に焦点をあてる。 Depression Symptoms Detection (DSD) from text (英語) まず、ZSLモデリングの様々な構成要素の総合的な合成と、臨床医の助けを借りて、地上の真理サンプルの分析と抑うつ症状の手がかりのキュレーションプロセスから始める。次に、様々な最先端ZSLモデルの精度と、タスクの潜在的な拡張について分析する。さらに,ZSLを階層的テキストベース説明機構に用いるためのフレームワークをスケッチし,Syntax Tree-Guided Semantic Explanation (STEP) と呼ぶ。最後に,提案する説明可能性指標(ei)を用いて,zslモデルを用いて合理的な正確性と説明可能性を達成する実験をまとめる。この研究は、我々の知る限り、DSDタスクにおけるZSLモデルの有効性を、精度と説明可能性の両方の観点から徹底的に探求する最初の成果である。 We focus on exploring various approaches of Zero-Shot Learning (ZSL) and their explainability for a challenging yet important supervised learning task notorious for training data scarcity, i.e. Depression Symptoms Detection (DSD) from text. We start with a comprehensive synthesis of different components of our ZSL modeling and analysis of our ground truth samples and Depression symptom clues curation process with the help of a practicing clinician. We next analyze the accuracy of various state-of-the-art ZSL models and their potential enhancements for our task. Further, we sketch a framework for the use of ZSL for hierarchical text-based explanation mechanism, which we call, Syntax Tree-Guided Semantic Explanation (STEP). Finally, we summarize experiments from which we conclude that we can use ZSL models and achieve reasonable accuracy and explainability, measured by a proposed Explainability Index (EI). This work is, to our knowledge, the first work to exhaustively explore the efficacy of ZSL models for DSD task, both in terms of accuracy and explainability.	翻訳日:2021-06-24 12:58:23 公開日:2021-06-23
# (参考訳) 差分認識モデルを用いた学習型実測光場画像圧縮 Learning-Based Practical Light Field Image Compression Using A Disparity-Aware Model ( http://arxiv.org/abs/2106.11558v2 ) ライセンス: CC BY 4.0	Mohana Singh and Renu M. Rameshan	(参考訳) 光分野技術は研究コミュニティの注目を集め、多くの応用が期待されている。商用のレンズカメラのレンズレットアレイは、光線の空間情報と角情報の両方を単一の露光で捉えるのに役立つ。光フィールドデータの高次元性により、その優れた機能を実現する一方で、その広範な採用を妨げる。そのため、光電界画像の効率的な圧縮が求められている。既存のソリューションは通常、いくつかの異なるモジュールで構成されており、いくつかは光フィールドデータの特定の構造と品質のために設計されていないかもしれない。これによりコーデックの複雑さが増し、非実用的なデコーディングランタイムが発生する。並列デコーディングが可能な4次元光フィールド画像の圧縮のための,学習に基づく分散支援モデルを提案する。モデルはエンドツーエンドのトレーニングが可能で、手動でモジュールを調整する必要がなく、レートと歪みの同時学習が可能である。格差支援アプローチは、再構成された光場の構造的整合性を保証する。 PSNRとMS-SSIMの指標で比較すると,性能が向上している。また、ランタイムのエンコーディングとデコードにも顕著な利益がある。ソースコードはhttps://moha23.github.io/LF-DAAEで公開されている。 Light field technology has increasingly attracted the attention of the research community with its many possible applications. The lenslet array in commercial plenoptic cameras helps capture both the spatial and angular information of light rays in a single exposure. While the resulting high dimensionality of light field data enables its superior capabilities, it also impedes its extensive adoption. Hence, there is a compelling need for efficient compression of light field images. Existing solutions are commonly composed of several separate modules, some of which may not have been designed for the specific structure and quality of light field data. This increases the complexity of the codec and results in impractical decoding runtimes. We propose a new learning-based, disparity-aided model for compression of 4D light field images capable of parallel decoding. The model is end-to-end trainable, eliminating the need for hand-tuning separate modules and allowing joint learning of rate and distortion. The disparity-aided approach ensures the structural integrity of the reconstructed light fields. Comparisons with the state of the art show encouraging performance in terms of PSNR and MS-SSIM metrics. Also, there is a notable gain in the encoding and decoding runtimes. Source code is available at https://moha23.github.io/LF-DAAE.	翻訳日:2021-06-24 12:39:42 公開日:2021-06-23
# (参考訳) 可視化:物理インフォームドデータ拡張によるデータ駆動型地震インバージョン Making Invisible Visible: Data-Driven Seismic Inversion with Physics-Informed Data Augmentation ( http://arxiv.org/abs/2106.11892v2 ) ライセンス: CC BY 4.0	Yuxin Yang, Xitong Zhang, Qiang Guan, Youzuo Lin	(参考訳) ディープラーニングとデータ駆動アプローチは、科学的領域において大きな可能性を示しています。データ駆動技術の約束は、大量の高品質なトレーニングデータセットが利用可能であることに依存している。高価な物理実験、機器、シミュレーションを通じてデータを取得するコストが高いため、近年、科学応用のためのデータ拡張技術が科学データを得るための新しい方向として登場した。しかし、コンピュータビジョンに由来する既存のデータ拡張技術は、私たちが関心を持つドメイン問題には役に立たない物理的に受け入れられないデータサンプルを生み出します。本稿では,畳み込みニューラルネットワークを用いた新しい物理情報拡張手法を提案する。特に、生成モデルは、合成データの質を改善するために、異なる物理知識(制御方程式、観測可能な知覚、物理現象など)を利用する。本研究では,データ拡張手法の有効性を検証するために,co$_2$リークデータを用いた地中地震波フルウェーブフォームインバージョン法を適用した。我々の関心は、極小のco$_2$リークを伴う地下速度モデルに逆戻りすることである。本手法の有効性を総合的な数値テストを用いて検証する。比較と解析により,物理インフォームドデータ拡張技術を用いて,データ駆動型地震イメージングを著しく向上させることができることを示す。特に,本手法で得られた拡張学習セットを用いた場合,画像品質は,一般大規模漏洩テストシナリオでは15%向上し,小型リークでは17%向上した。 Deep learning and data-driven approaches have shown great potential in scientific domains. The promise of data-driven techniques relies on the availability of a large volume of high-quality training datasets. Due to the high cost of obtaining data through expensive physical experiments, instruments, and simulations, data augmentation techniques for scientific applications have emerged as a new direction for obtaining scientific data recently. However, existing data augmentation techniques originating from computer vision, yield physically unacceptable data samples that are not helpful for the domain problems that we are interested in. In this paper, we develop new physics-informed data augmentation techniques based on convolutional neural networks. Specifically, our generative models leverage different physics knowledge (such as governing equations, observable perception, and physics phenomena) to improve the quality of the synthetic data. To validate the effectiveness of our data augmentation techniques, we apply them to solve a subsurface seismic full-waveform inversion using simulated CO$_2$ leakage data. Our interest is to invert for subsurface velocity models associated with very small CO$_2$ leakage. We validate the performance of our methods using comprehensive numerical tests. Via comparison and analysis, we show that data-driven seismic imaging can be significantly enhanced by using our physics-informed data augmentation techniques. Particularly, the imaging quality has been improved by 15% in test scenarios of general-sized leakage and 17% in small-sized leakage when using an augmented training set obtained with our techniques.	翻訳日:2021-06-24 12:29:08 公開日:2021-06-23
# Trinity: 複雑な空間データセットのためのノーコードAIプラットフォーム Trinity: A No-Code AI platform for complex spatial datasets ( http://arxiv.org/abs/2106.11756v2 ) ライセンス: Link先を確認	C.V.Krishnakumar Iyer, Feili Hou, Henry Wang, Yonghong Wang, Kay Oh, Swetava Ganguli, Vipul Pandey	(参考訳) 本稿では,機械学習研究者と非技術領域の専門家の両方が,さまざまな複雑な問題を解決するために,ドメイン固有の信号やデータセットを実験可能にすることを目的として,trinityと呼ばれる非コード人工知能(ai)プラットフォームを提案する。この多様な問題を解決する汎用性は、複雑な時空間データセットを変換して、標準的なディープラーニングモデル、この場合、畳み込みニューラルネットワーク(cnns)によって利用しやすくし、標準的な方法で異なる問題を定式化する能力を与えることによって達成される。セマンティクスのセグメンテーション。複雑な機能エンジニアリング、ディープラーニングカーネル、スケーラブルなデータ処理メカニズムのデリバティブをホストする機能ストアである直感的なユーザインターフェースによって、Trinityは、ドメインの専門家がビジネスクリティカルな問題を解決する上で、科学者やエンジニアとステージを共有するための強力なプラットフォームを提供する。迅速なプロトタイピングと迅速な実験を可能にし、モデルの構築とデプロイを標準化することで、生産までの時間を短縮する。本稿では,Trinityとその設計の背景にある私たちのモチベーションとサンプルアプリケーションを展示することで,AIを用いたバーを低くするというアイデアを動機づける。 We present a no-code Artificial Intelligence (AI) platform called Trinity with the main design goal of enabling both machine learning researchers and non-technical geospatial domain experts to experiment with domain-specific signals and datasets for solving a variety of complex problems on their own. This versatility to solve diverse problems is achieved by transforming complex Spatio-temporal datasets to make them consumable by standard deep learning models, in this case, Convolutional Neural Networks (CNNs), and giving the ability to formulate disparate problems in a standard way, eg. semantic segmentation. With an intuitive user interface, a feature store that hosts derivatives of complex feature engineering, a deep learning kernel, and a scalable data processing mechanism, Trinity provides a powerful platform for domain experts to share the stage with scientists and engineers in solving business-critical problems. It enables quick prototyping, rapid experimentation and reduces the time to production by standardizing model building and deployment. In this paper, we present our motivation behind Trinity and its design along with showcasing sample applications to motivate the idea of lowering the bar to using AI.	翻訳日:2021-06-24 12:08:23 公開日:2021-06-23
# クエリーとしてインスタンスを追跡する Tracking Instances as Queries ( http://arxiv.org/abs/2106.11963v2 ) ライセンス: Link先を確認	Shusheng Yang, Yuxin Fang, Xinggang Wang, Yu Li, Ying Shan, Bin Feng, Wenyu Liu	(参考訳) 最近、クエリベースのディープネットワークは、エンドツーエンドパイプラインと、オブジェクト検出、セマンティックセグメンテーション、インスタンスセグメンテーションなど、いくつかの基本的なコンピュータビジョンタスクにおける競合結果のために多くの注目を集めている。しかし、エレガントなアーキテクチャと強力なパフォーマンスを備えたクエリベースのビデオインスタンスセグメンテーション(VIS)フレームワークの確立方法はまだ解決されていない。本稿では、QueryInstのインスタンスとクエリの固有の一対一対応をフル活用した統合クエリベースのVISフレームワークである、textbf{QueryTrack}(クエリとしてのインスタンスの追跡)を提案する。提案手法は,YouTube-VIS-2019 / 2021データセット上で52.7 / 52.3 APを取得し,CVPR 2021 \textbf{ with a single online end-to-end model, single scale testing \& modest amount of training data} で2位を獲得した。また、VISコミュニティのリファレンスとして、YouTube-VIS-2021 val のQueryTrack-ResNet-50ベースライン結果も提供します。 Recently, query based deep networks catch lots of attention owing to their end-to-end pipeline and competitive results on several fundamental computer vision tasks, such as object detection, semantic segmentation, and instance segmentation. However, how to establish a query based video instance segmentation (VIS) framework with elegant architecture and strong performance remains to be settled. In this paper, we present \textbf{QueryTrack} (i.e., tracking instances as queries), a unified query based VIS framework fully leveraging the intrinsic one-to-one correspondence between instances and queries in QueryInst. The proposed method obtains 52.7 / 52.3 AP on YouTube-VIS-2019 / 2021 datasets, which wins the 2-nd place in the YouTube-VIS Challenge at CVPR 2021 \textbf{with a single online end-to-end model, single scale testing \& modest amount of training data}. We also provide QueryTrack-ResNet-50 baseline results on YouTube-VIS-2021 val set as references for the VIS community.	翻訳日:2021-06-24 12:08:01 公開日:2021-06-23
# CPM-2:大規模費用対効果事前訓練言語モデル CPM-2: Large-scale Cost-effective Pre-trained Language Models ( http://arxiv.org/abs/2106.10715v2 ) ライセンス: Link先を確認	Zhengyan Zhang, Yuxian Gu, Xu Han, Shengqi Chen, Chaojun Xiao, Zhenbo Sun, Yuan Yao, Fanchao Qi, Jian Guan, Pei Ke, Yanzheng Cai, Guoyang Zeng, Zhixing Tan, Zhiyuan Liu, Minlie Huang, Wentao Han, Yang Liu, Xiaoyan Zhu, Maosong Sun	(参考訳) 近年,事前学習型言語モデル (PLM) のサイズは跳躍と境界によって増大している。しかし、これらの大規模PLMの効率問題は現実のシナリオでの利用を制限する。本稿では, PLM を用いた事前学習, 微調整, 推論の効率性問題に対処するための費用対効果技術について述べる。 1)スクラッチからトレーニングモデルに代えて既存のplmを活用し,事前学習プロセスを高速化するために知識継承を導入する。 2)大規模PLMを用いた即時チューニングのベストプラクティスを検討する。従来の微調整に比べて、プロンプトチューニングはタスク固有のパラメータの数を大幅に減少させる。 (3)計算資源が限られている大規模PLMを使用するための新しい推論ツールキットInfMoEを実装した。コスト効率のよいパイプラインに基づいて、100億のパラメータを持つエンコーダ・デコーダバイリンガルモデル(CPM-2)と、1980億のパラメータを持つMoEバージョンという2つのモデルを事前訓練する。実験では,下流タスクにおけるCPM-2とmT5を比較した。実験の結果, CPM-2は汎用言語知能に優れていた。さらに,InfMoEを1つのGPU上で数千億のパラメータを持つ大規模モデルの推論を行う際の効率を検証する。すべてのソースコードとモデルパラメータはhttps://github.com/TsinghuaAI/CPMで入手できる。 In recent years, the size of pre-trained language models (PLMs) has grown by leaps and bounds. However, efficiency issues of these large-scale PLMs limit their utilization in real-world scenarios. We present a suite of cost-effective techniques for the use of PLMs to deal with the efficiency issues of pre-training, fine-tuning, and inference. (1) We introduce knowledge inheritance to accelerate the pre-training process by exploiting existing PLMs instead of training models from scratch. (2) We explore the best practice of prompt tuning with large-scale PLMs. Compared with conventional fine-tuning, prompt tuning significantly reduces the number of task-specific parameters. (3) We implement a new inference toolkit, namely InfMoE, for using large-scale PLMs with limited computational resources. Based on our cost-effective pipeline, we pre-train two models: an encoder-decoder bilingual model with 11 billion parameters (CPM-2) and its corresponding MoE version with 198 billion parameters. In our experiments, we compare CPM-2 with mT5 on downstream tasks. Experimental results show that CPM-2 has excellent general language intelligence. Moreover, we validate the efficiency of InfMoE when conducting inference of large-scale models having tens of billions of parameters on a single GPU. All source code and model parameters are available at https://github.com/TsinghuaAI/CPM.	翻訳日:2021-06-24 12:07:38 公開日:2021-06-23
# トランスフォーマーに基づく自然言語処理手法を用いた広告テキスト分類 Ad Text Classification with Transformer-Based Natural Language Processing Methods ( http://arxiv.org/abs/2106.10899v2 ) ライセンス: Link先を確認	Umut \"Ozdil, B\"u\c{s}ra Arslan, D. Emre Ta\c{s}ar, G\"ok\c{c}e Polat,\c{S}\"ukr\"u Ozan	(参考訳) 本研究では,オンライン広告プラットフォーム上で生成した広告テキストをセクター的に自動分類するための自然言語処理(NLP)手法を提案する。当社のデータセットは、12のセクターから約21,000のラベル付き広告テキストで構成されています。本研究では,最近自然言語処理文献におけるテキスト分類などの分野で用いられているトランスフォーマに基づく言語モデルであるbertモデルからの双方向エンコーダ表現を用いた。トルコ語のための事前訓練されたBERTモデルを用いて得られた分類効率を詳細に示す。 In this study, a natural language processing-based (NLP-based) method is proposed for the sector-wise automatic classification of ad texts created on online advertising platforms. Our data set consists of approximately 21,000 labeled advertising texts from 12 different sectors. In the study, the Bidirectional Encoder Representations from Transformers (BERT) model, which is a transformer-based language model that is recently used in fields such as text classification in the natural language processing literature, was used. The classification efficiencies obtained using a pre-trained BERT model for the Turkish language are shown in detail.	翻訳日:2021-06-24 12:07:18 公開日:2021-06-23
# 画像分類補助タスクとしてのフーリエ変換近似 Fourier Transform Approximation as an Auxiliary Task for Image Classification ( http://arxiv.org/abs/2106.11478v2 ) ライセンス: Link先を確認	Chen Liu	(参考訳) 画像再構成は、画像分類において最も重要な補助課題である。本稿では「入力画像のフーリエ変換の近似」を潜在的な代替案として検討し、これが主課題における性能をさらに向上させるか、あるいは画像再構成であまりカバーされない新しい制約を導入することを期待する。 cifar-10データセット上で5つの一般的な分類アーキテクチャを実験した結果,提案手法により分類精度が向上した。さらに,提案する補助タスクが,高速勾配符号法を用いて発生する敵攻撃に対する分類器の抵抗性を高める可能性が示唆された。 Image reconstruction is likely the most predominant auxiliary task for image classification. In this paper, we investigate "approximating the Fourier Transform of the input image" as a potential alternative, in the hope that it may further boost the performances on the primary task or introduce novel constraints not well covered by image reconstruction. We experimented with five popular classification architectures on the CIFAR-10 dataset, and the empirical results indicated that our proposed auxiliary task generally improves the classification accuracy. More notably, the results showed that in certain cases our proposed auxiliary task may enhance the classifiers' resistance to adversarial attacks generated using the fast gradient sign method.	翻訳日:2021-06-24 12:07:10 公開日:2021-06-23
# BanditMF:マルチArmed Bandit-based Matrix Factorization Recommender System BanditMF: Multi-Armed Bandit Based Matrix Factorization Recommender System ( http://arxiv.org/abs/2106.10898v2 ) ライセンス: Link先を確認	Shenghao Xu	(参考訳) マルチアームバンディット(mab)は、探索と搾取のバランスを達成するために原則化されたオンライン学習アプローチを提供する。複数の状況で行動する学習を伴わない優れたパフォーマンスと低フィードバック学習のため、マルチアームのバンディットはレコメンデーションシステムのようなアプリケーションで広く注目を集めている。同様に、リコメンダシステム内では、コラボレーティブフィルタリング(cf)はおそらくリコメンダシステムにおいて最も早く、最も影響力のある方法である。重要なことは、新しいユーザーと推奨アイテムのプールが、レコメンデーターシステムに対処する必要がある課題だ。協調フィルタリングでは、古典的な手法はモデルをオフラインでトレーニングし、オンラインテストを実行するが、このアプローチは、いわゆるコールドスタートであるユーザの好みの動的変更をもはや処理できない。では、効果的な情報がないユーザに対して、効果的にアイテムを推奨する方法? 上記の問題に対処するため、BanditMFというマルチアームバンディットに基づく協調フィルタリング推薦システムが提案されている。 BanditMF は,(1) 有効情報の不足条件下での協調フィルタリングにおけるコールドスタート問題の解法,(2) ユーザと関係する未知のパラメータを独立に推定し,ユーザ間の相関を無視することによる,強い関係領域におけるバンディットアルゴリズムの最適部分問題の解法,という2つの課題に対処するように設計されている。 Multi-armed bandits (MAB) provide a principled online learning approach to attain the balance between exploration and exploitation. Due to the superior performance and low feedback learning without the learning to act in multiple situations, Multi-armed Bandits drawing widespread attention in applications ranging such as recommender systems. Likewise, within the recommender system, collaborative filtering (CF) is arguably the earliest and most influential method in the recommender system. Crucially, new users and an ever-changing pool of recommended items are the challenges that recommender systems need to address. For collaborative filtering, the classical method is training the model offline, then perform the online testing, but this approach can no longer handle the dynamic changes in user preferences which is the so-called cold start. So how to effectively recommend items to users in the absence of effective information? To address the aforementioned problems, a multi-armed bandit based collaborative filtering recommender system has been proposed, named BanditMF. BanditMF is designed to address two challenges in the multi-armed bandits algorithm and collaborative filtering: (1) how to solve the cold start problem for collaborative filtering under the condition of scarcity of valid information, (2) how to solve the sub-optimal problem of bandit algorithms in strong social relations domains caused by independently estimating unknown parameters associated with each user and ignoring correlations between users.	翻訳日:2021-06-24 12:06:59 公開日:2021-06-23
# 微粒と粗粒の誤情報を分類する : COVID-19インフォデミックの実証的研究 Categorising Fine-to-Coarse Grained Misinformation: An Empirical Study of COVID-19 Infodemic ( http://arxiv.org/abs/2106.11702v2 ) ライセンス: Link先を確認	Ye Jiang, Xingyi Song, Carolina Scarton, Ahmet Aker, Kalina Bontcheva	(参考訳) ソーシャルメディア上で新型コロナウイルス(COVID-19)の誤報が広まることで、多くの研究者が注目している。 google scholarによると、covid-19関連の偽情報研究はこれまでに約2万6000件が出版されている。これらの研究の多くは、(1)新型コロナウイルス関連誤報の特徴を検出し、分析することに焦点を当てている。しかし、誤報に関連する社会行動の研究は無視されることが多い。本稿では、社会行動アノテーションを含む微粒な誤情報ツイートデータセット(例)を紹介する。誤報に対するコメントまたは質問) このデータセットは、社会的行動分析を可能にするだけでなく、証拠ベースまたは非証拠ベースの誤情報分類タスクにも適している。また,本実験では,実世界の誤情報に適用した場合,誤情報の分類性能が著しく異なる可能性があることを示す。 The spreading COVID-19 misinformation over social media already draws the attention of many researchers. According to Google Scholar, about 26000 COVID-19 related misinformation studies have been published to date. Most of these studies focusing on 1) detect and/or 2) analysing the characteristics of COVID-19 related misinformation. However, the study of the social behaviours related to misinformation is often neglected. In this paper, we introduce a fine-grained annotated misinformation tweets dataset including social behaviours annotation (e.g. comment or question to the misinformation). The dataset not only allows social behaviours analysis but also suitable for both evidence-based or non-evidence-based misinformation classification task. In addition, we introduce leave claim out validation in our experiments and demonstrate the misinformation classification performance could be significantly different when applying to real-world unseen misinformation.	翻訳日:2021-06-24 12:06:33 公開日:2021-06-23

Title

Authors

Abstract

論文公表日・翻訳日

# QuNetSim: 量子ネットワークのためのソフトウェアフレームワーク

QuNetSim: A Software Framework for Quantum Networks ( http://arxiv.org/abs/2003.06397v5 )

ライセンス: Link先を確認

Stephen DiAdamo, Janis N\"otzel, Benjamin Zanger, Mehmet Mert Be\c{s}e

(参考訳) 量子インターネット技術の発展に伴い、シミュレーションソフトウェアや量子インターネット教育の必要性が高まっている。 QuNetSimはこのニーズを満たすことを目指している。 qunetsimは、ネットワーク層までの量子ネットワークをシミュレートするために使用できるpythonソフトウェアフレームワークである。 QuNetSimの目標は、さまざまな量子ネットワーク構成とパラメータ上での量子ネットワークプロトコルの調査とテストを容易にすることである。このフレームワークには多くの既知の量子ネットワークプロトコルが組み込まれており、ユーザーはシミュレーションを素早く構築でき、初心者は簡単に独自の量子ネットワークプロトコルを実装することができる。

As quantum internet technologies develop, the need for simulation software and education for quantum internet rises. QuNetSim aims to fill this need. QuNetSim is a Python software framework that can be used to simulate quantum networks up to the network layer. The goal of QuNetSim is to make it easier to investigate and test quantum networking protocols over various quantum network configurations and parameters. The framework incorporates many known quantum network protocols so that users can quickly build simulations and beginners can easily learn to implement their own quantum networking protocols.

翻訳日:2023-05-29 06:13:38 公開日:2021-06-23

# 量子レイリー問題と熱コヒーレントオンザガー関係

Quantum Rayleigh problem and thermocoherent Onsager relations ( http://arxiv.org/abs/2006.03186v4 )

ライセンス: Link先を確認

Onur Pusuluk and \"Ozg\"ur E. M\"ustecapl{\i}o\u{g}lu

(参考訳) 熱流と平衡における量子コヒーレンスと相関の役割は、レイリーの量子状態における平衡に関する力学問題を調べ、オンサーガーの熱電性へのアプローチに従って研究されている。具体的には、側面から2キュービット弾を照射する。任意の衝突時間と初期状態に対して、逐次衝突および集団衝突のマスター方程式を開発する。マスター方程式からフォッカー・プランク方程式を導出することにより、レイリーの熱伝導方程式の量子バージョンを同定する。発射体間で共有される量子的不協和と絡み合いは、いわゆる熱交換コヒーレンスと関連している場合にのみ、真の熱流に寄与する。エネルギーの最小散逸というレイリーの原理をオンサーガーが用いたのと同様、エントロピー生成率を用いてコヒーレンス電流を同定する。コヒーレンスと熱の流れは、コヒーレントなペルチェ効果とコヒーレントなシーベック効果を予測する量子オンサーガー関係の形で書くことができる。効果は衝突時間と集束率によって最適化できる。最後に,様々なプラットフォームにおける熱コヒーレント現象の実験的実現と技術応用について考察する。

The role of quantum coherence and correlations in heat flow and equilibration is investigated by exploring the Rayleigh's dynamical problem to equilibration in the quantum regime and following Onsager's approach to thermoelectricity. Specifically, we consider a qubit bombarded by two-qubit projectiles from a side. For arbitrary collision times and initial states, we develop the master equation for sequential and collective collisions. By deriving the Fokker-Planck equation out of the master equation, we identify the quantum version of the Rayleigh's heat conduction equation. We find that quantum discord and entanglement shared between the projectiles can contribute to genuine heat flow only when they are associated with so-called heat-exchange coherences. Analogous to Onsager's use of Rayleigh's principle of least dissipation of energy, we use the entropy production rate to identify the coherence current. Both coherence and heat flows can be written in the form of quantum Onsager relations, from which we predict coherent Peltier and coherent Seebeck effects. The effects can be optimized by the collision times and collectivity. Finally, we discuss some of the possible experimental realizations and technological applications of the thermocoherent phenomena in different platforms.

翻訳日:2023-05-17 02:17:14 公開日:2021-06-23

# sub-bosonic (deformed) ladder operator

Sub-bosonic (deformed) ladder operators ( http://arxiv.org/abs/2009.06392v2 )

ライセンス: Link先を確認

J. Damastor Serafim, Ricardo Ximenes, and Fernando Parisio

(参考訳) 標準作用素 $\hat{a}^{\dagger}$$$$\hat{a}$) は、基本量子力学と量子場理論の両方において物理系にエネルギーの量$E$を(そこから)加える(差し引く)理想的な過程を表す。これは演算子レベルで$E$に関する変数が不可能であるという意味での ``sharp'' の概念である。本研究では、ファジィネスという厳密な概念から派生した変形生成および消滅作用素のクラスを示す。これにより変形し、ボゾン準可換関係は、修正された退化エネルギーとフォック状態を持つ単純な代数構造を誘導する。さらに,量子場理論において導入された形式性,例えば自由準ボソンの分散関係における線形性からの逸脱について検討する。

The canonical operator $\hat{a}^{\dagger}$ ($\hat{a}$) represents the ideal process of adding (subtracting) an {\it exact} amount of energy $E$ to (from) a physical system in both elementary quantum mechanics and quantum field theory. This is a ``sharp'' notion in the sense that no variability around $E$ is possible at the operator level. In this work, we present a class of deformed creation and annihilation operators that originates from a rigorous notion of fuzziness. This leads to deformed, sub-bosonic commutation relations inducing a simple algebraic structure with modified eigenenergies and Fock states. In addition, we investigate possible consequences of the introduced formalism in quantum field theories, as for instance, deviations from linearity in the dispersion relation for free quasibosons.

翻訳日:2023-05-03 00:29:08 公開日:2021-06-23

# 弱駆動量子系における3体相互作用の量子シミュレーション

Quantum simulation of three-body interactions in weakly driven quantum systems ( http://arxiv.org/abs/2011.03399v2 )

ライセンス: Link先を確認

Francesco Petiziol, Mahdi Sameti, Stefano Carretta, Sandro Wimberger, Florian Mintert

(参考訳) 一対のカップリングを超えて多体相互作用を特徴とする有効ハミルトニアンの実現は、トポロジカル物理学と量子計算の基盤となる中心モデルの量子シミュレーションを可能にする。摂動フロッケ工学の限界を克服し、超伝導回路および分子ナノマグネットにおける純三体ハミルトンの高精度実現について論じる。

The realization of effective Hamiltonians featuring many-body interactions beyond pairwise coupling would enable the quantum simulation of central models underpinning topological physics and quantum computation. We overcome crucial limitations of perturbative Floquet engineering and discuss the highly accurate realization of a purely three-body Hamiltonian in superconducting circuits and molecular nanomagnets.

翻訳日:2023-04-25 03:13:57 公開日:2021-06-23

# テンソルネットワークを持つ有限密度における3+1次元格子量子電磁力学

Lattice Quantum Electrodynamics in (3+1)-dimensions at finite density with Tensor Networks ( http://arxiv.org/abs/2011.10658v2 )

ライセンス: Link先を確認

Giuseppe Magnifico, Timo Felser, Pietro Silvi, Simone Montangero

(参考訳) ゲージ理論は物質の基本構成要素とその相互作用を理解する上で最も重要なものである。しかしながら、その位相図の完全な特徴づけと非摂動効果の完全な理解は、特に有限電荷密度において、主にモンテカルロの数値シミュレーションに影響を与える符号問題のために議論されている。本稿では, 力学物質を含むハミルトニアン定式化における3次元格子ゲージ理論のテンソルネットワークシミュレーションについて報告する: この符号-プロブレムフリー法を用いて, 零および有限電荷密度のコンパクト量子力学の基底状態をシミュレートし, モデルの集合相のキャラクタリゼーション, 大ゲージ結合における凝縮相の存在, 電荷スクリーニング効果などの基本的な問題に対処する。

Gauge theories are of paramount importance in our understanding of fundamental constituents of matter and their interactions. However, the complete characterization of their phase diagrams and the full understanding of non-perturbative effects are still debated, especially at finite charge density, mostly due to the sign-problem affecting Monte Carlo numerical simulations. Here, we report the Tensor Network simulation of a three dimensional lattice gauge theory in the Hamiltonian formulation including dynamical matter: Using this sign-problem-free method, we simulate the ground states of a compact Quantum Electrodynamics at zero and finite charge densities, and address fundamental questions such as the characterization of collective phases of the model, the presence of a confining phase at large gauge coupling, and the study of charge-screening effects.

翻訳日:2023-04-23 19:09:49 公開日:2021-06-23

# 2光子遷移によるフラクソニウム量子ビット上のエンタングリングゲートの提案

Proposal for entangling gates on fluxonium qubits via a two-photon transition ( http://arxiv.org/abs/2011.10011v2 )

ライセンス: Link先を確認

Konstantin N. Nesterov, Quentin Ficheux, Vladimir E. Manucharyan, Maxim G. Vavilov

(参考訳) 2つの容量結合型フラックスニウム量子ビット上でマイクロ波を活性化する絡み合いゲートのファミリーを提案する。量子ビットに印加されるマイクロ波パルスは、$|00\rangle - |11\rangle$遷移の半周波数付近の周波数で、フラキソニウムの強い非調和性のため、計算部分空間の外に無視できる漏れを伴う2光子ラビ振動を誘導する。駆動周波数、振幅、持続時間を調整することにより、$\sqrt{\rm SWAP}$-likeや制御相ゲートのようなフェルミオンシミュレーションゲートと局所的に等価なゲートファミリーを得る。ゲート誤差は、過剰な回路パラメータマッチングなしで100 ns以下のパルス持続時間に対して10^{-4}$以下に調整できる。フラクソニウムコヒーレンス時間が1msを超えることを考えると、我々のゲートスキームは大規模量子プロセッサに期待できる。

We propose a family of microwave-activated entangling gates on two capacitively coupled fluxonium qubits. A microwave pulse applied to either qubit at a frequency near the half-frequency of the $|00\rangle - |11\rangle$ transition induces two-photon Rabi oscillations with a negligible leakage outside the computational subspace, owing to the strong anharmonicity of fluxoniums. By adjusting the drive frequency, amplitude, and duration, we obtain the gate family that is locally equivalent to the fermionic-simulation gates such as $\sqrt{\rm SWAP}$-like and controlled-phase gates. The gate error can be tuned below $10^{-4}$ for a pulse duration under 100 ns without excessive circuit parameter matching. Given that the fluxonium coherence time can exceed 1 ms, our gate scheme is promising for large-scale quantum processors.

翻訳日:2023-04-23 17:07:52 公開日:2021-06-23

# 回転波近似を超える2モードジョセフソン回路におけるサイドバンド遷移

Sideband transitions in a two-mode Josephson circuit driven beyond the rotating wave approximation ( http://arxiv.org/abs/2011.14600v2 )

ライセンス: Link先を確認

Byoung-moo Ann, Wouter Kessels, and Gary. A. Steele

(参考訳) 周期的に量子システムを駆動することは、量子状態のコヒーレント制御において重要な役割を果たす。回転波近似 (rwa) は弱およびほぼ共鳴駆動場に対する良い近似手法である。しかし、これらの実験は、RWAが保持できない大きなゆるやかで強い駆動場を必要とすることがある。本研究では,強い駆動と大変形の条件下での強駆動2モードジョセフソン回路を実験的,数値的,解析的に検討する。具体的には、2光子側帯遷移を駆動することによって引き起こされる2つのモード間のビームスプリッタおよび2モードスクイーズ相互作用について検討する。数値シミュレーションを用いて、RWAがサイドバンド遷移速度の振幅を正確に捉えることができないことを観察する。摂動補正に基づく解析モデルを用いて,この発見を検証する。研究した系におけるrwaの崩壊は、定性的に異なるダイナミクスをもたらすのではなく、高い駆動強度でのrwa理論と同じ結果をもたらし、結合速度を予測値と比較して向上させる。これはキャリア遷移の場合と比較して興味深い結果であり、RWAの分解は量子状態の質的に異なる時間進化をもたらす。我々の研究は、RWAを超えて、周期的に駆動されるシステムの振る舞いに関する洞察を提供する。また,これらの知見を回路量子電磁力学における量子プロトコルの計算と校正に含むためのロバストな理論的枠組みを提供する。

Driving quantum systems periodically in time plays an essential role in the coherent control of quantum states. The rotating wave approximation (RWA) is a good approximation technique for weak and nearly-resonance driven fields. However, these experiments sometimes require large detuning and strong driving fields, for which the RWA may not hold. In this work, we experimentally, numerically, and analytically explore strongly driven two-mode Josephson circuits in the regime of strong driving and large detuning. Specifically, we investigate beam-splitter and two-mode squeezing interaction between the two modes induced by driving a two-photon sideband transition. Using numerical simulations, we observe that the RWA is unable to correctly capture the amplitude of the sideband transition rates. We verify this finding using an analytical model that is based on perturbative corrections. We find that the breakdown of the RWA in the regime studied does not lead to qualitatively different dynamics, but gives the same results as the RWA theory at higher drive strengths, enhancing the coupling rates compared to what one would predict. This is an interesting consequence compared to the carrier transition case, where the breakdown of the RWA results in qualitatively different time evolution of the quantum state. Our work provides an insight into the behavior of time-periodically driven systems beyond the RWA. We also provide a robust theoretical framework for including these findings in the calculation and calibration of quantum protocols in circuit quantum electrodynamics.

翻訳日:2023-04-22 14:49:37 公開日:2021-06-23

# 連結ボソニックおよび離散変数量子符号に基づく量子リピータ

Quantum repeaters based on concatenated bosonic and discrete-variable quantum codes ( http://arxiv.org/abs/2011.15076v2 )

ライセンス: Link先を確認

Filip Rozp\k{e}dek, Kyungjoo Noh, Qian Xu, Saikat Guha, Liang Jiang

(参考訳) 本稿では,離散および連続可変量子情報に使用される手法を組み合わせた量子エラー補正型量子リピータのアーキテクチャを提案する。具体的には、送信されたキュービットを2つのレベルからなる連結コードにエンコードする。最初のレベルでは、1つのボソニックモードでキュービットを符号化する連続可変GKPコードを使用します。第2のレベルでは、小さな離散変数コードを使用します。このようなアーキテクチャには2つの重要な特徴がある。まず、2つの異なるタイプのリピータにおいて、それぞれのレベルにおける誤差を補正する。これにより、すべてのリピータが同じアーキテクチャに対して、実用シナリオに必要なパフォーマンスをコスト削減で達成することができる。第二に、低レベルでの連続可変gkpコードの使用は、第2レベルのコードの誤り訂正能力を高める追加のアナログ情報を生成するため、4つまたは7つの光学モードからなる符号化で長距離通信が可能となる。

We propose an architecture of quantum-error-correction-based quantum repeaters that combines techniques used in discrete- and continuous-variable quantum information. Specifically, we propose to encode the transmitted qubits in a concatenated code consisting of two levels. On the first level we use a continuous-variable GKP code encoding the qubit in a single bosonic mode. On the second level we use a small discrete-variable code. Such an architecture has two important features. Firstly, errors on each of the two levels are corrected in repeaters of two different types. This enables for achieving performance needed in practical scenarios with a reduced cost with respect to an architecture for which all repeaters are the same. Secondly, the use of continuous-variable GKP code on the lower level generates additional analog information which enhances the error-correcting capabilities of the second-level code such that long-distance communication becomes possible with encodings consisting of only four or seven optical modes.

翻訳日:2023-04-22 14:19:30 公開日:2021-06-23

# 環境改善型コヒーレント光収穫

Environmentally improved coherent light harvesting ( http://arxiv.org/abs/2012.11864v2 )

ライセンス: Link先を確認

Stefano Tomasi, Dominic M. Rouse, Erik M. Gauger, Brendon W. Lovett, Ivan Kassal

(参考訳) コヒーレンス強化光収穫は、コヒーレンスが光ハーベスティング性能を著しく向上させるという理論的証拠にもかかわらず、実験的には直接観察されていない。主な実験的障害は、共役変数の存在下でのコヒーレンスの影響を分離することの難しさである。偏光度を操作することでコヒーレンスを外部から制御するための最近の提案は、コヒーレント効率の向上が可能であったが、環境に弱結合した光ハーベスティングシステムに限定されていた。本稿では,システム・バス結合強度の増大がコヒーレントな効率向上を増幅することを示す。この結果、コヒーレンス強化光収穫を決定的に実証したり、人工光ハーベスティングデバイスにコヒーレント効果を組み込むために使用できるシステムの範囲を劇的に拡大する。

Coherence-enhanced light harvesting has not been directly observed experimentally, despite theoretical evidence that coherence can significantly enhance light-harvesting performance. The main experimental obstacle has been the difficulty in isolating the effect of coherence in the presence of confounding variables. Recent proposals for externally controlling coherence by manipulating the light's degree of polarization showed that coherent efficiency enhancements would be possible, but were restricted to light-harvesting systems weakly coupled to their environment. Here, we show that increases in system-bath coupling strength can amplify coherent efficiency enhancements, rather than suppress them. This result dramatically broadens the range of systems that could be used to conclusively demonstrate coherence-enhanced light harvesting or to engineer coherent effects into artificial light-harvesting devices.

翻訳日:2023-04-19 22:24:17 公開日:2021-06-23

# ニューラルネットワークを用いた実験データからの非古典性同定

Identifying nonclassicality from experimental data using artificial neural networks ( http://arxiv.org/abs/2101.07112v2 )

ライセンス: Link先を確認

Valentin Gebhart, Martin Bohmann, Karsten Weiher, Nicola Biagi, Alessandro Zavatta, Marco Bellini, Elizabeth Agudelo

(参考訳) 非古典的資源の高速でアクセス可能な検証は、連続変数量子技術の幅広い利用に向けて不可欠のステップである。本稿では,ホモダイン検出により得られた実験データを処理し,光量子状態の非古典性同定のための機械学習手法を提案する。そこで我々は,古典的,非古典的状態の分類を行うニューラルネットワークを訓練した。光の状態の異なる実実験的な二次データから古典的特徴や非古典的特徴を正確に識別できることを実証する。さらに,訓練段階で使用されていない状態の非古典性も認識できることを示した。ホモダイントモグラフィを行うのに必要な大きな試料サイズの必要性を回避し,小型標本サイズに対する非古典性の同定に有望な代替案を示し,高速選別や実験データの直接監視への適用性を示す。

The fast and accessible verification of nonclassical resources is an indispensable step towards a broad utilization of continuous-variable quantum technologies. Here, we use machine learning methods for the identification of nonclassicality of quantum states of light by processing experimental data obtained via homodyne detection. For this purpose, we train an artificial neural network to classify classical and nonclassical states from their quadrature-measurement distributions. We demonstrate that the network is able to correctly identify classical and nonclassical features from real experimental quadrature data for different states of light. Furthermore, we show that nonclassicality of some states that were not used in the training phase is also recognized. Circumventing the requirement of the large sample sizes needed to perform homodyne tomography, our approach presents a promising alternative for the identification of nonclassicality for small sample sizes, indicating applicability for fast sorting or direct monitoring of experimental data.

翻訳日:2023-04-14 21:09:29 公開日:2021-06-23

# 2つの加速Unruh-DeWitt検出器間の絡み合い収穫における熱場の役割

Role of thermal field in entanglement harvesting between two accelerated Unruh-DeWitt detectors ( http://arxiv.org/abs/2104.11269v2 )

ライセンス: Link先を確認

Dipankar Barman, Subhajit Barman, Bibhas Ranjan Majhi

(参考訳) 2つの加速検出器間の絡み合いに及ぼすフィールド温度$T^{(f)}$の影響について検討した。平行運動では、場の熱的性質は絡み合いを生じないので、結果は非熱的状況と同じである。反対に、$t^{(f)}$ は、検知器が反平行運動である場合、すなわち、検知器 $a$ と $b$ がそれぞれ右と左のリンドラーウェッジにあるとき、絡み合いの収穫に影響する。 a$'saccelerate $a_a$のすべての値に対して$t^{(f)}=0$ エンタングルメントの収穫は可能であるが、温度が存在する場合、わずか$a_a$の範囲でのみ可能である。 1+1)$次元では、範囲は特定の値から始まり無限大に拡張され、$T^{(f)}$が増加するにつれて、絡み合いの収穫に必要最小の$a_A$が増加する。さらに、臨界値$a_A=a_c$の収穫は、以下の加速度と正反対の$T^{(f)}$の増加とともに増加する。加速度が異なる場合、いくつかの臨界値が$(1+3)$次元にある。 1+1)$次元の単一範囲とは対照的に、ここでの収穫はa_A$の離散範囲内で可能である。興味深いことに、等しい加速度の場合、1つの臨界点を持ち、性質は$(1+1)$次元の結果と非常に似ている。また、これらの検出器間の相互情報の依存を$a_A$と$T^{(f)}$で論じる。

We investigate the effects of field temperature $T^{(f)}$ on the entanglement harvesting between two uniformly accelerated detectors. For their parallel motion, the thermal nature of fields does not produce any entanglement, and therefore, the outcome is the same as the non-thermal situation. On the contrary, $T^{(f)}$ affects entanglement harvesting when the detectors are in anti-parallel motion, i.e., when detectors $A$ and $B$ are in the right and left Rindler wedges, respectively. While for $T^{(f)}=0$ entanglement harvesting is possible for all values of $A$'s acceleration $a_A$, in the presence of temperature, it is possible only within a narrow range of $a_A$. In $(1+1)$ dimensions, the range starts from specific values and extends to infinity, and as we increase $T^{(f)}$, the minimum required value of $a_A$ for entanglement harvesting increases. Moreover, above a critical value $a_A=a_c$ harvesting increases as we increase $T^{(f)}$, which is just opposite to the accelerations below it. There are several critical values in $(1+3)$ dimensions when they are in different accelerations. Contrary to the single range in $(1+1)$ dimensions, here harvesting is possible within several discrete ranges of $a_A$. Interestingly, for equal accelerations, one has a single critical point, with nature quite similar to $(1+1)$ dimensional results. We also discuss the dependence of mutual information among these detectors on $a_A$ and $T^{(f)}$.

翻訳日:2023-04-02 19:59:08 公開日:2021-06-23

# ポスト・クォータム時代の生産環境のための公開鍵インフラのセキュリティレコメンデーションに向けて

Towards security recommendations for public-key infrastructures for production environments in the post-quantum era ( http://arxiv.org/abs/2105.01324v2 )

ライセンス: Link先を確認

S.E. Yunakovsky, M. Kot, N.O. Pozhar, D. Nabokov, M.A. Kudinov, A. Guglya, E.O. Kiktenko, E. Kolycheva, A. Borisov, and A.K. Fedorov

(参考訳) 量子コンピューティング技術は、現在使われている公開鍵暗号プロトコルに重大な脅威をもたらす。本稿では、運用環境を保護するためのセキュリティシステムの一部として使用される公開鍵基盤(PKI)に対する量子脅威の影響について論じる。我々は,量子化後のソリューションへの迅速な移行の要件に着目し,既存モデルのセキュリティ問題を解析する。量子コンピューティングによる攻撃に重点を置いていますが、使用する暗号アルゴリズムとは直接関係なく、pki全体のセキュリティに不可欠なセキュリティ上の問題についても論じています。我々は、量子コンピュータによる攻撃の観点から、pkiに関する一連のセキュリティ勧告を提供する。

Quantum computing technologies pose a significant threat to the currently employed public-key cryptography protocols. In this paper, we discuss the impact of the quantum threat on public key infrastructures (PKIs), which are used as a part of security systems for protecting production environments. We analyze security issues of existing models with a focus on requirements for a fast transition to post-quantum solutions. Although our primary focus is on the attacks with quantum computing, we also discuss some security issues that are not directly related to the used cryptographic algorithms but are essential for the overall security of the PKI. We attempt to provide a set of security recommendations regarding the PKI from the viewpoints of attacks with quantum computers.

翻訳日:2023-04-01 15:42:17 公開日:2021-06-23

# ド・ジッター空間におけるバタフライ速度とカオス抑制

Butterfly velocity and chaos suppression in de Sitter space ( http://arxiv.org/abs/2105.02258v2 )

ライセンス: Link先を確認

Dmitry S. Ageev

(参考訳) 本稿では,デシッター静電パッチにおけるホログラフィCFTを有限温度$T$および化学ポテンシャルで検討する。このような場理論におけるバタフライ速度 $v_b$ は、ハッブルパラメータ $h$ と $t$ のすべての値に対して縮退する。我々はこれを、$v_b$で制約されたカオス相関の拡大とド・ジッター曲率による効果との相互作用によるカオス混乱と解釈する。化学的ポテンシャルは、ある程度の温度で健康な蝶の速度を回復させる。また、ド・ジッターにおけるシュウィンガー効果や衝撃波衝突によるブラックホールの形成と、このカオス抑制の類似性を示す。

In this note, we study the holographic CFT in the de Sitter static patch at finite temperature $T$ and chemical potential. We find that butterfly velocity $v_B$ in such field theory degenerates for all values of the Hubble parameter $H$ and $T$. We interpret this as a chaos disruption caused by the interplay between the expansion of chaotic correlations constrained by $v_B$ and effects caused by de Sitter curvature. The chemical potential restores healthy butterfly velocity for some range of temperatures. Also, we provide some analogy of this chaos suppression with the Schwinger effect in de Sitter and black hole formation from shock wave collision.

翻訳日:2023-04-01 13:06:33 公開日:2021-06-23

# UPBの強い非局所集合

Strong nonlocal sets of UPB ( http://arxiv.org/abs/2106.08699v2 )

ライセンス: Link先を確認

Bichen Che, Zhao Dou, Min Lei, Yixian Yang

(参考訳) 拡張不可能な積基底(UPB)は直交積状態の族からの興味深いメンバーである。本稿では,異なる大きさの強い非局所性を持つ3量子 UPB の構成について検討する。まず、{C}^{3}}\otimes {{C}^{3}}\otimes {{C}^{3}}\otimes {{C}^{3}}$ of size 12 の UPB 集合がShifts UPB に基づいて表される。各キュービットの直交グラフを観察した後、${C}^{d}}\otimes {{C}^{d}}\otimes {{C}^{d}}\otimes {{C}^{d}}$ of size ${{{\left(d-1 \right)}^{3}}+3\left(d-2 \right)+1$.} で UPB を構築する一般的な方法を提供する。第二に、キュービットの次元が異なるより一般的な場合、タイル構造を3-キュービット系に拡張し、3-キュービット UPB に対してトリタイル構造を提案する。この構造により、{C}^{4}}\otimes {{C}^{4}}\otimes {{C}^{4}}\otimes {{C}^{5}}$ system of size 30 は、{C}^{3}}\otimes {{C}^{3}}\otimes {{C}^{4}}$ system に基づいて得られる。同様に、このアプローチを ${{c}^{{{d}_{1}}}}\otimes {{c}^{{{d}_{2}}}}\otimes {{c}^{{d}_{3}}}}$ system と一般化し、${c}^{d}}\otimes {{c}^{d}}\otimes {{c}^{d}}$ と同様の構成を持つ。我々の研究は、[Halder, et al., PRL, 122, 040403 (2019)]で提起されたオープンな質問に対する肯定的な回答を提供し、絡み合うことなく強い量子非局所性を示す複数の量子ビット UPBが存在することを示唆している。

The unextendible product bases (UPBs) are interesting members from the family of orthogonal product states. In this paper, we investigate the construction of 3-qubit UPB with strong nonlocality of different sizes. First, a UPB set in ${{C}^{3}}\otimes {{C}^{3}}\otimes {{C}^{3}}$ of size 12 is presented based on the Shifts UPB, the structure of which is described by mapping the system to a $3\times 3\times 3$ Rubik's Cube. After observing the orthogonal graph of each qubit, we provide a general method of constructing UPB in ${{C}^{d}}\otimes {{C}^{d}}\otimes {{C}^{d}}$ of size ${{\left( d-1 \right)}^{3}}+3\left( d-2 \right)+1$. Second, for the more general case where the dimensions of qubits are different, we extend the tile structure to 3-qubit system and propose a Tri-tile structure for 3-qubit UPB. Then, by means of this structure, a ${{C}^{4}}\otimes {{C}^{4}}\otimes {{C}^{5}}$ system of size 30 is obtained based on a ${{C}^{3}}\otimes {{C}^{3}}\otimes {{C}^{4}}$ system. Similarly, we generalize this approach to ${{C}^{{{d}_{1}}}}\otimes {{C}^{{{d}_{2}}}}\otimes {{C}^{{{d}_{3}}}}$ system which has a similar composition to ${{C}^{d}}\otimes {{C}^{d}}\otimes {{C}^{d}}$. Our research provides a positive answer to the open questions raised in [Halder, et al., PRL, 122, 040403 (2019)], indicating that there do exist multi-qubit UPBs that can exhibit strong quantum nonlocality without entanglement.

翻訳日:2023-03-26 13:18:42 公開日:2021-06-23

# 位置相関を用いた未検出光子を用いた量子イメージングにおける解像度限界

Resolution limit in quantum imaging with undetected photons using position correlations ( http://arxiv.org/abs/2106.11358v2 )

ライセンス: Link先を確認

Balakrishnan Viswanathan, Gabriela Barreto Lemos and Mayukh Lahiri

(参考訳) 未検出光子(QIUP)を用いた量子イメージングは、物体を照らす光子を検出できない独自の画像取得方法である。この方法は、双対光子間の量子干渉と空間的相関を利用して画像を形成する。ここでは位置相関が有効であるQIUPの分解能限界について詳細に検討する。自然パラメトリックダウンコンバージョンプロセス(SPDC)における空間分解能と双光子位置相関の定量的な関係を確立する。さらに,検出されていない照明場の波長と検出されたフィールドの波長が解像度で果たす役割を定量的に確立する。ゴーストイメージングや従来のイメージングとは異なり、QIUPにおける双対光子の空間相関による分解能限界は従来の光学技術ではさらに改善できない。

Quantum imaging with undetected photons (QIUP) is a unique method of image acquisition where the photons illuminating the object are not detected. This method relies on quantum interference and spatial correlations between the twin photons to form an image. Here we present a detailed study of the resolution limits of position correlation enabled QIUP. We establish a quantitative relation between the spatial resolution and the twin photon position correlation in the spontaneous parametric down-conversion process (SPDC). Furthermore, we also quantitatively establish the roles that the wavelength of the undetected illumination field and the wavelength of the detected field play in the resolution. Like ghost imaging and unlike conventional imaging, the resolution limit imposed by the spatial correlation between twin photons in QIUP cannot be further improved by conventional optical techniques.

翻訳日:2023-03-25 22:54:56 公開日:2021-06-23

# Som-Raychaudhuri時空における一般化Duffin-Kemmer-Petiau発振子に対するAharonov-Bohm効果

Aharonov-Bohm effect on the generalized Duffin-Kemmer-Petiau oscillator in the Som-Raychaudhuri space-time ( http://arxiv.org/abs/2106.12192v1 )

ライセンス: Link先を確認

Yi Yang, Zheng-Wen Long, Hao Chen, Zi-Long Zhao and Chao-Yun Long

(参考訳) 曲線時空における電磁相互作用を持つ一般化ダフィン-ケムマー-ペティオー振動子(dkp)について検討した。まず、コーネルポテンシャルを持つSom-Raychaudhuri時空における一般化DKP発振器を紹介する。次に、一般化DKP発振器における電磁相互作用について考察する。我々の問題のエネルギー固有値と固有関数が得られた。エネルギー固有値に対する時空パラメータ,振動子周波数,コーネル電位,磁束の影響を解析した。我々は,アハロノフ・ボーム効果から有界状態に対する類似効果を考察した。

The generalized Duffin-Kemmer-Petiau (DKP) oscillator with electromagnetic interactions in the curved space-times are investigated. We introduce firstly the generalized DKP oscillator in Som-Raychaudhuri space-time with Cornell potential. Then, we consider the electromagnetic interactions into the generalized DKP oscillator. The energy eigenvalues and eigenfunction of our problem are obtained. The effect from the parameters of space-time, the frequency of oscillator, the Cornell potential and the magnetic flux on the energy eigenvalues have been analyzed. We find a analogs effect for the bound states from the Aharonov-Bohm effect in our considered system.

翻訳日:2023-03-25 18:45:41 公開日:2021-06-23

# バンドドhfを用いたロンガのエンレース

Enlaces de r\'adio de longa dist\^ancia utilizando a banda de HF ( http://arxiv.org/abs/2106.12187v1 )

ライセンス: Link先を確認

Rafael Diniz, Myl\`ene C. Q. Farias

(参考訳) 通信におけるHFバンドの使用に対する関心は、主にHFにおける軍用通信の新たな標準の開発とHFバンドにおけるデジタル放送の拡大により、この10年間で著しく高まっている。より具体的には、これらの新しい標準により、数百から数千kmのリンクを低コストで実装できるため、広く採用される可能性がある。ブラジルでは、この種のコミュニケーションは、アマゾン熱帯雨林地域のような、遠隔地やアクセスが難しい地域で使用することができる。 HF通信システムの物理層に関する技術の進化に加え、音声や画像の符号化に機械学習アルゴリズムを用いる技術が盛んに開発されてきた。これらすべての進歩により、通信インフラのない場所での通信サービスにHFバンドを使用できると信じられている。本研究は、ブラジルにおけるデジタルリンクにおけるHFラジオの最近の応用について、HFバンドにおける通信システム開発における課題について述べる。

The interest in the use of the HF band in telecommunication has increased significantly in the last decade, mainly due to the development of new standards for military telecommunications in HF, as well as the expansion of digital broadcasting in the HF band. More specifically, these new standards allow the implementation of links of hundreds or thousands of kilometers at a low cost, which suggests a widespread adoption can occur. In Brazil, this type of communication can be used in remote regions or regions of difficult access, such as the Amazon rain-forest region. In addition to the evolution of technologies concerning the physical layer of the HF telecommunication systems, there has been a great development of techniques that use machine learning algorithms for audio and image coding. It is believed that all these advances will enable the use of the HF band for communication services in places without telecommunication infrastructure. This work presents recent applications of HF radio for digital links in Brazil, describing the challenges present for the development of telecommunication systems in the HF band.

翻訳日:2023-03-25 18:45:32 公開日:2021-06-23

# 駅周辺における都市機能の多様性と密度

Diversity and density of urban functions in station areas ( http://arxiv.org/abs/2106.12107v1 )

ライセンス: Link先を確認

Yusuke Kumakoshi, Hideki Koizumi, Yuji Yoshimura

(参考訳) 都市機能の多様性と密度は都市での活力に肯定的に影響を与えることが知られているが、両者の関係は実証的に検討されていない。そこで本稿では, 都市における都市機能の多様性と密度との関係について, モンテカルロシミュレーションにより求めたロバスト密度指数を用いて, 都心における都市機能の多様性と密度の関係を実証的に示す。相関分析により,高密度の局部では複数スケールで低多様性を示す傾向が見られた。さらに, この負の相関は, 機能と相補関数の空間的特性の相違が原因であることが示唆された。本稿では, 都市計画における多様性と密度の両立を考慮し, 駅エリアの活力と弾力性について論じる。

The diversity and density of urban functions have been known to affect urban vibrancy positively, but the relation between the two has not been empirically examined; if high density is associated with low diversity in an area, its vibrancy may not increase. To obtain a better understanding of the metabolism of cities and directions for urban planning interventions, this paper offers empirical evidence on the association between the diversity and density of urban functions in the Tokyo Metropolitan Area, using a robust density index that was determined via a Monte Carlo simulation. By conducting association analyses, it was found that highly dense station areas tended to display low diversity at multiple scales. Further investigation indicated that this negative correlation was owing to different spatial characteristics of functions and complementary functioning among highly accessible station areas. This paper argues for considering both diversity and density in urban planning to make station areas vibrant and resilient.

翻訳日:2023-03-25 18:44:42 公開日:2021-06-23

# 量子ホモダイントモグラフィの高次元法

High-Dimensional Methods for Quantum Homodyne Tomography ( http://arxiv.org/abs/2106.12353v1 )

ライセンス: Link先を確認

Nicola Mosco, Lorenzo Maccone

(参考訳) 我々はホモダイントモグラフィに最適な再帰関係を提供する。従来のパターン関数の計算に内在する多様性を緩和し,モンテカルロシミュレーションによるデータ解析の実装方法について詳述した。我々の改良は、電磁場ヒルベルト空間の高次元部分空間を配置する励起量子状態の再構成に必要である。また,解析と再構築のためのJuliaパッケージも提示する。

We provide optimized recursion relations for homodyne tomography. We improve previous methods by mitigating the divergences intrinsic in the calculation of the pattern functions used previously, and detail how to implement the data analysis through Monte Carlo simulations. Our refinements are necessary for the reconstruction of excited quantum states which populate a high-dimensional subspace of the electromagnetic field Hilbert space. We also present a Julia package for the analysis and the reconstruction method.

翻訳日:2023-03-25 18:40:36 公開日:2021-06-23

# 圧縮熱浴における量子パラメトリック発振器:理論的基礎問題

Quantum Parametric Oscillator Heat Engines in Squeezed Thermal Baths: Foundational Theoretical Issues ( http://arxiv.org/abs/2106.12325v1 )

ライセンス: Link先を確認

Onat Ar{\i}soy, Jen-Tsung Hsiang and Bei-Lok Hu

(参考訳) In this paper we examine some foundational issues of a class of quantum engines where the system consists of a single quantum parametric oscillator, operating in an Otto cycle consisting of 4 stages of two alternating phases: the isentropic phase is detached from any bath (thus a closed system) where the natural frequency of the oscillator is changed from one value to another, and the isothermal phase where the system (now rendered open) is put in contact with one or two squeezed baths of different temperatures, whose nonequilibrium dynamics follows the Hu-Paz-Zhang (HPZ) master equation for quantum Brownian motion. hpz方程式は密度作用素の正則性を保つ完全非マルコフ方程式であり、有効である。 a) すべての温度 b)浴槽の任意のスペクトル密度,及び c) システムと浴槽との間の任意の結合強度。これらの性質を生かして、量子オットーエンジンのこれら2つの相に対する量子オープン・スクイーズド系の理論の重要な基礎的問題について検討する。以下を含む。一非マルコフ政権の非正統で低温の浴場二非断熱周波数変調に期待するもの三強固なシステムバス結合及び強固なシステムバスカップリング四この二つの相の間の適切な接合条件ここでの目標は、より高い効率を実現する方法を示すのではなく、より広い範囲のパラメータ空間をカバーする連続変数の量子エンジンのより堅実な理論的基礎を構築することである。

In this paper we examine some foundational issues of a class of quantum engines where the system consists of a single quantum parametric oscillator, operating in an Otto cycle consisting of 4 stages of two alternating phases: the isentropic phase is detached from any bath (thus a closed system) where the natural frequency of the oscillator is changed from one value to another, and the isothermal phase where the system (now rendered open) is put in contact with one or two squeezed baths of different temperatures, whose nonequilibrium dynamics follows the Hu-Paz-Zhang (HPZ) master equation for quantum Brownian motion. The HPZ equation is an exact nonMarkovian equation which preserves the positivity of the density operator and is valid for a) all temperatures, b) arbitrary spectral density of the bath, and c) arbitrary coupling strength between the system and the bath. Taking advantage of these properties we examine some key foundational issues of theories of quantum open and squeezed systems for these two phases of the quantum Otto engines. This include, i) the nonMarkovian regimes for non-Ohmic, low temperature baths, ii) what to expect in nonadiabatic frequency modulations, iii) strong system-bath coupling, as well as iv) the proper junction conditions between these two phases. Our aim here is not to present ways for attaining higher efficiency but to build a more solid theoretical foundation for quantum engines of continuous variables covering a broader range of parameter spaces hopefully of use for exploring such possibilities.

翻訳日:2023-03-25 18:40:29 公開日:2021-06-23

# 量子脳ネットワークの展望

Quantum Brain Networks: a Perspective ( http://arxiv.org/abs/2106.12295v1 )

ライセンス: Link先を確認

E. R. Miranda, S. Venkatesh, C. Hernani-Morales, L. Lamata, J. D. Mart\'in-Guerrero, and E. Solano

(参考訳) 我々はニューロテクノロジー、人工知能、量子コンピューティングの知識と手法を統合する新たな分野として量子脳ネットワーク(QBraiNs)を提案する。目標は、さまざまな破壊的応用のために、人間の脳と量子コンピュータの接続性を高めることである。我々は、ウェットウェアとハードウェアノードのハイブリッド古典量子ネットワークの出現を予測し、機械学習技術とブレイン・マシン・インタフェースを媒介とする。 QBraiNsは前例のない方法で芸術、科学、技術、起業家精神、特に医学、人間のインターネット、インテリジェントデバイス、感覚体験、ゲーム、物のインターネット、暗号取引、ビジネスに関連する活動を活用し、変革する。

We propose Quantum Brain Networks (QBraiNs) as a new interdisciplinary field integrating knowledge and methods from neurotechnology, artificial intelligence, and quantum computing. The objective is to develop an enhanced connectivity between the human brain and quantum computers for a variety of disruptive applications. We foresee the emergence of hybrid classical-quantum networks of wetware and hardware nodes, mediated by machine learning techniques and brain-machine interfaces. QBraiNs will harness and transform in unprecedented ways arts, science, technologies, and entrepreneurship, in particular activities related to medicine, Internet of humans, intelligent devices, sensorial experience, gaming, Internet of things, crypto trading, and business.

翻訳日:2023-03-25 18:40:08 公開日:2021-06-23

# 強相互作用原子の駆動非平衡系におけるエピデミック拡散と群免疫

Epidemic spreading and herd immunity in a driven non-equilibrium system of strongly-interacting atoms ( http://arxiv.org/abs/2106.12290v1 )

ライセンス: Link先を確認

Dong-Sheng Ding, Zong-Kai Liu, Hannes Busche, Bao-Sen Shi, Guang-Can Guo, Charles S. Adams, and Franco Nori

(参考訳) 疫病の空間的ダイナミクスを理解することがますます重要である。流行の数学的モデルが数多く存在するが、量的モデルテストを可能にする十分な制御パラメータを持つ物理システムが不足している。また、顕微鏡系における複雑なモデルのマクロ非平衡効果の再現も困難である。本研究では, 強い相互作用を持つリドバーグ原子における光学的非平衡相転移を利用した拡散拡散の物理アナログを実験的に示す。複数のレーザービームを使用することで、任意の所望の空間構造を課すことができる。サンプルの異なる部分で空間的局所化相転移とその相互作用を観察する。これらの相転移は、複数の場所での感染症の発生をシミュレートし、異なる体制下での免疫と疫病状態へのダイナミクスをシミュレートする。報告された結果は、Rydberg系は複雑な時空間力学をモデル化するのに十分な万能性を持っていることを示している。

It is increasingly important to understand the spatial dynamics of epidemics. While there are numerous mathematical models of epidemics, there is a scarcity of physical systems with sufficiently well-controlled parameters to allow quantitative model testing. It is also challenging to replicate the macro non-equilibrium effects of complex models in microscopic systems. In this work, we demonstrate experimentally a physics analog of epidemic spreading using optically driven non-equilibrium phase transitions in strongly interacting Rydberg atoms. Using multiple laser beams we can impose any desired spatial structure. We observe spatially localized phase transitions and their interplay in different parts of the sample. These phase transitions simulate the outbreak of an infectious disease in multiple locations, as well as the dynamics towards herd immunity and endemic state in different regimes. The reported results indicate that Rydberg systems are versatile enough to model complex spatial-temporal dynamics.

翻訳日:2023-03-25 18:39:57 公開日:2021-06-23

# 連続可変量子状態の数量子ビットへの普遍的ユニタリ移動

Universal unitary transfer of continuous-variable quantum states into a few qubits ( http://arxiv.org/abs/2106.12272v1 )

ライセンス: Link先を確認

Jacob Hastrup, Kimin Park, Jonatan Bohr Brask, Radim Filip and Ulrik Lund Andersen

(参考訳) 任意の連続変数量子状態を数個の離散変数量子ビットに転送するためのプロトコルを提案する。このプロトコルは決定論的であり、トラップイオンおよび超伝導回路プラットフォームで容易に利用できる2モードのラビ型相互作用のみを利用する。無限次元の状態を有限次元レジスタに転送することで生じる避けられない誤差は、指数関数的に量子ビットの数で抑制される。さらに、エンコードされた状態は、量子ビットに作用するデファスメントや振幅減衰などのノイズに対して頑健性を示す。このプロトコルは、離散連続型ハイブリッド量子システムのための強力で柔軟なツールを提供する。

We present a protocol for transferring arbitrary continuous-variable quantum states into a few discrete-variable qubits and back. The protocol is deterministic and utilizes only two-mode Rabi-type interactions which are readily available in trapped-ion and superconducting circuit platforms. The inevitable errors caused by transferring an infinite-dimensional state into a finite-dimensional register are suppressed exponentially with the number of qubits. Furthermore, the encoded states exhibit robustness against noise, such as dephasing and amplitude damping, acting on the qubits. Our protocol thus provides a powerful and flexible tool for discrete-continuous hybrid quantum systems.

翻訳日:2023-03-25 18:39:43 公開日:2021-06-23

# 軌道光学格子におけるフェルミガスの量子退化

Quantum degenerate Fermi gas in an orbital optical lattice ( http://arxiv.org/abs/2106.12241v1 )

ライセンス: Link先を確認

M. Hachmann, Y. Kiefer, J. Riebesehl, R. Eichberger, A. Hemmerich

(参考訳) 光学式チェッカーボード正方形格子の励起ブロッホ帯において, スピン偏極試料と量子退化フェルミオン原子のスピン混合物を調製した。スピン偏極の場合、パウリの排除原理による衝突の抑制を反映して、10,$s以上の極端帯域寿命が観測される。スピン混合物の場合、寿命は異なるスピン成分間の2体衝突によって桁違いに減少するが、それでも約1秒の顕著な大きな値が見つかる。運動量スペクトルを分析することで、光学格子の軌道特性を直接観測することができる。ここで実証された観測は、ユニタリティの体制を含む軌道光学格子における2対のスピン成分を持つフェルミ気体の物理を探索する基礎となる。

Spin-polarized samples and spin mixtures of quantum degenerate fermionic atoms are prepared in selected excited Bloch bands of an optical chequerboard square lattice. For the spin-polarized case, extreme band lifetimes above $10\,$s are observed, reflecting the suppression of collisions by Pauli's exclusion principle. For spin mixtures, lifetimes are reduced by an order of magnitude by two-body collisions between different spin components, but still remarkably large values of about one second are found. By analyzing momentum spectra, we can directly observe the orbital character of the optical lattice. The observations demonstrated here form the basis for exploring the physics of Fermi gases with two paired spin components in orbital optical lattices, including the regime of unitarity.

翻訳日:2023-03-25 18:39:32 公開日:2021-06-23

# Revenge Porn: 予備的専門家分析

Reporting Revenge Porn: a Preliminary Expert Analysis ( http://arxiv.org/abs/2106.12223v1 )

ライセンス: Link先を確認

A. De Angeli, M. Falduti, M. Menendez Blanco, S. Tessaris

(参考訳) 本研究では,リベンジポルノ(リベンジポルノ)と呼ばれる成人の親密・性的に露骨なデジタル画像の非コンセンサス分布に対する,被害者の視点からの対応に焦点を当てた。本稿では,選択したコンテンツ共有プラットフォームにおけるリベンジポルノ乱用を報告するためのプロセスに関する予備的専門家分析を行う。その中には、ソーシャルネットワーク、画像ホスティングサイト、ビデオホスティングプラットフォーム、フォーラム、ポルノサイトが含まれていました。性的行為の文脈における被害者の描写を目的とし、本人の顔を元の視覚的内容に置き換えるディープフェイク技術(ディープフェイク技術)の活用と、非合意による性的イメージやビデオのオンライン配信(リベンジポルノ)について、虐待を報告する方法について検討した。この予備分析は、これらの乱用を報告するためにプロバイダが設計した手順における、現在のプラクティスと潜在的な問題を理解することを目的としている。

In our research, we focus on the response to the non-consensual distribution of intimate or sexually explicit digital images of adults, also referred as revenge porn, from the point of view of the victims. In this paper, we present a preliminary expert analysis of the process for reporting revenge porn abuses in selected content sharing platforms. Among these, we included social networks, image hosting websites, video hosting platforms, forums, and pornographic sites. We looked at the way to report abuse, concerning both the non-consensual online distribution of private sexual image or video (revenge pornography), as well as the use of deepfake techniques, where the face of a person can be replaced on original visual content with the aim of portraying the victim in the context of sexual behaviours. This preliminary analysis is directed to understand the current practices and potential issues in the procedures designed by the providers for reporting these abuses.

翻訳日:2023-03-25 18:39:11 公開日:2021-06-23

# 平面k-一様状態:平面最大絡み合い状態の一般化

Planar k-Uniform States: a Generalization of Planar Maximally Entangled States ( http://arxiv.org/abs/2106.12209v1 )

ライセンス: Link先を確認

Yan-Ling Wang

(参考訳) 最近,ドローディアーニとカリミポーリ [Phys]. rev. a \textbf{102} 012427(2020)] は、極度に絡み合った (ame) 状態よりもより広い多成分の絡み合った状態のクラスである平面的極大絡み合い (pme) 状態の表記を提案した。そこで彼らは多成分系でその構成を示したが、粒子の数は偶数に制限されている。ここでは、まず残りのケース、すなわち、奇数の粒子を持つ系の平面的最大絡み合った状態の構成を解く。さらに、pme を平面的 $k$-uniform 状態に一般化し、n$ パーティの円に沿って隣接する$k$ パーティが最大に混合されるようにした。我々は最小のサポートを持つ平面$k$-一様状態の集合を構築する方法を示した。

Recently, Doroudiani and Karimipour [Phys. Rev. A \textbf{102} 012427(2020)] proposed the notation of planar maximally entangled (PME) states which are a wider class of multipartite entangled states than absolutely maximally entangled (AME) states. There they presented their constructions in the multipartite systems but the number of particles is restricted to be even. Here we first solve the remaining cases, i.e., constructions of planar maximally entangled states on systems with odd number of particles. In addition, we generalized the PME to the planar $k$-uniform states whose reductions to any adjacent $k$ parties along a circle of $N$ parties are maximally mixed. We presented a method to construct sets of planar $k$-uniform states which have minimal support.

翻訳日:2023-03-25 18:38:53 公開日:2021-06-23

# 一貫した質量を持たない粒子論の群理論的導出

Group theoretical derivation of consistent massless particle theories ( http://arxiv.org/abs/2106.12206v1 )

ライセンス: Link先を確認

Giuseppe Nistic\`o

(参考訳) 質量のない自由粒子の現在の理論は、空間反転と反ユニタリ作用素を仮定する。したがって、可能な理論の強固なクラスは破棄される。現在の無質量系の研究理論は、相対論的不変性の原理から厳密に推論的発展を通じて導かれるため、空間反転や時間反転作用素の一種が不整合を引き起こす場合にのみ排除される。その結果、質量のない孤立系に対する新しい一貫した理論のクラスが明確に決定される。一方、このアプローチは不変原理が示唆する一定の制約を定めており、過去のいくつかの調査で無視された結果、結果として不変原理と一致しないことがわかった。また、マスレスシステムのローカライズ可能性の問題は、新しい理論枠組みの中で再考され、一般化と過去の結果のより詳細な情報が得られる。

Current theories of massless free particle assume {\sl unitary} space inversion and {\sl anti-unitary} time reversal operators. In so doing robust classes of possible theories are discarded. In the present work theories of massless systems are derived through a strictly deductive development from the principle of relativistic invariance, so that a kind of space inversion or time reversal operator is ruled out only if it causes inconsistencies. As results, new classes of consistent theories for massless isolated systems are explicitly determined. On the other hand, the approach determines definite constraints implied by the invariance principle; they were ignored by some past investigations that, as a consequence, turn out to be not consistent with the invariance principle. Also the problem of the localizability for massless systems is reconsidered within the new theoretical framework, obtaining a generalization and a deeper detailing of previous results.

翻訳日:2023-03-25 18:38:36 公開日:2021-06-23

# MEMS単点磁気格子計用カシミールプルイン抵抗形パラメトリック増幅器の解析

Analysis of a Casimir-driven Parametric Amplifier with Resilience to Casimir Pull-in for MEMS Single-Point Magnetic Gradiometry ( http://arxiv.org/abs/2106.12477v1 )

ライセンス: Link先を確認

Josh Javor, Zhancheng Yao, Matthias Imboden, David K. Campbell and David J. Bishop

(参考訳) 量子力学的効果であるカシミール力は、いくつかのマイクロエレクトロメカニカル・システム(MEMS)プラットフォームで観測されている。 2つの物体の分離に対する極度な感度のため、カシミール力は量子力学の優れた道として提案されている。しかし、実用的応用はカシミールプルイン(casimir pull-in)と呼ばれる装置の消耗や故障に繋がる魅力的な力によって困難である。本研究では,時間遅延に基づくパラメトリック増幅手法を開発し,定常状態を実現し,プルインを回避するカシミール駆動型メトロロジープラットフォームの設計とシミュレーションを行う。この設計は、心臓と脳のイオン電流から発生するものと類似した弱い、低周波、勾配磁場の検出に応用する。シミュレーションパラメータは、MEMSプラットフォーム上でカシミール気象学および磁気グラディオメトリーのために開発された最近の実験プラットフォームから選択される。 MEMSはそのような用途に多くの利点を提供するが、検出された信号は通常、生体磁場の低周波状態において感度を低下させるため、デバイスの共振周波数でなければならない。カシミール駆動パラメトリック増幅器を用いて,MEMS単点勾配計の最適分解能が1万倍向上し,最大感度は1Hzで6Hz/(pT/cm)であった。提案した設計は、気象学に革命をもたらす可能性があり、特に環境条件下での生体磁場の非シールドモニタリングを可能にする可能性がある。

The Casimir Force, a quantum mechanical effect, has been observed in several microelectromechanical systems (MEMS) platforms. Due to its extreme sensitivity to the separation of two objects, the Casimir Force has been proposed as an excellent avenue for quantum metrology. Practical application, however, is challenging due to attractive forces leading to stiction and failure of the device, called Casimir pull-in. In this work, we design and simulate a Casimir-driven metrology platform, where a time-delay based parametric amplification technique is developed to achieve a steady state and avoid pull-in. We apply the design to the detection of weak, low frequency, gradient magnetic fields, similar to those emanating from ionic currents in the heart and brain. Simulation parameters are selected from recent experimental platforms developed for Casimir metrology and magnetic gradiometry, both on MEMS platforms. While MEMS offer many advantages to such an application, the detected signal must typically be at the resonant frequency of the device, with diminished sensitivity in the low frequency regime of biomagnetic fields. Using a Casimir-drive parametric amplifier, we report a 10,000 fold improvement in the best-case resolution of MEMS single-point gradiometers, with a maximum sensitivity of 6 Hz/(pT/cm) at 1 Hz. The development of the proposed design has the potential to revolutionize metrology, and specifically may enable unshielded monitoring of biomagnetic fields in ambient conditions.

翻訳日:2023-03-25 18:31:30 公開日:2021-06-23

# 光遠心分離型ガス混合系における状態及び分子選択的回転制御

State- and molecule-selective rotational control in gas mixtures with a shaped optical centrifuge ( http://arxiv.org/abs/2106.12468v1 )

ライセンス: Link先を確認

P. Amani, A. A. Milner, V. Milner

(参考訳) ガス混合系におけるオールオプティカル選択的回転制御法を実験的に示す。線形偏光が加速速度で回転する強いレーザーパルスである光遠心子を用いて、2つの異なる分子種を同時に2つの異なる回転周波数に励起する。新しいレベルの制御は、遠心分離分子の回転スペクトルに従って遠心スペクトルを形成することで達成される。形状の光学遠心分離機は、1つの分子種を他の分子よりも早く放出し、ターゲットの回転周波数と対応する回転状態とを分離する。この技術は、分子回転が衝突や化学反応に与える影響の研究において、回転制御の有用性を拡大する。

We demonstrate experimentally a method of all-optical selective rotational control in gas mixtures. Using an optical centrifuge - an intense laser pulse whose linear polarization rotates at an accelerated rate, we simultaneously excite two different molecular species to two different rotational frequencies of choice. The new level of control is achieved by shaping the centrifuge spectrum according to the rotational spectra of the centrifuged molecules. The shaped optical centrifuge releases one molecular species earlier than the other, therefore separating their target rotational frequencies and corresponding rotational states. The technique will expand the utility of rotational control in the studies of the effects of molecular rotation on collisions and chemical reactions.

翻訳日:2023-03-25 18:31:04 公開日:2021-06-23

# 深部ネットワークにおけるアナログ回路の展望

Prospects for Analog Circuits in Deep Networks ( http://arxiv.org/abs/2106.12444v1 )

ライセンス: Link先を確認

Shih-Chii Liu, John Paul Strachan, Arindam Basu

(参考訳) 機械学習のal-gorithms(例:addsとsoft max)で一般的に使用される演算は、compactアナログ回路で実装できる。 Analog Application-Specific Integrated Circuit (ASIC) は、電荷共有回路やサブスレッショルドトランジスタなどの技術を用いてこれらのアルゴリズムを実装し、非常に高い電力効率を実現する。近年のディープラーニングアルゴリズムの進歩により、一般的な行列ベクトル乗算処理を実装するハードウェアデジタルアクセラレータの設計に焦点が移った。これらの設計の電力は通常、ネットワークの重みとアクティベーションを保持するのに必要なオフチップDRAMのメモリアクセスパワーによって支配される。複雑な非揮発性メモリ技術はオンチップメモリを提供するのに役立ち、アナログ回路はインコンピュータメモリアプローチと組み合わせて必要な乗算ベクトル演算を実装するのに適している。本稿では,様々な機械学習アルゴリズムを実装したアナログ設計について概説する。そして、エッジや小さな機械学習アプリケーションに適した低消費電力ディープネットワークアクセラレータでofanalog回路を使用するための展望を示す。

Operations typically used in machine learning al-gorithms (e.g. adds and soft max) can be implemented bycompact analog circuits. Analog Application-Specific Integrated Circuit (ASIC) designs that implement these algorithms using techniques such as charge sharing circuits and subthreshold transistors, achieve very high power efficiencies. With the recent advances in deep learning algorithms, focus has shifted to hardware digital accelerator designs that implement the prevalent matrix-vector multiplication operations. Power in these designs is usually dominated by the memory access power of off-chip DRAM needed for storing the network weights and activations. Emerging dense non-volatile memory technologies can help to provide on-chip memory and analog circuits can be well suited to implement the needed multiplication-vector operations coupled with in-computing memory approaches. This paper presents abrief review of analog designs that implement various machine learning algorithms. It then presents an outlook for the use ofanalog circuits in low-power deep network accelerators suitable for edge or tiny machine learning applications.

翻訳日:2023-03-25 18:30:10 公開日:2021-06-23

# 量子計量に基づく普遍的半古典方程式

Universal semiclassical equations based on the quantum metric ( http://arxiv.org/abs/2106.12383v1 )

ライセンス: Link先を確認

C. Leblanc and G. Malpuech and D. D. Solnyshkov

(参考訳) 二バンド系における加速波束に対する半古典的運動方程式を導出する。これらの方程式は、量子計量によって記述される静的バンド幾何を用いて定式化できることを示す。我々は、ゼーマン項の有無にかかわらず、ラシュバ・ハミルトニアンの特定の場合を考える。半古典的軌道はシュリンガー方程式の解法から得られるものと完全に一致している。この形式主義は、伝統的なベリー曲率による断熱限界と異常ホール効果をうまく記述した。また、コヒーレントバンド重ね合わせの反対の極限を記述し、空間的に振動するZitterbewegung運動を引き起こす。 k=0$で、そのような波束は実空間において円軌道を示し、その半径は量子計量の平方根によって与えられる。この量は普遍長スケールとして現れ、コンプトン波長の幾何学的起源を与える。

We derive semiclassical equations of motion for an accelerated wavepacket in a two-band system. We show that these equations can be formulated in terms of the static band geometry described by the quantum metric. We consider the specific cases of the Rashba Hamiltonian with and without a Zeeman term. The semiclassical trajectories are in full agreement with the ones found by solving the Schr\"odinger equation. This formalism successfully describes the adiabatic limit and the anomalous Hall effect traditionally attributed to Berry curvature. It also describes the opposite limit of coherent band superposition giving rise to a spatially oscillating Zitterbewegung motion. At $k=0$, such wavepacket exhibits a circular trajectory in real space, with its radius given by the square root of the quantum metric. This quantity appears as a universal length scale, providing a geometrical origin of the Compton wavelength.

翻訳日:2023-03-25 18:28:56 公開日:2021-06-23

# 絡み合った状態からのマジック状態蒸留

Magic State Distillation from Entangled States ( http://arxiv.org/abs/2106.12591v1 )

ライセンス: Link先を確認

Ning Bao, ChunJun Cao, Vincent Paul Su

(参考訳) マジックは、凝縮物系の低エネルギー状態のような多体絡み合い状態において非局所的に分散することができる。ブラヴィイ・キタエフ・マジックステート蒸留プロトコルを用いて、非局所魔法は蒸留可能であり、蒸留結果を改善することができる。いくつかの明確な例を分析し、スピンスクイージングにより蒸留不能状態が蒸留可能状態に変換することができることを示した。また, 魔法蒸留プロトコルによって仮定される従来の製品入力状態は, 蒸留可能な魔法を持つ一般の州では非常に非定型的であることも示唆した。さらに、高い確率でマジック状態を与える様々なエンタングル入力を研究する必要性を正当化している。

Magic can be distributed non-locally in many-body entangled states, such as the low energy states of condensed matter systems. Using the Bravyi-Kitaev magic state distillation protocol, we find that non-local magic is distillable and can improve the distillation outcome. We analyze a few explicit examples and show that spin squeezing can be used to convert non-distillable states into distillable ones. Our analysis also suggests that the conventional product input states assumed by magic distillation protocols are extremely atypical among general states with distillable magic. It further justifies the need for studying a diverse range of entangled inputs that yield magic states with high probability.

翻訳日:2023-03-25 18:22:08 公開日:2021-06-23

# 演算子のユニタリ分解によるオープン量子系の量子シミュレーション

Quantum Simulation of Open Quantum Systems Using a Unitary Decomposition of Operators ( http://arxiv.org/abs/2106.12588v1 )

ライセンス: Link先を確認

Anthony W. Schlimgen, Kade Head-Marsden, LeeAnn M. Sager, Prineha Narang, and David A. Mazziotti

(参考訳) 現実の物理系と化学系における電子輸送は、しばしば大きな環境と非自明なエネルギー交換を伴い、開量子系の定義と処理を必要とする。オープン量子系の時間進化は非ユニタリ演算子を用いるため、オープン量子系のシミュレーションはユニタリ演算子やゲートのみから構築される普遍量子コンピュータの課題を示す。本稿では、量子デバイス上の任意の状態に対する非ユニタリ作用素の作用を実装するための一般的なアルゴリズムを提案する。任意の量子作用素は、少なくとも4つのユニタリ作用素の線型結合として正確に分解できることを示す。本手法を,ゼロおよび有限温度振幅減衰チャネルの2レベル系で実証する。結果は古典計算と一致しており、中間および将来の量子デバイス上での非ユニタリ操作をシミュレートする可能性を示している。

Electron transport in realistic physical and chemical systems often involves the non-trivial exchange of energy with a large environment, requiring the definition and treatment of open quantum systems. Because the time evolution of an open quantum system employs a non-unitary operator, the simulation of open quantum systems presents a challenge for universal quantum computers constructed from only unitary operators or gates. Here we present a general algorithm for implementing the action of any non-unitary operator on an arbitrary state on a quantum device. We show that any quantum operator can be exactly decomposed as a linear combination of at most four unitary operators. We demonstrate this method on a two-level system in both zero and finite temperature amplitude damping channels. The results are in agreement with classical calculations, showing promise in simulating non-unitary operations on intermediate-term and future quantum devices.

翻訳日:2023-03-25 18:21:55 公開日:2021-06-23

# 弱重力場における相対論的粒子運動と量子光学

Relativistic Particle Motion and Quantum Optics in a Weak Gravitational Field ( http://arxiv.org/abs/2106.12514v1 )

ライセンス: Link先を確認

Charis Anastopoulos and Bei-Lok Hu

(参考訳) 宇宙における長いベースライン量子実験の可能性は、弱い重力場における相対論的量子粒子の時間的進化をよりよく理解する必要がある。従来の量子光学と量子力学に基づく原子物理学による従来の処理が、局所性、同時性、シグナリング、因果性などの問題に直面したときに不適切になる理由を説明する。量子場理論が必要である。重力効果を加えると、曲線時空(qftcst)における場の量子論に導かれる。この確立された理論は、重力と量子理論の基礎、および相対論的設定における量子情報理論の基礎概念をテストする、提案された宇宙実験の大規模なクラスに対する標準参照理論として機能するべきである。これは、qftcstの観点から宇宙空間における近距離量子光学および物質波実験を扱う一連の論文の最初のものである。我々はQFTCSTを用いた光子及びスカラー粒子の量子運動と干渉計実験への応用について分析した。我々の主な結果は、光子の場合、弱い重力場は不均質誘電体と完全に等しい順序に導かれるため、光媒体の理論からよく知られた概念を用いて、曲面空間における量子光学実験を記述できるということである。また、一階の量子コヒーレンスをプローブする干渉実験、共変粒子検出理論の重要性、到着時刻の関連性についても論じる。内部構造を持つ大質量粒子に対しては、励起内部状態に起因する異なる重力質量に由来する新しい重力誘起相転移を同定する。この位相シフトは、宇宙実験で原理的に測定することができる。

The possibility of long-baseline quantum experiments in space makes it necessary to better understand the time evolution of relativistic quantum particles in a weakly varying gravitational field. We explain why conventional treatments by traditional quantum optics and atomic physics based on quantum mechanics may become inadequate when faced with issues related to locality, simultaneity, signaling, causality, etc. Quantum field theory is needed. Adding the effects of gravitation, we are led to Quantum Field Theory in Curved Spacetime (QFTCST). This well-established theory should serve as the canonical reference theory to a large class of proposed space experiments testing the foundations of gravitation and quantum theory, and the basic notions of quantum information theory in relativistic settings. This is the first in a series of papers treating near-term quantum optics and matter waves experiments in space from the perspective of QFTCST. We analyze the quantum motion of photons and of scalar massive particles using QFTCST with application to interferometer experiments. Our main result is that, for photons, the weak gravitational field is to leading order completely equivalent to an inhomogeneous dielectric, thus allowing for a description of quantum optics experiments in curved space using familiar notions from the theory of optical media. We also discuss interference experiments that probe first-order quantum coherence, the importance of a covariant particle detection theory, and the relevance of time of arrival measurements. For massive particles with internal structure, we identify a novel gravity-induced phase shift that originates from the different gravitational masses attributed to the excited internal states. This phase shift can in principle be measured in space experiments.

翻訳日:2023-03-25 18:20:32 公開日:2021-06-23

# ソーシャルエンジニアリング:概念,技術,セキュリティ対策

Social engineering: Concepts, Techniques and Security Countermeasures ( http://arxiv.org/abs/2107.14082v1 )

ライセンス: Link先を確認

Adib Mohammed Syed

(参考訳) 本報告の目的は,サイバーセキュリティにおける社会工学と呼ばれるトピックを調査し,実社会工学の意味,概念,技術,セキュリティ対策について,事実的な学術研究に基づいて解説することである。

The purpose of this report is to research the topic called Social Engineering in Cyber Security and present the explanation of the meaning, concepts, techniques, and security countermeasures of Social Engineering based on factual academic research.

翻訳日:2023-03-25 18:13:09 公開日:2021-06-23

# 気候変動のための量子技術:予備評価

Quantum technologies for climate change: Preliminary assessment ( http://arxiv.org/abs/2107.05362v1 )

ライセンス: Link先を確認

Casey Berger, Agustin Di Paolo, Tracey Forrest, Stuart Hadfield, Nicolas Sawaya, Micha{\l} St\k{e}ch{\l}y and Karl Thibault

(参考訳) 気候変動は、人間社会や地球の生態系にもっと一般的な脅威をもたらす。緩和戦略は自然に科学、工学、経済学の幅広い課題を解決する必要がある。この文脈では、コンピュータ、センシング、通信における量子技術が急速に発展し、気候変動の影響を診断し緩和するための有用なツールになり得る。しかし、気候と量子科学の交わりはほとんど解明されていない。本報告は, 物理システムのシミュレーション, 組合せ最適化, センシング, エネルギー効率という4つの分野に着目し, 気候変動における量子技術の潜在的高インパクト利用事例を明らかにすることを目的としている。このレポートは、気候と量子科学のコミュニティを結びつける上で有用なリソースを提供してくれることを願っています。

Climate change presents an existential threat to human societies and the Earth's ecosystems more generally. Mitigation strategies naturally require solving a wide range of challenging problems in science, engineering, and economics. In this context, rapidly developing quantum technologies in computing, sensing, and communication could become useful tools to diagnose and help mitigate the effects of climate change. However, the intersection between climate and quantum sciences remains largely unexplored. This preliminary report aims to identify potential high-impact use-cases of quantum technologies for climate change with a focus on four main areas: simulating physical systems, combinatorial optimization, sensing, and energy efficiency. We hope this report provides a useful resource towards connecting the climate and quantum science communities, and to this end we identify relevant research questions and next steps.

翻訳日:2023-03-25 18:13:04 公開日:2021-06-23

# Analisis Kualitas Layanan website E-Commerce Bukalapak Terhadap Kepuasan Pengguna Mahasiswa Universitas Bina Darma Menggunakan Metode Webqual 4.0

Analisis Kualitas Layanan Website E-Commerce Bukalapak Terhadap Kepuasan Pengguna Mahasiswa Universitas Bina Darma Menggunakan Metode Webqual 4.0 ( http://arxiv.org/abs/2106.15342v1 )

ライセンス: Link先を確認

Adellia, Leon Andretti Abdillah

(参考訳) 新しいテクノロジーの成長は、オンラインで行うプロダクトマーケティングを動機付けている。オンライン開発をサポートする要因の1つは、オンラインの売買サイトやElectronic Commerceである。電子商取引の支持要因の1つはウェブサイトの利用である。 Webサイト(Webサイト、英: web)は、様々なテキスト情報、データ、静止画像、アニメーションデータ、サウンド、ビデオ、静的および動的の両方を表示するページの集合として解釈できるメディアの一種である。電子商取引企業はWebを通じて消費者と対話し、そのうちの1つはBukalapakのウェブサイトである。ウェブサイトの品質を決定するためには、測定する必要がある。 Webサイトの品質を測定することで、Webサイトに対するユーザの認識を見ることができる。本研究では,ユーザビリティ,情報品質,ユーザ満足度に関するインタラクション品質という3次元からなる webqual 4.0 を用いた。使用するデータは、アンケートを配布して元の情報源から直接得られるデータソースである一次データである。収集したデータは104人。本研究の回答者はbina darma大学生で,webサイトを客観的に評価することが期待された。

The growth of new technology, motivates some product marketing to be done online. One of the factors that support online development is online buying and selling sites or Electronic Commerce. One of the supporting factors for Electronic Commerce is using a website. Website or also commonly called the web is a form of media that can be interpreted as a collection of pages that display various kinds of text information, data, still or moving images, animation data, sound, video, both static and dynamic. Electronic Commerce companies interact with consumers through the web, one of which is the Bukalapak website, which is an online site provider for buying and selling products to be marketed. To determine the quality of a website, it is necessary to measure. By measuring the quality of a website, it can be seen the user's perception of the website. In this study using the Webqual 4.0 method which consists of 3 dimensions, namely usability, information quality and interaction quality on user satisfaction. The data used is primary data which is a source of data obtained directly from the original source by distributing questionnaires. The total data obtained are 104 respondents. Respondents in this study were Bina Darma University students who were expected to provide an objective assessment of the website to be analyzed.

翻訳日:2023-03-25 18:12:13 公開日:2021-06-23

# マルチタスク強化学習における階層型メモリ予測マシンの進化

Evolving Hierarchical Memory-Prediction Machines in Multi-Task Reinforcement Learning ( http://arxiv.org/abs/2106.12659v1 )

ライセンス: Link先を確認

Stephen Kelly, Tatiana Voegerl, Wolfgang Banzhaf, Cedric Gondro

(参考訳) 行動の基本的な側面は、記憶における経験の突出した特徴をエンコードし、これらの記憶を現在の感覚情報と組み合わせて、長期的な目標を最大化するような各状況に対する最善の行動を予測する能力である。世界は非常にダイナミックで、行動エージェントは時間とともに様々な環境や目的にまたがって一般化する必要がある。このシナリオは、部分的に観測可能なマルチタスク強化学習問題としてモデル化することができる。遺伝的プログラミングを用いて、OpenAIのClassic Controlスイートを含む6つのユニークな環境で動作可能な、高度に一般化されたエージェントを進化させる。これはエージェントが離散的および連続的なアクションを同時にサポートする必要がある。タスク識別センサーの入力は提供されないため、エージェントは状態変数のダイナミクスからタスクを識別し、各タスクの制御ポリシーを定義する必要がある。進化するプログラムにおける創発的階層構造は、時間分解とメモリ上の問題環境の符号化を成功させるマルチタスクエージェントをもたらすことを示す。結果として得られるエージェントは、6つの環境すべてにおいてタスク固有のエージェントと競合する。さらに、プログラムの階層構造は動的実行時の複雑さを許容し、これは比較的効率的な操作をもたらす。

A fundamental aspect of behaviour is the ability to encode salient features of experience in memory and use these memories, in combination with current sensory information, to predict the best action for each situation such that long-term objectives are maximized. The world is highly dynamic, and behavioural agents must generalize across a variety of environments and objectives over time. This scenario can be modeled as a partially-observable multi-task reinforcement learning problem. We use genetic programming to evolve highly-generalized agents capable of operating in six unique environments from the control literature, including OpenAI's entire Classic Control suite. This requires the agent to support discrete and continuous actions simultaneously. No task-identification sensor inputs are provided, thus agents must identify tasks from the dynamics of state variables alone and define control policies for each task. We show that emergent hierarchical structure in the evolving programs leads to multi-task agents that succeed by performing a temporal decomposition and encoding of the problem environments in memory. The resulting agents are competitive with task-specific agents in all six environments. Furthermore, the hierarchical structure of programs allows for dynamic run-time complexity, which results in relatively efficient operation.

翻訳日:2023-03-25 18:11:48 公開日:2021-06-23

# ブリルアン地域のトンネル:バレー・ホール・エッジ・チャンネルにおける後方散乱の理論

Tunneling in the Brillouin Zone: Theory of Backscattering in Valley Hall Edge Channels ( http://arxiv.org/abs/2106.12646v1 )

ライセンス: Link先を確認

Tirth Shah, Florian Marquardt, and Vittorio Peano

(参考訳) 最近の大規模な実験では、光子やフォノンなどのボソニック系のトポロジカル輸送を探索している。大部分の場合、時間反転対称性は保存され、バンド構造は幾何の適切な選択によって設計され、高対称性点近傍で位相的に非自明なバンドギャップを生成する。しかし、これは大きなクアシモメンタムの後方散乱の可能性を開き、トポロジカル保護を破壊した。これまでのところ、この効果を十分に抑制できる条件が何であるかははっきりしていない。本研究では,運動量空間におけるトンネル遷移の包括的半古典理論を導入し,バレーホール効果に基づく最も重要なシステムクラスの後方散乱について述べる。平滑な領域壁においても,局所的な壁面傾斜とエネルギーの双方で,有効散乱中心が形成されると予測する。さらに,本理論は,領域壁の滑らかさの増加に伴う反射振幅の指数関数的抑制の定量的解析を提供する。

A large set of recent experiments has been exploring topological transport in bosonic systems, e.g. of photons or phonons. In the vast majority, time-reversal symmetry is preserved, and band structures are engineered by a suitable choice of geometry, to produce topologically nontrivial bandgaps in the vicinity of high-symmetry points. However, this leaves open the possibility of large-quasimomentum backscattering, destroying the topological protection. Up to now, it has been unclear what precisely are the conditions where this effect can be sufficiently suppressed. In the present work, we introduce a comprehensive semiclassical theory of tunneling transitions in momentum space, describing backscattering for one of the most important system classes, based on the valley Hall effect. We predict that even for a smooth domain wall effective scattering centres develop at locations determined by both the local slope of the wall and the energy. Moreover, our theory provides a quantitative analysis of the exponential suppression of the overall reflection amplitude with increasing domain wall smoothness.

翻訳日:2023-03-25 18:10:17 公開日:2021-06-23

# アナログ量子シミュレータにおける電流の非侵襲計測

Non-invasive measurement of currents in analog quantum simulators ( http://arxiv.org/abs/2106.12599v1 )

ライセンス: Link先を確認

Kevin T. Geier, Janika Reichstetter, Philipp Hauke

(参考訳) アナログ量子シミュレータによる量子力学の研究能力にもかかわらず、電流を検出する可能性は低い。本稿では, 量子多体系の電流をアンシラに弱い結合で測定し, 続いてアンシラ集団を測定する柔軟な非侵襲的手法を提案する。ハーパー・ホフシュタットラー光格子ラダーにおける相互作用ボソンの例として,このスキームを数値的に評価し,実験誤差源について考察する。非常にフレキシブルなプロトコルは、ハードコアとソフトコアのボソンとフェルミオンの両方で使用することができ、現在の相関のようなより一般的な観測可能なものに容易に拡張可能であり、閉じ込められたイオンプラットフォームのために例示しているように、冷たい原子以外の設定にも適用できる。

Despite the pristine abilities of analog quantum simulators to study quantum dynamics, possibilities to detect currents are sparse. Here, we propose a flexible non-invasive technique to measure currents in quantum many-body systems by weakly coupling the system to an ancilla, followed by a measurement of the ancilla population. We numerically benchmark the scheme at the example of interacting bosons in a Harper-Hofstadter optical-lattice ladder, and discuss potential experimental error sources. The highly flexible protocol can be used with both hard-core and soft-core bosons as well as fermions, is easily extendable to more general observables like current-current correlations, and applies to other setups beyond cold atoms as we exemplify for the trapped-ion platform.

翻訳日:2023-03-25 18:10:00 公開日:2021-06-23

# graph universal adversarial attack: 悪役がグラフ学習モデルを台無しにする

Graph Universal Adversarial Attacks: A Few Bad Actors Ruin Graph Learning Models ( http://arxiv.org/abs/2002.04784v2 )

ライセンス: Link先を確認

Xiao Zang, Yi Xie, Jie Chen, Bo Yuan

(参考訳) ディープニューラルネットワークは一般化されているものの、小さな対向摂動に敏感であることが知られている。この現象は深刻なセキュリティの脅威をもたらし、ディープラーニングモデルの堅牢性について深く調査する必要がある。グラフ構造化データのためのニューラルネットワークが出現すると、同様の調査が彼らの堅牢性を理解するよう促される。グラフ構造やノードの特徴を逆向きに摂動すると、モデルの性能が著しく低下する可能性があることが判明した。本研究では,対象とする被害者との接続を反転させることで,訓練済みのグラフニューラルネットワークを侵害する悪役ノードをグラフに含む場合,このような脆弱性が発生することを異なる角度から示す。さらに悪いことに、あるグラフモデルで見つかった悪いアクターは、他のモデルもひどく侵害している。我々はバッドアクタを ‘アンカーノード' と呼び、それらを識別するために gua というアルゴリズムを提案する。徹底的な実証調査は、アンカーノードがしばしば同じクラスに属することの興味深い発見であり、アンカーノードの数と攻撃成功率の間の直感的なトレードオフの相関も示している。 2708ノードを含むデータセットCoraでは、6つのアンカーノードがGCNや他の3モデルの攻撃成功率を80%以上上回る結果となる。

Deep neural networks, while generalize well, are known to be sensitive to small adversarial perturbations. This phenomenon poses severe security threat and calls for in-depth investigation of the robustness of deep learning models. With the emergence of neural networks for graph structured data, similar investigations are urged to understand their robustness. It has been found that adversarially perturbing the graph structure and/or node features may result in a significant degradation of the model performance. In this work, we show from a different angle that such fragility similarly occurs if the graph contains a few bad-actor nodes, which compromise a trained graph neural network through flipping the connections to any targeted victim. Worse, the bad actors found for one graph model severely compromise other models as well. We call the bad actors ``anchor nodes'' and propose an algorithm, named GUA, to identify them. Thorough empirical investigations suggest an interesting finding that the anchor nodes often belong to the same class; and they also corroborate the intuitive trade-off between the number of anchor nodes and the attack success rate. For the dataset Cora which contains 2708 nodes, as few as six anchor nodes will result in an attack success rate higher than 80\% for GCN and other three models.

翻訳日:2023-01-01 19:37:33 公開日:2021-06-23

# クロスドメインオブジェクト検出のための無バイアス平均教師

Unbiased Mean Teacher for Cross-domain Object Detection ( http://arxiv.org/abs/2003.00707v2 )

ライセンス: Link先を確認

Jinhong Deng, Wen Li, Yuhua Chen, Lixin Duan

(参考訳) オブジェクト検出モデルはデータ分散、特に2つの異なるドメイン間のかなりの領域シフトに対して脆弱であることが多いため、ドメイン間のオブジェクト検出は困難である。本稿では,ドメイン間オブジェクト検出のためのUnbiased Mean Teacher (UMT)モデルを提案する。我々は、ドメイン横断シナリオにおいて、単純な平均教師(MT)モデルに対してかなりのモデルバイアスが存在することを明らかにする。特に,教師モデルにおいて,教師モデルの専門知識を最大限活用するためのMTのクロスドメイン蒸留法を提案する。さらに,学生モデルでは,画素レベルの適応でトレーニングサンプルを増強することにより,バイアスを軽減する。最後に, 現状モデルに最も適合する試料を選別し, クロスドメイン蒸留プロセスをさらに強化するために, アウト・オブ・ディストリビューション推定手法を用いる。これらの戦略でモデルバイアスの問題に取り組むことで、我々のumtモデルは、ベンチマークデータセットであるclipart1k、watercolor2k、fogggy cityscapes、cityscapes上で44.1%、58.1%、41.7%、43.1%のマップをそれぞれ達成し、既存の最先端の成果を上回っている。私たちの実装はhttps://github.com/kinredon/umtで利用可能です。

Cross-domain object detection is challenging, because object detection model is often vulnerable to data variance, especially to the considerable domain shift between two distinctive domains. In this paper, we propose a new Unbiased Mean Teacher (UMT) model for cross-domain object detection. We reveal that there often exists a considerable model bias for the simple mean teacher (MT) model in cross-domain scenarios, and eliminate the model bias with several simple yet highly effective strategies. In particular, for the teacher model, we propose a cross-domain distillation method for MT to maximally exploit the expertise of the teacher model. Moreover, for the student model, we alleviate its bias by augmenting training samples with pixel-level adaptation. Finally, for the teaching process, we employ an out-of-distribution estimation strategy to select samples that most fit the current model to further enhance the cross-domain distillation process. By tackling the model bias issue with these strategies, our UMT model achieves mAPs of 44.1%, 58.1%, 41.7%, and 43.1% on benchmark datasets Clipart1k, Watercolor2k, Foggy Cityscapes, and Cityscapes, respectively, which outperforms the existing state-of-the-art results in notable margins. Our implementation is available at https://github.com/kinredon/umt.

翻訳日:2022-12-27 05:25:20 公開日:2021-06-23

# 人間デモから長距離タスクを一般化する学習

Learning to Generalize Across Long-Horizon Tasks from Human Demonstrations ( http://arxiv.org/abs/2003.06085v2 )

ライセンス: Link先を確認

Ajay Mandlekar, Danfei Xu, Roberto Mart\'in-Mart\'in, Silvio Savarese, Li Fei-Fei

(参考訳) 模倣学習は、高価なランダム探索プロセスに依存しないため、現実世界でロボットポリシーを訓練するための効果的で安全な手法である。しかし、探索の欠如により、実証された行動を超えて一般化する学習方針は依然としてオープンな課題である。本稿では,ロボットの模倣学習の枠組みを提案する。 1)少数の人間のデモンストレーションから複雑な実世界の操作タスクを効率的に学習し、 2) 収集した実演に含まれない新たな行動の合成。我々の重要な洞察は、多タスク領域がしばしば潜在構造を持ち、状態空間の共通領域で異なるタスクの軌道が交差することを示すことである。本稿では,この間欠的構造を利用した2段階のオフライン模倣学習アルゴリズムであるimitation(gti)による一般化について述べる。 GTIの第1段階では、異なる実演軌跡から行動を構成する能力を持つために軌道交叉を利用する確率的ポリシーを訓練する。 GTIの第2段階では、第1段階の無条件確率ポリシーからロールアウトの小さなセットを収集し、ゴール指向エージェントをトレーニングして、新規なスタートおよびゴール設定を一般化する。我々は,実世界におけるGTIのシミュレーション領域と長距離ロボット操作領域の両面での検証を行った。追加の結果とビデオはhttps://sites.google.com/view/gti2020/で見ることができる。

Imitation learning is an effective and safe technique to train robot policies in the real world because it does not depend on an expensive random exploration process. However, due to the lack of exploration, learning policies that generalize beyond the demonstrated behaviors is still an open challenge. We present a novel imitation learning framework to enable robots to 1) learn complex real world manipulation tasks efficiently from a small number of human demonstrations, and 2) synthesize new behaviors not contained in the collected demonstrations. Our key insight is that multi-task domains often present a latent structure, where demonstrated trajectories for different tasks intersect at common regions of the state space. We present Generalization Through Imitation (GTI), a two-stage offline imitation learning algorithm that exploits this intersecting structure to train goal-directed policies that generalize to unseen start and goal state combinations. In the first stage of GTI, we train a stochastic policy that leverages trajectory intersections to have the capacity to compose behaviors from different demonstration trajectories together. In the second stage of GTI, we collect a small set of rollouts from the unconditioned stochastic policy of the first stage, and train a goal-directed agent to generalize to novel start and goal configurations. We validate GTI in both simulated domains and a challenging long-horizon robotic manipulation domain in the real world. Additional results and videos are available at https://sites.google.com/view/gti2020/ .

翻訳日:2022-12-24 01:23:44 公開日:2021-06-23

# 訴訟手続の状況予測:逐次的テキストデータに基づくアプローチ

Predicting Legal Proceedings Status: Approaches Based on Sequential Text Data ( http://arxiv.org/abs/2003.11561v4 )

ライセンス: Link先を確認

Felipe Maia Polo, Itamar Ciochetti, Emerson Bertolo

(参考訳) 本研究の目的は,ブラジルの法的手続を3段階に分類する予測モデルを開発することである。 (i)アーカイブされた手続 (ii)積極的な手続、及び (iii)停止。この問題の解決は、公共機関や民間機関が大規模な法的手続きのポートフォリオを管理し、規模と効率性を高めることを目的としている。本論文では,「運動」と呼ばれる短文の系列からなる訴訟手続について述べる。自然言語処理(NLP)と機械学習技術を組み合わせて問題解決を行った。資源不足のため、ポルトガルのNLPで作業することは難しいが、我々のアプローチは分類作業において非常にうまく行っており、最大精度は.93、最高スコアは.89(マクロ)と.93(重み)である。さらに,モデルの1つで学習したパターンを抽出・解釈し,そのパターンが分類タスクとどのように関連しているかを定量化することができた。解釈可能性のステップは、マシンラーニングの法的アプリケーションにおいて重要であり、ブラックボックスモデルがどのように意思決定を行うかに関するエキサイティングな洞察を与えてくれます。

The objective of this paper is to develop predictive models to classify Brazilian legal proceedings in three possible classes of status: (i) archived proceedings, (ii) active proceedings, and (iii) suspended proceedings. This problem's resolution is intended to assist public and private institutions in managing large portfolios of legal proceedings, providing gains in scale and efficiency. In this paper, legal proceedings are made up of sequences of short texts called "motions." We combined several natural language processing (NLP) and machine learning techniques to solve the problem. Although working with Portuguese NLP, which can be challenging due to lack of resources, our approaches performed remarkably well in the classification task, achieving maximum accuracy of .93 and top average F1 Scores of .89 (macro) and .93 (weighted). Furthermore, we could extract and interpret the patterns learned by one of our models besides quantifying how those patterns relate to the classification task. The interpretability step is important among machine learning legal applications and gives us an exciting insight into how black-box models make decisions.

翻訳日:2022-12-24 00:55:54 公開日:2021-06-23

# PO-EMO:ドイツ語・英語詩における美的感情の概念化・注釈・モデル化

PO-EMO: Conceptualization, Annotation, and Modeling of Aesthetic Emotions in German and English Poetry ( http://arxiv.org/abs/2003.07723v3 )

ライセンス: Link先を確認

Thomas Haider, Steffen Eger, Evgeny Kim, Roman Klinger, Winfried Menninghaus

(参考訳) ソーシャルメディア、文学、ニュース、その他のドメインの感情分析へのほとんどのアプローチは、ekmanやplutchikが定義する基本的な感情カテゴリのみに焦点を当てている。しかし、芸術(文学など)はより複雑で微妙な感情の幅広い範囲への関与を可能にする。それらには感情的な反応も混ざり合っていることが示されている。詩の感情は、著者がテキストで表現したものや意図したものではなく、読者によって引き起こされるものだと考える。そこで我々は,読者の審美的評価を予測可能な審美感情の集合を概念化し,複数のラベルの注釈を1行にまとめることで,その文脈内での混合感情を捉える。注意深い訓練を受けた専門家とクラウドソーシングによるアノテーション実験において,この新しい設定を評価した。専門家とのアノテーションは、kappa = .70の許容可能な一致をもたらし、将来の大規模分析のための一貫したデータセットをもたらす。最後に、BERTに基づく最初の感情分類実験を行い、ドイツのサブセットで最大.52F1-microの美的感情の識別が困難であることを示す。データとリソースはhttps://github.com/tnhaider/poetry-emotionで入手できる。

Most approaches to emotion analysis of social media, literature, news, and other domains focus exclusively on basic emotion categories as defined by Ekman or Plutchik. However, art (such as literature) enables engagement in a broader range of more complex and subtle emotions. These have been shown to also include mixed emotional responses. We consider emotions in poetry as they are elicited in the reader, rather than what is expressed in the text or intended by the author. Thus, we conceptualize a set of aesthetic emotions that are predictive of aesthetic appreciation in the reader, and allow the annotation of multiple labels per line to capture mixed emotions within their context. We evaluate this novel setting in an annotation experiment both with carefully trained experts and via crowdsourcing. Our annotation with experts leads to an acceptable agreement of kappa = .70, resulting in a consistent dataset for future large scale analysis. Finally, we conduct first emotion classification experiments based on BERT, showing that identifying aesthetic emotions is challenging in our data, with up to .52 F1-micro on the German subset. Data and resources are available at https://github.com/tnhaider/poetry-emotion

翻訳日:2022-12-22 21:11:38 公開日:2021-06-23

# ぼやけ、ノイズ、圧縮ロバストな生成型逆ネットワーク

Blur, Noise, and Compression Robust Generative Adversarial Networks ( http://arxiv.org/abs/2003.07849v2 )

ライセンス: Link先を確認

Takuhiro Kaneko, Tatsuya Harada

(参考訳) generative adversarial networks (gans) は、画像の再現能力によってかなりの注目を集めている。しかし、画像がぼやけ、ノイズ、圧縮という形で劣化しているにもかかわらず、トレーニング画像を忠実に再現することができ、同様に劣化した画像を生成する。この問題を解決するために、最近提案されたノイズロバストGAN(NR-GAN)は、画像とノイズジェネレータからなる2世代モデルを用いて、ノイズの多い画像から直接クリーンな画像ジェネレータを学習できることを示し、部分解を提供する。しかし、その応用はノイズに限定されており、その付加的・可逆的特性により比較的分解が容易であり、ぼかし、圧縮、そしてすべての組み合わせという形で可逆的画像劣化への応用は依然として課題である。これらの問題に対処するために,劣化パラメータ(ぼかしカーネルタイプ,ノイズ量,品質係数値など)を知らずに,劣化画像から直接クリーン画像生成器を学習できる,ぼかし,ノイズ,圧縮頑健 gan (bncr-gan) を提案する。 NR-GANにインスパイアされたBNCR-GANは、画像、ボケカーネル、ノイズ、品質要素ジェネレータで構成される多重ジェネレータモデルを使用する。しかし,nr-ganとは対照的に,非可逆的な特性に対処するために,劣化前後のバイパスを用いてデータ駆動方式で劣化強度値を調整するマスキングアーキテクチャを導入する。さらに, ボケ, ノイズ, 圧縮の組み合わせによる不確実性を抑制するため, 劣化強度に応じて可逆的劣化過程間の一貫性を規定する適応的一貫性損失を導入する。 CIFAR-10の大規模比較とFFHQの一般性解析によるBNCR-GANの有効性を示す。さらに,画像復元におけるBNCR-GANの適用性を示す。

Generative adversarial networks (GANs) have gained considerable attention owing to their ability to reproduce images. However, they can recreate training images faithfully despite image degradation in the form of blur, noise, and compression, generating similarly degraded images. To solve this problem, the recently proposed noise robust GAN (NR-GAN) provides a partial solution by demonstrating the ability to learn a clean image generator directly from noisy images using a two-generator model comprising image and noise generators. However, its application is limited to noise, which is relatively easy to decompose owing to its additive and reversible characteristics, and its application to irreversible image degradation, in the form of blur, compression, and combination of all, remains a challenge. To address these problems, we propose blur, noise, and compression robust GAN (BNCR-GAN) that can learn a clean image generator directly from degraded images without knowledge of degradation parameters (e.g., blur kernel types, noise amounts, or quality factor values). Inspired by NR-GAN, BNCR-GAN uses a multiple-generator model composed of image, blur-kernel, noise, and quality-factor generators. However, in contrast to NR-GAN, to address irreversible characteristics, we introduce masking architectures adjusting degradation strength values in a data-driven manner using bypasses before and after degradation. Furthermore, to suppress uncertainty caused by the combination of blur, noise, and compression, we introduce adaptive consistency losses imposing consistency between irreversible degradation processes according to the degradation strengths. We demonstrate the effectiveness of BNCR-GAN through large-scale comparative studies on CIFAR-10 and a generality analysis on FFHQ. In addition, we demonstrate the applicability of BNCR-GAN in image restoration.

翻訳日:2022-12-22 20:18:33 公開日:2021-06-23

# 自然言語処理のための事前学習モデル:調査

Pre-trained Models for Natural Language Processing: A Survey ( http://arxiv.org/abs/2003.08271v4 )

ライセンス: Link先を確認

Xipeng Qiu, Tianxiang Sun, Yige Xu, Yunfan Shao, Ning Dai, and Xuanjing Huang

(参考訳) 近年,事前学習モデル(PTM)の出現により,自然言語処理(NLP)が新たな時代を迎えている。本調査では,NLP 用 PTM について概説する。まず,言語表現学習とその研究の進展について紹介する。そして,4つの観点から,既存のPTMを分類的に分類する。次に,PTMの知識を下流タスクに適応させる方法について述べる。最後に,今後の研究に向けた PTM の可能性について概説する。この調査は、様々なNLPタスクに対するPTMの理解、利用、開発のためのハンズオンガイドになることを目的としている。

Recently, the emergence of pre-trained models (PTMs) has brought natural language processing (NLP) to a new era. In this survey, we provide a comprehensive review of PTMs for NLP. We first briefly introduce language representation learning and its research progress. Then we systematically categorize existing PTMs based on a taxonomy with four perspectives. Next, we describe how to adapt the knowledge of PTMs to the downstream tasks. Finally, we outline some potential directions of PTMs for future research. This survey is purposed to be a hands-on guide for understanding, using, and developing PTMs for various NLP tasks.

翻訳日:2022-12-22 09:31:44 公開日:2021-06-23

# 集中効果を持つマクロアクションを用いた効率的なブラックボックス計画

Efficient Black-Box Planning Using Macro-Actions with Focused Effects ( http://arxiv.org/abs/2004.13242v3 )

ライセンス: Link先を確認

Cameron Allen, Michael Katz, Tim Klinger, George Konidaris, Matthew Riemer, Gerald Tesauro

(参考訳) 決定論的計画の難しさは探索木深度とともに指数関数的に増加する。ブラックボックスプランニングは、プランナーがドメインの明示的なモデルなしで運用する必要があるため、さらに大きな課題となる。ヒューリスティックは検索をより効率的にするが、ブラックボックス計画のための目標認識ヒューリスティックは通常ゴールカウントに依存している。本稿では,目標数ヒューリスティックをより正確にするマクロアクションの発見によって,この制限を克服する方法を示す。提案手法は,目標数ヒューリスティックによる仮定とよく一致した,集中した効果(つまり少数の状態変数のみを修飾するマクロ)を持つマクロアクションを探索する。フォーカスされたマクロは、幅広い計画領域におけるブラックボックス計画効率を劇的に改善し、時には完全なドメインモデルへのアクセスで最先端のプランナーを圧倒する。

The difficulty of deterministic planning increases exponentially with search-tree depth. Black-box planning presents an even greater challenge, since planners must operate without an explicit model of the domain. Heuristics can make search more efficient, but goal-aware heuristics for black-box planning usually rely on goal counting, which is often quite uninformative. In this work, we show how to overcome this limitation by discovering macro-actions that make the goal-count heuristic more accurate. Our approach searches for macro-actions with focused effects (i.e. macros that modify only a small number of state variables), which align well with the assumptions made by the goal-count heuristic. Focused macros dramatically improve black-box planning efficiency across a wide range of planning domains, sometimes beating even state-of-the-art planners with access to a full domain model.

翻訳日:2022-12-08 21:59:49 公開日:2021-06-23

# 分類データセットのクラスタリングのための効率的な$k$-modesアルゴリズム

An Efficient $k$-modes Algorithm for Clustering Categorical Datasets ( http://arxiv.org/abs/2006.03936v3 )

ライセンス: Link先を確認

Karin S. Dorman and Ranjan Maitra

(参考訳) データからクラスタをマイニングすることは、多くのアプリケーションにおいて重要な取り組みです。 k$-means法は、数値データをクラスタリングするための一般的で効率的で分散のないアプローチであるが、分類値の観測には適用されない。 k$-modes メソッドは、ユークリッドをハミング距離と平均とを $k$-means 目的関数のモードに置き換えることで、この lacuna に対処する。我々は, OTQT と呼ばれる$k$-modes の斬新で効率的な実装を提供する。 OTQTは既存の$k$-modesアルゴリズムでは検出不可能な目的関数を改善するために更新を見つける。アルゴリズムの複雑さのため、イテレーション毎に若干遅いが、otqtは常にイテレーションごとに正確であり、ほぼ常に(一部のデータセットではわずかに遅い)最終最適化まで高速である。したがって、$k$-modes最適化のためのデフォルトアルゴリズムとしてOTQTを推奨する。

Mining clusters from data is an important endeavor in many applications. The $k$-means method is a popular, efficient, and distribution-free approach for clustering numerical-valued data, but does not apply for categorical-valued observations. The $k$-modes method addresses this lacuna by replacing the Euclidean with the Hamming distance and the means with the modes in the $k$-means objective function. We provide a novel, computationally efficient implementation of $k$-modes, called OTQT. We prove that OTQT finds updates to improve the objective function that are undetectable to existing $k$-modes algorithms. Although slightly slower per iteration due to algorithmic complexity, OTQT is always more accurate per iteration and almost always faster (and only barely slower on some datasets) to the final optimum. Thus, we recommend OTQT as the preferred, default algorithm for $k$-modes optimization.

翻訳日:2022-11-24 21:32:05 公開日:2021-06-23

# 複雑ネットワーク上の感染ダイナミクスの深層学習

Deep learning of contagion dynamics on complex networks ( http://arxiv.org/abs/2006.05410v5 )

ライセンス: Link先を確認

Charles Murphy, Edward Laurence, Antoine Allard

(参考訳) 感染力学の進化を予測することは、力学モデルが部分解のみを与えるようなオープンな問題である。数学的または計算的に計算可能となるためには、これらのモデルは仮定を単純化し、予測の量的精度とモデル化できる力学の複雑さを制限する必要がある。本稿では,ネットワーク上で動的に制御する効果的な局所機構を時系列データから学習する深層学習に基づく補完的アプローチを提案する。当社のグラフニューラルネットワークアーキテクチャは,そのダイナミクスに関する仮定をほとんど行わず,複雑化に伴う異なる伝染ダイナミクスを用いてその正確さを実証する。任意のネットワーク構造をシミュレーションすることで,学習したダイナミックスの性質を学習データを超えて探索することが可能になる。最後に,スペインにおけるcovid-19流行の実データを用いて,このアプローチの適用性を示す。この結果は,ネットワーク上での感染動態の効果的なモデルを構築するために,ディープラーニングが新たな補完的な視点を提供することを示す。

Forecasting the evolution of contagion dynamics is still an open problem to which mechanistic models only offer a partial answer. To remain mathematically or computationally tractable, these models must rely on simplifying assumptions, thereby limiting the quantitative accuracy of their predictions and the complexity of the dynamics they can model. Here, we propose a complementary approach based on deep learning where the effective local mechanisms governing a dynamic on a network are learned from time series data. Our graph neural network architecture makes very few assumptions about the dynamics, and we demonstrate its accuracy using different contagion dynamics of increasing complexity. By allowing simulations on arbitrary network structures, our approach makes it possible to explore the properties of the learned dynamics beyond the training data. Finally, we illustrate the applicability of our approach using real data of the COVID-19 outbreak in Spain. Our results demonstrate how deep learning offers a new and complementary perspective to build effective models of contagion dynamics on networks.

翻訳日:2022-11-23 15:39:55 公開日:2021-06-23

# ShapeFlow: 3次元形状の学習可能な変形

ShapeFlow: Learnable Deformations Among 3D Shapes ( http://arxiv.org/abs/2006.07982v2 )

ライセンス: Link先を確認

Chiyu "Max" Jiang, Jingwei Huang, Andrea Tagliasacchi, Leonidas Guibas

(参考訳) 本稿では,3次元形状全体の変形空間を学習するためのフローベースモデルであるshapeflowを提案する。 ShapeFlowは、形状トポロジーに非依存なマルチテンプレートの変形空間を学習できるが、微妙な幾何学的詳細を保存できる。遅延ベクトルが直接形状にデコードされる生成空間と異なり、変形空間はベクトルを連続流れにデコードし、ソース形状を目標に向けて対流させることができる。このような空間は、自然に幾何学的スタイル(元から来る)と構造的ポーズ(ターゲットに変形する)の切り離しを許す。ニューラルネットワークによって学習された連続的流れ場としてジオメトリ間の変形をパラメトリ化し、そのような変形が単射性、自己切断の自由、体積保存といった望ましい特性を持つことを保証できることを示す。本研究は, 変形による形状生成, 幾何学的様相転移, 形状のクラス全体に対する一貫したパラメータ化の教師なし学習, 形状補間など, 下流の様々な応用において学習された変形空間の有効性を示す。

We present ShapeFlow, a flow-based model for learning a deformation space for entire classes of 3D shapes with large intra-class variations. ShapeFlow allows learning a multi-template deformation space that is agnostic to shape topology, yet preserves fine geometric details. Different from a generative space where a latent vector is directly decoded into a shape, a deformation space decodes a vector into a continuous flow that can advect a source shape towards a target. Such a space naturally allows the disentanglement of geometric style (coming from the source) and structural pose (conforming to the target). We parametrize the deformation between geometries as a learned continuous flow field via a neural network and show that such deformations can be guaranteed to have desirable properties, such as be bijectivity, freedom from self-intersections, or volume preservation. We illustrate the effectiveness of this learned deformation space for various downstream applications, including shape generation via deformation, geometric style transfer, unsupervised learning of a consistent parameterization for entire classes of shapes, and shape interpolation.

翻訳日:2022-11-21 13:32:46 公開日:2021-06-23

# dual t: ラベルノイズ学習における遷移行列の推定誤差の低減

Dual T: Reducing Estimation Error for Transition Matrix in Label-noise Learning ( http://arxiv.org/abs/2006.07805v3 )

ライセンス: Link先を確認

Yu Yao, Tongliang Liu, Bo Han, Mingming Gong, Jiankang Deng, Gang Niu, Masashi Sugiyama

(参考訳) クリーンラベルからノイズラベルへの遷移関係を示す遷移行列は、ラベルノイズ学習において統計的に一貫性のある分類器を構築するために必須である。遷移行列を推定するための既存の手法は、後方の騒がしいクラスの推定に大きく依存している。しかし, ラベルノイズのランダム性により, 騒音クラス後方推定誤差が大きくなり, 遷移行列の精度が低下する可能性が示唆された。そこで本稿では,分割・分割パラダイムを活用し,この問題を解決しようとする。具体的には,雑音のクラス後部を直接推定しないように中間クラスを導入する。この中間クラスにより、元の遷移行列は2つの容易に推定できる遷移行列の積に分解できる。提案手法を双対T推定器と呼ぶ。理論的解析と実証結果は、遷移行列を推定するための双対T推定器の有効性を示し、より良い分類性能をもたらす。

The transition matrix, denoting the transition relationship from clean labels to noisy labels, is essential to build statistically consistent classifiers in label-noise learning. Existing methods for estimating the transition matrix rely heavily on estimating the noisy class posterior. However, the estimation error for noisy class posterior could be large due to the randomness of label noise, which would lead the transition matrix to be poorly estimated. Therefore, in this paper, we aim to solve this problem by exploiting the divide-and-conquer paradigm. Specifically, we introduce an intermediate class to avoid directly estimating the noisy class posterior. By this intermediate class, the original transition matrix can then be factorized into the product of two easy-to-estimate transition matrices. We term the proposed method the dual-T estimator. Both theoretical analyses and empirical results illustrate the effectiveness of the dual-T estimator for estimating transition matrices, leading to better classification performances.

翻訳日:2022-11-21 09:50:26 公開日:2021-06-23

# パンデミックの初期段階における米国と英国における新型コロナウイルスワクチンの受容: 未成年者に対するai生成ワクチンの緩和と政府の役割

COVID-19 Vaccine Acceptance in the US and UK in the Early Phase of the Pandemic: AI-Generated Vaccines Hesitancy for Minors, and the Role of Governments ( http://arxiv.org/abs/2006.08164v3 )

ライセンス: Link先を確認

Gabriel Lima, Meeyoung Cha, Chiyoung Cha, Hyeyoung Hwang

(参考訳) 本研究は、新型コロナウイルス(COVID-19)の早期にワクチン接種を受けたいという国民の意思を調査し、対象間のデザインに基づいてワクチンの受け入れに影響を与える可能性のある要因について検討する。米国と英国の成人572人がオンライン調査に参加した。まず、参加者の医療利用傾向と初期ワクチンの受け入れを評価し、その後、新型コロナウイルスワクチンに対する態度の変化を評価するための短いビグネットを提供した。データ解析にはANOVAとポストホックのペアワイド比較が用いられた。参加者は自分の子供や高齢者よりも予防接種に消極的だった。ワクチン開発における人工知能(ai)の使用はワクチンの受容に影響を与えなかった。ワクチンの有効性を明示したヴィグネットは、ワクチンの受け入れを増加させた。本研究は、ウイルスに対するワクチンの有効性を強調する公共政策がワクチン接種率の向上につながることを示唆している。また、ワクチンの安全性に関する国民の期待についても論じ、その結果に基づく一連の影響を提示する。

This study presents survey results of the public's willingness to get vaccinated against COVID-19 during an early phase of the pandemic and examines factors that could influence vaccine acceptance based on a between-subjects design. A representative quota sample of 572 adults in the US and UK participated in an online survey. First, the participants' medical use tendencies and initial vaccine acceptance were assessed; then, short vignettes were provided to evaluate their changes in attitude towards COVID-19 vaccines. For data analysis, ANOVA and post hoc pairwise comparisons were used. The participants were more reluctant to vaccinate their children than themselves and the elderly. The use of artificial intelligence (AI) in vaccine development did not influence vaccine acceptance. Vignettes that explicitly stated the high effectiveness of vaccines led to an increase in vaccine acceptance. Our study suggests public policies emphasizing the vaccine effectiveness against the virus could lead to higher vaccination rates. We also discuss the public's expectations of governments concerning vaccine safety and present a series of implications based on our findings.

翻訳日:2022-11-21 04:44:48 公開日:2021-06-23

# 生成モデルを用いたロバスト圧縮センシング

Robust Compressed Sensing using Generative Models ( http://arxiv.org/abs/2006.09461v3 )

ライセンス: Link先を確認

Ajil Jalal, Liu Liu, Alexandros G. Dimakis, Constantine Caramanis

(参考訳) 圧縮センシングの目標は、ノイズ線形方程式の未決定系から高次元ベクトルを推定することである。古典的な圧縮センシングと類似して、ここでは生成モデルが先行する、つまりベクトルは深い生成モデル $g: \mathbb{r}^k \rightarrow \mathbb{r}^n$ で表現されると仮定する。経験的リスク最小化(ERM)のような古典的回復アプローチは、測定行列がガウス以下である場合に成功することが保証される。しかし、測定行列と測定値が重く、または外れ値がある場合、回復は劇的に失敗する可能性がある。本稿では,Median-of-Means (MOM) にヒントを得たアルゴリズムを提案する。我々のアルゴリズムは、外れ値が存在する場合でも、重み付きデータの回復を保証する。理論的には,本手法はサブガウシアン仮定下でのermと同等のサンプル複雑性を満足することを示す。我々の実験は、我々の主張の両面を検証している: 他のアルゴリズムは、実際は脆弱で、重み付けや破損したデータの下で失敗する。

The goal of compressed sensing is to estimate a high dimensional vector from an underdetermined system of noisy linear equations. In analogy to classical compressed sensing, here we assume a generative model as a prior, that is, we assume the vector is represented by a deep generative model $G: \mathbb{R}^k \rightarrow \mathbb{R}^n$. Classical recovery approaches such as empirical risk minimization (ERM) are guaranteed to succeed when the measurement matrix is sub-Gaussian. However, when the measurement matrix and measurements are heavy-tailed or have outliers, recovery may fail dramatically. In this paper we propose an algorithm inspired by the Median-of-Means (MOM). Our algorithm guarantees recovery for heavy-tailed data, even in the presence of outliers. Theoretically, our results show our novel MOM-based algorithm enjoys the same sample complexity guarantees as ERM under sub-Gaussian assumptions. Our experiments validate both aspects of our claims: other algorithms are indeed fragile and fail under heavy-tailed and/or corrupted data, while our approach exhibits the predicted robustness.

翻訳日:2022-11-20 20:19:59 公開日:2021-06-23

# Lookahead-MinmaxによるGANの処理

Taming GANs with Lookahead-Minmax ( http://arxiv.org/abs/2006.14567v3 )

ライセンス: Link先を確認

Tatjana Chavdarova, Matteo Pagliardini, Sebastian U. Stich, Francois Fleuret, Martin Jaggi

(参考訳) ジェネレーティブ・Adversarial Networksはトレーニングが難しいことで有名だ。基礎となるminmax最適化は、確率勾配と関連するゲームベクトル場の回転成分の分散に非常に影響を受けやすい。これらの課題に取り組むため,我々は,単一目的最小化専用に開発されたminmax最適化のためのlookaheadアルゴリズムを提案する。 Lookahead-minmaxのバックトラックステップは自然に回転ゲームダイナミクスを処理します。この特性は、文献でしばしば分析される挑戦的な例に基づいて勾配上昇降下法を収束させる鍵であると考えられていました。さらに、大きなミニバッチを使用せずに、暗黙のうちに高い分散を処理する。 mnist、svhn、cifar-10、imagenetの実験結果は、性能と安定性の向上、メモリと計算コストの面で、lookahead-minmaxとadamまたはextragradientを組み合わせるという明確な利点を示している。 CIFAR-10のクラス依存型BigGANでは,30倍のパラメータと16倍のミニバッチを用いることで,クラスラベルを使わずに12.19のFIDを取得し,一般的な計算資源の範囲内で最先端のGANトレーニングを行う。

Generative Adversarial Networks are notoriously challenging to train. The underlying minmax optimization is highly susceptible to the variance of the stochastic gradient and the rotational component of the associated game vector field. To tackle these challenges, we propose the Lookahead algorithm for minmax optimization, originally developed for single objective minimization only. The backtracking step of our Lookahead-minmax naturally handles the rotational game dynamics, a property which was identified to be key for enabling gradient ascent descent methods to converge on challenging examples often analyzed in the literature. Moreover, it implicitly handles high variance without using large mini-batches, known to be essential for reaching state of the art performance. Experimental results on MNIST, SVHN, CIFAR-10, and ImageNet demonstrate a clear advantage of combining Lookahead-minmax with Adam or extragradient, in terms of performance and improved stability, for negligible memory and computational cost. Using 30-fold fewer parameters and 16-fold smaller minibatches we outperform the reported performance of the class-dependent BigGAN on CIFAR-10 by obtaining FID of 12.19 without using the class labels, bringing state-of-the-art GAN training within reach of common computational resources.

翻訳日:2022-11-17 03:23:43 公開日:2021-06-23

# ドメインに依存しない内部分布を用いた逐次モデル適応

Sequential Model Adaptation Using Domain Agnostic Internal Distributions ( http://arxiv.org/abs/2007.00197v4 )

ライセンス: Link先を確認

Mohammad Rostami, Aram Galstyan

(参考訳) 分類器の逐次適応アルゴリズムを開発し, 対象領域の非注釈領域に一般化するために, ソース領域を訓練した。このモデルは、ソースドメインアノテートされたデータに基づいてトレーニングされており、ソースドメインデータがアクセスできない場合には、ターゲットドメインアンアノテートされたデータを使用して適用する必要があると考えている。我々は、中間内部分布を介して、識別的埋め込み空間におけるソースとターゲットドメインの分布を整列する。この分布は埋め込みのソースデータ表現を用いて推定される。提案手法の有効性を実証する4つのベンチマーク実験を行い,既存手法と比較した。

We develop an algorithm for sequential adaptation of a classifier that is trained for a source domain to generalize in an unannotated target domain. We consider that the model has been trained on the source domain annotated data and then it needs to be adapted using the target domain unannotated data when the source domain data is not accessible. We align the distributions of the source and the target domains in a discriminative embedding space via an intermediate internal distribution. This distribution is estimated using the source data representations in the embedding. We conduct experiments on four benchmarks to demonstrate the method is effective and compares favorably against existing methods.

翻訳日:2022-11-14 22:35:52 公開日:2021-06-23

# 循環グラフ上の量子ウォークにおけるカオスからの秩序

Order from chaos in quantum walks on cyclic graphs ( http://arxiv.org/abs/2008.00316v3 )

ライセンス: Link先を確認

Abhisek Panda, Colin Benjamin

(参考訳) 2つのカオスランダムウォークを組み合わせることで、順序付けられた(周期的な)ウォークが得られることが古典的に示されている。本論文の目的は,この非直観的な結果に対する量子アナログを見つけることである。循環型量子ウォークのカオス的および周期的性質を考察し,3サイクルグラフ上の周期的量子ウォークが同じグラフ上の2つのカオス的量子ウォークの決定論的組み合わせによって生成されるユニークな状況に着目した。結果は偶数巡回グラフ、特に4サイクルグラフにも拡張します。私たちの結果は量子暗号と量子カオス制御に関係します。

It has been shown classically that combining two chaotic random walks can yield an ordered(periodic) walk. Our aim in this paper is to find a quantum analog for this rather counter-intuitive result. We study chaotic and periodic nature of cyclic quantum walks and focus on a unique situation wherein a periodic quantum walk on a 3-cycle graph is generated via a deterministic combination of two chaotic quantum walks on the same graph. We extend our results to even-numbered cyclic graphs, specifically a 4-cycle graph too. Our results will be relevant in quantum cryptography and quantum chaos control.

翻訳日:2022-11-04 00:55:20 公開日:2021-06-23

# 1つの単眼映像による高妥当性・信頼性歩行パラメータのアルゴリズム

Algorithm Based on One Monocular Video Delivers Highly Valid and Reliable Gait Parameters ( http://arxiv.org/abs/2008.08045v5 )

ライセンス: Link先を確認

Dr. Arash Azhand, Dr. Sophie Rabe, Dr. Swantje M\"uller, Igor Sattler, Dr. Anika Steinert

(参考訳) 多様体のユースケース(例えば、医療産業、スポーツ、リハビリテーション、フィットネスアセスメントなど)において最重要でありながら、十分な有効で信頼性の高い歩行パラメータの測定は依然としてハイテク歩行研究所に限られている。本稿では,現代の畳み込みニューラルネットワークを基盤とし,歩行者の単眼前頭視映像から三次元骨格関節を抽出する,新たな歩行評価システムの有効性とテスト・テストの再現性を示す。この妥当性は, GAITRite の圧力感受性歩行システムとの比較に基づく。すべての歩行パラメータ(歩行速度、ケイデンス、歩幅、歩幅)は、通常の歩行と速い歩行速度で複数の歩行試行において優れた同時妥当性を示した。テスト-再テスト-リピータビリティは、GAITRiteシステムと同じレベルである。結論として,本研究の結果は,幅広い主流アプリケーションにおいて,コスト,空間,運用上有効な歩容解析への道を開くことができると確信している。ほとんどのセンサーベースのシステムはコストがかかり、広範囲に訓練された人員(例えばモーションキャプチャシステム)によって運用されなければならない。対照的に、ここで提示する評価方法に十分なビデオは、多くのトレーニングなしで、スマートフォンのカメラで誰でも入手することができる。

Despite its paramount importance for manifold use cases (e.g., in the health care industry, sports, rehabilitation and fitness assessment), sufficiently valid and reliable gait parameter measurement is still limited to high-tech gait laboratories mostly. Here, we demonstrate the excellent validity and test-retest repeatability of a novel gait assessment system which is built upon modern convolutional neural networks to extract three-dimensional skeleton joints from monocular frontal-view videos of walking humans. The validity study is based on a comparison to the GAITRite pressure-sensitive walkway system. All measured gait parameters (gait speed, cadence, step length and step time) showed excellent concurrent validity for multiple walk trials at normal and fast gait speeds. The test-retest-repeatability is on the same level as the GAITRite system. In conclusion, we are convinced that our results can pave the way for cost, space and operationally effective gait analysis in broad mainstream applications. Most sensor-based systems are costly, must be operated by extensively trained personnel (e.g., motion capture systems) or - even if not quite as costly - still possess considerable complexity (e.g., wearable sensors). In contrast, a video sufficient for the assessment method presented here can be obtained by anyone, without much training, via a smartphone camera.

翻訳日:2022-11-02 18:56:36 公開日:2021-06-23

# 古典密度汎関数理論における状態関数の物理制約ベイズ推論

Physics-constrained Bayesian inference of state functions in classical density-functional theory ( http://arxiv.org/abs/2010.03374v4 )

ライセンス: Link先を確認

Peter Yatsyshin, Serafim Kalliadasis and Andrew B. Duncan

(参考訳) 古典統計力学の逆問題に対する新しいデータ駆動型アプローチを開発し、古典的な多体系の集合運動に関する実験データから、その系の自由エネルギー景観をどう特徴づけるか。非パラメトリックベイズ推論と物理的動機付け制約を組み合わせることで,近似自由エネルギー汎関数の構成を自動化する効率的な学習アルゴリズムを開発した。コスト関数を最小化しようとする最適化ベースの機械学習アプローチとは対照的に、ベイズ推論の中心となる考え方は、物理原理から導かれるモデルを通じて事前仮定の集合を伝播させることである。実験データは、可能なモデル予測を確率的に評価するために使用される。これは自然に予測の完全不確実な定量化を伴う人間の解釈可能なアルゴリズムにつながる。この場合、学習アルゴリズムの出力は、観測された粒子データと一致する自由エネルギー汎関数の族上の確率分布である。驚くほど小さなデータサンプルは、基礎となる自由エネルギー関数の高精度な解析式を推測するのに十分な情報を含んでおり、アルゴリズムを高度にデータ効率良くする。自由エネルギーの観点からのモデリングにおいて非常に困難である一方, 自然界においてユビキタスである体積粒子相互作用の排除を考える。このアプローチを検証するために, 1次元流体のパラダイム的場合を考察し, 標準的および大カノニカル統計力学的アンサンブルの推論アルゴリズムを開発した。高次元システムの拡張は概念的には単純であるが、標準的な粗粒化技術では魅力的な相互作用を容易に取り入れることができる。

We develop a novel data-driven approach to the inverse problem of classical statistical mechanics: given experimental data on the collective motion of a classical many-body system, how does one characterise the free energy landscape of that system? By combining non-parametric Bayesian inference with physically-motivated constraints, we develop an efficient learning algorithm which automates the construction of approximate free energy functionals. In contrast to optimisation-based machine learning approaches, which seek to minimise a cost function, the central idea of the proposed Bayesian inference is to propagate a set of prior assumptions through the model, derived from physical principles. The experimental data is used to probabilistically weigh the possible model predictions. This naturally leads to humanly interpretable algorithms with full uncertainty quantification of predictions. In our case, the output of the learning algorithm is a probability distribution over a family of free energy functionals, consistent with the observed particle data. We find that surprisingly small data samples contain sufficient information for inferring highly accurate analytic expressions of the underlying free energy functionals, making our algorithm highly data efficient. We consider excluded volume particle interactions, which are ubiquitous in nature, whilst being highly challenging for modelling in terms of free energy. To validate our approach we consider the paradigmatic case of one-dimensional fluid and develop inference algorithms for the canonical and grand-canonical statistical-mechanical ensembles. Extensions to higher-dimensional systems are conceptually straightforward, whilst standard coarse-graining techniques allow one to easily incorporate attractive interactions.

翻訳日:2022-10-10 00:05:09 公開日:2021-06-23

# textsettr: 最小限のテキストスタイル抽出とチューニング可能なターゲットレスタイリング

TextSETTR: Few-Shot Text Style Extraction and Tunable Targeted Restyling ( http://arxiv.org/abs/2010.03802v3 )

ライセンス: Link先を確認

Parker Riley, Noah Constant, Mandy Guo, Girish Kumar, David Uthus, Zarana Parekh

(参考訳) 本稿では,テキストスタイル転送問題に対する新しいアプローチを提案する。スタイルラベル付き学習データを必要とする従来の手法とは異なり,提案手法は隣接した文間のスタイルの暗黙的な接続に依存し,推論時にのみラベル付きデータを使用する。我々は、強い事前訓練されたテキスト-テキストモデルであるT5(Raffel et al., 2020)に適応し、テキストからスタイルベクトルを抽出し、デコーダを用いてスタイル転送を行う。ラベルなしのトレーニングでは,多くのスタイルのファセットを符号化したスタイルベクトル空間が生成されるので,入力の特定の属性を調整し,他の属性を保存しながら,転送を"ターゲット復元"ベクター操作として再キャストする。ラベルなしのamazon reviewsデータに対するトレーニングの結果、ラベル付きデータで完全にトレーニングされたモデルと比較しても、感情伝達に競争力のあるモデルが得られることを実証する。さらに,ラベルのないwebテキストの多種多様なコーパスに適用することで,追加のトレーニングを受けず,推論時にほんの一握りの例を用いても,多次元のスタイル(発話性,動機づけ性,形式性,礼儀正しく,感情)を伝達できる単一モデルが得られた。

We present a novel approach to the problem of text style transfer. Unlike previous approaches requiring style-labeled training data, our method makes use of readily-available unlabeled text by relying on the implicit connection in style between adjacent sentences, and uses labeled data only at inference time. We adapt T5 (Raffel et al., 2020), a strong pretrained text-to-text model, to extract a style vector from text and use it to condition the decoder to perform style transfer. As our label-free training results in a style vector space encoding many facets of style, we recast transfers as "targeted restyling" vector operations that adjust specific attributes of the input while preserving others. We demonstrate that training on unlabeled Amazon reviews data results in a model that is competitive on sentiment transfer, even compared to models trained fully on labeled data. Furthermore, applying our novel method to a diverse corpus of unlabeled web text results in a single model capable of transferring along multiple dimensions of style (dialect, emotiveness, formality, politeness, sentiment) despite no additional training and using only a handful of exemplars at inference time.

翻訳日:2022-10-09 11:14:31 公開日:2021-06-23

# Permuted AdaIN: 画像分類における世界統計へのバイアス削減

Permuted AdaIN: Reducing the Bias Towards Global Statistics in Image Classification ( http://arxiv.org/abs/2010.05785v3 )

ライセンス: Link先を確認

Oren Nuriel, Sagie Benaim, Lior Wolf

(参考訳) 近年の研究では、畳み込みニューラルネットワーク分類器は形状を犠牲にしてテクスチャに依存することが示されている。一方、形状と局所像の区別は類似しているが異なるが、一方、グローバル画像統計は異なる。提案手法は,pAdaIN(Permuted Adaptive Instance Normalization)と呼ばれ,画像分類器の隠蔽層におけるグローバル統計の表現を低減する。 padainは、与えられたバッチ内のサンプルを並べ替えるランダムな置換$\pi$をサンプリングする。適応インスタンス正規化(adain)は、各(置換されていない)サンプル$i$のアクティベーションと、対応するサンプル$\pi(i)$のアクティベーションの間に適用される。グローバル画像統計は歪んでいるため、この交換手順により、ネットワークは形状やテクスチャなどの手がかりに依存することになる。確率 $p$ のランダム置換とそれ以外は恒等置換を選択することで、効果の強さを制御できる。すべての実験で$p$と固定 aprioriを正しく選択し、テストデータを考慮せずに選択することで、複数の設定でベースラインを一貫して上回っています。画像分類では,複数のアーキテクチャを用いてCIFAR100とImageNetの両方を改良する。堅牢性の設定では、複数のアーキテクチャに対して ImageNet-C と Cifar-100-C の両方を改良する。ドメイン適応とドメイン一般化の設定において,本手法はGTAVからCityscapesおよびPACSベンチマークへの変換学習タスクにおける技術結果の状態を達成している。

Recent work has shown that convolutional neural network classifiers overly rely on texture at the expense of shape cues. We make a similar but different distinction between shape and local image cues, on the one hand, and global image statistics, on the other. Our method, called Permuted Adaptive Instance Normalization (pAdaIN), reduces the representation of global statistics in the hidden layers of image classifiers. pAdaIN samples a random permutation $\pi$ that rearranges the samples in a given batch. Adaptive Instance Normalization (AdaIN) is then applied between the activations of each (non-permuted) sample $i$ and the corresponding activations of the sample $\pi(i)$, thus swapping statistics between the samples of the batch. Since the global image statistics are distorted, this swapping procedure causes the network to rely on cues, such as shape or texture. By choosing the random permutation with probability $p$ and the identity permutation otherwise, one can control the effect's strength. With the correct choice of $p$, fixed apriori for all experiments and selected without considering test data, our method consistently outperforms baselines in multiple settings. In image classification, our method improves on both CIFAR100 and ImageNet using multiple architectures. In the setting of robustness, our method improves on both ImageNet-C and Cifar-100-C for multiple architectures. In the setting of domain adaptation and domain generalization, our method achieves state of the art results on the transfer learning task from GTAV to Cityscapes and on the PACS benchmark.

翻訳日:2022-10-09 05:59:22 公開日:2021-06-23

# スマートビルにおける異常検出のための連合学習手法

A Federated Learning Approach to Anomaly Detection in Smart Buildings ( http://arxiv.org/abs/2010.10293v3 )

ライセンス: Link先を確認

Raed Abdel Sater and A. Ben Hamza

(参考訳) スマートな建物におけるIoT(Internet of Things)センサーはますます普及しており、建物をより生き生きとエネルギー効率を良くし、持続可能なものにしている。これらの装置は環境を感知し、スマートビルにおける異常の検出とエネルギー使用量の予測を改善するため、最重要度の多変量時間データを生成する。しかしながら、中央システムにおけるこれらの異常の検出は、応答時間の大幅な遅延によってしばしば悩まされる。本研究では,タスク間の類似性と差異を生かしつつ,複数のタスクを同時に解決することを目的としたマルチタスク学習パラダイムを活用して,連合学習環境における異常検出問題を定式化する。本論文では,lstm(stacked long short-time memory)モデルを用いた,新しいプライバシ・バイ・デザインのフェデレーション学習モデルを提案する。当社のフェデレーション学習手法の有効性を,一般電流スマートビルディングにおけるiot生産システムによって生成された3つの実世界データセットを用いて実証した。本研究は,予測性能を損なうことなく,総合的なトレーニングコストを削減するためのフレームワークの有効性を示す。

Internet of Things (IoT) sensors in smart buildings are becoming increasingly ubiquitous, making buildings more livable, energy efficient, and sustainable. These devices sense the environment and generate multivariate temporal data of paramount importance for detecting anomalies and improving the prediction of energy usage in smart buildings. However, detecting these anomalies in centralized systems is often plagued by a huge delay in response time. To overcome this issue, we formulate the anomaly detection problem in a federated learning setting by leveraging the multi-task learning paradigm, which aims at solving multiple tasks simultaneously while taking advantage of the similarities and differences across tasks. We propose a novel privacy-by-design federated learning model using a stacked long short-time memory (LSTM) model, and we demonstrate that it is more than twice as fast during training convergence compared to the centralized LSTM. The effectiveness of our federated learning approach is demonstrated on three real-world datasets generated by the IoT production system at General Electric Current smart building, achieving state-of-the-art performance compared to baseline methods in both classification and regression tasks. Our experimental results demonstrate the effectiveness of the proposed framework in reducing the overall training cost without compromising the prediction performance.

翻訳日:2022-10-05 07:21:34 公開日:2021-06-23

# PHEW: トレーニングデータなしで学習し、より良く一般化するスパースネットワークの構築

PHEW: Constructing Sparse Networks that Learn Fast and Generalize Well without Training Data ( http://arxiv.org/abs/2010.11354v2 )

ライセンス: Link先を確認

Shreyas Malakarjun Patil, Constantine Dovrolis

(参考訳) 初期化時にネットワークをスパース化する手法は、学習と推論の両方の効率を大幅に改善するため、実際に重要である。我々の研究は、最近提案されたNeural Tangent Kernel(NTK)の分解に基づいており、トレーニングプロセスのダイナミクスをデータ依存コンポーネントとアーキテクチャ依存カーネル(後者はPath Kernelと呼ばれる)に分離した。この研究は、Synflow-L2アルゴリズムを使用して、トレーニングデータなしで、より高速な収束のためにスパースニューラルネットワークを設計する方法を示した。我々はまず、Synflow-L2が収束の点で最適であるにもかかわらず、ネットワーク密度が与えられた場合、ネットワークのサブネットワークに"bottleneck"層(狭い層)が生じることを示し、同じ数のパラメータを使用する他のデータに依存しない手法と比べてパフォーマンスが劣ることを示した。そこで本稿では,PHEW(Paths with Higher-Edge Weights)と呼ばれるトレーニングデータなしでスパースネットワークを構築する手法を提案する。 phewは、初期重みのみに依存するバイアス付きランダムウォークに基づく確率的ネットワーク形成手法である。 Synflow-L2と同様のパスカーネル特性を持つが、より広い層を生成するため、より一般化と性能が向上する。 PHEWは、幅広いネットワーク密度で、データ非依存のSynFlowとSynFlow-L2メソッドよりも大幅に改善されている。

Methods that sparsify a network at initialization are important in practice because they greatly improve the efficiency of both learning and inference. Our work is based on a recently proposed decomposition of the Neural Tangent Kernel (NTK) that has decoupled the dynamics of the training process into a data-dependent component and an architecture-dependent kernel - the latter referred to as Path Kernel. That work has shown how to design sparse neural networks for faster convergence, without any training data, using the Synflow-L2 algorithm. We first show that even though Synflow-L2 is optimal in terms of convergence, for a given network density, it results in sub-networks with "bottleneck" (narrow) layers - leading to poor performance as compared to other data-agnostic methods that use the same number of parameters. Then we propose a new method to construct sparse networks, without any training data, referred to as Paths with Higher-Edge Weights (PHEW). PHEW is a probabilistic network formation method based on biased random walks that only depends on the initial weights. It has similar path kernel properties as Synflow-L2 but it generates much wider layers, resulting in better generalization and performance. PHEW achieves significant improvements over the data-independent SynFlow and SynFlow-L2 methods at a wide range of network densities.

翻訳日:2022-10-04 07:18:31 公開日:2021-06-23

# fdrn:医療画像のための高速変形可能な登録ネットワーク

FDRN: A Fast Deformable Registration Network for Medical Images ( http://arxiv.org/abs/2011.02307v4 )

ライセンス: Link先を確認

Kaicong Sun and Sven Simon

(参考訳) 変形可能な画像登録は医療画像の基本的な課題である。ボリューム画像の変形可能な登録の計算複雑性が大きいため、従来の反復法は通常、登録精度と実際の計算時間とのトレードオフに直面している。精度と実行時間の両方で登録性能を向上させるため,高速畳み込みニューラルネットワークを提案する。特に、メモリ資源を効率的に活用し、モデル容量を拡大するために、各エンコーダおよびデコーダステージにおいて、チャネル結合の代わりに付加フォワードを採用し、ネットワークを深くする。学習効率を高めるため,エンコーダおよびデコーダ段内のスキップ接続を活用し,残差学習を可能にし,下位層の補助損失を最小の分解能で活用し,深い監督を行う。特に、トレーニングフェーズ中に指数減衰パラメータによって低分解能補助損失を重み付けする。高解像度グリッドの主な損失と合わせて、粗大な学習戦略が達成される。最後に, Dice スコアの登録性能を改善するために, セグメンテーションに基づく補助的損失を導入する。平均diceスコアを用いた補助損失と比較すると,提案するマルチラベルセグメンテーション損失はトレーニング段階で追加のメモリコストを生じさせず,任意の量のカテゴリを持つ画像に適用できる。実験では,fdrnが,コンパクトネットワーク構造と効率的な学習を駆使して,既存の脳mr画像の最先端登録手法よりも優れていることを示す。さらに、FDRNは画像登録のための一般的なフレームワークであり、特定の種類の医療画像や解剖に制限されない。

Deformable image registration is a fundamental task in medical imaging. Due to the large computational complexity of deformable registration of volumetric images, conventional iterative methods usually face the tradeoff between the registration accuracy and the computation time in practice. In order to boost the registration performance in both accuracy and runtime, we propose a fast convolutional neural network. Specially, to efficiently utilize the memory resources and enlarge the model capacity, we adopt additive forwarding instead of channel concatenation and deepen the network in each encoder and decoder stage. To facilitate the learning efficiency, we leverage skip connection within the encoder and decoder stages to enable residual learning and employ an auxiliary loss at the bottom layer with lowest resolution to involve deep supervision. Particularly, the low-resolution auxiliary loss is weighted by an exponentially decayed parameter during the training phase. In conjunction with the main loss in high-resolution grid, a coarse-to-fine learning strategy is achieved. Last but not least, we introduce an auxiliary loss based on the segmentation prior to improve the registration performance in Dice score. Comparing to the auxiliary loss using average Dice score, the proposed multi-label segmentation loss does not induce additional memory cost in the training phase and can be employed on images with arbitrary amount of categories. In the experiments, we show FDRN outperforms the existing state-of-the-art registration methods for brain MR images by resorting to the compact network structure and efficient learning. Besides, FDRN is a generalized framework for image registration which is not confined to a particular type of medical images or anatomy.

翻訳日:2022-09-29 23:00:00 公開日:2021-06-23

# HILONet:非アライン観測による階層的模倣学習

HILONet: Hierarchical Imitation Learning from Non-Aligned Observations ( http://arxiv.org/abs/2011.02671v2 )

ライセンス: Link先を確認

Shanqi Liu, Junjie Cao, Wenzhou Chen, Licheng Wen, Yong Liu

(参考訳) 実演を段階的に追従して専門家を模倣することを目的とした模倣学習手法が多いため,非時間連携環境において実演のみの軌跡から学ぶことは困難である。しかし、実世界でのデモはほとんど得られない。本研究では,ハイロネット(Hierarchical Imitation Learning from Observation, HiLONet)と呼ばれる新しい模倣学習手法を提案する。本手法は,1つのゴール位置の有無に関わらず,これらのサブゴールを達成することで,あらゆる種類のタスクを解決できる。また, 階層構造における試料効率を向上させる3つの方法を提案する。いくつかの環境を用いて広範な実験を行う。その結果,性能と学習効率の両面で改善が見られた。

It is challenging learning from demonstrated observation-only trajectories in a non-time-aligned environment because most imitation learning methods aim to imitate experts by following the demonstration step-by-step. However, aligned demonstrations are seldom obtainable in real-world scenarios. In this work, we propose a new imitation learning approach called Hierarchical Imitation Learning from Observation(HILONet), which adopts a hierarchical structure to choose feasible sub-goals from demonstrated observations dynamically. Our method can solve all kinds of tasks by achieving these sub-goals, whether it has a single goal position or not. We also present three different ways to increase sample efficiency in the hierarchical structure. We conduct extensive experiments using several environments. The results show the improvement in both performance and learning efficiency.

翻訳日:2022-09-29 12:25:23 公開日:2021-06-23

# GANMEX: 1-vs-one属性をGANベースの対実説明ベースラインでガイドする

GANMEX: One-vs-One Attributions Guided by GAN-based Counterfactual Explanation Baselines ( http://arxiv.org/abs/2011.06015v4 )

ライセンス: Link先を確認

Sheng-Min Shih, Pin-Ju Tien, Zohar Karnin

(参考訳) 帰属法は学習モデル予測に繋がる重要な特徴を特定するための有望な手法として示されてきた。既存の帰属法の多くは特徴摂動を行うためのベースライン入力に依存しているが、ベースライン選択問題に対処するための限定的な研究がなされている。ベースラインの貧弱な選択は、マルチクラス分類器に対する1-vs-one (1-vs-1)説明の能力を制限する。 1-vs-1の説明は、あるクラスが他のクラスと類似している場合、例えば、複数の動物の間での2種類の鳥のタイプは、クラス間での共有機能よりも重要な識別機能に焦点を当てることによって重要である。本稿では,GAN(Generative Adversarial Networks)を用いた新しい手法であるGANMEX(GAN-based Model Explainability)を提案する。提案手法は, 対象クラスに最も近い実写的なサンプルとして, 対物的ベースラインを効果的に選択することで, 真の1-vs-1説明を提供する属性法を実現する。我々は,GANMEXベースラインがサリエンシマップを改善し,既存のベースラインよりも摂動に基づく評価指標の性能が向上したことを示した。既存の帰属結果はモデルランダム化に敏感であることが知られており、GANMEXベースラインがモデルのカスケードランダム化の下でより良い結果をもたらすことを示した。

Attribution methods have been shown as promising approaches for identifying key features that led to learned model predictions. While most existing attribution methods rely on a baseline input for performing feature perturbations, limited research has been conducted to address the baseline selection issues. Poor choices of baselines limit the ability of one-vs-one (1-vs-1) explanations for multi-class classifiers, which means the attribution methods were not able to explain why an input belongs to its original class but not the other specified target class. 1-vs-1 explanation is crucial when certain classes are more similar than others, e.g. two bird types among multiple animals, by focusing on key differentiating features rather than shared features across classes. In this paper, we present GAN-based Model EXplainability (GANMEX), a novel approach applying Generative Adversarial Networks (GAN) by incorporating the to-be-explained classifier as part of the adversarial networks. Our approach effectively selects the counterfactual baseline as the closest realistic sample belong to the target class, which allows attribution methods to provide true 1-vs-1 explanations. We showed that GANMEX baselines improved the saliency maps and led to stronger performance on perturbation-based evaluation metrics over the existing baselines. Existing attribution results are known for being insensitive to model randomization, and we demonstrated that GANMEX baselines led to better outcome under the cascading randomization of the model.

翻訳日:2022-09-27 00:33:58 公開日:2021-06-23

# hebbian meta-learningにおけるゲノムボトルネック仮説の検証

Testing the Genomic Bottleneck Hypothesis in Hebbian Meta-Learning ( http://arxiv.org/abs/2011.06811v2 )

ライセンス: Link先を確認

Rasmus Berg Palm, Elias Najarro, Sebastian Risi

(参考訳) hebbian meta-learningは最近、厳しい強化学習問題を解決する約束を示しており、エージェントが環境の変化にある程度適応できるようにしている。しかしながら、これらの手法のシナプスは、非常に特定の学習規則を学習できるため、非常に異なる状況に一般化する能力は減少する可能性が高い。我々は、ヘビアン学習規則の数を「ゲノムボトルネック」によって制限することは、環境の変化をまたいだより良い一般化につながると仮定する。本仮説は,ヘッブの学習規則数をシナプス数から分離し,体系的にヘッブの学習規則の数を変化させることで検証する。本稿では,ヘビアン学習規則の同時学習とシナプスへの割り当てが困難な最適化問題であり,テスト環境における性能の低下につながることを示唆する。しかし,並列研究の結果,類似したルールをクラスタ化することで,学習ルールの数を減らすことが可能であることが判明した。ゲノムボトルネック」アルゴリズムを最もうまく実装する方法は、さらなる調査を保証する重要な研究方向である。

Hebbian meta-learning has recently shown promise to solve hard reinforcement learning problems, allowing agents to adapt to some degree to changes in the environment. However, because each synapse in these approaches can learn a very specific learning rule, the ability to generalize to very different situations is likely reduced. We hypothesize that limiting the number of Hebbian learning rules through a "genomic bottleneck" can act as a regularizer leading to better generalization across changes to the environment. We test this hypothesis by decoupling the number of Hebbian learning rules from the number of synapses and systematically varying the number of Hebbian learning rules. The results in this paper suggest that simultaneously learning the Hebbian learning rules and their assignment to synapses is a difficult optimization problem, leading to poor performance in the environments tested. However, parallel research to ours finds that it is indeed possible to reduce the number of learning rules by clustering similar rules together. How to best implement a "genomic bottleneck" algorithm is thus an important research direction that warrants further investigation.

翻訳日:2022-09-25 23:44:26 公開日:2021-06-23

# 有限地平線上の騒音線形二次レギュレータのポリシー勾配法

Policy Gradient Methods for the Noisy Linear Quadratic Regulator over a Finite Horizon ( http://arxiv.org/abs/2011.10300v2 )

ライセンス: Link先を確認

Ben Hambly, Renyuan Xu and Huining Yang

(参考訳) 線形二次レギュレータ(lqr)問題における最適方針を求めるための強化学習法について検討する。特に、既知のパラメータと未知パラメータの設定におけるポリシー勾配法の収束について考察する。弱仮定下での有限時間地平線と確率状態ダイナミクスの設定において、このアプローチに対する大域的線形収束保証を作成できる。また,制約問題に対処するために,計画された方針勾配法の収束性も確立した。アルゴリズムの性能を2つの例で説明する。最初の例は、資産の持ち株の最適清算である。基礎となるダイナミクスのモデルを仮定し、その手法をデータに直接適用する場合の結果を示す。実証的な証拠は、政策勾配法がLQRフレームワークを含むより大規模な確率系の大域的最適解を学習し、モデルベースアプローチと比較してモデルミス特定に関してより堅牢であることを示唆している。第二の例は合成データを用いた高次元設定におけるLQRシステムである。

We explore reinforcement learning methods for finding the optimal policy in the linear quadratic regulator (LQR) problem. In particular, we consider the convergence of policy gradient methods in the setting of known and unknown parameters. We are able to produce a global linear convergence guarantee for this approach in the setting of finite time horizon and stochastic state dynamics under weak assumptions. The convergence of a projected policy gradient method is also established in order to handle problems with constraints. We illustrate the performance of the algorithm with two examples. The first example is the optimal liquidation of a holding in an asset. We show results for the case where we assume a model for the underlying dynamics and where we apply the method to the data directly. The empirical evidence suggests that the policy gradient method can learn the global optimal solution for a larger class of stochastic systems containing the LQR framework and that it is more robust with respect to model mis-specification when compared to a model-based approach. The second example is an LQR system in a higher dimensional setting with synthetic data.

翻訳日:2022-09-23 06:32:53 公開日:2021-06-23

# HAWQV3: Dyadic Neural Network Quantization

HAWQV3: Dyadic Neural Network Quantization ( http://arxiv.org/abs/2011.10680v3 )

ライセンス: Link先を確認

Zhewei Yao, Zhen Dong, Zhangcheng Zheng, Amir Gholami, Jiali Yu, Eric Tan, Leyuan Wang, Qijing Huang, Yida Wang, Michael W. Mahoney, Kurt Keutzer

(参考訳) 現在の低精度量子化アルゴリズムは浮動小数点から量子化された整数値への変換の隠れたコストを持つことが多い。この隠れたコストは、ニューラルネットワークの量子化によって実現されるレイテンシの改善を制限する。そこで本研究では,新しい混合精度整数専用量子化フレームワークHAWQV3を提案する。 HAWQV3の貢献は以下のとおりである。 (i)浮動小数点演算や整数除算なしで、整数乗算、加算、ビットシフトのみで計算グラフ全体が実行される整数専用推論 2 モデル摂動とその他の制約(例えばメモリフットプリントと遅延)のトレードオフをバランスさせる整数線形計画問題の解法により、ビット精度を計算したハードウェア対応混合精度量子化法 3TVMにおける4ビットの均一/混合精度量子化のための直接ハードウェア展開とオープンソースコントリビューションで、T4 GPU上のResNet50の均一8ビットと比較して平均速度が1.45\times$に達する。 (iv)resnet18/50とinceptionv3の混合精度の異なるモデル圧縮レベルに対する提案手法の広範な評価 resnet50では、int8量子化は77.58\%$(以前の整数のみの仕事よりも2.68\%$高い)の精度を達成し、混合精度のint4/8量子化はint8のレイテンシを23\%$に削減し、それでも7.73\%の精度を達成します。私たちのフレームワークとtvmの実装はオープンソースです。

Current low-precision quantization algorithms often have the hidden cost of conversion back and forth from floating point to quantized integer values. This hidden cost limits the latency improvement realized by quantizing Neural Networks. To address this, we present HAWQV3, a novel mixed-precision integer-only quantization framework. The contributions of HAWQV3 are the following: (i) An integer-only inference where the entire computational graph is performed only with integer multiplication, addition, and bit shifting, without any floating point operations or even integer division; (ii) A novel hardware-aware mixed-precision quantization method where the bit-precision is calculated by solving an integer linear programming problem that balances the trade-off between model perturbation and other constraints, e.g., memory footprint and latency; (iii) Direct hardware deployment and open source contribution for 4-bit uniform/mixed-precision quantization in TVM, achieving an average speed up of $1.45\times$ for uniform 4-bit, as compared to uniform 8-bit for ResNet50 on T4 GPUs; and (iv) extensive evaluation of the proposed methods on ResNet18/50 and InceptionV3, for various model compression levels with/without mixed precision. For ResNet50, our INT8 quantization achieves an accuracy of $77.58\%$, which is $2.68\%$ higher than prior integer-only work, and our mixed-precision INT4/8 quantization can reduce INT8 latency by $23\%$ and still achieve $76.73\%$ accuracy. Our framework and the TVM implementation have been open sourced.

翻訳日:2022-09-23 06:06:01 公開日:2021-06-23

# チャットボットを用いた水稲画像の自動水稲病検出システム

A System for Automatic Rice Disease Detection from Rice Paddy Images Serviced via a Chatbot ( http://arxiv.org/abs/2011.10823v2 )

ライセンス: Link先を確認

Pitchayagan Temniranrat, Kantip Kiratiratanapruk, Apichon Kitvimonrat, Wasin Sinthupinyo and Sujin Patarapuwadol

(参考訳) 実際の水田画像からイネの病気を診断するLINEボットシステムを開発し,本論文で紹介した。稲作農家の収量・品質向上に資する、使い易く自動的な制度であった。対象画像は,水田環境から特別に試料を採取することなく撮影した。画像からイネ病を検出するために深層学習ニューラルネットワークを用いた。水稲病検出に関するこれまでの研究の成果を改善するために,オブジェクト検出モデルのトレーニングと改良プロセスを開発した。このプロセスはモデルの予測結果の分析に基づいており、モデルの次のトレーニングでデータベースの品質を改善するために繰り返し使用される。 LINE Bot システムのデプロイモデルは,前回の論文 YOLOv3 で選択された最高のパフォーマンス技術を用いて,洗練されたトレーニングデータセットによってトレーニングされた。配置モデルの性能を5つの対象クラスで測定した結果, 前回の論文では91.1%から95.6%に改善した。そこで,この展開モデルをイネ病線ボットシステムに適用した。当システムでは, 稲作農家やイネ病専門医を含むLINEグループ利用者に対して, 初診結果を自動で提示する。彼らはチャットを通じて自由にコミュニケーションできる。実ラインボットのデプロイメントでは、モデルのパフォーマンスは、我々の定義した測定平均であるtrue positive pointで測定され、平均78.86%であることが判明した。システムは高速で,検出処理に2～3秒しかかからなかった。

A LINE Bot System to diagnose rice diseases from actual paddy field images was developed and presented in this paper. It was easy-to-use and automatic system designed to help rice farmers improve the rice yield and quality. The targeted images were taken from the actual paddy environment without special sample preparation. We used a deep learning neural networks technique to detect rice diseases from the images. We developed an object detection model training and refinement process to improve the performance of our previous research on rice leave diseases detection. The process was based on analyzing the model's predictive results and could be repeatedly used to improve the quality of the database in the next training of the model. The deployment model for our LINE Bot system was created from the selected best performance technique in our previous paper, YOLOv3, trained by refined training data set. The performance of the deployment model was measured on 5 target classes and found that the Average True Positive Point improved from 91.1% in the previous paper to 95.6% in this study. Therefore, we used this deployment model for Rice Disease LINE Bot system. Our system worked automatically real-time to suggest primary diagnosis results to the users in the LINE group, which included rice farmers and rice disease specialists. They could communicate freely via chat. In the real LINE Bot deployment, the model's performance was measured by our own defined measurement Average True Positive Point and was found to be an average of 78.86%. The system was fast and took only 2-3 s for detection process in our system server.

翻訳日:2022-09-22 23:41:52 公開日:2021-06-23

# (参考訳) オートオルガニサドのスーパーメルカドにおけるカナスタ・デ・メルカドの考察

An\'alisis de Canasta de mercado en supermercados mediante mapas auto-organizados ( http://arxiv.org/abs/2107.10647v1 )

ライセンス: CC BY 4.0

Joaqu\'in Cordero, Alfredo Bolt and Mauricio Valle

(参考訳) 導入:チリの首都の西部地域で重要なスーパーマーケットチェーンは、決定を行う上で重要な情報を得る必要があり、この情報はデータベースで利用可能であるが、可視化が困難になる情報の複雑さと量のために処理する必要がある。方法: この目的のために, 人工ニューラルネットワークを用いて, コホーネンのSOM法を用いたアルゴリズムを開発した。これを実行するには、特定の重要な手順に従う必要がある。例えば、データマイニングはフィルタリングに責任を持ち、関連するデータのみをマーケットバスケット分析に使用する。情報をフィルタリングした後、データは準備されなければならない。データ準備の後、サンプルデータに適応するためにPythonプログラミング環境を用意し、テスト結果の後にパラメータをセットしてSOMのトレーニングを進めました。結果:SOMの成果は,SOMのトレーニングと実際の取引の結果として得られたことから,店主が考慮すべきプロモーション,パック,バンドルを形成するために,トポロジカルに近接して配置して購入した商品間の関係が得られた。結論:これに基づいて,調査で使用したデータを提供するスーパーマーケットチェーンに対して,頻繁な買い物かごの推薦がなされている。

Introduction: An important chain of supermarkets in the western zone of the capital of Chile, needs to obtain key information to make decisions, this information is available in the databases but needs to be processed due to the complexity and quantity of information which becomes difficult to visualiz,. Method: For this purpose, an algorithm was developed using artificial neural networks applying Kohonen's SOM method. To carry it out, certain key procedures must be followed to develop it, such as data mining that will be responsible for filtering and then use only the relevant data for market basket analysis. After filtering the information, the data must be prepared. After data preparation, we prepared the Python programming environment to adapt it to the sample data, then proceed to train the SOM with its parameters set after test results. Result: the result of the SOM obtains the relationship between the products that were most purchased by positioning them topologically close, to form promotions, packs and bundles for the retail manager to take into consideration, because these relationships were obtained as a result of the SOM training with the real transactions of the clients. Conclusion: Based on this, recommendations on frequent shopping baskets have been made to the supermarket chain that provided the data used in the research

翻訳日:2021-07-25 15:03:07 公開日:2021-06-23

# (参考訳) MegazordNet:時系列予測のための統計と機械学習の視点を組み合わせる

MegazordNet: combining statistical and machine learning standpoints for time series forecasting ( http://arxiv.org/abs/2107.01017v1 )

ライセンス: CC BY 4.0

Angelo Garangau Menezes and Saulo Martiello Mastelini

(参考訳) 金融時系列の予測は、シリーズのカオス的特徴のために難しい課題であると考えられている。統計学的アプローチは、市場方向の予測や株価の単価など、いくつかの特定の問題において確固たる結果を示しているが、近年のディープラーニングとビッグデータ技術の進歩により、金融時系列予測に新たな有望な選択肢が生まれている。さらに,近年の文献では,統計と機械学習を組み合わせることで,単一解と比較して予測精度が向上する可能性が示唆されている。そこで本研究では,時系列予測のための構造化深層学習モデルと組み合わせて,金融時系列内の統計的特徴を探索するフレームワークであるMegazordNetを提案する。我々は、s&p500種株価の終値予測手法を異なる指標を用いて評価し、単一統計および機械学習手法を上回った。

Forecasting financial time series is considered to be a difficult task due to the chaotic feature of the series. Statistical approaches have shown solid results in some specific problems such as predicting market direction and single-price of stocks; however, with the recent advances in deep learning and big data techniques, new promising options have arises to tackle financial time series forecasting. Moreover, recent literature has shown that employing a combination of statistics and machine learning may improve accuracy in the forecasts in comparison to single solutions. Taking into consideration the mentioned aspects, in this work, we proposed the MegazordNet, a framework that explores statistical features within a financial series combined with a structured deep learning model for time series forecasting. We evaluated our approach predicting the closing price of stocks in the S&P 500 using different metrics, and we were able to beat single statistical and machine learning methods.

翻訳日:2021-07-11 13:00:20 公開日:2021-06-23

# トランスファーラーニングによるインフォーマル・フォーマル言語シナリオにおけるジェンダー認識

Gender Recognition in Informal and Formal Language Scenarios via Transfer Learning ( http://arxiv.org/abs/2107.02759v1 )

ライセンス: Link先を確認

Daniel Escobar-Grisales, Juan Camilo Vasquez-Correa, Juan Rafael Orozco-Arroyave

(参考訳) テキストデータに基づく人口統計情報検索への関心は,セキュリティ,マーケティング,ヒースケアなどさまざまな分野において,アプリケーションが成功を収めていることから,研究コミュニティで高まっている。テキストデータに基づく性別、年齢、場所、性格などの人口統計特性の認識と識別は、異なるマーケティング戦略を改善するのに役立つ。例えば、オファーのセグメンテーションとパーソナライズを可能にすることで、製品やサービスを最も関心のあるグループに公開することができる。この種の技術は、ソーシャルメディアの文書で広く議論されている。しかし、これらの手法は、ソーシャルメディアにしか存在しないエモティコン、言及、その他の言語現象へのアクセスがない、より形式的な構造を持つデータで研究されていない。本稿では,再帰的・畳み込み型ニューラルネットワークと,非公式言語と形式言語で書かれた文書における性別認識のための伝達学習戦略を提案する。モデルは、ツイートとコールセンター会話からなる2つの異なるデータベースでテストされる。両方のデータベースで最大75\%のアキュラティが達成される。また、ソーシャルメディアで一般的に使用されるような特定の表現やイディオムに基づいて訓練されたシステムから、より形式的なテキストデータに知識を移すことも可能であり、データ量が少なく、構造が完全に異なることを示している。

The interest in demographic information retrieval based on text data has increased in the research community because applications have shown success in different sectors such as security, marketing, heath-care, and others. Recognition and identification of demographic traits such as gender, age, location, or personality based on text data can help to improve different marketing strategies. For instance it makes it possible to segment and to personalize offers, thus products and services are exposed to the group of greatest interest. This type of technology has been discussed widely in documents from social media. However, the methods have been poorly studied in data with a more formal structure, where there is no access to emoticons, mentions, and other linguistic phenomena that are only present in social media. This paper proposes the use of recurrent and convolutional neural networks, and a transfer learning strategy for gender recognition in documents that are written in informal and formal languages. Models are tested in two different databases consisting of Tweets and call-center conversations. Accuracies of up to 75\% are achieved for both databases. The results also indicate that it is possible to transfer the knowledge from a system trained on a specific type of expressions or idioms such as those typically used in social media into a more formal type of text data, where the amount of data is more scarce and its structure is completely different.

翻訳日:2021-07-11 11:34:03 公開日:2021-06-23

# (参考訳) 対話型セグメンテーションのための確率的注意

Probabilistic Attention for Interactive Segmentation ( http://arxiv.org/abs/2106.15338v1 )

ライセンス: CC BY 4.0

Prasad Gabbur and Manjot Bilkhu and Javier Movellan

(参考訳) 我々は注意の確率論的解釈を提供し、トランスフォーマーにおける標準ドット生産注意は最大後方推定(map)の特別な場合であることを示す。提案手法は,キーおよび値モデルパラメータのオンライン適応に期待最大化アルゴリズムを用いることを提案する。このアプローチは、外部エージェント、例えば注釈器が、いくつかのトークンの正しい値、例えば、いくつかのピクセルの意味圏に関する推論時間情報を提供する場合に有用であり、この新しい情報は、原則的に他のトークンに伝播する必要がある。本稿では,アノテーションの効率を向上させるために,アノテーションとモデルがオンラインで協調する対話型意味セグメンテーションタスクのアプローチについて述べる。標準ベンチマークを用いて、キー適応は低フィードバック方式におけるモデル性能を向上し(\sim10\%$ mIoU)、高フィードバック方式における値伝搬はモデル応答性を向上させる。確率的注意モデルのpytorch層の実装が公開される予定だ。

We provide a probabilistic interpretation of attention and show that the standard dot-product attention in transformers is a special case of Maximum A Posteriori (MAP) inference. The proposed approach suggests the use of Expectation Maximization algorithms for online adaptation of key and value model parameters. This approach is useful for cases in which external agents, e.g., annotators, provide inference-time information about the correct values of some tokens, e.g, the semantic category of some pixels, and we need for this new information to propagate to other tokens in a principled manner. We illustrate the approach on an interactive semantic segmentation task in which annotators and models collaborate online to improve annotation efficiency. Using standard benchmarks, we observe that key adaptation boosts model performance ($\sim10\%$ mIoU) in the low feedback regime and value propagation improves model responsiveness in the high feedback regime. A PyTorch layer implementation of our probabilistic attention model will be made publicly available.

翻訳日:2021-07-04 21:08:49 公開日:2021-06-23

# (参考訳) ScanBank: Scanned Electronic Theses and Dissertationsから図を抽出するためのベンチマークデータセット

ScanBank: A Benchmark Dataset for Figure Extraction from Scanned Electronic Theses and Dissertations ( http://arxiv.org/abs/2106.15320v1 )

ライセンス: CC BY 4.0

Sampanna Yashwant Kahu, William A. Ingram, Edward A. Fox, Jian Wu

(参考訳) 我々は,600万人以上が公開されており,アクセスの向上と実用性の拡大をめざして,電子製図・論文(ETDs)に焦点を合わせ,研究・教育を専門分野にわたって支援するための重要なコーパスを構成している。新たなデジタル文書が含まれるにつれてコーパスは成長しており、何百万もの古い論文や論文がデジタル形式に変換され、機関リポジトリに電子的に配布されている。 ETDでは、他の学術作品と同様に、数字や表は簡潔な方法で大量の情報を伝達することができる。デジタルPDFから図形や表を抽出する手法が提案されているが、スキャンされたETDではうまく機能しない。この問題を考慮し,本研究では,スキャンしたPDFでうまく機能しない理由として,デジタル文書でのみトレーニングを行ったことが挙げられる。この制限に対処するため、ScanBankは1万ページの画像をスキャンし、人間が手動でラベル付けした新しいデータセットである。このデータセットを用いて、YOLOv5に基づくディープニューラルネットワークモデルをトレーニングし、スキャンされたETDから数値とテーブルを正確に抽出する。我々は,スキャンされた文書から図形を抽出するためのより良い方法を見つけることを目的とした,重要な研究課題を提起し,回答する。そのうちの1つは、スキャンされたドキュメントからの図形抽出に適したモデルをトレーニングするために使用される、生まれながらのデジタルドキュメントに適用されるデータ拡張技術である。我々の知る限りでは、ScanBankはスキャンされたETDのフィギュアとテーブル抽出のための最初の手動アノテートデータセットである。 ScanBankでトレーニングされたYOLOv5ベースのモデルでは、既存の同等のオープンソースおよび無償のベースラインメソッドよりも大幅にパフォーマンスが向上している。

We focus on electronic theses and dissertations (ETDs), aiming to improve access and expand their utility, since more than 6 million are publicly available, and they constitute an important corpus to aid research and education across disciplines. The corpus is growing as new born-digital documents are included, and since millions of older theses and dissertations have been converted to digital form to be disseminated electronically in institutional repositories. In ETDs, as with other scholarly works, figures and tables can communicate a large amount of information in a concise way. Although methods have been proposed for extracting figures and tables from born-digital PDFs, they do not work well with scanned ETDs. Considering this problem, our assessment of state-of-the-art figure extraction systems is that the reason they do not function well on scanned PDFs is that they have only been trained on born-digital documents. To address this limitation, we present ScanBank, a new dataset containing 10 thousand scanned page images, manually labeled by humans as to the presence of the 3.3 thousand figures or tables found therein. We use this dataset to train a deep neural network model based on YOLOv5 to accurately extract figures and tables from scanned ETDs. We pose and answer important research questions aimed at finding better methods for figure extraction from scanned documents. One of those concerns the value for training, of data augmentation techniques applied to born-digital documents which are used to train models better suited for figure extraction from scanned documents. To the best of our knowledge, ScanBank is the first manually annotated dataset for figure and table extraction for scanned ETDs. A YOLOv5-based model, trained on ScanBank, outperforms existing comparable open-source and freely available baseline methods by a considerable margin.

翻訳日:2021-07-04 20:49:55 公開日:2021-06-23

# (参考訳) Wasserstein生成逆数インプットネットワークを用いた画像インパインティング

Image Inpainting Using Wasserstein Generative Adversarial Imputation Network ( http://arxiv.org/abs/2106.15341v1 )

ライセンス: CC BY 4.0

Daniel Va\v{s}ata, Tom\'a\v{s} Halama, Magda Friedjungov\'a

(参考訳) 画像インペイントは、画像内の欠落した領域の再構築に焦点を当てたコンピュータビジョンにおける重要なタスクの1つである。本研究の目的は,Wasserstein Generative Adversarial Imputation Networkに基づく画像インペイントモデルの導入である。モデルのジェネレータネットワークは、異なるダイレーションレートの畳み込み層の構築ブロックと、モデルが出力の詳細を再現するのに役立つスキップ接続を使用する。この組み合わせは、不足する様々なシナリオを十分な品質で扱える普遍的な計算モデルをもたらす。これを実験的に示すために、このモデルはランダムなピクセルの欠落、様々な小さな平方領域の欠落、画像の中心に1つの欠落した四角の欠落という3つのシナリオを同時に扱うように訓練されている。私たちのモデルはすべてのシナリオで高品質なインペインティング結果を達成しています。 2つの実世界のベンチマークデータセット、celeba facesとparis streetviewにおけるピーク信号対雑音比と構造類似性指数を用いて性能評価を行う。本モデルの結果は,バイハーモニック・インパクション法や,他の最先端画像インパインティング法と比較された。

Image inpainting is one of the important tasks in computer vision which focuses on the reconstruction of missing regions in an image. The aim of this paper is to introduce an image inpainting model based on Wasserstein Generative Adversarial Imputation Network. The generator network of the model uses building blocks of convolutional layers with different dilation rates, together with skip connections that help the model reproduce fine details of the output. This combination yields a universal imputation model that is able to handle various scenarios of missingness with sufficient quality. To show this experimentally, the model is simultaneously trained to deal with three scenarios given by missing pixels at random, missing various smaller square regions, and one missing square placed in the center of the image. It turns out that our model achieves high-quality inpainting results on all scenarios. Performance is evaluated using peak signal-to-noise ratio and structural similarity index on two real-world benchmark datasets, CelebA faces and Paris StreetView. The results of our model are compared to biharmonic imputation and to some of the other state-of-the-art image inpainting methods.

翻訳日:2021-07-04 20:31:39 公開日:2021-06-23

# (参考訳) ワトソン博士型人工知性(ai)システム

Dr. Watson type Artificial Intellect (AI) Systems ( http://arxiv.org/abs/2106.13322v1 )

ライセンス: CC BY 4.0

Saveli Goldberg (1), Stanislav Belyaev (2), Vladimir Sluchak ((1) MGH Radiation Oncology Department, (2) Eastern New Mexico Medical Center)

(参考訳) この記事では、ソリューションを直接提供せず、むしろその方向を指して、ユーザーに質問やメッセージの調整を促す新しいタイプのAIシステムを提案する。 aiヒューマンコラボレーションのモデルは、コナン・ドイルの物語からホームズ氏とワトソン博士の相互作用の古典的な文学的例から導き出され、高度に資格のあるホームズ氏はワトソン博士の問いに答える。ここでMr. Holmesは、ルールベースの計算、ロジック、メモリ管理と共に、明らかにAIシステムの役割を担っており、Watson博士がユーザである。同じホームズとワトソンのインタラクションを調べて、Watson博士のようなAIが行動する別のモデルを見つけ、促進します。この原理に基づいて、これらのシステムを「ワトソン博士型システム」と呼ぶ。本稿では,これらのシステムの特徴について述べ,集中治療医のための患者管理システムとデータエラー防止システムについて紹介する。

The article proposes a new type of AI system that does not give solutions directly but rather points toward it, friendly prompting the user with questions and adjusting messages. Models of AI human collaboration can be deduced from the classic literary example of interaction between Mr. Holmes and Dr. Watson from the stories by Conan Doyle, where the highly qualified expert Mr. Holmes answers questions posed by Dr. Watson. Here Mr. Holmes, with his rule-based calculations, logic, and memory management, apparently plays the role of an AI system, and Dr. Watson is the user. Looking into the same Holmes-Watson interaction, we find and promote another model in which the AI behaves like Dr. Watson, who, by asking questions and acting in a particular way, helps Holmes (the AI user) make the right decisions. We call the systems based on this principle "Dr. Watson-type systems." The article describes the properties of such systems and introduces two particular: Patient Management System for intensive care physicians and Data Error Prevention System.

翻訳日:2021-06-29 06:15:42 公開日:2021-06-23

# 逐次文書上での反復結合トピックモデリング

Recurrent Coupled Topic Modeling over Sequential Documents ( http://arxiv.org/abs/2106.13732v1 )

ライセンス: Link先を確認

Jinjin Guo, Longbing Cao and Zhiguo Gong

(参考訳) オンラインアーカイブ、ソーシャルメディア、ニュースフィードなどの豊富なシーケンシャルなドキュメントはストリーミング更新され、各ドキュメントはスムーズに進化するが依存するトピックに組み込まれる。このようなデジタルテキストは、隠れた進化するトピックとその時間的依存性を推測するために、動的トピックモデリングに関する広範な研究を惹きつけている。しかし、既存のアプローチのほとんどはシングルトピックとスレッドの進化に焦点を当てており、現在のトピックが複数の関連する先行トピックと結合される可能性があるという事実を無視している。さらに、これらの手法は遅延パラメータを推論する際の難解な推論問題も引き起こし、高い計算コストと性能劣化をもたらす。この研究では、現在のトピックが対応する結合重み付き以前のトピックから進化し、マルチトピック・スレッドの進化が形成されると仮定する。我々の手法は、進化するトピック間の依存関係をモデル化し、時間ステップで複雑なマルチカップリングを徹底的にエンコードする。難解な推論課題を克服するために,新しいデータ拡張手法のセットを用いた新しい解を提案し,進化するトピック間の多重結合をうまく分解する。これにより、完全な共役モデルが得られ、推論手法の有効性と効率が保証される。後方フィルタアルゴリズムを備えた新しいギブスサンプリング器は、閉形式の潜時時間パラメータを効率的に学習する。さらに、潜在インディアンバッファプロセス(IBP)複合分布を利用して、全体のトピック番号を自動的に推測し、バイアスのない各シーケンシャル文書のスパーストピック比をカスタマイズする。提案手法は, 競合するベースラインに対する合成データセットと実世界のデータセットの両方で評価され, 単語ごとのパープレキシティの低さ, 一貫性の高いトピック, 文書時間予測の精度が向上した。

The abundant sequential documents such as online archival, social media and news feeds are streamingly updated, where each chunk of documents is incorporated with smoothly evolving yet dependent topics. Such digital texts have attracted extensive research on dynamic topic modeling to infer hidden evolving topics and their temporal dependencies. However, most of the existing approaches focus on single-topic-thread evolution and ignore the fact that a current topic may be coupled with multiple relevant prior topics. In addition, these approaches also incur the intractable inference problem when inferring latent parameters, resulting in a high computational cost and performance degradation. In this work, we assume that a current topic evolves from all prior topics with corresponding coupling weights, forming the multi-topic-thread evolution. Our method models the dependencies between evolving topics and thoroughly encodes their complex multi-couplings across time steps. To conquer the intractable inference challenge, a new solution with a set of novel data augmentation techniques is proposed, which successfully discomposes the multi-couplings between evolving topics. A fully conjugate model is thus obtained to guarantee the effectiveness and efficiency of the inference technique. A novel Gibbs sampler with a backward-forward filter algorithm efficiently learns latent timeevolving parameters in a closed-form. In addition, the latent Indian Buffet Process (IBP) compound distribution is exploited to automatically infer the overall topic number and customize the sparse topic proportions for each sequential document without bias. The proposed method is evaluated on both synthetic and real-world datasets against the competitive baselines, demonstrating its superiority over the baselines in terms of the low per-word perplexity, high coherent topics, and better document time prediction.

翻訳日:2021-06-28 12:56:32 公開日:2021-06-23

# (参考訳) オンラインハンドブック of argumentation for ai: volume 2

Online Handbook of Argumentation for AI: Volume 2 ( http://arxiv.org/abs/2106.10832v2 )

ライセンス: CC BY 4.0

OHAAI Collaboration: Andreas Brannstrom, Federico Castagna, Theo Duchatelle, Matt Foulis, Timotheus Kampik, Isabelle Kuhlmann, Lars Malmqvist, Mariela Morveli-Espinoza, Jack Mumford, Stipe Pandzic, Robin Schaefer, Luke Thorburn, Andreas Xydis, Antonio Yuste-Ginel, Heng Zheng

(参考訳) 本巻は、OHAAI(Online Handbook of Argumentation for AI)の第2巻に選択された論文の改訂版を含む。従来、議論と議論の相互作用の形式理論が提案され研究され、近年では議論の計算モデルが研究されている。人工知能(AI)の分野としての論証は、知識の象徴的表現や実現不可能な推論に関心を持つ研究者にとって非常に重要である。このハンドブックの目的は、議論研究コミュニティにオープンアクセスとキュレートされたアンソロジーを提供することである。 OHAAIは、AIに関連するあらゆる分野における議論の理論と応用に関する、最新のおよび今後の博士主導の研究を追跡するための研究ハブとして設計されている。

This volume contains revised versions of the papers selected for the second volume of the Online Handbook of Argumentation for AI (OHAAI). Previously, formal theories of argument and argument interaction have been proposed and studied, and this has led to the more recent study of computational models of argument. Argumentation, as a field within artificial intelligence (AI), is highly relevant for researchers interested in symbolic representations of knowledge and defeasible reasoning. The purpose of this handbook is to provide an open access and curated anthology for the argumentation research community. OHAAI is designed to serve as a research hub to keep track of the latest and upcoming PhD-driven research on the theory and application of argumentation in all areas related to AI.

翻訳日:2021-06-27 09:52:53 公開日:2021-06-23

# (参考訳) 硬膜外電図信号からのロバスト軌道復号のための部分的最大コレントロピー回帰

Partial Maximum Correntropy Regression for Robust Trajectory Decoding from Noisy Epidural Electrocorticographic Signals ( http://arxiv.org/abs/2106.13086v1 )

ライセンス: CC BY 4.0

Yuanhao Li, Badong Chen, Gang Wang, Natsue Yoshimura, Yasuharu Koike

(参考訳) PLSR(Partial Least Square Regression)アルゴリズムは、脳-コンピュータインタフェースにおける相関脳記録から連続変数を予測する特別な能力を示し、近年のマカクの硬膜外電図から3次元連続ハンドトラジェクトリへの予測に成功した。それにもかかわらず、PLSRは本質的に最小二乗基準に基づいて定式化されており、結果として複雑な雑音に関して損なわれない。本研究の目的は,PLSRの堅牢なバージョンを提案することである。この目的のために、最大コレントロピー基準は、PLSRの新しい頑健な変種であるPartial Maximum Correntropy Regression (PMCR)を構築するために採用されている。半量子最適化手法を用いて頑健な潜在変数を計算する。提案するPMCRを合成例と公開Neurotychoデータセットを用いて評価した。従来のPLSRと最先端の変種と比較して、PMCRは、汚染されたトレーニングセットを持つ3つの異なるパフォーマンス指標に対して優れた予測能力を実現した。提案するpmcrは雑音下脳測定からのロバスト復号化に有効な手法として実証され,ノイズによる性能劣化を低減し,脳-コンピュータ界面の復号ロバスト性を向上させることができた。

The Partial Least Square Regression (PLSR) algorithm exhibits exceptional competence for predicting continuous variables from inter-correlated brain recordings in brain-computer interfaces, which achieved successful prediction from epidural electrocorticography of macaques to three-dimensional continuous hand trajectories recently. Nevertheless, PLSR is in essence formulated based on the least square criterion, thus, being non-robust with respect to complicated noises consequently. The aim of the present study is to propose a robust version of PLSR. To this end, the maximum correntropy criterion is adopted to structure a new robust variant of PLSR, namely Partial Maximum Correntropy Regression (PMCR). Half-quadratic optimization technique is utilized to calculate the robust latent variables. We assess the proposed PMCR on a synthetic example and the public Neurotycho dataset. Compared with the conventional PLSR and the state-of-the-art variant, PMCR realized superior prediction competence on three different performance indicators with contaminated training set. The proposed PMCR was demonstrated as an effective approach for robust decoding from noisy brain measurements, which could reduce the performance degradation resulting from adverse noises, thus, improving the decoding robustness of brain-computer interfaces.

翻訳日:2021-06-26 12:00:07 公開日:2021-06-23

# (参考訳) ジオタグ付きソーシャルメディアにおけるリアルタイム時空間イベント検出

Real-time Spatio-temporal Event Detection on Geotagged Social Media ( http://arxiv.org/abs/2106.13121v1 )

ライセンス: CC BY 4.0

Yasmeen George, Shanika Karunasekera, Aaron Harwood and Kwan Hui Lim

(参考訳) ソーシャルメディアデータストリームのマイニングにおける重要な課題は、特定の地域またはグローバルな地域の人々のグループによって活発に議論されるイベントを特定することである。このような出来事は、事故、抗議、選挙、突破ニュースの早期警告に有用である。しかし、イベントのリストやイベント時間と空間の解決は事前に固定または既知のものではない。本研究では,異なる時間と空間解像度のイベントを検出可能なソーシャルメディアを用いたオンライン時空間イベント検出システムを提案する。まず, イベントの空間分解に関する課題に対処するため, ソーシャルメディアデータの密度に基づいて, 地理的空間をマルチスケール領域に分割するために, クワッドツリー法を用いる。次に,ポアソン分布とソーシャルポストの予期せぬ密度の領域を強調する平滑化を含む統計的非教師なしアプローチを行う。さらに、連続した時間間隔で同じ領域で発生した事象をマージすることにより、イベント期間を正確に推定する。ポスト処理ステージは、スパム、フェイク、不正なイベントをフィルタリングするために導入される。最後に,ソーシャルメディアを利用した単純な意味論を取り入れ,検出された事象の完全性や正確性を評価する。提案手法は,メルボルン,ロンドン,パリ,ニューヨークの各都市を対象としたtwitterとflickrのソーシャルメディアデータセットを用いて評価される。提案手法の有効性を検証するため,地理的空間の固定分割とクラスタリング法に基づく2つのベースラインアルゴリズムとの比較を行った。性能評価のために,手動でリコールと精度を計算する。また,報告された事象の正確性を自動的に測定する「強度指標」という新しい品質指標を提案する。

A key challenge in mining social media data streams is to identify events which are actively discussed by a group of people in a specific local or global area. Such events are useful for early warning for accident, protest, election or breaking news. However, neither the list of events nor the resolution of both event time and space is fixed or known beforehand. In this work, we propose an online spatio-temporal event detection system using social media that is able to detect events at different time and space resolutions. First, to address the challenge related to the unknown spatial resolution of events, a quad-tree method is exploited in order to split the geographical space into multiscale regions based on the density of social media data. Then, a statistical unsupervised approach is performed that involves Poisson distribution and a smoothing method for highlighting regions with unexpected density of social posts. Further, event duration is precisely estimated by merging events happening in the same region at consecutive time intervals. A post processing stage is introduced to filter out events that are spam, fake or wrong. Finally, we incorporate simple semantics by using social media entities to assess the integrity, and accuracy of detected events. The proposed method is evaluated using different social media datasets: Twitter and Flickr for different cities: Melbourne, London, Paris and New York. To verify the effectiveness of the proposed method, we compare our results with two baseline algorithms based on fixed split of geographical space and clustering method. For performance evaluation, we manually compute recall and precision. We also propose a new quality measure named strength index, which automatically measures how accurate the reported event is.

翻訳日:2021-06-26 11:16:32 公開日:2021-06-23

# (参考訳) 芸術解釈と意味のモデル化。イコノロジーとイコノグラフィーを記述するためのデータモデル

Modelling Art Interpretation and Meaning. A Data Model for Describing Iconology and Iconography ( http://arxiv.org/abs/2106.12967v1 )

ライセンス: CC BY 4.0

S. Baroncini (1), M. Daquino (1), F. Tomasi (1) ((1) Department of Classical Philology and Italian Studies, University of Bologna)

(参考訳) イコノロジー(Iconology)は、美術史の分野の一つで、芸術の社会的・文化的背景に関する意味を研究する。今日、いくつかの学際研究分野は、データサイエンスの手法とセマンティックウェブ技術を用いて定量的美術史を追求するために、イコノロジーに近い理論的な枠組みを利用している。しかし、近年ではイコノグラフィー研究がオントロジーで取り上げられているが、イコノロジー研究に関連する側面の完全な記述はいまだに欠落している。本稿では,本論文から選択した11の事例研究について予備研究を行い,既存のオントロジーを拡張するための新たな用語を提案する。我々は,新しい用語を共通の評価手法で検証し,デジタル美術史のコミュニティにおいて,このような拡張オントロジーが生まれる可能性に照らして,その結果について考察する。

Iconology is a branch of art history that investigates the meaning of artworks in relation to their social and cultural background. Nowadays, several interdisciplinary research fields leverage theoretical frameworks close to iconology to pursue quantitative Art History with data science methods and Semantic Web technologies. However, while Iconographic studies have been recently addressed in ontologies, a complete description of aspects relevant to iconological studies is still missing. In this article, we present a preliminary study on eleven case studies selected from the literature and we envision new terms for extending existing ontologies. We validate new terms according to a common evaluation method and we discuss our results in the light of the opportunities that such an extended ontology would arise in the community of Digital Art History.

翻訳日:2021-06-26 10:45:24 公開日:2021-06-23

# (参考訳) 連続時間深部グリオーマ成長モデル

Continuous-Time Deep Glioma Growth Models ( http://arxiv.org/abs/2106.12917v1 )

ライセンス: CC BY-SA 4.0

Jens Petersen and Fabian Isensee and Gregor K\"ohler and Paul F. J\"ager and David Zimmerer and Ulf Neuberger and Wolfgang Wick and J\"urgen Debus and Sabine Heiland and Martin Bendszus and Philipp Vollmuth and Klaus H. Maier-Hein

(参考訳) 将来、腫瘍がどのように進化するかを推定できる能力は、治療決定の改善から放射線治療における線量分布の改善まで、大きな臨床効果をもたらす可能性がある。最近の研究は、深層学習と変分推論を通じてグリオーマ成長モデル問題にアプローチし、実際の患者データ分布から完全に学習する。これまでのところ、このアプローチは画像取得間隔と固定長のシーケンスに制約されており、より現実的なシナリオにおける適用性を制限する。本稿では,確率的時系列の条件生成モデルであるNeural Processesを拡張し,時空間の注意機構を含む階層的マルチスケール表現符号化を行う。その結果、任意の数の観測で条件付けできる学習的成長モデルとなり、連続時間軸上で時間的に一貫した成長軌道の分布を生成することができる。 379人の患者のデータセット上で、この手法は画像のグローバルおよびよりきめ細かなバリエーションを捉え、他の学習された成長モデルよりも優れたパフォーマンスを示す。

The ability to estimate how a tumor might evolve in the future could have tremendous clinical benefits, from improved treatment decisions to better dose distribution in radiation therapy. Recent work has approached the glioma growth modeling problem via deep learning and variational inference, thus learning growth dynamics entirely from a real patient data distribution. So far, this approach was constrained to predefined image acquisition intervals and sequences of fixed length, which limits its applicability in more realistic scenarios. We overcome these limitations by extending Neural Processes, a class of conditional generative models for stochastic time series, with a hierarchical multi-scale representation encoding including a spatio-temporal attention mechanism. The result is a learned growth model that can be conditioned on an arbitrary number of observations, and that can produce a distribution of temporally consistent growth trajectories on a continuous time axis. On a dataset of 379 patients, the approach successfully captures both global and finer-grained variations in the images, exhibiting superior performance compared to other learned growth models.

翻訳日:2021-06-26 10:24:29 公開日:2021-06-23

# (参考訳) 解釈可能なグラフニューラルネットワークのための学習スパーシフィケーション

Learnt Sparsification for Interpretable Graph Neural Networks ( http://arxiv.org/abs/2106.12920v1 )

ライセンス: CC BY 4.0

Mandeep Rathee, Zijian Zhang, Thorben Funke, Megha Khosla, and Avishek Anand

(参考訳) グラフニューラルネットワーク(GNN)は、リレーショナルモデリングを必要とするさまざまなタスクや分野において大きな成功を収めている。 GNNは、グラフ構造を帰納バイアスとして利用し、柔軟性と強力なモデルを生成する。しかし、ノード特徴とグラフ構造の間の相互作用が暗黙的にのみ学習されるため、GNNの解釈は困難である。本稿では,不要な近傍を除去し,基礎となるグラフを明示的にスパースする手法であるkedgeを提案する。提案手法は,任意のgnnモデルと共役で使用可能な硬いkumaraswamy分布を用いた,扱いやすいスパーシフィケーション法に基づいている。 Kedgeは、任意のGNNでトレーニングされたモジュール方式でエッジマスクを学び、エンドツーエンドで勾配ベースの最適化を実現する。実験では,実験精度に小さな影響を及ぼさずに,エッジのかなりの割合をプルーピングできることを実証した。具体的には、pubmedデータセットでkedgeは、エッジの80%以上をドロップし、わずか2%の精度低下でグラフ構造がノードの機能に対して小さな貢献しか持たないことを学んでいる。最後に、Kedgeは、GNN層の増加とともにタスク性能を向上し、深いGNNにおいて過度にスムースな現象に効果的に対処することを示した。

Graph neural networks (GNNs) have achieved great success on various tasks and fields that require relational modeling. GNNs aggregate node features using the graph structure as inductive biases resulting in flexible and powerful models. However, GNNs remain hard to interpret as the interplay between node features and graph structure is only implicitly learned. In this paper, we propose a novel method called Kedge for explicitly sparsifying the underlying graph by removing unnecessary neighbors. Our key idea is based on a tractable method for sparsification using the Hard Kumaraswamy distribution that can be used in conjugation with any GNN model. Kedge learns edge masks in a modular fashion trained with any GNN allowing for gradient based optimization in an end-to-end fashion. We demonstrate through extensive experiments that our model Kedge can prune a large proportion of the edges with only a minor effect on the test accuracy. Specifically, in the PubMed dataset, Kedge learns to drop more than 80% of the edges with an accuracy drop of merely 2% showing that graph structure has only a small contribution in comparison to node features. Finally, we also show that Kedge effectively counters the over-smoothing phenomena in deep GNNs by maintaining good task performance with increasing GNN layers.

翻訳日:2021-06-26 10:11:42 公開日:2021-06-23

# (参考訳) 機械学習を用いた救急医療における入院予測

Using machine learning techniques to predict hospital admission at the emergency department ( http://arxiv.org/abs/2106.12921v1 )

ライセンス: CC BY 4.0

Georgios Feretzakis, George Karlis, Evangelos Loupelis, Dimitris Kalles, Rea Chatzikyriakou, Nikolaos Trakas, Eugenia Karakou, Aikaterini Sakagianni, Lazaros Tzelves, Stavroula Petropoulou, Aikaterini Tika, Ilias Dalainas and Vasileios Kaldis

(参考訳) 紹介:救急部門(ED)における最も重要な課題の1つは、病院入院の恩恵を受ける患者を迅速に特定することである。機械学習(ML)技術は、医療における診断支援として有望であることを示している。材料と方法: 尿素, クレアチニン, 乳酸脱水素酵素, クレアチンキナーゼ, c-反応性蛋白, 血液計数, 活性化部分トロンボプラスチン時間, dダイマー, 国際正規化比, 年齢, 性別, edユニットへのトリアージ配置, 救急車の使用率など, 入院率の予測における成績について検討した。合計3,204回のED訪問が分析された。結果:提案アルゴリズムは,ED患者の入院予測における許容性能を示すモデルを生成する。 8つの評価アルゴリズムのF値とROC値の範囲はそれぞれ [0.679-0.708] と [0.734-0.774] であった。議論: このツールの主な利点は、簡単アクセス、可用性、イエス/ノー結果、低コストである。本手法の臨床的意義は,従来の臨床的意思決定からより洗練されたモデルへの移行を促進する可能性がある。結論: 共通のバイオマーカーを利用したロバストな予後モデルの開発は, 救急医療の将来を形作るかもしれない。本研究は,実用的ED試験の実施を保証している。

Introduction: One of the most important tasks in the Emergency Department (ED) is to promptly identify the patients who will benefit from hospital admission. Machine Learning (ML) techniques show promise as diagnostic aids in healthcare. Material and methods: We investigated the following features seeking to investigate their performance in predicting hospital admission: serum levels of Urea, Creatinine, Lactate Dehydrogenase, Creatine Kinase, C-Reactive Protein, Complete Blood Count with differential, Activated Partial Thromboplastin Time, D Dimer, International Normalized Ratio, age, gender, triage disposition to ED unit and ambulance utilization. A total of 3,204 ED visits were analyzed. Results: The proposed algorithms generated models which demonstrated acceptable performance in predicting hospital admission of ED patients. The range of F-measure and ROC Area values of all eight evaluated algorithms were [0.679-0.708] and [0.734-0.774], respectively. Discussion: The main advantages of this tool include easy access, availability, yes/no result, and low cost. The clinical implications of our approach might facilitate a shift from traditional clinical decision-making to a more sophisticated model. Conclusion: Developing robust prognostic models with the utilization of common biomarkers is a project that might shape the future of emergency medicine. Our findings warrant confirmation with implementation in pragmatic ED trials.

翻訳日:2021-06-26 09:58:28 公開日:2021-06-23

# (参考訳) contextized token representationsを用いた臨床名付きエンティティ認識

Clinical Named Entity Recognition using Contextualized Token Representations ( http://arxiv.org/abs/2106.12608v1 )

ライセンス: CC BY 4.0

Yichao Zhou, Chelsea Ju, J. Harry Caufield, Kevin Shih, Calvin Chen, Yizhou Sun, Kai-Wei Chang, Peipei Ping, Wei Wang

(参考訳) clinical named entity recognition (cner) タスクは、診断手順、疾患障害、重症度、薬物、薬物量、徴候などの予め定義されたカテゴリに臨床用語を分類することを目的としている。 CNERは、新しい現象の同定や人為的な情報抽出を含む薬物に対する副作用の研究を促進する。関心の実体を抽出する既存のアプローチは、各単語を表現するために静的な単語埋め込みを使うことに焦点を当てている。しかし、1つの単語は、文の文脈に依存する異なる解釈を持つことができる。静的な単語埋め込みは、単語の多様な解釈を統合するには不十分である。この課題を克服するために,各単語の意味的意味をより正確に把握するために,文脈的単語埋め込み技術が導入された。これら2つの言語モデルであるelmoとflairは、自然言語処理の分野で広く使われ、ドメインジェネリックドキュメントにコンテキスト化された単語埋め込みを生成する。しかし、これらの埋め込みは通常、特定のドメインの語彙間の近接を捉えるには一般的すぎる。臨床症例報告 (CCR) を用いた下流の様々な応用を容易にするため, PubMed Central による臨床関連コーパスを用いて, 深層文脈言語モデル (C-ELMo) と臨床コンテキスト文字列埋め込み (C-Flair) を事前訓練した。明示的な実験により、私たちのモデルは静的な単語埋め込みとドメイン固有言語モデルの両方と比較して劇的な改善が得られます。

The clinical named entity recognition (CNER) task seeks to locate and classify clinical terminologies into predefined categories, such as diagnostic procedure, disease disorder, severity, medication, medication dosage, and sign symptom. CNER facilitates the study of side-effect on medications including identification of novel phenomena and human-focused information extraction. Existing approaches in extracting the entities of interests focus on using static word embeddings to represent each word. However, one word can have different interpretations that depend on the context of the sentences. Evidently, static word embeddings are insufficient to integrate the diverse interpretation of a word. To overcome this challenge, the technique of contextualized word embedding has been introduced to better capture the semantic meaning of each word based on its context. Two of these language models, ELMo and Flair, have been widely used in the field of Natural Language Processing to generate the contextualized word embeddings on domain-generic documents. However, these embeddings are usually too general to capture the proximity among vocabularies of specific domains. To facilitate various downstream applications using clinical case reports (CCRs), we pre-train two deep contextualized language models, Clinical Embeddings from Language Model (C-ELMo) and Clinical Contextual String Embeddings (C-Flair) using the clinical-related corpus from the PubMed Central. Explicit experiments show that our models gain dramatic improvements compared to both static word embeddings and domain-generic language models.

翻訳日:2021-06-26 09:52:02 公開日:2021-06-23

# (参考訳) 機械学習とディープラーニングを用いた手書き文字認識

Handwritten Digit Recognition using Machine and Deep Learning Algorithms ( http://arxiv.org/abs/2106.12614v1 )

ライセンス: CC0 1.0

Samay Pashine, Ritik Dixit, and Rishika Kushwah

(参考訳) 人間のマシンへの依存度は、写真のオブジェクト分類からサイレント映画への音の追加まで、ディープラーニングと機械学習アルゴリズムの助けを借りて、すべてを実行することができるほど高くはない。同様に、手書きのテキスト認識は研究と開発において重要な分野の1つであり、多くの可能性があり得る。手書き文字認識 (HWR) は、手書き文字認識 (HTR) とも呼ばれ、紙文書、写真、タッチスクリーン、その他の装置から手書き入力を受信し、解釈するコンピュータの能力である。本稿では,MNISTデータセットを用いて,Support Vector Machines (SVM), Multi-Layer Perceptron (MLP), Convolution Neural Network (CNN)モデルを用いて手書き桁認識を行った。我々の主な目的は、上述したモデルの精度と実行時間を比較して、桁認識に最適なモデルを得ることである。

The reliance of humans over machines has never been so high such that from object classification in photographs to adding sound to silent movies everything can be performed with the help of deep learning and machine learning algorithms. Likewise, Handwritten text recognition is one of the significant areas of research and development with a streaming number of possibilities that could be attained. Handwriting recognition (HWR), also known as Handwritten Text Recognition (HTR), is the ability of a computer to receive and interpret intelligible handwritten input from sources such as paper documents, photographs, touch-screens and other devices [1]. Apparently, in this paper, we have performed handwritten digit recognition with the help of MNIST datasets using Support Vector Machines (SVM), Multi-Layer Perceptron (MLP) and Convolution Neural Network (CNN) models. Our main objective is to compare the accuracy of the models stated above along with their execution time to get the best possible model for digit recognition.

翻訳日:2021-06-26 09:40:17 公開日:2021-06-23

# (参考訳) 量子多体問題に対する確率的機械学習

Provably efficient machine learning for quantum many-body problems ( http://arxiv.org/abs/2106.12627v1 )

ライセンス: CC BY 4.0

Hsin-Yuan Huang, Richard Kueng, Giacomo Torlai, Victor V. Albert, John Preskill

(参考訳) 古典機械学習(ML)は、物理学と化学における量子多体問題の解決に潜在的に強力なアプローチを提供する。しかし,従来の手法に比べてMLの優位性は確立されていない。本研究では, 古典的mlアルゴリズムを用いて, ガッピングハミルトニアンの有限次元における基底状態特性を, 同じ量子相で他のハミルトニアンを測定した結果から効率的に予測できることを実証する。対照的に、広く受け入れられている複雑性理論の仮定の下では、データから学ばない古典的アルゴリズムは同じ保証を達成できない。また、古典的MLアルゴリズムは、幅広い量子位相の物質を効率的に分類できることを示す。我々の議論は古典的な影の概念に基づいており、これは多体量子状態の簡潔な古典的な記述であり、実現可能な量子実験で構築でき、状態の多くの特性を予測できる。大規模数値実験は、Rydberg原子系、2次元ランダムハイゼンベルクモデル、対称性保護位相、位相秩序相などの様々なシナリオにおいて、我々の理論結果を裏付ける。

Classical machine learning (ML) provides a potentially powerful approach to solving challenging quantum many-body problems in physics and chemistry. However, the advantages of ML over more traditional methods have not been firmly established. In this work, we prove that classical ML algorithms can efficiently predict ground state properties of gapped Hamiltonians in finite spatial dimensions, after learning from data obtained by measuring other Hamiltonians in the same quantum phase of matter. In contrast, under widely accepted complexity theory assumptions, classical algorithms that do not learn from data cannot achieve the same guarantee. We also prove that classical ML algorithms can efficiently classify a wide range of quantum phases of matter. Our arguments are based on the concept of a classical shadow, a succinct classical description of a many-body quantum state that can be constructed in feasible quantum experiments and be used to predict many properties of the state. Extensive numerical experiments corroborate our theoretical results in a variety of scenarios, including Rydberg atom systems, 2D random Heisenberg models, symmetry-protected topological phases, and topologically ordered phases.

翻訳日:2021-06-26 09:31:18 公開日:2021-06-23

# (参考訳) リスク階層化と分析のための医療的主張に基づくトランスフォーマーに基づく教師なし患者表現学習

Transformer-based unsupervised patient representation learning based on medical claims for risk stratification and analysis ( http://arxiv.org/abs/2106.12658v1 )

ライセンス: CC BY 4.0

Xianlong Zeng, Simon Lin, Chang Liu

(参考訳) クレームデータは、医療コード、サービス情報、および発生した支出を含むものであり、個人の健康状態と医療リスクレベルを推定するのによい資源である。本研究では,マルチモーダルオートエンコーダ(tmae,transformer-based multimodal autoencoder)を開発した。 TMAEは、医療の実践的なニーズにより、患者を異なるリスクレベルに階層化し、ケア提供と管理を改善する。従来のアプローチと比較して, TMAEは, 1) 入院者, 外来患者, 服薬請求を総合的にモデル化し, 2) 医療イベント間の不規則な時間間隔を処理し, 3) まれな医療基準の空白問題を緩和し, 4) 医療費情報を組み込むことができる。我々は,60万人以上の患者を含む実世界の小児クレームデータセットを用いてtmaeを訓練し,その性能を2つのクラスタリングタスクにおける様々なアプローチと比較した。実験の結果, TMAEは全てのベースラインに比べて優れた性能を示した。フレームワークの有効性を説明するために、複数のダウンストリームアプリケーションも実施する。有望な結果は,TMAEフレームワークが大規模クレームデータに対してスケーラブルであり,リスク階層化と分析のために効率的な患者埋め込みを生成することができることを確認した。

The claims data, containing medical codes, services information, and incurred expenditure, can be a good resource for estimating an individual's health condition and medical risk level. In this study, we developed Transformer-based Multimodal AutoEncoder (TMAE), an unsupervised learning framework that can learn efficient patient representation by encoding meaningful information from the claims data. TMAE is motivated by the practical needs in healthcare to stratify patients into different risk levels for improving care delivery and management. Compared to previous approaches, TMAE is able to 1) model inpatient, outpatient, and medication claims collectively, 2) handle irregular time intervals between medical events, 3) alleviate the sparsity issue of the rare medical codes, and 4) incorporate medical expenditure information. We trained TMAE using a real-world pediatric claims dataset containing more than 600,000 patients and compared its performance with various approaches in two clustering tasks. Experimental results demonstrate that TMAE has superior performance compared to all baselines. Multiple downstream applications are also conducted to illustrate the effectiveness of our framework. The promising results confirm that the TMAE framework is scalable to large claims data and is able to generate efficient patient embeddings for risk stratification and analysis.

翻訳日:2021-06-26 09:29:32 公開日:2021-06-23

# (参考訳) 畳み込みニューラルネットワークを用いた高速で忠実なlyman$\alpha$ forests

Fast, high-fidelity Lyman $\alpha$ forests with convolutional neural networks ( http://arxiv.org/abs/2106.12662v1 )

ライセンス: CC BY 4.0

Peter Harrington, Mustafa Mustafa, Max Dornfest, Benjamin Horowitz, Zarija Luki\'c

(参考訳) フル物理宇宙学シミュレーションは宇宙の構造の形成と進化を研究する強力なツールであるが、極端な計算資源を必要とする。そこで我々は,Nyxシミュレーションのデータを用いて,Lyman-$\alpha$(Ly$\alpha$)森林のバリオン流体力学変数(密度,温度,速度)を復元するために,より安価なN-body-onlyシミュレーションを使用するように畳み込みニューラルネットワークを訓練する。本手法は20kpcの解像度でこれらのフィールドの迅速な推定を可能にし,既存の近似値よりもはるかに精度の高いly$\alpha$ forestの統計値を取得する。私たちのモデルは完全なコンボリューションであるため、より小さなシミュレーションボックスでトレーニングし、より大きなモデルにデプロイすることが可能です。さらに, この手法は, ly$\alpha$ flux ではなく, 流体力学場の近似を生成するので, 電離背景や平均透過流束の特定の選択に限定されない。

Full-physics cosmological simulations are powerful tools for studying the formation and evolution of structure in the universe but require extreme computational resources. Here, we train a convolutional neural network to use a cheaper N-body-only simulation to reconstruct the baryon hydrodynamic variables (density, temperature, and velocity) on scales relevant to the Lyman-$\alpha$ (Ly$\alpha$) forest, using data from Nyx simulations. We show that our method enables rapid estimation of these fields at a resolution of $\sim$20kpc, and captures the statistics of the Ly$\alpha$ forest with much greater accuracy than existing approximations. Because our model is fully-convolutional, we can train on smaller simulation boxes and deploy on much larger ones, enabling substantial computational savings. Furthermore, as our method produces an approximation for the hydrodynamic fields instead of Ly$\alpha$ flux directly, it is not limited to a particular choice of ionizing background or mean transmitted flux.

翻訳日:2021-06-26 09:16:14 公開日:2021-06-23

# (参考訳) 連続ウェーブレット変換と畳み込みニューラルネットワークを用いたヒューマンアクティビティ認識

Human Activity Recognition using Continuous Wavelet Transform and Convolutional Neural Networks ( http://arxiv.org/abs/2106.12666v1 )

ライセンス: CC BY 4.0

Anna Nedorubova, Alena Kadyrova, Aleksey Khlyupin

(参考訳) 糖尿病患者や慢性疾患のある人、高齢者、障害者など、健康上の理由から永遠の監視を受ける人は非常に少なく、これらの集団は、生命を脅かすような転倒やシンコペに襲われるリスクが高まる可能性がある。資源が限られているため、リスクのある人の大部分は必要な監視を受けられず、過度の危険にさらされる。現在、この問題はHAR(Human Activity Recognition)手法を用いて解決されている。 HARは、医療、スポーツ、セキュリティなど、幅広い分野の応用分野を持つ、視点と速いペースのデータサイエンス分野である。しかし,現在の認識技術では精度が著しく低いため,人間の行動分類の高精度な手法が提案されている。我々は、HAR問題に対処する新しいワークフローを提案し、加速度センサ信号からなるUniMiB SHARデータセット上で評価する。提案するモデルは連続ウェーブレット変換(CWT)と畳み込みニューラルネットワーク(CNN)に基づいている。ウェーブレット変換は信号特徴を時間領域と周波数領域の両方にローカライズし、その後、cnnはこれらの特徴を抽出して活動を認識する。また、CWTは1D加速度計信号を2D画像に変換するため、2Dネットワークの予測能力が著しく高いため、より良い結果が得られる。研究の過程で、畳み込みニューラルネットワークを構築し、空間軸の数、層数、各層内のニューロン数、画像サイズ、母ウェーブレットの種類、母ウェーブレットのゼロモーメントの順序など、モデルパラメータを変化させる。さらに,残差ブロックを持つモデルを適用することで,測定値が大きく向上する。最後に、99.26パーセントの精度に達することに成功し、この問題に対して価値のあるパフォーマンスである。

Quite a few people in the world have to stay under permanent surveillance for health reasons; they include diabetic people or people with some other chronic conditions, the elderly and the disabled.These groups may face heightened risk of having life-threatening falls or of being struck by a syncope. Due to limited availability of resources a substantial part of people at risk can not receive necessary monitoring and thus are exposed to excessive danger. Nowadays, this problem is usually solved via applying Human Activity Recognition (HAR) methods. HAR is a perspective and fast-paced Data Science field, which has a wide range of application areas such as healthcare, sport, security etc. However, the currently techniques of recognition are markedly lacking in accuracy, hence, the present paper suggests a highly accurate method for human activity classification. Wepropose a new workflow to address the HAR problem and evaluate it on the UniMiB SHAR dataset, which consists of the accelerometer signals. The model we suggest is based on continuous wavelet transform (CWT) and convolutional neural networks (CNNs). Wavelet transform localizes signal features both in time and frequency domains and after that a CNN extracts these features and recognizes activity. It is also worth noting that CWT converts 1D accelerometer signal into 2D images and thus enables to obtain better results as 2D networks have a significantly higher predictive capacity. In the course of the work we build a convolutional neural network and vary such model parameters as number of spatial axes, number of layers, number of neurons in each layer, image size, type of mother wavelet, the order of zero moment of mother wavelet etc. Besides, we also apply models with residual blocks which resulted in significantly higher metric values. Finally, we succeed to reach 99.26 % accuracy and it is a worthy performance for this problem.

翻訳日:2021-06-26 09:01:31 公開日:2021-06-23

# (参考訳) 表現中立化による公正性

Fairness via Representation Neutralization ( http://arxiv.org/abs/2106.12674v1 )

ライセンス: CC BY 4.0

Mengnan Du, Subhabrata Mukherjee, Guanchu Wang, Ruixiang Tang, Ahmed Hassan Awadallah, Xia Hu

(参考訳) DNNモデルの既存のバイアス軽減手法は、主にデバイアスエンコーダの学習に取り組んでいる。このプロセスは、センシティブな属性に対して多くのインスタンスレベルのアノテーションを必要とするだけでなく、すべての公平さに敏感な情報がエンコーダから削除されたことを保証しません。これらの制限に対処するために、我々は以下の研究課題を探求する: DNNモデルの識別は、入力として偏りのある表現であっても、分類ヘッドを乱すだけで抑えられるか? そこで本研究では,DNNモデルのタスク固有分類先頭のみを曖昧にすることで,公平性を実現するための表現中立化(Representation Neutralization for Fairness, RNF)を提案する。そこで我々は,DNNモデルの分類ヘッドをトレーニングするために,同一の地下構造ラベルを持つサンプルを,異なる感度特性で利用し,その中性表現を用いて評価する。 RNFの鍵となる考え方は、特定のクラスラベルを持つエンコーダ表現において、公平さに敏感な情報間の素早い相関を捉えないようにすることである。機密属性アノテーションにアクセスせずに低リソース設定に対処するため、バイアス増幅モデルを用いて機密属性のプロキシアノテーションを生成する。複数のベンチマークデータセットに対する実験結果は、タスク固有の性能の低下を最小限に抑えつつ、DNNモデルの識別を効果的に削減するRNFフレームワークを実証している。

Existing bias mitigation methods for DNN models primarily work on learning debiased encoders. This process not only requires a lot of instance-level annotations for sensitive attributes, it also does not guarantee that all fairness sensitive information has been removed from the encoder. To address these limitations, we explore the following research question: Can we reduce the discrimination of DNN models by only debiasing the classification head, even with biased representations as inputs? To this end, we propose a new mitigation technique, namely, Representation Neutralization for Fairness (RNF) that achieves fairness by debiasing only the task-specific classification head of DNN models. To this end, we leverage samples with the same ground-truth label but different sensitive attributes, and use their neutralized representations to train the classification head of the DNN model. The key idea of RNF is to discourage the classification head from capturing spurious correlation between fairness sensitive information in encoder representations with specific class labels. To address low-resource settings with no access to sensitive attribute annotations, we leverage a bias-amplified model to generate proxy annotations for sensitive attributes. Experimental results over several benchmark datasets demonstrate our RNF framework to effectively reduce discrimination of DNN models with minimal degradation in task-specific performance.

翻訳日:2021-06-26 08:46:05 公開日:2021-06-23

# Charformer: Gradient-based Subword Tokenizationによる高速文字変換器

Charformer: Fast Character Transformers via Gradient-based Subword Tokenization ( http://arxiv.org/abs/2106.12672v1 )

ライセンス: Link先を確認

Yi Tay, Vinh Q. Tran, Sebastian Ruder, Jai Gupta, Hyung Won Chung, Dara Bahri, Zhen Qin, Simon Baumgartner, Cong Yu, Donald Metzler

(参考訳) 自然言語処理における最先端モデルは、その一般化能力と新しい設定への適応を制限する、別個の厳密なサブワードトークン化アルゴリズムに依存している。本稿では,モデルの一部として単語のトークン化を端から端まで学習するモデルインダクティブバイアスを提案する。そこで本研究では,データ駆動方式で文字から潜在サブワード表現を自動的に学習する,ソフトグラデーションベースのサブワードトークンモジュール(GBST)を提案する。具体的には、gbstは候補のサブワードブロックを列挙し、ブロックスコアリングネットワークを用いて位置的にスコア付けすることを学習する。また、GBSTを統合し、バイトレベルで動作する深層トランスフォーマーモデルであるCharformerを紹介する。英語のグルー、多言語、騒がしいテキストデータセットに関する広範な実験を通じて、charformerは、一般的にparおよび時としてsubwordベースのモデルよりも優れたパフォーマンスを保ちながら、一連の競合バイトレベルのベースラインよりも優れています。さらにCharformerは高速で、バニラバイトレベルのトランスフォーマーとサブワードレベルのトランスフォーマーの両方のスピードを28%-100%向上し、競争上の品質を維持している。この作業は、エンドツーエンドで完全にトレーニングされた高性能なトークンフリーモデルの道を開くものだと考えています。

State-of-the-art models in natural language processing rely on separate rigid subword tokenization algorithms, which limit their generalization ability and adaptation to new settings. In this paper, we propose a new model inductive bias that learns a subword tokenization end-to-end as part of the model. To this end, we introduce a soft gradient-based subword tokenization module (GBST) that automatically learns latent subword representations from characters in a data-driven fashion. Concretely, GBST enumerates candidate subword blocks and learns to score them in a position-wise fashion using a block scoring network. We additionally introduce Charformer, a deep Transformer model that integrates GBST and operates on the byte level. Via extensive experiments on English GLUE, multilingual, and noisy text datasets, we show that Charformer outperforms a series of competitive byte-level baselines while generally performing on par and sometimes outperforming subword-based models. Additionally, Charformer is fast, improving the speed of both vanilla byte-level and subword-level Transformers by 28%-100% while maintaining competitive quality. We believe this work paves the way for highly performant token-free models that are trained completely end-to-end.

翻訳日:2021-06-25 15:22:24 公開日:2021-06-23

# 多層ランダムreluネットワークにおける逆例

Adversarial Examples in Multi-Layer Random ReLU Networks ( http://arxiv.org/abs/2106.12611v1 )

ライセンス: Link先を確認

Peter L. Bartlett, S\'ebastien Bubeck and Yeshwanth Cherapanamjeri

(参考訳) 独立ガウスパラメータを持つReLUネットワークにおける逆例の現象を考察する。一定の深さと広い幅のネットワーク(例えば、各層の幅が他の層の多項式であれば十分)に対して、入力ベクトルの小さな摂動は出力の大きな変化をもたらす。これにより、急速に幅を減らしたネットワークに対する Daniely と Schacham (2020) の結果と、2層ネットワークに対する Bubeck et al (2021) の結果が一般化される。この証明は、それらが計算する関数が線形に非常に近いため、これらのネットワークで逆例が生じることを示している。ネットワーク内のいくつかのポイントまでの最小の幅は、そのポイントまで計算されたマッピングのスケールと感度を決定する。主な結果は、深さが一定であるネットワークに対してであるが、この種の結果には深さの制約が必要であり、それは、一定の確率で定数に近い関数を計算する、適切なディープネットワークが存在するためである。

We consider the phenomenon of adversarial examples in ReLU networks with independent gaussian parameters. For networks of constant depth and with a large range of widths (for instance, it suffices if the width of each layer is polynomial in that of any other layer), small perturbations of input vectors lead to large changes of outputs. This generalizes results of Daniely and Schacham (2020) for networks of rapidly decreasing width and of Bubeck et al (2021) for two-layer networks. The proof shows that adversarial examples arise in these networks because the functions that they compute are very close to linear. Bottleneck layers in the network play a key role: the minimal width up to some point in the network determines scales and sensitivities of mappings computed up to that point. The main result is for networks with constant depth, but we also show that some constraint on depth is necessary for a result of this kind, because there are suitably deep networks that, with constant probability, compute a function that is close to constant.

翻訳日:2021-06-25 15:21:28 公開日:2021-06-23

# タブラルデータからのアイデアによるGNN説明の再検討

Reimagining GNN Explanations with ideas from Tabular Data ( http://arxiv.org/abs/2106.12665v1 )

ライセンス: Link先を確認

Anjali Singh, Shamanth R Nayak K, Balaji Ganesan

(参考訳) グラフニューラルネットワークの説明可能性技術は、グラフデータに基づいてトレーニングされたニューラルネットワークと決定木ベースのモデルの両方で利用可能な説明と比較して、まだ長い道のりがある。グラフと表データの両方にまたがるタスク、すなわちEntity Matchingを使って、GNNモデル説明に欠けている説明可能性の重要な側面についてコメントする。

Explainability techniques for Graph Neural Networks still have a long way to go compared to explanations available for both neural and decision decision tree-based models trained on tabular data. Using a task that straddles both graphs and tabular data, namely Entity Matching, we comment on key aspects of explainability that are missing in GNN model explanations.

翻訳日:2021-06-25 15:19:13 公開日:2021-06-23

# 多目的非同期逐次Halving

Multi-objective Asynchronous Successive Halving ( http://arxiv.org/abs/2106.12639v1 )

ライセンス: Link先を確認

Robin Schmucker, Michele Donini, Muhammad Bilal Zafar, David Salinas, C\'edric Archambeau

(参考訳) ハイパーパラメータ最適化(HPO)は、機械学習モデルの予測性能(例えば精度)を自動調整するために、ますます使われている。しかし、現実世界のアプリケーションでは、精度は複数の(しばしば矛盾する)パフォーマンス基準の1つに過ぎず、多目的(MO)の観点を採用する必要がある。 MO最適化に関する文献は豊富だが、HPOに焦点を当てた先行研究はほとんどない。本稿では,非同期連続半減期(ASHA)をMO設定に拡張するアルゴリズムを提案する。複数の評価指標を考慮して,3つの実世界課題,すなわち(i)ニューラルアーキテクチャ探索,(ii)アルゴリズム的公平性,(iii)言語モデル最適化の性能評価を行った。実験分析の結果,MO ASHAはMO HPOを大規模に実行可能であることがわかった。さらに,パレートフロント全体を候補選択の考慮に入れることで,壁時計時間の観点からのmoスカラー化に基づくマルチ忠実度hpoを一貫して上回っていることを観察する。私たちのアルゴリズム(オープンソース化)は、この分野における今後の研究のための新しいベースラインを確立します。

Hyperparameter optimization (HPO) is increasingly used to automatically tune the predictive performance (e.g., accuracy) of machine learning models. However, in a plethora of real-world applications, accuracy is only one of the multiple -- often conflicting -- performance criteria, necessitating the adoption of a multi-objective (MO) perspective. While the literature on MO optimization is rich, few prior studies have focused on HPO. In this paper, we propose algorithms that extend asynchronous successive halving (ASHA) to the MO setting. Considering multiple evaluation metrics, we assess the performance of these methods on three real world tasks: (i) Neural architecture search, (ii) algorithmic fairness and (iii) language model optimization. Our empirical analysis shows that MO ASHA enables to perform MO HPO at scale. Further, we observe that that taking the entire Pareto front into account for candidate selection consistently outperforms multi-fidelity HPO based on MO scalarization in terms of wall-clock time. Our algorithms (to be open-sourced) establish new baselines for future research in the area.

翻訳日:2021-06-25 15:18:31 公開日:2021-06-23

# ディープフェイク検出:顔面マニピュレーション検出ソリューションの調査

Deep Fake Detection: Survey of Facial Manipulation Detection Solutions ( http://arxiv.org/abs/2106.12605v1 )

ライセンス: Link先を確認

Samay Pashine, Sagar Mandiya, Praveen Gupta, and Rashid Sheikh

(参考訳) 分野としてのディープラーニングは、数十年前に想像できなかったような、多くの複雑な問題の解決に成功しています。しかし、それがもたらす多くの利益は、社会に害をもたらすのに使える方法がまだ残っています。ディープフェイクはそのような問題のひとつであることが証明されており、スマートフォン上で単にアプリケーションを使って偽の画像やビデオを作成できる場合には、画像や動画が偽物なのか本物なのかを検知し、オンライン情報の信頼性を脅かす問題を処分する、何らかの対策が必要になります。ニューラルネットワークによって作成されたディープフェイクは、実際の画像やビデオと同じくらいリアルに思えるかもしれないが、モデレーション後の空間的および時間的痕跡やシグネチャは残っており、人間の目に見えないシグネチャは、ディープフェイク検出を専門に訓練されたニューラルネットワークによって検出することができる。本稿では,アートニューラルネット(mesonet,resnet-50,vgg-19,xception net)のいくつかの状態を分析し,それらを比較することで,分類をできるだけ早く行うべきオンラインソーシャルメディアプラットフォームや,その分類をリアルタイムに必要とせず,かつ最も正確性を要する小さなニュース機関において,リアルタイムのディープフェイク検出のような様々なシナリオに対して最適な解決策を見出す。

Deep Learning as a field has been successfully used to solve a plethora of complex problems, the likes of which we could not have imagined a few decades back. But as many benefits as it brings, there are still ways in which it can be used to bring harm to our society. Deep fakes have been proven to be one such problem, and now more than ever, when any individual can create a fake image or video simply using an application on the smartphone, there need to be some countermeasures, with which we can detect if the image or video is a fake or real and dispose of the problem threatening the trustworthiness of online information. Although the Deep fakes created by neural networks, may seem to be as real as a real image or video, it still leaves behind spatial and temporal traces or signatures after moderation, these signatures while being invisible to a human eye can be detected with the help of a neural network trained to specialize in Deep fake detection. In this paper, we analyze several such states of the art neural networks (MesoNet, ResNet-50, VGG-19, and Xception Net) and compare them against each other, to find an optimal solution for various scenarios like real-time deep fake detection to be deployed in online social media platforms where the classification should be made as fast as possible or for a small news agency where the classification need not be in real-time but requires utmost accuracy.

翻訳日:2021-06-25 15:16:29 公開日:2021-06-23

# IA-RED$^2$:視覚変換器の解釈可能性を考慮した冗長性低減

IA-RED$^2$: Interpretability-Aware Redundancy Reduction for Vision Transformers ( http://arxiv.org/abs/2106.12620v1 )

ライセンス: Link先を確認

Bowen Pan, Yifan Jiang, Rameswar Panda, Zhangyang Wang, Rogerio Feris, Aude Oliva

(参考訳) 自己注意型モデルであるTransformerは最近、コンピュータビジョン分野における主要なバックボーンになりつつある。様々なビジョンタスクでトランスフォーマーが素晴らしい成功をおさめたにもかかわらず、計算量と集中的なメモリコストに苦しめられている。本稿では,この制限に対処するため,解釈可能性を考慮したredundancy REDuction framework(IA-RED$^2$)を提案する。まず,非相関な入力パッチに主に費やされる大量の冗長な計算を観察し,その冗長なパッチを動的かつ優雅に削除するための解釈可能なモジュールを導入する。この新たなフレームワークは階層構造に拡張され、異なるステージで無相関なトークンが徐々に削除され、計算コストが大幅に削減される。 DeiTやTimeSformerのような最先端モデルの1.4倍のスピードアップを実現するために、画像タスクとビデオタスクの両方で広範な実験を行いました。さらに重要なことは、他の加速手法とは対照的に、我々の手法は本質的にかなりの視覚的証拠で解釈可能であり、より軽量でありながら、より人間に理解可能なアーキテクチャに近づきます。筆者らは,本フレームワークで自然に現れる解釈可能性について,本来の視覚変換器で学習した生の注意力,および既成の解釈法で生成されたものより質的かつ定量的な結果よりも優れていることを示した。プロジェクトページ: http://people.csail.mit.edu/bpan/ia-red/

The self-attention-based model, transformer, is recently becoming the leading backbone in the field of computer vision. In spite of the impressive success made by transformers in a variety of vision tasks, it still suffers from heavy computation and intensive memory cost. To address this limitation, this paper presents an Interpretability-Aware REDundancy REDuction framework (IA-RED$^2$). We start by observing a large amount of redundant computation, mainly spent on uncorrelated input patches, and then introduce an interpretable module to dynamically and gracefully drop these redundant patches. This novel framework is then extended to a hierarchical structure, where uncorrelated tokens at different stages are gradually removed, resulting in a considerable shrinkage of computational cost. We include extensive experiments on both image and video tasks, where our method could deliver up to 1.4X speed-up for state-of-the-art models like DeiT and TimeSformer, by only sacrificing less than 0.7% accuracy. More importantly, contrary to other acceleration approaches, our method is inherently interpretable with substantial visual evidence, making vision transformer closer to a more human-understandable architecture while being lighter. We demonstrate that the interpretability that naturally emerged in our framework can outperform the raw attention learned by the original visual transformer, as well as those generated by off-the-shelf interpretation methods, with both qualitative and quantitative results. Project Page: http://people.csail.mit.edu/bpan/ia-red/.

翻訳日:2021-06-25 15:09:27 公開日:2021-06-23

# 最小シャープネス:ニューラルネットワークのスケール不変パラメータロバストネス

Minimum sharpness: Scale-invariant parameter-robustness of neural networks ( http://arxiv.org/abs/2106.12612v1 )

ライセンス: Link先を確認

Hikaru Ibayashi, Takuo Hamaguchi, Masaaki Imaizum

(参考訳) 堅牢で防御的なニューラルネットワークの実現に向けて、重量パラメータ摂動(シャープネス)に対する堅牢性は近年注目を集めている(Sun et al., 2020)。しかし、鋭さは「スケール感度」という重要な問題のままである。本稿では,新しいシャープネス尺度,Minimum Sharpnessを提案する。 NNは、機能的特性が完全に同一である同値なクラスを構成する特定のスケール変換を持ち、同時にそのシャープさは無限に変化することが知られている。我々は、スケール変換に不変な等価NNに対する最小化問題を通じて、シャープさを定義する。また, 研削性を実現するための効率的かつ精密な手法を開発し, ヘシアンの計算コストを低減した。実験の結果,我々のシャープネスはNNの一般化と有効に相関しており,既存のシャープネス対策よりも計算コストが低いことがわかった。

Toward achieving robust and defensive neural networks, the robustness against the weight parameters perturbations, i.e., sharpness, attracts attention in recent years (Sun et al., 2020). However, sharpness is known to remain a critical issue, "scale-sensitivity." In this paper, we propose a novel sharpness measure, Minimum Sharpness. It is known that NNs have a specific scale transformation that constitutes equivalent classes where functional properties are completely identical, and at the same time, their sharpness could change unlimitedly. We define our sharpness through a minimization problem over the equivalent NNs being invariant to the scale transformation. We also develop an efficient and exact technique to make the sharpness tractable, which reduces the heavy computational costs involved with Hessian. In the experiment, we observed that our sharpness has a valid correlation with the generalization of NNs and runs with less computational cost than existing sharpness measures.

翻訳日:2021-06-25 15:03:51 公開日:2021-06-23

# オンライン学習におけるベストケースローワーバウンダリ

Best-Case Lower Bounds in Online Learning ( http://arxiv.org/abs/2106.12688v1 )

ライセンス: Link先を確認

Crist\'obal Guzm\'an and Nishant A. Mehta and Ali Mortazavi

(参考訳) オンライン学習における研究の多くは、後悔に対する下線上界の研究に焦点を当てている。本研究は,オンライン凸最適化における最良ケース下界の研究を開始し,アルゴリズムが後から得られる最良動作に対する最大の改善点を定めている。この問題は、学習アルゴリズムの適応性をよりよく理解することを目的としている。もうひとつのモチベーションは、グループフェアネスの概念を満たす決定理論オンライン学習(DTOL)のアルゴリズムを得る上で、ベストケースの下位境界が有効であることが知られていることである。我々のコントリビューションは、Follow The Regularized Leader (FTRL)アルゴリズムに時間変化レギュラーライザを付加する一般的な方法であり、このアルゴリズムは、最良のケースの下位境界が既存の上位の後悔境界と同じ順序であることを示すために使われる。対照的に、FTRLの線形化バージョンは負の線形後悔を達成できることを示す。最後に、2人の専門家とバイナリ予測を持つdtolでは、ベストケースシーケンスを完全に特徴付けし、ベストケース下限をより詳細に理解します。

Much of the work in online learning focuses on the study of sublinear upper bounds on the regret. In this work, we initiate the study of best-case lower bounds in online convex optimization, wherein we bound the largest improvement an algorithm can obtain relative to the single best action in hindsight. This problem is motivated by the goal of better understanding the adaptivity of a learning algorithm. Another motivation comes from fairness: it is known that best-case lower bounds are instrumental in obtaining algorithms for decision-theoretic online learning (DTOL) that satisfy a notion of group fairness. Our contributions are a general method to provide best-case lower bounds in Follow The Regularized Leader (FTRL) algorithms with time-varying regularizers, which we use to show that best-case lower bounds are of the same order as existing upper regret bounds: this includes situations with a fixed learning rate, decreasing learning rates, timeless methods, and adaptive gradient methods. In stark contrast, we show that the linearized version of FTRL can attain negative linear regret. Finally, in DTOL with two experts and binary predictions, we fully characterize the best-case sequences, which provides a finer understanding of the best-case lower bounds.

翻訳日:2021-06-25 15:03:36 公開日:2021-06-23

# バックプロパゲーションにおけるReLU'(0)の数値解析効果

Numerical influence of ReLU'(0) on backpropagation ( http://arxiv.org/abs/2106.12915v1 )

ライセンス: Link先を確認

David Bertoin (ISAE-SUPAERO), J\'er\^ome Bolte (UT1, TSE), S\'ebastien Gerchinovitz (IMT), Edouard Pauwels (CNRS, IRIT)

(参考訳) 理論上、ニューラルネットワークの[0, 1]におけるrelu(0)の選択は、バックプロパゲーションとトレーニングの両方に無視できない影響を与える。しかし、現実世界では、32ビットのデフォルト精度とディープラーニングの問題のサイズが組み合わさって、トレーニング手法のハイパーパラメータとなる。各種ネットワーク(全接続, VGG, ResNet)およびデータセット(MNIST, CIFAR10, SVHN)における複数の精度レベル(16, 32, 64ビット)に対するReLU(0)の値の重要性について検討する。約半分の時間で32ビット精度で発生するバックプロパゲーション出力のかなりの変動を観測する。この効果は倍精度で消失するが、16ビットで体系化される。バニラSGDトレーニングでは、ReLU (0) = 0の選択が最も効率的と思われる。また、バッチノルムやADAMのようなリコンディショニングアプローチは、ReLU(0)値の影響を緩衝する傾向にあることを示す。全体として、我々が伝えたいメッセージは、非滑らかな問題のアルゴリズム的微分が、有利に調整できるパラメータを隠蔽する可能性があるということだ。

In theory, the choice of ReLU (0) in [0, 1] for a neural network has a negligible influence both on backpropagation and training. Yet, in the real world, 32 bits default precision combined with the size of deep learning problems makes it a hyperparameter of training methods. We investigate the importance of the value of ReLU (0) for several precision levels (16, 32, 64 bits), on various networks (fully connected, VGG, ResNet) and datasets (MNIST, CIFAR10, SVHN). We observe considerable variations of backpropagation outputs which occur around half of the time in 32 bits precision. The effect disappears with double precision, while it is systematic at 16 bits. For vanilla SGD training, the choice ReLU (0) = 0 seems to be the most efficient. We also evidence that reconditioning approaches as batch-norm or ADAM tend to buffer the influence of ReLU (0)'s value. Overall, the message we want to convey is that algorithmic differentiation of nonsmooth problems potentially hides parameters that could be tuned advantageously.

翻訳日:2021-06-25 15:02:01 公開日:2021-06-23

# L'Apprentissage Automatique dans la planification et le contr{\^o}le de la production : un {\displaystyle {\'e}tat de l'art

L'Apprentissage Automatique dans la planification et le contr{\^o}le de la production : un {\'e}tat de l'art ( http://arxiv.org/abs/2106.12916v1 )

ライセンス: Link先を確認

Juan Pablo Usuga Cadavid (LAMIH, ENSAM), Samir Lamouri (LAMIH, ENSAM), Bernard Grabot (LGP, ENIT), Arnaud Fortin

(参考訳) PPC(Proper Production Planning and Control)は、競争相手を圧倒し、コストを削減し、納期を尊重する資本である。 PPCに関しては、機械学習(ML)がデータに基づいてインテリジェントな意思決定を行う新たな機会を提供する。したがって、このコミュニケーションは、PPCに適用されたMLに関する出版物の初期の体系的なレビューを提供する。本研究の目的は2つある:第1に、PPCにMLを適用可能な技術やツールを特定し、第2に、最近の研究論文における産業4.0(I4.0)の特徴をレビューすることである。第2の目的について、i4.0の7つの特徴が分析フレームワークで使われ、そのうちの2つは著者によって提案されている。さらに、科学文献におけるML支援PPCのアドレスドメインを同定した。最後に、結果は分析され、さらなる研究の動機となるギャップが強調される。

Proper Production Planning and Control (PPC) is capital to have an edge over competitors, reduce costs and respect delivery dates. With regard to PPC, Machine Learning (ML) provides new opportunities to make intelligent decisions based on data. Therefore, this communication provides an initial systematic review of publications on ML applied in PPC. The research objective of this study is twofold: firstly, it aims to identify techniques and tools allowing to apply ML in PPC, and secondly, it reviews the characteristics of Industry 4.0 (I4.0) in recent research papers. Concerning the second objective, seven characteristics of I4.0 are used in the analysis framework, from which two of them are proposed by the authors. Additionally, the addressed domains of ML-aided PPC in scientific literature are identified. Finally, results are analyzed and gaps that may motivate further research are highlighted.

翻訳日:2021-06-25 15:01:42 公開日:2021-06-23

# トレーニングとテストセグメンテーションミスマッチによる対処: FBK@IWSLT2021

Dealing with training and test segmentation mismatch: FBK@IWSLT2021 ( http://arxiv.org/abs/2106.12607v1 )

ライセンス: Link先を確認

Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi

(参考訳) 本稿では,FIWLT 2021オフライン音声翻訳タスクに対するFBKのシステム適用について述べる。英語音声データをドイツ語のテキストに変換するためのトランスフォーマティブなアーキテクチャであるdirect modelに参加した。訓練パイプラインは、知識蒸留と2段階の微調整手順により特徴づけられる。知識蒸留と第1の微調整工程の両方を手作業で分割した実データと合成データで行い、後者は利用可能なコーパスで訓練されたMTシステムで生成する。異なることに、第2の微調整ステップは、MuST-C v2 En-Deデータセットのランダムセグメンテーションで実行される。その主な目的は、手動でセグメンテーションされたデータ(すなわち)に基づいて訓練された音声翻訳モデルで発生する性能低下を減らすことである。自動セグメンテーション音声(すなわち、音声)上で理想的な文様セグメンテーションを評価する。実際の、より現実的なテスト条件) 同じ目的のために、システムに渡す前に、オーディオコンテンツ(ポーズ)と生成されたセグメントの長さの両方を考慮に入れた独自のハイブリッドセグメンテーション手順をテストデータに適用する。提案手法を,Voice Activity Detection (VAD) に基づくベースラインセグメンテーション法と比較した。提案手法の有効性は,手動のセグメンテーションによるギャップを8.3から1.4 BLEUポイントに減らし,提案手法の有効性を示した。

This paper describes FBK's system submission to the IWSLT 2021 Offline Speech Translation task. We participated with a direct model, which is a Transformer-based architecture trained to translate English speech audio data into German texts. The training pipeline is characterized by knowledge distillation and a two-step fine-tuning procedure. Both knowledge distillation and the first fine-tuning step are carried out on manually segmented real and synthetic data, the latter being generated with an MT system trained on the available corpora. Differently, the second fine-tuning step is carried out on a random segmentation of the MuST-C v2 En-De dataset. Its main goal is to reduce the performance drops occurring when a speech translation model trained on manually segmented data (i.e. an ideal, sentence-like segmentation) is evaluated on automatically segmented audio (i.e. actual, more realistic testing conditions). For the same purpose, a custom hybrid segmentation procedure that accounts for both audio content (pauses) and for the length of the produced segments is applied to the test data before passing them to the system. At inference time, we compared this procedure with a baseline segmentation method based on Voice Activity Detection (VAD). Our results indicate the effectiveness of the proposed hybrid approach, shown by a reduction of the gap with manual segmentation from 8.3 to 1.4 BLEU points.

翻訳日:2021-06-25 14:59:18 公開日:2021-06-23

# フロリダ野生生物カメラトラップデータセット

Florida Wildlife Camera Trap Dataset ( http://arxiv.org/abs/2106.12628v1 )

ライセンス: Link先を確認

Crystal Gagne, Jyoti Kini, Daniel Smith, Mubarak Shah

(参考訳) トレイルカメラの画像は、保護と生態研究のために生物学者の間で人気が高まっている。カメラトラップを操作するのに必要な最小限の人間の干渉は、偏見のない種の活動を捉えることができる。人類と野生生物の相互作用、様々な種の移動パターン、絶滅危惧種の絶滅リスクなどに基づくいくつかの研究は、豊富なデータ不足と、手動で注釈付けされたトレイルカメラ画像の時間的特性によって制限されている。フロリダ州南西部の2つの異なる場所から収集された野生生物カメラトラップ分類データセットについて,視覚に類似した種を特徴とする104,495枚の画像,照明条件の相違,類型分布,絶滅危惧種のサンプルなどを紹介する。フロリダ・パンサー。 ResNet-50アーキテクチャによる実験的評価は、この画像分類に基づくデータセットが野生生物統計モデリングのさらなる進歩を推し進めることを示している。私たちはデータセットを公開します。

Trail camera imagery has increasingly gained popularity amongst biologists for conservation and ecological research. Minimal human interference required to operate camera traps allows capturing unbiased species activities. Several studies - based on human and wildlife interactions, migratory patterns of various species, risk of extinction in endangered populations - are limited by the lack of rich data and the time-consuming nature of manually annotating trail camera imagery. We introduce a challenging wildlife camera trap classification dataset collected from two different locations in Southwestern Florida, consisting of 104,495 images featuring visually similar species, varying illumination conditions, skewed class distribution, and including samples of endangered species, i.e. Florida panthers. Experimental evaluations with ResNet-50 architecture indicate that this image classification-based dataset can further push the advancements in wildlife statistical modeling. We will make the dataset publicly available.

翻訳日:2021-06-25 14:58:05 公開日:2021-06-23

# 視覚的な場所認識が簡単か難しいか?

What makes visual place recognition easy or hard? ( http://arxiv.org/abs/2106.12671v1 )

ライセンス: Link先を確認

Stefan Schubert and Peer Neubert

(参考訳) 視覚的位置認識は移動ロボットの局所化の基本的な機能である。物理的世界で動作する物理エージェントの実践的な文脈に画像検索を配置する。これは研究の活発な分野であり、多くの異なる実験で多くの異なるアプローチが提案され評価されている。以下では、この実践的文脈と個々の設計判断のバリエーションから、場所認識実験は、異なる論文にほぼ匹敵するものであり、ある実験から別の実験へと変化する様々な特性が存在すると論じる。このような特性の広範なリストを提供し、位置認識実験をより簡単に、あるいは困難に設定する方法の例を示す。本研究は,(1)特定の課題の特質に適した場所認識アプローチを選択したい人,(2)オープンな研究課題を探求し,特に困難な事例に関心を持つ研究者,(3)再現可能な論文を作成したい著者,(4)レビュー中の論文の潜在的な問題を識別するタスクを持つレビュアーなど,様々な関係者にとって興味深いものである。

Visual place recognition is a fundamental capability for the localization of mobile robots. It places image retrieval in the practical context of physical agents operating in a physical world. It is an active field of research and many different approaches have been proposed and evaluated in many different experiments. In the following, we argue that due to variations of this practical context and individual design decisions, place recognition experiments are barely comparable across different papers and that there is a variety of properties that can change from one experiment to another. We provide an extensive list of such properties and give examples how they can be used to setup a place recognition experiment easier or harder. This might be interesting for different involved parties: (1) people who just want to select a place recognition approach that is suitable for the properties of their particular task at hand, (2) researchers that look for open research questions and are interested in particularly difficult instances, (3) authors that want to create reproducible papers on this topic, and (4) also reviewers that have the task to identify potential problems in papers under review.

翻訳日:2021-06-25 14:57:49 公開日:2021-06-23

# 畳み込みニューラルネットワークによる条件変形可能な画像登録

Conditional Deformable Image Registration with Convolutional Neural Network ( http://arxiv.org/abs/2106.12673v1 )

ライセンス: Link先を確認

Tony C. W. Mok and Albert C. S. Chung

(参考訳) 近年のディープラーニングに基づく手法は、変形可能な画像登録において有望な結果と実行時の利点を示している。しかし、ハイパーパラメータの効果を分析し、最適な正規化パラメータを探索することは、深層学習法では違法すぎることが証明されている。これは、異なるハイパーパラメータ値を持つかなりの数の異なるモデルをトレーニングする必要があるためである。本稿では,深部変形可能な画像登録のための条件付き画像登録手法と自己教師付き学習パラダイムを提案する。正規化ハイパーパラメータと相関する条件特徴を学習することにより、任意のハイパーパラメータによる最適解を1つの深層畳み込みニューラルネットワークで捉えることができることを示す。さらに、結果として生じる変形場の滑らかさは、推論中の滑らか度正規化の任意の強度で操作することができる。大規模脳MRIデータセットの大規模な実験により,提案手法は実行時の優位性や登録精度を犠牲にすることなく,変形場の滑らかさを正確に制御できることを示した。

Recent deep learning-based methods have shown promising results and runtime advantages in deformable image registration. However, analyzing the effects of hyperparameters and searching for optimal regularization parameters prove to be too prohibitive in deep learning-based methods. This is because it involves training a substantial number of separate models with distinct hyperparameter values. In this paper, we propose a conditional image registration method and a new self-supervised learning paradigm for deep deformable image registration. By learning the conditional features that correlated with the regularization hyperparameter, we demonstrate that optimal solutions with arbitrary hyperparameters can be captured by a single deep convolutional neural network. In addition, the smoothness of the resulting deformation field can be manipulated with arbitrary strength of smoothness regularization during inference. Extensive experiments on a large-scale brain MRI dataset show that our proposed method enables the precise control of the smoothness of the deformation field without sacrificing the runtime advantage or registration accuracy.

翻訳日:2021-06-25 14:57:32 公開日:2021-06-23

# 不可逆過程を予測するブラケットを保存する機械学習構造

Machine learning structure preserving brackets for forecasting irreversible processes ( http://arxiv.org/abs/2106.12619v1 )

ライセンス: Link先を確認

Kookjin Lee and Nathaniel A. Trask and Panos Stinis

(参考訳) 時系列データの予測には予測外挿を得るために帰納バイアスの付与が必要であり、最近の研究では可逆力学系の構造を保存するためにハミルトニアン/ラグランジアン形式が課されている。本稿では,未知の事前モデル形式を持つ可逆力学の学習に適した3次力学系からの散逸ブラケットのパラメータ化を提案する。この過程は、エネルギーとエントロピーがそれぞれ保存され、非減少することが保証された一般化されたカシミールを学ぶ。さらに, 熱雑音が加わった場合, 変動散逸定理の正確な保存を保証し, 熱力学的整合性を確保する。学習したダイナミクスが"ブラックボックス"やペナルティベースのアプローチよりも堅牢で一般化していることを示す散逸システムのベンチマークを提供する。

Forecasting of time-series data requires imposition of inductive biases to obtain predictive extrapolation, and recent works have imposed Hamiltonian/Lagrangian form to preserve structure for systems with reversible dynamics. In this work we present a novel parameterization of dissipative brackets from metriplectic dynamical systems appropriate for learning irreversible dynamics with unknown a priori model form. The process learns generalized Casimirs for energy and entropy guaranteed to be conserved and nondecreasing, respectively. Furthermore, for the case of added thermal noise, we guarantee exact preservation of a fluctuation-dissipation theorem, ensuring thermodynamic consistency. We provide benchmarks for dissipative systems demonstrating learned dynamics are more robust and generalize better than either "black-box" or penalty-based approaches.

翻訳日:2021-06-25 14:55:05 公開日:2021-06-23

# 協調フィルタ型推薦システムにおけるステレオタイプ問題

The Stereotyping Problem in Collaboratively Filtered Recommender Systems ( http://arxiv.org/abs/2106.12622v1 )

ライセンス: Link先を確認

Wenshuo Guo, Karl Krauth, Michael I. Jordan, Nikhil Garg

(参考訳) 推薦システム、特に行列分解に基づく協調フィルタリングアルゴリズムは、オンライン情報へのアクセスを仲介する上で重要な役割を果たす。このようなアルゴリズムが特定のステレオタイプを誘導することを示している: アイテムの \textit{set} に対する嗜好が一般ユーザ集団で反相関である場合、それらのアイテムはユーザの好みや評価履歴に関係なく、ユーザと一緒に推奨されない可能性がある。まず,一組のアイテムがユーザによって共同でアクセス可能な範囲を計測する「textit{joint accessibility}」という概念を導入する。次に,標準因子化に基づく協調フィルタリングの枠組みに基づく協調的アクセシビリティを研究し,協調的アクセシビリティに違反した場合の理論的必要十分条件を提供する。さらに,ユーザが単一の特徴ベクトルで表される場合,これらの条件が容易に破られることを示す。共同アクセシビリティを向上させるために,マルチベクタ表現を用いて各ユーザの多様な利害関係を捉えるための代替的なモデリング修正を提案する。本研究では,実データとシミュレーションデータについて広範な実験を行い,標準単一ベクトル行列分解モデルを用いてステレオタイプ問題を示す。

Recommender systems -- and especially matrix factorization-based collaborative filtering algorithms -- play a crucial role in mediating our access to online information. We show that such algorithms induce a particular kind of stereotyping: if preferences for a \textit{set} of items are anti-correlated in the general user population, then those items may not be recommended together to a user, regardless of that user's preferences and ratings history. First, we introduce a notion of \textit{joint accessibility}, which measures the extent to which a set of items can jointly be accessed by users. We then study joint accessibility under the standard factorization-based collaborative filtering framework, and provide theoretical necessary and sufficient conditions when joint accessibility is violated. Moreover, we show that these conditions can easily be violated when the users are represented by a single feature vector. To improve joint accessibility, we further propose an alternative modelling fix, which is designed to capture the diverse multiple interests of each user using a multi-vector representation. We conduct extensive experiments on real and simulated datasets, demonstrating the stereotyping problem with standard single-vector matrix factorization models.

翻訳日:2021-06-25 14:54:51 公開日:2021-06-23

# 製品探索における意味マッチングのためのエクストリームマルチラベル学習

Extreme Multi-label Learning for Semantic Matching in Product Search ( http://arxiv.org/abs/2106.12657v1 )

ライセンス: Link先を確認

Wei-Cheng Chang, Daniel Jiang, Hsiang-Fu Yu, Choon-Hui Teo, Jiong Zhang, Kai Zhong, Kedarnath Kolluri, Qie Hu, Nikhil Shandilya, Vyacheslav Ievgrafov, Japinder Singh, Inderjit S. Dhillon

(参考訳) 製品検索におけるセマンティックマッチングの問題について考察する。顧客の問い合わせを与えられた場合、1億以上の巨大なカタログからすべてのセマンティック関連商品を検索する。大きなカタログ空間とリアルタイムレイテンシの制約のため、セマンティックマッチングアルゴリズムは高いリコールを求めるだけでなく、低レイテンシを必要とする。従来の語彙マッチングアプローチ(例えばokapi-bm25)は、逆インデックスを利用して高速な推論時間を達成するが、クエリと製品間の動作信号をキャプチャしない。対照的に、埋め込みベースのモデルは顧客の行動データからセマンティック表現を学習するが、レイテンシの制約により、浅いニューラルエンコーダによって性能が制限されることが多い。セマンティック製品検索は、顧客クエリが入力インスタンスであり、製品が出力ラベルである、極端なマルチラベル分類(xmc)の問題と見なすことができる。本稿では,推論時間の複雑さが製品数に対数である木ベースxmcモデルを用いて,意味的製品探索を改善することを目的とする。高速リアルタイム推論のためのn-gram特徴を持つ階層線形モデルを考える。定量的には、1クエリあたりのレイテンシが1.25ミリ秒と低く、recall@100 (60.9%v.s) を65%向上させた。 36.8%) が競合する組込みベースのdssmモデルより優れている。私たちのモデルは、さまざまなしきい値で重み付けし、オンラインデプロイメントの異なるシステム要件を柔軟に満たすことができます。質的には,既存の製品検索システムと相補的な製品を検索し,マッチセットに多様性を加えることができる。

We consider the problem of semantic matching in product search: given a customer query, retrieve all semantically related products from a huge catalog of size 100 million, or more. Because of large catalog spaces and real-time latency constraints, semantic matching algorithms not only desire high recall but also need to have low latency. Conventional lexical matching approaches (e.g., Okapi-BM25) exploit inverted indices to achieve fast inference time, but fail to capture behavioral signals between queries and products. In contrast, embedding-based models learn semantic representations from customer behavior data, but the performance is often limited by shallow neural encoders due to latency constraints. Semantic product search can be viewed as an eXtreme Multi-label Classification (XMC) problem, where customer queries are input instances and products are output labels. In this paper, we aim to improve semantic product search by using tree-based XMC models where inference time complexity is logarithmic in the number of products. We consider hierarchical linear models with n-gram features for fast real-time inference. Quantitatively, our method maintains a low latency of 1.25 milliseconds per query and achieves a 65% improvement of Recall@100 (60.9% v.s. 36.8%) over a competing embedding-based DSSM model. Our model is robust to weight pruning with varying thresholds, which can flexibly meet different system requirements for online deployments. Qualitatively, our method can retrieve products that are complementary to existing product search system and add diversity to the match set.

翻訳日:2021-06-25 14:54:30 公開日:2021-06-23

# 最適化における現代的な技術を理解する:Frank-Wolfe、NesterovのMomentum、PolyakのMomentum

Understanding Modern Techniques in Optimization: Frank-Wolfe, Nesterov's Momentum, and Polyak's Momentum ( http://arxiv.org/abs/2106.12923v1 )

ライセンス: Link先を確認

Jun-Kun Wang

(参考訳) この論文研究の第1部では,凸最適化のための反復アルゴリズムの構築と解析のためのレシピとして機能するモジュラーフレームワークを開発した。具体的には,2プレイヤーゼロサムゲームを反復的に行うことで最適化を行う。フランク・ウルフやネステロフの加速法を含む既存の多くの最適化アルゴリズムは、2人のオンライン学習者を互いに適切な戦略でピットすることでゲームから復元することができる。さらに、ゲーム中のプレイヤーの重み付けされた平均的後悔の和は収束率を示している。その結果,本手法はこれらのアルゴリズムに簡単な代替的証明を与える。さらに,ゲームプレイを反復的に行うことによる最適化のアプローチが,いくつかの制約セットに対してフランク・ウルフ風のアルゴリズムを新たに3つ導入すること,さらに,我々のフレームワークが本当に汎用的でモジュール的で使いやすくなっていることを示す。第2部では,古典的強二次凸問題の解法,神経接核系下での広いreluネットワークの訓練,直交初期化を用いた深い線形ネットワークの訓練など,ある問題に対するpolyakの運動量による証明可能な加速度のモジュラー解析を開発した。我々はメタ定理を開発し、これらの問題にポリアックの運動量を適用するとき、誘導力学はメタ定理を直接適用できる形式を示すことを示した。論文の最後の部分では、ポリアックの運動量の使用の別の利点を示し、滑らかな非凸最適化において、サドルポイントの高速脱出を容易にする。この結果、第2部と共に、現代の非凸最適化とディープラーニングにおけるPolyakの勢いに新たな光を当てた。

In the first part of this dissertation research, we develop a modular framework that can serve as a recipe for constructing and analyzing iterative algorithms for convex optimization. Specifically, our work casts optimization as iteratively playing a two-player zero-sum game. Many existing optimization algorithms including Frank-Wolfe and Nesterov's acceleration methods can be recovered from the game by pitting two online learners with appropriate strategies against each other. Furthermore, the sum of the weighted average regrets of the players in the game implies the convergence rate. As a result, our approach provides simple alternative proofs to these algorithms. Moreover, we demonstrate that our approach of optimization as iteratively playing a game leads to three new fast Frank-Wolfe-like algorithms for some constraint sets, which further shows that our framework is indeed generic, modular, and easy-to-use. In the second part, we develop a modular analysis of provable acceleration via Polyak's momentum for certain problems, which include solving the classical strongly quadratic convex problems, training a wide ReLU network under the neural tangent kernel regime, and training a deep linear network with an orthogonal initialization. We develop a meta theorem and show that when applying Polyak's momentum for these problems, the induced dynamics exhibit a form where we can directly apply our meta theorem. In the last part of the dissertation, we show another advantage of the use of Polyak's momentum -- it facilitates fast saddle point escape in smooth non-convex optimization. This result, together with those of the second part, sheds new light on Polyak's momentum in modern non-convex optimization and deep learning.

翻訳日:2021-06-25 14:53:21 公開日:2021-06-23

# テキストデータを用いた株式市場分析:レビュー

Stock Market Analysis with Text Data: A Review ( http://arxiv.org/abs/2106.12985v1 )

ライセンス: Link先を確認

Kamaladdin Fataliyev, Aneesh Chivukula, Mukesh Prasad and Wei Liu

(参考訳) 株式市場の動きは、ニュース記事、会社の報告、ソーシャルメディアの議論を通じて共有される公開情報やプライベート情報の影響を受けている。こうした膨大なデータソースを分析することで、市場参加者に利益をもたらすことができる。しかし、文学における研究の大部分は、構造化されていない膨大なテキストデータの解析に近づいた伝統的なアプローチに基づいている。本研究では,テキストベースの株式市場分析における既存文献の膨大な量について概観する。入力データ型を示し、主要なテキストデータソースとバリエーションをカバーする。特徴表現技法が提示される。次に、分析手法を概説し、主要な株式市場予測モデルの分類を作成する。ここでは,分類学の各分野の代表的業績について論じ,それぞれの貢献を分析した。最後に,未解決の未解決問題に関する知見を示し,今後の課題の提案を行う。本研究の目的は,主要な株式市場分析モデル,金融市場予測のためのテキスト表現技術,既存手法の欠点,今後の研究への道筋を提案することである。

Stock market movements are influenced by public and private information shared through news articles, company reports, and social media discussions. Analyzing these vast sources of data can give market participants an edge to make profit. However, the majority of the studies in the literature are based on traditional approaches that come short in analyzing unstructured, vast textual data. In this study, we provide a review on the immense amount of existing literature of text-based stock market analysis. We present input data types and cover main textual data sources and variations. Feature representation techniques are then presented. Then, we cover the analysis techniques and create a taxonomy of the main stock market forecast models. Importantly, we discuss representative work in each category of the taxonomy, analyzing their respective contributions. Finally, this paper shows the findings on unaddressed open problems and gives suggestions for future work. The aim of this study is to survey the main stock market analysis models, text representation techniques for financial market prediction, shortcomings of existing techniques, and propose promising directions for future research.

翻訳日:2021-06-25 14:52:44 公開日:2021-06-23

# 神経後部推定を用いたリアルタイム重力波科学

Real-time gravitational-wave science with neural posterior estimation ( http://arxiv.org/abs/2106.12594v1 )

ライセンス: Link先を確認

Maximilian Dax, Stephen R. Green, Jonathan Gair, Jakob H. Macke, Alessandra Buonanno, Bernhard Sch\"olkopf

(参考訳) 深層学習による高速重力波パラメータ推定について,前例のない精度を示す。ニューラルネットワークをベイズ分布のサロゲートとして用いて,最初のLIGO-Virgo Gravitational-Wave Transient Catalogから8つの重力波イベントを解析し,標準推論符号と非常に密に一致しているが,推定時間はO(day)から1分間に短縮された。ネットワークはシミュレーションデータを用いて,事象近傍の検出器ノイズ特性の推定を含むトレーニングを行う。これにより、数百万のニューラルネットワークパラメータ内の信号とノイズモデルを符号化し、イベントからイベントまでのノイズ非定常性を考慮して、トレーニング分布に整合した観測データの推論を可能にする。私たちのアルゴリズムは、"dingo"と呼ばれ、検出された重力波イベントの物理的パラメータの高速かつ正確な推論の新しい標準を設定します。

We demonstrate unprecedented accuracy for rapid gravitational-wave parameter estimation with deep learning. Using neural networks as surrogates for Bayesian posterior distributions, we analyze eight gravitational-wave events from the first LIGO-Virgo Gravitational-Wave Transient Catalog and find very close quantitative agreement with standard inference codes, but with inference times reduced from O(day) to a minute per event. Our networks are trained using simulated data, including an estimate of the detector-noise characteristics near the event. This encodes the signal and noise models within millions of neural-network parameters, and enables inference for any observed data consistent with the training distribution, accounting for noise nonstationarity from event to event. Our algorithm -- called "DINGO" -- sets a new standard in fast-and-accurate inference of physical parameters of detected gravitational-wave events, which should enable real-time data analysis without sacrificing accuracy.

翻訳日:2021-06-25 14:50:05 公開日:2021-06-23

# 表現の組み合わせによるランキングのセマンティックな類似クエリの活用

Leveraging semantically similar queries for ranking via combining representations ( http://arxiv.org/abs/2106.12621v1 )

ライセンス: Link先を確認

Hayden S. Helm and Marah Abdin and Benjamin D. Pedigo and Shweti Mahajan and Vince Lyzinski and Youngser Park and Amitabh Basu and Piali~Choudhury and Christopher M. White and Weiwei Yang and Carey E. Priebe

(参考訳) 現代のランキング問題では、ランク付けされる項目の異なる、異なる表現がしばしば利用できる。したがって、これらの表現を組み合わせてランキングを改善するのは賢明である。実際、表現を組み合わせることでランク付けを学ぶことは、特定のクエリのランク付け関数を学ぶための原則と実践の両方である。しかし、極めてデータ量の多い設定では、特定のクエリで利用可能なラベル付きデータの量は、高度に可変で非効率なランキング機能につながる可能性がある。少量のデータの影響を軽減する一つの方法は、セマンティックに類似したクエリからの情報を活用することである。実際、シミュレーション設定や実データ例で示すように、セマンティックに類似したクエリが利用可能であれば、特定のクエリに対してランク付けするときに、それらを適切に使用できる。我々は,この現象をバイアス分散トレードオフの文脈で記述し,Bingナビゲーショングラフとショウジョウバエ幼虫コネクトームのデータスカース設定に適用する。

In modern ranking problems, different and disparate representations of the items to be ranked are often available. It is sensible, then, to try to combine these representations to improve ranking. Indeed, learning to rank via combining representations is both principled and practical for learning a ranking function for a particular query. In extremely data-scarce settings, however, the amount of labeled data available for a particular query can lead to a highly variable and ineffective ranking function. One way to mitigate the effect of the small amount of data is to leverage information from semantically similar queries. Indeed, as we demonstrate in simulation settings and real data examples, when semantically similar queries are available it is possible to gainfully use them when ranking with respect to a particular query. We describe and explore this phenomenon in the context of the bias-variance trade off and apply it to the data-scarce settings of a Bing navigational graph and the Drosophila larva connectome.

翻訳日:2021-06-25 14:49:47 公開日:2021-06-23

# 低複雑さDFT空間サンプリングに基づくロバスト適応ビームフォーミングの検討

Study of Robust Adaptive Beamforming Based on Low-Complexity DFT Spatial Sampling ( http://arxiv.org/abs/2106.12663v1 )

ライセンス: Link先を確認

Saeed Mohammadzadeh, Vitor H.Nascimento, Rodrigo C. de Lamare and Osman Kukrer

(参考訳) 本稿では,無作為過程の自己相関列(acs)を一組の計測データから再構成する手法に基づいて,適応的ビームフォーミングのための新しいロバストなアルゴリズムを提案する。これは、その対角線に沿って平均化した後、サンプル共分散行列(SCM)の第1列と第1列から得られる。次に、離散フーリエ変換(DFT)を用いて相関系列のパワースペクトルを推定する。ノイズプラス干渉領域内の角度に対応するDFT係数を用いてノイズプラス干渉共分散行列(NPICM)を再構成し、所望の信号共分散行列(DSCM)をSCMからノイズプラス干渉成分を同定して除去する。特に、推定された受信信号の空間パワースペクトルを利用して、ノイズプラス干渉の優位dft係数をキャプチャしたノイズプラス干渉に対応する相関シーケンスを算出する。提案した適応ビームフォーミングの重要な利点は、わずかな事前情報しか必要としないことである。具体的には、配列幾何学と干渉が位置する角のセクターに関する不正確な知識が必要である。シミュレーションの結果,提案手法は従来の再構成方式のビームフォーマと比較して,入力信号-雑音比が非常に広い範囲で複数ミスマッチした場合の全体的な性能が向上することが示された。

In this paper, a novel and robust algorithm is proposed for adaptive beamforming based on the idea of reconstructing the autocorrelation sequence (ACS) of a random process from a set of measured data. This is obtained from the first column and the first row of the sample covariance matrix (SCM) after averaging along its diagonals. Then, the power spectrum of the correlation sequence is estimated using the discrete Fourier transform (DFT). The DFT coefficients corresponding to the angles within the noise-plus-interference region are used to reconstruct the noise-plus-interference covariance matrix (NPICM), while the desired signal covariance matrix (DSCM) is estimated by identifying and removing the noise-plus-interference component from the SCM. In particular, the spatial power spectrum of the estimated received signal is utilized to compute the correlation sequence corresponding to the noise-plus-interference in which the dominant DFT coefficient of the noise-plus-interference is captured. A key advantage of the proposed adaptive beamforming is that only little prior information is required. Specifically, an imprecise knowledge of the array geometry and of the angular sectors in which the interferences are located is needed. Simulation results demonstrate that compared with previous reconstruction-based beamformers, the proposed approach can achieve better overall performance in the case of multiple mismatches over a very large range of input signal-to-noise ratios.

翻訳日:2021-06-25 14:49:31 公開日:2021-06-23

# (参考訳) GKSD(Generalized Kernel Stein Discrepancy) : 非パラメトリックグッドネス・オブ・フィットテストのための統一的アプローチ

Generalised Kernel Stein Discrepancy(GKSD): A Unifying Approach for Non-parametric Goodness-of-fit Testing ( http://arxiv.org/abs/2106.12105v1 )

ライセンス: CC BY 4.0

Wenkai Xu

(参考訳) kernel stein discrepancies (ksd)に基づく非パラメトリックな適合性テスト手順は、様々なシナリオにおける一般的な非正規化分布を検証するための有望なアプローチである。既存の研究は、テスト性能を高めるために最適なカーネルの選択を研究することに重点を置いている。しかし、スタイン作用素は一般に普遍的ではないが、スタイン作用素の異なる選択はテスト性能にかなりの影響を及ぼす。そこで本研究では,KSDに基づく適合性テストの実行において,異なるStein演算子を理論的に比較・解釈する汎用カーネルStein差分法(GKSD)を提案する。提案したGKSDフレームワークが既存のStein演算子とその対応するテストをどのように一般化するかを明確に導出する。さらに、GKSDフレームワークは、カーネルベースの複雑な新しいデータシナリオのための非パラメトリック良性テストを開発するためのガイドとして使用できることを示す。断続分布または構成データ。実験結果から,提案したテストは,最大平均分解率(MMD)に基づくテストを含む既存手法よりも高いテストパワーを達成できることがわかった。

Non-parametric goodness-of-fit testing procedures based on kernel Stein discrepancies (KSD) are promising approaches to validate general unnormalised distributions in various scenarios. Existing works have focused on studying optimal kernel choices to boost test performances. However, the Stein operators are generally non-unique, while different choices of Stein operators can also have considerable effect on the test performances. In this work, we propose a unifying framework, the generalised kernel Stein discrepancy (GKSD), to theoretically compare and interpret different Stein operators in performing the KSD-based goodness-of-fit tests. We derive explicitly that how the proposed GKSD framework generalises existing Stein operators and their corresponding tests. In addition, we show thatGKSD framework can be used as a guide to develop kernel-based non-parametric goodness-of-fit tests for complex new data scenarios, e.g. truncated distributions or compositional data. Experimental results demonstrate that the proposed tests control type-I error well and achieve higher test power than existing approaches, including the test based on maximum-mean-discrepancy (MMD).

翻訳日:2021-06-25 01:28:52 公開日:2021-06-23

# (参考訳) 分布シフト下における近似線形回帰

Near-Optimal Linear Regression under Distribution Shift ( http://arxiv.org/abs/2106.12108v1 )

ライセンス: CC0 1.0

Qi Lei, Wei Hu, Jason D. Lee

(参考訳) 十分なデータがソースドメインから来る場合、転送学習は必須であり、対象ドメインからのラベル付きデータが不足する。分布シフト中の線形回帰問題に対する最小線形リスクを実現する推定器を開発する。アルゴリズムは,共変量シフトやモデルシフトなど,さまざまなトランスファー学習設定をカバーする。また、線形あるいは一般非線形モデルからデータを生成する場合についても検討する。線形ミニマックス推定器は、様々なソース/ターゲット分布に対する非線形推定器であっても、ミニマックスリスクの絶対定数であることを示す。

Transfer learning is essential when sufficient data comes from the source domain, with scarce labeled data from the target domain. We develop estimators that achieve minimax linear risk for linear regression problems under distribution shift. Our algorithms cover different transfer learning settings including covariate shift and model shift. We also consider when data are generated from either linear or general nonlinear models. We show that linear minimax estimators are within an absolute constant of the minimax risk even among nonlinear estimators for various source/target distributions.

翻訳日:2021-06-25 01:01:16 公開日:2021-06-23

# (参考訳) 因果効果の境界と高次元データへの応用

Bounds on Causal Effects and Application to High Dimensional Data ( http://arxiv.org/abs/2106.12121v1 )

ライセンス: CC BY 4.0

Ang Li, Judea Pearl

(参考訳) 本稿では,バックドア条件やフロントドア基準の調整変数が部分的に観察された場合の因果効果を推定する問題に対処する。このようなシナリオでは、2つの非線形最適化問題を解くことによって因果効果の境界を導出し、境界が十分であることを示す。この最適化手法を用いて,推定パワーに対するバイアスをトレードオフできる次元性低減のための枠組みを提案し,その性能をシミュレーションにより実証する。

This paper addresses the problem of estimating causal effects when adjustment variables in the back-door or front-door criterion are partially observed. For such scenarios, we derive bounds on the causal effects by solving two non-linear optimization problems, and demonstrate that the bounds are sufficient. Using this optimization method, we propose a framework for dimensionality reduction that allows one to trade bias for estimation power, and demonstrate its performance using simulation studies.

翻訳日:2021-06-25 01:00:23 公開日:2021-06-23

# (参考訳) ソースフリードメイン適応意味セグメンテーションにおける暗黙的擬似ラベル整流法に対する負学習の活用

Exploiting Negative Learning for Implicit Pseudo Label Rectification in Source-Free Domain Adaptive Semantic Segmentation ( http://arxiv.org/abs/2106.12123v1 )

ライセンス: CC BY 4.0

Xin Luo, Wei Chen, Yusong Tan, Chen Li, Yulin He, Xiaogang Jia

(参考訳) ソースデータがない場合には、十分に訓練されたソースモデルに格納された知識を非注釈のターゲットドメインに転送することが望ましい。しかし、ソースフリードメイン適応(sfda)のための最先端の手法には厳しい制限がある。1) ソースモデルの内部仕様へのアクセスは必須であり、2) 擬似ラベルは自己学習中にクリーンでなければならず、セマンティックセグメンテーションに依存する重要なタスクは信頼できない。 Aiming at these pitfalls, this study develops a domain adaptive solution to semantic segmentation with pseudo label rectification (namely \textit{PR-SFDA}), which operates in two phases: 1) \textit{Confidence-regularized unsupervised learning}: Maximum squares loss applies to regularize the target model to ensure the confidence in prediction; and 2) \textit{Noise-aware pseudo label learning}: Negative learning enables tolerance to noisy pseudo labels in training, meanwhile positive learning achieves fast convergence. ドメイン適応型セマンティックセグメンテーションのベンチマークである \textit{GTA5 $\to$ Cityscapes} で大規模な実験が行われた。全体として、textit{PR-SFDA} は 49.0 mIoU のパフォーマンスを達成している。後者の要求はソースモデルの内部仕様にアクセスできるが、 \textit{PR-SFDA} ソリューションは明確なコントラストを必要としない。

It is desirable to transfer the knowledge stored in a well-trained source model onto non-annotated target domain in the absence of source data. However, state-of-the-art methods for source free domain adaptation (SFDA) are subject to strict limits: 1) access to internal specifications of source models is a must; and 2) pseudo labels should be clean during self-training, making critical tasks relying on semantic segmentation unreliable. Aiming at these pitfalls, this study develops a domain adaptive solution to semantic segmentation with pseudo label rectification (namely \textit{PR-SFDA}), which operates in two phases: 1) \textit{Confidence-regularized unsupervised learning}: Maximum squares loss applies to regularize the target model to ensure the confidence in prediction; and 2) \textit{Noise-aware pseudo label learning}: Negative learning enables tolerance to noisy pseudo labels in training, meanwhile positive learning achieves fast convergence. Extensive experiments have been performed on domain adaptive semantic segmentation benchmark, \textit{GTA5 $\to$ Cityscapes}. Overall, \textit{PR-SFDA} achieves a performance of 49.0 mIoU, which is very close to that of the state-of-the-art counterparts. Note that the latter demand accesses to the source model's internal specifications, whereas the \textit{PR-SFDA} solution needs none as a sharp contrast.

翻訳日:2021-06-25 00:38:43 公開日:2021-06-23

# (参考訳) 複数ソースによるセキュアなドメイン適応

Secure Domain Adaptation with Multiple Sources ( http://arxiv.org/abs/2106.12124v1 )

ライセンス: CC BY 4.0

Serban Stan, Mohammad Rostami

(参考訳) マルチソースアン教師付きドメイン適応(MUDA)は、最近検討された学習フレームワークであり、アノテーション付きデータで複数のソースドメインから知識を伝達することで、ターゲットドメインにおけるラベル付きデータの不足に対処することを目的としている。ソースデータは分散されているため、ソースドメインのデータのプライバシは自然な関心事になり得る。 MUDAのプライバシー問題に対処するために、埋め込みスペースにおけるドメインアライメントというアイデアの恩恵を受けます。本手法は,ドメイン間のデータサンプルを伝達することなく,内部学習分布を介して間接的にソースとターゲット分布を整列する手法である。提案手法を理論的に正当化し,提案手法が有効であることを示す実験を行い,既存の手法と比較した。

Multi-source unsupervised domain adaptation (MUDA) is a recently explored learning framework, where the goal is to address the challenge of labeled data scarcity in a target domain via transferring knowledge from multiple source domains with annotated data. Since the source data is distributed, the privacy of source domains' data can be a natural concern. We benefit from the idea of domain alignment in an embedding space to address the privacy concern for MUDA. Our method is based on aligning the sources and target distributions indirectly via internally learned distributions, without communicating data samples between domains. We justify our approach theoretically and perform extensive experiments to demonstrate that our method is effective and compares favorably against existing methods.

翻訳日:2021-06-25 00:26:56 公開日:2021-06-23

# (参考訳) NAX:memristive Xbarベースのコンピューティングシステムのためのニューラルネットワークとハードウェアアーキテクチャの共同設計

NAX: Co-Designing Neural Network and Hardware Architecture for Memristive Xbar based Computing Systems ( http://arxiv.org/abs/2106.12125v1 )

ライセンス: CC BY 4.0

Shubham Negi, Indranil Chakraborty, Aayush Ankit, Kaushik Roy

(参考訳) Memristive Crossbar Arrays (MCAs) を用いたインメモリコンピューティング(IMC)ハードウェアは、von-Neumannアーキテクチャに関連する"メモリウォール"問題を緩和するため、ディープニューラルネットワーク(DNN)を加速するために人気を集めている。このようなハードウェアにマッピングされたDNNのハードウェア効率(エネルギー、レイテンシ、領域)とアプリケーション精度(デバイスと回路の非理想性)は、カーネルサイズ、深さなどのネットワークパラメータに共依存する。クロスバーサイズのようなハードウェアアーキテクチャのパラメータですしかし、ネットワークパラメータとハードウェアパラメータの共最適化は、様々なクロスバーサイズにマッピングされた異なるカーネルサイズからなる困難な探索空間を示している。そこで我々は,ニューラルネットワークとiccベースのハードウェアアーキテクチャを共設計する効率的なニューラルネットワーク検索エンジンnaxを提案する。 NAXは前述の検索空間を探索し、各DNN層のカーネルと対応するクロスバーサイズを決定し、ハードウェア効率とアプリケーション精度の最適なトレードオフを実現する。 NAXの結果,ネットワークは異なるネットワーク層にまたがって異質なクロスバーサイズを持ち,クロスバーの非理想性を考慮した最適ハードウェア効率と精度が得られた。 CIFAR-10 と Tiny ImageNet では,ベースラインの ResNet-20 と ResNet-18 と比較して0.8%,0.2%,17%,4% の EDAP (Energy-delay-area product) を実現している。

In-Memory Computing (IMC) hardware using Memristive Crossbar Arrays (MCAs) are gaining popularity to accelerate Deep Neural Networks (DNNs) since it alleviates the "memory wall" problem associated with von-Neumann architecture. The hardware efficiency (energy, latency and area) as well as application accuracy (considering device and circuit non-idealities) of DNNs mapped to such hardware are co-dependent on network parameters, such as kernel size, depth etc. and hardware architecture parameters such as crossbar size. However, co-optimization of both network and hardware parameters presents a challenging search space comprising of different kernel sizes mapped to varying crossbar sizes. To that effect, we propose NAX -- an efficient neural architecture search engine that co-designs neural network and IMC based hardware architecture. NAX explores the aforementioned search space to determine kernel and corresponding crossbar sizes for each DNN layer to achieve optimal tradeoffs between hardware efficiency and application accuracy. Our results from NAX show that the networks have heterogeneous crossbar sizes across different network layers, and achieves optimal hardware efficiency and accuracy considering the non-idealities in crossbars. On CIFAR-10 and Tiny ImageNet, our models achieve 0.8%, 0.2% higher accuracy, and 17%, 4% lower EDAP (energy-delay-area product) compared to a baseline ResNet-20 and ResNet-18 models, respectively.

翻訳日:2021-06-25 00:07:48 公開日:2021-06-23

# (参考訳) patentnet: 大規模不完全なマルチビュー、マルチモーダル、マルチラベル産業製品画像データベース

PatentNet: A Large-Scale Incomplete Multiview, Multimodal, Multilabel Industrial Goods Image Database ( http://arxiv.org/abs/2106.12139v1 )

ライセンス: CC BY 4.0

Fangyuan Lei, Da Huang, Jianjian Jiang, Ruijun Ma, Senhong Wang, Jiangzhong Cao, Yusen Lin and Qingyun Dai

(参考訳) ディープラーニング領域では、大規模な画像データセットがオブジェクト認識と検索の成功にブレークスルーをもたらす。今日では、イノベーションの具体例として、産業品の多様性が著しく大きくなり、不完全なマルチビュー、マルチモーダル、マルチラベルが従来のデータセットとは異なる。本稿では,産業製品画像および対応するテキストの多種多様な,正確かつ詳細なアノテーションを備えた産業製品データセットであるPatentNetを紹介する。 patentnetでは、画像とテキストは設計特許から引用される。 6m以上の画像と、専門家が手動でチェックした工業製品の対応するテキストの中で、パテントネットは、以前ベンチマークに使用されていた工業製品データセットよりも多種多様な産業製品画像データベースである。 patentnetは、ロカルノ分類協定に基づいて、何百万もの画像を32のクラスと219のサブクラスに分類する。画像分類,画像検索,不完全なマルチビュークラスタリングに関する広範な実験を通じて,我々の特許ネットワークは,既存の産業画像データセットよりもはるかに多様性があり,複雑で,困難であり,高いポテンシャルを享受できることを実証した。さらに、パテントネットにおける不完全なマルチビュー、マルチモーダル、マルチラベルの特徴は、人工知能コミュニティなどにおいて、別個の機会を提供することができる。

In deep learning area, large-scale image datasets bring a breakthrough in the success of object recognition and retrieval. Nowadays, as the embodiment of innovation, the diversity of the industrial goods is significantly larger, in which the incomplete multiview, multimodal and multilabel are different from the traditional dataset. In this paper, we introduce an industrial goods dataset, namely PatentNet, with numerous highly diverse, accurate and detailed annotations of industrial goods images, and corresponding texts. In PatentNet, the images and texts are sourced from design patent. Within over 6M images and corresponding texts of industrial goods labeled manually checked by professionals, PatentNet is the first ongoing industrial goods image database whose varieties are wider than industrial goods datasets used previously for benchmarking. PatentNet organizes millions of images into 32 classes and 219 subclasses based on the Locarno Classification Agreement. Through extensive experiments on image classification, image retrieval and incomplete multiview clustering, we demonstrate that our PatentNet is much more diverse, complex, and challenging, enjoying higher potentials than existing industrial image datasets. Furthermore, the characteristics of incomplete multiview, multimodal and multilabel in PatentNet are able to offer unparalleled opportunities in the artificial intelligence community and beyond.

翻訳日:2021-06-24 23:40:41 公開日:2021-06-23

# (参考訳) 個人別$k$-Clusteringのためのより良いアルゴリズム

Better Algorithms for Individually Fair $k$-Clustering ( http://arxiv.org/abs/2106.12150v1 )

ライセンス: CC BY 4.0

Deeparnab Chakrabarty and Maryam Negahbani

(参考訳) データクラスタリングの問題を$\ell_p$-normの目的(例)で研究する。 $k$-Median と $k$-Means) は、個々のフェアネスの文脈における。データセットは$n$ポイントで構成されており、(a)目的が最小化されるような$k$センターを探したいが、(b)各点$v$が最大$r(v)$の範囲内で中心を持つという個々の公正性制約を尊重する一方で、$r(v)$はその$(n/k)$から最寄りの点まで$v$sの距離である。 Jung, Kannan, Lutz [FORC 2020]は、この概念を導入し、$\ell_\infty$または$k$-Centerの目的に対して、証明可能な(近似的な)フェアネスと客観的保証を備えたクラスタリングアルゴリズムを設計した。 MahabadiとVakilian(ICML 2020)はこの問題を再考し、すべての$\ell_p$-normsに対してローカル検索アルゴリズムを提供した。経験上、アルゴリズムはjungなどよりも優れている。アルコスト面では大きな差($k$-Medianと$k$-Meansは$k$-Means)がありますが、フェアネスでは合理的な損失をもたらします。本稿では,Linear Programming (LP) 技術を用いて,理論と実践の両方において,この問題に対するより良いアルゴリズムを得る。我々は、既知のLPラウンドリング技術を変更することで、MV20よりもはるかに優れた目標に対して最悪のケースを保証できることを実証的に証明し、この目標が最適に非常に近いことを実証した。さらに、理論上の公平性保証は理論上mv20と同等であり、経験上、より公平な解が得られる。 lp {\em exactly} を解くことは禁止されるかもしれないが、実際には単純なスパーシフィケーション手法がアルゴリズムの実行時間を大幅に改善することを示している。

We study data clustering problems with $\ell_p$-norm objectives (e.g. $k$-Median and $k$-Means) in the context of individual fairness. The dataset consists of $n$ points, and we want to find $k$ centers such that (a) the objective is minimized, while (b) respecting the individual fairness constraint that every point $v$ has a center within a distance at most $r(v)$, where $r(v)$ is $v$'s distance to its $(n/k)$th nearest point. Jung, Kannan, and Lutz [FORC 2020] introduced this concept and designed a clustering algorithm with provable (approximate) fairness and objective guarantees for the $\ell_\infty$ or $k$-Center objective. Mahabadi and Vakilian [ICML 2020] revisited this problem to give a local-search algorithm for all $\ell_p$-norms. Empirically, their algorithms outperform Jung et. al.'s by a large margin in terms of cost (for $k$-Median and $k$-Means), but they incur a reasonable loss in fairness. In this paper, our main contribution is to use Linear Programming (LP) techniques to obtain better algorithms for this problem, both in theory and in practice. We prove that by modifying known LP rounding techniques, one gets a worst-case guarantee on the objective which is much better than in MV20, and empirically, this objective is extremely close to the optimal. Furthermore, our theoretical fairness guarantees are comparable with MV20 in theory, and empirically, we obtain noticeably fairer solutions. Although solving the LP {\em exactly} might be prohibitive, we demonstrate that in practice, a simple sparsification technique drastically improves the run-time of our algorithm.

翻訳日:2021-06-24 23:32:37 公開日:2021-06-23

# (参考訳) Atari-2600ベンチマークによる学習ベースポリシとヒューリスティックスを用いた幅ベースルックアヘッド

Width-based Lookaheads with Learnt Base Policies and Heuristics Over the Atari-2600 Benchmark ( http://arxiv.org/abs/2106.12151v1 )

ライセンス: CC BY 4.0

Stefan O'Toole, Nir Lipovetzky, Miquel Ramirez, Adrian Pearce

(参考訳) atari-2600ベンチマークを用いて,新たな幅ベースの計画学習アルゴリズムを提案する。提案するアルゴリズムは、以前の幅ベースのプランナーによる設計決定を慎重に分析することから着想を得ている。我々は,Atari-2600ゲームに対して新たなアルゴリズムをベンチマークし,これまで導入した幅ベース計画学習アルゴリズムであるRIW$_C$+CPV,$\pi$-IW(1),$\pi$-IW(1)+,$\pi$-HIW(n, 1)より優れていることを示す。さらに, atari-2600ゲームセットの分類について, その特徴について述べる。このゲームの分析は、導入された幅ベースのアルゴリズムの挙動と性能に関するさらなる洞察を与える。すなわち、大きな分岐因子を持つゲームや、希薄な有意義な報酬を持つゲームの場合、RIW$_C$+CPVは$\pi$-IW, $\pi$-IW(1)+および$\pi$-HIW(n, 1)より優れている。

We propose new width-based planning and learning algorithms applied over the Atari-2600 benchmark. The algorithms presented are inspired from a careful analysis of the design decisions made by previous width-based planners. We benchmark our new algorithms over the Atari-2600 games and show that our best performing algorithm, RIW$_C$+CPV, outperforms previously introduced width-based planning and learning algorithms $\pi$-IW(1), $\pi$-IW(1)+ and $\pi$-HIW(n, 1). Furthermore, we present a taxonomy of the set of Atari-2600 games according to some of their defining characteristics. This analysis of the games provides further insight into the behaviour and performance of the width-based algorithms introduced. Namely, for games with large branching factors, and games with sparse meaningful rewards, RIW$_C$+CPV outperforms $\pi$-IW, $\pi$-IW(1)+ and $\pi$-HIW(n, 1).

翻訳日:2021-06-24 23:12:33 公開日:2021-06-23

# (参考訳) ニューラルファッション画像のキャプション : データ多様性の会計

Neural Fashion Image Captioning : Accounting for Data Diversity ( http://arxiv.org/abs/2106.12154v1 )

ライセンス: CC BY-SA 4.0

Gilles Hacheme, Noureini Sayouti

(参考訳) 画像キャプションはアプリケーション分野が拡大しており、ファッションも例外ではない。自動アイテム記述を持つことは、何十万もの画像をホストするファッションwebプラットフォームにとって非常に興味深いことです。本論文はファッション画像のキャプションを初めて行う手法の1つである。 InFashAIv1データセットには、約16万のアフリカのファッションアイテムイメージとそのタイトル、価格、一般的な説明が含まれている。 InFashAIv1に加えて、よく知られたDeepFashionデータセットも使用しました。キャプションは、CNNエンコーダとRNNデコーダで作られた \textit{Show and Tell} モデルを使って生成される。両データセットのモデルを共同でトレーニングすることで,アフリカのスタイルのファッションイメージのキャプション品質が向上し,西洋スタイルのデータからの移行学習が示唆された。 InFashAIv1データセットは \href{https://github.com/hgilles06/infashai}{Github} でリリースされ、より多くの多様性を含む作業を促進する。

Image captioning has increasingly large domains of application, and fashion is not an exception. Having automatic item descriptions is of great interest for fashion web platforms hosting sometimes hundreds of thousands of images. This paper is one of the first tackling image captioning for fashion images. To contribute addressing dataset diversity issues, we introduced the InFashAIv1 dataset containing almost 16.000 African fashion item images with their titles, prices and general descriptions. We also used the well known DeepFashion dataset in addition to InFashAIv1. Captions are generated using the \textit{Show and Tell} model made of CNN encoder and RNN Decoder. We showed that jointly training the model on both datasets improves captions quality for African style fashion images, suggesting a transfer learning from Western style data. The InFashAIv1 dataset is released on \href{https://github.com/hgilles06/infashai}{Github} to encourage works with more diversity inclusion.

翻訳日:2021-06-24 22:55:52 公開日:2021-06-23

# (参考訳) 地域意識ネットワーク: 群衆カウントのためのモデル人間のトップダウン視覚知覚メカニズム

Region-Aware Network: Model Human's Top-Down Visual Perception Mechanism for Crowd Counting ( http://arxiv.org/abs/2106.12163v1 )

ライセンス: CC BY 4.0

Yuehai Chen, Jing Yang, Dong Zhang, Kun Zhang, Badong Chen and Shaoyi Du

(参考訳) 背景雑音とスケール変動は、群集数で長年認識されてきた一般的な問題である。人間は群衆のイメージをちらっと見て、人間のほぼ数を瞬時に把握し、群衆の領域や、地球規模の受容性のある群衆の混雑度に注意を払います。そこで本稿では,人間のトップダウン視覚認識機構をモデル化し,RANetと呼ばれる領域認識ブロックを用いた新しいフィードバックネットワークを提案する。まず,入力画像中の候補群領域を優先する優先順位マップを生成するためのフィードバックアーキテクチャを提案する。前者により、ラネットは群衆地域にもっと注意を払うことができる。次に、グローバルレセプティブフィールドを介して、文脈情報を入力画像に適応的にエンコードできる領域認識ブロックを設計する。具体的には、入力画像全体とその優先度マップを列ベクトルの形でスキャンし、それらの類似性を推定する関連行列を得る。得られた関連行列は、ピクセル間のグローバルな関係を構築するために使用される。提案手法は,いくつかの公開データセットにおいて,最先端の群集カウント法より優れる。

Background noise and scale variation are common problems that have been long recognized in crowd counting. Humans glance at a crowd image and instantly know the approximate number of human and where they are through attention the crowd regions and the congestion degree of crowd regions with a global receptive filed. Hence, in this paper, we propose a novel feedback network with Region-Aware block called RANet by modeling human's Top-Down visual perception mechanism. Firstly, we introduce a feedback architecture to generate priority maps that provide prior about candidate crowd regions in input images. The prior enables the RANet pay more attention to crowd regions. Then we design Region-Aware block that could adaptively encode the contextual information into input images through global receptive field. More specifically, we scan the whole input images and its priority maps in the form of column vector to obtain a relevance matrix estimating their similarity. The relevance matrix obtained would be utilized to build global relationships between pixels. Our method outperforms state-of-the-art crowd counting methods on several public datasets.

翻訳日:2021-06-24 22:45:18 公開日:2021-06-23

# (参考訳) Cough Sounds を用いたディープニューラルネットワークによる呼吸病理分類

Deep Neural Network Based Respiratory Pathology Classification Using Cough Sounds ( http://arxiv.org/abs/2106.12174v1 )

ライセンス: CC BY 4.0

Balamurali B T, Hwan Ing Hee, Saumitra Kapoor, Oon Hoe Teoh, Sung Shin Teng, Khai Pin Lee, Dorien Herremans, Jer Ming Chen

(参考訳) インテリジェントなシステムは、私たちの医療システムと同様に、世界を変えつつある。本研究では,気管支喘息,上気道感染症(urti),下気道感染症(lrti)などの健康な小児と,病態のある小児の区別が可能な,深層学習に基づくcough音分類モデルを提案する。深層ニューラルネットワークモデルをトレーニングするために,臨床医の診断でラベル付けされた新しいcough音のデータセットを収集した。選択されたモデルは、Mel Frequency Cepstral Coefficients(MFCC)機能に基づいた双方向長短メモリネットワーク(BiLSTM)である。結果として得られた訓練されたモデルは、健康または病理(一般には特定の呼吸器病理学に属する)の2つのクラスを分類するために訓練された場合、医師の診断によって提供されるラベルに分類すると、84\%を超える精度に達する。対象者の呼吸病状を分類するために, 被験者1人あたりに複数カウエポックの結果が組み合わされた。その結果、3つの呼吸器疾患の予測精度は91\%を超える。しかし、モデルが4種類のうずくの分類と識別を行うように訓練されると、全体的な精度は低下し、1種類の病的くずはしばしば別のものと誤分類される。しかし, 健康的, 病理学的に何らかの病態を有すると分類された健康性うがいを考慮すれば, 4種類のモデル全体の精度は84\%以上である。 MFCCの特徴空間の経時的変化は, 病理学的, 回復的生地を比較して検討した結果, 病態によらず病理的生地が同じ特徴空間を占めるため, MFCCの特徴のみを区別することが困難であった。

Intelligent systems are transforming the world, as well as our healthcare system. We propose a deep learning-based cough sound classification model that can distinguish between children with healthy versus pathological coughs such as asthma, upper respiratory tract infection (URTI), and lower respiratory tract infection (LRTI). In order to train a deep neural network model, we collected a new dataset of cough sounds, labelled with clinician's diagnosis. The chosen model is a bidirectional long-short term memory network (BiLSTM) based on Mel Frequency Cepstral Coefficients (MFCCs) features. The resulting trained model when trained for classifying two classes of coughs -- healthy or pathology (in general or belonging to a specific respiratory pathology), reaches accuracy exceeding 84\% when classifying cough to the label provided by the physicians' diagnosis. In order to classify subject's respiratory pathology condition, results of multiple cough epochs per subject were combined. The resulting prediction accuracy exceeds 91\% for all three respiratory pathologies. However, when the model is trained to classify and discriminate among the four classes of coughs, overall accuracy dropped: one class of pathological coughs are often misclassified as other. However, if one consider the healthy cough classified as healthy and pathological cough classified to have some kind of pathologies, then the overall accuracy of four class model is above 84\%. A longitudinal study of MFCC feature space when comparing pathological and recovered coughs collected from the same subjects revealed the fact that pathological cough irrespective of the underlying conditions occupy the same feature space making it harder to differentiate only using MFCC features.

翻訳日:2021-06-24 22:29:15 公開日:2021-06-23

# (参考訳) 不確かさ属性による画像生成の公正性

Fairness for Image Generation with Uncertain Sensitive Attributes ( http://arxiv.org/abs/2106.12182v1 )

ライセンス: CC BY 4.0

Ajil Jalal and Sushrut Karmalkar and Jessica Hoffmann and Alexandros G. Dimakis and Eric Price

(参考訳) 本研究は、画像超解像などの生成手順の文脈における公平性の問題に対処し、標準分類設定と異なる定義を包含する。さらに、伝統的グループフェアネスの定義は、通常、指定された保護されたグループ(これらのグループ化が人工的であり、歴史的、政治的モチベーションを担っているという事実を象徴する)に関して定義されるが、本質的な真理の同一性は存在しないことを強調する。例えば、南アジアと東アジアは一つのグループか別々のグループと見なされるべきか? ひとつの人種を全体、あるいはさらに性別によって分割すべきだろうか? どの集団が有効で、どのグループに属しているかを決めることは不可能なジレンマであり、アジア人に関して「フェア」であることは、南アジア人に対して「アンフェア」であることを必要とするかもしれない。これにより、関連するグルーピングに対してアルゴリズムが \emph{oblivious} となるような定義が導入される。グループフェアネスの直感的な概念を定義し、不整合性とトレードオフを研究する。統計学的パリティの自然な拡張はグループ化に強く依存しており、<emph{impossible} は必然的に達成できることを示した。一方、概念的に新しい定義である条件付き比例表現は、後サンプリングによって明確化することができる。実験は, 最新の生成モデルを用いて, 理論結果の検証を行い, 公平な画像再構成を実現する。

This work tackles the issue of fairness in the context of generative procedures, such as image super-resolution, which entail different definitions from the standard classification setting. Moreover, while traditional group fairness definitions are typically defined with respect to specified protected groups -- camouflaging the fact that these groupings are artificial and carry historical and political motivations -- we emphasize that there are no ground truth identities. For instance, should South and East Asians be viewed as a single group or separate groups? Should we consider one race as a whole or further split by gender? Choosing which groups are valid and who belongs in them is an impossible dilemma and being ``fair'' with respect to Asians may require being ``unfair'' with respect to South Asians. This motivates the introduction of definitions that allow algorithms to be \emph{oblivious} to the relevant groupings. We define several intuitive notions of group fairness and study their incompatibilities and trade-offs. We show that the natural extension of demographic parity is strongly dependent on the grouping, and \emph{impossible} to achieve obliviously. On the other hand, the conceptually new definition we introduce, Conditional Proportional Representation, can be achieved obliviously through Posterior Sampling. Our experiments validate our theoretical results and achieve fair image reconstruction using state-of-the-art generative models.

翻訳日:2021-06-24 22:13:27 公開日:2021-06-23

# (参考訳) 高齢者の日常生活活動支援技術の現状と展望

A Review of Assistive Technologies for Activities of Daily Living of Elderly ( http://arxiv.org/abs/2106.12183v1 )

ライセンス: CC BY 4.0

Nirmalya Thakur and Chia Y. Han

(参考訳) この世紀の特筆すべき特徴の1つは、常に増加を続けている高齢者の人口である。高齢者は、身体障害、認知障害、記憶の弱化、老化に伴う非組織的行動などにより、様々なニーズと要求を抱えている。これらの制限の範囲は、年齢、性別、背景、経験、スキル、知識など、高齢者の多様性によっても異なる。高齢者が日常生活活動(ADL)を自立的に行うためには,年齢の増大や能力の制限など,様々なニーズと課題がある。さらに、介護者の不足は、高齢者が日常の日常業務をこなし、自立した生活と活動的な高齢化を維持するために、テクノロジーベースのサービスの必要性が高まっている。これらのニーズに対処するため、この作品はこの分野で3つの主要な貢献をしている。まず,高齢者のadl実施支援を目的とした生活支援技術の包括的レビューを行う。第2に, 本研究は, スマートホームとスマートシティにおける高齢者介護支援サービス実施の文脈において現在存在している課題について考察する。最後に、この研究は、この分野での既存の作業の実装、拡張、統合のためのアプローチを概説し、変化し続けるニーズに応じて高齢者にパーソナライズされた支援とユーザー中心の行動介入を提供する、待望のフレームワークを開発するためのものである。

One of the distinct features of this century has been the population of older adults which has been on a constant rise. Elderly people have several needs and requirements due to physical disabilities, cognitive issues, weakened memory and disorganized behavior, that they face with increasing age. The extent of these limitations also differs according to the varying diversities in elderly, which include age, gender, background, experience, skills, knowledge and so on. These varying needs and challenges with increasing age, limits abilities of older adults to perform Activities of Daily Living (ADLs) in an independent manner. To add to it, the shortage of caregivers creates a looming need for technology-based services for elderly people, to assist them in performing their daily routine tasks to sustain their independent living and active aging. To address these needs, this work consists of making three major contributions in this field. First, it provides a rather comprehensive review of assisted living technologies aimed at helping elderly people to perform ADLs. Second, the work discusses the challenges identified through this review, that currently exist in the context of implementation of assisted living services for elderly care in Smart Homes and Smart Cities. Finally, the work also outlines an approach for implementation, extension and integration of the existing works in this field for development of a much-needed framework that can provide personalized assistance and user-centered behavior interventions to elderly as per their varying and ever-changing needs.

翻訳日:2021-06-24 21:47:24 公開日:2021-06-23

# (参考訳) 希少クラスのための合成サンプルの画像から画像への変換

Image-to-Image Translation of Synthetic Samples for Rare Classes ( http://arxiv.org/abs/2106.12212v1 )

ライセンス: CC BY 4.0

Edoardo Lanzini and Sara Beery

(参考訳) 希少なクラスは一般的なクラスよりも桁違いに頻繁に観察され、希少なクラスがほんの一握りの例しか持たない高度に不均衡なデータに繋がる。少数の例から学ぶことは、ディープラーニングベースの分類アルゴリズムにとって既知の課題であり、低ショット学習の分野の焦点である。これらのレアクラスのトレーニングデータを増やす潜在的アプローチの一つは、限られた実データを合成サンプルで強化することである。これは役に立つことが示されているが、実際のデータでテストした場合、実データと合成データのドメインシフトはアプローチの有効性を妨げる。本研究では,動物種分類における合成画像と実画像の領域ギャップを埋めるための画像間翻訳手法について,野生動物を観察する静止カメラのカメラトラップから収集したデータを用いて検討する。我々は、ソースドメインとターゲットドメイン間の低レベルの特徴アライメントを用いて、グラフィックエンジンを用いて生成される稀な種の合成データを作成する。非整合合成データを用いたシステムと比較すると, 希少種に対する分類誤差は有意に減少した。

The natural world is long-tailed: rare classes are observed orders of magnitudes less frequently than common ones, leading to highly-imbalanced data where rare classes can have only handfuls of examples. Learning from few examples is a known challenge for deep learning based classification algorithms, and is the focus of the field of low-shot learning. One potential approach to increase the training data for these rare classes is to augment the limited real data with synthetic samples. This has been shown to help, but the domain shift between real and synthetic hinders the approaches' efficacy when tested on real data. We explore the use of image-to-image translation methods to close the domain gap between synthetic and real imagery for animal species classification in data collected from camera traps: motion-activated static cameras used to monitor wildlife. We use low-level feature alignment between source and target domains to make synthetic data for a rare species generated using a graphics engine more "realistic". Compared against a system augmented with unaligned synthetic data, our experiments show a considerable decrease in classification error rates on a rare species.

翻訳日:2021-06-24 21:38:55 公開日:2021-06-23

# (参考訳) バイオメディカルネームの認識:課題と解決

Recognising Biomedical Names: Challenges and Solutions ( http://arxiv.org/abs/2106.12230v1 )

ライセンス: CC BY 4.0

Xiang Dai

(参考訳) 生物医学的な文書の量の増加率は驚異的だ。これらの文書に閉じ込められた情報をアンロックすることで、研究者や実践者は情報の世界において確実に操作できる。バイオメディカルNERは、通常、NLPパイプラインの最初のステップとして用いられる。シーケンシャルタグ技術に基づく標準NERモデルは、ジェネリックドメインにおける短いエンティティ参照を認識するのに長けている。 However, there are several open challenges of applying these models to recognise biomedical names: 1) Biomedical names may contain complex inner structure (discontinuity and overlapping) which cannot be recognised using standard sequence tagging technique; 2) The training of NER models usually requires large amount of labelled data, which are difficult to obtain in the biomedical domain; and, 3) Commonly used language representation models are pre-trained on generic data; a domain shift therefore exists between these models and target biomedical data. 1) 不連続な言及を認識可能なトランジッションベースのnerモデルを提案し, 2) 適切な事前学習データを生成するためのコスト効率の高いアプローチを開発し,3) nerのためのデータ拡張手法をいくつか設計する。我々の貢献は、特に新しいバイオメディカル・アプリケーションが必要な場合に、明らかな実践的意味を持つ。提案手法は,少ないラベル付きデータしか必要とせず,NERモデルの良好な性能を実現するのに有効である。事前学習データの選択に関する調査は、ドメイン内データを用いて事前学習した言語表現モデルを組み込むことで、モデルを改善することができる。最後に,提案する遷移型nerモデルは,不連続な言及を認識することにより,さらに性能を向上させることができる。

The growth rate in the amount of biomedical documents is staggering. Unlocking information trapped in these documents can enable researchers and practitioners to operate confidently in the information world. Biomedical NER, the task of recognising biomedical names, is usually employed as the first step of the NLP pipeline. Standard NER models, based on sequence tagging technique, are good at recognising short entity mentions in the generic domain. However, there are several open challenges of applying these models to recognise biomedical names: 1) Biomedical names may contain complex inner structure (discontinuity and overlapping) which cannot be recognised using standard sequence tagging technique; 2) The training of NER models usually requires large amount of labelled data, which are difficult to obtain in the biomedical domain; and, 3) Commonly used language representation models are pre-trained on generic data; a domain shift therefore exists between these models and target biomedical data. To deal with these challenges, we explore several research directions and make the following contributions: 1) we propose a transition-based NER model which can recognise discontinuous mentions; 2) We develop a cost-effective approach that nominates the suitable pre-training data; and, 3) We design several data augmentation methods for NER. Our contributions have obvious practical implications, especially when new biomedical applications are needed. Our proposed data augmentation methods can help the NER model achieve decent performance, requiring only a small amount of labelled data. Our investigation regarding selecting pre-training data can improve the model by incorporating language representation models, which are pre-trained using in-domain data. Finally, our proposed transition-based NER model can further improve the performance by recognising discontinuous mentions.

翻訳日:2021-06-24 21:28:06 公開日:2021-06-23

# (参考訳) 腎乳頭癌の組織像におけるサブタイピングのためのインスタンスベース視覚トランスフォーマ

Instance-based Vision Transformer for Subtyping of Papillary Renal Cell Carcinoma in Histopathological Image ( http://arxiv.org/abs/2106.12265v1 )

ライセンス: CC BY 4.0

Zeyu Gao, Bangyang Hong, Xianli Zhang, Yang Li, Chang Jia, Jialun Wu, Chunbao Wang, Deyu Meng, Chen Li

(参考訳) P型腎細胞癌(RCC)の病理組織学的サブタイプは1型対2型であり,本態性予後因子である。 pRCCの2つのサブタイプは類似したパターン、すなわち乳頭状構造を持つが、細胞および細胞層レベルのパターンを含む微妙な違いがある。しかし、細胞層と細胞層レベルのパターンは、大規模な病理組織像において既存のCNNモデルではほとんど捉えられず、このような細粒度分類にこれらのモデルを直接適用する際の障害となる。そこで,本研究では,pRCCサブタイピングタスクにおける病理像の頑健な表現をインスタンスパッチから抽出し,より微細な特徴を抽出し(セグメント化された核を取り囲み,予測等級を割り当てることにより)学習する。提案するi-vitは、top-kインスタンスを入力として、位置埋め込み層、グレードエンベディング層、マルチヘッド多層セルフアテンションモジュールによってセル層とセル層の両方のレベルパターンをキャプチャする。提案フレームワークの性能を評価するため,1型と2型pRCCの171枚のスライド画像から,経験的病理医を1162個の関心領域に招待した。実験結果から,提案手法は既存のCNNモデルよりも優れた性能を示すことが示された。

Histological subtype of papillary (p) renal cell carcinoma (RCC), type 1 vs. type 2, is an essential prognostic factor. The two subtypes of pRCC have a similar pattern, i.e., the papillary architecture, yet some subtle differences, including cellular and cell-layer level patterns. However, the cellular and cell-layer level patterns almost cannot be captured by existing CNN-based models in large-size histopathological images, which brings obstacles to directly applying these models to such a fine-grained classification task. This paper proposes a novel instance-based Vision Transformer (i-ViT) to learn robust representations of histopathological images for the pRCC subtyping task by extracting finer features from instance patches (by cropping around segmented nuclei and assigning predicted grades). The proposed i-ViT takes top-K instances as input and aggregates them for capturing both the cellular and cell-layer level patterns by a position-embedding layer, a grade-embedding layer, and a multi-head multi-layer self-attention module. To evaluate the performance of the proposed framework, experienced pathologists are invited to selected 1162 regions of interest from 171 whole slide images of type 1 and type 2 pRCC. Experimental results show that the proposed method achieves better performance than existing CNN-based models with a significant margin.

翻訳日:2021-06-24 21:26:53 公開日:2021-06-23

# (参考訳) 糖尿病網膜症の網膜底画像分類のためのラベル管理機構

A Label Management Mechanism for Retinal Fundus Image Classification of Diabetic Retinopathy ( http://arxiv.org/abs/2106.12284v1 )

ライセンス: CC BY 4.0

Mengdi Gao, Ximeng Feng, Mufeng Geng, Zhe Jiang, Lei Zhu, Xiangxi Meng, Chuanqing Zhou, Qiushi Ren and Yanye Lu

(参考訳) 糖尿病網膜症(DR)は、成人の視力障害と不可逆性失明の最も多い原因である。深層学習 (DL) のルネッサンスにより, DLをベースとしたDR診断は, DRの早期スクリーニングと重症度向上に有効である。しかし、ディープニューラルネットワーク(DNN)のトレーニングには、大量の注意深くラベル付けされたデータが必要である。ノイズの多いラベルデータは、大量のデータをラベル付けすることで、モデルのパフォーマンスを低下させる。本研究では,DNNがノイズの多いデータに対する過度な適合を克服するための新しいラベル管理機構(LMM)を提案する。 LMMはベイズ統計および時間重み付け手法における最大後続確率(MAP)を用いて、不確定データのラベルを選択的に補正し、徐々にトレーニングデータを浄化し、分類性能を向上させる。合成ノイズデータ(Messidor \ and our collected DR dataset)と実世界のノイズデータ(ANIMAL-10N)の総合実験により、LMMはモデルの性能を向上し、3つの最先端手法よりも優れていることが示された。

Diabetic retinopathy (DR) remains the most prevalent cause of vision impairment and irreversible blindness in the working-age adults. Due to the renaissance of deep learning (DL), DL-based DR diagnosis has become a promising tool for the early screening and severity grading of DR. However, training deep neural networks (DNNs) requires an enormous amount of carefully labeled data. Noisy label data may be introduced when labeling plenty of data, degrading the performance of models. In this work, we propose a novel label management mechanism (LMM) for the DNN to overcome overfitting on the noisy data. LMM utilizes maximum posteriori probability (MAP) in the Bayesian statistic and time-weighted technique to selectively correct the labels of unclean data, which gradually purify the training data and improve classification performance. Comprehensive experiments on both synthetic noise data (Messidor \& our collected DR dataset) and real-world noise data (ANIMAL-10N) demonstrated that LMM could boost performance of models and is superior to three state-of-the-art methods.

翻訳日:2021-06-24 21:15:04 公開日:2021-06-23

# (参考訳) 行動模倣分布:フェデレーション学習のための個人行動と集団行動の組み合わせ

Behavior Mimics Distribution: Combining Individual and Group Behaviors for Federated Learning ( http://arxiv.org/abs/2106.12300v1 )

ライセンス: CC BY 4.0

Hua Huang, Fanhua Shang, Yuanyuan Liu, Hongying Liu

(参考訳) Federated Learning(FL)は、アクティブで有望な分散機械学習パラダイムになった。統計的不均質性の結果,近年の研究では,ローカル更新によるクライアントドリフトにより,一般的なfl法(fedavgなど)の性能が劇的に低下することが明らかとなった。本稿では,個人とグループの両方の行動を利用して分布を模倣し,不均一性に対処できる新しいフェデレート学習アルゴリズム(IGFL)を提案する。既存のFLメソッドとは異なり、IGFLはクライアントとサーバの最適化にも適用できます。本稿では,IGFLのサーバ最適化における注目度に基づく新しいフェデレーション学習を提案する。私たちの知る限りでは、フェデレーション最適化に注意機構を組み込むのはこれが初めてです。広範な実験を行い,igflが既存の連合学習手法の性能を著しく向上できることを示す。特に個人間のデータの分布が多様である場合、IGFLは以前のベースラインと比較して約13%の精度で分類できる。

Federated Learning (FL) has become an active and promising distributed machine learning paradigm. As a result of statistical heterogeneity, recent studies clearly show that the performance of popular FL methods (e.g., FedAvg) deteriorates dramatically due to the client drift caused by local updates. This paper proposes a novel Federated Learning algorithm (called IGFL), which leverages both Individual and Group behaviors to mimic distribution, thereby improving the ability to deal with heterogeneity. Unlike existing FL methods, our IGFL can be applied to both client and server optimization. As a by-product, we propose a new attention-based federated learning in the server optimization of IGFL. To the best of our knowledge, this is the first time to incorporate attention mechanisms into federated optimization. We conduct extensive experiments and show that IGFL can significantly improve the performance of existing federated learning methods. Especially when the distributions of data among individuals are diverse, IGFL can improve the classification accuracy by about 13% compared with prior baselines.

翻訳日:2021-06-24 20:57:20 公開日:2021-06-23

# (参考訳) 単一"in-the-wild"画像による3次元舌再建

3D human tongue reconstruction from single "in-the-wild" images ( http://arxiv.org/abs/2106.12302v1 )

ライセンス: CC BY 4.0

Stylianos Ploumpis, Stylianos Moschoglou, Vasileios Triantafyllou, Stefanos Zafeiriou

(参考訳) 単一の画像からの3D顔の再構成は、特にリアルな3Dアバター作成、不変顔認識、顔の幻覚といった多くのアプリケーションで広く使われているため、コンピュータビジョンコミュニティへの関心が高まったタスクである。 90年代後半に3D Morphable Modelが導入されて以来、我々はこの課題に特に取り組むことを目的とした研究の爆発を目撃した。しかし, 深層学習に起因した単一画像からの3次元顔再構成の精度は高まっているものの, 3次元アバター表現の現実性には極めて重要であるにもかかわらず, 舌などの顔の微細で変形性の高い成分は, 文学におけるすべての3次元顔モデルにはまだ欠落している。本研究では,まず,舌とともに3次元顔の正確な再構築を行う,エンド・ツー・エンドのトレーニング可能なパイプラインについて述べる。さらに,3次元舌表面生成に適した新しいGAN法を導入することにより,このパイプラインを「夢中」画像で堅牢にする。最後に、性別、年齢、民族的背景の異なる700人の生スキャン1,800人からなる、最初の多様な舌データセットをコミュニティに公開します。定量的および定性的実験の広範なシリーズで示すように、我々のモデルは、悪質な「未熟な」条件下であっても、頑健で現実的な3D舌の構造を捉えることができる。

3D face reconstruction from a single image is a task that has garnered increased interest in the Computer Vision community, especially due to its broad use in a number of applications such as realistic 3D avatar creation, pose invariant face recognition and face hallucination. Since the introduction of the 3D Morphable Model in the late 90's, we witnessed an explosion of research aiming at particularly tackling this task. Nevertheless, despite the increasing level of detail in the 3D face reconstructions from single images mainly attributed to deep learning advances, finer and highly deformable components of the face such as the tongue are still absent from all 3D face models in the literature, although being very important for the realness of the 3D avatar representations. In this work we present the first, to the best of our knowledge, end-to-end trainable pipeline that accurately reconstructs the 3D face together with the tongue. Moreover, we make this pipeline robust in "in-the-wild" images by introducing a novel GAN method tailored for 3D tongue surface generation. Finally, we make publicly available to the community the first diverse tongue dataset, consisting of 1,800 raw scans of 700 individuals varying in gender, age, and ethnicity backgrounds. As we demonstrate in an extensive series of quantitative as well as qualitative experiments, our model proves to be robust and realistically captures the 3D tongue structure, even in adverse "in-the-wild" conditions.

翻訳日:2021-06-24 20:43:10 公開日:2021-06-23

# (参考訳) 学習した特徴空間の構造による分類モデルのロバスト性の推定

Estimating the Robustness of Classification Models by the Structure of the Learned Feature-Space ( http://arxiv.org/abs/2106.12303v1 )

ライセンス: CC BY 4.0

Kalun Ho, Franz-Josef Pfreundt, Janis Keuper, Margret Keuper

(参考訳) 過去10年間で、ディープイメージ分類ネットワークの開発は、主にimagenetのような標準ベンチマークにおける分類精度の観点から、最高のパフォーマンスの探索によって進められてきた。最近では、モデルロバストネスの概念によって、この焦点が拡張されている。データ分布の変化を事前に把握したモデル一般化能力。 ImageNet-Cのような新しいベンチマークは堅牢性を測定するために導入されたが、固定テストセットはデータバリエーションのごく一部しかキャプチャできないため、新しい過度なソリューションを生成する傾向にある、と我々は主張する。これらの欠点を克服するために、学習した特徴空間の構造から直接モデルの堅牢性を推定することを提案する。学習した分類器内の潜在表現の教師なしクラスタリングによって得られるロバスト性指標を導入し,破損したテストデータに対するモデル性能に非常に高い相関を示す。

Over the last decade, the development of deep image classification networks has mostly been driven by the search for the best performance in terms of classification accuracy on standardized benchmarks like ImageNet. More recently, this focus has been expanded by the notion of model robustness, i.e. the generalization abilities of models towards previously unseen changes in the data distribution. While new benchmarks, like ImageNet-C, have been introduced to measure robustness properties, we argue that fixed testsets are only able to capture a small portion of possible data variations and are thus limited and prone to generate new overfitted solutions. To overcome these drawbacks, we suggest to estimate the robustness of a model directly from the structure of its learned feature-space. We introduce robustness indicators which are obtained via unsupervised clustering of latent representations inside a trained classifier and show very high correlations to the model performance on corrupted test data.

翻訳日:2021-06-24 20:26:38 公開日:2021-06-23

# (参考訳) もっと深く行くべきか? 受容場解析による学習を伴わない畳み込みニューラルネットワークアーキテクチャの最適化

Should You Go Deeper? Optimizing Convolutional Neural Network Architectures without Training by Receptive Field Analysis ( http://arxiv.org/abs/2106.12307v1 )

ライセンス: CC BY 4.0

Mats L. Richter, Julius Sch\"oning, Ulf Krumnack

(参考訳) 特定のタスクにニューラルネットワーク(ann)を適用する場合、研究者、プログラマ、その他の専門家は通常、設計上の畳み込み層の数をオーバーショットする。これらのannにはパラメータが多すぎるため、結果に影響を与えずに不必要なトレーニングが必要となる。畳み込み層が処理できる特徴は、その受容場によって厳密に制限される。受容場の拡大を階層的に解析することにより、ANNアーキテクチャの推論に質的に寄与しない階層列を確実に予測することができる。これらの分析に基づいて,これらの非効率性を解決するための設計戦略を提案し, annの解法と計算性能を最適化する。戦略も分析も実際のモデルのトレーニングを必要としないため、これらの洞察は、将来自動化されるであろうansアーキテクチャの非常に効率的な設計プロセスを可能にする。

Applying artificial neural networks (ANN) to specific tasks, researchers, programmers, and other specialists usually overshot the number of convolutional layers in their designs. By implication, these ANNs hold too many parameters, which needed unnecessarily trained without impacting the result. The features, a convolutional layer can process, are strictly limited by its receptive field. By layer-wise analyzing the expansion of the receptive fields, we can reliably predict sequences of layers that will not contribute qualitatively to the inference in thegiven ANN architecture. Based on these analyses, we propose design strategies to resolve these inefficiencies, optimizing the explainability and the computational performance of ANNs. Since neither the strategies nor the analysis requires training of the actual model, these insights allow for a very efficient design process of ANNs architectures which might be automated in the future.

翻訳日:2021-06-24 20:12:22 公開日:2021-06-23

# (参考訳) GraphConfRec: グラフニューラルネットワークに基づくカンファレンスレコメンダシステム

GraphConfRec: A Graph Neural Network-Based Conference Recommender System ( http://arxiv.org/abs/2106.12340v1 )

ライセンス: CC BY 4.0

Andreea Iana, Heiko Paulheim

(参考訳) 今日の学術出版モデル、特にコンピュータ科学において、会議は、それぞれの分野で最新のピアレビューされた進歩を公表するための主要なプラットフォームを構成する。しかし、研究の出版に適した学術的場を選ぶことは、特に学術的キャリアの開始時や通常の領域外の出版を希望する人にとって、利用可能な会議の多さを考える上で困難な課題となる。本稿では,SciGraphとグラフニューラルネットワークを組み合わせた会議推薦システムであるGraphConfRecを提案する。 graphconfrecは、リコール@10を0.580まで、マップを0.336まで、グラフアテンションネットワークベースのレコメンデーションモデルで達成する。 25名の被験者によるユーザスタディは、肯定的な結果を支持する。

In today's academic publishing model, especially in Computer Science, conferences commonly constitute the main platforms for releasing the latest peer-reviewed advancements in their respective fields. However, choosing a suitable academic venue for publishing one's research can represent a challenging task considering the plethora of available conferences, particularly for those at the start of their academic careers, or for those seeking to publish outside of their usual domain. In this paper, we propose GraphConfRec, a conference recommender system which combines SciGraph and graph neural networks, to infer suggestions based not only on title and abstract, but also on co-authorship and citation relationships. GraphConfRec achieves a recall@10 of up to 0.580 and a MAP of up to 0.336 with a graph attention network-based recommendation model. A user study with 25 subjects supports the positive results.

翻訳日:2021-06-24 19:59:55 公開日:2021-06-23

# (参考訳) PALRACE: 人間のデータとラベル付き合理化による包括的データセットを読む

PALRACE: Reading Comprehension Dataset with Human Data and Labeled Rationales ( http://arxiv.org/abs/2106.12373v1 )

ライセンス: CC BY 4.0

Jiajie Zou, Yuran Zhang, Peiqing Jin, Cheng Luo, Xunyi Pan, Nai Ding

(参考訳) 事前学習された言語モデルは、機械読解(MRC)タスクにおいて高い性能を達成するが、結果は説明が難しい。モデルを説明するための魅力的なアプローチは、その決定の根拠を提供することである。本稿では,人間理論の教師付き学習を容易にするために,レースデータセットから選択した800のパスに対して,人間のラベル付き合理性を持つ新しいmrcデータセットであるpalrace(pruned and labeled race)を提案する。さらに,質問を各項目に6種類に分類した。各章は少なくとも26人の参加者が読み、質問に答える根拠をラベル付けした。また,ラベル付き合理性のみに基づいた質問への回答を参加者に依頼し,ラベル付き合理性が高品質であり,質問応答を十分に支援できる合理性評価セッションを実施した。

Pre-trained language models achieves high performance on machine reading comprehension (MRC) tasks but the results are hard to explain. An appealing approach to make models explainable is to provide rationales for its decision. To facilitate supervised learning of human rationales, here we present PALRACE (Pruned And Labeled RACE), a new MRC dataset with human labeled rationales for 800 passages selected from the RACE dataset. We further classified the question to each passage into 6 types. Each passage was read by at least 26 participants, who labeled their rationales to answer the question. Besides, we conducted a rationale evaluation session in which participants were asked to answering the question solely based on labeled rationales, confirming that the labeled rationales were of high quality and can sufficiently support question answering.

翻訳日:2021-06-24 19:37:34 公開日:2021-06-23

# (参考訳) 心臓MRI画像解析における公正性:深層学習におけるデータ不均衡によるバイアスの検討

Fairness in Cardiac MR Image Analysis: An Investigation of Bias Due to Data Imbalance in Deep Learning Based Segmentation ( http://arxiv.org/abs/2106.12387v1 )

ライセンス: CC BY 4.0

Esther Puyol-Anton, Bram Ruijsink, Stefan K. Piechnik, Stefan Neubauer, Steffen E. Petersen, Reza Razavi, and Andrew P. King

(参考訳) 人工知能(AI)における「フェアネス」の主題は、人種や性別などの人口動態特性に基づく潜在的なバイアスに対するAIアルゴリズムの評価と、このバイアスに対処するアルゴリズムの開発である。これまでほとんどのアプリケーションはコンピュータビジョンで使われてきたが、医療分野の仕事がいくつか現れ始めている。心臓mrセグメンテーションにおける深層学習(dl)の使用は近年、印象的な結果をもたらしており、その技術は臨床に翻訳され始めている。しかし、これらのモデルの公平性についてはまだ研究されていない。本研究では,6つの人種グループから5,903人の被験者からなる英国バイオバンクデータセットから,短軸心MRデータをトレーニングし,評価したnnU-Netモデルを用いて,人種/ジェンダーグループを対象としたこのような分析を行った。異なる人種間でのサイコロのパフォーマンスに統計的に有意な差が見られた。人種バイアスを低減するために,(1) 人種間のバランスを確保するためにバッチサンプリングが階層化される階層化バッチサンプリング,(2) 人種分類のための公平なメタラーニング,(2) DL分類器が人種分類を訓練し,セグメンテーションモデルと共同最適化されたグループモデル,(3) 人種毎に異なるセグメンテーションモデルを訓練する保護されたグループモデル,の3つの戦略を検討した。また、完全にバランスの取れたデータベースがあるシナリオと比較しました。公平性を評価するために,平均Dice値の標準偏差(SD)とスキュード誤差比(SER)を用いた。以上の結果から,不均衡なトレーニングデータを用いることにより人種バイアスが生じ,提案されているバイアス緩和戦略はすべて公平性が向上し,保護されたグループモデルを用いた最良のsdとserが得られた。

The subject of "fairness" in artificial intelligence (AI) refers to assessing AI algorithms for potential bias based on demographic characteristics such as race and gender, and the development of algorithms to address this bias. Most applications to date have been in computer vision, although some work in healthcare has started to emerge. The use of deep learning (DL) in cardiac MR segmentation has led to impressive results in recent years, and such techniques are starting to be translated into clinical practice. However, no work has yet investigated the fairness of such models. In this work, we perform such an analysis for racial/gender groups, focusing on the problem of training data imbalance, using a nnU-Net model trained and evaluated on cine short axis cardiac MR data from the UK Biobank dataset, consisting of 5,903 subjects from 6 different racial groups. We find statistically significant differences in Dice performance between different racial groups. To reduce the racial bias, we investigated three strategies: (1) stratified batch sampling, in which batch sampling is stratified to ensure balance between racial groups; (2) fair meta-learning for segmentation, in which a DL classifier is trained to classify race and jointly optimized with the segmentation model; and (3) protected group models, in which a different segmentation model is trained for each racial group. We also compared the results to the scenario where we have a perfectly balanced database. To assess fairness we used the standard deviation (SD) and skewed error ratio (SER) of the average Dice values. Our results demonstrate that the racial bias results from the use of imbalanced training data, and that all proposed bias mitigation strategies improved fairness, with the best SD and SER resulting from the use of protected group models.

翻訳日:2021-06-24 19:35:00 公開日:2021-06-23

# (参考訳) 形態的にリッチな言語に対する語彙制約付き機械翻訳

End-to-End Lexically Constrained Machine Translation for Morphologically Rich Languages ( http://arxiv.org/abs/2106.12398v1 )

ライセンス: CC BY 4.0

Josef Jon and Jo\~ao Paulo Aires and Du\v{s}an Vari\v{s} and Ond\v{r}ej Bojar

(参考訳) 語彙的に制約された機械翻訳では、特定の単語やフレーズの存在や欠如を強制して出力文を操作できる。現在のアプローチでは、翻訳に現れる用語を強制することはできるが、制約語形式を生成された出力の他の部分と一致させるのに苦労することが多い。手動分析の結果、英語からチェコ語への翻訳における基準制約モデルの出力エラーの46%が合意に関連していることがわかった。本研究は, 機械翻訳による単語の正しいインフレクションを許容する機構について検討する。特に,入力シーケンスの一部として制約を付与したモデルトレーニングに基づく手法に着目した。本手法は, 自動評価と手動評価の両方における制約項の翻訳を, 一致の誤りを減らすことにより改善することを示す。提案手法は,新しい誤りや翻訳の全体的な品質を低下させることなく,屈折誤差を除去する。

Lexically constrained machine translation allows the user to manipulate the output sentence by enforcing the presence or absence of certain words and phrases. Although current approaches can enforce terms to appear in the translation, they often struggle to make the constraint word form agree with the rest of the generated output. Our manual analysis shows that 46% of the errors in the output of a baseline constrained model for English to Czech translation are related to agreement. We investigate mechanisms to allow neural machine translation to infer the correct word inflection given lemmatized constraints. In particular, we focus on methods based on training the model with constraints provided as part of the input sequence. Our experiments on the English-Czech language pair show that this approach improves the translation of constrained terms in both automatic and manual evaluation by reducing errors in agreement. Our approach thus eliminates inflection errors, without introducing new errors or decreasing the overall quality of the translation.

翻訳日:2021-06-24 19:25:11 公開日:2021-06-23

# (参考訳) 機械予測における偽の完全性:機械学習における循環問題の検出と評価

False perfection in machine prediction: Detecting and assessing circularity problems in machine learning ( http://arxiv.org/abs/2106.12417v1 )

ライセンス: CC BY 4.0

Michael Hagmann, Stefan Riezler

(参考訳) 機械学習アルゴリズムは、見えないテスト入力の正しい出力を予測することを目的として、入力データとターゲット出力のパターンからモデルをトレーニングする。本稿では, 医療情報学や特許法などの応用分野において, 入力データの表現において, 目標出力が決定論的に定義された測定値を含むことによる機械学習の問題を示す。これは、既知の目標定義の機械的再構成に基づく完全だが円形の予測につながるが、定義された測定値が不完全あるいは不完全であるような実世界のデータでは失敗する。本稿では,任意のデータセットとブラックボックス機械学習モデルに対して,対象の機能定義を再構築可能か,トレーニングに使用しているかを示す循環性テストを行う。我々は,機械学習におけるデータ表現から対象とする結果を定義することで,研究結果を実世界のアプリケーションに転送するには円周性を回避する必要があると論じる。

Machine learning algorithms train models from patterns of input data and target outputs, with the goal of predicting correct outputs for unseen test inputs. Here we demonstrate a problem of machine learning in vital application areas such as medical informatics or patent law that consists of the inclusion of measurements on which target outputs are deterministically defined in the representations of input data. This leads to perfect, but circular predictions based on a machine reconstruction of the known target definition, but fails on real-world data where the defining measurements may not or only incompletely be available. We present a circularity test that shows, for given datasets and black-box machine learning models, whether the target functional definition can be reconstructed and has been used in training. We argue that a transfer of research results to real-world applications requires to avoid circularity by separating measurements that define target outcomes from data representations in machine learning.

翻訳日:2021-06-24 19:01:45 公開日:2021-06-23

# (参考訳) 単純さの発見:シフト不変変分オートエンコーダによる特徴・パターン・順序パラメータの教師なし発見

Finding simplicity: unsupervised discovery of features, patterns, and order parameters via shift-invariant variational autoencoders ( http://arxiv.org/abs/2106.12472v1 )

ライセンス: CC BY 4.0

Maxim Ziatdinov, Chun Yin Wong, and Sergei V. Kalinin

(参考訳) 走査トンネル法と透過電子顕微鏡(STM, STEM)の最近の進歩により, 材料の構造や機能に関する情報を含む大量のイメージングデータが日常的に生成されるようになった。実験データセットは、STEMにおける物理秩序パラメータ場、偏光およびひずみ勾配、STMにおける定常電子波およびキャリア媒介交換相互作用などの長距離現象のシグネチャを含む。それに応じて、人間の目は格子周期、繰り返し構造要素、微細構造などの画像の特定のパターンを容易に識別することができるが、それらの自動抽出と分類は非常に非自明で、そのような分析を達成するための普遍的な経路が欠如している。 STMおよび(S)TEM画像で観察されるパターンの最も特徴的な要素は、(ほぼ)周期性であり、基本原子構造のパーシモニーから直接発生する挙動であり、秩序パラメータ分布を反映する段階的変化に重畳されている。しかしながら、大域的フーリエ法によるこれらの要素の発見は、可変性と理想的離散的翻訳対称性の欠如により非自明である。この問題に対処するため,画像空間をランダムにサンプリングするためには,画像の特徴的反復的特徴を解消するシフト不変変分オートエンコーダ(shift-VAE)を開発した。シフト-VAEは、対象物の位置の不確実性と形状再構成の不確実性とのバランスをとる。このアプローチはモデル1Dデータに対して説明され、さらに合成および実験的なSTMおよびSTEM2Dデータに拡張される。

Recent advances in scanning tunneling and transmission electron microscopies (STM and STEM) have allowed routine generation of large volumes of imaging data containing information on the structure and functionality of materials. The experimental data sets contain signatures of long-range phenomena such as physical order parameter fields, polarization and strain gradients in STEM, or standing electronic waves and carrier-mediated exchange interactions in STM, all superimposed onto scanning system distortions and gradual changes of contrast due to drift and/or mis-tilt effects. Correspondingly, while the human eye can readily identify certain patterns in the images such as lattice periodicities, repeating structural elements, or microstructures, their automatic extraction and classification are highly non-trivial and universal pathways to accomplish such analyses are absent. We pose that the most distinctive elements of the patterns observed in STM and (S)TEM images are similarity and (almost-) periodicity, behaviors stemming directly from the parsimony of elementary atomic structures, superimposed on the gradual changes reflective of order parameter distributions. However, the discovery of these elements via global Fourier methods is non-trivial due to variability and lack of ideal discrete translation symmetry. To address this problem, we develop shift-invariant variational autoencoders (shift-VAE) that allow disentangling characteristic repeating features in the images, their variations, and shifts inevitable for random sampling of image space. Shift-VAEs balance the uncertainty in the position of the object of interest with the uncertainty in shape reconstruction. This approach is illustrated for model 1D data, and further extended to synthetic and experimental STM and STEM 2D data.

翻訳日:2021-06-24 19:00:37 公開日:2021-06-23

# (参考訳) 転校学習に対する教師モデルフィンガープリント攻撃

Teacher Model Fingerprinting Attacks Against Transfer Learning ( http://arxiv.org/abs/2106.12478v1 )

ライセンス: CC BY 4.0

Yufei Chen, Chao Shen, Cong Wang, Yang Zhang

(参考訳) トランスファーラーニングは、トレーニングデータの不足に対処するための一般的なソリューションになっています。訓練された教師モデルの初期の層を再利用または微調整することで、特定の学生モデルを訓練する。しかし、ユーティリティの改善に加えて、移行された公開知識は機密性をモデル化する潜在的な脅威をもたらし、さらに他のセキュリティやプライバシーの問題も引き起こす。本稿では,情報伝達学習の文脈における教師モデル暴露の脅威について,初めて総合的な調査を行い,公開知識とモデル機密性との緊張関係について深い知見を得ることを目的としている。そこで本研究では,学生モデルの起源を推定するために,教師モデルフィンガープリント攻撃を提案する。具体的には,学生モデルを探索して攻撃を実現するために,クエリを慎重に生成する新しい最適化手法を提案する。既存のモデルリバースエンジニアリングのアプローチとは異なり、提案手法では、後部などのきめ細かいモデル出力や、モデルアーキテクチャやトレーニングデータセットの補助情報に依存しない。提案攻撃の有効性を系統的に評価した。実験結果から,本攻撃はプロービングクエリの少ないモデル起源を正確に識別できることが判明した。さらに,提案攻撃は,モデル盗難などの機械学習モデルに対する攻撃を容易化するためのステップストーンとして機能することを示す。

Transfer learning has become a common solution to address training data scarcity in practice. It trains a specified student model by reusing or fine-tuning early layers of a well-trained teacher model that is usually publicly available. However, besides utility improvement, the transferred public knowledge also brings potential threats to model confidentiality, and even further raises other security and privacy issues. In this paper, we present the first comprehensive investigation of the teacher model exposure threat in the transfer learning context, aiming to gain a deeper insight into the tension between public knowledge and model confidentiality. To this end, we propose a teacher model fingerprinting attack to infer the origin of a student model, i.e., the teacher model it transfers from. Specifically, we propose a novel optimization-based method to carefully generate queries to probe the student model to realize our attack. Unlike existing model reverse engineering approaches, our proposed fingerprinting method neither relies on fine-grained model outputs, e.g., posteriors, nor auxiliary information of the model architecture or training dataset. We systematically evaluate the effectiveness of our proposed attack. The empirical results demonstrate that our attack can accurately identify the model origin with few probing queries. Moreover, we show that the proposed attack can serve as a stepping stone to facilitating other attacks against machine learning models, such as model stealing.

翻訳日:2021-06-24 18:43:40 公開日:2021-06-23

# (参考訳) アラビア語におけるサーカズム検出と感情分析のための深部マルチタスクモデル

Deep Multi-Task Model for Sarcasm Detection and Sentiment Analysis in Arabic Language ( http://arxiv.org/abs/2106.12488v1 )

ライセンス: CC BY 4.0

Abdelkader El Mahdaouy, Abdellah El Mekki, Kabil Essefar, Nabil El Mamoun, Ismail Berrada, Ahmed Khoumsi

(参考訳) 皮肉や皮肉といった比喩的言語装置の普及は、アラビア語の知覚分析(SA)に深刻な課題をもたらす。従来の研究ではSAとsarcasm検出が別々に行われているが,本研究では,両タスク間の知識相互作用を可能にする,エンドツーエンドの深層多タスク学習(MTL)モデルを提案する。我々のMTLモデルは、変換器(BERT)モデルからの双方向エンコーダ表現、マルチタスクアテンション相互作用モジュール、および2つのタスク分類器で構成されている。以上の結果から, 提案手法は, SAおよびsarcasm検出サブタスクにおいて, 単タスクモデルよりも優れていることがわかった。

The prominence of figurative language devices, such as sarcasm and irony, poses serious challenges for Arabic Sentiment Analysis (SA). While previous research works tackle SA and sarcasm detection separately, this paper introduces an end-to-end deep Multi-Task Learning (MTL) model, allowing knowledge interaction between the two tasks. Our MTL model's architecture consists of a Bidirectional Encoder Representation from Transformers (BERT) model, a multi-task attention interaction module, and two task classifiers. The overall obtained results show that our proposed model outperforms its single-task counterparts on both SA and sarcasm detection sub-tasks.

翻訳日:2021-06-24 18:14:33 公開日:2021-06-23

# (参考訳) ハイパースペクトル画像復調のためのマルチモーダルおよび周波数重み付きテンソル核ノルム

Multi-modal and frequency-weighted tensor nuclear norm for hyperspectral image denoising ( http://arxiv.org/abs/2106.12489v1 )

ライセンス: CC BY 4.0

Sheng Liu, Xiaozhen Xie, Wenfeng Kong, and Jifeng Ning

(参考訳) 低ランク性はハイパースペクトル画像(hsi)のタスクにおいて重要である。テンソル核ノルム (TNN) は、テンソル特異値分解に基づいて定義され、HSIの低ランク性を記述するための最先端の手法である。しかしながら、TNNは、非正規化タスクに対処する際のHSIの物理的意味を無視し、亜最適デノイズ化パフォーマンスをもたらす。本稿では,マルチモーダル・周波数重み付きテンソル核ノルム (MFWTNN) と非凸MFWTNN を提案する。まず、周波数成分の物理的意味を調査し、その重みを再考し、TNNの低ランク表現能力を改善する。また,2つの空間次元とHSIのスペクトル次元の相関を考察し,上記のTNNの改良と組み合わせてMFWTNNを提案する。次に、非凸関数を用いて周波数テンソルのランク関数を近似し、MFWTNNをより緩和するNonMFWTNNを提案する。また,ノイズ情報を含むスライスに対して,プロファイル情報を含むスライスに対して,より小さなウエイトを適応的に選択する。最後に,提案モデルを解くために,乗算器(admm)に基づくアルゴリズムの効率的な交互方向法を開発し,本手法の有効性をシミュレーションおよび実hsiデータセットで検証した。

Low-rankness is important in the hyperspectral image (HSI) denoising tasks. The tensor nuclear norm (TNN), defined based on the tensor singular value decomposition, is a state-of-the-art method to describe the low-rankness of HSI. However, TNN ignores some of the physical meanings of HSI in tackling the denoising tasks, leading to suboptimal denoising performance. In this paper, we propose the multi-modal and frequency-weighted tensor nuclear norm (MFWTNN) and the non-convex MFWTNN for HSI denoising tasks. Firstly, we investigate the physical meaning of frequency components and reconsider their weights to improve the low-rank representation ability of TNN. Meanwhile, we also consider the correlation among two spatial dimensions and the spectral dimension of HSI and combine the above improvements to TNN to propose MFWTNN. Secondly, we use non-convex functions to approximate the rank function of the frequency tensor and propose the NonMFWTNN to relax the MFWTNN better. Besides, we adaptively choose bigger weights for slices mainly containing noise information and smaller weights for slices containing profile information. Finally, we develop the efficient alternating direction method of multiplier (ADMM) based algorithm to solve the proposed models, and the effectiveness of our models are substantiated in simulated and real HSI datasets.

翻訳日:2021-06-24 18:07:39 公開日:2021-06-23

# (参考訳) 国・州レベルの現代アラビア語標準および方言アラビア語識別のためのBERTに基づくマルチタスクモデル

BERT-based Multi-Task Model for Country and Province Level Modern Standard Arabic and Dialectal Arabic Identification ( http://arxiv.org/abs/2106.12495v1 )

ライセンス: CC BY 4.0

Abdellah El Mekki, Abdelkader El Mahdaouy, Kabil Essefar, Nabil El Mamoun, Ismail Berrada, Ahmed Khoumsi

(参考訳) 方言と標準言語識別は多くのアラビア語自然言語処理アプリケーションにとって重要なタスクである。本稿では,現代標準アラビア語 (msa) と方言アラビア語 (da) の国レベルと州レベルを識別するための第2のnadi共通課題である深層学習に基づくシステムを提案する。このシステムは、国レベルと州レベルのmsa/da識別に取り組むために、エンドツーエンドのディープマルチタスク学習(mtl)モデルに基づいている。後者のMTLモデルは、共通の双方向エンコーダ表現変換器(BERT)エンコーダ、2つのタスク固有の注意層、2つの分類器で構成される。私たちのキーとなる考え方は、タスク識別とタスク間共有機能の両方を活用することです。その結果,MTLモデルは,ほとんどのサブタスクにおいて単一タスクモデルよりも優れていた。

Dialect and standard language identification are crucial tasks for many Arabic natural language processing applications. In this paper, we present our deep learning-based system, submitted to the second NADI shared task for country-level and province-level identification of Modern Standard Arabic (MSA) and Dialectal Arabic (DA). The system is based on an end-to-end deep Multi-Task Learning (MTL) model to tackle both country-level and province-level MSA/DA identification. The latter MTL model consists of a shared Bidirectional Encoder Representation Transformers (BERT) encoder, two task-specific attention layers, and two classifiers. Our key idea is to leverage both the task-discriminative and the inter-task shared features for country and province MSA/DA identification. The obtained results show that our MTL model outperforms single-task models on most subtasks.

翻訳日:2021-06-24 17:44:50 公開日:2021-06-23

# (参考訳) 医用画像セグメンテーションのためのオフザシェルフソースセグメンタの適用

Adapting Off-the-Shelf Source Segmenter for Target Medical Image Segmentation ( http://arxiv.org/abs/2106.12497v1 )

ライセンス: CC BY 4.0

Xiaofeng Liu, Fangxu Xing, Chao Yang, Georges El Fakhri, Jonghye Woo

(参考訳) unsupervised domain adaptation(uda)は、ラベル付きソースドメインから学んだ知識を、ラベル付きで見当たらないターゲットドメインに転送することを目的としている。しかし、適応段階におけるソースドメインデータへのアクセスは、データストレージやプライバシの問題のため、しばしば制限される。これを軽減するため,本研究では,ソースドメイン内で事前学習した ``off-the-shelf' セグメントモデルを,適応型バッチワイド正規化統計適応フレームワークを用いて,対象ドメインに適応させることを提案する。具体的には、ドメイン固有の低次バッチ統計、すなわち平均と分散は指数運動量減衰スキームに徐々に適応し、ドメイン共有可能な高次バッチ統計、すなわちスケーリングとシフトパラメータの整合性は、最適化目標によって明示的に強制される。各チャネルの転送性は、まず各チャネルの寄与のバランスをとるために適応的に測定される。さらに、提案したオープンソースフリーなUDAフレームワークは、例えば自己エントロピーの最小化など、教師なしの学習手法に直交しているため、フレームワークの上に簡単に追加できる。 BraTS 2018データベース上での大規模な実験により、我々のソースフリーなUDAフレームワークは、クロスサブタイプUDAセグメンテーションタスクの既存のソースラックスUDAメソッドよりも優れており、ソースデータとの教師付きUDAメソッドと比較して、クロスモダリティUDAセグメンテーションタスクの同等の結果が得られました。

Unsupervised domain adaptation (UDA) aims to transfer knowledge learned from a labeled source domain to an unlabeled and unseen target domain, which is usually trained on data from both domains. Access to the source domain data at the adaptation stage, however, is often limited, due to data storage or privacy issues. To alleviate this, in this work, we target source free UDA for segmentation, and propose to adapt an ``off-the-shelf" segmentation model pre-trained in the source domain to the target domain, with an adaptive batch-wise normalization statistics adaptation framework. Specifically, the domain-specific low-order batch statistics, i.e., mean and variance, are gradually adapted with an exponential momentum decay scheme, while the consistency of domain shareable high-order batch statistics, i.e., scaling and shifting parameters, is explicitly enforced by our optimization objective. The transferability of each channel is adaptively measured first from which to balance the contribution of each channel. Moreover, the proposed source free UDA framework is orthogonal to unsupervised learning methods, e.g., self-entropy minimization, which can thus be simply added on top of our framework. Extensive experiments on the BraTS 2018 database show that our source free UDA framework outperformed existing source-relaxed UDA methods for the cross-subtype UDA segmentation task and yielded comparable results for the cross-modality UDA segmentation task, compared with a supervised UDA methods with the source data.

翻訳日:2021-06-24 17:38:44 公開日:2021-06-23

# (参考訳) 深層畳み込みニューラルネットワークの普遍的一貫性

Universal Consistency of Deep Convolutional Neural Networks ( http://arxiv.org/abs/2106.12498v1 )

ライセンス: CC BY 4.0

Shao-Bo Lin, Kaidong Wang, Yao Wang, Ding-Xuan Zhou

(参考訳) 深層畳み込みニューラルネットワーク(dcnn)の実際的な研究活動と比較すると、dcnnの理論的な挙動の研究は遅れている。特にDCNNの普遍的な一貫性は未解決のままである。本稿では,拡張畳み込みを伴うDCNNにおける経験的リスク最小化の実装が,(ゼロパディングを伴う)強固に一貫したものであることを示す。完全連結層がなければ、拡張畳み込みを伴うDCNNは、収縮(ゼロパディング)畳み込み層と複数の完全連結層を含むハイブリッド構造を持つ広く使われているディープニューラルネットワークよりも悪くはならないことを示す一連の実験を行う。

Compared with avid research activities of deep convolutional neural networks (DCNNs) in practice, the study of theoretical behaviors of DCNNs lags heavily behind. In particular, the universal consistency of DCNNs remains open. In this paper, we prove that implementing empirical risk minimization on DCNNs with expansive convolution (with zero-padding) is strongly universally consistent. Motivated by the universal consistency, we conduct a series of experiments to show that without any fully connected layers, DCNNs with expansive convolution perform not worse than the widely used deep neural networks with hybrid structure containing contracting (without zero-padding) convolution layers and several fully connected layers.

翻訳日:2021-06-24 17:27:48 公開日:2021-06-23

# (参考訳) クロスドメイン非教師付きタグ・ツー・シネMRI合成のための生成的自己学習

Generative Self-training for Cross-domain Unsupervised Tagged-to-Cine MRI Synthesis ( http://arxiv.org/abs/2106.12499v1 )

ライセンス: CC BY 4.0

Xiaofeng Liu, Fangxu Xing, Maureen Stone, Jiachen Zhuo, Reese Timothy, Jerry L. Prince, Georges El Fakhri, Jonghye Woo

(参考訳) 自己学習に基づく教師なしドメイン適応(UDA)は、未ラベルのターゲットドメインに訓練されたディープラーニングモデルをソースドメインに適用する場合、ドメインシフトの問題に対処する大きな可能性を示している。しかし、自己学習udaは、ソフトマックス離散ヒストグラムに基づく信頼性の高い疑似ラベル選択により、分類やセグメンテーションなどの判別タスクにおいて有効性を示すが、画像合成などの生成課題に対する自己学習udaは十分に研究されていない。本稿では,連続値予測とクロスドメイン画像合成のための回帰目標を備えた新しい生成的自己学習(gst) udaフレームワークを提案する。具体的には,疑似ラベルを不確実性マスクでフィルタリングし,生成画像の予測信頼度を実用的変動ベイズ学習で定量化する。高速テストタイム適応はラウンドベースの代替最適化スキームによって達成される。我々は、ソースドメインとターゲットドメインのデータセットを異なるスキャナーやセンターから取得する、タグ付き磁気共鳴画像(MRI)合成問題に関する枠組みを検証した。一般的なUDA手法に対して,我々の枠組みを検証するため,広範囲な検証を行った。以上の結果より,新しい対象領域の被験者のMRIをタグ付けしたGSTでは,UDA法と比較すると,合成品質が有意に向上した。

Self-training based unsupervised domain adaptation (UDA) has shown great potential to address the problem of domain shift, when applying a trained deep learning model in a source domain to unlabeled target domains. However, while the self-training UDA has demonstrated its effectiveness on discriminative tasks, such as classification and segmentation, via the reliable pseudo-label selection based on the softmax discrete histogram, the self-training UDA for generative tasks, such as image synthesis, is not fully investigated. In this work, we propose a novel generative self-training (GST) UDA framework with continuous value prediction and regression objective for cross-domain image synthesis. Specifically, we propose to filter the pseudo-label with an uncertainty mask, and quantify the predictive confidence of generated images with practical variational Bayes learning. The fast test-time adaptation is achieved by a round-based alternative optimization scheme. We validated our framework on the tagged-to-cine magnetic resonance imaging (MRI) synthesis problem, where datasets in the source and target domains were acquired from different scanners or centers. Extensive validations were carried out to verify our framework against popular adversarial training UDA methods. Results show that our GST, with tagged MRI of test subjects in new target domains, improved the synthesis quality by a large margin, compared with the adversarial training UDA methods.

翻訳日:2021-06-24 17:07:31 公開日:2021-06-23

# (参考訳) ベイズ深層学習ハイパーパラメータ探索による雑音付き多項式へのロバスト関数マッピング

Bayesian Deep Learning Hyperparameter Search for Robust Function Mapping to Polynomials with Noise ( http://arxiv.org/abs/2106.12532v1 )

ライセンス: CC BY 4.0

Nidhin Harilal, Udit Bhatia, Auroop R. Ganguly

(参考訳) ニューラルアーキテクチャ探索の進歩と、コネクショナリストアーキテクチャの説明可能性と解釈性は、最近の文献で報告されている。しかし,BDL(Bayesian Deep Learning)ハイパーパラメータの設計方法,特に不確実な定量化を伴うロバストな関数マッピングのための深さ,幅,アンサンブルサイズについて,我々はまだ理解していない。本稿では,ベイズ接続性表現を,ノイズタイプや比率の異なる異なる次数の多項式にマッピングすることで,理解を深めようとする。雑音特性に基づく不確かさを定量化しつつ, 基礎となる多項式信号を抽出するハイパーパラメータの組み合わせを探索するために, 雑音汚染多項式を調べる。具体的には、異なる分布とSNR比と様々な雑音特性を有するノイズで汚染されたn次多項式の信号を検出するために、適切なニューラルネットワークアーキテクチャとアンサンブル構成が見つかるかどうかを考察する。以上の結果から,ネットワーク深度が最適であること,および予測スキルと不確かさの定量化に最適なアンサンブル数があることが示唆された。しかし、高い幅値での幅増加に伴い性能向上率が低下しても、幅に対する最適性は識別できない。我々の実験と洞察は、BDL表現の理論的性質を理解し、実用的なソリューションを設計するための方向性となる。

Advances in neural architecture search, as well as explainability and interpretability of connectionist architectures, have been reported in the recent literature. However, our understanding of how to design Bayesian Deep Learning (BDL) hyperparameters, specifically, the depth, width and ensemble size, for robust function mapping with uncertainty quantification, is still emerging. This paper attempts to further our understanding by mapping Bayesian connectionist representations to polynomials of different orders with varying noise types and ratios. We examine the noise-contaminated polynomials to search for the combination of hyperparameters that can extract the underlying polynomial signals while quantifying uncertainties based on the noise attributes. Specifically, we attempt to study the question that an appropriate neural architecture and ensemble configuration can be found to detect a signal of any n-th order polynomial contaminated with noise having different distributions and signal-to-noise (SNR) ratios and varying noise attributes. Our results suggest the possible existence of an optimal network depth as well as an optimal number of ensembles for prediction skills and uncertainty quantification, respectively. However, optimality is not discernible for width, even though the performance gain reduces with increasing width at high values of width. Our experiments and insights can be directional to understand theoretical properties of BDL representations and to design practical solutions.

翻訳日:2021-06-24 16:57:07 公開日:2021-06-23

# (参考訳) PAC-Bayes一般化境界の最小化による確率的多数票の学習

Learning Stochastic Majority Votes by Minimizing a PAC-Bayes Generalization Bound ( http://arxiv.org/abs/2106.12535v1 )

ライセンス: CC BY 4.0

Valentina Zantedeschi, Paul Viallard, Emilie Morvant, R\'emi Emonet, Amaury Habrard, Pascal Germain, Benjamin Guedj

(参考訳) 分類器の有限アンサンブルに対する多数票の確率的対向について検討し,その一般化特性について検討する。このアプローチは任意の分布に対して成り立つが、dirichlet分布をインスタンス化する: これは、期待されるリスクに対して閉じた形式と微分可能な表現を可能にする。その結果得られた確率的多数決学習アルゴリズムは、pap-bayes目標を最小化する競合するアルゴリズムと比較した一連の数値実験において、最先端の精度と(空でない)密接な一般化限界の利点を達成する。

We investigate a stochastic counterpart of majority votes over finite ensembles of classifiers, and study its generalization properties. While our approach holds for arbitrary distributions, we instantiate it with Dirichlet distributions: this allows for a closed-form and differentiable expression for the expected risk, which then turns the generalization bound into a tractable training objective. The resulting stochastic majority vote learning algorithm achieves state-of-the-art accuracy and benefits from (non-vacuous) tight generalization bounds, in a series of numerical experiments when compared to competing algorithms which also minimize PAC-Bayes objectives -- both with uninformed (data-independent) and informed (data-dependent) priors.

翻訳日:2021-06-24 16:45:21 公開日:2021-06-23

# (参考訳) 特徴帰属と反事実的説明は操作できる

Feature Attributions and Counterfactual Explanations Can Be Manipulated ( http://arxiv.org/abs/2106.12563v1 )

ライセンス: CC BY 4.0

Dylan Slack, Sophie Hilgard, Sameer Singh, Hima Lakkaraju

(参考訳) 機械学習モデルは、重要な意思決定設定(医療や金融など)でますます使われているため、モデル予測を説明する方法の開発に重点が置かれている。このような \textit{explanations} はモデルの理解と確立に使用され、マシンラーニングパイプラインの重要なコンポーネントである。これらのシステムでは、説明は重要な部分であるが、敵による操作に対する脆弱性についてはほとんど理解されていない。本稿では,2つの幅広い説明のクラスが操作に対して脆弱であるかを論じる。敵がモデルに依存しない特徴帰属法(例: lime \& shap)を操作するバイアス付きモデルをどのように設計するかを実証し、反事実探索(例:wachterのアルゴリズム \& dice)中のヒル・クライムがモデルのバイアスである \textit{concealing} へ変換されるという反事実的説明を実証する。これらの脆弱性は、敵がバイアス付きモデルをデプロイすることを可能にするが、説明はこのバイアスを明らかにしないため、ステークホルダーをモデルの信頼性を損なう。我々は,実世界のデータセット上での操作について,compas や community \& crime などを評価し,実際に操作できる説明を見つける。

As machine learning models are increasingly used in critical decision-making settings (e.g., healthcare, finance), there has been a growing emphasis on developing methods to explain model predictions. Such \textit{explanations} are used to understand and establish trust in models and are vital components in machine learning pipelines. Though explanations are a critical piece in these systems, there is little understanding about how they are vulnerable to manipulation by adversaries. In this paper, we discuss how two broad classes of explanations are vulnerable to manipulation. We demonstrate how adversaries can design biased models that manipulate model agnostic feature attribution methods (e.g., LIME \& SHAP) and counterfactual explanations that hill-climb during the counterfactual search (e.g., Wachter's Algorithm \& DiCE) into \textit{concealing} the model's biases. These vulnerabilities allow an adversary to deploy a biased model, yet explanations will not reveal this bias, thereby deceiving stakeholders into trusting the model. We evaluate the manipulations on real world data sets, including COMPAS and Communities \& Crime, and find explanations can be manipulated in practice.

翻訳日:2021-06-24 16:09:15 公開日:2021-06-23

# (参考訳) ニューラルネットワークにおける近似可逆性のための特徴アライメント

Feature Alignment for Approximated Reversibility in Neural Networks ( http://arxiv.org/abs/2106.12562v1 )

ライセンス: CC BY 4.0

Tiago de Souza Farias and Jonas Maziero

(参考訳) 本稿では,ニューラルネットワークにおける近似可逆性を得る手法である特徴アライメントを導入する。特徴抽出によって、ニューラルネットワークを訓練して、出力から入力への逆プロセスのための推定マップを学習することができる。変分オートエンコーダと組み合わせることで、トレーニングデータと同じ統計から新しいサンプルを生成することができる。生成的対向ネットワークの概念を用いて, 結果の改善を図った。最後に、ニューラルネットワークをローカルにトレーニングし、計算メモリリソースを節約するためにこの技術を変更可能であることを示す。これらの手法を適用し,MNIST,CIFAR-10,celebAの3つの視覚生成課題について報告する。

We introduce feature alignment, a technique for obtaining approximate reversibility in artificial neural networks. By means of feature extraction, we can train a neural network to learn an estimated map for its reverse process from outputs to inputs. Combined with variational autoencoders, we can generate new samples from the same statistics as the training data. Improvements of the results are obtained by using concepts from generative adversarial networks. Finally, we show that the technique can be modified for training neural networks locally, saving computational memory resources. Applying these techniques, we report results for three vision generative tasks: MNIST, CIFAR-10, and celebA.

翻訳日:2021-06-24 15:52:19 公開日:2021-06-23

# エイリアスフリー生成型adversarial network

Alias-Free Generative Adversarial Networks ( http://arxiv.org/abs/2106.12423v1 )

ライセンス: Link先を確認

Tero Karras, Miika Aittala, Samuli Laine, Erik H\"ark\"onen, Janne Hellsten, Jaakko Lehtinen, Timo Aila

(参考訳) 階層的畳み込みの性質にもかかわらず、典型的な生成逆数ネットワークの合成過程は不健全な方法で絶対画素座標に依存する。例えば、ディテールは描写されたオブジェクトの表面ではなく、画像座標に接着されているように見える。我々は、生成ネットワーク内でエイリアスを引き起こす不注意信号処理に根本原因を辿る。ネットワーク内のすべての信号を連続的に解釈すると、不要な情報が階層的な合成プロセスに漏れないことを保証する小さなアーキテクチャ変更が一般的に適用される。その結果得られるネットワークはstylegan2のfidと一致するが、内部表現では劇的に異なり、サブピクセルスケールでも翻訳と回転に完全同値である。その結果,ビデオやアニメーションに適した生成モデルへの道が開けた。

We observe that despite their hierarchical convolutional nature, the synthesis process of typical generative adversarial networks depends on absolute pixel coordinates in an unhealthy manner. This manifests itself as, e.g., detail appearing to be glued to image coordinates instead of the surfaces of depicted objects. We trace the root cause to careless signal processing that causes aliasing in the generator network. Interpreting all signals in the network as continuous, we derive generally applicable, small architectural changes that guarantee that unwanted information cannot leak into the hierarchical synthesis process. The resulting networks match the FID of StyleGAN2 but differ dramatically in their internal representations, and they are fully equivariant to translation and rotation even at subpixel scales. Our results pave the way for generative models better suited for video and animation.

翻訳日:2021-06-24 15:37:32 公開日:2021-06-23

# NodePiece: 大きな知識グラフの合成とパラメータ効率の良い表現

NodePiece: Compositional and Parameter-Efficient Representations of Large Knowledge Graphs ( http://arxiv.org/abs/2106.12144v1 )

ライセンス: Link先を確認

Mikhail Galkin, Jiapeng Wu, Etienne Denis, William L. Hamilton

(参考訳) 知識グラフ(KG)の従来の表現学習アルゴリズムは、各エンティティを独自の埋め込みベクトルにマッピングする。このような浅いルックアップは、埋め込み行列を格納するためのメモリ消費の線形増加をもたらし、現実世界のKGを扱う際に高い計算コストを発生させる。 NLPで一般的に使われているサブワードトークン化と平行に描画することで、サブ線形メモリ要求を伴うパラメータ効率の高いノード埋め込み戦略の展望を探る。そこで我々は,固定サイズのエンティティ語彙を学習するためのアンカーベースアプローチであるnodepieceを提案する。ノードピースでは、既知の関係型を持つグラフのアンカーノードからサブワード/サブエンティティ単位の語彙を構築する。このような固定サイズの語彙を考えると、トレーニング中に見えないものを含むあらゆるエンティティのエンコーディングと埋め込みをブートストラップすることができる。実験によると、NodePieceはノード分類、リンク予測、関係予測タスクにおいて競合的に動作し、グラフ内の明示的なノードの10%未満をアンカーとして保持し、しばしば10倍のパラメータを持つ。

Conventional representation learning algorithms for knowledge graphs (KG) map each entity to a unique embedding vector. Such a shallow lookup results in a linear growth of memory consumption for storing the embedding matrix and incurs high computational costs when working with real-world KGs. Drawing parallels with subword tokenization commonly used in NLP, we explore the landscape of more parameter-efficient node embedding strategies with possibly sublinear memory requirements. To this end, we propose NodePiece, an anchor-based approach to learn a fixed-size entity vocabulary. In NodePiece, a vocabulary of subword/sub-entity units is constructed from anchor nodes in a graph with known relation types. Given such a fixed-size vocabulary, it is possible to bootstrap an encoding and embedding for any entity, including those unseen during training. Experiments show that NodePiece performs competitively in node classification, link prediction, and relation prediction tasks while retaining less than 10% of explicit nodes in a graph as anchors and often having 10x fewer parameters.

翻訳日:2021-06-24 15:37:01 公開日:2021-06-23

# トランスファーラーニングとデータ変換による事前学習視覚モデルによるテキストデータの分類

Classifying Textual Data with Pre-trained Vision Models through Transfer Learning and Data Transformations ( http://arxiv.org/abs/2106.12479v1 )

ライセンス: Link先を確認

Charaf Eddine Benarab

(参考訳) 知識は経験を通じて人間によって獲得され、異なるタスクで同時に達成できる知識の種類やスキルレベルの境界は設定されない。ニューラルネットワークに関しては、そうではありませんが、この分野における大きなブレークスルーは極めてタスクとドメイン特化です。ビジョンと言語は別々の方法で処理され、別々のメソッドと異なるデータセットを使用する。本稿では,imagenetでトレーニングされたベンチマークビジョンモデルによって得られた知識を用いて,より小さなアーキテクチャでテキストの分類を学ぶことを提案する。 IMDBデータセットに含まれるテキストデータをグレースケールイメージに変換する。異なる領域の解析と転送学習法を実行する。まったく異なるデータセットによる課題にもかかわらず、有望な結果が得られます。この研究の主な貢献は、言語とビジョンの両方で事前訓練された大きなモデルを結びつけて、元のタスクと異なるサブフィールドで最新の結果を達成する、新しいアプローチである。計算能力の高いリソースを必要とせず具体的には、視覚と言語モデル間の知識を転送して感情分析を行う。 BERT埋め込みはグレースケールのイメージに変換され、これらのイメージはVGG16やResNet Index Terms:自然言語、ビジョン、BERT、Transfer Learning、CNN、ドメイン適応といった事前訓練されたビジョンモデルのトレーニング例として使用される。

Knowledge is acquired by humans through experience, and no boundary is set between the kinds of knowledge or skill levels we can achieve on different tasks at the same time. When it comes to Neural Networks, that is not the case, the major breakthroughs in the field are extremely task and domain specific. Vision and language are dealt with in separate manners, using separate methods and different datasets. In this work, we propose to use knowledge acquired by benchmark Vision Models which are trained on ImageNet to help a much smaller architecture learn to classify text. After transforming the textual data contained in the IMDB dataset to gray scale images. An analysis of different domains and the Transfer Learning method is carried out. Despite the challenge posed by the very different datasets, promising results are achieved. The main contribution of this work is a novel approach which links large pretrained models on both language and vision to achieve state-of-the-art results in different sub-fields from the original task. Without needing high compute capacity resources. Specifically, Sentiment Analysis is achieved after transferring knowledge between vision and language models. BERT embeddings are transformed into grayscale images, these images are then used as training examples for pretrained vision models such as VGG16 and ResNet Index Terms: Natural language, Vision, BERT, Transfer Learning, CNN, Domain Adaptation.

翻訳日:2021-06-24 15:36:43 公開日:2021-06-23

# 血液細胞の多種分類 --エンドツーエンドコンピュータビジョンに基づく診断ケーススタディ-

Multi-Class Classification of Blood Cells -- End to End Computer Vision based diagnosis case study ( http://arxiv.org/abs/2106.12548v1 )

ライセンス: Link先を確認

Sai Sukruth Bezugam

(参考訳) 血液ベースの疾患の診断は、しばしば患者の血液サンプルを特定して特徴付ける。血液細胞サブタイプの検出と分類の自動化は、重要な医学的応用である。医療画像の自動処理と分析は、医療診断に強力なツールを提供する。本研究では, 白血球の外輪郭, 色の形態的特徴に基づいて, 白血球分類の問題に取り組む。 The work we would explore a set of preprocessing and segmentation (Color-based segmentation, Morphological processing, contouring) algorithms along with a set of features extraction methods (Corner detection algorithms and Histogram of Gradients(HOG)), dimensionality reduction algorithms (Principal Component Analysis(PCA)) that are able to recognize and classify through various Unsupervised(k-nearest neighbors) and Supervised (Support Vector Machine, Decision Trees, Linear Discriminant Analysis, Quadratic Discriminant Analysis, Naive Bayes) algorithms different categories of white blood cells to Eosinophil, Lymphocyte, Monocyte, and Neutrophil. さまざまなDeep Convolutional Neural Network Architecture(Sqeezent、MobilenetV1、MobilenetV2、InceptionNetなど)の探求も進めています。前処理/セグメンテーションおよび前処理なしで。我々は、最小時間複雑さと低リソース要求でロバストなアルゴリズムを特定するために、多くのアルゴリズムを探求したい。この研究の結果は、自動的な血液細胞分類に必要なアルゴリズムの選択の手がかりとなる可能性がある。

The diagnosis of blood-based diseases often involves identifying and characterizing patient blood samples. Automated methods to detect and classify blood cell subtypes have important medical applications. Automated medical image processing and analysis offers a powerful tool for medical diagnosis. In this work we tackle the problem of white blood cell classification based on the morphological characteristics of their outer contour, color. The work we would explore a set of preprocessing and segmentation (Color-based segmentation, Morphological processing, contouring) algorithms along with a set of features extraction methods (Corner detection algorithms and Histogram of Gradients(HOG)), dimensionality reduction algorithms (Principal Component Analysis(PCA)) that are able to recognize and classify through various Unsupervised(k-nearest neighbors) and Supervised (Support Vector Machine, Decision Trees, Linear Discriminant Analysis, Quadratic Discriminant Analysis, Naive Bayes) algorithms different categories of white blood cells to Eosinophil, Lymphocyte, Monocyte, and Neutrophil. We even take a step forwards to explore various Deep Convolutional Neural network architecture (Sqeezent, MobilenetV1,MobilenetV2, InceptionNet etc.) without preprocessing/segmentation and with preprocessing. We would like to explore many algorithms to identify the robust algorithm with least time complexity and low resource requirement. The outcome of this work can be a cue to selection of algorithms as per requirement for automated blood cell classification.

翻訳日:2021-06-24 15:36:21 公開日:2021-06-23

# 安定,高速,高精度:相対的位置エンコーディングによるカーネル化注意

Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding ( http://arxiv.org/abs/2106.12566v1 )

ライセンス: Link先を確認

Shengjie Luo, Shanda Li, Tianle Cai, Di He, Dinglan Peng, Shuxin Zheng, Guolin Ke, Liwei Wang, Tie-Yan Liu

(参考訳) トランスフォーマーの重要な要素であるアテンションモジュールは、二次複雑性のため、長いシーケンスに対して効率的にスケールできない。多くの作品は、元々の注意でドット指数のソフトマックス関数を近似することに焦点を当てており、サブクアドラティックあるいは線形複雑トランスフォーマーアーキテクチャへと繋がる。しかし,これらの手法は,例えば相対位置符号化 (rpe) を用いたトランスフォーマなど,dot-then-exponentiate スタイルを超えて,より強力な注意モジュールには適用できないことを示す。多くの最先端モデルでは、相対的な位置符号化がデフォルトとして使用されるため、RPEを組み込む効率的なトランスフォーマーを設計することは魅力的である。本稿では、カーネル化された注目の上にRPEを持つトランスフォーマーの注意計算を高速化する新しい手法を提案する。相対的な位置符号化がtoeplitz行列を形成するという観測に基づいて,高速フーリエ変換(fft)を用いてrpeによるカーネル化注意を効率的に計算できることを数学的に示す。 FFTでは,時間複雑性を$\mathcal{O}(n\log n)$とする。さらに, 相対的位置符号化を適切に使用することで, バニラ核化注意のトレーニング不安定性問題を軽減できることを示す。幅広いタスクにおいて、最適化の問題なしにモデルをゼロからトレーニングできることを経験的に示します。学習されたモデルは、多くの効率的なTransformer変種よりも優れた性能を示し、長周期の標準的なTransformerよりも高速である。

The attention module, which is a crucial component in Transformer, cannot scale efficiently to long sequences due to its quadratic complexity. Many works focus on approximating the dot-then-exponentiate softmax function in the original attention, leading to sub-quadratic or even linear-complexity Transformer architectures. However, we show that these methods cannot be applied to more powerful attention modules that go beyond the dot-then-exponentiate style, e.g., Transformers with relative positional encoding (RPE). Since in many state-of-the-art models, relative positional encoding is used as default, designing efficient Transformers that can incorporate RPE is appealing. In this paper, we propose a novel way to accelerate attention calculation for Transformers with RPE on top of the kernelized attention. Based upon the observation that relative positional encoding forms a Toeplitz matrix, we mathematically show that kernelized attention with RPE can be calculated efficiently using Fast Fourier Transform (FFT). With FFT, our method achieves $\mathcal{O}(n\log n)$ time complexity. Interestingly, we further demonstrate that properly using relative positional encoding can mitigate the training instability problem of vanilla kernelized attention. On a wide range of tasks, we empirically show that our models can be trained from scratch without any optimization issues. The learned model performs better than many efficient Transformer variants and is faster than standard Transformer in the long-sequence regime.

翻訳日:2021-06-24 15:36:03 公開日:2021-06-23

# 説明可能な機械学習における科学研究のための合成ベンチマーク

Synthetic Benchmarks for Scientific Research in Explainable Machine Learning ( http://arxiv.org/abs/2106.12543v1 )

ライセンス: Link先を確認

Yang Liu, Sujay Khandagale, Colin White, Willie Neiswanger

(参考訳) 機械学習モデルがより複雑になり、アプリケーションがよりハイテイクになるにつれて、モデル予測を説明するツールがますます重要になっている。説明可能性技術が広く使われているにもかかわらず、異なる特徴帰属法の評価と比較は依然として困難である: 評価は理想的には人間の研究を必要とし、経験的評価メトリクスは実世界のデータセットでは計算的に禁止されることが多い。本稿では,XAI-Benchという合成データセットのスイートと,特徴属性アルゴリズムをベンチマークするライブラリのリリースによってこの問題に対処する。実世界のデータセットとは異なり、合成データセットは、地味なShapley値やその他のメトリクスを評価するのに必要な条件付き期待値の効率的な計算を可能にする。私たちがリリースした合成データセットは、現実世界のデータをシミュレートするように構成できる幅広いパラメータを提供します。我々は,いくつかの評価指標にまたがる一般的な説明可能性手法をベンチマークし,一般的な説明者の障害モードを特定することで,図書館のパワーを実証する。ライブラリの効率は、開発からデプロイまで、新しい説明可能性メソッドをもたらすのに役立つでしょう。

As machine learning models grow more complex and their applications become more high-stakes, tools for explaining model predictions have become increasingly important. Despite the widespread use of explainability techniques, evaluating and comparing different feature attribution methods remains challenging: evaluations ideally require human studies, and empirical evaluation metrics are often computationally prohibitive on real-world datasets. In this work, we address this issue by releasing XAI-Bench: a suite of synthetic datasets along with a library for benchmarking feature attribution algorithms. Unlike real-world datasets, synthetic datasets allow the efficient computation of conditional expected values that are needed to evaluate ground-truth Shapley values and other metrics. The synthetic datasets we release offer a wide variety of parameters that can be configured to simulate real-world data. We demonstrate the power of our library by benchmarking popular explainability techniques across several evaluation metrics and identifying failure modes for popular explainers. The efficiency of our library will help bring new explainability methods from development to deployment.

翻訳日:2021-06-24 15:35:37 公開日:2021-06-23

# 機能可視化はcnnアクティベーションの因果理解にどの程度有効か?

How Well do Feature Visualizations Support Causal Understanding of CNN Activations? ( http://arxiv.org/abs/2106.12447v1 )

ライセンス: Link先を確認

Roland S. Zimmermann, Judy Borowski, Robert Geirhos, Matthias Bethge, Thomas S. A. Wallis, Wieland Brendel

(参考訳) 深層畳み込みニューラルネットワークの内部動作を理解するために広く用いられるアプローチの1つは、アクティベーションの最大化による単位応答の可視化である。アクティベーションの最大化による特徴可視化は、ユニットをアクティベートする画像の特徴に関する正確な情報を提供すると考えられている。もしこれが本当なら、これらの合成画像は、画像の特定のパッチ(例えば犬の頭)がユニットのアクティベーションを変化させるかどうかなど、人間が介入の効果を予測できるようにすべきである。ここでは、2つの正方形のオクルージョンのどれがユニットのアクティベーションに大きな変化を引き起こすかを予測することで、この仮説をテストする。大規模なクラウドソースによる実験と専門家による測定は、平均的に、Olahらによる非常に活発な特徴視覚化が示している。 (2017年)は確かに、このタスクの人間を助ける(67 \pm 4\%$の正確さ;ベースラインのパフォーマンスは、視覚化なしで60 \pm 3\%$)。しかし、他の視覚化(例えば、)に比べて大きな優位性は提供されない。データセットは、同様のパフォーマンスをもたらす(66 \pm 3\%$から676 \pm 3\%$ accuracy)。本研究では,人間に対する単位レベルの解釈可能性手法の利点を定量化するための客観的心理学的課題を提案し,特徴的可視化が人間の「因果的理解」を,単純な代替的可視化よりも優れていることを示す証拠は見つからない。

One widely used approach towards understanding the inner workings of deep convolutional neural networks is to visualize unit responses via activation maximization. Feature visualizations via activation maximization are thought to provide humans with precise information about the image features that cause a unit to be activated. If this is indeed true, these synthetic images should enable humans to predict the effect of an intervention, such as whether occluding a certain patch of the image (say, a dog's head) changes a unit's activation. Here, we test this hypothesis by asking humans to predict which of two square occlusions causes a larger change to a unit's activation. Both a large-scale crowdsourced experiment and measurements with experts show that on average, the extremely activating feature visualizations by Olah et al. (2017) indeed help humans on this task ($67 \pm 4\%$ accuracy; baseline performance without any visualizations is $60 \pm 3\%$). However, they do not provide any significant advantage over other visualizations (such as e.g. dataset samples), which yield similar performance ($66 \pm 3\%$ to $67 \pm 3\%$ accuracy). Taken together, we propose an objective psychophysical task to quantify the benefit of unit-level interpretability methods for humans, and find no evidence that feature visualizations provide humans with better "causal understanding" than simple alternative visualizations.

翻訳日:2021-06-24 15:34:31 公開日:2021-06-23

# 粗Q注意:離散化による視覚ロボットマニピュレーションのための効率的な学習

Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via Discretisation ( http://arxiv.org/abs/2106.12534v1 )

ライセンス: Link先を確認

Stephen James, Kentaro Wada, Tristan Laidlow, Andrew J. Davison

(参考訳) 過去数年間を振り返ると、深層強化学習(RL)における最大のブレークスルーは、離散的なアクション領域にある。しかし、ロボット操作は本質的には連続制御環境であるが、これらの連続制御強化学習アルゴリズムは、俳優と批評家の共同最適化のため、サンプル非効率で本質的に訓練が困難であるアクタ-批判的手法に依存することが多い。そこで我々は,ロボット操作領域に離散型アクションrlアルゴリズムの安定性を実現する方法について検討する。我々は最近リリースされたARMアルゴリズムを拡張し、連続する次ベストポーズエージェントを離散的な次ベストポーズエージェントに置き換える。回転の離散化はその有界性を考えると自明であるが、翻訳は本質的に非有界であり、離散化は困難である。翻訳予測は3次元空間を判別することでボクセル予測問題として定式化するが、大きなワークスペースのボクセル化はメモリ集約的であり、ボクセルの密度が高く、ロボット操作に必要な解像度を得るのに不可欠である。そこで我々は, このボクセル予測を, 分解能を徐々に高め, 粗い方法で適用することを提案する。各ステップにおいて,予測位置として最も高い値のボクセルを抽出し,次のステップで高分解能ボクセル化の中心として使用する。この粗大な予測はいくつかのステップで適用され、翻訳のほとんどロスレスな予測を与える。我々の新しい粗大きめのアルゴリズムは、連続的な制御の同等性よりもずっと効率的にRLBenchのタスクを達成でき、実世界のタスクである表状のラザを7分以内で訓練し、わずか3回のデモしか行えません。さらに,voxel表現に移行することで,複数のカメラからの観測を容易に取り入れることができることを示す。

Reflecting on the last few years, the biggest breakthroughs in deep reinforcement learning (RL) have been in the discrete action domain. Robotic manipulation, however, is inherently a continuous control environment, but these continuous control reinforcement learning algorithms often depend on actor-critic methods that are sample-inefficient and inherently difficult to train, due to the joint optimisation of the actor and critic. To that end, we explore how we can bring the stability of discrete action RL algorithms to the robot manipulation domain. We extend the recently released ARM algorithm, by replacing the continuous next-best pose agent with a discrete next-best pose agent. Discretisation of rotation is trivial given its bounded nature, while translation is inherently unbounded, making discretisation difficult. We formulate the translation prediction as the voxel prediction problem by discretising the 3D space; however, voxelisation of a large workspace is memory intensive and would not work with a high density of voxels, crucial to obtaining the resolution needed for robotic manipulation. We therefore propose to apply this voxel prediction in a coarse-to-fine manner by gradually increasing the resolution. In each step, we extract the highest valued voxel as the predicted location, which is then used as the centre of the higher-resolution voxelisation in the next step. This coarse-to-fine prediction is applied over several steps, giving a near-lossless prediction of the translation. We show that our new coarse-to-fine algorithm is able to accomplish RLBench tasks much more efficiently than the continuous control equivalent, and even train some real-world tasks, tabular rasa, in less than 7 minutes, with only 3 demonstrations. Moreover, we show that by moving to a voxel representation, we are able to easily incorporate observations from multiple cameras.

翻訳日:2021-06-24 15:34:02 公開日:2021-06-23

# 勾配に基づく解法と二元化ニューラルネットワーク

Gradient-Based Interpretability Methods and Binarized Neural Networks ( http://arxiv.org/abs/2106.12569v1 )

ライセンス: Link先を確認

Amy Widdicombe, Simon J. Julier

(参考訳) バイナリニューラルネットワーク(BNN)は、エッジコンピューティングプラットフォームでディープラーニングが実行される方法に革命をもたらす可能性がある。しかし,これらのネットワークにおける解釈可能性手法の有効性は評価されていない。本稿では,2値化および完全精度ニューラルネットワーク(fpnn)に適用した場合に,多種多様な塩分マップに基づく解釈手法(gradient, smoothgrad, gradcam)の性能を比較する。基礎的なグラディエント法は両タイプのネットワークに対して非常に類似したマップを生成する。しかし、SmoothGradはBNNに対して非常にノイズの多いマップを生成する。 GradCAMはまた、ネットワークタイプによって異なるサリエンシマップも作成しており、BNNのいくつかは意味のない説明をしている。我々は,これらの相違の原因を解説し,より広い範囲のネットワークタイプに対して,解釈可能性手法をテストすべき理由の例として提示する。

Binarized Neural Networks (BNNs) have the potential to revolutionize the way that deep learning is carried out in edge computing platforms. However, the effectiveness of interpretability methods on these networks has not been assessed. In this paper, we compare the performance of several widely used saliency map-based interpretabilty techniques (Gradient, SmoothGrad and GradCAM), when applied to Binarized or Full Precision Neural Networks (FPNNs). We found that the basic Gradient method produces very similar-looking maps for both types of network. However, SmoothGrad produces significantly noisier maps for BNNs. GradCAM also produces saliency maps which differ between network types, with some of the BNNs having seemingly nonsensical explanations. We comment on possible reasons for these differences in explanations and present it as an example of why interpretability techniques should be tested on a wider range of network types.

翻訳日:2021-06-24 15:33:29 公開日:2021-06-23

# ブラックウェルによる公正オンライン学習への統一的アプローチ

A Unified Approach to Fair Online Learning via Blackwell Approachability ( http://arxiv.org/abs/2106.12242v1 )

ライセンス: Link先を確認

Evgenii Chzhen (LMO, CELESTE), Christophe Giraud (LMO, CELESTE), Gilles Stoltz (LMO, CELESTE)

(参考訳) 確率的かつ非敏感な文脈でオンライン学習を公平に行うための設定と一般的なアプローチを提供する。設定はプレイヤーと自然の間の繰り返しのゲームであり、それぞれのステージにおいてそれぞれのコンテキストに基づいてアクションを選択する。不知性の概念に触発されて、プレイヤーは決定を下す前に非敏感なコンテキストにしかアクセスできないと仮定し、同時に、敏感なコンテキストにアクセスする自然のケースと、敏感なコンテキストに気付いていない自然のケースについて論じる。未知の文脈分布の場合を扱うためにブラックウェルのアプローチ可能性理論を適用することにより、学習目的が公正性制約に適合するために必要な一般的な条件を提供する。この条件は (group-wise) no-regret と (group-wise) calibration の目的と、追加の制約として人口順にインスタンス化される。目的が制約と適合しない場合、提供されたフレームワークは、両者間の最適なトレードオフを特徴付けることができる。

We provide a setting and a general approach to fair online learning with stochastic sensitive and non-sensitive contexts. The setting is a repeated game between the Player and Nature, where at each stage both pick actions based on the contexts. Inspired by the notion of unawareness, we assume that the Player can only access the non-sensitive context before making a decision, while we discuss both cases of Nature accessing the sensitive contexts and Nature unaware of the sensitive contexts. Adapting Blackwell's approachability theory to handle the case of an unknown contexts' distribution, we provide a general necessary and sufficient condition for learning objectives to be compatible with some fairness constraints. This condition is instantiated on (group-wise) no-regret and (group-wise) calibration objectives, and on demographic parity as an additional constraint. When the objective is not compatible with the constraint, the provided framework permits to characterise the optimal trade-off between the two.

翻訳日:2021-06-24 15:33:14 公開日:2021-06-23

# IQ-Learn:模倣のための逆ソフトQ学習

IQ-Learn: Inverse soft-Q Learning for Imitation ( http://arxiv.org/abs/2106.12142v1 )

ライセンス: Link先を確認

Divyansh Garg, Shuvam Chakraborty, Chris Cundy, Jiaming Song, Stefano Ermon

(参考訳) 多くの逐次的な意思決定問題(ロボット制御、ゲームプレイ、逐次予測など)では、人間または専門家のデータがタスクに関する有用な情報を含んでいる。しかし、少量のエキスパートデータからの模倣学習(il)は、複雑なダイナミクスを持つ高次元環境では困難である。振る舞いのクローニングは、実装の単純さと安定した収束性のため広く使われている単純な方法であるが、環境のダイナミクスに関する情報は利用しない。力学情報を利用する既存の多くの手法は、報酬や政策近似に対する逆最適化プロセスや偏りのある高分散勾配推定器による訓練が困難である。本稿では,1つのq関数を学習し,報酬と方針の両方を暗黙的に表現することにより,敵対的トレーニングを回避するダイナミクス認識il法を提案する。標準ベンチマークでは,暗黙的に学習した報奨は,強健な報奨と高い正の相関を示すが,本手法は逆強化学習(IRL)にも利用できる。提案手法である逆ソフトq学習(iq-learn)は,オフラインとオンラインの模倣学習環境において,必要な環境相互作用の数と高次元空間のスケーラビリティの両方において既存の手法を上回って最先端の結果を得る。

In many sequential decision-making problems (e.g., robotics control, game playing, sequential prediction), human or expert data is available containing useful information about the task. However, imitation learning (IL) from a small amount of expert data can be challenging in high-dimensional environments with complex dynamics. Behavioral cloning is a simple method that is widely used due to its simplicity of implementation and stable convergence but doesn't utilize any information involving the environment's dynamics. Many existing methods that exploit dynamics information are difficult to train in practice due to an adversarial optimization process over reward and policy approximators or biased, high variance gradient estimators. We introduce a method for dynamics-aware IL which avoids adversarial training by learning a single Q-function, implicitly representing both reward and policy. On standard benchmarks, the implicitly learned rewards show a high positive correlation with the ground-truth rewards, illustrating our method can also be used for inverse reinforcement learning (IRL). Our method, Inverse soft-Q learning (IQ-Learn) obtains state-of-the-art results in offline and online imitation learning settings, surpassing existing methods both in the number of required environment interactions and scalability in high-dimensional spaces.

翻訳日:2021-06-24 15:31:46 公開日:2021-06-23

# 制約プログラミングによるベイズネットワーク構造学習のための非周期推論の改良

Improved Acyclicity Reasoning for Bayesian Network Structure Learning with Constraint Programming ( http://arxiv.org/abs/2106.12269v1 )

ライセンス: Link先を確認

Fulya Tr\"osser (MIAT INRA), Simon de Givry (MIAT INRA), George Katsirelos (MIA-Paris)

(参考訳) ベイジアンネットワークは確率的グラフィカルモデルであり、遺伝子制御ネットワークの推論、リスク分析、画像処理など幅広い応用領域を持つ。離散データからベイズネットワーク(BNSL)の構造を学習することは、有向非巡回グラフの超指数探索空間を持つNPハードタスクであることが知られている。本研究では,全ての可能なクラスタカットのサブセットを発見するための新しい多項式時間アルゴリズム,結果の線形プログラムを近似的に解くグリーディアルゴリズム,非循環性制約に対する一般化アーク整合アルゴリズムを提案する。制約プログラミングに基づく分岐結合解法 CPBayes にこれらを組み込んで, 最適ではないにもかかわらず, 桁違いの性能向上を図っている。結果として得られる解法は、NPハード問題を解くBNSL問題に対する最先端の解法である GOBNILP と好意的に比較し、線形プログラムを正確に解く。

Bayesian networks are probabilistic graphical models with a wide range of application areas including gene regulatory networks inference, risk analysis and image processing. Learning the structure of a Bayesian network (BNSL) from discrete data is known to be an NP-hard task with a superexponential search space of directed acyclic graphs. In this work, we propose a new polynomial time algorithm for discovering a subset of all possible cluster cuts, a greedy algorithm for approximately solving the resulting linear program, and a generalised arc consistency algorithm for the acyclicity constraint. We embed these in the constraint programmingbased branch-and-bound solver CPBayes and show that, despite being suboptimal, they improve performance by orders of magnitude. The resulting solver also compares favourably with GOBNILP, a state-of-the-art solver for the BNSL problem which solves an NP-hard problem to discover each cut and solves the linear program exactly.

翻訳日:2021-06-24 15:31:23 公開日:2021-06-23

# AC/DC: ディープニューラルネットワークの交互圧縮/非圧縮訓練

AC/DC: Alternating Compressed/DeCompressed Training of Deep Neural Networks ( http://arxiv.org/abs/2106.12379v1 )

ライセンス: Link先を確認

Alexandra Peste, Eugenia Iofinova, Adrian Vladu, Dan Alistarh

(参考訳) ディープニューラルネットワーク(DNN)の計算要求の増大は、疎いが正確でないDNNモデルを得ることに大きな関心を惹き付けている。最近の研究は、DNNの重量が可能な限り、訓練中の計算コストを減らすために既に不足しているスパーストレーニングのさらに難しいケースを調査している。既存のスパーストレーニング法は主に経験的であり、しばしば密度の高いベースラインと比較して精度が低い。本稿では,DNNのAlternating Compressed/DeCompressed (AC/DC) トレーニングと呼ばれる一般的な手法を提案し,アルゴリズムの変種に対する収束性を実証し,AC/DCが既存のスパーストレーニング手法を類似の計算予算で精度良く上回っていることを示す。 AC/DCの重要な特性は、密度とスパースモデルのコトレーニングが可能であり、トレーニングプロセスの終了時に正確なスパースセンスモデルペアが得られることである。これは実際に有用であり、圧縮された変種は、トレーニングフロー全体をやり直すことなく、リソース制約された設定に展開するのに好適であり、また、密集モデルと圧縮モデルの間の精度ギャップに関する洞察を提供する。

The increasing computational requirements of deep neural networks (DNNs) have led to significant interest in obtaining DNN models that are sparse, yet accurate. Recent work has investigated the even harder case of sparse training, where the DNN weights are, for as much as possible, already sparse to reduce computational costs during training. Existing sparse training methods are mainly empirical and often have lower accuracy relative to the dense baseline. In this paper, we present a general approach called Alternating Compressed/DeCompressed (AC/DC) training of DNNs, demonstrate convergence for a variant of the algorithm, and show that AC/DC outperforms existing sparse training methods in accuracy at similar computational budgets; at high sparsity levels, AC/DC even outperforms existing methods that rely on accurate pre-trained dense models. An important property of AC/DC is that it allows co-training of dense and sparse models, yielding accurate sparse-dense model pairs at the end of the training process. This is useful in practice, where compressed variants may be desirable for deployment in resource-constrained settings without re-doing the entire training flow, and also provides us with insights into the accuracy gap between dense and compressed models.

翻訳日:2021-06-24 15:31:06 公開日:2021-06-23

# 神経odeにおける予測を超える:同定と介入

Beyond Predictions in Neural ODEs: Identification and Interventions ( http://arxiv.org/abs/2106.12430v1 )

ライセンス: Link先を確認

Hananeh Aliee, Fabian J. Theis, Niki Kilbertus

(参考訳) パターンマッチングと予測タスクの膨大な成功に刺激され、研究者は独自の科学的発見を支援するために機械学習に頼るようになった。システムに関する大量の観測データがあれば、その進化を支配するルールを解明できるだろうか? このタスクの解決は、因果的相互作用を完全に理解し、介入の下でシステムの振る舞いについて信頼できる予測を行うという大きな約束を果たす。我々は、通常の微分方程式(ODE)系から生成された時系列データに対して、この問題に答えるための一歩を踏み出した。ガバナンスODEはデータだけでは識別できないかもしれないが、フレキシブルなニューラルODEと単純な正規化スキームを組み合わせることで、時系列データから動的および因果構造を堅牢に復元できることを示す。提案手法は, 実データと同様に, 様々な(非)線形一階および二階システムにおいて検証された。我々は、変数やシステム自体の介入の下で正確な予測を行うこともできることを示して結論付けます。

Spurred by tremendous success in pattern matching and prediction tasks, researchers increasingly resort to machine learning to aid original scientific discovery. Given large amounts of observational data about a system, can we uncover the rules that govern its evolution? Solving this task holds the great promise of fully understanding the causal interactions and being able to make reliable predictions about the system's behavior under interventions. We take a step towards answering this question for time-series data generated from systems of ordinary differential equations (ODEs). While the governing ODEs might not be identifiable from data alone, we show that combining simple regularization schemes with flexible neural ODEs can robustly recover the dynamics and causal structures from time-series data. Our results on a variety of (non)-linear first and second order systems as well as real data validate our method. We conclude by showing that we can also make accurate predictions under interventions on variables or the system itself.

翻訳日:2021-06-24 15:30:44 公開日:2021-06-23

# レバレッジ統計とイノベーションサーチによるクローズドフォーム,プロビブル,ロバストPCA

Closed-Form, Provable, and Robust PCA via Leverage Statistics and Innovation Search ( http://arxiv.org/abs/2106.12190v1 )

ライセンス: Link先を確認

Mostafa Rahmani and Ping Li

(参考訳) データクラスタリングのために最初に提案されたInnovation Searchのアイデアは、最近、外れ値検出に使用された。異常検出のためのイノベーション探索の応用において、データポイントの革新を測定するためにイノベーションの方向性が活用された。本研究では,革新探索アルゴリズムで計算された革新価値を二次コスト関数で検討し,新しいコスト関数を用いた革新価値がスコアの活用に等しいことを証明した。この興味深い接続は、Levanage ScoreベースのロバストPCA法に対するいくつかの理論的保証を確立し、新しいロバストPCA法を設計するために利用される。理論的には、アウトリアー分布とインリアー分布の異なるモデルによるパフォーマンス保証が含まれる。さらに,ノイズの存在に対するアルゴリズムの堅牢性を示す。数値的および理論的研究は、提案手法は高速かつ閉形式であるが、既存のアルゴリズムの大部分を上回ることができることを示している。

The idea of Innovation Search, which was initially proposed for data clustering, was recently used for outlier detection. In the application of Innovation Search for outlier detection, the directions of innovation were utilized to measure the innovation of the data points. We study the Innovation Values computed by the Innovation Search algorithm under a quadratic cost function and it is proved that Innovation Values with the new cost function are equivalent to Leverage Scores. This interesting connection is utilized to establish several theoretical guarantees for a Leverage Score based robust PCA method and to design a new robust PCA method. The theoretical results include performance guarantees with different models for the distribution of outliers and the distribution of inliers. In addition, we demonstrate the robustness of the algorithms against the presence of noise. The numerical and theoretical studies indicate that while the presented approach is fast and closed-form, it can outperform most of the existing algorithms.

翻訳日:2021-06-24 15:29:42 公開日:2021-06-23

# ランダム効果バンディット

Random Effect Bandits ( http://arxiv.org/abs/2106.12200v1 )

ライセンス: Link先を確認

Rong Zhu and Branislav Kveton

(参考訳) 本稿では,古典的オンライン学習問題である多腕バンディットにおける後悔の最小化について述べる。より統計的に効率的なアルゴリズムを開発するために,ランダム効果モデルの仮定を用いることを提案する。このモデルでは、腕の平均報酬は未知の分布から独立に引き出され、そのパラメータは推定される。我々は,本モデルにおけるアーム平均の推定器を提供し,その不確実性を分析する。これらの結果に基づいて,我々はReUCBと呼ぶ UCB アルゴリズムを設計する。 reucbを分析して、既存の下限に合致した、n$roundの後悔に縛られたベイズ後悔を証明する。実験の結果,reucbは,アーム平均の事前分布が分かっていないと仮定することなく,様々なシナリオにおいてトンプソンサンプリングよりも優れることがわかった。

This paper studies regret minimization in multi-armed bandits, a classical online learning problem. To develop more statistically-efficient algorithms, we propose to use the assumption of a random-effect model. In this model, the mean rewards of arms are drawn independently from an unknown distribution, whose parameters we estimate. We provide an estimator of the arm means in this model and also analyze its uncertainty. Based on these results, we design a UCB algorithm, which we call ReUCB. We analyze ReUCB and prove a Bayes regret bound on its $n$-round regret, which matches an existing lower bound. Our experiments show that ReUCB can outperform Thompson sampling in various scenarios, without assuming that the prior distribution of arm means is known.

翻訳日:2021-06-24 15:29:27 公開日:2021-06-23

# groupShapley: 特徴群に対するShapley値を用いた効率的な予測説明

groupShapley: Efficient prediction explanation with Shapley values for feature groups ( http://arxiv.org/abs/2106.12228v1 )

ライセンス: Link先を確認

Martin Jullum, Annabelle Redelmeier, Kjersti Aas

(参考訳) 共有値は、複雑な機械学習モデルから予測を説明する最も適切で理論的に健全なフレームワークの1つとして確立されている。説明設定におけるシェープリー値の人気は、おそらくそのユニークな理論的性質によるものである。しかし、Shapley値の最大の欠点は、その計算複雑性が入力機能の数で指数関数的に増加し、何百、何千もの機能がある多くの現実世界の状況では実現不可能であることだ。さらに、多くの(依存した)機能により、計算されたShapley値の提示と解釈も困難になる。本稿では,上記のボトルネックに対処するための概念的にシンプルなアプローチであるgroupshapleyを紹介する。そのアイデアは、例えば、型や依存によって、機能をグループ化し、その後、個々の機能ではなく、これらのグループのためにshapley値を計算し、提示することです。数百から数千の機能を半ダース程度に削減することで、正確な計算が事実上可能になり、プレゼンテーションや知識の抽出が大幅に単純化される。特定の条件下では、groupShapleyは各特徴群内の特徴量Shapley値の和と同値であることを示す。さらに,これらの条件を満たさない場合の違いを示すシミュレーション実験を行う。このアプローチのユーザビリティを、grouphapleyがシンプルで直感的な説明を提供する実世界の自動車保険の例で説明します。

Shapley values has established itself as one of the most appropriate and theoretically sound frameworks for explaining predictions from complex machine learning models. The popularity of Shapley values in the explanation setting is probably due to its unique theoretical properties. The main drawback with Shapley values, however, is that its computational complexity grows exponentially in the number of input features, making it unfeasible in many real world situations where there could be hundreds or thousands of features. Furthermore, with many (dependent) features, presenting/visualizing and interpreting the computed Shapley values also becomes challenging. The present paper introduces groupShapley: a conceptually simple approach for dealing with the aforementioned bottlenecks. The idea is to group the features, for example by type or dependence, and then compute and present Shapley values for these groups instead of for all individual features. Reducing hundreds or thousands of features to half a dozen or so, makes precise computations practically feasible and the presentation and knowledge extraction greatly simplified. We prove that under certain conditions, groupShapley is equivalent to summing the feature-wise Shapley values within each feature group. Moreover, we provide a simulation study exemplifying the differences when these conditions are not met. We illustrate the usability of the approach in a real world car insurance example, where groupShapley is used to provide simple and intuitive explanations.

翻訳日:2021-06-24 15:29:15 公開日:2021-06-23

# ParK: 特徴空間分割による音と効率の良いカーネルリッジ回帰

ParK: Sound and Efficient Kernel Ridge Regression by Feature Space Partitions ( http://arxiv.org/abs/2106.12231v1 )

ライセンス: Link先を確認

Luigi Carratino, Stefano Vigogna, Daniele Calandriello, Lorenzo Rosasco

(参考訳) 我々は,カーネルリッジ回帰のための新しい大規模解法parkを紹介する。分割とランダムな投影と反復最適化を組み合わせることで,同じ統計的精度を維持しつつ,空間と時間の複雑さを低減できる。特に、入力空間ではなく特徴空間に直接適切な分割を構築することにより、局所的推定子間の直交性が促進され、局所的有効次元やバイアスといった重要な量が制御されていることが保証される。本手法は,大規模データセット上での数値実験により,統計計算のトレードオフを特徴とし,その効果を実証する。

We introduce ParK, a new large-scale solver for kernel ridge regression. Our approach combines partitioning with random projections and iterative optimization to reduce space and time complexity while provably maintaining the same statistical accuracy. In particular, constructing suitable partitions directly in the feature space rather than in the input space, we promote orthogonality between the local estimators, thus ensuring that key quantities such as local effective dimension and bias remain under control. We characterize the statistical-computational tradeoff of our model, and demonstrate the effectiveness of our method by numerical experiments on large-scale datasets.

翻訳日:2021-06-24 15:28:55 公開日:2021-06-23

# ニューラルネットワークによるLee-CarterモデルとPoisson Lee-Carterモデルの校正

Calibrating the Lee-Carter and the Poisson Lee-Carter models via Neural Networks ( http://arxiv.org/abs/2106.12312v1 )

ライセンス: Link先を確認

Salvatore Scognamiglio

(参考訳) 本稿では,複数の個体群にLee-CarterモデルとPoisson Lee-Carterモデルを適用するニューラルネットワーク手法を提案する。我々は, 個々のlcモデルの構造を再現したニューラルネットワークを開発し, 全集団の死亡データを同時に解析することにより, それらの統合的適合を可能にする。ニューラルネットワークアーキテクチャは、従来の推定スキームのように、人口固有のデータサブセットを使用するのではなく、利用可能なすべての情報を使用して各モデルを調整するように特別に設計されている。 HMD(Human Mortality Database)のすべての国で実施された大規模な数値実験は、我々のアプローチの有効性を示している。特に、結果のパラメータ推定値は、死亡率のデータ、特に低人口国でしばしば発生するランダムな変動に対して滑らかに、より敏感に見えます。また,予測性能も大幅に向上した。

This paper introduces a neural network approach for fitting the Lee-Carter and the Poisson Lee-Carter model on multiple populations. We develop some neural networks that replicate the structure of the individual LC models and allow their joint fitting by analysing the mortality data of all the considered populations simultaneously. The neural network architecture is specifically designed to calibrate each individual model using all available information instead of using a population-specific subset of data as in the traditional estimation schemes. A large set of numerical experiments performed on all the countries of the Human Mortality Database (HMD) shows the effectiveness of our approach. In particular, the resulting parameter estimates appear smooth and less sensitive to the random fluctuations often present in the mortality rates' data, especially for low-population countries. In addition, the forecasting performance results significantly improved as well.

翻訳日:2021-06-24 15:28:45 公開日:2021-06-23

# オートエンコーダの革新とリアルタイム異常検出への応用

Innovations Autoencoder and its Application in Real-Time Anomaly Detection ( http://arxiv.org/abs/2106.12382v1 )

ライセンス: Link先を確認

Xinyi Wang, Lang Tong

(参考訳) 時系列のイノベーティブシーケンス(innovations sequence of a time series)は、元の時系列が因果表現を持つ独立かつ同分布の確率変数の列である。当時の革新は、時系列の以前の歴史とは統計的に独立している。そのため、現在に含まれている新しい情報を表すが、過去にはない。単純な確率構造のため、イノベーションシーケンスはオリジナルの最も効率的な署名である。原理または独立解析(PCA/ICA)表現とは異なり、革新系列は完全な統計的性質だけでなく、オリジナルの時系列の時間順序も保存する。長年の未解決問題は、非ガウス過程のイノベーション列を抽出するための計算的に扱いやすい方法を見つけることである。本稿では,因果畳み込みニューラルネットワークを用いてイノベーションシーケンスを抽出する,innovations autoencoder(iae)と呼ばれるディープラーニング手法を提案する。未知の異常モデルと無異常モデルを用いた非パラメトリック異常検出へのIAEの適用について述べる。

An innovations sequence of a time series is a sequence of independent and identically distributed random variables with which the original time series has a causal representation. The innovation at a time is statistically independent of the prior history of the time series. As such, it represents the new information contained at present but not in the past. Because of its simple probability structure, an innovations sequence is the most efficient signature of the original. Unlike the principle or independent analysis (PCA/ICA) representations, an innovations sequence preserves not only the complete statistical properties but also the temporal order of the original time series. An long-standing open problem is to find a computationally tractable way to extract an innovations sequence of non-Gaussian processes. This paper presents a deep learning approach, referred to as Innovations Autoencoder (IAE), that extracts innovations sequences using a causal convolutional neural network. An application of IAE to nonparametric anomaly detection with unknown anomaly and anomaly-free models is also presented.

翻訳日:2021-06-24 15:28:32 公開日:2021-06-23

# ミラードステイン演算子によるサンプリング

Sampling with Mirrored Stein Operators ( http://arxiv.org/abs/2106.12506v1 )

ライセンス: Link先を確認

Jiaxin Shi, Chang Liu, Lester Mackey

(参考訳) 制約領域と非ユークリッド測地に適した新しい粒子進化型試料群を紹介する。ステイン変分ミラーDescentとミラーレッド変分グレイディエントDescentは、鏡写像で定義される双対空間における粒子の進化による制約対象分布へのクルバック・リーブラー(KL)の偏差を最小化する。スタイン変分自然勾配は非ユークリッド幾何学を利用して、KLの非拘束対象への発散をより効率的に最小化する。この研究で開発されたミラー化されたスタイン作用素と適応カーネルからこれらのサンプルを導出する。これらの新しい標本は, 単純集合上の分布に正確な近似を与え, 選択後の推論において有効な信頼区間を与え, 大規模非拘束後推定において, 従来法よりも高速に収束することを示す。最後に,対象分布の検証可能な条件下での新たな手続きの収束を確立する。

We introduce a new family of particle evolution samplers suitable for constrained domains and non-Euclidean geometries. Stein Variational Mirror Descent and Mirrored Stein Variational Gradient Descent minimize the Kullback-Leibler (KL) divergence to constrained target distributions by evolving particles in a dual space defined by a mirror map. Stein Variational Natural Gradient exploits non-Euclidean geometry to more efficiently minimize the KL divergence to unconstrained targets. We derive these samplers from a new class of mirrored Stein operators and adaptive kernels developed in this work. We demonstrate that these new samplers yield accurate approximations to distributions on the simplex, deliver valid confidence intervals in post-selection inference, and converge more rapidly than prior methods in large-scale unconstrained posterior inference. Finally, we establish the convergence of our new procedures under verifiable conditions on the target distribution.

翻訳日:2021-06-24 15:28:17 公開日:2021-06-23

# co-advise:クロスインダクティブバイアス蒸留

Co-advise: Cross Inductive Bias Distillation ( http://arxiv.org/abs/2106.12378v1 )

ライセンス: Link先を確認

Sucheng Ren, Zhengqi Gao, Tianyu Hua, Zihui Xue, Yonglong Tian, Shengfeng He, Hang Zhao

(参考訳) 近年のトランスフォーマーは、自然言語処理のコミュニティから、視覚学習タスクのための畳み込みベースのニューラルネットワークの代替として適応している。しかし、その優越性は不十分なトレーニングデータ(例: imagenet)を与えられた。そこで本研究では,視覚変換器を訓練するための蒸留法を提案する。単に重い畳み込みベースの教師が提供される以前の作品とは異なり、学生トランスフォーマーを助言するために異なるアーキテクチャ的帰納的バイアス(例えば、畳み込みと畳み込み)を持つ軽量の教師を導入する。鍵となるのは、異なるインダクティブバイアスを持つ教師は、同じデータセットでトレーニングされているにもかかわらず異なる知識を得ることであり、そのような異なる知識の複合物であり、蒸留中の生徒のパフォーマンスを高めることである。このクロスインダクティブバイアス蒸留法により、私たちのビジョントランスフォーマー(CivT)は、ImageNet上の同じアーキテクチャの以前のトランスフォーマーよりも優れています。

Transformers recently are adapted from the community of natural language processing as a promising substitute of convolution-based neural networks for visual learning tasks. However, its supremacy degenerates given an insufficient amount of training data (e.g., ImageNet). To make it into practical utility, we propose a novel distillation-based method to train vision transformers. Unlike previous works, where merely heavy convolution-based teachers are provided, we introduce lightweight teachers with different architectural inductive biases (e.g., convolution and involution) to co-advise the student transformer. The key is that teachers with different inductive biases attain different knowledge despite that they are trained on the same dataset, and such different knowledge compounds and boosts the student's performance during distillation. Equipped with this cross inductive bias distillation method, our vision transformers (termed as CivT) outperform all previous transformers of the same architecture on ImageNet.

翻訳日:2021-06-24 15:28:00 公開日:2021-06-23

# APNN-TC: Ampere GPU Tensor Core上での任意精度ニューラルネットワークの高速化

APNN-TC: Accelerating Arbitrary Precision Neural Networks on Ampere GPU Tensor Cores ( http://arxiv.org/abs/2106.12169v1 )

ライセンス: Link先を確認

Boyuan Feng, Yuke Wang, Tong Geng, Ang Li, Yufei Ding

(参考訳) 近年,量子化によるニューラルネットワークの高速化が広く研究されている。残念なことに、さまざまな精度(1ビットの重みや2ビットのアクティベーションなど)の以前の取り組みは、gpu(例えば、int1やint4)の精度の制限によって制限される。このような制約を破るために,最初の任意精度ニューラルネットワークフレームワーク(apnn-tc)を導入し,アンペアgpuテンソルコアの量子化利点を十分に活用する。具体的には、APNN-TCはまず、int1計算プリミティブとXOR/ANDブール演算による任意の短ビット幅計算をサポートする新しいエミュレーションアルゴリズムを組み込んだ。第2に、APNN-TCは任意の精度層の設計を統合し、エミュレーションアルゴリズムを新しいバッチ戦略と特別なメモリ構成でTensor Coresに効率的にマッピングする。第3に、apnn-tcは層間のメモリアクセスを最小化し、さらにパフォーマンスを向上させるために、任意の精度のnn設計を具体化する。大規模な評価の結果、APNN-TCはCUTLASSカーネルやResNetやVGGといったNNモデルよりも大幅に高速化できることがわかった。

Over the years, accelerating neural networks with quantization has been widely studied. Unfortunately, prior efforts with diverse precisions (e.g., 1-bit weights and 2-bit activations) are usually restricted by limited precision support on GPUs (e.g., int1 and int4). To break such restrictions, we introduce the first Arbitrary Precision Neural Network framework (APNN-TC) to fully exploit quantization benefits on Ampere GPU Tensor Cores. Specifically, APNN-TC first incorporates a novel emulation algorithm to support arbitrary short bit-width computation with int1 compute primitives and XOR/AND Boolean operations. Second, APNN-TC integrates arbitrary precision layer designs to efficiently map our emulation algorithm to Tensor Cores with novel batching strategies and specialized memory organization. Third, APNN-TC embodies a novel arbitrary precision NN design to minimize memory access across layers and further improve performance. Extensive evaluations show that APNN-TC can achieve significant speedup over CUTLASS kernels and various NN models, such as ResNet and VGG.

翻訳日:2021-06-24 15:27:44 公開日:2021-06-23

# 雲除去のためのセンチネル-1とセンチネル-2時空間データ融合

Sentinel-1 and Sentinel-2 Spatio-Temporal Data Fusion for Clouds Removal ( http://arxiv.org/abs/2106.12226v1 )

ライセンス: Link先を確認

Alessandro Sebastianelli, Artur Nowakowski, Erika Puglisi, Maria Pia Del Rosso, Jamila Mifdal, Fiora Pirri, Pierre Philippe Mathieu and Silvia Liberata Ullo

(参考訳) 空間的にも時間的にも多数の雲は、光学画像を用いたリモートセンシングアプリケーションを困難または不可能にすることがしばしばある。本研究では,sentinel-1とsentinel-2の時系列データから抽出した時空間的特徴を融合するために,3つの深層ニューラルネットワークを結合した合同データ融合パラダイムに基づいて,雲分解光画像復元法を提案する。コードとデータセットの両方がスクラッチから実装され、さらなる分析と調査のために興味のある研究に利用可能であることは注目に値する。

The abundance of clouds, located both spatially and temporally, often makes remote sensing applications with optical images difficult or even impossible. In this manuscript, a novel method for clouds-corrupted optical image restoration has been presented and developed, based on a joint data fusion paradigm, where three deep neural networks have been combined in order to fuse spatio-temporal features extracted from Sentinel-1 and Sentinel-2 time-series of data. It is worth highlighting that both the code and the dataset have been implemented from scratch and made available to interested research for further analysis and investigation.

翻訳日:2021-06-24 15:27:22 公開日:2021-06-23

# ADAVI:ピラミッドベイズモデルに適用された2値補正変分自動推定

ADAVI: Automatic Dual Amortized Variational Inference Applied To Pyramidal Bayesian Models ( http://arxiv.org/abs/2106.12248v1 )

ライセンス: Link先を確認

Louis Rouillard (PARIETAL, Inria, CEA), Demian Wassermann (PARIETAL, Inria, CEA)

(参考訳) しばしば、人口調査は階層ベイズモデル(HBM)で表されるピラミッド的に組織化されたデータで表される。これらのモデルは、ニューロイメージングのような設定では違法に大きくなり、サンプルは6万の脳位置で測定された機能的なMRI信号からなり、4回の測定セッションにまたがり、少なくとも10人の被験者からなる。 300の脳の特定の皮質領域の縮小例でさえ、約100万のパラメータが特徴であり、シミュレーションベース推論(SBI)のような現代的な密度推定技術の使用を妨げる。この課題のクラスにおいて,パラメータの後方分布を推定するために,ターゲットHBMに双対な変動族を自動生成する手法を考案した。ニューラルネットワークとして表現されるこのVariatonal familyは、注意に基づく階層エンコーダの組み合わせによって、要約統計を正規化フローの集合に供給する。我々のニューラルネットワークはプレート強化HBMの交換性を利用してパラメータ空間を分解する。結果として得られるアーキテクチャは、表現性を維持しながら、典型的なSBI表現に関するパラメータ化を桁違いに削減する。トレーニングが完了すれば,パラメータの完全後部を計算するために,新しいデータサンプルに容易に適用することができる。シミュレーションデータにおける本手法の有効性を実証するとともに,高次元脳解析実験を行った。また、SBI技術と構造化変分推論の共通点にあるいくつかの質問も開きます。

Frequently, population studies feature pyramidally-organized data represented using Hierarchical Bayesian Models (HBM) enriched with plates. These models can become prohibitively large in settings such as neuroimaging, where a sample is composed of a functional MRI signal measured on 64 thousand brain locations, across 4 measurement sessions, and at least tens of subjects. Even a reduced example on a specific cortical region of 300 brain locations features around 1 million parameters, hampering the usage of modern density estimation techniques such as Simulation-Based Inference (SBI). To infer parameter posterior distributions in this challenging class of problems, we designed a novel methodology that automatically produces a variational family dual to a target HBM. This variatonal family, represented as a neural network, consists in the combination of an attention-based hierarchical encoder feeding summary statistics to a set of normalizing flows. Our automatically-derived neural network exploits exchangeability in the plate-enriched HBM and factorizes its parameter space. The resulting architecture reduces by orders of magnitude its parameterization with respect to that of a typical SBI representation, while maintaining expressivity. Our method performs inference on the specified HBM in an amortized setup: once trained, it can readily be applied to a new data sample to compute the parameters' full posterior. We demonstrate the capability of our method on simulated data, as well as a challenging high-dimensional brain parcellation experiment. We also open up several questions that lie at the intersection between SBI techniques and structured Variational Inference.

翻訳日:2021-06-24 15:27:10 公開日:2021-06-23

# 動的変分オートエンコーダを用いた教師なし音声強調

Unsupervised Speech Enhancement using Dynamical Variational Auto-Encoders ( http://arxiv.org/abs/2106.12271v1 )

ライセンス: Link先を確認

Xiaoyu Bie, Simon Leglaive, Xavier Alameda-Pineda, Laurent Girin

(参考訳) 動的変分自動エンコーダ(Dynamical Variational Auto-Encoders, DVAE)は、時系列データモデリングに特化した潜伏変数を持つ深部生成モデルのクラスである。 DVAEは、連続した観測ベクトルおよび/または遅延ベクトル間の時間的依存関係のモデリングを含む変分オートエンコーダ(VAE)の拡張と見なすことができる。従来の研究は、音声信号(スペクトログラム)モデリングにおいて、DVAEの関心と、VAEよりも優れた性能を示してきた。独立して、VAEは、訓練にクリーンでノイズの多い音声サンプルの並列データセットを必要とせず、クリーンな音声信号のみを必要とする、教師なしノイズ非依存のセットアップにおいて、音声強調に成功している。本稿では,dvaeを用いた単一チャネル非教師なし音声強調に拡張し,教師なし表現学習とダイナミクスモデリングの両方を利用する。我々は,最も一般的なdvaesに基づく教師なし音声強調アルゴリズムを提案し,フレームワークの汎用性を説明するために3つのdvaeモデルに適応させた。より正確には、DVAEに基づく先行音声を非負行列分解に基づく雑音モデルと組み合わせ、変動予測最大化(VEM)アルゴリズムを導出し、音声強調を行う。実験の結果,DVAEsに基づく提案手法は,VAEと教師付き音声強調ベースラインよりも優れていた。

Dynamical variational auto-encoders (DVAEs) are a class of deep generative models with latent variables, dedicated to time series data modeling. DVAEs can be considered as extensions of the variational autoencoder (VAE) that include the modeling of temporal dependencies between successive observed and/or latent vectors in data sequences. Previous work has shown the interest of DVAEs and their better performance over the VAE for speech signals (spectrogram) modeling. Independently, the VAE has been successfully applied to speech enhancement in noise, in an unsupervised noise-agnostic set-up that does not require the use of a parallel dataset of clean and noisy speech samples for training, but only requires clean speech signals. In this paper, we extend those works to DVAE-based single-channel unsupervised speech enhancement, hence exploiting both speech signals unsupervised representation learning and dynamics modeling. We propose an unsupervised speech enhancement algorithm based on the most general form of DVAEs, that we then adapt to three specific DVAE models to illustrate the versatility of the framework. More precisely, we combine DVAE-based speech priors with a noise model based on nonnegative matrix factorization, and we derive a variational expectation-maximization (VEM) algorithm to perform speech enhancement. Experimental results show that the proposed approach based on DVAEs outperforms its VAE counterpart and a supervised speech enhancement baseline.

翻訳日:2021-06-24 15:26:46 公開日:2021-06-23

# 心血管深層学習による左室肥大の高スループット表現型化

High-Throughput Precision Phenotyping of Left Ventricular Hypertrophy with Cardiovascular Deep Learning ( http://arxiv.org/abs/2106.12511v1 )

ライセンス: Link先を確認

Grant Duffy, Paul P Cheng, Neal Yuan, Bryan He, Alan C. Kwan, Matthew J. Shun-Shin, Kevin M. Alexander, Joseph Ebinger, Matthew P. Lungren, Florian Rader, David H. Liang, Ingela Schnittger, Euan A. Ashley, James Y. Zou, Jignesh Patel, Ronald Witteles, Susan Cheng, David Ouyang

(参考訳) 左室肥大 (LVH) は、高血圧、大動脈狭窄症、肥大型心筋症、心アミロイドーシスなど、幅広い全身・心血管疾患による慢性リモデリングによるものである。 LVHの早期検出と評価は、患者のケアに大きな影響を及ぼすが、肥大の低認識、測定誤差と変動性、LVHの差別化の難しさによって制限される。この課題を克服するために、人間の専門家に匹敵する精度で心室肥大を自動的に定量化し、LVHのエチオロジーを予測するディープラーニングワークフローであるEchoNet-LVHを提案する。 28,201心エコービデオを用いて心内膜厚(平均絶対誤差[MAE]1.4mm,95%CI1.2-1.5mm),左室径(MAE 2.4mm,95%CI 2.2-2.6mm),後壁厚(MAE 1.2mm,95%CI 1.1-1.3mm)を正確に測定し,心アミロイドーシスと肥大型心筋症(AUC 0.98)をLVHの他の病因から分類した。内外の医療システムからの外部データセットでは、EchoNet-LVHは心室パラメータ(それぞれ0.96と0.90)を正確に定量し、心アミロイドーシス(AUC 0.79)と肥大型心筋症(AUC 0.89)を検出した。複数の心拍数を測定することで、LV形状の微妙な変化とその因果関係をより正確に識別することができる。人間の専門家と比較して、EchoNet-LVHは完全に自動化されており、再現可能で正確な測定が可能であり、心臓肥大の正確な診断の基礎となっている。さらなるイノベーションを促進するためのリソースとして,23,212の注釈付き心エコービデオの大規模なデータセットを公開しています。

Left ventricular hypertrophy (LVH) results from chronic remodeling caused by a broad range of systemic and cardiovascular disease including hypertension, aortic stenosis, hypertrophic cardiomyopathy, and cardiac amyloidosis. Early detection and characterization of LVH can significantly impact patient care but is limited by under-recognition of hypertrophy, measurement error and variability, and difficulty differentiating etiologies of LVH. To overcome this challenge, we present EchoNet-LVH - a deep learning workflow that automatically quantifies ventricular hypertrophy with precision equal to human experts and predicts etiology of LVH. Trained on 28,201 echocardiogram videos, our model accurately measures intraventricular wall thickness (mean absolute error [MAE] 1.4mm, 95% CI 1.2-1.5mm), left ventricular diameter (MAE 2.4mm, 95% CI 2.2-2.6mm), and posterior wall thickness (MAE 1.2mm, 95% CI 1.1-1.3mm) and classifies cardiac amyloidosis (area under the curve of 0.83) and hypertrophic cardiomyopathy (AUC 0.98) from other etiologies of LVH. In external datasets from independent domestic and international healthcare systems, EchoNet-LVH accurately quantified ventricular parameters (R2 of 0.96 and 0.90 respectively) and detected cardiac amyloidosis (AUC 0.79) and hypertrophic cardiomyopathy (AUC 0.89) on the domestic external validation site. Leveraging measurements across multiple heart beats, our model can more accurately identify subtle changes in LV geometry and its causal etiologies. Compared to human experts, EchoNet-LVH is fully automated, allowing for reproducible, precise measurements, and lays the foundation for precision diagnosis of cardiac hypertrophy. As a resource to promote further innovation, we also make publicly available a large dataset of 23,212 annotated echocardiogram videos.

翻訳日:2021-06-24 15:26:06 公開日:2021-06-23

# 切替トークンを用いた複数音声テキスト変換タスクのゼロショットジョイントモデリング

Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks using Switching Tokens ( http://arxiv.org/abs/2106.12131v1 )

ライセンス: Link先を確認

Mana Ihori, Naoki Makishima, Tomohiro Tanaka, Akihiko Takashima, Shota Orihashi, Ryo Masumura

(参考訳) 本稿では,一致したデータセットを作成することなく,句読取復元や不規則削除といった複数のスタイル変換モジュールを同時に実行可能な,音声文型変換手法を提案する。実際には、自動音声認識システムによって生成された文字は、多くの不一致を含むことが多く、句読点を含まないため、読めない。可読性を向上させるために、単一の変換タスクを個別にモデル化する複数の音声テキストスタイルの変換モジュールがカスケードされる。しかし、変換エラーの連鎖のため、カスケードはタスクの順序に対して不安定である。加えて、カスケードの計算コストは単一変換よりも高くなければならない。一致したデータセットを準備せずに複数の変換タスクを同時に実行するためには、オンオフスイッチを使用して個々の変換タスクを区別する。提案したゼロショット共同モデリングでは,複数の切替トークンを用いて個々のタスクを切り替え,ゼロショット学習アプローチを用いて同時変換を行う。ディフルエンシ除去と句読取回復の連成モデリング実験により,本手法の有効性を実証した。

In this paper, we propose a novel spoken-text-style conversion method that can simultaneously execute multiple style conversion modules such as punctuation restoration and disfluency deletion without preparing matched datasets. In practice, transcriptions generated by automatic speech recognition systems are not highly readable because they often include many disfluencies and do not include punctuation marks. To improve their readability, multiple spoken-text-style conversion modules that individually model a single conversion task are cascaded because matched datasets that simultaneously handle multiple conversion tasks are often unavailable. However, the cascading is unstable against the order of tasks because of the chain of conversion errors. Besides, the computation cost of the cascading must be higher than the single conversion. To execute multiple conversion tasks simultaneously without preparing matched datasets, our key idea is to distinguish individual conversion tasks using the on-off switch. In our proposed zero-shot joint modeling, we switch the individual tasks using multiple switching tokens, enabling us to utilize a zero-shot learning approach to executing simultaneous conversions. Our experiments on joint modeling of disfluency deletion and punctuation restoration demonstrate the effectiveness of our method.

翻訳日:2021-06-24 15:25:13 公開日:2021-06-23

# 強化学習に基づく対話型イベント抽出による議論関係の活用

Reinforcement Learning-based Dialogue Guided Event Extraction to Exploit Argument Relations ( http://arxiv.org/abs/2106.12384v1 )

ライセンス: Link先を確認

Qian Li, Hao Peng, Jianxin Li, Yuanxing Ning, Lihong Wang, Philip S. Yu, Zheng Wang

(参考訳) イベント抽出は自然言語処理の基本的なタスクである。イベント参加者のようなイベント引数の役割を見つけることは、イベント抽出に不可欠である。しかし、実生活におけるイベント記述のために行うことは、議論の役割が異なる状況でしばしば異なるため、難しい。複数の引数間の関係と相互作用は引数の役割を解決するのに有用であるが、そのような情報は既存のアプローチでは無視されている。本稿では,イベント引数の関係を明示的に活用し,イベント抽出のためのより良い手法を提案する。タスク指向対話システムによってこれを実現できる。議論関係をモデル化するために,強化学習とインクリメンタル学習を用い,マルチターン反復プロセスを通じて複数の引数を抽出する。提案手法では,すでに抽出された同一文の引数の知識を活用して,個別に決定しにくい議論の役割を決定する。その後、新たに取得した情報を使用して、以前抽出された議論の判断を改善する。この双方向フィードバックプロセスにより、議論関係を利用して議論の役割を効果的に解決し、文理解とイベント抽出を改善することができる。実験の結果,提案手法は,イベントの分類や引数の役割,引数の識別において,7つの最先端イベント抽出手法を一貫して上回っていることがわかった。

Event extraction is a fundamental task for natural language processing. Finding the roles of event arguments like event participants is essential for event extraction. However, doing so for real-life event descriptions is challenging because an argument's role often varies in different contexts. While the relationship and interactions between multiple arguments are useful for settling the argument roles, such information is largely ignored by existing approaches. This paper presents a better approach for event extraction by explicitly utilizing the relationships of event arguments. We achieve this through a carefully designed task-oriented dialogue system. To model the argument relation, we employ reinforcement learning and incremental learning to extract multiple arguments via a multi-turned, iterative process. Our approach leverages knowledge of the already extracted arguments of the same sentence to determine the role of arguments that would be difficult to decide individually. It then uses the newly obtained information to improve the decisions of previously extracted arguments. This two-way feedback process allows us to exploit the argument relations to effectively settle argument roles, leading to better sentence understanding and event extraction. Experimental results show that our approach consistently outperforms seven state-of-the-art event extraction methods for the classification of events and argument role and argument identification.

翻訳日:2021-06-24 15:24:55 公開日:2021-06-23

# 音声自動スコアリングのためのディープニューラル専門家の混合

Mixtures of Deep Neural Experts for Automated Speech Scoring ( http://arxiv.org/abs/2106.12475v1 )

ライセンス: Link先を確認

Sara Papi, Edmondo Trentin, Roberto Gretter, Marco Matassoni, Daniele Falavigna

(参考訳) 本論文は,言語学習者の音声応答からテストプロンプトに対する第二言語能力の自動評価の課題に対処する。このタスクは、コンピュータ支援言語学習の分野に大きく関係している。本論文で提示されたアプローチは,(1)音声対話のテキスト書き起こしを生成する自動音声認識システム,(2)書き起こしを熟練度クラスに分類する深層学習者に基づく多重分類システムという,2つの異なるモジュールに依存している。異なるディープニューラルネットワークアーキテクチャ(フィードフォワードとリカレントの両方)は、参照文法、確率言語モデルの結果、複数の単語埋め込み、2つのバグ・オブ・ワードモデルという観点でテキストの多様な表現に特化している。個々の分類器の組み合わせは、確率的擬似結合モデルまたは専門家の神経混合物を介して実現される。第3回Spoken CALL Shared Task Challengeのデータを用いて,3つの評価指標から,現在までの最高値を得た。

The paper copes with the task of automatic assessment of second language proficiency from the language learners' spoken responses to test prompts. The task has significant relevance to the field of computer assisted language learning. The approach presented in the paper relies on two separate modules: (1) an automatic speech recognition system that yields text transcripts of the spoken interactions involved, and (2) a multiple classifier system based on deep learners that ranks the transcripts into proficiency classes. Different deep neural network architectures (both feed-forward and recurrent) are specialized over diverse representations of the texts in terms of: a reference grammar, the outcome of probabilistic language models, several word embeddings, and two bag-of-word models. Combination of the individual classifiers is realized either via a probabilistic pseudo-joint model, or via a neural mixture of experts. Using the data of the third Spoken CALL Shared Task challenge, the highest values to date were obtained in terms of three popular evaluation metrics.

翻訳日:2021-06-24 15:24:37 公開日:2021-06-23

# LegoFormer:マルチビュー3D再構築のためのトランスフォーマー

LegoFormer: Transformers for Block-by-Block Multi-view 3D Reconstruction ( http://arxiv.org/abs/2106.12102v1 )

ライセンス: Link先を確認

Farid Yagubbayli, Alessio Tonioni, Federico Tombari

(参考訳) 現代のディープラーニングベースの多視点3D再構成技術のほとんどは、RNNまたは融合モジュールを使用して、エンコード後の複数の画像からの情報を組み合わせている。これら2つのステップは疎結合であり、各ビューをエンコーディングしている間に利用可能なすべての情報を考慮しない。 legoformerは,単一のフレームワークでオブジェクトの再構成を統一し,その分解因子によって再構成された占有グリッドをパラメータ化するトランスフォーマモデルである。この再構成により、オブジェクトを独立した構造の集合として予測し、最終的な再構成を得ることができる。 shapenet上で行った実験では,最先端の手法に関して,ネットワークの競合性能を示す。また,自己注意の使用がモデル出力の解釈可能性の向上につながることを示す。

Most modern deep learning-based multi-view 3D reconstruction techniques use RNNs or fusion modules to combine information from multiple images after encoding them. These two separate steps have loose connections and do not consider all available information while encoding each view. We propose LegoFormer, a transformer-based model that unifies object reconstruction under a single framework and parametrizes the reconstructed occupancy grid by its decomposition factors. This reformulation allows the prediction of an object as a set of independent structures then aggregated to obtain the final reconstruction. Experiments conducted on ShapeNet display the competitive performance of our network with respect to the state-of-the-art methods. We also demonstrate how the use of self-attention leads to increased interpretability of the model output.

翻訳日:2021-06-24 15:22:50 公開日:2021-06-23

# 医用ボリュームとシーケンスのセグメンテーションのためのブートストラップ表現学習

Bootstrap Representation Learning for Segmentation on Medical Volumes and Sequences ( http://arxiv.org/abs/2106.12153v1 )

ライセンス: Link先を確認

Zejian Chen, Wei Zhuo, Tianfu Wang, Wufeng Xue and Dong Ni

(参考訳) そこで本研究では,アノテーションを限定した医用ボリュームとシーケンスセグメンテーションの簡易化手法を提案する。自己教師付き学習(ssl)の最近の成功は、ラベルなしデータの事前学習を動機付けている。その成功にもかかわらず、ローカルなセマンティックな差別やボリュームやシーケンス構造への稀な利用が欠如しているため、一般的なSSLメソッドをボリューム/シーケンスセグメンテーションに適応することは依然として困難である。スライス/フレーム間の連続性とボリューム/シーケンス間のオルガンの共通空間配置に基づいて,隣接するスライスの予測可能性を活用したブートストラップ自己監督表現学習手法を提案する。本手法の核心は,局所表現の予測に関する単純で分かりやすい自己管理と,グローバルコンテキストに基づく局所表現予測戦略であり,ボリューム間のグローバル表現マイニングと局所表現マイニングの両方に対して安定かつ信頼性の高い監督を可能にする。具体的には,注意誘導型予測器を備えた非対称ネットワークを提案し,ボリューム/シーケンス間のスライス間の距離特異的な予測と監視を行った。次に,新しいプロトタイプベースフォアグラウンド・バックグラウンドキャリブレーションモジュールを導入した。 2つの部分はラベル付きおよびラベルなしのデータに基づいて共同で訓練される。医療用ボリュームとシークエンスの3つのベンチマークデータセットで評価すると、adcdcでは4.5\%dsc、前立腺では1.7\%、camusでは2.3\%という大きなマージンで既存の手法を上回っている。集中評価は,本手法の有効性と優位性を明らかにする。

In this work, we propose a novel straightforward method for medical volume and sequence segmentation with limited annotations. To avert laborious annotating, the recent success of self-supervised learning(SSL) motivates the pre-training on unlabeled data. Despite its success, it is still challenging to adapt typical SSL methods to volume/sequence segmentation, due to their lack of mining on local semantic discrimination and rare exploitation on volume and sequence structures. Based on the continuity between slices/frames and the common spatial layout of organs across volumes/sequences, we introduced a novel bootstrap self-supervised representation learning method by leveraging the predictable possibility of neighboring slices. At the core of our method is a simple and straightforward dense self-supervision on the predictions of local representations and a strategy of predicting locals based on global context, which enables stable and reliable supervision for both global and local representation mining among volumes. Specifically, we first proposed an asymmetric network with an attention-guided predictor to enforce distance-specific prediction and supervision on slices within and across volumes/sequences. Secondly, we introduced a novel prototype-based foreground-background calibration module to enhance representation consistency. The two parts are trained jointly on labeled and unlabeled data. When evaluated on three benchmark datasets of medical volumes and sequences, our model outperforms existing methods with a large margin of 4.5\% DSC on ACDC, 1.7\% on Prostate, and 2.3\% on CAMUS. Intensive evaluations reveals the effectiveness and superiority of our method.

翻訳日:2021-06-24 15:22:39 公開日:2021-06-23

# 視覚に基づく豚の新規選好の行動認識

Vision-based Behavioral Recognition of Novelty Preference in Pigs ( http://arxiv.org/abs/2106.12181v1 )

ライセンス: Link先を確認

Aniket Shirke, Rebecca Golden, Mrinal Gautam, Angela Green-Miller, Matthew Caesar, Ryan N. Dilger

(参考訳) 研究データの行動スコアリングは、ドメイン固有のメトリクスを抽出するために重要であるが、人間の労働力を用いて膨大な量の情報を分析する能力にボトルネックがある。ディープラーニングは、このボトルネックを緩和するための重要な進歩と見なされている。我々は,手動スコアリングのプロセスを緩和するために,ディープラーニングを活用できる分野を1つ同定する。新規嗜好のパラダイムはブタの認知記憶の研究に広く用いられているが、これらのビデオの分析には人間の介入が必要である。ブタの行動とキーポイントを完全に注釈付けした 'Pig Novelty Preference Behavior' (PNPB) データセットの形式で,このようなビデオのサブセットを紹介する。本データセットにおける最先端の行動認識モデルの適用例を示すために,様々な分析指標に基づいてlrcn,c3d,tsmを比較し,モデルの落とし穴について考察する。豚の行動推定における平均精度は93%,平均精度は96%であった。コードと注釈付きデータセットをhttps://github.com/AIFARMS/NOR-behavior-recognitionでオープンソース化しました。

Behavioral scoring of research data is crucial for extracting domain-specific metrics but is bottlenecked on the ability to analyze enormous volumes of information using human labor. Deep learning is widely viewed as a key advancement to relieve this bottleneck. We identify one such domain, where deep learning can be leveraged to alleviate the process of manual scoring. Novelty preference paradigms have been widely used to study recognition memory in pigs, but analysis of these videos requires human intervention. We introduce a subset of such videos in the form of the 'Pig Novelty Preference Behavior' (PNPB) dataset that is fully annotated with pig actions and keypoints. In order to demonstrate the application of state-of-the-art action recognition models on this dataset, we compare LRCN, C3D, and TSM on the basis of various analytical metrics and discuss common pitfalls of the models. Our methods achieve an accuracy of 93% and a mean Average Precision of 96% in estimating piglet behavior. We open-source our code and annotated dataset at https://github.com/AIFARMS/NOR-behavior-recognition

翻訳日:2021-06-24 15:22:12 公開日:2021-06-23

# 識別指向マップを用いたリアルタイムインスタンス分割

Real-time Instance Segmentation with Discriminative Orientation Maps ( http://arxiv.org/abs/2106.12204v1 )

ライセンス: Link先を確認

Wentao Du, Zhiyu Xiang, Shuya Chen, Chengyu Qiao, Yiman Chen and Tingming Bai

(参考訳) インスタンスのセグメンテーションは近年かなり進歩していますが、リアルタイムパフォーマンスで高精度なアルゴリズムを設計することは依然として課題です。本稿では,OrienMaskと呼ばれるリアルタイムインスタンスセグメンテーションフレームワークを提案する。一段物検出器YOLOv3では、マスクヘッドが追加され、前景および背景画素の両方の空間オフセットベクトルとして明示的に定義される識別向きマップが予測される。方位マップの識別能力のおかげで、余分なフォアグラウンドセグメンテーションを必要とせずにマスクを復元できる。同じアンカーサイズにマッチするすべてのインスタンスは、共通の向きマップを共有する。この特別な共有戦略は、マスクの粒度が失われることなく、マスク予測のメモリ使用量を削減する。 NMS後に残るボックス予測を考慮すれば、インスタンスマスクは複雑さの低い対応する向きマップから同時に構築することができる。マスク表現の簡潔な設計とアンカーベースオブジェクト検出器との効果的な統合により,本手法は競争精度を維持しつつ,リアルタイム条件下での精度が確保できる。 COCOベンチマークの実験では、OrienMaskは1つのRTX 2080 Tiで評価された42.7 fpsの速度で34.8マスクAPを達成した。コードはhttps://github.com/duwt/orienmaskで入手できる。

Although instance segmentation has made considerable advancement over recent years, it's still a challenge to design high accuracy algorithms with real-time performance. In this paper, we propose a real-time instance segmentation framework termed OrienMask. Upon the one-stage object detector YOLOv3, a mask head is added to predict some discriminative orientation maps, which are explicitly defined as spatial offset vectors for both foreground and background pixels. Thanks to the discrimination ability of orientation maps, masks can be recovered without the need for extra foreground segmentation. All instances that match with the same anchor size share a common orientation map. This special sharing strategy reduces the amortized memory utilization for mask predictions but without loss of mask granularity. Given the surviving box predictions after NMS, instance masks can be concurrently constructed from the corresponding orientation maps with low complexity. Owing to the concise design for mask representation and its effective integration with the anchor-based object detector, our method is qualified under real-time conditions while maintaining competitive accuracy. Experiments on COCO benchmark show that OrienMask achieves 34.8 mask AP at the speed of 42.7 fps evaluated with a single RTX 2080 Ti. The code is available at https://github.com/duwt/OrienMask.

翻訳日:2021-06-24 15:21:56 公開日:2021-06-23

# 相互情報に基づくFew-Shot分類

Mutual-Information Based Few-Shot Classification ( http://arxiv.org/abs/2106.12252v1 )

ライセンス: Link先を確認

Malik Boudiaf, Ziko Imtiaz Masud, J\'er\^ome Rony, Jose Dolz, Ismail Ben Ayed, Pablo Piantanida

(参考訳) 数ショット学習のためのTIM(Transductive Infomation Maximization)を提案する。提案手法は,クエリ特徴とラベル予測との相互情報を最大化し,その支援セットに基づく監督損失を付与する。我々は, 分類精度と相互情報最大化の形式的関係を導出することにより, トランスダクティブ損失の動機付けを行う。さらに、勾配に基づく最適化よりもトランスダクティブ推論を大幅に高速化し、競争精度を向上する新しい交互方向解法を提案する。また、Zangwillの理論と有界最適化論に基づく解の収束解析も提供する。 TIM推論はモジュラーであり、任意のベーストレーニング機能抽出器上で使用することができる。 TIMは様々なデータセットやネットワークにまたがる最先端の手法よりも優れており、複雑なメタ学習手法を使わずに、ベースクラス上で単純なクロスエントロピーで訓練された固定された特徴抽出器上で使用されている。最近発表されたMETA-DATASETのようなランダムなタスク、ドメインシフト、より大きなクラス数を含む、より困難なシナリオでも、優れたパフォーマンスのメソッドよりも、一貫して2%から5%の精度の向上を実現している。私たちのコードはhttps://github.com/mboudiaf/TIMで公開されています。また、META-DATASETのスタンドアロンPyTorch実装とベンチマーク結果もhttps://github.com/mboudiaf/pytorch-meta-datasetで公開しています。

We introduce Transductive Infomation Maximization (TIM) for few-shot learning. Our method maximizes the mutual information between the query features and their label predictions for a given few-shot task, in conjunction with a supervision loss based on the support set. We motivate our transductive loss by deriving a formal relation between the classification accuracy and mutual-information maximization. Furthermore, we propose a new alternating-direction solver, which substantially speeds up transductive inference over gradient-based optimization, while yielding competitive accuracy. We also provide a convergence analysis of our solver based on Zangwill's theory and bound-optimization arguments. TIM inference is modular: it can be used on top of any base-training feature extractor. Following standard transductive few-shot settings, our comprehensive experiments demonstrate that TIM outperforms state-of-the-art methods significantly across various datasets and networks, while used on top of a fixed feature extractor trained with simple cross-entropy on the base classes, without resorting to complex meta-learning schemes. It consistently brings between 2 % and 5 % improvement in accuracy over the best performing method, not only on all the well-established few-shot benchmarks but also on more challenging scenarios, with random tasks, domain shift and larger numbers of classes, as in the recently introduced META-DATASET. Our code is publicly available at https://github.com/mboudiaf/TIM. We also publicly release a standalone PyTorch implementation of META-DATASET, along with additional benchmarking results, at https://github.com/mboudiaf/pytorch-meta-dataset.

翻訳日:2021-06-24 15:21:38 公開日:2021-06-23

# まばらなランドマークの集合体からの深部無監督3次元人体再構築

Deep unsupervised 3D human body reconstruction from a sparse set of landmarks ( http://arxiv.org/abs/2106.12282v1 )

ライセンス: Link先を確認

Meysam Madadi and Hugo Bertiche and Sergio Escalera

(参考訳) 本稿では,DeepMurfと呼ばれる,まばらなランドマークの集合から人体表面を推定する,人体再構成における初の深層非教師的アプローチを提案する。欠落したランドマークを推定するためにデノナイズドオートエンコーダを適用する。次に,ランドマークからの身体関節の推定に注意モデルを適用する。最後に、身体を再構築する統計的生成モデルの回帰パラメータにカスケードネットワークを適用する。提案した損失関数セットは、教師なしの方法でネットワークをトレーニングすることができる。 4つの公開データセットの結果から,実世界のモキャップデータから人体を正確に再構築した。

In this paper we propose the first deep unsupervised approach in human body reconstruction to estimate body surface from a sparse set of landmarks, so called DeepMurf. We apply a denoising autoencoder to estimate missing landmarks. Then we apply an attention model to estimate body joints from landmarks. Finally, a cascading network is applied to regress parameters of a statistical generative model that reconstructs body. Our set of proposed loss functions allows us to train the network in an unsupervised way. Results on four public datasets show that our approach accurately reconstructs the human body from real world mocap data.

翻訳日:2021-06-24 15:21:09 公開日:2021-06-23

# Open Images V5 Text Annotation and another Mask Text Spotter

Open Images V5 Text Annotation and Yet Another Mask Text Spotter ( http://arxiv.org/abs/2106.12326v1 )

ライセンス: Link先を確認

Ilya Krylov, Sergei Nosov, Vladislav Sovrasov

(参考訳) 大規模な人間ラベルデータセットは、高品質なディープラーニングモデルを作成する上で重要な役割を果たす。本稿では,Open Images V5データセットのテキストアノテーションについて述べる。私たちの知る限り、手作業で作成したテキストアノテーションの中では最大である。 icdar2013、icdar2015、total-textデータセットにおいて、競争力のあるパフォーマンスを実現するか、あるいは現在の最先端のアプローチを上回ることさえ可能な、シンプルなマスクrcnnベースのネットワークをトレーニングした。 https://github.com/openvinotoolkit/training_extensions。モデルはOpenVINO-formatにエクスポートでき、Intel CPUで動作する。

A large scale human-labeled dataset plays an important role in creating high quality deep learning models. In this paper we present text annotation for Open Images V5 dataset. To our knowledge it is the largest among publicly available manually created text annotations. Having this annotation we trained a simple Mask-RCNN-based network, referred as Yet Another Mask Text Spotter (YAMTS), which achieves competitive performance or even outperforms current state-of-the-art approaches in some cases on ICDAR2013, ICDAR2015 and Total-Text datasets. Code for text spotting model available online at: https://github.com/openvinotoolkit/training_extensions. The model can be exported to OpenVINO-format and run on Intel CPUs.

翻訳日:2021-06-24 15:20:58 公開日:2021-06-23

# Vision Permutator: 視覚認識のための可変MLP様アーキテクチャ

Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition ( http://arxiv.org/abs/2106.12368v1 )

ライセンス: Link先を確認

Qibin Hou, Zihang Jiang, Li Yuan, Ming-Ming Cheng, Shuicheng Yan, Jiashi Feng

(参考訳) 本稿では,視覚認識のための概念的にシンプルでデータ効率のよいMLP型アーキテクチャであるVision Permutatorを提案する。平面化された空間次元に沿って空間情報を符号化する最近のMLPのようなモデルとは異なり、2次元特徴表現が持つ位置情報の重要性を実現することにより、視覚パーミュータは、高さと幅の表現を線形投影で別々に符号化する。これにより、Vision Permutatorは1つの空間方向に沿った長距離依存関係をキャプチャし、他方の方向に沿った正確な位置情報を保存できる。結果として得られる位置感性出力は相互補完的な方法で集約され、興味のある対象の表現表現を形成する。私たちのVision Permutatorは、畳み込みニューラルネットワーク(CNN)とビジョントランスフォーマーとの激しい競合であることを示す。空間畳み込みやアテンション機構に依存せずに、Vision Permutatorは同じモデルサイズ制約の下でほとんどのCNNや視覚変換器よりもはるかに優れた25Mの学習可能なパラメータを使用して、大規模なトレーニングデータ(例えばImageNet-22k)を使わずに、ImageNet上で81.5%のトップ-1精度を達成する。 88Mまでスケールアップすると、83.2%のトップ1の精度に達する。本研究は,空間情報のエンコーディング方法の再考と,MLPのようなモデルの開発を促進することを目的としている。コードはhttps://github.com/Andrew-Qibin/VisionPermutator.comで入手できる。

In this paper, we present Vision Permutator, a conceptually simple and data efficient MLP-like architecture for visual recognition. By realizing the importance of the positional information carried by 2D feature representations, unlike recent MLP-like models that encode the spatial information along the flattened spatial dimensions, Vision Permutator separately encodes the feature representations along the height and width dimensions with linear projections. This allows Vision Permutator to capture long-range dependencies along one spatial direction and meanwhile preserve precise positional information along the other direction. The resulting position-sensitive outputs are then aggregated in a mutually complementing manner to form expressive representations of the objects of interest. We show that our Vision Permutators are formidable competitors to convolutional neural networks (CNNs) and vision transformers. Without the dependence on spatial convolutions or attention mechanisms, Vision Permutator achieves 81.5% top-1 accuracy on ImageNet without extra large-scale training data (e.g., ImageNet-22k) using only 25M learnable parameters, which is much better than most CNNs and vision transformers under the same model size constraint. When scaling up to 88M, it attains 83.2% top-1 accuracy. We hope this work could encourage research on rethinking the way of encoding spatial information and facilitate the development of MLP-like models. Code is available at https://github.com/Andrew-Qibin/VisionPermutator.

翻訳日:2021-06-24 15:20:47 公開日:2021-06-23

# Transformer Meets Convolution: Very Fine Resolution Ur-ban Scene Imageのセマンティックセグメンテーションのためのバイラテラルアウェアネスネットワーク

Transformer Meets Convolution: A Bilateral Awareness Net-work for Semantic Segmentation of Very Fine Resolution Ur-ban Scene Images ( http://arxiv.org/abs/2106.12413v1 )

ライセンス: Link先を確認

Libo Wang, Rui Li, Dongzhi Wang, Chenxi Duan, Teng Wang, Xiaoliang Meng

(参考訳) 超微細解像度(vfr)からのセマンティックセグメンテーション都市景観画像は、自動運転、土地被覆分類、都市計画など、いくつかのアプリケーションシナリオにおいて重要な役割を果たす。しかし、VFR画像に含まれる膨大な詳細は、既存のディープラーニングアプローチの可能性を著しく制限している。さらに、スケールや物体の出現のかなりの変化は、これらのセマンティックセグメンテーション法の表現能力をさらに悪化させ、隣接する物体の混乱につながった。このような課題に対処することは、シーンレベルの景観パターン分析と意思決定の道を開くリモートセンシングコミュニティにおける有望な研究分野である。本稿では,VFR画像の長距離関係と細粒度をフルに捉えるために,依存経路とテクスチャパスを含む両側認知ネットワーク(BANet)を提案する。特に、依存関係パスはメモリ効率の良いマルチヘッド自己アテンションを備えた新しいトランスフォーマーバックボーンであるResTに基づいて実行され、テクスチャパスはスタック化されたコンボサーション操作上に構築される。さらに、線形アテンション機構を使用することで、依存性機能とテクスチャ機能を効果的に融合する機能アグリゲーションモジュール(FAM)が設計されている。大規模都市景観画像セグメンテーションデータセット(ISPRS Vaihingen データセット,ISPRS Potsdam データセット,UAVid データセット)で実施された大規模な実験により,BANet の有効性が示された。具体的には、UAVidデータセット上で64.6%のmIoUが達成される。

Semantic segmentation from very fine resolution (VFR) urban scene images plays a significant role in several application scenarios including autonomous driving, land cover classification, and urban planning, etc. However, the tremendous details contained in the VFR image severely limit the potential of the existing deep learning approaches. More seriously, the considerable variations in scale and appearance of objects further deteriorate the representational capacity of those se-mantic segmentation methods, leading to the confusion of adjacent objects. Addressing such is-sues represents a promising research field in the remote sensing community, which paves the way for scene-level landscape pattern analysis and decision making. In this manuscript, we pro-pose a bilateral awareness network (BANet) which contains a dependency path and a texture path to fully capture the long-range relationships and fine-grained details in VFR images. Specif-ically, the dependency path is conducted based on the ResT, a novel Transformer backbone with memory-efficient multi-head self-attention, while the texture path is built on the stacked convo-lution operation. Besides, using the linear attention mechanism, a feature aggregation module (FAM) is designed to effectively fuse the dependency features and texture features. Extensive experiments conducted on the three large-scale urban scene image segmentation datasets, i.e., ISPRS Vaihingen dataset, ISPRS Potsdam dataset, and UAVid dataset, demonstrate the effective-ness of our BANet. Specifically, a 64.6% mIoU is achieved on the UAVid dataset.

翻訳日:2021-06-24 15:20:21 公開日:2021-06-23

# FusionPainting:3Dオブジェクト検出のための適応注意型マルチモーダルフュージョン

FusionPainting: Multimodal Fusion with Adaptive Attention for 3D Object Detection ( http://arxiv.org/abs/2106.12449v1 )

ライセンス: Link先を確認

Shaoqing Xu, Dingfu Zhou, Jin Fang, Junbo Yin, Zhou Bin and Liangjun Zhang

(参考訳) 3dの障害物の正確な検出は、自動運転とインテリジェントな輸送に欠かせない課題である。本研究では,2次元RGB画像と3次元点雲を意味レベルで融合させて3次元物体検出タスクを増強する汎用多モード融合フレームワークFusionPaintingを提案する。特にFusionPaintingフレームワークは、マルチモーダルセマンティックセグメンテーションモジュール、アダプティブアテンションベースのセマンティックフュージョンモジュール、および3Dオブジェクト検出器の3つの主要モジュールで構成されている。まず、2次元および3次元セグメンテーションアプローチに基づく2次元画像および3次元lidar点雲について意味情報を得る。そして、提案する注意に基づくセマンティクス融合モジュールに基づいて、異なるセンサからのセグメンテーション結果を適応的に融合する。最後に、融合セマンティックラベルで塗られた点雲を3D検出器に送信し、3D対物結果を得る。提案手法の有効性を3つの異なるベースラインと比較し,大規模なnuScenes検出ベンチマークで検証した。実験の結果,点群のみを用いた手法に比べ,核融合戦略は検出性能を大幅に向上し,点群を用いた手法は2次元分節情報のみを描画することを示した。さらに、提案手法は、nuScenesテストベンチマークにおいて、他の最先端メソッドよりも優れている。

Accurate detection of obstacles in 3D is an essential task for autonomous driving and intelligent transportation. In this work, we propose a general multimodal fusion framework FusionPainting to fuse the 2D RGB image and 3D point clouds at a semantic level for boosting the 3D object detection task. Especially, the FusionPainting framework consists of three main modules: a multi-modal semantic segmentation module, an adaptive attention-based semantic fusion module, and a 3D object detector. First, semantic information is obtained for 2D images and 3D Lidar point clouds based on 2D and 3D segmentation approaches. Then the segmentation results from different sensors are adaptively fused based on the proposed attention-based semantic fusion module. Finally, the point clouds painted with the fused semantic label are sent to the 3D detector for obtaining the 3D objection results. The effectiveness of the proposed framework has been verified on the large-scale nuScenes detection benchmark by comparing it with three different baselines. The experimental results show that the fusion strategy can significantly improve the detection performance compared to the methods using only point clouds, and the methods using point clouds only painted with 2D segmentation information. Furthermore, the proposed approach outperforms other state-of-the-art methods on the nuScenes testing benchmark.

翻訳日:2021-06-24 15:19:51 公開日:2021-06-23

# 視覚感情分布学習のための円構造表現

A Circular-Structured Representation for Visual Emotion Distribution Learning ( http://arxiv.org/abs/2106.12450v1 )

ライセンス: Link先を確認

Jingyuan Yang, Ji Lie, Leida Li, Xiumei Wang, and Xinbo Gao

(参考訳) 視覚感情分析(vea)は,近年,ソーシャルネットワーク上で画像共有が普及するにつれて注目を浴びている。人間の感情は曖昧で主観的であるため、単一ラベル分類タスクよりもラベル分散学習(LDL)パラダイムでVEAに取り組む方が妥当である。他のLCLタスクと異なり、心理学理論で示されるように、感情とその内固有の特徴の間に固有の関係が存在する。そこで本研究では,視覚的感情分布学習に先立つ知識を活かした,身近な円形構造表現を提案する。具体的には、まず感情圏を構築し、その中の感情状態を統一する。提案した感情圏では、感情分布は感情ベクトルで表され、3つの属性(感情の極性、感情のタイプ、感情の強さ)と2つの特性(類似性、付加性)で定義される。さらに,予測された感情ベクトルとラベル付き感情ベクトルとの相違を粗い方法でペナルティ化する新たなプログレッシブ・サークル(PC)の損失を設計し,さらに感情特異的な学習プロセスを促進させる。公開視覚感情分布データセット上での広範な実験と比較を行い,提案手法が最先端手法よりも優れていることを示す。

Visual Emotion Analysis (VEA) has attracted increasing attention recently with the prevalence of sharing images on social networks. Since human emotions are ambiguous and subjective, it is more reasonable to address VEA in a label distribution learning (LDL) paradigm rather than a single-label classification task. Different from other LDL tasks, there exist intrinsic relationships between emotions and unique characteristics within them, as demonstrated in psychological theories. Inspired by this, we propose a well-grounded circular-structured representation to utilize the prior knowledge for visual emotion distribution learning. To be specific, we first construct an Emotion Circle to unify any emotional state within it. On the proposed Emotion Circle, each emotion distribution is represented with an emotion vector, which is defined with three attributes (i.e., emotion polarity, emotion type, emotion intensity) as well as two properties (i.e., similarity, additivity). Besides, we design a novel Progressive Circular (PC) loss to penalize the dissimilarities between predicted emotion vector and labeled one in a coarse-to-fine manner, which further boosts the learning process in an emotion-specific way. Extensive experiments and comparisons are conducted on public visual emotion distribution datasets, and the results demonstrate that the proposed method outperforms the state-of-the-art methods.

翻訳日:2021-06-24 15:19:33 公開日:2021-06-23

# characterchat:チャットボットによる会話とプログレッシブな表現による架空のキャラクターの創造を支援する

CharacterChat: Supporting the Creation of Fictional Characters through Conversation and Progressive Manifestation with a Chatbot ( http://arxiv.org/abs/2106.12314v1 )

ライセンス: Link先を確認

Oliver Schmitt, Daniel Buschek

(参考訳) CharacterChatは、作家が架空のキャラクターを作るのを支援するコンセプトとチャットボットです。具体的には、作家は会話を通じてボットを想像上のキャラクターに変える。著者による文字作成に関する調査(n=30)から,2つの質的ユーザ調査(n=7,n=8)まで,ユーザ中心のアプローチで文字チャットを反復的に開発した。プロトタイプには2つのモードが組み合わさっている。(1) 文字属性の定義を支援するガイドプロンプト。ユーザー:「あなたはジェーンです。属性(例えば、属性)の提案を含む。 Bot: “私の主な動機は何ですか? 概念ネットワークを備えたルールベースのシステムとして実現された。 2) チャットボットとのオープンな会話は, 文字属性を考慮に入れた言語モデルを用いて, 文字の探索とインスピレーション獲得を支援する。ユーザスタディでは,文字生成の初期段階におけるメリットと,会話能力の制限による課題を明らかにする。学んだ教訓と将来の仕事のアイデアで締めくくります。

We present CharacterChat, a concept and chatbot to support writers in creating fictional characters. Concretely, writers progressively turn the bot into their imagined character through conversation. We iteratively developed CharacterChat in a user-centred approach, starting with a survey on character creation with writers (N=30), followed by two qualitative user studies (N=7 and N=8). Our prototype combines two modes: (1) Guided prompts help writers define character attributes (e.g. User: "Your name is Jane."), including suggestions for attributes (e.g. Bot: "What is my main motivation?") and values, realised as a rule-based system with a concept network. (2) Open conversation with the chatbot helps writers explore their character and get inspiration, realised with a language model that takes into account the defined character attributes. Our user studies reveal benefits particularly for early stages of character creation, and challenges due to limited conversational capabilities. We conclude with lessons learned and ideas for future work.

翻訳日:2021-06-24 15:19:10 公開日:2021-06-23

# 模倣学習 : 進歩・分類学・機会

Imitation Learning: Progress, Taxonomies and Opportunities ( http://arxiv.org/abs/2106.12177v1 )

ライセンス: Link先を確認

Boyuan Zheng, Sunny Verma, Jianlong Zhou, Ivor Tsang, Fang Chen

(参考訳) 模倣学習(imitation learning)は、人間の専門家のデモンストレーションから知識を抽出することを目的としている。その成功は、ビデオゲーム、自律運転、ロボットシミュレーション、オブジェクト操作などの分野で実証されている。しかし、この複製プロセスは、性能がデモ品質に大きく依存するなど問題があり、ほとんどの訓練されたエージェントはタスク固有の環境でうまく機能するように制限されている。本研究では,模倣学習に関する体系的考察を行う。まず, 開発史と予備知識から得られた背景知識を紹介し, 模倣学習と分野の重要なマイルストーンの中で異なる分類法を提示する。次に,学習戦略の課題を詳述し,サブオプティマイズや音声指示,その他の関連する最適化手法から学習方針を学ぶための研究機会を提案する。

Imitation learning aims to extract knowledge from human experts' demonstrations or artificially created agents in order to replicate their behaviors. Its success has been demonstrated in areas such as video games, autonomous driving, robotic simulations and object manipulation. However, this replicating process could be problematic, such as the performance is highly dependent on the demonstration quality, and most trained agents are limited to perform well in task-specific environments. In this survey, we provide a systematic review on imitation learning. We first introduce the background knowledge from development history and preliminaries, followed by presenting different taxonomies within Imitation Learning and key milestones of the field. We then detail challenges in learning strategies and present research opportunities with learning policy from suboptimal demonstration, voice instructions and other associated optimization schemes.

翻訳日:2021-06-24 15:18:22 公開日:2021-06-23

# マルチバンドVAE:連続学習における知識統合のための潜在空間分割

Multiband VAE: Latent Space Partitioning for Knowledge Consolidation in Continual Learning ( http://arxiv.org/abs/2106.12196v1 )

ライセンス: Link先を確認

Kamil Deja, Pawe{\l} Wawrzy\'nski, Daniel Marczak, Wojciech Masarczyk, Tomasz Trzci\'nski

(参考訳) 本稿では,変分オートエンコーダの潜伏空間の分割に依存する生成モデルにおける教師なし連続的知識統合手法を提案する。従来を忘れずに新しいデータサンプルに関する知識を取得することは、継続的な学習の重要な問題である。現在提案されている手法は、既存のモデルを拡張しながら、過去のデータで劣化しないように振舞いを制約することで、この目標を達成している。本研究では,この限界を特定し,知識蓄積タスクとして継続学習の目標を実証する。我々は、異なるタスクで見られるサンプルの表現であるバンドを、それらが含む情報の類似性によって駆動する、遅延空間分割を継続的に調整することで解決する。さらに,遅延帯域に符号化された再構成の質を向上する過去のデータの制御をシンプルかつ効果的に行う方法と,知識統合を改善する潜時空間のゆがみ技術を導入する。標準の連続学習評価ベンチマークに基づいて,本手法を新たな知識統合シナリオで評価し,提案手法がすべてのテストシナリオで最大2倍の性能を発揮することを示す。

We propose a new method for unsupervised continual knowledge consolidation in generative models that relies on the partitioning of Variational Autoencoder's latent space. Acquiring knowledge about new data samples without forgetting previous ones is a critical problem of continual learning. Currently proposed methods achieve this goal by extending the existing model while constraining its behavior not to degrade on the past data, which does not exploit the full potential of relations within the entire training dataset. In this work, we identify this limitation and posit the goal of continual learning as a knowledge accumulation task. We solve it by continuously re-aligning latent space partitions that we call bands which are representations of samples seen in different tasks, driven by the similarity of the information they contain. In addition, we introduce a simple yet effective method for controlled forgetting of past data that improves the quality of reconstructions encoded in latent bands and a latent space disentanglement technique that improves knowledge consolidation. On top of the standard continual learning evaluation benchmarks, we evaluate our method on a new knowledge consolidation scenario and show that the proposed approach outperforms state-of-the-art by up to twofold across all testing scenarios.

翻訳日:2021-06-24 15:18:09 公開日:2021-06-23

# 遅延フィードバックによる学習: 勾配遅延に暗黙的に適応する

Learning Under Delayed Feedback: Implicitly Adapting to Gradient Delays ( http://arxiv.org/abs/2106.12261v1 )

ライセンス: Link先を確認

Rotem Zamir Aviv (1), Ido Hakimi (2), Assaf Schuster (2), Kfir Y. Levy (1 and 3) ((1) Department of Electrical and Computer Engineering, Technion, (2) Department of Computer Science, Technion, (3) A Viterbi Fellow)

(参考訳) 複数のマシンが共通のメモリを共有しながら並列に動作する確率的凸最適化問題を考える。本稿では,更新遅延,客観的な滑らかさ,勾配分散の事前知識に依存しない,制約付き設定のロバストなトレーニング手法を提案し,非漸近収束保証を導出する。逆に、この設定のための既存のメソッドは、クラウドやデータセンターなど、本質的にすべての共有リソース計算環境に不向きな、この事前の知識に依存している。具体的には,従来の手法ではマシンの動的割り当てによる遅延の変化に適応できないが,本手法はそのような変化に暗黙的に適応する。

We consider stochastic convex optimization problems, where several machines act asynchronously in parallel while sharing a common memory. We propose a robust training method for the constrained setting and derive non asymptotic convergence guarantees that do not depend on prior knowledge of update delays, objective smoothness, and gradient variance. Conversely, existing methods for this setting crucially rely on this prior knowledge, which render them unsuitable for essentially all shared-resources computational environments, such as clouds and data centers. Concretely, existing approaches are unable to accommodate changes in the delays which result from dynamic allocation of the machines, while our method implicitly adapts to such changes.

翻訳日:2021-06-24 15:17:49 公開日:2021-06-23

# マルウェア行動の説明可能な表現の学習

Learning Explainable Representations of Malware Behavior ( http://arxiv.org/abs/2106.12328v1 )

ライセンス: Link先を確認

Paul Prasse, Jan Brabec, Jan Kohout, Martin Kopp, Lukas Bajer, Tobias Scheffer

(参考訳) 我々は,ネットワークテレメトリログにおけるマルウェア識別の問題に対処し,脅威を識別する行動パターンの理解可能な説明である \emph{indicators of compromise} を提供する。本システムでは,専用検出器群がネットワークフローデータを第1ステップで理解可能な \emph{network events} に抽象化する。我々は、この一連のイベントを処理し、特定の脅威、マルウェアファミリー、および広範囲のマルウェアを識別するニューラルネットワークを開発した。次に、emph{integrated-gradients} メソッドを使用して、脅威の特徴的行動パターンを共同で構成するイベントをハイライトする。 CNN,LSTM,トランスフォーマーに基づくネットワークアーキテクチャを比較し,大規模テレメトリデータを用いた教師なし事前学習の有効性について検討する。本システムは,行動パターンに基づいて,njRATや他のマルウェアを検出する方法を示す。

We address the problems of identifying malware in network telemetry logs and providing \emph{indicators of compromise} -- comprehensible explanations of behavioral patterns that identify the threat. In our system, an array of specialized detectors abstracts network-flow data into comprehensible \emph{network events} in a first step. We develop a neural network that processes this sequence of events and identifies specific threats, malware families and broad categories of malware. We then use the \emph{integrated-gradients} method to highlight events that jointly constitute the characteristic behavioral pattern of the threat. We compare network architectures based on CNNs, LSTMs, and transformers, and explore the efficacy of unsupervised pre-training experimentally on large-scale telemetry data. We demonstrate how this system detects njRAT and other malware based on behavioral patterns.

翻訳日:2021-06-24 15:17:37 公開日:2021-06-23

# 正準相関解析から自己教師付きグラフニューラルネットワークへ

From Canonical Correlation Analysis to Self-supervised Graph Neural Networks ( http://arxiv.org/abs/2106.12484v1 )

ライセンス: Link先を確認

Hengrui Zhang, Qitian Wu, Junchi Yan, David Wipf, Philip S. Yu

(参考訳) 本稿では,グラフデータを用いた自己教師付き表現学習のための概念的単純かつ効果的なモデルを提案する。データ拡張を通じて入力グラフの2つのビューを生成する以前の方法に従う。しかし、インスタンスレベルの識別に焦点を当てた対照的な手法とは異なり、古典的正準相関分析に触発された革新的な特徴レベルの目標を最適化する。他の研究と比較すると、パラメータ化された相互情報推定器、追加のプロジェクタ、非対称構造、そして最も重要なのは、コストがかかる負のサンプルを必要としない。本研究の目的は,1) 不変表現を学習することで拡張不変情報を排除し,2) 異なる次元の特徴をデコレーションすることでデジェネレーションソリューションを回避できることである。また,本理論解析は,情報ボトルネック原理のインスタンス化と同等視できる新しい目的の理解を,自己教師付き設定下でも提供する。その単純さにもかかわらず、この手法は7つのパブリックグラフデータセット上で競合的に実行される。

We introduce a conceptually simple yet effective model for self-supervised representation learning with graph data. It follows the previous methods that generate two views of an input graph through data augmentation. However, unlike contrastive methods that focus on instance-level discrimination, we optimize an innovative feature-level objective inspired by classical Canonical Correlation Analysis. Compared with other works, our approach requires none of the parameterized mutual information estimator, additional projector, asymmetric structures, and most importantly, negative samples which can be costly. We show that the new objective essentially 1) aims at discarding augmentation-variant information by learning invariant representations, and 2) can prevent degenerated solutions by decorrelating features in different dimensions. Our theoretical analysis further provides an understanding for the new objective which can be equivalently seen as an instantiation of the Information Bottleneck Principle under the self-supervised setting. Despite its simplicity, our method performs competitively on seven public graph datasets.

翻訳日:2021-06-24 15:17:23 公開日:2021-06-23

# 相互監督によるマルチモーダルベイズ学習

Learning Multimodal VAEs through Mutual Supervision ( http://arxiv.org/abs/2106.12570v1 )

ライセンス: Link先を確認

Tom Joy, Yuge Shi, Philip H.S. Torr, Tom Rainforth, Sebastian M. Schmon, N. Siddharth

(参考訳) マルチモーダルVAEは、異種データ(例えば、視覚、言語)上の共同分布をモデル化し、そのようなモダリティをまたいだ共有表現も取得しようとする。先行研究は、典型的には、明示的な積、混合、または他のそのような因子化を通じて認識モデル内で直接の慣用的表現を調和させることによって、モダリティからの情報を結合する。本稿では、半教師付きVAEを再利用し、相互監督を通じて暗黙的にモダリティ間の情報を組み合わせることで、このような明示的な組み合わせを避ける新しい選択肢MEMEを紹介する。この定式化は自然に、いくつかのモダリティが完全に欠落する部分的観測されたデータから学習を可能にする。我々は,MNIST-SVHN (image-image) と CUB (image-text) のデータセットを用いた部分的および完全観察スキームにおいて,MEME が標準指標のベースラインよりも優れていることを示す。また、相互監督によって学習される表現の品質を標準的アプローチと対比し、データ間の関連性を捉えた興味深い傾向を観察する。

Multimodal VAEs seek to model the joint distribution over heterogeneous data (e.g.\ vision, language), whilst also capturing a shared representation across such modalities. Prior work has typically combined information from the modalities by reconciling idiosyncratic representations directly in the recognition model through explicit products, mixtures, or other such factorisations. Here we introduce a novel alternative, the MEME, that avoids such explicit combinations by repurposing semi-supervised VAEs to combine information between modalities implicitly through mutual supervision. This formulation naturally allows learning from partially-observed data where some modalities can be entirely missing -- something that most existing approaches either cannot handle, or do so to a limited extent. We demonstrate that MEME outperforms baselines on standard metrics across both partial and complete observation schemes on the MNIST-SVHN (image-image) and CUB (image-text) datasets. We also contrast the quality of the representations learnt by mutual supervision against standard approaches and observe interesting trends in its ability to capture relatedness between data.

翻訳日:2021-06-24 15:17:06 公開日:2021-06-23

# weisfeilerとlehman goのセルラー:cwネットワーク

Weisfeiler and Lehman Go Cellular: CW Networks ( http://arxiv.org/abs/2106.12575v1 )

ライセンス: Link先を確認

Cristian Bodnar, Fabrizio Frasca, Nina Otter, Yu Guang Wang, Pietro Li\`o, Guido Mont\'ufar, Michael Bronstein

(参考訳) グラフニューラルネットワーク(GNN)は、表現力に制限があり、長距離相互作用に苦慮し、高次構造をモデル化する原則的な方法がない。これらの問題は、計算グラフと入力グラフ構造の間の強い結合によって引き起こされる。最近提案されたMessage Passing Simplicial Networksは、グラフのcliqueコンプレックスにメッセージパッシングを実行することによって、これらの要素を自然に分離する。しかしながら、これらのモデルはSimplicial Complexs (SCs) の厳密な組合せ構造によって厳しく制約されている。本研究では, SC の最近の理論的結果を, SC とグラフを柔軟にサブスムする位相対象である正則なセルコンプレックスに拡張する。この一般化は、グラフ ``lifting''' 変換の強力なセットを提供し、それぞれにユニークな階層的メッセージパッシング手順をもたらす。 CWN(CW Networks)と呼ばれる結果の手法は、WLテストよりも厳格に強力であり、場合によっては3WLテストよりも強力である。特に,そのようなスキームが分子グラフ問題に適用された場合に,環に基づく効果を示す。提案したアーキテクチャは、一般的に使用されるGNNよりも明らかに大きな表現性、高次信号のモデリングの原則、ノード間の距離の圧縮の利点がある。本モデルにより, 種々の分子データセットの最先端結果が得られた。

Graph Neural Networks (GNNs) are limited in their expressive power, struggle with long-range interactions and lack a principled way to model higher-order structures. These problems can be attributed to the strong coupling between the computational graph and the input graph structure. The recently proposed Message Passing Simplicial Networks naturally decouple these elements by performing message passing on the clique complex of the graph. Nevertheless, these models are severely constrained by the rigid combinatorial structure of Simplicial Complexes (SCs). In this work, we extend recent theoretical results on SCs to regular Cell Complexes, topological objects that flexibly subsume SCs and graphs. We show that this generalisation provides a powerful set of graph ``lifting'' transformations, each leading to a unique hierarchical message passing procedure. The resulting methods, which we collectively call CW Networks (CWNs), are strictly more powerful than the WL test and, in certain cases, not less powerful than the 3-WL test. In particular, we demonstrate the effectiveness of one such scheme, based on rings, when applied to molecular graph problems. The proposed architecture benefits from provably larger expressivity than commonly used GNNs, principled modelling of higher-order signals and from compressing the distances between nodes. We demonstrate that our model achieves state-of-the-art results on a variety of molecular datasets.

翻訳日:2021-06-24 15:16:45 公開日:2021-06-23

# すべてのユーザが同じではない: シーケンシャルな意思決定問題に対するパーソナライズされた説明を提供すること

Not all users are the same: Providing personalized explanations for sequential decision making problems ( http://arxiv.org/abs/2106.12207v1 )

ライセンス: Link先を確認

Utkarsh Soni, Sarath Sreedharan, Subbarao Kambhampati

(参考訳) 人間と一緒に機能する自律エージェントの設計への関心が高まっている。そのようなエージェントは間違いなく彼らの行動や決定を説明するだろう。説明の生成は積極的に研究されているトピックだが、ほとんどの作品は1つのサイズに適合する説明を生成する方法にフォーカスする傾向がある。ユーザモデルの仕様は、完全に無視されます。説明をユーザーの背景に合わせて調整する作業は、ユーザの特定のモデル(分析モデルや学習したラベル付けモデル)に依存する。本研究の目的は,エージェントが対話できるユーザの種類を学習することから始まる,エンドツーエンドの適応的説明生成システムを提案することである。そして、ターゲットユーザーとのインタラクション中に、フライ上のタイプを特定し、それに応じて説明を調整するタスクが実行される。前者はデータ駆動クラスタリング手法により実現され,後者では説明生成問題をPOMDPにコンパイルする。最先端のPOMDPソルバを用いた2つの領域におけるシステムの有用性を示す。また,人間とロボットのインタラクション設定においてパーソナライズされた説明を提供することのメリットを調査するユーザスタディの結果を報告する。

There is a growing interest in designing autonomous agents that can work alongside humans. Such agents will undoubtedly be expected to explain their behavior and decisions. While generating explanations is an actively researched topic, most works tend to focus on methods that generate explanations that are one size fits all. As in the specifics of the user-model are completely ignored. The handful of works that look at tailoring their explanation to the user's background rely on having specific models of the users (either analytic models or learned labeling models). The goal of this work is thus to propose an end-to-end adaptive explanation generation system that begins by learning the different types of users that the agent could interact with. Then during the interaction with the target user, it is tasked with identifying the type on the fly and adjust its explanations accordingly. The former is achieved by a data-driven clustering approach while for the latter, we compile our explanation generation problem into a POMDP. We demonstrate the usefulness of our system on two domains using state-of-the-art POMDP solvers. We also report the results of a user study that investigates the benefits of providing personalized explanations in a human-robot interaction setting.

翻訳日:2021-06-24 15:16:06 公開日:2021-06-23

# DeepStochLog: ニューラルネットワークの確率論理プログラミング

DeepStochLog: Neural Stochastic Logic Programming ( http://arxiv.org/abs/2106.12574v1 )

ライセンス: Link先を確認

Thomas Winters, Giuseppe Marra, Robin Manhaeve, Luc De Raedt

(参考訳) DeepProbLogのようなニューラルシンボル学習の最近の進歩は、確率論的論理プログラムをニューラル述語で拡張している。グラフィカルモデルと同様に、これらの確率論的論理プログラムは可能な世界上の確率分布を定義する。本稿では,確率論的定節文法に基づく別のニューラルネットワーク記号フレームワークであるDeepStochLogを提案する。より具体的には、ニューラル文法規則を確率論的定節文法に導入し、エンドツーエンドにトレーニング可能なフレームワークを作成する。神経確率論理プログラミングにおける推論と学習は、神経確率論理プログラムよりもはるかに優れていることを示す。さらに,DeepStochLogを用いた実験結果から,ニューラルシンボリック学習課題における最先端の成果が得られた。

Recent advances in neural symbolic learning, such as DeepProbLog, extend probabilistic logic programs with neural predicates. Like graphical models, these probabilistic logic programs define a probability distribution over possible worlds, for which inference is computationally hard. We propose DeepStochLog, an alternative neural symbolic framework based on stochastic definite clause grammars, a type of stochastic logic program, which defines a probability distribution over possible derivations. More specifically, we introduce neural grammar rules into stochastic definite clause grammars to create a framework that can be trained end-to-end. We show that inference and learning in neural stochastic logic programming scale much better than for neural probabilistic logic programs. Furthermore, the experimental evaluation shows that DeepStochLog achieves state-of-the-art results on challenging neural symbolic learning tasks.

翻訳日:2021-06-24 15:15:47 公開日:2021-06-23

# cxse:胸部x線スローコードcnn forcovid-19診断

CxSE: Chest X-ray Slow Encoding CNN forCOVID-19 Diagnosis ( http://arxiv.org/abs/2106.12157v1 )

ライセンス: Link先を確認

Thangarajah Akilan

(参考訳) 新型コロナウイルスは指数的なペースで広がる中、私たちの日常生活を混乱させ続けている。さらなる拡散を避けるために、陽性患者を隔離するためには迅速に検出する必要がある。この研究は、'slow Encoding CNN'と呼ばれる新しい畳み込みニューラルネットワーク(CNN)アーキテクチャを提案する。提案されたモデルで最高の性能であるPPV(Positive Predictive Value)は、SP=0.67、PP=0.98、SN=0.96、PN=0.52でAI AGAINST COVID19 - 新型コロナウイルスの検査データサンプルのX線画像のスクリーニングを行う。 SP と PP は COVID-19 陽性クラスの感度と PPV を表し、PN と SN は COVID-19 陰性クラスの感度と PPV を表す。

The coronavirus continues to disrupt our everyday lives as it spreads at an exponential rate. It needs to be detected quickly in order to quarantine positive patients so as to avoid further spread. This work proposes a new convolutional neural network (CNN) architecture called 'slow Encoding CNN. The proposed model's best performance wrt Sensitivity, Positive Predictive Value (PPV) found to be SP=0.67, PP=0.98, SN=0.96, and PN=0.52 on AI AGAINST COVID19 - Screening X-ray images for COVID-19 Infections competition's test data samples. SP and PP stand for the Sensitivity and PPV of the COVID-19 positive class, while PN and SN stand for the Sensitivity and PPV of the COVID-19 negative class.

翻訳日:2021-06-24 15:15:32 公開日:2021-06-23

# deformed2self: dynamic medical imagingのための自己教師付きデノイジング

Deformed2Self: Self-Supervised Denoising for Dynamic Medical Imaging ( http://arxiv.org/abs/2106.12175v1 )

ライセンス: Link先を確認

Junshen Xu, Elfar Adalsteinsson

(参考訳) 画像変性は,疾患診断や下流画像解析のための画像品質を向上させるため,医用画像システムにとって非常に重要である。様々な応用において、動的イメージング技術を用いて被写体の時間変化の特徴を捉え、同じ被写体に対して異なる時間ポイントで複数の画像を取得する。各時間フレームの信号対雑音比は通常、短い取得時間によって制限されるが、異なる時間フレーム間の相関を利用して、時間フレーム間の共有情報による復調結果を改善することができる。コンピュータビジョンにおけるニューラルネットワークの成功により、教師付きディープラーニング手法は、クリーンなvsノイズのイメージペアを持つ大規模なデータセットに依存する、単一イメージの認知において顕著なパフォーマンスを示す。近年, 自己教師付き深層ノイズモデルがいくつか提案されており, クリーン画像のペアワイズ基底真理を必要とせず, 有望な結果が得られた。しかし,マルチイメージ・Denoisingの分野では,自己教師型深層学習法を用いて,複数のスライスから相関情報を抽出する作業はほとんど行われていない。本研究では,ダイナミックイメージングのためのエンドツーエンドの自己教師型ディープラーニングフレームワークDeformed2Selfを提案する。シングルイメージとマルチイメージのデノゲーションを組み合わせて画質を改善し、空間トランスフォーマーネットワークを使用して異なるスライス間の動きをモデル化する。さらに、トレーニングと推論のために異なる時間フレームでいくつかの補助的な観察を行う単一ノイズ画像のみを必要とする。ノイズ統計値の異なるファントムおよび生体内データを用いて評価したところ,本手法は他の最先端の教師なし・自己監督型復調法と同等の性能を示した。

Image denoising is of great importance for medical imaging system, since it can improve image quality for disease diagnosis and downstream image analyses. In a variety of applications, dynamic imaging techniques are utilized to capture the time-varying features of the subject, where multiple images are acquired for the same subject at different time points. Although signal-to-noise ratio of each time frame is usually limited by the short acquisition time, the correlation among different time frames can be exploited to improve denoising results with shared information across time frames. With the success of neural networks in computer vision, supervised deep learning methods show prominent performance in single-image denoising, which rely on large datasets with clean-vs-noisy image pairs. Recently, several self-supervised deep denoising models have been proposed, achieving promising results without needing the pairwise ground truth of clean images. In the field of multi-image denoising, however, very few works have been done on extracting correlated information from multiple slices for denoising using self-supervised deep learning methods. In this work, we propose Deformed2Self, an end-to-end self-supervised deep learning framework for dynamic imaging denoising. It combines single-image and multi-image denoising to improve image quality and use a spatial transformer network to model motion between different slices. Further, it only requires a single noisy image with a few auxiliary observations at different time frames for training and inference. Evaluations on phantom and in vivo data with different noise statistics show that our method has comparable performance to other state-of-the-art unsupervised or self-supervised denoising methods and outperforms under high noise levels.

翻訳日:2021-06-24 15:15:16 公開日:2021-06-23

# 複数のスマートフォンのための協調的視覚慣性SLAM

Collaborative Visual Inertial SLAM for Multiple Smart Phones ( http://arxiv.org/abs/2106.12186v1 )

ライセンス: Link先を確認

Jialing Liu, Ruyu Liu, Kaiqi Chen, Jianhua Zhang, Dongyan Guo

(参考訳) マッピングの効率性と正確性は、大規模なシーンと長期的なarアプリケーションにおいて極めて重要です。マルチエージェント協調SLAMはマルチユーザARインタラクションの前提条件である。複数のスマートフォンの連携により、タスク完了の効率性と堅牢性が向上し、単一のエージェントができないタスクを完了することができる。しかし、堅牢な通信、効率的な位置検出、ロバストマッピング、エージェント間の効率的な情報共有に依存している。マルチインテリジェンス・コラボレーティブな単眼視覚-慣性SLAMを,集中型アーキテクチャで複数のiosモバイルデバイスにデプロイする。各エージェントは独立して環境を探索し、視覚的慣性オドメトリーモジュールをオンラインで実行し、高いコンピューティングリソースを持つ中央サーバにすべての計測情報を送信することができる。サーバは受信したすべての情報を管理し、重複領域を検出し、地図をマージして最適化し、必要に応じてエージェントと情報を共有する。我々は,公開データセットと実環境におけるシステムの性能を検証した。提案システムのマッピングと融合の精度は,より高い計算資源を必要とするVINS-Monoに匹敵する。

The efficiency and accuracy of mapping are crucial in a large scene and long-term AR applications. Multi-agent cooperative SLAM is the precondition of multi-user AR interaction. The cooperation of multiple smart phones has the potential to improve efficiency and robustness of task completion and can complete tasks that a single agent cannot do. However, it depends on robust communication, efficient location detection, robust mapping, and efficient information sharing among agents. We propose a multi-intelligence collaborative monocular visual-inertial SLAM deployed on multiple ios mobile devices with a centralized architecture. Each agent can independently explore the environment, run a visual-inertial odometry module online, and then send all the measurement information to a central server with higher computing resources. The server manages all the information received, detects overlapping areas, merges and optimizes the map, and shares information with the agents when needed. We have verified the performance of the system in public datasets and real environments. The accuracy of mapping and fusion of the proposed system is comparable to VINS-Mono which requires higher computing resources.

翻訳日:2021-06-24 15:14:46 公開日:2021-06-23

# Pseudo Lesionから学ぶ : 新型コロナウイルスの自己診断フレームワーク

Learning from Pseudo Lesion: A Self-supervised Framework for COVID-19 Diagnosis ( http://arxiv.org/abs/2106.12313v1 )

ライセンス: Link先を確認

Zhongliang Li, Zhihao Jin, Xuechen Li, Linlin Shen

(参考訳) 2019年12月の最初の報告以来、新型コロナウイルス(covid-19)は世界中で急速に広がり、胸部ct(胸部ct)はその診断の主要なツールの1つとなっている。近年、ディープラーニングに基づくアプローチは、無数の画像認識タスクにおいて顕著なパフォーマンスを示している。しかし、通常、トレーニングには大量の注釈付きデータが必要である。今回我々は,COIVD-19患者のCT検査でよく見られるグラウンドグラス・オパシティ(GGO)に触発され,疑似病変の発生と回復に基づく自己監督型事前訓練法を提案した。我々は,勾配雑音に基づく数学的モデルであるPerlin noiseを用いて病変様パターンを生成し,正常なCT画像の肺領域にランダムに貼り付け,擬似的なCOVID-19画像を生成する。正常と偽のCOVID-19イメージのペアは、ラベル付きデータを必要としない画像復元のためにエンコーダデコーダアーキテクチャに基づくU-Netのトレーニングに使用された。事前訓練されたエンコーダは、新型コロナウイルスの診断のためにラベル付きデータを使用して微調整された。 CT画像を用いた2つの公開COVID-19診断データセットを用いて評価を行った。総合的な実験結果から,提案手法は,SARS-CoV-2データセットとJinan COVID-19データセットでそれぞれ6.57%,3.03%の精度で教師付きモデルより優れた特徴表現を抽出できることが確認された。

The Coronavirus disease 2019 (COVID-19) has rapidly spread all over the world since its first report in December 2019 and thoracic computed tomography (CT) has become one of the main tools for its diagnosis. In recent years, deep learning-based approaches have shown impressive performance in myriad image recognition tasks. However, they usually require a large number of annotated data for training. Inspired by Ground Glass Opacity (GGO), a common finding in COIVD-19 patient's CT scans, we proposed in this paper a novel self-supervised pretraining method based on pseudo lesions generation and restoration for COVID-19 diagnosis. We used Perlin noise, a gradient noise based mathematical model, to generate lesion-like patterns, which were then randomly pasted to the lung regions of normal CT images to generate pseudo COVID-19 images. The pairs of normal and pseudo COVID-19 images were then used to train an encoder-decoder architecture based U-Net for image restoration, which does not require any labelled data. The pretrained encoder was then fine-tuned using labelled data for COVID-19 diagnosis task. Two public COVID-19 diagnosis datasets made up of CT images were employed for evaluation. Comprehensive experimental results demonstrated that the proposed self-supervised learning approach could extract better feature representation for COVID-19 diagnosis and the accuracy of the proposed method outperformed the supervised model pretrained on large scale images by 6.57% and 3.03% on SARS-CoV-2 dataset and Jinan COVID-19 dataset, respectively.

翻訳日:2021-06-24 15:14:31 公開日:2021-06-23

# ステレオカメラを用いた新しいビデオ合成手法

A new Video Synopsis Based Approach Using Stereo Camera ( http://arxiv.org/abs/2106.12362v1 )

ライセンス: Link先を確認

Talha Dilber, Mehmet Serdar Guzel, Erkan Bostanci

(参考訳) 今日の世界では、各分野で生成されるデータ量は予期せぬレベルで増加している。データの増加に直面したデータ処理の重要性は著しく高まっている。当社のリソーストピックは,データ増加に重要な位置を占めるビデオデータの処理と要約ビデオの生成に関するものです。このリソースの範囲内で,映像要約作成中に,オブジェクトベースの教師なし学習を用いた異常検出手法が開発されている。この方法を用いて、映像データを画素として処理し、ビデオセグメントとして結果を生成する。プロセスフローは、次のように簡単に要約できる。ビデオ上のオブジェクトは、そのタイプに応じて検出され、その後追跡される。そして、オブジェクトのトラッキング履歴データを処理し、そのオブジェクトタイプで分類器を訓練する。この分類器により、物体の異常な挙動を検出する。映像セグメントは、異常動作を含む映像モーメントを処理して決定される。検出されたビデオセグメントを元のビデオから抽出して組み合わせることで、ビデオ要約を作成する。私たちが開発したモデルは、シングルカメラとデュアルカメラシステムで別々にテストされ、検証されています。

In today's world, the amount of data produced in every field has increased at an unexpected level. In the face of increasing data, the importance of data processing has increased remarkably. Our resource topic is on the processing of video data, which has an important place in increasing data, and the production of summary videos. Within the scope of this resource, a new method for anomaly detection with object-based unsupervised learning has been developed while creating a video summary. By using this method, the video data is processed as pixels and the result is produced as a video segment. The process flow can be briefly summarized as follows. Objects on the video are detected according to their type, and then they are tracked. Then, the tracking history data of the objects are processed, and the classifier is trained with the object type. Thanks to this classifier, anomaly behavior of objects is detected. Video segments are determined by processing video moments containing anomaly behaviors. The video summary is created by extracting the detected video segments from the original video and combining them. The model we developed has been tested and verified separately for single camera and dual camera systems.

翻訳日:2021-06-24 15:14:04 公開日:2021-06-23

# STESS:自己監督学習を用いた動的胎児MRIの超解像

STRESS: Super-Resolution for Dynamic Fetal MRI using Self-Supervised Learning ( http://arxiv.org/abs/2106.12407v1 )

ライセンス: Link先を確認

Junshen Xu, Esra Abaci Turk, P. Ellen Grant, Polina Golland, Elfar Adalsteinsson

(参考訳) 胎児の運動は、従来のMRIスキャンのスケールでは予測不可能で急速である。したがって、胎児の運動と胎児機能のダイナミックスを捉えることを目的とした動的胎児MRIは、画像品質と解像度の妥協を伴う高速イメージング技術に限られる。特にオーバーサンプリングのための多方向画像スライススタックが利用できず、胎児や胎盤のダイナミックスを記録するための高時間分解能が望まれる場合、動的胎児MRIの超高解像度化は依然として課題である。さらに、胎児の動きは、教師あり学習方法のための高解像度画像を得るのを難しくする。そこで本研究では,動的胎児MRIのための自己監督型超解像フレームワークSTRESS(Spatio-Temporal Resolution Enhancement with Simulated Scans)を提案する。提案手法は,低解像度画像と高解像度画像のペアを生成するために,元々取得したデータの高分解能軸に沿ったインターリーブスライス取得をシミュレートする。そして、MR時系列における空間的相関と時間的相関を利用して、元のデータの解像度を高めることで超解像ネットワークを訓練する。シミュレーションおよび子宮内データによる評価は,提案手法が他の自己教師付き超解像法より優れ,画質が向上し,他の下流タスクや評価に有用であることを示す。

Fetal motion is unpredictable and rapid on the scale of conventional MR scan times. Therefore, dynamic fetal MRI, which aims at capturing fetal motion and dynamics of fetal function, is limited to fast imaging techniques with compromises in image quality and resolution. Super-resolution for dynamic fetal MRI is still a challenge, especially when multi-oriented stacks of image slices for oversampling are not available and high temporal resolution for recording the dynamics of the fetus or placenta is desired. Further, fetal motion makes it difficult to acquire high-resolution images for supervised learning methods. To address this problem, in this work, we propose STRESS (Spatio-Temporal Resolution Enhancement with Simulated Scans), a self-supervised super-resolution framework for dynamic fetal MRI with interleaved slice acquisitions. Our proposed method simulates an interleaved slice acquisition along the high-resolution axis on the originally acquired data to generate pairs of low- and high-resolution images. Then, it trains a super-resolution network by exploiting both spatial and temporal correlations in the MR time series, which is used to enhance the resolution of the original data. Evaluations on both simulated and in utero data show that our proposed method outperforms other self-supervised super-resolution methods and improves image quality, which is beneficial to other downstream tasks and evaluations.

翻訳日:2021-06-24 15:13:50 公開日:2021-06-23

# foldit: 大腸内視鏡ビデオにおけるhaustral foldsの検出とセグメンテーション

FoldIt: Haustral Folds Detection and Segmentation in Colonoscopy Videos ( http://arxiv.org/abs/2106.12522v1 )

ライセンス: Link先を確認

Shawn Mathew, Saad Nadeem, Arie Kaufman

(参考訳) ホストラル折りたたみ(haustral fold)は、大腸内視鏡検査中に高いポリープミス率を示す大腸壁突起である。正確にセグメンテーションされた場合、ハストラルフォールドは欠損面のより良い推定を可能にし、また前処理の仮想(CT)と光学的大腸内視鏡を登録するための貴重なランドマークとして機能し、前処理のスキャンで見つかった異常へのナビゲーションをガイドする。本稿では,光学的大腸内視鏡映像から仮想大腸内視鏡画像へのハウストラルフォールドオーバーレイを用いた画像変換のための,新しい生成的逆向きネットワークfolditを提案する。新しい推移的損失を導入し,Hustral fold アノテーションと仮想大腸内視鏡的レンダリングの接地真理情報を活用する。そこで本研究では,本モデルの有効性を,実際に挑戦する光大腸内視鏡ビデオや,臨床医が検証したオーストラルフォールドアノテーションを用いたテクスチャ付き仮想大腸内視鏡ビデオに示す。この論文の実験を再現するコードとスクリプトは、https://github.com/nadeemlab/CEPのComputational Endoscopy Platformで公開されます。

Haustral folds are colon wall protrusions implicated for high polyp miss rate during optical colonoscopy procedures. If segmented accurately, haustral folds can allow for better estimation of missed surface and can also serve as valuable landmarks for registering pre-treatment virtual (CT) and optical colonoscopies, to guide navigation towards the anomalies found in pre-treatment scans. We present a novel generative adversarial network, FoldIt, for feature-consistent image translation of optical colonoscopy videos to virtual colonoscopy renderings with haustral fold overlays. A new transitive loss is introduced in order to leverage ground truth information between haustral fold annotations and virtual colonoscopy renderings. We demonstrate the effectiveness of our model on real challenging optical colonoscopy videos as well as on textured virtual colonoscopy videos with clinician-verified haustral fold annotations. All code and scripts to reproduce the experiments of this paper will be made available via our Computational Endoscopy Platform at https://github.com/nadeemlab/CEP.

翻訳日:2021-06-24 15:13:28 公開日:2021-06-23

# ソーシャルナビゲーションにおける紛争の予防と解決 -調査-

Prevention and Resolution of Conflicts in Social Navigation -- a Survey ( http://arxiv.org/abs/2106.12113v1 )

ライセンス: Link先を確認

Reuth Mirsky and Xuesu Xiao and Justin Hart and Peter Stone

(参考訳) ロボットを共有ロボット環境で協調させるという目標が近づき、このコンテキストにおけるナビゲーションは重要かつ望ましいものとなる。ロボット工学の最近の進歩は、混在するロボット環境をナビゲートする際のいくつかの課題に遭遇し、対処してきたが、近年は、ソーシャルナビゲーションにおけるエージェント間の衝突をどう扱うかという問題に特に焦点を絞った、関連する研究の急増が観察されている。これらの貢献はモデル、アルゴリズム、評価指標を提供するが、この研究領域は本質的に学際的であるため、関連する論文の多くは同等ではなく、研究者の間には標準的な語彙がない。この調査の主な目標は、このような共通言語を提案し、既存の作業を調査し、オープンな問題を強調することで、このギャップを埋めることである。ソーシャルナビゲーションにおける衝突を定義することから始まり、コンポーネントの詳細な分類を提供する。この調査は、提案する分類法の枠組みを用いて論文を議論しながら、既存の研究を地図化する。最後に,現在ソーシャルナビゲーションの最前線にある今後の方向性と課題について,研究の焦点を絞るために提案する。

With the approaching goal of having robots collaborate in shared human-robot environments, navigation in this context becomes both crucial and desirable. Recent developments in robotics have encountered and tackled some of the challenges of navigating in mixed human-robot environments, and in recent years we observe a surge of related work that specifically targets the question of how to handle conflicts between agents in social navigation. These contributions offer models, algorithms, and evaluation metrics, however as this research area is inherently interdisciplinary, many of the relevant papers are not comparable and there is no standard vocabulary between the researchers. The main goal of this survey is to bridge this gap by proposing such a common language, using it to survey existing work, and highlighting open problems. It starts by defining a conflict in social navigation, and offers a detailed taxonomy of its components. This survey then maps existing work while discussing papers using the framing of the proposed taxonomy. Finally, this paper propose some future directions and problems that are currently in the frontier of social navigation to help focus research efforts.

翻訳日:2021-06-24 15:13:09 公開日:2021-06-23

# SKIM-FAカーネル:線形時間における高次元可変選択と非線形相互作用発見

The SKIM-FA Kernel: High-Dimensional Variable Selection and Nonlinear Interaction Discovery in Linear Time ( http://arxiv.org/abs/2106.12408v1 )

ライセンス: Link先を確認

Raj Agrawal and Tamara Broderick

(参考訳) 多くの科学的問題は、標的反応に関連する小さな共変体を同定し、その効果を推定する必要がある。これらの効果は、しばしば非線形であり、相互作用を含むため、線形および加法的手法は、推定と変数選択の貧弱につながる。ベイズフレームワークは、階層的モデルにおいてスパーシリティ、非線形性、相互作用を同時に表現する。しかし、この三量体を扱う他のいくつかの方法と同様に、推論は計算的に難解である。本研究では,この計算ボトルネックを解消する。まず,適切なベイズモデルがガウス過程(gps)として表現できることを示す。次に,これらのgpsを用いた計算を,変数選択と推定の両方においてo(# covariates)時間に短縮する方法を示す。我々の結果の適合性は、ヒルベルト空間における回帰関数のスパース直交分解(つまり、機能的ANOVA分解)に対応し、相互作用効果は低次効果によって説明できないすべての変動を表す。様々な合成データセットと実データセットにおいて、当社のアプローチは、大規模で高次元のデータセットに使用される既存の手法よりも優れています。

Many scientific problems require identifying a small set of covariates that are associated with a target response and estimating their effects. Often, these effects are nonlinear and include interactions, so linear and additive methods can lead to poor estimation and variable selection. The Bayesian framework makes it straightforward to simultaneously express sparsity, nonlinearity, and interactions in a hierarchical model. But, as for the few other methods that handle this trifecta, inference is computationally intractable - with runtime at least quadratic in the number of covariates, and often worse. In the present work, we solve this computational bottleneck. We first show that suitable Bayesian models can be represented as Gaussian processes (GPs). We then demonstrate how a kernel trick can reduce computation with these GPs to O(# covariates) time for both variable selection and estimation. Our resulting fit corresponds to a sparse orthogonal decomposition of the regression function in a Hilbert space (i.e., a functional ANOVA decomposition), where interaction effects represent all variation that cannot be explained by lower-order effects. On a variety of synthetic and real datasets, our approach outperforms existing methods used for large, high-dimensional datasets while remaining competitive (or being orders of magnitude faster) in runtime.

翻訳日:2021-06-24 15:12:50 公開日:2021-06-23

# パスシグネチャを用いた近似ベイズ計算

Approximate Bayesian Computation with Path Signatures ( http://arxiv.org/abs/2106.12555v1 )

ライセンス: Link先を確認

Joel Dyer, Patrick Cannon, Sebastian M Schmon

(参考訳) 科学的な関心のシミュレーションモデルは、しばしば、標準的確率に基づく統計推論に先行して、扱いやすい確率関数を欠いている。シミュレータのパラメータを推定する一般的な帰納法として近似ベイズ計算があり、シミュレータ出力と観測データを比較して近似後段をサンプリングする。しかし,特に高次元で構造的に複雑である時系列データでは,シミュレーションデータと観測データとの密接度を効果的に測定することは一般的に困難である。既存のアプローチは通常、手動で要約統計を構築したり、ドメインの専門知識や実験を必要としたり、idデータのような非現実的な仮定に依存したりする。その他、多変量や不規則にサンプリングされた時系列データのようなより複雑な設定では不適切である。本稿では,近似ベイズ計算アルゴリズムで使用する時系列データ間の距離を構築するための自然候補特徴集合としてパスシグネチャを用いることを提案する。実験により, 従来の時系列モデルよりも高精度なベイズ後方推定が可能であることが示された。

Simulation models of scientific interest often lack a tractable likelihood function, precluding standard likelihood-based statistical inference. A popular likelihood-free method for inferring simulator parameters is approximate Bayesian computation, where an approximate posterior is sampled by comparing simulator output and observed data. However, effective measures of closeness between simulated and observed data are generally difficult to construct, particularly for time series data which are often high-dimensional and structurally complex. Existing approaches typically involve manually constructing summary statistics, requiring substantial domain expertise and experimentation, or rely on unrealistic assumptions such as iid data. Others are inappropriate in more complex settings like multivariate or irregularly sampled time series data. In this paper, we introduce the use of path signatures as a natural candidate feature set for constructing distances between time series data for use in approximate Bayesian computation algorithms. Our experiments show that such an approach can generate more accurate approximate Bayesian posteriors than existing techniques for time series models.

翻訳日:2021-06-24 15:12:28 公開日:2021-06-23

# 不確実性認識モデルに基づく強化学習と自動運転への応用

Uncertainty-Aware Model-Based Reinforcement Learning with Application to Autonomous Driving ( http://arxiv.org/abs/2106.12194v1 )

ライセンス: Link先を確認

Jingda Wu, Zhiyu Huang, Chen Lv

(参考訳) 本稿では、強化学習(RL)の学習効率と性能をさらに向上させるために、新しい不確実性を考慮したモデルベースRL(UA-MBRL)フレームワークを提案する。まず,仮想環境モデルとして不確実性評価能力を有する動作条件アンサンブルモデルを確立する。そして,適応的トランケーションアプローチに基づいて,新たな不確実性を考慮したRLフレームワークを開発し,エージェントと環境モデルの仮想インタラクションを提供し,RLのトレーニング効率と性能を向上させる。開発したアルゴリズムは、エンド・ツー・エンドの自動運転車制御タスクで実装され、様々な運転シナリオで最先端の手法と比較される。その結果,UA-MBRL法は既存のモデルベースおよびモデルフリーRL法を学習効率の観点から上回り,性能が向上した。また,様々な自律運転シナリオにおいて,適応性とロバスト性に関して提案手法の有効性を示す。

To further improve the learning efficiency and performance of reinforcement learning (RL), in this paper we propose a novel uncertainty-aware model-based RL (UA-MBRL) framework, and then implement and validate it in autonomous driving under various task scenarios. First, an action-conditioned ensemble model with the ability of uncertainty assessment is established as the virtual environment model. Then, a novel uncertainty-aware model-based RL framework is developed based on the adaptive truncation approach, providing virtual interactions between the agent and environment model, and improving RL's training efficiency and performance. The developed algorithms are then implemented in end-to-end autonomous vehicle control tasks, validated and compared with state-of-the-art methods under various driving scenarios. The validation results suggest that the proposed UA-MBRL method surpasses the existing model-based and model-free RL approaches, in terms of learning efficiency and achieved performance. The results also demonstrate the good ability of the proposed method with respect to the adaptiveness and robustness, under various autonomous driving scenarios.

翻訳日:2021-06-24 15:11:50 公開日:2021-06-23

# MG-DVD:動的不均一グラフ学習に基づくマルウェア検出のためのリアルタイムフレームワーク

MG-DVD: A Real-time Framework for Malware Variant Detection Based on Dynamic Heterogeneous Graph Learning ( http://arxiv.org/abs/2106.12288v1 )

ライセンス: Link先を確認

Chen Liu, Bo Li, Jun Zhao, Ming Su, Xu-Dong Liu

(参考訳) 新たなマルウェアをリアルタイムで検出することは、サイバーリスクを軽減し、積極的に侵入を阻止するために重要である。本稿では,動的異種グラフ学習に基づく新しい検出フレームワークMG-DVDを提案する。特にmg-dvdは、マルウェア変異体の細かな実行イベントストリームを動的ヘテロジニアスグラフにモデル化し、マルウェアオブジェクト間の実世界のメタグラフを調査し、マルウェアとその変異種間のより識別的な悪意のある進化パターンを効果的に特徴付ける。そして、MG-DVDは2つの動的ウォークに基づく異種グラフ学習法を示し、より包括的なマルウェアの表現を学習し、グラフ再学習のコストを大幅に削減する。その結果、MG-DVDはマルウェアの変種をリアルタイムで検出する機能を備えており、意味のあるメタグラフを導入することにより、より優れた解釈性を示す。大規模サンプルの総合的な実験により,提案したMG-DVDは,有効性と効率の観点から,マルウェアの変異を検出する最先端の手法より優れていることが示された。

Detecting the newly emerging malware variants in real time is crucial for mitigating cyber risks and proactively blocking intrusions. In this paper, we propose MG-DVD, a novel detection framework based on dynamic heterogeneous graph learning, to detect malware variants in real time. Particularly, MG-DVD first models the fine-grained execution event streams of malware variants into dynamic heterogeneous graphs and investigates real-world meta-graphs between malware objects, which can effectively characterize more discriminative malicious evolutionary patterns between malware and their variants. Then, MG-DVD presents two dynamic walk-based heterogeneous graph learning methods to learn more comprehensive representations of malware variants, which significantly reduces the cost of the entire graph retraining. As a result, MG-DVD is equipped with the ability to detect malware variants in real time, and it presents better interpretability by introducing meaningful meta-graphs. Comprehensive experiments on large-scale samples prove that our proposed MG-DVD outperforms state-of-the-art methods in detecting malware variants in terms of effectiveness and efficiency.

翻訳日:2021-06-24 15:11:31 公開日:2021-06-23

# EXPLAINable DGA Multiclass Classification への第一歩

First Step Towards EXPLAINable DGA Multiclass Classification ( http://arxiv.org/abs/2106.12336v1 )

ライセンス: Link先を確認

Arthur Drichel, Nils Faerber, Ulrike Meyer

(参考訳) 多くのマルウェアファミリーは、コマンドとコントロール(C2)サーバーへの接続を確立するためにドメイン生成アルゴリズム(DGA)に依存している。 DGAと対抗して、特定のドメイン名を生成したDGAを識別し、ターゲットの修復措置を誘発する機械学習分類器が提案されている。しかし、提案した最先端分類器はディープラーニングモデルに基づいている。これらのブラックボックスの性質は、その推論を評価するのを難しくしている。その結果、信頼性の欠如により、そのようなモデルの利用は不可能となる。本稿では,機能ベースでコンテキストレスなDGAマルチクラス分類器EXPLAINを提案する。我々は,同じ実世界のデータに基づいて,複数の最先端の分類器に対して,特徴集合とハイパーパラメータの組み合わせを比較検討した。提案するDGAマルチクラス分類器の予測よりも,特徴に遡ることが容易である。

Numerous malware families rely on domain generation algorithms (DGAs) to establish a connection to their command and control (C2) server. Counteracting DGAs, several machine learning classifiers have been proposed enabling the identification of the DGA that generated a specific domain name and thus triggering targeted remediation measures. However, the proposed state-of-the-art classifiers are based on deep learning models. The black box nature of these makes it difficult to evaluate their reasoning. The resulting lack of confidence makes the utilization of such models impracticable. In this paper, we propose EXPLAIN, a feature-based and contextless DGA multiclass classifier. We comparatively evaluate several combinations of feature sets and hyperparameters for our approach against several state-of-the-art classifiers in a unified setting on the same real-world data. Our classifier achieves competitive results, is real-time capable, and its predictions are easier to trace back to features than the predictions made by the DGA multiclass classifiers proposed in related work.

翻訳日:2021-06-24 15:11:10 公開日:2021-06-23

# ハイスタックでフィッシュを見つける: 認証透明性ログのフィッシュ分類のためのパイプライン

Finding Phish in a Haystack: A Pipeline for Phishing Classification on Certificate Transparency Logs ( http://arxiv.org/abs/2106.12343v1 )

ライセンス: Link先を確認

Arthur Drichel, Vincent Drury, Justus von Brandt, Ulrike Meyer

(参考訳) 現在の一般的なフィッシング防止技術は、主に、被害者が保護されていない攻撃者に「機会の窓」を残すリアクティブブロックリストを使用する。このウィンドウを短くする1つの可能なアプローチは、認証透明性(CT)ログを監視して、ウェブサイトの準備中にフィッシング攻撃を早期に検出することである。フィッシング分類のためのCTログデータを扱う以前の試みは存在するが、実際のCTログデータに対する評価は欠如している。本稿では,CTログデータを扱う際の問題に対処し,そのような評価を容易にするパイプラインを提案する。パイプラインにはデータセットの作成、トレーニング、CTログの過去またはライブ分類が含まれている。そのモジュラ構造により、分類器や検証源を簡単に交換し、基底真理ラベルと分類器の比較をサポートすることができる。パイプラインを多数の新しいおよび既存の分類器でテストし、将来このシナリオの分類器を改善する一般的な可能性を見出す。パイプラインと使用するデータセットのソースコードと論文(https://gitlab.com/rwth-itsec/ctl-pipeline)を公開しています。

Current popular phishing prevention techniques mainly utilize reactive blocklists, which leave a ``window of opportunity'' for attackers during which victims are unprotected. One possible approach to shorten this window aims to detect phishing attacks earlier, during website preparation, by monitoring Certificate Transparency (CT) logs. Previous attempts to work with CT log data for phishing classification exist, however they lack evaluations on actual CT log data. In this paper, we present a pipeline that facilitates such evaluations by addressing a number of problems when working with CT log data. The pipeline includes dataset creation, training, and past or live classification of CT logs. Its modular structure makes it possible to easily exchange classifiers or verification sources to support ground truth labeling efforts and classifier comparisons. We test the pipeline on a number of new and existing classifiers, and find a general potential to improve classifiers for this scenario in the future. We publish the source code of the pipeline and the used datasets along with this paper (https://gitlab.com/rwth-itsec/ctl-pipeline), thus making future research in this direction more accessible.

翻訳日:2021-06-24 15:10:55 公開日:2021-06-23

# パストレースのためのリアルタイムニューラルネットワークラミアンスキャッシング

Real-time Neural Radiance Caching for Path Tracing ( http://arxiv.org/abs/2106.12372v1 )

ライセンス: Link先を確認

Thomas M\"uller, Fabrice Rousselle, Jan Nov\'ak, Alexander Keller

(参考訳) 本稿では,パストレースによるグローバル照明のためのリアルタイムニューラルネットワークラミアンスキャッシング手法を提案する。我々のシステムは、完全にダイナミックなシーンを扱うように設計されており、照明、幾何学、材料に関する仮定は一切ない。私たちのアプローチのデータ駆動性は、キャッシュポイントの配置、補間、更新など、キャッシュアルゴリズムの多くの難しさを回避します。ニューラルネットワークをトレーニングして新しいものを扱うため、動的シーンは恐ろしい一般化の課題であるので、事前トレーニングを廃止し、適応によって一般化する。レンダリング中にレイディアンスキャッシュを訓練することにしました低ノイズのトレーニングターゲットを提供し、数バウンストレーニング更新を単に繰り返して無限バウンス輸送をシミュレートするために、自己学習を採用している。最新のハードウェアをフル活用したニューラルネットワークのストリーミング実装のおかげで、更新とキャッシュクエリは -- フルhd解像度で約2.6ミリ秒の軽いオーバーヘッドを伴います。バイアスを小さく抑えることで大きなノイズ低減効果を示すとともに,多くの課題に対して最先端のリアルタイム性能を報告した。

We present a real-time neural radiance caching method for path-traced global illumination. Our system is designed to handle fully dynamic scenes, and makes no assumptions about the lighting, geometry, and materials. The data-driven nature of our approach sidesteps many difficulties of caching algorithms, such as locating, interpolating, and updating cache points. Since pretraining neural networks to handle novel, dynamic scenes is a formidable generalization challenge, we do away with pretraining and instead achieve generalization via adaptation, i.e. we opt for training the radiance cache while rendering. We employ self-training to provide low-noise training targets and simulate infinite-bounce transport by merely iterating few-bounce training updates. The updates and cache queries incur a mild overhead -- about 2.6ms on full HD resolution -- thanks to a streaming implementation of the neural network that fully exploits modern hardware. We demonstrate significant noise reduction at the cost of little induced bias, and report state-of-the-art, real-time performance on a number of challenging scenarios.

翻訳日:2021-06-24 15:10:33 公開日:2021-06-23

# 一般化誤差制御による回帰学習のためのトレーニングデータサブセット選択

Training Data Subset Selection for Regression with Controlled Generalization Error ( http://arxiv.org/abs/2106.12491v1 )

ライセンス: Link先を確認

Durga Sivasubramanian, Rishabh Iyer, Ganesh Ramakrishnan, Abir De

(参考訳) 多数のトレーニングインスタンスからのデータサブセット選択は、効率的でコスト効率の良い機械学習へのアプローチとして成功している。しかし、より小さな部分集合で訓練されたモデルは、一般化能力に乏しい。本稿では,トレーニングデータのサブセットを選択するアルゴリズムを設計することで,精度を著しく犠牲にすることなく,モデルを迅速にトレーニングすることを目的とする。より具体的には、l2正規化回帰問題に対するデータサブセット選択に着目し、トレーニング可能なパラメータとトレーニングデータのサブセットの両方に対するトレーニング損失を最小限に抑えることを目的とした新しい問題定式化を提供する。我々はいくつかの技術革新を用いてこの問題に取り組む。まず、この問題を元のトレーニング問題の双対を用いて単純化した制約で表現し、この新しい表現の目的が様々なモデリング選択に対してモノトーンおよびα-部分モジュラー関数であることを示す。このような特性により、トレーニングがトレーニングされたモデルの不完全推定を提供しても近似を保証する、データサブセット選択のための効率的な分極最小化アルゴリズムであるSELCONを開発することができる。最後に、いくつかのデータセットに対する実験により、SELCONは現在の最先端技術よりも精度と効率を効果的に交換することを示した。

Data subset selection from a large number of training instances has been a successful approach toward efficient and cost-effective machine learning. However, models trained on a smaller subset may show poor generalization ability. In this paper, our goal is to design an algorithm for selecting a subset of the training data, so that the model can be trained quickly, without significantly sacrificing on accuracy. More specifically, we focus on data subset selection for L2 regularized regression problems and provide a novel problem formulation which seeks to minimize the training loss with respect to both the trainable parameters and the subset of training data, subject to error bounds on the validation set. We tackle this problem using several technical innovations. First, we represent this problem with simplified constraints using the dual of the original training problem and show that the objective of this new representation is a monotone and alpha-submodular function, for a wide variety of modeling choices. Such properties lead us to develop SELCON, an efficient majorization-minimization algorithm for data subset selection, that admits an approximation guarantee even when the training provides an imperfect estimate of the trained model. Finally, our experiments on several datasets show that SELCON trades off accuracy and efficiency more effectively than the current state-of-the-art.

翻訳日:2021-06-24 15:10:16 公開日:2021-06-23

# 戦略分類で誰がリードし、誰がフォローするか?

Who Leads and Who Follows in Strategic Classification? ( http://arxiv.org/abs/2106.12529v1 )

ライセンス: Link先を確認

Tijana Zrnic, Eric Mazumdar, S. Shankar Sastry, Michael I. Jordan

(参考訳) 予測モデルが現実世界にデプロイされるにつれ、彼らはますます戦略的な行動と競合しなくてはならない。戦略分類に関する活動の活発化は、この問題をStackelbergのゲームとして扱う: 意思決定者(deciment-maker)は、モデルをデプロイすることでゲーム内で"リード(leads)"する。重要なのは、このフレーミングでは、学習の負担は意思決定者のみに置かれ、エージェントのベストレスポンスは暗黙的に瞬時に扱われる。本研究では,戦略分類における役割の順序は,意思決定者とエージェントが互いの行動に適応する相対周波数によって決定されると主張している。特に,両プレイヤーが時間とともに学習できるように標準モデルを一般化することにより,エージェントよりも高速に更新を行う意思決定者がプレーの順序を逆転し,エージェントがリードし,意思決定者が従うことを示す。我々は,このような役割の逆転が意思決定者や戦略エージェントにとって望ましいことを,標準的な学習環境で観察する。最後に,更新頻度を自由に選択できる意思決定者は,いずれの順序でもstackelberg equilibriaに収束する学習ダイナミクスを誘導できることを示す。

As predictive models are deployed into the real world, they must increasingly contend with strategic behavior. A growing body of work on strategic classification treats this problem as a Stackelberg game: the decision-maker "leads" in the game by deploying a model, and the strategic agents "follow" by playing their best response to the deployed model. Importantly, in this framing, the burden of learning is placed solely on the decision-maker, while the agents' best responses are implicitly treated as instantaneous. In this work, we argue that the order of play in strategic classification is fundamentally determined by the relative frequencies at which the decision-maker and the agents adapt to each other's actions. In particular, by generalizing the standard model to allow both players to learn over time, we show that a decision-maker that makes updates faster than the agents can reverse the order of play, meaning that the agents lead and the decision-maker follows. We observe in standard learning settings that such a role reversal can be desirable for both the decision-maker and the strategic agents. Finally, we show that a decision-maker with the freedom to choose their update frequency can induce learning dynamics that converge to Stackelberg equilibria with either order of play.

翻訳日:2021-06-24 15:09:56 公開日:2021-06-23

# bregmangradient policyの最適化

Bregman Gradient Policy Optimization ( http://arxiv.org/abs/2106.12112v1 )

ライセンス: Link先を確認

Feihu Huang, Shangqian Gao, Heng Huang

(参考訳) 本稿では,Bregman分散度と運動量に基づく強化学習のための新しいBregman勾配ポリシー最適化フレームワークを設計する。具体的には,基本運動量法とミラー降下反復に基づくBregmanグラデーションポリシー最適化(BGPO)アルゴリズムを提案する。同時に,運動量分散を再現した手法に基づいて,ブレグマン勾配ポリシー最適化(VR-BGPO)アルゴリズムを提案する。さらに,非凸条件下でのブレグマン勾配政策最適化のための収束解析フレームワークを提案する。具体的には、BGPOが各反復で1つの軌道のみを必要とする$\epsilon$-stationary pointを見つけるために$\tilde{O}(\epsilon^{-4})$のサンプル複雑性を達成し、VR-BGPOは各反復で1つの軌道のみを必要とする$\tilde{O}(\epsilon^{-3})$の既知のサンプル複雑さに達することを証明している。特に,Bregmanの相違を利用して,既存の政策最適化アルゴリズムと,既存の(分散還元)政策勾配アルゴリズムや(分散還元)自然政策勾配アルゴリズムなどの新しい変種を統一する。複数の強化学習タスクに関する広範な実験結果から,新しいアルゴリズムの有効性が示された。

In this paper, we design a novel Bregman gradient policy optimization framework for reinforcement learning based on Bregman divergences and momentum techniques. Specifically, we propose a Bregman gradient policy optimization (BGPO) algorithm based on the basic momentum technique and mirror descent iteration. At the same time, we present an accelerated Bregman gradient policy optimization (VR-BGPO) algorithm based on a momentum variance-reduced technique. Moreover, we introduce a convergence analysis framework for our Bregman gradient policy optimization under the nonconvex setting. Specifically, we prove that BGPO achieves the sample complexity of $\tilde{O}(\epsilon^{-4})$ for finding $\epsilon$-stationary point only requiring one trajectory at each iteration, and VR-BGPO reaches the best known sample complexity of $\tilde{O}(\epsilon^{-3})$ for finding an $\epsilon$-stationary point, which also only requires one trajectory at each iteration. In particular, by using different Bregman divergences, our methods unify many existing policy optimization algorithms and their new variants such as the existing (variance-reduced) policy gradient algorithms and (variance-reduced) natural policy gradient algorithms. Extensive experimental results on multiple reinforcement learning tasks demonstrate the efficiency of our new algorithms.

翻訳日:2021-06-24 15:08:56 公開日:2021-06-23

# 運動方程式の保守的ニューラルネットワーク解のためのラグランジュ双対フレームワーク

Lagrangian dual framework for conservative neural network solutions of kinetic equations ( http://arxiv.org/abs/2106.12147v1 )

ライセンス: Link先を確認

Hyung Ju Hwang and Hwijae Son

(参考訳) 本稿では,ニューラルネットワークによる運動方程式の解法として,新しい保存的定式化を提案する。より正確には、学習問題を物理的保存則を表す制約付き最適化問題として定式化する。制約はラグランジアン双対性により残留損失関数に対して緩和される。学習問題の制約として解の物理的保存特性を仮定することにより、解の誤差や保存則の観点からのより正確な近似を、速度論的フォッカー・プランク方程式と均質なボルツマン方程式に対して示す。

In this paper, we propose a novel conservative formulation for solving kinetic equations via neural networks. More precisely, we formulate the learning problem as a constrained optimization problem with constraints that represent the physical conservation laws. The constraints are relaxed toward the residual loss function by the Lagrangian duality. By imposing physical conservation properties of the solution as constraints of the learning problem, we demonstrate far more accurate approximations of the solutions in terms of errors and the conservation laws, for the kinetic Fokker-Planck equation and the homogeneous Boltzmann equation.

翻訳日:2021-06-24 15:08:32 公開日:2021-06-23

# エネルギーを考慮した資源配分のための畳み込みニューラルネットワークとGated Recurrent Unitの組み合わせ

Combination of Convolutional Neural Network and Gated Recurrent Unit for Energy Aware Resource Allocation ( http://arxiv.org/abs/2106.12178v1 )

ライセンス: Link先を確認

Zeinab Khodaverdian, Hossein Sadr, Seyed Ahmad Edalatpanah and Mojdeh Nazari Solimandarabi

(参考訳) クラウドコンピューティングサービスモデルは急速に成長し、非効率なリソース利用は、クラウドデータセンターにおける高エネルギー消費の最大の原因の1つとして知られている。仮想マシン (VM) のライブマイグレーションと, 少数の物理マシン (PM) への統合により, エネルギー消費削減を目的としたクラウドデータセンターの資源配分を行った。しかし、移行に適したvmの選択は重要な課題である。この問題を解決するため、ユーザリクエストのパターンに従ってVMを機密性のある、あるいは非機密なクラスに分類し、その後、マイグレーション用に適切なVMを選択することができる。本稿では、Microsoft Azureデータセット内のVMの分類に、畳み込みニューラルネットワーク(CNN)とGRU(Gated Recurrent Unit)の組み合わせを利用する。このデータセットのほとんどのVMは遅延に敏感であるとラベル付けされているため、このグループのVMへの移行はエネルギー消費を減らすだけでなく、サービスレベルアグリーメント(SLA)に違反している。実験結果に基づき,提案モデルでは,既存モデルと比較して,提案モデルが優れていることを示す95.18の精度を得た。

Cloud computing service models have experienced rapid growth and inefficient resource usage is known as one of the greatest causes of high energy consumption in cloud data centers. Resource allocation in cloud data centers aiming to reduce energy consumption has been conducted using live migration of Virtual Machines (VMs) and their consolidation into the small number of Physical Machines (PMs). However, the selection of the appropriate VM for migration is an important challenge. To solve this issue, VMs can be classified according to the pattern of user requests into sensitive or insensitive classes to latency, and thereafter suitable VMs can be selected for migration. In this paper, the combination of Convolution Neural Network (CNN) and Gated Recurrent Unit (GRU) is utilized for the classification of VMs in the Microsoft Azure dataset. Due to the fact the majority of VMs in this dataset are labeled as insensitive to latency, migration of more VMs in this group not only reduces energy consumption but also decreases the violation of Service Level Agreements (SLA). Based on the empirical results, the proposed model obtained an accuracy of 95.18which clearly demonstrates the superiority of our proposed model compared to other existing models.

翻訳日:2021-06-24 15:08:23 公開日:2021-06-23

# bibliodap:1st workshop on bibliographic data analysis and processing

BiblioDAP: The 1st Workshop on Bibliographic Data Analysis and Processing ( http://arxiv.org/abs/2106.12320v1 )

ライセンス: Link先を確認

Zeyd Boukhers, Philipp Mayr, Silvio Peroni

(参考訳) 書誌データの自動処理は, 図書館, データサイエンス, 機械学習において, 書誌データの自動処理が重要となる。この処理には、I)PDF文書からの参照の自動抽出、II)正確な引用グラフの構築、III)著者名曖昧化等を含むいくつかの側面がある。書誌データは自然と異質であり、構造化された(例えば)両者で発生する。引用グラフ)と非構造化(例) 出版物) 形式。そのため、データサイエンスと機械学習のテクニックを処理および分析する必要がある。ここでは、BiblioDAP'21: The First Workshop on Bibliographic Data Analysis and Processingを紹介する。

Automatic processing of bibliographic data becomes very important in digital libraries, data science and machine learning due to its importance in keeping pace with the significant increase of published papers every year from one side and to the inherent challenges from the other side. This processing has several aspects including but not limited to I) Automatic extraction of references from PDF documents, II) Building an accurate citation graph, III) Author name disambiguation, etc. Bibliographic data is heterogeneous by nature and occurs in both structured (e.g. citation graph) and unstructured (e.g. publications) formats. Therefore, it requires data science and machine learning techniques to be processed and analysed. Here we introduce BiblioDAP'21: The 1st Workshop on Bibliographic Data Analysis and Processing.

翻訳日:2021-06-24 15:08:05 公開日:2021-06-23

# ラジオマップを用いたリアルタイム屋外位置推定 : 深層学習アプローチ

Real-time Outdoor Localization Using Radio Maps: A Deep Learning Approach ( http://arxiv.org/abs/2106.12556v1 )

ライセンス: Link先を確認

\c{C}a\u{g}kan Yapar, Ron Levie, Gitta Kutyniok, Giuseppe Caire

(参考訳) 本稿では,密集した都市シナリオにおけるセルネットワークの局在の問題を扱う。グローバル・ナビゲーション・サテライト・システムは通常、機器と衛星の間の視線条件が低くなる都市環境では性能が悪く、適切な精度のために代替のローカライズ方法が必要となる。本稿では,パスロスのみに基づく局所化のための深層学習手法を提案する。これは,到着時刻や到着角に依存する手法とは異なり,デバイス標準操作に対するユーザデバイスでの計算複雑性の増大を必要としない。無線ネットワークにおいて、ユーザデバイスは、ベースステーションビーコンスロットをスキャンし、ハンドオーバおよびユーザベースステーションアソシエーションのために、数少ない最強のベースステーション信号を特定する。提案手法では,受信した信号強度をクラウド上に位置する中央処理ユニットに簡易に報告する。各基地局に対して、地図上の高密度グリッド内のすべての位置におけるパスロスをよく近似する。この近似は,都市環境におけるパスロス関数の深層学習に基づくシミュレータであるRadioUNetによって提供される。提案した深層学習アルゴリズムは,すべての基地局の推定パスロスラジオマップとそれに対応する信号強度を用いて,ユーザの正確な位置推定を行うことができる。提案手法はLocUNetと呼ばれ,推定無線地図における不正確性が高い。これを数値実験により実演し,最新の結果を得た。

This paper deals with the problem of localization in a cellular network in a dense urban scenario. Global Navigation Satellite Systems typically perform poorly in urban environments, where the likelihood of line-of-sight conditions between the devices and the satellites is low, and thus alternative localization methods are required for good accuracy. We present a deep learning method for localization, based merely on pathloss, which does not require any increase in computation complexity at the user devices with respect to the device standard operations, unlike methods that rely on time of arrival or angle of arrival information. In a wireless network, user devices scan the base station beacon slots and identify the few strongest base station signals for handover and user-base station association purposes. In the proposed method, the user to be localized simply reports such received signal strengths to a central processing unit, which may be located in the cloud. For each base station we have good approximation of the pathloss at every location in a dense grid in the map. This approximation is provided by RadioUNet, a deep learning-based simulator of pathloss functions in urban environment, that we have previously proposed and published. Using the estimated pathloss radio maps of all base stations and the corresponding reported signal strengths, the proposed deep learning algorithm can extract a very accurate localization of the user. The proposed method, called LocUNet, enjoys high robustness to inaccuracies in the estimated radio maps. We demonstrate this by numerical experiments, which obtain state-of-the-art results.

翻訳日:2021-06-24 15:07:53 公開日:2021-06-23

# (参考訳) STEP-EZ:Syntax Tree Guided semantic ExPlanation for Explainable Zero-shot Modeling of Clinical depression symptoms from text

STEP-EZ: Syntax Tree guided semantic ExPlanation for Explainable Zero-shot modeling of clinical depression symptoms from text ( http://arxiv.org/abs/2106.10928v2 )

ライセンス: CC BY 4.0

Nawshad Farruque, Randy Goebel, Osmar Zaiane, Sudhakar Sivapalan

(参考訳) 我々は,ZSL(Zero-Shot Learning)の様々なアプローチと,データ不足のトレーニングで有名な,重要な教師付き学習課題の説明可能性に焦点をあてる。 Depression Symptoms Detection (DSD) from text (英語) まず、ZSLモデリングの様々な構成要素の総合的な合成と、臨床医の助けを借りて、地上の真理サンプルの分析と抑うつ症状の手がかりのキュレーションプロセスから始める。次に、様々な最先端ZSLモデルの精度と、タスクの潜在的な拡張について分析する。さらに,ZSLを階層的テキストベース説明機構に用いるためのフレームワークをスケッチし,Syntax Tree-Guided Semantic Explanation (STEP) と呼ぶ。最後に,提案する説明可能性指標(ei)を用いて,zslモデルを用いて合理的な正確性と説明可能性を達成する実験をまとめる。この研究は、我々の知る限り、DSDタスクにおけるZSLモデルの有効性を、精度と説明可能性の両方の観点から徹底的に探求する最初の成果である。

We focus on exploring various approaches of Zero-Shot Learning (ZSL) and their explainability for a challenging yet important supervised learning task notorious for training data scarcity, i.e. Depression Symptoms Detection (DSD) from text. We start with a comprehensive synthesis of different components of our ZSL modeling and analysis of our ground truth samples and Depression symptom clues curation process with the help of a practicing clinician. We next analyze the accuracy of various state-of-the-art ZSL models and their potential enhancements for our task. Further, we sketch a framework for the use of ZSL for hierarchical text-based explanation mechanism, which we call, Syntax Tree-Guided Semantic Explanation (STEP). Finally, we summarize experiments from which we conclude that we can use ZSL models and achieve reasonable accuracy and explainability, measured by a proposed Explainability Index (EI). This work is, to our knowledge, the first work to exhaustively explore the efficacy of ZSL models for DSD task, both in terms of accuracy and explainability.

翻訳日:2021-06-24 12:58:23 公開日:2021-06-23

# (参考訳) 差分認識モデルを用いた学習型実測光場画像圧縮

Learning-Based Practical Light Field Image Compression Using A Disparity-Aware Model ( http://arxiv.org/abs/2106.11558v2 )

ライセンス: CC BY 4.0

Mohana Singh and Renu M. Rameshan

(参考訳) 光分野技術は研究コミュニティの注目を集め、多くの応用が期待されている。商用のレンズカメラのレンズレットアレイは、光線の空間情報と角情報の両方を単一の露光で捉えるのに役立つ。光フィールドデータの高次元性により、その優れた機能を実現する一方で、その広範な採用を妨げる。そのため、光電界画像の効率的な圧縮が求められている。既存のソリューションは通常、いくつかの異なるモジュールで構成されており、いくつかは光フィールドデータの特定の構造と品質のために設計されていないかもしれない。これによりコーデックの複雑さが増し、非実用的なデコーディングランタイムが発生する。並列デコーディングが可能な4次元光フィールド画像の圧縮のための,学習に基づく分散支援モデルを提案する。モデルはエンドツーエンドのトレーニングが可能で、手動でモジュールを調整する必要がなく、レートと歪みの同時学習が可能である。格差支援アプローチは、再構成された光場の構造的整合性を保証する。 PSNRとMS-SSIMの指標で比較すると,性能が向上している。また、ランタイムのエンコーディングとデコードにも顕著な利益がある。ソースコードはhttps://moha23.github.io/LF-DAAEで公開されている。

Light field technology has increasingly attracted the attention of the research community with its many possible applications. The lenslet array in commercial plenoptic cameras helps capture both the spatial and angular information of light rays in a single exposure. While the resulting high dimensionality of light field data enables its superior capabilities, it also impedes its extensive adoption. Hence, there is a compelling need for efficient compression of light field images. Existing solutions are commonly composed of several separate modules, some of which may not have been designed for the specific structure and quality of light field data. This increases the complexity of the codec and results in impractical decoding runtimes. We propose a new learning-based, disparity-aided model for compression of 4D light field images capable of parallel decoding. The model is end-to-end trainable, eliminating the need for hand-tuning separate modules and allowing joint learning of rate and distortion. The disparity-aided approach ensures the structural integrity of the reconstructed light fields. Comparisons with the state of the art show encouraging performance in terms of PSNR and MS-SSIM metrics. Also, there is a notable gain in the encoding and decoding runtimes. Source code is available at https://moha23.github.io/LF-DAAE.

翻訳日:2021-06-24 12:39:42 公開日:2021-06-23

# (参考訳) 可視化:物理インフォームドデータ拡張によるデータ駆動型地震インバージョン

Making Invisible Visible: Data-Driven Seismic Inversion with Physics-Informed Data Augmentation ( http://arxiv.org/abs/2106.11892v2 )

ライセンス: CC BY 4.0

Yuxin Yang, Xitong Zhang, Qiang Guan, Youzuo Lin

(参考訳) ディープラーニングとデータ駆動アプローチは、科学的領域において大きな可能性を示しています。データ駆動技術の約束は、大量の高品質なトレーニングデータセットが利用可能であることに依存している。高価な物理実験、機器、シミュレーションを通じてデータを取得するコストが高いため、近年、科学応用のためのデータ拡張技術が科学データを得るための新しい方向として登場した。しかし、コンピュータビジョンに由来する既存のデータ拡張技術は、私たちが関心を持つドメイン問題には役に立たない物理的に受け入れられないデータサンプルを生み出します。本稿では,畳み込みニューラルネットワークを用いた新しい物理情報拡張手法を提案する。特に、生成モデルは、合成データの質を改善するために、異なる物理知識(制御方程式、観測可能な知覚、物理現象など)を利用する。本研究では,データ拡張手法の有効性を検証するために,co$_2$リークデータを用いた地中地震波フルウェーブフォームインバージョン法を適用した。我々の関心は、極小のco$_2$リークを伴う地下速度モデルに逆戻りすることである。本手法の有効性を総合的な数値テストを用いて検証する。比較と解析により,物理インフォームドデータ拡張技術を用いて,データ駆動型地震イメージングを著しく向上させることができることを示す。特に,本手法で得られた拡張学習セットを用いた場合,画像品質は,一般大規模漏洩テストシナリオでは15%向上し,小型リークでは17%向上した。

Deep learning and data-driven approaches have shown great potential in scientific domains. The promise of data-driven techniques relies on the availability of a large volume of high-quality training datasets. Due to the high cost of obtaining data through expensive physical experiments, instruments, and simulations, data augmentation techniques for scientific applications have emerged as a new direction for obtaining scientific data recently. However, existing data augmentation techniques originating from computer vision, yield physically unacceptable data samples that are not helpful for the domain problems that we are interested in. In this paper, we develop new physics-informed data augmentation techniques based on convolutional neural networks. Specifically, our generative models leverage different physics knowledge (such as governing equations, observable perception, and physics phenomena) to improve the quality of the synthetic data. To validate the effectiveness of our data augmentation techniques, we apply them to solve a subsurface seismic full-waveform inversion using simulated CO$_2$ leakage data. Our interest is to invert for subsurface velocity models associated with very small CO$_2$ leakage. We validate the performance of our methods using comprehensive numerical tests. Via comparison and analysis, we show that data-driven seismic imaging can be significantly enhanced by using our physics-informed data augmentation techniques. Particularly, the imaging quality has been improved by 15% in test scenarios of general-sized leakage and 17% in small-sized leakage when using an augmented training set obtained with our techniques.

翻訳日:2021-06-24 12:29:08 公開日:2021-06-23

# Trinity: 複雑な空間データセットのためのノーコードAIプラットフォーム

Trinity: A No-Code AI platform for complex spatial datasets ( http://arxiv.org/abs/2106.11756v2 )

ライセンス: Link先を確認

C.V.Krishnakumar Iyer, Feili Hou, Henry Wang, Yonghong Wang, Kay Oh, Swetava Ganguli, Vipul Pandey

(参考訳) 本稿では,機械学習研究者と非技術領域の専門家の両方が,さまざまな複雑な問題を解決するために,ドメイン固有の信号やデータセットを実験可能にすることを目的として,trinityと呼ばれる非コード人工知能(ai)プラットフォームを提案する。この多様な問題を解決する汎用性は、複雑な時空間データセットを変換して、標準的なディープラーニングモデル、この場合、畳み込みニューラルネットワーク(cnns)によって利用しやすくし、標準的な方法で異なる問題を定式化する能力を与えることによって達成される。セマンティクスのセグメンテーション。複雑な機能エンジニアリング、ディープラーニングカーネル、スケーラブルなデータ処理メカニズムのデリバティブをホストする機能ストアである直感的なユーザインターフェースによって、Trinityは、ドメインの専門家がビジネスクリティカルな問題を解決する上で、科学者やエンジニアとステージを共有するための強力なプラットフォームを提供する。迅速なプロトタイピングと迅速な実験を可能にし、モデルの構築とデプロイを標準化することで、生産までの時間を短縮する。本稿では,Trinityとその設計の背景にある私たちのモチベーションとサンプルアプリケーションを展示することで,AIを用いたバーを低くするというアイデアを動機づける。

We present a no-code Artificial Intelligence (AI) platform called Trinity with the main design goal of enabling both machine learning researchers and non-technical geospatial domain experts to experiment with domain-specific signals and datasets for solving a variety of complex problems on their own. This versatility to solve diverse problems is achieved by transforming complex Spatio-temporal datasets to make them consumable by standard deep learning models, in this case, Convolutional Neural Networks (CNNs), and giving the ability to formulate disparate problems in a standard way, eg. semantic segmentation. With an intuitive user interface, a feature store that hosts derivatives of complex feature engineering, a deep learning kernel, and a scalable data processing mechanism, Trinity provides a powerful platform for domain experts to share the stage with scientists and engineers in solving business-critical problems. It enables quick prototyping, rapid experimentation and reduces the time to production by standardizing model building and deployment. In this paper, we present our motivation behind Trinity and its design along with showcasing sample applications to motivate the idea of lowering the bar to using AI.

翻訳日:2021-06-24 12:08:23 公開日:2021-06-23

# クエリーとしてインスタンスを追跡する

Tracking Instances as Queries ( http://arxiv.org/abs/2106.11963v2 )

ライセンス: Link先を確認

Shusheng Yang, Yuxin Fang, Xinggang Wang, Yu Li, Ying Shan, Bin Feng, Wenyu Liu

(参考訳) 最近、クエリベースのディープネットワークは、エンドツーエンドパイプラインと、オブジェクト検出、セマンティックセグメンテーション、インスタンスセグメンテーションなど、いくつかの基本的なコンピュータビジョンタスクにおける競合結果のために多くの注目を集めている。しかし、エレガントなアーキテクチャと強力なパフォーマンスを備えたクエリベースのビデオインスタンスセグメンテーション(VIS)フレームワークの確立方法はまだ解決されていない。本稿では、QueryInstのインスタンスとクエリの固有の一対一対応をフル活用した統合クエリベースのVISフレームワークである、textbf{QueryTrack}(クエリとしてのインスタンスの追跡)を提案する。提案手法は,YouTube-VIS-2019 / 2021データセット上で52.7 / 52.3 APを取得し,CVPR 2021 \textbf{ with a single online end-to-end model, single scale testing \& modest amount of training data} で2位を獲得した。また、VISコミュニティのリファレンスとして、YouTube-VIS-2021 val のQueryTrack-ResNet-50ベースライン結果も提供します。

Recently, query based deep networks catch lots of attention owing to their end-to-end pipeline and competitive results on several fundamental computer vision tasks, such as object detection, semantic segmentation, and instance segmentation. However, how to establish a query based video instance segmentation (VIS) framework with elegant architecture and strong performance remains to be settled. In this paper, we present \textbf{QueryTrack} (i.e., tracking instances as queries), a unified query based VIS framework fully leveraging the intrinsic one-to-one correspondence between instances and queries in QueryInst. The proposed method obtains 52.7 / 52.3 AP on YouTube-VIS-2019 / 2021 datasets, which wins the 2-nd place in the YouTube-VIS Challenge at CVPR 2021 \textbf{with a single online end-to-end model, single scale testing \& modest amount of training data}. We also provide QueryTrack-ResNet-50 baseline results on YouTube-VIS-2021 val set as references for the VIS community.

翻訳日:2021-06-24 12:08:01 公開日:2021-06-23

# CPM-2:大規模費用対効果事前訓練言語モデル

CPM-2: Large-scale Cost-effective Pre-trained Language Models ( http://arxiv.org/abs/2106.10715v2 )

ライセンス: Link先を確認

Zhengyan Zhang, Yuxian Gu, Xu Han, Shengqi Chen, Chaojun Xiao, Zhenbo Sun, Yuan Yao, Fanchao Qi, Jian Guan, Pei Ke, Yanzheng Cai, Guoyang Zeng, Zhixing Tan, Zhiyuan Liu, Minlie Huang, Wentao Han, Yang Liu, Xiaoyan Zhu, Maosong Sun

(参考訳) 近年,事前学習型言語モデル (PLM) のサイズは跳躍と境界によって増大している。しかし、これらの大規模PLMの効率問題は現実のシナリオでの利用を制限する。本稿では, PLM を用いた事前学習, 微調整, 推論の効率性問題に対処するための費用対効果技術について述べる。 1)スクラッチからトレーニングモデルに代えて既存のplmを活用し,事前学習プロセスを高速化するために知識継承を導入する。 2)大規模PLMを用いた即時チューニングのベストプラクティスを検討する。従来の微調整に比べて、プロンプトチューニングはタスク固有のパラメータの数を大幅に減少させる。 (3)計算資源が限られている大規模PLMを使用するための新しい推論ツールキットInfMoEを実装した。コスト効率のよいパイプラインに基づいて、100億のパラメータを持つエンコーダ・デコーダバイリンガルモデル(CPM-2)と、1980億のパラメータを持つMoEバージョンという2つのモデルを事前訓練する。実験では,下流タスクにおけるCPM-2とmT5を比較した。実験の結果, CPM-2は汎用言語知能に優れていた。さらに,InfMoEを1つのGPU上で数千億のパラメータを持つ大規模モデルの推論を行う際の効率を検証する。すべてのソースコードとモデルパラメータはhttps://github.com/TsinghuaAI/CPMで入手できる。

In recent years, the size of pre-trained language models (PLMs) has grown by leaps and bounds. However, efficiency issues of these large-scale PLMs limit their utilization in real-world scenarios. We present a suite of cost-effective techniques for the use of PLMs to deal with the efficiency issues of pre-training, fine-tuning, and inference. (1) We introduce knowledge inheritance to accelerate the pre-training process by exploiting existing PLMs instead of training models from scratch. (2) We explore the best practice of prompt tuning with large-scale PLMs. Compared with conventional fine-tuning, prompt tuning significantly reduces the number of task-specific parameters. (3) We implement a new inference toolkit, namely InfMoE, for using large-scale PLMs with limited computational resources. Based on our cost-effective pipeline, we pre-train two models: an encoder-decoder bilingual model with 11 billion parameters (CPM-2) and its corresponding MoE version with 198 billion parameters. In our experiments, we compare CPM-2 with mT5 on downstream tasks. Experimental results show that CPM-2 has excellent general language intelligence. Moreover, we validate the efficiency of InfMoE when conducting inference of large-scale models having tens of billions of parameters on a single GPU. All source code and model parameters are available at https://github.com/TsinghuaAI/CPM.

翻訳日:2021-06-24 12:07:38 公開日:2021-06-23

# トランスフォーマーに基づく自然言語処理手法を用いた広告テキスト分類

Ad Text Classification with Transformer-Based Natural Language Processing Methods ( http://arxiv.org/abs/2106.10899v2 )

ライセンス: Link先を確認

Umut \"Ozdil, B\"u\c{s}ra Arslan, D. Emre Ta\c{s}ar, G\"ok\c{c}e Polat,\c{S}\"ukr\"u Ozan

(参考訳) 本研究では,オンライン広告プラットフォーム上で生成した広告テキストをセクター的に自動分類するための自然言語処理(NLP)手法を提案する。当社のデータセットは、12のセクターから約21,000のラベル付き広告テキストで構成されています。本研究では,最近自然言語処理文献におけるテキスト分類などの分野で用いられているトランスフォーマに基づく言語モデルであるbertモデルからの双方向エンコーダ表現を用いた。トルコ語のための事前訓練されたBERTモデルを用いて得られた分類効率を詳細に示す。

In this study, a natural language processing-based (NLP-based) method is proposed for the sector-wise automatic classification of ad texts created on online advertising platforms. Our data set consists of approximately 21,000 labeled advertising texts from 12 different sectors. In the study, the Bidirectional Encoder Representations from Transformers (BERT) model, which is a transformer-based language model that is recently used in fields such as text classification in the natural language processing literature, was used. The classification efficiencies obtained using a pre-trained BERT model for the Turkish language are shown in detail.

翻訳日:2021-06-24 12:07:18 公開日:2021-06-23

# 画像分類補助タスクとしてのフーリエ変換近似

Fourier Transform Approximation as an Auxiliary Task for Image Classification ( http://arxiv.org/abs/2106.11478v2 )

ライセンス: Link先を確認

Chen Liu

(参考訳) 画像再構成は、画像分類において最も重要な補助課題である。本稿では「入力画像のフーリエ変換の近似」を潜在的な代替案として検討し、これが主課題における性能をさらに向上させるか、あるいは画像再構成であまりカバーされない新しい制約を導入することを期待する。 cifar-10データセット上で5つの一般的な分類アーキテクチャを実験した結果,提案手法により分類精度が向上した。さらに,提案する補助タスクが,高速勾配符号法を用いて発生する敵攻撃に対する分類器の抵抗性を高める可能性が示唆された。

Image reconstruction is likely the most predominant auxiliary task for image classification. In this paper, we investigate "approximating the Fourier Transform of the input image" as a potential alternative, in the hope that it may further boost the performances on the primary task or introduce novel constraints not well covered by image reconstruction. We experimented with five popular classification architectures on the CIFAR-10 dataset, and the empirical results indicated that our proposed auxiliary task generally improves the classification accuracy. More notably, the results showed that in certain cases our proposed auxiliary task may enhance the classifiers' resistance to adversarial attacks generated using the fast gradient sign method.

翻訳日:2021-06-24 12:07:10 公開日:2021-06-23

# BanditMF:マルチArmed Bandit-based Matrix Factorization Recommender System

BanditMF: Multi-Armed Bandit Based Matrix Factorization Recommender System ( http://arxiv.org/abs/2106.10898v2 )

ライセンス: Link先を確認

Shenghao Xu

(参考訳) マルチアームバンディット(mab)は、探索と搾取のバランスを達成するために原則化されたオンライン学習アプローチを提供する。複数の状況で行動する学習を伴わない優れたパフォーマンスと低フィードバック学習のため、マルチアームのバンディットはレコメンデーションシステムのようなアプリケーションで広く注目を集めている。同様に、リコメンダシステム内では、コラボレーティブフィルタリング(cf)はおそらくリコメンダシステムにおいて最も早く、最も影響力のある方法である。重要なことは、新しいユーザーと推奨アイテムのプールが、レコメンデーターシステムに対処する必要がある課題だ。協調フィルタリングでは、古典的な手法はモデルをオフラインでトレーニングし、オンラインテストを実行するが、このアプローチは、いわゆるコールドスタートであるユーザの好みの動的変更をもはや処理できない。では、効果的な情報がないユーザに対して、効果的にアイテムを推奨する方法? 上記の問題に対処するため、BanditMFというマルチアームバンディットに基づく協調フィルタリング推薦システムが提案されている。 BanditMF は,(1) 有効情報の不足条件下での協調フィルタリングにおけるコールドスタート問題の解法,(2) ユーザと関係する未知のパラメータを独立に推定し,ユーザ間の相関を無視することによる,強い関係領域におけるバンディットアルゴリズムの最適部分問題の解法,という2つの課題に対処するように設計されている。

Multi-armed bandits (MAB) provide a principled online learning approach to attain the balance between exploration and exploitation. Due to the superior performance and low feedback learning without the learning to act in multiple situations, Multi-armed Bandits drawing widespread attention in applications ranging such as recommender systems. Likewise, within the recommender system, collaborative filtering (CF) is arguably the earliest and most influential method in the recommender system. Crucially, new users and an ever-changing pool of recommended items are the challenges that recommender systems need to address. For collaborative filtering, the classical method is training the model offline, then perform the online testing, but this approach can no longer handle the dynamic changes in user preferences which is the so-called cold start. So how to effectively recommend items to users in the absence of effective information? To address the aforementioned problems, a multi-armed bandit based collaborative filtering recommender system has been proposed, named BanditMF. BanditMF is designed to address two challenges in the multi-armed bandits algorithm and collaborative filtering: (1) how to solve the cold start problem for collaborative filtering under the condition of scarcity of valid information, (2) how to solve the sub-optimal problem of bandit algorithms in strong social relations domains caused by independently estimating unknown parameters associated with each user and ignoring correlations between users.

翻訳日:2021-06-24 12:06:59 公開日:2021-06-23

# 微粒と粗粒の誤情報を分類する : COVID-19インフォデミックの実証的研究

Categorising Fine-to-Coarse Grained Misinformation: An Empirical Study of COVID-19 Infodemic ( http://arxiv.org/abs/2106.11702v2 )

ライセンス: Link先を確認

Ye Jiang, Xingyi Song, Carolina Scarton, Ahmet Aker, Kalina Bontcheva

(参考訳) ソーシャルメディア上で新型コロナウイルス(COVID-19)の誤報が広まることで、多くの研究者が注目している。 google scholarによると、covid-19関連の偽情報研究はこれまでに約2万6000件が出版されている。これらの研究の多くは、(1)新型コロナウイルス関連誤報の特徴を検出し、分析することに焦点を当てている。しかし、誤報に関連する社会行動の研究は無視されることが多い。本稿では、社会行動アノテーションを含む微粒な誤情報ツイートデータセット(例)を紹介する。誤報に対するコメントまたは質問) このデータセットは、社会的行動分析を可能にするだけでなく、証拠ベースまたは非証拠ベースの誤情報分類タスクにも適している。また,本実験では,実世界の誤情報に適用した場合,誤情報の分類性能が著しく異なる可能性があることを示す。

The spreading COVID-19 misinformation over social media already draws the attention of many researchers. According to Google Scholar, about 26000 COVID-19 related misinformation studies have been published to date. Most of these studies focusing on 1) detect and/or 2) analysing the characteristics of COVID-19 related misinformation. However, the study of the social behaviours related to misinformation is often neglected. In this paper, we introduce a fine-grained annotated misinformation tweets dataset including social behaviours annotation (e.g. comment or question to the misinformation). The dataset not only allows social behaviours analysis but also suitable for both evidence-based or non-evidence-based misinformation classification task. In addition, we introduce leave claim out validation in our experiments and demonstrate the misinformation classification performance could be significantly different when applying to real-world unseen misinformation.

翻訳日:2021-06-24 12:06:33 公開日:2021-06-23

PDF登録状況（公開日: 20210623）