Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20200708となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# 進化論理回路によるオープンエンディングフィットネス景観に関する研究 Investigation into Open-Ended Fitness Landscape through Evolutionary Logical Circuits ( http://arxiv.org/abs/2002.00593v2 ) ライセンス: Link先を確認	Masaki Suyama and Kosuke Sato	(参考訳) 累積的な文化進化は、人類が様々な生態学的・人口的環境で繁栄するきっかけとなった。人間が解決する必要のあるタスクの解決策は、クローズドまたはオープンエンドのフィットネスランドスケープの形をとるタスク空間にマッピングされ、前者は文化進化の研究において後者よりも広範囲にモデル化された。本稿では,前回の試行で構築された回路を用いた論理回路を構築するコンピュータシミュレーションを用いて,オープンエンドフィットネスランドスケープをモデル化したArthur and Polak (2006) によるシミュレーションを修正した。このシミュレーションを用いて、オープンエンドフィットネスランドスケープの性質を明らかにするとともに、グループサイズの増大により文化の蓄積速度が向上するかどうかを調べた。その結果, 群サイズは蓄積速度を増加させたが, 期待以上に制限された。また、2種類の蓄積、発明と改良が区別された場合、両者の性質が異なっていた。改良では, 1つのエージェントの生産性がグループサイズの増加とともに低下する凸関数に追従した。発明では、この軌道は急激な増加の連続したパターンを示し、次いで高原を示した。 Cumulative cultural evolution is what made humanity to thrive in various ecological and demographic environments. Solutions to the tasks that humans needed to solve could be mapped onto a task space which could take the form of either closed or open-ended fitness landscape, with the former being modeled more extensively than the latter in studies of cultural evolution. In this article, we modified a simulation by Arthur and Polak (2006) that modeled open-ended fitness landscape by using a computer simulation that builds logical circuits with circuits that were built in earlier trials. We used this simulation to clarify the nature of open-ended fitness landscape and to investigate whether the speed of accumulation of culture is increased by an increase in group size. The results indicated that group size increased the speed of accumulation but is limited than expected. Also, when two types of accumulation, invention and improvement, were distinguished the nature of the two differed. In improvement, the trajectory followed a convex function with productivity of one agent decreasing as group size increased. In invention, the trajectory showed a continuous pattern of rapid increase followed by a plateau.	翻訳日:2023-06-04 20:50:14 公開日:2020-07-08
# 準確率関数を持つ量子相関の雑音適応試験 Noise-adaptive test of quantum correlations with quasiprobability functions ( http://arxiv.org/abs/2002.05840v2 ) ライセンス: Link先を確認	Seung-Woo Lee, Jaewan Kim, Wonmin Son	(参考訳) 本稿では,雑音の存在下での準確率関数の観点から量子相関をテストする手法を提案する。測定の不完全性と熱環境が量子相関に及ぼす影響を分析し,そのノイズ効果を一般化準確率関数の次数パラメータの変化にうまくカプセル化できることを示した。次に、一般化準確率関数を用いて、ベル型不等式という形で雑音適応型絡み込み証を定式化する。驚くべきことに、厳しい雑音下で量子相関を観測できる。本手法は,連続可変系を用いた近時雑音量子プロセッサにおける量子相関の検証に有用である。 We introduce a method for testing quantum correlations in terms of quasiprobability functions in the presence of noise. We analyze the effects of measurement imperfection and thermal environment on quantum correlations and show that their noise effects can be well encapsulated into the change of the order parameter of the generalized quasiprobability function. We then formulate a noise-adaptive entanglement witness in the form of a Bell-type inequality by using the generalized quasiprobability function. Remarkably, it allows us to observe quantum correlations under severe noise. Our method provides a useful tool to test quantum correlations in near-term noisy quantum processors with continuous-variable systems.	翻訳日:2023-06-03 17:09:34 公開日:2020-07-08
# CFTにおけるナヒトマン理論の一般化 A Generalized Nachtmann Theorem in CFT ( http://arxiv.org/abs/2002.12390v2 ) ライセンス: Link先を確認	Sandipan Kundu	(参考訳) ローレンツ符号におけるユニタリ量子場理論のコリエーターは、ある解析性と肯定的性質に従う。 2次元以上のユニタリ CFT の相互作用に対して、これらの性質は一次作用素の OPE に現れる極小ツイスト作用素の族に一般的な制約を与えることを示す。特に、任意のスカラー一次の反射対称 OPE に偶数スピンが現れる極小ツイスト作用素の族に対して、ツイストはスピンの単調に増大する凸函数でなければならないという凸定理を導いて拡張する。我々の議論は完全に非摂動的であり、ユニタリ cft における非同一のスカラープライマリの ope にも当てはまり、ope に現れる回転作用素のねじれを制限している。最後に、同じ手法が特定のCFT相関器のRegge動作にも制約を課していると論じる。 Correlators of unitary quantum field theories in Lorentzian signature obey certain analyticity and positivity properties. For interacting unitary CFTs in more than two dimensions, we show that these properties impose general constraints on families of minimal twist operators that appear in the OPEs of primary operators. In particular, we rederive and extend the convexity theorem which states that for the family of minimal twist operators with even spins appearing in the reflection-symmetric OPE of any scalar primary, twist must be a monotonically increasing convex function of the spin. Our argument is completely non-perturbative and it also applies to the OPE of nonidentical scalar primaries in unitary CFTs, constraining the twist of spinning operators appearing in the OPE. Finally, we argue that the same methods also impose constraints on the Regge behavior of certain CFT correlators.	翻訳日:2023-06-01 12:29:21 公開日:2020-07-08
# qsw_mpi: 量子確率歩行の並列シミュレーションのためのフレームワーク QSW_MPI: a framework for parallel simulation of quantum stochastic walks ( http://arxiv.org/abs/2003.02450v2 ) ライセンス: Link先を確認	Edric Matwiejew and Jingbo Wang	(参考訳) QSW_MPIは連続時間量子確率ウォークの時系列シミュレーションのために開発されたピソンパッケージである。このモデルは、連続時間ランダムウォークと連続時間量子ウォークの一般化を含むリンドブラッド形式論におけるマルコフ開量子系の研究を可能にする。 QSW_MPIは、並列化されたFortranライブラリにアクセスするピソンインタフェースで、スパースデータ構造を利用するため、大規模並列コンピュータにスケーラブルであり、任意の複雑さのグラフと無方向性グラフ上の幅広いウォークダイナミクスのシミュレーションを可能にする。 QSW_MPI is a python package developed for time-series simulation of continuous-time quantum stochastic walks. This model allows for the study of Markovian open quantum systems in the Lindblad formalism, including a generalisation of the continuous-time random walk and continuous-time quantum walk. Consisting of a python interface accessing parallelised Fortran libraries utilising sparse data structures, QSW_MPI is scalable to massively parallel computers, which makes possible the simulation of a wide range of walk dynamics on directed and undirected graphs of arbitrary complexity.	翻訳日:2023-05-30 11:44:17 公開日:2020-07-08
# 1次元光学格子におけるスピン軌道結合によるスタガードフラックスの有効三角形ラダー Effective triangular ladders with staggered flux from spin-orbit coupling in 1D optical lattices ( http://arxiv.org/abs/2003.04154v3 ) ライセンス: Link先を確認	Josep Cabedo, Joan Claramunt, Jordi Mompart, Ver\`onica Ahufinger and Alessio Celi	(参考訳) 光誘起スピン軌道結合は、超低温原子で量子磁気を研究する柔軟なツールである。本研究では,1次元光学格子中のスピン軌道結合ボース気体を,ハミルトニアンの最低帯域切断後のスタガードフラックスを持つ2脚三角ラダーにマッピング可能であることを示す。有効フラックスとトンネル強度の比は、広範囲の値に独立に調整することができる。ハードコアボソン近似が保持するパラメータの特定の構造を特定し,可変磁束を持つフラストレーション三角形スピンラダーを実現する。実効スピンハミルトニアンの性質を密度行列再正規化法を用いて検討し, 半充填時の位相図を決定する。均一な超流動とボンドオーダー絶縁体という2つの相を示す。後者は低ラマン調律でのみ安定化できる。最後に、予測相転移を横断するSOC系のパラメータ空間にわたって実験可能な軌道を提供する。 Light-induced spin-orbit coupling is a flexible tool to study quantum magnetism with ultracold atoms. In this work we show that spin-orbit coupled Bose gases in a one-dimensional optical lattice can be mapped into a two-leg triangular ladder with staggered flux following a lowest-band truncation of the Hamiltonian. The effective flux and the ratio of the tunneling strengths can be independently adjusted to a wide range of values. We identify a certain regime of parameters where a hard-core boson approximation holds and the system realizes a frustrated triangular spin ladder with tunable flux. We study the properties of the effective spin Hamiltonian using the density-matrix renormalization-group method and determine the phase diagram at half-filling. It displays two phases: a uniform superfluid and a bond-ordered insulator. The latter can be stabilized only for low Raman detuning. Finally, we provide experimentally feasible trajectories across the parameter space of the SOC system that cross the predicted phase transition.	翻訳日:2023-05-30 03:15:06 公開日:2020-07-08
# ドメインウォール非線形量子化 Domain wall nonlinear quantization ( http://arxiv.org/abs/2003.05387v3 ) ライセンス: Link先を確認	M. G. Ivanov	(参考訳) 領域壁の非線形量子化(コディメンション1の相対論的膜)を考える。膜塵の方程式はハミルトン・ヤコビ方程式の類似物と見なされ、量子アナログを構成することができる。結果として得られる方程式は非線形クライン・フォック・ゴルドン方程式の形を持つ。これは量子領域の壁に対する平均場近似と解釈できる。分散関係は(線形近似で)小さな摂動に対して得られる。摂動の群速度は光の速度を超えない。ドメイン壁に沿って伝播する摂動に対して、(古典的な場合のように)質量を持たないモードに加えて、大きなものが現れます。この結果は凝縮物質理論や超弦理論や超重力理論における膜量子化において興味深い。 The nonlinear quantization of the domain wall (relativistic membrane of codimension 1) is considered. The membrane dust equation is considered as an analogue of the Hamilton-Jacobi equation, which allows us to construct its quantum analogue. The resulting equation has the form of a nonlinear Klein-Fock-Gordon equation. It can be interpreted as the mean field approximation for a quantum domain wall. Dispersion relations are obtained for small perturbations (in a linear approximation). The group speed of perturbations does not exceed the speed of light. For perturbations propagating along the domain wall, in addition to the massless mode (as in the classical case), a massive one appears. The result may be interesting in condensed matter theory and in membrane quantization in superstring and supergravity theories.	翻訳日:2023-05-29 11:10:38 公開日:2020-07-08
# 単位非マルコフ量子進化における可逆性緩和 Irreversibility mitigation in unital non-Markovian quantum evolutions ( http://arxiv.org/abs/2004.04619v2 ) ライセンス: Link先を確認	Stefano Gherardini, Stefano Marcantoni, Filippo Caruso	(参考訳) 熱力学的エントロピー生成と非マルコフ進化の関係は、現在の研究課題である。本稿では,開量子系における確率エントロピー生成の挙動について考察する。特に、パウリ流の族について、量子力学がP-分割できないことを仮定して、平均エントロピー生成と分散の両方が特定の時間間隔で減少することを示す。系の力学は全体的に可逆的であるが、この結果は可逆性への過渡的傾向として解釈され、エントロピー生成のデルタピーク分布として0付近で説明される。最後に、量子系ダイナミクスを発生させる発電機のパラメータに関する解析的境界も提供し、対応する非マルコフ進化の可逆的緩和を確実にする。 The relation between the thermodynamic entropy production and non-Markovian evolutions is matter of current research. Here, we study the behavior of the stochastic entropy production in open quantum systems undergoing unital non-Markovian dynamics. In particular, for the family of Pauli channels we show that in some specific time intervals both the average entropy production and the variance can decrease, provided that the quantum dynamics fails to be P-divisible. Although the dynamics of the system is overall irreversible, our result may be interpreted as a transient tendency towards reversibility, described as a delta peaked distribution of entropy production around zero. Finally, we also provide analytical bounds on the parameters in the generator giving rise to the quantum system dynamics, so as to ensure irreversibility mitigation of the corresponding non-Markovian evolution.	翻訳日:2023-05-25 08:42:18 公開日:2020-07-08
# 局在ディラック波動関数のリバウンド運動 Rebound Motion of Localized Dirac Wavefunctions ( http://arxiv.org/abs/2004.07938v2 ) ライセンス: Link先を確認	Domenico P.L. Castrigiano	(参考訳) 有界局所化自由ディラック波動関数のキャリアは無限度から縮小し、その後再び無限度へと拡大する。運動は光の速度で等方的に起こる。その間にはリバウンドの位相があり、これはキャリアの直径の順に最小の延長で時間と空間に制限される。この動きは、空間のあらゆる方向に特定の時間があり、収縮から膨張への変化が瞬時に起こるように、異方的に突然進行する。漸近的に、過去と未来に関しても、位置の確率は、光速で外半径が増加する球殻の中で最大1に集中する。 It is shown that the carrier of a bounded localized free Dirac wavefunction shrinks from infinity and subsequently expands to infinity again. The motion occurs isotropicly at the speed of light. In between there is the phase of rebound, which is limited in time and space in the order of the diameter of the carrier at its minimal extension. This motion proceeds anisotropicly and abruptly as for every direction in space there is a specific time, at which the change from shrinking to expanding happens instantaneously. Asymptotically, regarding the past and the future as well, the probability of position concentrates up to 1 within any spherical shell whose outer radius increases at light speed.	翻訳日:2023-05-23 08:51:10 公開日:2020-07-08
# 有限温度における振動スペクトルのオンザフライ半古典的評価 On-the-fly ab initio semiclassical evaluation of vibronic spectra at finite temperature ( http://arxiv.org/abs/2005.09126v2 ) ライセンス: Link先を確認	Tomislav Begu\v{s}i\'c and Ji\v{r}\'i Van\'i\v{c}ek	(参考訳) 振動分解された電子スペクトルをゼロ温度で計算・解析するため,我々は最近,非調和性,モードモード結合,ヘルツベルグ-テラー効果を考慮し,オン・ザ・フライ ab initio extended thawed gaussian approximation [a. patoz et al., j. phys. chem. lett. 9, 2367 (2018)] を実装した。非零温度でのスペクトル評価のために,本手法を一般化する。熱場力学と並行して、密度行列のコヒーレンス成分のフォン・ノイマン進化を2倍の自由度を持つ拡張空間における波動関数のschr\"{o}dinger進化に変換する。拡張解法ガウス近似の効率により、この座標数の増加は計算コストをほとんど増やさない。より具体的には、元々のゼロ温度アプローチと比較して、有限温度法は追加のab initio電子構造計算を必要としない。同時に、新しいアプローチはスペクトルに対する有限温度、非調和性、ヘルツベルク・テラー効果を明確に区別することができる。モデルモース系において、一般的に用いられる大域高調波法に対する有限温度ソードガウス近似の利点を示し、上記のすべての効果が寄与するベンゼンの対称性禁止吸収スペクトルの評価に応用する。 To compute and analyze vibrationally resolved electronic spectra at zero temperature, we have recently implemented the on-the-fly ab initio extended thawed Gaussian approximation [A. Patoz et al., J. Phys. Chem. Lett. 9, 2367 (2018)], which accounts for anharmonicity, mode-mode coupling, and Herzberg-Teller effects. Here, we generalize this method in order to evaluate spectra at non-zero temperature. In line with thermo-field dynamics, we transform the von Neumann evolution of the coherence component of the density matrix to the Schr\"{o}dinger evolution of a wavefunction in an augmented space with twice as many degrees of freedom. Due to the efficiency of the extended thawed Gaussian approximation, this increase in the number of coordinates results in nearly no additional computational cost. More specifically, compared to the original, zero-temperature approach, the finite-temperature method requires no additional ab initio electronic structure calculations. At the same time, the new approach allows for a clear distinction among finite-temperature, anharmonicity, and Herzberg-Teller effects on spectra. We show, on a model Morse system, the advantages of the finite-temperature thawed Gaussian approximation over the commonly used global harmonic methods and apply it to evaluate the symmetry-forbidden absorption spectrum of benzene, where all of the aforementioned effects contribute.	翻訳日:2023-05-19 10:54:28 公開日:2020-07-08
# 偏光中性子を用いた電界イメージング Electric field imaging using polarized neutrons ( http://arxiv.org/abs/2006.03728v2 ) ライセンス: Link先を確認	Yuan-Yu Jau, Daniel S. Hussey, Thomas R. Gentile, and Wangchun Chen	(参考訳) 実験では、電気的に中立な粒子である中性子を用いて、分離または占有できる標的体積内の静電場を直接可視化できることを実証する。感度ポラリメトリー方式の多色スピン偏光中性子ビームを用いて電界画像を得た。この研究は、他の従来のプローブではアクセスできない物体の空間依存的な電場を撮像することにより、電位、電気分極、電荷分布、誘電率の新たな診断力を可能にする。 We experimentally demonstrate that electrically neutral particles, neutrons, can be used to directly visualize the electrostatic field inside a target volume that can be isolated or occupied. Electric-field images were obtained using a polychromatic, spin-polarized neutron beam with a sensitive polarimetry scheme. This work may enable new diagnostic power of the structure of electric potential, electric polarization, charge distribution, and dielectric constant by imaging spatially dependent electric fields in objects that cannot be accessed by other conventional probes.	翻訳日:2023-05-17 02:00:34 公開日:2020-07-08
# 単光子付加コヒーレント状態の非古典性に及ぼすポストセレクトフォンノイマン測定の影響 Effects of postselected von Neumann measurement on nonclassicality of single-photon-added coherent state ( http://arxiv.org/abs/2006.08081v2 ) ライセンス: Link先を確認	Yusuf Turek	(参考訳) 単光子付加コヒーレント状態(SPACS)の非古典性に対するフォン・ノイマン後測定の影響について検討した。光子数分布,マンデルq_{m}因子,および電界二次のスクイーズパラメータといった空間の種々の場特性に対するフォン・ノイマン測定後の明示的な表現と解析結果について検討した。その結果, 測定後のSPACSの非古典性は初期状態よりも劇的に変化した。この測定により、SPACSはより強いポアソニアン光子統計を、一定の結合強度と、ポストセレクション確率の低い大きな弱い値で保持することができた。 The effects of von Neumann postselected measurement on nonclassicality of single-photon-added coherent state (SPACS) are studied. Explicit expressions and analytical results for various field properties of SPACS such as the photon number distribution, the Mandel Q_{m} factor and the squeezing parameter of field quadrature after postselected von Neumann measurement are investigated. The results showed that the nonclassicality of SPACS after measurement changed dramatically than initial state. The measurement let SPACS possess more strong sub-Poissonian photon statistics in some definite coupling strength regimes and large weak values which accompanied by low postselection probabilities.	翻訳日:2023-05-13 20:36:48 公開日:2020-07-08
# 周期的に焼成された2脚はしごにおける偶整数位相不変量をもつ非エルミート的フロケ位相 Non-Hermitian Floquet phases with even-integer topological invariants in a periodically quenched two-leg ladder ( http://arxiv.org/abs/2006.08897v2 ) ライセンス: Link先を確認	Longwen Zhou	(参考訳) 周期的に駆動される非エルミート系は、独自のトポロジカル、動的、輸送特性を持つエキゾチックな非平衡相を持つことができる。本研究では,拡張cii対称性クラスに属する時間周期的クエンチと非エルミート効果の両方を対象とする実験的に実現可能な2脚ラダーモデルを提案する。駆動と非相互性の相互作用により、豊富な非エルミートフロケット位相相が系内に出現し、それぞれが偶数整数位相不変量 $(w_{0},w_{\pi})\in2\mathbb{z}\times2\mathbb{z}$ によって特徴づけられる。開境界条件の下では、これらの不変量はさらにシステムの端辺に局在するゼロおよび$\pi$-quasienergyモードの数を予測する。最終的に、CII対称性クラスにおける非エルミートフロケ位相の位相不変量に対する動的プローブとして使用できる平均キラル変位の一般化版を構築した。そこで本研究では,非エルミートフロッケ位相問題の新しいタイプを導入し,さらに,駆動開放系におけるトポロジーとダイナミクスの豊かさを明らかにした。 Periodically driven non-Hermitian systems could possess exotic nonequilibrium phases with unique topological, dynamical and transport properties. In this work, we introduce an experimentally realizable two-leg ladder model subjecting to both time-periodic quenches and non-Hermitian effects, which belongs to an extended CII symmetry class. Due to the interplay between drivings and nonreciprocity, rich non-Hermitian Floquet topological phases emerge in the system, with each of them been characterized by a pair of even-integer topological invariants $(w_{0},w_{\pi})\in2\mathbb{Z}\times2\mathbb{Z}$. Under the open boundary condition, these invariants further predict the number of zero- and $\pi$-quasienergy modes localized around the edges of the system. We finally construct a generalized version of the mean chiral displacement, which could be employed as a dynamical probe to the topological invariants of non-Hermitian Floquet phases in the CII symmetry class. Our work thus introduces a new type of non-Hermitian Floquet topological matter, and further reveals the richness of topology and dynamics in driven open systems.	翻訳日:2023-05-13 18:18:21 公開日:2020-07-08
# ランダムテンソルネットワークにおけるpetz再構成 Petz reconstruction in random tensor networks ( http://arxiv.org/abs/2006.12601v2 ) ライセンス: Link先を確認	Hewei Frederic Jia, Mukund Rangamani	(参考訳) ホログラフィーのランダムテンソルネットワーク玩具モデルにおけるバルク再構成の考え方について述べる。具体的には、ペッツ再構成マップが、レプリカのトリックを利用して境界データからバルク演算子を得る方法を示す。また,粗粒化とランダム射影の違いについてもコメントする機会を得た。 We illustrate the ideas of bulk reconstruction in the context of random tensor network toy models of holography. Specifically, we demonstrate how the Petz reconstruction map works to obtain bulk operators from the boundary data by exploiting the replica trick. We also take the opportunity to comment on the differences between coarse-graining and random projections.	翻訳日:2023-05-13 04:39:27 公開日:2020-07-08
# 新型コロナウイルスのリスクを見積もる個人別健康トークン Differentially Private Health Tokens for Estimating COVID-19 Risk ( http://arxiv.org/abs/2006.14329v2 ) ライセンス: Link先を確認	David Butler, Chris Hicks, James Bell, Carsten Maple, Jon Crowcroft	(参考訳) Covid-19との戦いにおいて、多くの政府や企業がいわゆる免疫パスポートを評価し、試行し、実施している。抗体や健康証明書としても知られており、他の人を危険にさらすことなく仕事や混雑した場所に戻れる技術には明確な需要がある。このようなシステムに対する大きな批判の1つは、免疫のない人々に対して不当に差別するために誤用される可能性があり、「免疫特権を持つ」人々の集団を形成することを許している。この作業では、設計によって差別的でない代替の技術的ソリューションを探究する動機があります。特に私たちは、個々のテスト結果がランダム化され、有用な集計リスク見積が計算できるような、ランダム化された健康証明書を提案します。健康トークンは,少人数の利用者による集団感染リスクを推定する有効なメカニズムを示しながら,免疫に基づく差別を緩和できることを示した。我々は、アイデンティティフリーおよびIDバインディングユースケースの文脈において、我々のアプローチの生存可能性を評価し、多くの攻撃の可能性を検討する。実験の結果,500以上のグループでは,平均で0.03 % 以下の誤差があり,複数のアイデンティティフリーな文脈において,集約された結果が有用であることがわかった。最後に,我々のソリューションの実用性を示すオープンソースプロトタイプの結果を示す。 In the fight against Covid-19, many governments and businesses are in the process of evaluating, trialling and even implementing so-called immunity passports. Also known as antibody or health certificates, there is a clear demand for any technology that could allow people to return to work and other crowded places without placing others at risk. One of the major criticisms of such systems is that they could be misused to unfairly discriminate against those without immunity, allowing the formation of an `immuno-privileged' class of people. In this work we are motivated to explore an alternative technical solution that is non-discriminatory by design. In particular we propose health tokens -- randomised health certificates which, using methods from differential privacy, allow individual test results to be randomised whilst still allowing useful aggregate risk estimates to be calculated. We show that health tokens could mitigate immunity-based discrimination whilst still presenting a viable mechanism for estimating the collective transmission risk posed by small groups of users. We evaluate the viability of our approach in the context of identity-free and identity-binding use cases and then consider a number of possible attacks. Our experimental results show that for groups of size 500 or more, the error associated with our method can be as low as 0.03 on average and thus the aggregated results can be useful in a number of identity-free contexts. Finally, we present the results of our open-source prototype which demonstrates the practicality of our solution.	翻訳日:2023-05-12 20:04:38 公開日:2020-07-08
# 開量子システムにおけるkrotovアルゴリズムのコスト関数の単調増加の証明 Proof of monotonic increase in the cost function for Krotov algorithm for open quantum systems ( http://arxiv.org/abs/2006.16817v2 ) ライセンス: Link先を確認	Tejas Shetty	(参考訳) 多くの量子制御論文では、クロトフの単調収束変分制御アルゴリズム(maday and tririnici (2003), tannor et al. (1992), zhu and rabitz (1998)など)の変種の一つを用いている。 N, Suriらによる「オープン量子系変分最適化による熱化の高速化」に関する論文。 [EPJST 227, 203 -216 (2018), arXiv:1711.08776] は、オープン量子系に対して Krotov アルゴリズムを実行する方法を提供する。我々は,論文の定理1を証明し,同一の付録1に記載された簡潔な処理に大きく拡張する。 A great number of quantum control papers have used one of the variants of the monotonically convergent variational control algorithm of Krotov (as described in Maday and Turinici (2003), Tannor et al. (1992), Zhu and Rabitz (1998), etc). The paper "Speeding up Thermalisation via Open Quantum System Variational Optimisation" by N, Suri, et al. [EPJST 227, 203 -216 (2018), arXiv:1711.08776] provides us a way of carrying out Krotov algorithm for open quantum systems. We shall prove the Theorem 1 of the paper, greatly expanding upon the brief treatment given in appendix 1 of the same.	翻訳日:2023-05-12 03:20:04 公開日:2020-07-08
# 信号検出理論を用いた高校環境における槍フィッシング感受性の定量化 Quantifying Susceptibility to Spear Phishing in a High School Environment Using Signal Detection Theory ( http://arxiv.org/abs/2006.16380v2 ) ライセンス: Link先を確認	Ploy Unchit, Sanchari Das, Andrew Kim, L. Jean Camp	(参考訳) スピアフィッシング(英:spear phishing)は、社会工学を用いて標的の被害者を標的とした機密情報を取得する詐欺攻撃である。特定の犠牲者を狙うために社会的手がかりとパーソナライズされた情報を使用することで区別される。スピアフィッシングに対するレジリエンスに関する以前の研究は、学生に不釣り合いに焦点を合わせながら、コンビニエンスサンプルに焦点を当てている。対照的に,本稿では,高校生コミュニティの評価について報告する。信号検出理論(SDT)を用いた研究には,高校生57名と教員(高校生12名,職員45名)が参加した。シナリオベースの分析を通じて、参加者はフィッシングメールと本物のメールを区別する作業を行った。その結果, 技術背景に関わらず, 自己検出における自信の偏りがみられた。これらの知見は,過疎な集団の意思決定を評価する上で重要であり,ヒトの感受性を調べることで,潜在的な槍フィッシング攻撃から人々を保護している。 Spear phishing is a deceptive attack that uses social engineering to obtain confidential information through targeted victimization. It is distinguished by its use of social cues and personalized information to target specific victims. Previous work on resilience to spear phishing has focused on convenience samples, with a disproportionate focus on students. In contrast, here, we report on an evaluation of a high school community. We engaged 57 high school students and faculty members (12 high school students, 45 staff members) as participants in research utilizing signal detection theory (SDT). Through scenario-based analysis, participants tasked with distinguishing phishing emails from authentic emails. The results revealed an overconfidence bias in self-detection from the participants, regardless of their technical background. These findings are critical for evaluating the decision-making of underrepresented populations and protecting people from potential spear phishing attacks by examining human susceptibility.	翻訳日:2023-05-12 03:19:26 公開日:2020-07-08
# コンピュータを信頼できる時(そしてできないとき) When we can trust computers (and when we can't) ( http://arxiv.org/abs/2007.03741v1 ) ライセンス: Link先を確認	Peter V. Coveney and Roger R. Highfield	(参考訳) コンピュータパワーの絶え間ない上昇により、コンピュータは科学の最も圧力のかかる問題を、さらにさらに解決できると広く期待されている。計算モデリングの限界を探求し、比較的単純で理論に固執した科学と工学の領域において、これらの手法は確かに強力である、と結論づける。それでも、コード、データ、ドキュメントの可用性は、検証、検証、不確実性定量化といった様々な技術とともに、コンピュータが生成した発見に対する信頼を構築する上で不可欠である。科学分野の複雑なシステム、特に生物学や医学において、社会科学や人文科学について何も言わずに、理論に固執していない場合、コンピュータは客観性の錯覚を生み出すことができる。また,デジタル手法では解決できない自然界の重要な側面についても論じる。長期的には、デジタル計算に現在置かれている過度な信頼を誘惑するために、アナログ手法に重点を置く必要がある。 With the relentless rise of computer power, there is a widespread expectation that computers can solve the most pressing problems of science, and even more besides. We explore the limits of computational modelling and conclude that, in the domains of science and engineering that are relatively simple and firmly grounded in theory, these methods are indeed powerful. Even so, the availability of code, data and documentation, along with a range of techniques for validation, verification and uncertainty quantification, are essential for building trust in computer generated findings. When it comes to complex systems in domains of science that are less firmly grounded in theory, notably biology and medicine, to say nothing of the social sciences and humanities, computers can create the illusion of objectivity, not least because the rise of big data and machine learning pose new challenges to reproducibility, while lacking true explanatory power. We also discuss important aspects of the natural world which cannot be solved by digital means. In the long-term, renewed emphasis on analogue methods will be necessary to temper the excessive faith currently placed in digital computation.	翻訳日:2023-05-10 23:45:55 公開日:2020-07-08
# ITForensics Managementのためのアジャイルアプローチ Agile Approach for IT Forensics Management ( http://arxiv.org/abs/2007.04125v1 ) ライセンス: Link先を確認	Matthias Schopp, Peter Hillmann	(参考訳) サイバー攻撃とITインシデントに関する法医学的な調査は、複雑化とネットワークの強化によりますます難しくなってきている。特にAdvanced Attacks(AT)では、Advanced Persistent Threatsのようなアジャイルアプローチは不可欠です。複数のシステムが攻撃(マルチホスト攻撃)に関与している。現在の法医学モデルと手続きは、そのような攻撃を分析する過程においてかなりの欠陥を示している。そこで本稿では,アジャイル手法を用いて新たな法医学的管理手法を形成する新しいフラワーモデルを提案する。このように、ATの増大する課題は満たされている。このような攻撃の法医学的な調査では、分析が必要なデータ量のために、ビッグデータの問題を解決する必要がある。提案したモデルは、早期の状態で答えるべき質問を正確に定義し、これらの質問に答えるために必要な裁判所の手続きで利用可能な証拠のみを収集することによって、この要件を満たす。さらに, 調査プロセスの異なる段階に対応する新しいAT花モデルが提示される。 The forensic investigation of cyber attacks and IT incidents is becoming increasingly difficult due to increasing complexity and intensify networking. Especially with Advanced Attacks (AT) like the increasing Advanced Persistent Threats an agile approach is indispensable. Several systems are involved in an attack (multi-host attacks). Current forensic models and procedures show considerable deficits in the process of analyzing such attacks. For this purpose, this paper presents the novel flower model, which uses agile methods and forms a new forensic management approach. In this way, the growing challenges of ATs are met. In the forensic investigation of such attacks, big data problems have to be solved due to the amount of data that needs to be analyzed. The proposed model meets this requirement by precisely defining the questions that need to be answered in an early state and collecting only the evidence usable in court proceedings that is needed to answer these questions. Additionally, the novel flower model for AT is presented that meets the different phases of an investigation process.	翻訳日:2023-05-10 23:40:59 公開日:2020-07-08
# 問題解決スキルとしての計算思考に関する研究--工学と社会科学の知識に基づく比較 Study on Computational Thinking as Problem-solving Skill: Comparison Based on Students Mindset in Engineering and Social Science ( http://arxiv.org/abs/2007.04060v1 ) ライセンス: Link先を確認	Andik Asmara	(参考訳) 21世紀のスキルを強制する能力の1つは、最上位の地位となる批判的思考と問題解決のスキルである。問題解決能力に重点を置くことは子供に教えることができ、特に小学校ではk-12の先行研究に注目する。計算思考(Computational Thinking)は、この10年で広く採用され、研究された問題解決技術である。本研究は, 計算的思考法を活用できる可能性に基づいて, 課題を解決できる学生の能力を検討することを目的とした。この研究の参加者は、台湾で学ぶ6人の国際学生と、工学と社会科学の2つのdeferent sciencesの学生であった。データインタビューの分析には質的手法が用いられ、気候変動という世界的な問題から事例を取り上げている。その結果、新しい環境で生き残ることが、彼らの問題解決スキルの実施の証拠となった。工学と社会科学の両方の学生は違いがあり、アルゴリズムに正確な構造を使う方法がある。 One of the capabilities which 21st-century skill compulsory a person is critical thinking and problem-solving skill that becomes top positions rank. Focus on problem-solving skills can be taught to a child, especially begun in elementary school refer to prior research focus on K-12. Computational thinking was one problem-solving skill that popular to implemented and studied in the current decade. This study was conducted to explore students' capability to be able solving of the problem based on the possibility use the computational thinking way. Participants in this study came from six international students that study in Taiwan and from two deferent sciences disciplines, engineering, and social science. A qualitative method was used to analyze data interviews, took example cases from the global issue that is Climate Change. The result founded that survive in a new environment was become evidence of their implementation of problem-solving skills. Problem-solving mindset both students of engineering and social science had discrepancy, those are how to use precise structure in the algorithm.	翻訳日:2023-05-10 23:40:39 公開日:2020-07-08
# 光渦中の原子の四極子吸収速度と軌道角運動量移動 Quadrupole absorption rate and orbital angular momentum transfer for atoms in optical vortices ( http://arxiv.org/abs/2007.04021v1 ) ライセンス: Link先を確認	Smail Bougouffa and Mohamed Babiker	(参考訳) 四極子遷移における光渦と原子の相互作用に関する最近の実験は、原子の電子状態と光渦場の間の軌道角運動量(oam)の交換を伴うことが示されている。理論と実験の両方による以前の研究は、電気双極子原子遷移における渦OAMの電子的自由度への移動を排除しており、電子運動へのOAMの移動を含む最も低い多極性秩序が電気四極子であることが確認されている。光渦を含む四極子転移は定量化されていないため、セシウム原子が線形偏光渦の場下にあるcsの6^2s_{1/2}\rightarrow 5^2d_{5/2}$を参照して、oam転移に伴う吸収率を評価する。実験により, 適度な光強度の吸収速度は, 四重極自発放射率よりも小さいが, 現代の分光法では測定範囲に留まることが示唆された。 Recent experiments involving the interaction of optical vortices with atoms in quadrupole transitions have been shown to be accompanied by the exchange of orbital angular momentum (OAM) between the electronic states of the atom and the optical vortex field. Earlier work by both theory and experiment had ruled out the transfer of a vortex OAM to the electronic degrees of freedom in an electric dipole atomic transition and it has been confirmed that the lowest multipolar order involving an OAM transfer to the electronic motion is indeed the electric quadrupole. Hitherto, the quadrupole transition involving optical vortices has not been quantified and we have thus set out to evaluate the absorption rate accompanied by an OAM transfer with reference to the $6^2S_{1/2}\rightarrow 5^2D_{5/2}$ in Cs when caesium atoms are subject to the field of a linearly polarized optical vortex. Our results assuming typical experimentally accessible parameters indicate that the absorption rate for moderate light intensities is smaller than the quadrupole spontaneous emission rate, but should still be within the measurement capabilities of modern spectroscopic techniques.	翻訳日:2023-05-10 23:40:09 公開日:2020-07-08
# 量子有限オートマトンによる化学反応のモデル化 A Quantum Finite Automata Approach to Modeling the Chemical Reactions ( http://arxiv.org/abs/2007.03976v1 ) ライセンス: Link先を確認	Amandeep Singh Bhatia, Shenggen Zheng	(参考訳) 近年、モデリングへの関心は分子レベルから原子レベル、量子スケールへと大きく高まっている。計算化学の分野は、原子や分子から工業規模のプロセスまで、システムの操作とシミュレーションのための計算モデルを設計する上で重要な役割を担っている。これは計算能力の大幅な増加とアルゴリズムの効率に影響を受けている。古典的オートマトン理論を用いた化学反応を熱力学的に表現することは、コンピュータ科学に大きな影響を与えた。量子計算モデルを用いた化学情報処理の研究は自然な目標である。本稿では,線形時間で停止する2方向量子有限オートマトンを用いた化学反応のモデル化を行った。さらに、古典的なプッシュダウンオートマトンは、複数のスタックを持つ化学反応のために設計することができる。化学受容/放出シグネチャと量子オートマトンモデルを組み合わせて計算の汎用性を高めることが証明されている。 In recent years, the modeling interest has increased significantly from the molecular level to the atomic and quantum scale. The field of computational chemistry plays a significant role in designing computational models for the operation and simulation of systems ranging from atoms and molecules to industrial-scale processes. It is influenced by a tremendous increase in computing power and the efficiency of algorithms. The representation of chemical reactions using classical automata theory in thermodynamic terms had a great influence on computer science. The study of chemical information processing with quantum computational models is a natural goal. In this paper, we have modeled chemical reactions using two-way quantum finite automata, which are halted in linear time. Additionally, classical pushdown automata can be designed for such chemical reactions with multiple stacks. It has been proven that computational versatility can be increased by combining chemical accept/reject signatures and quantum automata models.	翻訳日:2023-05-10 23:39:01 公開日:2020-07-08
# 量子群対称性を持つ量子チャネル Quantum channels with quantum group symmetry ( http://arxiv.org/abs/2007.03901v1 ) ライセンス: Link先を確認	Hun Hee Lee and Sang-Gyun Youn	(参考訳) 本稿では、任意のコンパクトな量子群が量子チャネルの対称性群として使用できることを証明し、共変チャネルの概念を導出する。そして、同変チャネルの凸集合の構造を、関連する融合則に対する多重性のない条件を仮定して、すべての極点を同定することにより、最近の結果の広範な一般化を提供する。群対称性と対照的な量子群対称性の存在は、量子置換群の例と$SU_q(2)$で強調される。後者の例では、非カック型条件から生じるハイゼンベルク像の必要性を見出す。本論文は、射影表現に関する共変性によって終わり、ワイル共変チャネルとそのフェルミオン的類似性に戻る。 In this paper we will demonstrate that any compact quantum group can be used as symmetry groups for quantum channels, which leads us to the concept of covariant channels. We, then, unearth the structure of the convex set of covariant channels by identifying all extreme points under the assumption of multiplicity-free condition for the associated fusion rule, which provides a wide generalization of some recent results. The presence of quantum group symmetry contrast to the group symmetry will be highlighted in the examples of quantum permutation groups and $SU_q(2)$. In the latter example, we will see the necessity of the Heisenberg picture coming from the non-Kac type condition. This paper ends with the covariance with respect to projective representations, which leads us back to Weyl covariant channels and its fermionic analogue.	翻訳日:2023-05-10 23:38:38 公開日:2020-07-08
# Ge/Siナノワイヤ量子ドットにおける強スピン軌道相互作用とホールスピンの$g$-factor再正規化 Strong spin-orbit interaction and $g$-factor renormalization of hole spins in Ge/Si nanowire quantum dots ( http://arxiv.org/abs/2007.04308v1 ) ライセンス: Link先を確認	F. N. M. Froning, M. J. Ran\v{c}i\'c, B. Het\'enyi, S. Bosco, M. K. Rehmann, A. Li, E. P. A. M. Bakkers, F. A. Zwanenburg, D. Loss, D. M. Zumb\"uhl, F. R. Braakman	(参考訳) スピン軌道相互作用は、量子計算の中心にスピン量子ビット、位相的に非自明な状態の研究、スピントロニクスにおける様々な応用がある。 ge/siコア/シェルナノワイヤのホールスピンは、強くて電気的に調整可能なスピン軌道相互作用を経験し、これらの分野の研究に特に有望なプラットフォームとなっている。 ge/siナノワイヤ内の2重量子ドットに閉じ込められたホールスピンのスピン軌道相互作用の強度をスピンブロック輸送系内におけるスピン混合遷移の測定により実験的に決定する。驚くほど短いスピン軌道長が$\sim$65 nmであり、量子ドット長とインタードット距離に匹敵する。さらに, 印加磁場のホール状態に対する大きな軌道効果を観測し, スピン混合遷移エネルギーの磁場依存性を明らかにした。これらの軌道効果とともに、強いスピン軌道相互作用は、磁場による$g$-factorの顕著な向上を引き起こすが、大きなスピン軌道相互作用強度は、この物質系の予測された直接ラシュバスピン軌道相互作用と一致し、スピン量子ビットの超高速なラビ振動と効率的なクビット量子ビット相互作用、およびマヨラナゼロモードの研究に適したプラットフォームを提供する。 The spin-orbit interaction lies at the heart of quantum computation with spin qubits, research on topologically non-trivial states, and various applications in spintronics. Hole spins in Ge/Si core/shell nanowires experience a spin-orbit interaction that has been predicted to be both strong and electrically tunable, making them a particularly promising platform for research in these fields. We experimentally determine the strength of spin-orbit interaction of hole spins confined to a double quantum dot in a Ge/Si nanowire by measuring spin-mixing transitions inside a regime of spin-blockaded transport. We find a remarkably short spin-orbit length of $\sim$65 nm, comparable to the quantum dot length and the interdot distance. We additionally observe a large orbital effect of the applied magnetic field on the hole states, resulting in a large magnetic field dependence of the spin-mixing transition energies. Strikingly, together with these orbital effects, the strong spin-orbit interaction causes a significant enhancement of the $g$-factor with magnetic field.The large spin-orbit interaction strength demonstrated is consistent with the predicted direct Rashba spin-orbit interaction in this material system and is expected to enable ultrafast Rabi oscillations of spin qubits and efficient qubit-qubit interactions, as well as provide a platform suitable for studying Majorana zero modes.	翻訳日:2023-05-10 23:32:30 公開日:2020-07-08
# ポリエンの高エネルギー三重項ペア状態と分子内一重項分裂における役割 Higher energy triplet-pair states in polyenes and their role in intramolecular singlet fission ( http://arxiv.org/abs/2007.04305v1 ) ライセンス: Link先を確認	Darren J Valentine, Dilhan Manawadu, and William Barford	(参考訳) 明るい状態(1^1B_u^+$/S_2$) を超えるエネルギーを持つ拡張ポリエン系は、一重項分裂によって三重項を生成する。この過程は、2^1A_g^-$/$S_1$状態には関与せず、他の状態が役割を果たすことを示唆している。パリエ・パリル・ピエルス・ハミルトンの密度行列再正規化群 (DMRG) 計算を用いて, 一重項分裂に関与する可能性のある候補状態について検討した。緩和された1^1b_u^-$と3^1a_g^-$ singlet状態と1^5a_g^-$ quintet状態は$s_2$状態以下であることがわかった。 1^1b_u^-$,3^1a_g^-$,1^5a_g^-$状態はすべて三重項三重項を持つと考えられており、三重項状態の積と結合二重化、スピンスピン相関、波動関数の重なりの計算によって確認される。したがって、三重項対と電子ホール特性の両方からなる一重項励起(つまり、$2^1A_g^-$, $1^1B_u^-$, $3^1A_g^-$, $\cdots$)があり、基本的に同じ励起であるが質量中心エネルギーを持つ。この族で最も低いエネルギー元素である2^1A_g^-$状態は一重項核分裂を起こさない。しかし、より高いエネルギーメンバー(例えば3^1a_g^-$)は、運動エネルギーの増加と電子格子緩和の低減により、特定の鎖長に対して一重項分裂を起こすことができる。 Probing extended polyene systems with energy in excess of the bright state ($1^1B_u^+$/$S_2$) band edge generates triplets via singlet fission. This process is not thought to involve the $2^1A_g^-$/$S_1$ state, suggesting that other states play a role. Using density matrix renormalisation group (DMRG) calculations of the Pariser-Parr-Pople-Peierls Hamiltonian, we investigate candidate states that could be involved in singlet fission. We find that the relaxed $1^1B_u^-$, and $3^1A_g^-$ singlet states and $1^5A_g^-$ quintet state lie below the $S_2$ state. The $1^1B_u^-$, $3^1A_g^-$ and $1^5A_g^-$ states are all thought to have triplet-triplet character, which is confirmed by our calculations of bond dimerization, spin-spin correlation and wavefunction overlap with products of triplet states. We thus show that there is a family of singlet excitations(i.e., $2^1A_g^-$, $1^1B_u^-$, $3^1A_g^-$, $\cdots$), composed of both triplet-pair and electron-hole character, which are fundamentally the same excitation, but have different center-of-mass energies. The lowest energy member of this family, the $2^1A_g^-$ state, cannot undergo singlet fission. But higher energy members (e.g., the $3^1A_g^-$) state, owing to their increased kinetic energy and reduced electron-lattice relaxation, can undergo singlet fission for certain chain lengths.	翻訳日:2023-05-10 23:32:03 公開日:2020-07-08
# 量子ファンアウト:回路最適化と技術モデリング Quantum Fan-out: Circuit Optimizations and Technology Modeling ( http://arxiv.org/abs/2007.04246v1 ) ライセンス: Link先を確認	Pranav Gokhale, Samantha Koretsky, Shilin Huang, Swarnadeep Majumder, Andrew Drucker, Kenneth R. Brown, Frederic T. Chong	(参考訳) 命令スケジューリングは、古典計算と同様に、量子コンピューティングにおける重要なコンパイラ最適化である。現在のスケジューラは、キュービットが重複しない限り、命令の同時実行を可能にすることで、データの並列処理を最適化する。しかし、多くの量子ハードウェアプラットフォームでは、重なり合う量子ビットの命令を__globalインタラクション__で同時に実行できる。例えば、従来の量子回路におけるファンアウトは論理レベルで見る場合にのみシーケンシャルに実装できるが、物理的レベルでのグローバルな相互作用はファンアウトを1ステップで達成できる。 NISQ(Noisy Intermediate-Scale Quantum)ワークロードの回路合成を最適化するために,この同時ファンアウトプリミティブを活用する。さらに,ファンアウトに基づく新しい量子メモリアーキテクチャを提案する。我々の研究はファンアウトプリミティブのハードウェア実装にも取り組んでいる。我々は、閉じ込められたイオン量子コンピュータの現実的なシミュレーションを行う。また,超伝導量子ビットを用いたファンアウトの概念実証実験を行った。 NISQアプリケーション回路と量子メモリアーキテクチャに対して,現実的なノイズモデルの下で深度(ランタイム)および忠実度推定を行う。我々のシミュレーションは、実行時の漸近的な利点を伴う有望な結果を示し、7～24%のエラーの低減を示す。 Instruction scheduling is a key compiler optimization in quantum computing, just as it is for classical computing. Current schedulers optimize for data parallelism by allowing simultaneous execution of instructions, as long as their qubits do not overlap. However, on many quantum hardware platforms, instructions on overlapping qubits can be executed simultaneously through __global interactions__. For example, while fan-out in traditional quantum circuits can only be implemented sequentially when viewed at the logical level, global interactions at the physical level allow fan-out to be achieved in one step. We leverage this simultaneous fan-out primitive to optimize circuit synthesis for NISQ (Noisy Intermediate-Scale Quantum) workloads. In addition, we introduce novel quantum memory architectures based on fan-out. Our work also addresses hardware implementation of the fan-out primitive. We perform realistic simulations for trapped ion quantum computers. We also demonstrate experimental proof-of-concept of fan-out with superconducting qubits. We perform depth (runtime) and fidelity estimation for NISQ application circuits and quantum memory architectures under realistic noise models. Our simulations indicate promising results with an asymptotic advantage in runtime, as well as 7--24% reduction in error.	翻訳日:2023-05-10 23:30:54 公開日:2020-07-08
# 時間依存背景におけるロンドンの超伝導アプローチ London superconductivity approach in a time-dependent background ( http://arxiv.org/abs/2007.04230v1 ) ライセンス: Link先を確認	Vanderley Aguiar, Jo\~ao P. G. Nascimento, Ilde Guedes and Raimundo N. Costa Filho	(参考訳) 本論文の主な目的は、ロンドンアプローチを用いて時間依存パラメータを持つ超伝導体における電荷空間の正確な量子解を得ることである。本稿では、lewis and riesenfeld invariant operator法に基づく超伝導体内部電荷の新しい量子化スキームを提案する。得られた波動関数から,時間依存の不確かさとシステムの平均エネルギーを計算した。シャノンエントロピーや複雑性といった情報尺度も得られた。後者は常に時間非依存であり、伝導度にも依存しない。他の量は時間依存関数 \r{ho}(t), c 個の数で書かれ、非線形微分方程式を満たす。 The main goal of this paper is to obtain the exact quantum solutions for charge space in a superconductor with time-dependent parameters using the London approach. We introduce a new quantization scheme for the charge inside a superconductor based on the Lewis and Riesenfeld invariant operator method. From the wave-functions obtained, we calculated the time-dependent uncertainties and the mean energy of the system. Information measures were also obtained, such as Shannon entropy and complexity. The later is always time-independent and also does not depend on conductivity. The others quantities are written in terms of a time-dependent function, \r{ho}(t), c-number quantity satisfying a nonlinear differential equation.	翻訳日:2023-05-10 23:30:38 公開日:2020-07-08
# Nレベル量子スピン系のロバストフィードバック安定化 Robust feedback stabilization of N-level quantum spin systems ( http://arxiv.org/abs/2007.04211v1 ) ライセンス: Link先を確認	Weichao Liang, Nina H. Amini, and Paolo Mason	(参考訳) 本稿では,nレベル量子角運動量系と電磁場との相互作用について検討する。推定された量子状態を表す追加状態を導入する必要があるので、初期状態と物理パラメータの無知を仮定する。量子状態の進化とその推定は、結合確率マスター方程式によって記述される。本稿では,フィードバック制御系の存在下でのシステムの漸近的挙動について検討する。我々は,フィードバック制御器と推定パラメータに十分な条件を与え,結合確率系の指数的安定化を測定者の固有状態に保証する。さらに、対応する収束率を推定する。このような条件を満たすパラメータ化されたフィードバック則も提供します。本研究は, [21] のフィードバック安定化戦略が, 推定状態の不正確な初期化や未知の物理パラメータに対する堅牢性を示すものである。 In this paper, we consider N-level quantum angular momentum systems interacting with electromagnetic fields undergoing continuous-time measurements. We suppose unawareness of the initial state and physical parameters, entailing the introduction of an additional state representing the estimated quantum state. The evolution of the quantum state and its estimation is described by a coupled stochastic master equation. Here, we study the asymptotic behavior of such a system in presence of a feedback controller. We provide sufficient conditions on the feedback controller and on the estimated parameters that guarantee exponential stabilization of the coupled stochastic system towards an eigenstate of the measurement operator. Furthermore, we estimate the corresponding rate of convergence. We also provide parametrized feedback laws satisfying such conditions. Our results show the robustness of the feedback stabilization strategy considered in [21] in case of imprecise initialization of the estimated state and with respect to the unknown physical parameters.	翻訳日:2023-05-10 23:30:10 公開日:2020-07-08
# 世論調査におけるオンライン暗黙の関連テストの利用 Using Online Implicit Association Tests in Opinion Polling ( http://arxiv.org/abs/2007.04183v1 ) ライセンス: Link先を確認	Alan Smeaton and Hyowon Lee and Niamh Morris and David Hanley	(参考訳) 世論調査は、今や私たちの日々のニュースサイクルのデファクトな要素であり、その結果が政府やビジネスに常に明らかな方法で影響を与えているため、社会の非常に重要な要素になっています。しかし、ポーリングは必ずしも正確というわけではないし、1930年代までさかのぼる世界に大きな影響を与えてきた真に不正確なポーリング結果もいくつかある。本稿では,現代の不正確な世論調査の理由の一つとして,社会的に望ましい反応 (shy vote) 現象を分析した。暗黙の連帯試験 (IATs) を通じてそれを公開する方法を説明し、アイルランドのイギリスに対する意見に関する小さな調査において、シャイな有権者効果を示す。従来の世論調査にIATを取り入れることで、これらをオンラインで正確に実施できるという事実を指摘するとともに、世論の世論調査の機会を制限するCovid-19規制時代において、より多種多様な回答者のサンプルにポーリングが到達できるようにする。 Opinion polls have now become a very important component of society because they are now a defacto component of our daily news cycle and because their results influence governments and business in ways which are not always obvious to us. However, polling is not always accurate and there have been some really inaccurate polling results which have had major influences on the world going back to the 1930s but also as recently as just the last 3 or 4 years. In this paper we analyse the phenomenon of socially desirable responding (shy voters) which has emerged as one of the reasons for modern day inaccurate polling. We describe how it can be exposed through implicit association tests (IATs) and we demonstrate the shy voter effect in a small survey on opinions in Ireland towards the United Kingdom. We argue for inclusion of IATs in traditional polling and point to the fact that these can be conducted accurately online, which also allows polling to reach a larger and more diverse sample of respondents in the days of Covid-19 restrictions which restricts the opportunities for poll sampling from the general public.	翻訳日:2023-05-10 23:29:59 公開日:2020-07-08
# NERD: リスクデータストリームの予測のためのニューラルネットワーク NERD: Neural Network for Edict of Risky Data Streams ( http://arxiv.org/abs/2007.07753v1 ) ライセンス: Link先を確認	Sandro Passarelli, Cem G\"undogan, Lars Stiemert, Matthias Schopp, Peter Hillmann	(参考訳) サイバーインシデントは、単純な接続損失から断続的な攻撃まで、幅広い原因を持つ可能性がある。一度サイバーセキュリティのインシデントやシステム障害が特定できれば、どのように進むかを決めることはしばしば複雑になる。特に、実際の原因が直接的詳細決定可能でない場合。そこで我々は,サイバーインシデント対応支援システムのコンセプトを開発した。このシステムには侵入検知システムや監視ツールなど,複数の情報ソースが組み込まれている。同期パッケージ比のような20以上の重要な属性を使用して、潜在的なセキュリティインシデントを特定し、データを異なる優先順位カテゴリに分類する。その後、システムは人工知能を使用してさらなる意思決定プロセスをサポートし、取締役会を簡潔にするために対応するレポートを生成する。この情報から、その原因やトラブルシューティング対策について、適切かつ詳細な提案がなされる。学習プロセスの入力としてラベル付きフローデータを使用することで,問題の解決に関するユーザからのフィードバックを今後の意思決定に含める。プロトタイプは、意思決定が持続的に改善され、サイバーインシデント処理プロセスがより効果的になることを示している。 Cyber incidents can have a wide range of cause from a simple connection loss to an insistent attack. Once a potential cyber security incidents and system failures have been identified, deciding how to proceed is often complex. Especially, if the real cause is not directly in detail determinable. Therefore, we developed the concept of a Cyber Incident Handling Support System. The developed system is enriched with information by multiple sources such as intrusion detection systems and monitoring tools. It uses over twenty key attributes like sync-package ratio to identify potential security incidents and to classify the data into different priority categories. Afterwards, the system uses artificial intelligence to support the further decision-making process and to generate corresponding reports to brief the Board of Directors. Originating from this information, appropriate and detailed suggestions are made regarding the causes and troubleshooting measures. Feedback from users regarding the problem solutions are included into future decision-making by using labelled flow data as input for the learning process. The prototype shows that the decision making can be sustainably improved and the Cyber Incident Handling process becomes much more effective.	翻訳日:2023-05-10 23:22:27 公開日:2020-07-08
# mog-vqe:多目的遺伝的変分量子固有解法 MoG-VQE: Multiobjective genetic variational quantum eigensolver ( http://arxiv.org/abs/2007.04424v1 ) ライセンス: Link先を確認	D. Chivilikhin, A. Samarin, V. Ulyantsev, I. Iorsh, A. R. Oganov, O. Kyriienko	(参考訳) 変分量子固有解法(VQE)は、短期量子コンピュータのための最初の実用的なアルゴリズムとして登場した。その成功は主に選択された変分アンザッツに依存し、ハミルトニアンの近似基底状態を作成する量子回路に対応する。典型的には、高い表現精度(回路深度を犠牲にして)を達成するか、あるいは正確な基底エネルギーへの収束を犠牲にする浅い回路を使用する。本稿では,低深度と精度の向上を両立させる手法を提案し,ハードウェア効率の良いVQEのための遺伝的改良アンサッツを考案した。本手法は多目的遺伝的変分量子固有解法 (MoG-VQE) を多目的パレート最適化に頼り, 非支配的ソート遺伝的アルゴリズム (NSGA-II) を用いて変分アンザッツの位相を最適化する。各回路トポロジに対して、共分散行列適応進化戦略(CMA-ES)を用いて、単一キュービット回転の角度を最適化する。提案プロトコルでは, 得られたエネルギー精度と2ビットゲート数の両面で高い性能を同時に提供する回路を作成できるので, パレート最適解に到達しようと試みる。種々の分子 (H$_2$, H$_4$, H$_6$, BeH$_2$, LiH) に対して実験を行い, 標準のハードウェア効率のアンサッツと比較して, 2量子ゲート数の約10倍の減少を観測した。 12量子ビットのLiHハミルトニアンでは、既に12個のCNOTで化学的精度に達することができる。その結果、アルゴリズムは、短期デバイスに対する基底状態の忠実度を著しく向上させる。 Variational quantum eigensolver (VQE) emerged as a first practical algorithm for near-term quantum computers. Its success largely relies on the chosen variational ansatz, corresponding to a quantum circuit that prepares an approximate ground state of a Hamiltonian. Typically, it either aims to achieve high representation accuracy (at the expense of circuit depth), or uses a shallow circuit sacrificing the convergence to the exact ground state energy. Here, we propose the approach which can combine both low depth and improved precision, capitalizing on a genetically-improved ansatz for hardware-efficient VQE. Our solution, the multiobjective genetic variational quantum eigensolver (MoG-VQE), relies on multiobjective Pareto optimization, where topology of the variational ansatz is optimized using the non-dominated sorting genetic algorithm (NSGA-II). For each circuit topology, we optimize angles of single-qubit rotations using covariance matrix adaptation evolution strategy (CMA-ES) -- a derivative-free approach known to perform well for noisy black-box optimization. Our protocol allows preparing circuits that simultaneously offer high performance in terms of obtained energy precision and the number of two-qubit gates, thus trying to reach Pareto-optimal solutions. Tested for various molecules (H$_2$, H$_4$, H$_6$, BeH$_2$, LiH), we observe nearly ten-fold reduction in the two-qubit gate counts as compared to the standard hardware-efficient ansatz. For 12-qubit LiH Hamiltonian this allows reaching chemical precision already at 12 CNOTs. Consequently, the algorithm shall lead to significant growth of the ground state fidelity for near-term devices.	翻訳日:2023-05-10 23:21:33 公開日:2020-07-08
# 絡み合いレンズによる準粒子の観察 Observing Quasiparticles through the Entanglement Lens ( http://arxiv.org/abs/2007.04318v1 ) ライセンス: Link先を確認	Yizhi You, Elisabeth Wybo, Frank Pollmann, S. L. Sondhi	(参考訳) 相互作用する量子系の低エネルギー物理学は通常、関連する準粒子や低エネルギー励起とその量子数を同定することで理解される。我々は、対応する量子状態における絡み合いの性質を調べるために、これを超える量子情報フレームワークを提案する。我々は、量子数、局所性、分数化を含む準粒子の健全な特徴が、絡み合いスペクトルや相互情報に反映されていると論じる。これらのアイデアを、積分可能性破壊摂動を持つ$d=1$横場イジングモデルの特定の文脈で説明する。 The low energy physics of interacting quantum systems is typically understood through the identification of the relevant quasiparticles or low energy excitations and their quantum numbers. We present a quantum information framework that goes beyond this to examine the nature of the entanglement in the corresponding quantum states. We argue that the salient features of the quasiparticles, including their quantum numbers, locality and fractionalization are reflected in the entanglement spectrum and in the mutual information. We illustrate these ideas in the specific context of the $d=1$ transverse field Ising model with an integrability breaking perturbation.	翻訳日:2023-05-10 23:19:47 公開日:2020-07-08
# MTI-Net:マルチタスク学習のためのマルチスケールタスクインタラクションネットワーク MTI-Net: Multi-Scale Task Interaction Networks for Multi-Task Learning ( http://arxiv.org/abs/2001.06902v5 ) ライセンス: Link先を確認	Simon Vandenhende, Stamatios Georgoulis and Luc Van Gool	(参考訳) 本稿では,マルチタスク学習環境においてタスク情報を蒸留する際に,複数のスケールでタスクインタラクションを検討することの重要性について論じる。共通の信念とは対照的に、特定のスケールで高い親和性を持つタスクは、他のスケールでこの動作を維持することが保証されていない。我々はこの発見を3つの方法で構築する新しいアーキテクチャ MTI-Net を提案する。まず、マルチスケールのマルチモーダル蒸留ユニットを介して、あらゆるスケールでのタスクインタラクションを明示的にモデル化する。第二に、機能伝達モジュールを介して、より低いスケールから高いスケールで蒸留されたタスク情報を伝播する。第3に、すべてのスケールから機能集約ユニットを介して洗練されたタスク特徴を集約し、最終的なタスク毎の予測を生成する。 2つのマルチタスク高密度ラベル付けデータセットに対する大規模な実験により、従来の研究とは異なり、我々のマルチタスクモデルはマルチタスク学習の潜在能力、すなわち、メモリフットプリントが小さくなり、計算回数が減り、シングルタスク学習の性能が向上することを示した。コードは、https://github.com/SimonVandenhende/Multi-Task-Learning-PyTorchで公開されている。 In this paper, we argue about the importance of considering task interactions at multiple scales when distilling task information in a multi-task learning setup. In contrast to common belief, we show that tasks with high affinity at a certain scale are not guaranteed to retain this behaviour at other scales, and vice versa. We propose a novel architecture, namely MTI-Net, that builds upon this finding in three ways. First, it explicitly models task interactions at every scale via a multi-scale multi-modal distillation unit. Second, it propagates distilled task information from lower to higher scales via a feature propagation module. Third, it aggregates the refined task features from all scales via a feature aggregation unit to produce the final per-task predictions. Extensive experiments on two multi-task dense labeling datasets show that, unlike prior work, our multi-task model delivers on the full potential of multi-task learning, that is, smaller memory footprint, reduced number of calculations, and better performance w.r.t. single-task learning. The code is made publicly available: https://github.com/SimonVandenhende/Multi-Task-Learning-PyTorch.	翻訳日:2023-01-08 12:36:28 公開日:2020-07-08
# CNNに基づく高速ソースデバイス識別 CNN-based fast source device identification ( http://arxiv.org/abs/2001.11847v3 ) ライセンス: Link先を確認	Sara Mandelli, Davide Cozzolino, Paolo Bestagini, Luisa Verdoliva, Stefano Tubaro	(参考訳) ソース識別は、画像の原点を追跡することができるため、画像鑑識において重要なトピックである。これは知的財産を請求する上で貴重な情報であり、また違法な資料の著者も明らかにしている。本稿では,センサノイズに基づくデバイス識別の問題に対処し,畳み込みニューラルネットワーク(CNN)を用いた高速かつ正確な解法を提案する。具体的には、カメラ指紋と画像ノイズをパッチレベルで比較する方法を学習する2チャンネルCNNを提案する。提案手法は従来の手法よりもはるかに高速であり,精度の向上が期待できる。このアプローチは、ソーシャルネットワークなど、大規模な画像データベースが分析されるシナリオに特に適している。この例では、ソーシャルメディアにアップロードされた画像は通常、少なくとも2つの圧縮段階にあるため、二重JPEG圧縮画像の調査を含め、常に標準的なアプローチよりも高い精度を報告している。 Source identification is an important topic in image forensics, since it allows to trace back the origin of an image. This represents a precious information to claim intellectual property but also to reveal the authors of illicit materials. In this paper we address the problem of device identification based on sensor noise and propose a fast and accurate solution using convolutional neural networks (CNNs). Specifically, we propose a 2-channel-based CNN that learns a way of comparing camera fingerprint and image noise at patch level. The proposed solution turns out to be much faster than the conventional approach and to ensure an increased accuracy. This makes the approach particularly suitable in scenarios where large databases of images are analyzed, like over social networks. In this vein, since images uploaded on social media usually undergo at least two compression stages, we include investigations on double JPEG compressed images, always reporting higher accuracy than standard approaches.	翻訳日:2023-01-05 06:28:32 公開日:2020-07-08
# 機械学習:logitベースの分類器に対する線形フィルタリング Machine Unlearning: Linear Filtration for Logit-based Classifiers ( http://arxiv.org/abs/2002.02730v2 ) ライセンス: Link先を確認	Thomas Baumhauer and Pascal Sch\"ottle and Matthias Zeppelzauer	(参考訳) 最近制定された法律では、個人が自分の個人データがどんな風に使用されるかを決める権利、特に「忘れられる権利」を付与している。個人がモデルのトレーニングプロセスの一部であるデータを使用する許可を取り除いた場合、どのように進めばいいのか? この質問から、機械学習の分野が生まれ、それは「モデルからトレーニングデータを削除」する方法の調査として広く説明できる。我々の研究は、分類モデル(ディープニューラルネットワークなど)のクラス全体の削除要求の設定に関するこの研究の方向性を補完する。最初のステップとして,直感的で計算効率の良い衛生手法として線形濾過を提案する。本実験は,ナイーブ削除スキームに対する敵意設定の利点を示す。 Recently enacted legislation grants individuals certain rights to decide in what fashion their personal data may be used, and in particular a "right to be forgotten". This poses a challenge to machine learning: how to proceed when an individual retracts permission to use data which has been part of the training process of a model? From this question emerges the field of machine unlearning, which could be broadly described as the investigation of how to "delete training data from models". Our work complements this direction of research for the specific setting of class-wide deletion requests for classification models (e.g. deep neural networks). As a first step, we propose linear filtration as a intuitive, computationally efficient sanitization method. Our experiments demonstrate benefits in an adversarial setting over naive deletion schemes.	翻訳日:2023-01-03 03:51:31 公開日:2020-07-08
# privacyfl: プライバシー保護と安全な連合学習のためのシミュレータ PrivacyFL: A simulator for privacy-preserving and secure federated learning ( http://arxiv.org/abs/2002.08423v2 ) ライセンス: Link先を確認	Vaikkunth Mugunthan, Anton Peraire-Bueno and Lalana Kagal	(参考訳) フェデレーション学習(federated learning)は、分散クライアントがトレーニングデータをローカライズしながら共有機械学習モデルを共同学習できるようにするテクニックである。これは、トレーニングされたモデルの重みやパラメータからトレーニングデータセットに関する情報をリークすることができるため、データプライバシのリスクを低減することができる。フェデレーション学習環境のセットアップ、特にセキュリティとプライバシの保証は、操作可能な多数の設定とパラメータを備えた、時間を要するプロセスである。クライアントがコラボレーションが実現可能であることを保証し、モデル精度を改善するためには、プライバシ保護とセキュアなフェデレーション学習のための実世界のシミュレータが必要である。本稿では,フェデレート学習環境のための拡張可能で,構成が容易でスケーラブルなシミュレータであるPrivacyFLを紹介する。主な機能としては、レイテンシシミュレーション、クライアントからの離脱に対する堅牢性、集中型と分散型の学習のサポート、差分プライバシーとセキュアなマルチパーティ計算に基づく設定可能なプライバシとセキュリティメカニズムなどがある。本稿では,我々の研究を動機付け,シミュレータと関連するプロトコルのアーキテクチャを説明し,その幅広い機能とその利点を浮き彫りにした多数のシナリオにおける評価について論じる。本稿は,様々な状況下での連携型学習環境の実現可能性の検証という,現実的な重要な課題に対処する。病院、銀行、研究機関といった、大量の機密データを持ち、協力したい組織は、プライバシーを守り、セキュアな方法でそれを可能にするシステムを持つことで、大きな利益を享受できるため、実践的な影響も大きい。 Federated learning is a technique that enables distributed clients to collaboratively learn a shared machine learning model while keeping their training data localized. This reduces data privacy risks, however, privacy concerns still exist since it is possible to leak information about the training dataset from the trained model's weights or parameters. Setting up a federated learning environment, especially with security and privacy guarantees, is a time-consuming process with numerous configurations and parameters that can be manipulated. In order to help clients ensure that collaboration is feasible and to check that it improves their model accuracy, a real-world simulator for privacy-preserving and secure federated learning is required. In this paper, we introduce PrivacyFL, which is an extensible, easily configurable and scalable simulator for federated learning environments. Its key features include latency simulation, robustness to client departure, support for both centralized and decentralized learning, and configurable privacy and security mechanisms based on differential privacy and secure multiparty computation. In this paper, we motivate our research, describe the architecture of the simulator and associated protocols, and discuss its evaluation in numerous scenarios that highlight its wide range of functionality and its advantages. Our paper addresses a significant real-world problem: checking the feasibility of participating in a federated learning environment under a variety of circumstances. It also has a strong practical impact because organizations such as hospitals, banks, and research institutes, which have large amounts of sensitive data and would like to collaborate, would greatly benefit from having a system that enables them to do so in a privacy-preserving and secure manner.	翻訳日:2022-12-30 13:44:45 公開日:2020-07-08
# 自己回帰モデルによる予測サンプリング Predictive Sampling with Forecasting Autoregressive Models ( http://arxiv.org/abs/2002.09928v2 ) ライセンス: Link先を確認	Auke Wiggers, Emiel Hoogeboom	(参考訳) 自動回帰モデル(ARM)は現在、画像とオーディオデータの可能性に基づくモデリングにおいて最先端のパフォーマンスを持っている。一般的に、ニューラルネットワークベースのARMは高速な推論を可能にするように設計されている。本稿では,ARMの高速推論特性を利用してサンプリングを高速化する手法である予測サンプリングアルゴリズムを提案する。本稿では,arm固定点反復によるサンプリングと学習予測モジュールの2種類の予測サンプリングを提案する。有効性は2つの設定で示される。 i)二項mnist,svhn,cifar10の明示的確率モデリング及び二 SVHN、CIFAR10、Imagenet32で訓練されたオートエンコーダにおける離散潜時モデリング実験により,ARM推論呼び出し数やサンプリング速度において,ベースラインよりもかなりの改善が見られた。 Autoregressive models (ARMs) currently hold state-of-the-art performance in likelihood-based modeling of image and audio data. Generally, neural network based ARMs are designed to allow fast inference, but sampling from these models is impractically slow. In this paper, we introduce the predictive sampling algorithm: a procedure that exploits the fast inference property of ARMs in order to speed up sampling, while keeping the model intact. We propose two variations of predictive sampling, namely sampling with ARM fixed-point iteration and learned forecasting modules. Their effectiveness is demonstrated in two settings: i) explicit likelihood modeling on binary MNIST, SVHN and CIFAR10, and ii) discrete latent modeling in an autoencoder trained on SVHN, CIFAR10 and Imagenet32. Empirically, we show considerable improvements over baselines in number of ARM inference calls and sampling speed.	翻訳日:2022-12-29 09:19:20 公開日:2020-07-08
# 対人強化学習によるロバスト市場形成 Robust Market Making via Adversarial Reinforcement Learning ( http://arxiv.org/abs/2003.01820v2 ) ライセンス: Link先を確認	Thomas Spooner, Rahul Savani	(参考訳) 本稿では, 対人強化学習(ARL)を用いて, 対人的かつ適応的な市場条件に頑健な市場マーキングエージェントを作成できることを示す。 ARLを適用するために、Avellaneda と Stoikov [2008] のよく研究された単一エージェントモデルを、市場メーカーと敵の間の離散時間ゼロサムゲームに変換する。相手は、市場メーカーの経費で利益を上げたい他の市場参加者の代理として機能する。 2つの従来の単エージェントRLエージェントとARLを経験的に比較し、ARLアプローチが導くことを示す。 1) 制約のないリスク回避行動の出現又はドメイン固有の罰則 2) 試験環境における敵の有無にかかわらず評価された基準指標のセットによる性能の大幅な改善 3) 不確実性をモデル化した。我々は,本手法が一貫して収束することを示す実証実験を行い,単純な単段ゲームにおいて,我々が収束するプロファイルがnash平衡に対応することを証明した。 We show that adversarial reinforcement learning (ARL) can be used to produce market marking agents that are robust to adversarial and adaptively-chosen market conditions. To apply ARL, we turn the well-studied single-agent model of Avellaneda and Stoikov [2008] into a discrete-time zero-sum game between a market maker and adversary. The adversary acts as a proxy for other market participants that would like to profit at the market maker's expense. We empirically compare two conventional single-agent RL agents with ARL, and show that our ARL approach leads to: 1) the emergence of risk-averse behaviour without constraints or domain-specific penalties; 2) significant improvements in performance across a set of standard metrics, evaluated with or without an adversary in the test environment, and; 3) improved robustness to model uncertainty. We empirically demonstrate that our ARL method consistently converges, and we prove for several special cases that the profiles that we converge to correspond to Nash equilibria in a simplified single-stage game.	翻訳日:2022-12-26 21:39:52 公開日:2020-07-08
# anysize gan: イメージワーピング問題の解決策 Anysize GAN: A solution to the image-warping problem ( http://arxiv.org/abs/2003.03233v2 ) ライセンス: Link先を確認	Connah Kendrick, David Gillespie, Moi Hoon Yap	(参考訳) 本稿では,Deep Learningにおける共通問題を解決するために,GAN(General Adversarial Network)の新たなタイプを提案する。我々は,既存の潜在ベクトルベースGAN構造に適用可能な新しいアーキテクチャを開発し,任意のサイズのオンザフライ画像を生成する。画像生成のための既存のGANは、一致する寸法の均一な画像を必要とする。しかし、ImageNetのような公開データセットには数千の異なるサイズが含まれている。画像のサイズ変更は画像データの変形や変化を引き起こすが、ネットワークはこの前処理ステップを必要としない。トレーニングのために任意のサイズの画像をロードできるように、標準的なデータローディング技術に大きな変更を加えています。また、複数の入力と新しい動的リサイズ層を追加することで、ネットワークを2つの方法で修正する。最後に、判別器を複数の解像度で処理するように調整する。これらの変更により、メモリが許せば、リサイズなしで複数の解像度データセットをトレーニングできる。 isic 2019皮膚病変データセットで結果を確認した。提案手法は,特徴的関係を維持しつつ,空間的関係の保存と理解を行なわずに,異なる大きさの現実的な画像を生成することを実証する。論文を受理し、ソースコードを公開します。 We propose a new type of General Adversarial Network (GAN) to resolve a common issue with Deep Learning. We develop a novel architecture that can be applied to existing latent vector based GAN structures that allows them to generate on-the-fly images of any size. Existing GAN for image generation requires uniform images of matching dimensions. However, publicly available datasets, such as ImageNet contain thousands of different sizes. Resizing image causes deformations and changing the image data, whereas as our network does not require this preprocessing step. We make significant changes to the standard data loading techniques to enable any size image to be loaded for training. We also modify the network in two ways, by adding multiple inputs and a novel dynamic resizing layer. Finally we make adjustments to the discriminator to work on multiple resolutions. These changes can allow multiple resolution datasets to be trained on without any resizing, if memory allows. We validate our results on the ISIC 2019 skin lesion dataset. We demonstrate our method can successfully generate realistic images at different sizes without issue, preserving and understanding spatial relationships, while maintaining feature relationships. We will release the source codes upon paper acceptance.	翻訳日:2022-12-26 01:48:14 公開日:2020-07-08
# 環境微生物画像分割のためのマルチスケールCNN-CRFフレームワーク A Multi-scale CNN-CRF Framework for Environmental Microorganism Image Segmentation ( http://arxiv.org/abs/2003.03744v2 ) ライセンス: Link先を確認	Jinghua Zhang, Chen Li, Frank Kulwa, Xin Zhao, Changhao Sun, Zihan Li, Tao Jiang, Hong Li, and Shouliang Qi	(参考訳) 研究者が環境微生物(EM)を効果的に識別するのを支援するために,EM画像セグメンテーションのためのマルチスケールCNN-CRF(MSCC)フレームワークを提案する。 1つは新しいピクセルレベルのセグメンテーションアプローチで、新しく導入された畳み込みニューラルネットワーク(CNN)、すなわち「mU-Net-B3」と高密度条件ランダムフィールド(CRF)後処理を使用する。 2つ目はvgg-16ベースのパッチレベルのセグメンテーション法で、新しい"バッファ"戦略により、emsの詳細のセグメンテーション品質がさらに向上する。実験では、420 EM画像の最先端手法と比較して、提案したMSCC法はメモリ要求を355 MBから103 MBに減らし、総合評価指標(Dice, Jaccard, Recall, Accuracy)を85.24%、77.42%、82.27%、96.76%から87.13%、79.74%、87.12%、96.91%に改善し、22.58%から20.26%に減らした。したがって、MSCC法は、EMセグメンテーション分野において大きなポテンシャルを示す。 To assist researchers to identify Environmental Microorganisms (EMs) effectively, a Multiscale CNN-CRF (MSCC) framework for the EM image segmentation is proposed in this paper. There are two parts in this framework: The first is a novel pixel-level segmentation approach, using a newly introduced Convolutional Neural Network (CNN), namely, "mU-Net-B3", with a dense Conditional Random Field (CRF) postprocessing. The second is a VGG-16 based patch-level segmentation method with a novel "buffer" strategy, which further improves the segmentation quality of the details of the EMs. In the experiment, compared with the state-of-the-art methods on 420 EM images, the proposed MSCC method reduces the memory requirement from 355 MB to 103 MB, improves the overall evaluation indexes (Dice, Jaccard, Recall, Accuracy) from 85.24%, 77.42%, 82.27%, and 96.76% to 87.13%, 79.74%, 87.12%, and 96.91%, respectively, and reduces the volume overlap error from 22.58% to 20.26%. Therefore, the MSCC method shows great potential in the EM segmentation field.	翻訳日:2022-12-25 14:24:37 公開日:2020-07-08
# 実画像復元・強調のための学習強化機能 Learning Enriched Features for Real Image Restoration and Enhancement ( http://arxiv.org/abs/2003.06792v2 ) ライセンス: Link先を確認	Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, Ling Shao	(参考訳) 劣化した画像から高品質の画像コンテンツを回復することを目的として、画像復元は監視、計算写真、医用画像、リモートセンシングなどの多くの応用を享受している。近年,畳み込みニューラルネットワーク(cnns)は,従来の画像復元手法に比べて劇的に改善されている。既存のCNNベースのメソッドは通常、フル解像度またはプログレッシブに低解像度の表現で動作する。前者の場合、空間的に正確だが文脈的に劣る結果が得られ、後者の場合、意味的に信頼できるが空間的に劣る出力が生成される。本稿では,ネットワーク全体を通して空間的にpreciseな高分解能表現を維持し,低解像度表現から強い文脈情報を受け取ることを目的とした,新しいアーキテクチャを提案する。このアプローチのコアは、いくつかのキー要素を含むマルチスケールの残留ブロックである。 (a)マルチスケール特徴抽出のための並列マルチレゾリューション畳み込みストリーム (b)多解像度ストリーム間の情報交換 c) 文脈情報取得のための空間的及びチャネル的注意機構 (d)注意に基づくマルチスケール特徴集約。簡単に言うと、我々は高解像度の空間的詳細を同時に保存しながら、複数のスケールからの文脈情報を組み合わせた豊富な特徴集合を学習する。 5つの実画像ベンチマークデータセットに対する大規模な実験により、我々の手法は、MIRNetと呼ばれ、画像のデノゲーション、超解像、画像強調など、様々な画像処理タスクに対して最先端の結果が得られることを示した。ソースコードと事前訓練されたモデルはhttps://github.com/swz30/MIRNet.comで入手できる。 With the goal of recovering high-quality image content from its degraded version, image restoration enjoys numerous applications, such as in surveillance, computational photography, medical imaging, and remote sensing. Recently, convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task. Existing CNN-based methods typically operate either on full-resolution or on progressively low-resolution representations. In the former case, spatially precise but contextually less robust results are achieved, while in the latter case, semantically reliable but spatially less accurate outputs are generated. In this paper, we present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network and receiving strong contextual information from the low-resolution representations. The core of our approach is a multi-scale residual block containing several key elements: (a) parallel multi-resolution convolution streams for extracting multi-scale features, (b) information exchange across the multi-resolution streams, (c) spatial and channel attention mechanisms for capturing contextual information, and (d) attention based multi-scale feature aggregation. In a nutshell, our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details. Extensive experiments on five real image benchmark datasets demonstrate that our method, named as MIRNet, achieves state-of-the-art results for a variety of image processing tasks, including image denoising, super-resolution, and image enhancement. The source code and pre-trained models are available at https://github.com/swz30/MIRNet.	翻訳日:2022-12-23 08:56:10 公開日:2020-07-08
# 条件付き生成逆ネットワークによるマルチモーダル形状補完 Multimodal Shape Completion via Conditional Generative Adversarial Networks ( http://arxiv.org/abs/2003.07717v3 ) ライセンス: Link先を確認	Rundi Wu, Xuelin Chen, Yixin Zhuang, Baoquan Chen	(参考訳) 形状取得装置からの部分的データ、すなわち形状に欠ける領域を埋めるために、いくつかの深層学習法が提案されている。しかし、これらの手法は1つの出力で部分的な形状を完遂するだけで、欠落した幾何を推論するあいまいさを無視している。したがって,多モード形状完備化問題として,一対多写像を学習することで,複数の出力で部分形状を完備化しようとする。条件付き学習データを必要としない条件付き生成モデルにより部分的な形状を完遂する最初のマルチモーダル形状補完法を開発した。提案手法は,学習結果のマルチモーダル分布の完備化を条件に,あいまいさを抽出する。様々な形状の不完全性を含む複数のデータセットに対するアプローチを広範に評価し,本手法の基本的な方法と変種を比較し,多様性と品質の両面で部分的な形状を完遂する上での手法のメリットを実証した。 Several deep learning methods have been proposed for completing partial data from shape acquisition setups, i.e., filling the regions that were missing in the shape. These methods, however, only complete the partial shape with a single output, ignoring the ambiguity when reasoning the missing geometry. Hence, we pose a multi-modal shape completion problem, in which we seek to complete the partial shape with multiple outputs by learning a one-to-many mapping. We develop the first multimodal shape completion method that completes the partial shape via conditional generative modeling, without requiring paired training data. Our approach distills the ambiguity by conditioning the completion on a learned multimodal distribution of possible results. We extensively evaluate the approach on several datasets that contain varying forms of shape incompleteness, and compare among several baseline methods and variants of our methods qualitatively and quantitatively, demonstrating the merit of our method in completing partial shapes with both diversity and quality.	翻訳日:2022-12-22 21:30:43 公開日:2020-07-08
# 交換可能なデータのためのエネルギーベースプロセス Energy-Based Processes for Exchangeable Data ( http://arxiv.org/abs/2003.07521v2 ) ライセンス: Link先を確認	Mengjiao Yang, Bo Dai, Hanjun Dai, Dale Schuurmans	(参考訳) 近年,点雲などの交換可能性を持つ集合のモデリングへの関心が高まっている。現在のアプローチの欠点は、考慮される集合の濃度を制限するか、観測されていないデータ上の制限された形式の分布しか表現できないことである。これらの制限を克服するために、エネルギーベースのモデルから交換可能なデータまで拡張し、エネルギー関数のニューラルネットワークパラメータ化を可能にするEnergy-Based Processs (EBP)を導入する。これらのモデルの重要な利点は、集合上のより柔軟な分布を、その濃度を制限することなく表現できることである。我々は,ポイントクラウド生成,分類,デノイジング,画像補完など,さまざまなタスクにおける最先端のパフォーマンスを示す,ebpsの効率的なトレーニング手順を開発した。 Recently there has been growing interest in modeling sets with exchangeability such as point clouds. A shortcoming of current approaches is that they restrict the cardinality of the sets considered or can only express limited forms of distribution over unobserved data. To overcome these limitations, we introduce Energy-Based Processes (EBPs), which extend energy based models to exchangeable data while allowing neural network parameterizations of the energy function. A key advantage of these models is the ability to express more flexible distributions over sets without restricting their cardinality. We develop an efficient training procedure for EBPs that demonstrates state-of-the-art performance on a variety of tasks such as point cloud generation, classification, denoising, and image completion.	翻訳日:2022-12-22 20:38:17 公開日:2020-07-08
# 運動からの非剛体構造に対する重み付き核ノルムの正確な最適化 Accurate Optimization of Weighted Nuclear Norm for Non-Rigid Structure from Motion ( http://arxiv.org/abs/2003.10281v2 ) ライセンス: Link先を確認	Jos\'e Pedro Iglesias, Carl Olsson, Marcus Valtonen \"Ornhag	(参考訳) 与えられたランクの行列を最小二乗の意味でデータに合わせることは、行列の双線型パラメタライゼーションを明示的に最適化することで、レバンス=マルカルトのような2次法を用いて非常に効果的に行うことができる。対照的に、重み付き核ノルム優先のようなより一般的な特異値ペナルティを適用する場合、行列の要素に対する直接最適化が一般的に用いられる。結果の目的関数の非微分性のため、第一次劣次法や分割法が主に用いられる。これらは速いイテレーションを提供するが、ジグザグによって最小に近い非効率になることはよく知られており、実際、近似解の解決を迫られることが多い。本稿では,2次法により多くの場合において,より正確な結果が得られることを示す。我々の主な結果は、重み付き核規範ペナルティを含む一般の正規化子に対して、元の問題と同値な双線型定式化を構築する方法を示している。これらの定式化によって正則化関数は2つの微分可能となり、2次法が適用できる。動作問題からの多くの構造について実験により,本手法が最先端手法より優れていることを示す。 Fitting a matrix of a given rank to data in a least squares sense can be done very effectively using 2nd order methods such as Levenberg-Marquardt by explicitly optimizing over a bilinear parameterization of the matrix. In contrast, when applying more general singular value penalties, such as weighted nuclear norm priors, direct optimization over the elements of the matrix is typically used. Due to non-differentiability of the resulting objective function, first order sub-gradient or splitting methods are predominantly used. While these offer rapid iterations it is well known that they become inefficent near the minimum due to zig-zagging and in practice one is therefore often forced to settle for an approximate solution. In this paper we show that more accurate results can in many cases be achieved with 2nd order methods. Our main result shows how to construct bilinear formulations, for a general class of regularizers including weighted nuclear norm penalties, that are provably equivalent to the original problems. With these formulations the regularizing function becomes twice differentiable and 2nd order methods can be applied. We show experimentally, on a number of structure from motion problems, that our approach outperforms state-of-the-art methods.	翻訳日:2022-12-21 00:43:54 公開日:2020-07-08
# 限られた資源深層学習のためのデータと計算効率 A Data and Compute Efficient Design for Limited-Resources Deep Learning ( http://arxiv.org/abs/2004.09691v2 ) ライセンス: Link先を確認	Mirgahney Mohamed, Gabriele Cesa, Taco S. Cohen and Max Welling	(参考訳) データ効率の改善により、同種のニューラルネットワークはディープラーニングコミュニティへの関心を高めている。医療分野では、データの対称性を効果的に活用して、より正確で堅牢なモデルの構築に成功している。より広い範囲の患者にリーチするために、モバイルでデバイス上での深層学習ソリューションの実装が医療応用のために開発されている。しかし、同変モデルは大規模で計算コストのかかるアーキテクチャを使って一般的に実装されている。本研究では,mobilenetv2の等価バージョンを設計,テストし,モデルの量子化によりさらに最適化することで,より効率的な推論を実現する。我々は,patch camelyon (pcam) の医療データセット上で,より計算効率の高い技術性能を実現する。 Thanks to their improved data efficiency, equivariant neural networks have gained increased interest in the deep learning community. They have been successfully applied in the medical domain where symmetries in the data can be effectively exploited to build more accurate and robust models. To be able to reach a much larger body of patients, mobile, on-device implementations of deep learning solutions have been developed for medical applications. However, equivariant models are commonly implemented using large and computationally expensive architectures, not suitable to run on mobile devices. In this work, we design and test an equivariant version of MobileNetV2 and further optimize it with model quantization to enable more efficient inference. We achieve close-to state of the art performance on the Patch Camelyon (PCam) medical dataset while being more computationally efficient.	翻訳日:2022-12-11 06:20:07 公開日:2020-07-08
# キュラス探査による柔軟かつ効率的な長距離計画 Flexible and Efficient Long-Range Planning Through Curious Exploration ( http://arxiv.org/abs/2004.10876v2 ) ライセンス: Link先を確認	Aidan Curtis, Minjian Xin, Dilip Arumugam, Kevin Feigelis, Daniel Yamins	(参考訳) 時間的拡張型マルチフェーズプランを柔軟かつ効率的に発見するアルゴリズムは、ロボット工学の進歩とモデルに基づく強化学習にとって重要なステップである。長距離計画の核となる問題は、可能なアクションシーケンスのツリーを探索する効率的な方法を見つけることである。タスク・アンド・モーション・プランニング(tamp)による既存の非学習型計画ソリューションは、アクションの効果と前提条件に対する論理的記述の存在に依存している。この制約により、tampメソッドは、ツリー探索問題を効率的に減らすことができるが、隠れない複雑な物理環境に一般化する能力は制限される。対照的に、深層強化学習(DRL)法は、柔軟なニューラルネットワークに基づく関数近似を用いて、自然に見えない状況に一般化するポリシーを発見する。しかし、DRL法は、長距離多段階計画環境に固有の非常にまばらな報酬景観を扱うのに苦労する。本稿では、好奇心誘導サンプリング戦略と模倣学習を組み合わせることで、TAMPとDRLの要素を融合させるCurious Sample Planner(CSP)を提案する。 CSPは、多種多様なリアルな3Dタスクを解くための、興味深く複雑な時間的拡張プランを効率的に発見できることを示す。対照的に、標準的な計画と学習の手法は、これらのタスクを全く解決できなかったり、巨大な、非常に可変なトレーニングサンプルでしかできなかったりします。我々は、CSPで様々な好奇心メトリクスを使用することを検討し、CSPが発見するソリューションの種類を分析する。最後に、CSPはタスク転送をサポートし、あるタスクの経験から学んだ探索ポリシーが関連するタスクの効率向上に役立つことを示す。 Identifying algorithms that flexibly and efficiently discover temporally-extended multi-phase plans is an essential step for the advancement of robotics and model-based reinforcement learning. The core problem of long-range planning is finding an efficient way to search through the tree of possible action sequences. Existing non-learned planning solutions from the Task and Motion Planning (TAMP) literature rely on the existence of logical descriptions for the effects and preconditions for actions. This constraint allows TAMP methods to efficiently reduce the tree search problem but limits their ability to generalize to unseen and complex physical environments. In contrast, deep reinforcement learning (DRL) methods use flexible neural-network-based function approximators to discover policies that generalize naturally to unseen circumstances. However, DRL methods struggle to handle the very sparse reward landscapes inherent to long-range multi-step planning situations. Here, we propose the Curious Sample Planner (CSP), which fuses elements of TAMP and DRL by combining a curiosity-guided sampling strategy with imitation learning to accelerate planning. We show that CSP can efficiently discover interesting and complex temporally-extended plans for solving a wide range of physically realistic 3D tasks. In contrast, standard planning and learning methods often fail to solve these tasks at all or do so only with a huge and highly variable number of training samples. We explore the use of a variety of curiosity metrics with CSP and analyze the types of solutions that CSP discovers. Finally, we show that CSP supports task transfer so that the exploration policies learned during experience with one task can help improve efficiency on related tasks.	翻訳日:2022-12-10 18:31:20 公開日:2020-07-08
# VC次元によるロバスト部分ガウス推定 Robust subgaussian estimation with VC-dimension ( http://arxiv.org/abs/2004.11734v3 ) ライセンス: Link先を確認	Jules Depersin	(参考訳) 中央値法(英語版)(mom)に基づく手続きは、データが重み付きまたは破損している場合でも非漸近的かつ強い偏差境界を提供する。この研究は、MOM推定器の余剰リスクを束縛する新しい一般的な方法を提案する。中心となる技術は、統計複雑性を測定するためにVC次元(ラデマッハの複雑さの代わりに)を用いることである。特に、これはスパース推定のための最初のロバストな推定子を与えることができ、これはいわゆる準ガウジアンレートを、分解されていないデータに対して有限秒のモーメントを仮定するだけで達成する。対照的に、ラデマッハ複素数を用いた以前の研究は、次元と対数的に成長する多くの有限モーメントを必要とした。この手法により、任意のノルムにおける平均推定のための新しいロバストなスガウス境界を導出する。また、L_4-L_2$ノルム同値性のない亜ガウス境界を初めて達成した共分散推定のための新しいロバストな推定器も導出する。 Median-of-means (MOM) based procedures provide non-asymptotic and strong deviation bounds even when data are heavy-tailed and/or corrupted. This work proposes a new general way to bound the excess risk for MOM estimators. The core technique is the use of VC-dimension (instead of Rademacher complexity) to measure the statistical complexity. In particular, this allows to give the first robust estimators for sparse estimation which achieves the so-called subgaussian rate only assuming a finite second moment for the uncorrupted data. By comparison, previous works using Rademacher complexities required a number of finite moments that grows logarithmically with the dimension. With this technique, we derive new robust sugaussian bounds for mean estimation in any norm. We also derive a new robust estimator for covariance estimation that is the first to achieve subgaussian bounds without $L_4-L_2$ norm equivalence.	翻訳日:2022-12-10 03:06:50 公開日:2020-07-08
# AlignShift:3次元異方性ボリュームにおける画像厚のギャップを埋める AlignShift: Bridging the Gap of Imaging Thickness in 3D Anisotropic Volumes ( http://arxiv.org/abs/2005.01969v2 ) ライセンス: Link先を確認	Jiancheng Yang, Yi He, Xiaoyang Huang, Jingwei Xu, Xiaodan Ye, Guangyu Tao, Bingbing Ni	(参考訳) 本稿では,3次元医用画像処理における基本的な課題について述べる。異方性医用ボリュームでは、薄いスライス(1mm)と厚いスライス(5mm)の間に大きなパフォーマンスギャップがある。従来の芸術では、薄いスライスに3Dアプローチ、厚いスライスに2Dアプローチを使う傾向がある。我々は,薄肉および厚肉の医療用ボリュームの統一的アプローチを目指す。ビデオ解析の最近の進歩に触発されて,理論上は任意の2次元事前学習ネットワークを太さ対応の3Dネットワークに変換する新しいパラメータフリー演算子であるAlignShiftを提案する。興味深いことに、変換されたネットワークは薄いスライスでは3Dのように振る舞うが、厚いスライスでは2Dに適応的に縮退する。入力画像厚みに応じてアライメントされた「仮想スライス」をシフト・融合することにより、統一された厚み認識表現学習を実現する。広汎性病変検出のための32k病変からなる,公衆の大規模深部結節ベンチマークに関する広範囲な実験により,前回と比べ,ホイッスルやベルを伴わない有意なマージンで先行する手法の有効性が検証された。さらに重要なことに、この方法は統一フレームワークによって薄いスライスボリュームと厚いスライスボリュームのパフォーマンスギャップを埋める最初の方法です。 PyTorch のコードは https://github.com/M3DV/AlignShift でオープンソース化されている。 This paper addresses a fundamental challenge in 3D medical image processing: how to deal with imaging thickness. For anisotropic medical volumes, there is a significant performance gap between thin-slice (mostly 1mm) and thick-slice (mostly 5mm) volumes. Prior arts tend to use 3D approaches for the thin-slice and 2D approaches for the thick-slice, respectively. We aim at a unified approach for both thin- and thick-slice medical volumes. Inspired by recent advances in video analysis, we propose AlignShift, a novel parameter-free operator to convert theoretically any 2D pretrained network into thickness-aware 3D network. Remarkably, the converted networks behave like 3D for the thin-slice, nevertheless degenerate to 2D for the thick-slice adaptively. The unified thickness-aware representation learning is achieved by shifting and fusing aligned "virtual slices" as per the input imaging thickness. Extensive experiments on public large-scale DeepLesion benchmark, consisting of 32K lesions for universal lesion detection, validate the effectiveness of our method, which outperforms previous state of the art by considerable margins without whistles and bells. More importantly, to our knowledge, this is the first method that bridges the performance gap between thin- and thick-slice volumes by a unified framework. To improve research reproducibility, our code in PyTorch is open source at https://github.com/M3DV/AlignShift.	翻訳日:2022-12-06 13:41:58 公開日:2020-07-08
# 安全な深層強化学習のための確率的保証 Probabilistic Guarantees for Safe Deep Reinforcement Learning ( http://arxiv.org/abs/2005.07073v2 ) ライセンス: Link先を確認	Edoardo Bacci and David Parker	(参考訳) 深層強化学習は多くの制御タスクにうまく適用されているが、安全上重要なシナリオにおけるこれらのエージェントの適用は安全性上の懸念から制限されている。これらのコントローラーの厳密なテストは、特にハードウェアの故障や騒がしいセンサーのため、確率的な環境での運用では困難である。確率的環境下での深部強化学習エージェントの安全性を測定するアルゴリズムMOSAICを提案する。本手法は,環境におけるコントローラの実行の形式的抽象化を反復的に構築し,マルコフ決定過程の確率論的モデルチェックを活用し,有限時間軸上での安全な動作に関する確率論的保証を実現する。異なる初期設定のためにコントローラの安全な操作の確率の境界を生成し、正しい振る舞いが保証される領域を識別する。いくつかのベンチマーク制御問題で訓練されたエージェントに対するアプローチの実装と評価を行った。 Deep reinforcement learning has been successfully applied to many control tasks, but the application of such agents in safety-critical scenarios has been limited due to safety concerns. Rigorous testing of these controllers is challenging, particularly when they operate in probabilistic environments due to, for example, hardware faults or noisy sensors. We propose MOSAIC, an algorithm for measuring the safety of deep reinforcement learning agents in stochastic settings. Our approach is based on the iterative construction of a formal abstraction of a controller's execution in an environment, and leverages probabilistic model checking of Markov decision processes to produce probabilistic guarantees on safe behaviour over a finite time horizon. It produces bounds on the probability of safe operation of the controller for different initial configurations and identifies regions where correct behaviour can be guaranteed. We implement and evaluate our approach on agents trained for several benchmark control problems.	翻訳日:2022-12-03 05:16:26 公開日:2020-07-08
# マルチモーダル表現学習のための適応トランスフォーマー Adaptive Transformers for Learning Multimodal Representations ( http://arxiv.org/abs/2005.07486v3 ) ライセンス: Link先を確認	Prajjwal Bhargava	(参考訳) トランスフォーマーの使用は、言語意味論の学習から有意義なビシオ言語表現の形成へと成長してきた。これらのアーキテクチャはしばしば過度にパラメータ化され、大量の計算を必要とする。本研究では,モデル解釈性と計算効率についてより深く学ぶために適応的アプローチを拡張する。具体的には,注意スパン,スパース,構造化ドロップアウトの手法について検討し,視覚や言語タスクに対する注意のメカニズムがどのように広がるかを理解するのに役立つ。さらに,これらの手法は,ネットワークが入力シーケンスの複雑さ,異なるモダリティに対するスパーシティ・プレファレンス,その他の関連する現象をどのように知覚するかを知る上で有用であることを示す。 The usage of transformers has grown from learning about language semantics to forming meaningful visiolinguistic representations. These architectures are often over-parametrized, requiring large amounts of computation. In this work, we extend adaptive approaches to learn more about model interpretability and computational efficiency. Specifically, we study attention spans, sparse, and structured dropout methods to help understand how their attention mechanism extends for vision and language tasks. We further show that these approaches can help us learn more about how the network perceives the complexity of input sequences, sparsity preferences for different modalities, and other related phenomena.	翻訳日:2022-12-02 22:24:26 公開日:2020-07-08
# 新型コロナウイルス(COVID-19)パンデミックにおける世論と感情:Twitterのトピックモデリングに潜在ディリクレ・アロケーションを用いた Public discourse and sentiment during the COVID-19 pandemic: using Latent Dirichlet Allocation for topic modeling on Twitter ( http://arxiv.org/abs/2005.08817v3 ) ライセンス: Link先を確認	Jia Xue, Junxiang Chen, Chen Chen, Chengda Zheng, Sijia Li, Tingshao Zhu	(参考訳) この研究の目的は、twitterユーザーの談話とcovid-19に対する心理的反応を理解することだ。私たちは、2020年1月23日から3月7日までに収集された新型コロナウイルスに関連する約190万ツイート(英語で書かれた)の分析に機械学習技術を使用します。 11のトピックが識別され、さらに10のテーマに分類され、「確認されたケースに関する更新」、「1919年関連死」、「中国国外のケース(世界規模)」、「韓国でのcovid-19の流行」、「ニューヨークでの流行の兆候」、「ダイアモンド・プリンセス・クルーズ」、「経済への影響」、「予防措置」、「権限」、「サプライチェーン」がある。結果はTwitterで一般的なトピックとして治療や症状に関連するメッセージを明らかにしない。感情分析は、新型コロナウイルスの未知の性質に対する恐怖があらゆるトピックで支配的であることを示している。本研究の意義と限界についても論じる。 The study aims to understand Twitter users' discourse and psychological reactions to COVID-19. We use machine learning techniques to analyze about 1.9 million Tweets (written in English) related to coronavirus collected from January 23 to March 7, 2020. A total of salient 11 topics are identified and then categorized into ten themes, including "updates about confirmed cases," "COVID-19 related death," "cases outside China (worldwide)," "COVID-19 outbreak in South Korea," "early signs of the outbreak in New York," "Diamond Princess cruise," "economic impact," "Preventive measures," "authorities," and "supply chain." Results do not reveal treatments and symptoms related messages as prevalent topics on Twitter. Sentiment analysis shows that fear for the unknown nature of the coronavirus is dominant in all topics. Implications and limitations of the study are also discussed.	翻訳日:2022-12-02 00:15:18 公開日:2020-07-08
# 心臓mriのための高能率・位相認識ビデオ超解像 Efficient and Phase-aware Video Super-resolution for Cardiac MRI ( http://arxiv.org/abs/2005.10626v4 ) ライセンス: Link先を確認	Jhih-Yuan Lin, Yu-Cheng Chang, Winston H. Hsu	(参考訳) 心臓磁気共鳴イメージング(CMR)は、非侵襲的で痛みのない方法で心臓の構造と機能を説明することができるため、広く用いられている。しかし、ハードウェアの制限により高品質なスキャンを得るには時間がかかり、コストがかかる。そこで本研究では,ハードウェアのアップグレードやスキャンプロトコルの変更を伴わずに,CMRビデオの超解像問題を解決するための新しいエンドツーエンドトレーニングネットワークを提案する。我々は,心の知識をモデルに取り入れ,時間的情報の利用を支援する。具体的には,CMRの循環特性を満たすように調整された周期関数として心臓の知識を定式化する。さらに,残差学習方式の残差は,LR-HRマッピングを漸進的改良方式で学習することを容易にする。この機構により、タスクの難易度に応じて改善イテレーションを調整することにより、ネットワークに適応性を持たせることができる。大規模データセットに対する大規模な実験結果から,提案手法の優位性を示した。 Cardiac Magnetic Resonance Imaging (CMR) is widely used since it can illustrate the structure and function of heart in a non-invasive and painless way. However, it is time-consuming and high-cost to acquire the high-quality scans due to the hardware limitation. To this end, we propose a novel end-to-end trainable network to solve CMR video super-resolution problem without the hardware upgrade and the scanning protocol modifications. We incorporate the cardiac knowledge into our model to assist in utilizing the temporal information. Specifically, we formulate the cardiac knowledge as the periodic function, which is tailored to meet the cyclic characteristic of CMR. In addition, the proposed residual of residual learning scheme facilitates the network to learn the LR-HR mapping in a progressive refinement fashion. This mechanism enables the network to have the adaptive capability by adjusting refinement iterations depending on the difficulty of the task. Extensive experimental results on large-scale datasets demonstrate the superiority of the proposed method compared with numerous state-of-the-art methods.	翻訳日:2022-11-30 23:56:05 公開日:2020-07-08
# Web検索と会話エージェントのためのユーザインテント推論 User Intent Inference for Web Search and Conversational Agents ( http://arxiv.org/abs/2005.13808v2 ) ライセンス: Link先を確認	Ali Ahmadvand	(参考訳) ユーザー意図の理解は、会話エージェントと検索エンジンの両方を設計する上で重要なステップである。ユーザの発話やクエリは短く、曖昧で、コンテキストに依存しているため、ユーザ意図の検出や推論は難しい。これらの研究課題に対処するために、私の論文は以下に焦点を当てています。 1)会話エージェントの発話トピックと意図分類 2)eコマース分野に着目したWeb検索エンジンの検索意図のマイニングと分類を行う。最初のトピックに対処するために、エンティティ情報と会話コンテキストの手がかりを組み込んだ新しいモデルを提案し、ユーザの発話の話題と意図の両方を予測する。第2の研究テーマは、web検索意図予測における既存の技術メソッドを、次のとおりeコマースドメインに拡張することです。 1)検索クエリの意図と関連する製品カテゴリを予測する共同学習モデルの構築。 2)新しい隠れユーザの意図を明らかにする。すべてのモデルは、主要なeコマースサイト検索エンジンから入手可能な実際のクエリで評価される。これらの研究の成果は、自然言語理解、クエリスコーピング、クエリ提案、ランキングなど、様々なタスクのパフォーマンスを改善するために利用することができ、結果としてユーザーエクスペリエンスが強化される。 User intent understanding is a crucial step in designing both conversational agents and search engines. Detecting or inferring user intent is challenging, since the user utterances or queries can be short, ambiguous, and contextually dependent. To address these research challenges, my thesis work focuses on: 1) Utterance topic and intent classification for conversational agents 2) Query intent mining and classification for Web search engines, focusing on the e-commerce domain. To address the first topic, I proposed novel models to incorporate entity information and conversation-context clues to predict both topic and intent of the user's utterances. For the second research topic, I plan to extend the existing state of the art methods in Web search intent prediction to the e-commerce domain, via: 1) Developing a joint learning model to predict search queries' intents and the product categories associated with them, 2) Discovering new hidden users' intents. All the models will be evaluated on the real queries available from a major e-commerce site search engine. The results from these studies can be leveraged to improve performance of various tasks such as natural language understanding, query scoping, query suggestion, and ranking, resulting in an enriched user experience.	翻訳日:2022-11-27 05:38:28 公開日:2020-07-08
# 深部言語表現における分離多様体の出現 Emergence of Separable Manifolds in Deep Language Representations ( http://arxiv.org/abs/2006.01095v4 ) ライセンス: Link先を確認	Jonathan Mamou, Hang Le, Miguel Del Rio, Cory Stephenson, Hanlin Tang, Yoon Kim, SueYeon Chung	(参考訳) ディープニューラルネットワーク(DNN)は、様々な認知モダリティの知覚的タスクを解く上で、非常に経験的な成功を示している。最近の研究では、タスク最適化されたDNNから抽出された表現と脳内の神経集団の間にかなりの類似性が報告されている。その後、DNNは複雑な認知機能の基礎となる計算原理を推論する一般的なモデルクラスとなり、神経集団の情報を調べるために開発された手法を応用するための自然なテストベッドとして登場した。本研究では,特徴表現の幾何学とクラスの線形分離性を結びつける計算神経科学の最近の手法である平均場理論多様体解析を用いて,大規模文脈埋め込みモデルから言語表現を分析する。異なるモデルファミリ(bert, roberta, gptなど)からの表現を探索し、特にあいまいなデータ(例えば、複数のpart-of-speechタグを持つ単語、多くの単語を含むpart-of-speechクラス)において、層深度(例えば、part-of-speechタグのための多様体)を越えて言語多様体が出現する証拠を見つける。さらに、これらの多様体における線形分離性の出現は、多様体の半径、次元性、多様体間相関の複合化によって引き起こされる。 Deep neural networks (DNNs) have shown much empirical success in solving perceptual tasks across various cognitive modalities. While they are only loosely inspired by the biological brain, recent studies report considerable similarities between representations extracted from task-optimized DNNs and neural populations in the brain. DNNs have subsequently become a popular model class to infer computational principles underlying complex cognitive functions, and in turn, they have also emerged as a natural testbed for applying methods originally developed to probe information in neural populations. In this work, we utilize mean-field theoretic manifold analysis, a recent technique from computational neuroscience that connects geometry of feature representations with linear separability of classes, to analyze language representations from large-scale contextual embedding models. We explore representations from different model families (BERT, RoBERTa, GPT, etc.) and find evidence for emergence of linguistic manifolds across layer depth (e.g., manifolds for part-of-speech tags), especially in ambiguous data (i.e, words with multiple part-of-speech tags, or part-of-speech classes including many words). In addition, we find that the emergence of linear separability in these manifolds is driven by a combined reduction of manifolds' radius, dimensionality and inter-manifold correlations.	翻訳日:2022-11-26 05:56:04 公開日:2020-07-08
# MRIにおける心筋運動追跡のためのバイオメカニクスインフォームドニューラルネットワーク Biomechanics-informed Neural Networks for Myocardial Motion Tracking in MRI ( http://arxiv.org/abs/2006.04725v3 ) ライセンス: Link先を確認	Chen Qin, Shuo Wang, Chen Chen, Huaqi Qiu, Wenjia Bai and Daniel Rueckert	(参考訳) 画像登録は、しばしば解空間の正規化を必要とする不測の逆問題である。本稿では, 平滑性などの明示的な正規化条件を課す現在のアプローチのほとんどとは対照的に, バイオメカニクスによる正規化を暗黙的に学習できる新しい手法を提案する。このようなアプローチは、アプリケーション固有の事前知識をディープラーニングベースの登録に組み込むことができる。特に, 生体力学的に妥当な変形の多様体を学習するために変分オートエンコーダ(vae)を活用し, 生体力学的シミュレーションを再構成することでその基礎特性を暗黙的に把握する。学習されたvae正規化器は、任意の深層学習ベースの登録ネットワークと結合して、生体力学的に実現可能な解空間を定式化することができる。提案手法は2つの異なるデータセットから得られた2次元心筋mriデータを用いた心筋運動追跡の文脈で検証される。その結果,運動追跡精度の点で他の競合手法よりも優れた性能を示し,非圧縮性やひずみなどの生体力学的特性を学習できることを示した。この手法は、一般的なl2正規化スキームと比較して、未検出領域に対するより良い一般化性も示されている。 Image registration is an ill-posed inverse problem which often requires regularisation on the solution space. In contrast to most of the current approaches which impose explicit regularisation terms such as smoothness, in this paper we propose a novel method that can implicitly learn biomechanics-informed regularisation. Such an approach can incorporate application-specific prior knowledge into deep learning based registration. Particularly, the proposed biomechanics-informed regularisation leverages a variational autoencoder (VAE) to learn a manifold for biomechanically plausible deformations and to implicitly capture their underlying properties via reconstructing biomechanical simulations. The learnt VAE regulariser then can be coupled with any deep learning based registration network to regularise the solution space to be biomechanically plausible. The proposed method is validated in the context of myocardial motion tracking on 2D stacks of cardiac MRI data from two different datasets. The results show that it can achieve better performance against other competing methods in terms of motion tracking accuracy and has the ability to learn biomechanical properties such as incompressibility and strains. The method has also been shown to have better generalisability to unseen domains compared with commonly used L2 regularisation schemes.	翻訳日:2022-11-24 02:28:40 公開日:2020-07-08
# KiU-Net:オーバーコンプリート表現を用いたバイオメディカル画像の正確なセグメンテーションを目指して KiU-Net: Towards Accurate Segmentation of Biomedical Images using Over-complete Representations ( http://arxiv.org/abs/2006.04878v2 ) ライセンス: Link先を確認	Jeya Maria Jose, Vishwanath Sindagi, Ilker Hacihaliloglu, Vishal M. Patel	(参考訳) 優れた性能のため、U-Netは近年でもっとも広く使われているバイオメディカルイメージセグメンテーションのバックボーンアーキテクチャである。しかし,本研究では,より小さな解剖学的ランドマークを不明瞭な境界で検出する場合,かなりの性能低下が観察されている。この問題を詳細に分析し,より高次元(空間的な意味で)にデータを投影するオーバーコンプリートアーキテクチャ(ki-net)を提案することで対処する。このネットワークをU-Netで拡張すると、小さな解剖学的ランドマークとぼやけたノイズ境界を分割する場合には、全体的なパフォーマンスが向上する。さらに、提案するネットワークには、より高速な収束やパラメータの少ないといったメリットがある。本研究は, 早産児の2次元超音波(us)から脳解剖学を分離する作業について検討し, 標準u-netと比較して, dice精度とjaccard indexの点で約4%の改善を達成し, 最近の最良法を2%上回った。コード:https://github.com/jeya-maria-jose/KiU-Net-pytorch Due to its excellent performance, U-Net is the most widely used backbone architecture for biomedical image segmentation in the recent years. However, in our studies, we observe that there is a considerable performance drop in the case of detecting smaller anatomical landmarks with blurred noisy boundaries. We analyze this issue in detail, and address it by proposing an over-complete architecture (Ki-Net) which involves projecting the data onto higher dimensions (in the spatial sense). This network, when augmented with U-Net, results in significant improvements in the case of segmenting small anatomical landmarks and blurred noisy boundaries while obtaining better overall performance. Furthermore, the proposed network has additional benefits like faster convergence and fewer number of parameters. We evaluate the proposed method on the task of brain anatomy segmentation from 2D Ultrasound (US) of preterm neonates, and achieve an improvement of around 4% in terms of the DICE accuracy and Jaccard index as compared to the standard-U-Net, while outperforming the recent best methods by 2%. Code: https://github.com/jeya-maria-jose/KiU-Net-pytorch .	翻訳日:2022-11-24 02:27:29 公開日:2020-07-08
# covid-abs:ソーシャルディスタンシング介入の健康と経済効果をシミュレートするエージェントベースの新型コロナウイルス流行モデル COVID-ABS: An Agent-Based Model of COVID-19 Epidemic to Simulate Health and Economic Effects of Social Distancing Interventions ( http://arxiv.org/abs/2006.10532v2 ) ライセンス: Link先を確認	Petr\^onio C. L. Silva, Paulo V. C. Batista, H\'elder S. Lima, Marcos A. Alves, Frederico G. Guimar\~aes, Rodrigo C. P. Silva	(参考訳) SARS-CoV-2による新型コロナウイルスのパンデミックは、世界中の公衆衛生と経済に直接影響を与えている。この問題を解決するため、各国はウイルスの拡散を抑制するために異なる政策と非薬剤的介入を講じてきた。本稿では,人,企業,政府を模倣するエージェントの社会を用いて,パンデミックのダイナミクスをシミュレートする新たなエージェントベースモデルであるcovid-19-absを提案する。その結果,(1)何もせず,(2)ロックダウン,(3)条件付きロックダウン,(4)垂直隔離,(5)部分隔離,(6)顔マスクの使用,(7)顔マスクの使用,および50%の社会的孤立への密着性,の7つの要因が分析された。ロックダウンによるシナリオの実施が不可能で、死亡者数が最も少なく、経済に最も影響を与える場合、フェイスマスクの使用と部分隔離を組み合わせるシナリオは、社会協力の観点からの実施においてより現実的なものとなり得る。 COVID-ABSモデルはPythonプログラミング言語で実装され、ソースコードが公開されている。このモデルは、入力パラメータを変更したり、様々なシナリオを作成できるようにすることで、他の社会にも容易に拡張できる。そのため、政治家や保健当局が新型コロナウイルス対策を計画する上で有用なツールである。 The COVID-19 pandemic due to the SARS-CoV-2 coronavirus has directly impacted the public health and economy worldwide. To overcome this problem, countries have adopted different policies and non-pharmaceutical interventions for controlling the spread of the virus. This paper proposes the COVID-ABS, a new SEIR (Susceptible-Exposed-Infected-Recovered) agent-based model that aims to simulate the pandemic dynamics using a society of agents emulating people, business and government. Seven different scenarios of social distancing interventions were analyzed, with varying epidemiological and economic effects: (1) do nothing, (2) lockdown, (3) conditional lockdown, (4) vertical isolation, (5) partial isolation, (6) use of face masks, and (7) use of face masks together with 50% of adhesion to social isolation. In the impossibility of implementing scenarios with lockdown, which present the lowest number of deaths and highest impact on the economy, scenarios combining the use of face masks and partial isolation can be the more realistic for implementation in terms of social cooperation. The COVID-ABS model was implemented in Python programming language, with source code publicly available. The model can be easily extended to other societies by changing the input parameters, as well as allowing the creation of a multitude of other scenarios. Therefore, it is a useful tool to assist politicians and health authorities to plan their actions against the COVID-19 epidemic.	翻訳日:2022-11-23 15:22:33 公開日:2020-07-08
# COALA: セマンティックにリッチな音声表現を学習するための協調型オートエンコーダ COALA: Co-Aligned Autoencoders for Learning Semantically Enriched Audio Representations ( http://arxiv.org/abs/2006.08386v2 ) ライセンス: Link先を確認	Xavier Favory, Konstantinos Drossos, Tuomas Virtanen and Xavier Serra	(参考訳) ディープニューラルネットワーク(DNN)に基づく音声表現学習は、手作り機能に代わるアプローチとして登場した。高性能を実現するために、DNNは大量の注釈付きデータを必要とすることが多く、入手が困難でコストがかかる。本稿では,学習した音声および関連タグの潜在表現を整列させて,音声表現を学習する手法を提案する。調整は、音声とタグの潜在表現の一致を最大化し、対照的な損失を用いて行う。その結果,音の音響的・意味的特性を反映した音響埋め込みモデルが得られた。組込みモデルの品質を評価し,3つの異なるタスク(音のイベント認識,音楽ジャンル,楽器分類)で特徴抽出器としての性能を測定し,そのモデルがどのような特徴を捉えているかを検討する。提案手法により得られた埋め込みは,いくつかの音響ディスクリプタとよく相関している。 Audio representation learning based on deep neural networks (DNNs) emerged as an alternative approach to hand-crafted features. For achieving high performance, DNNs often need a large amount of annotated data which can be difficult and costly to obtain. In this paper, we propose a method for learning audio representations, aligning the learned latent representations of audio and associated tags. Aligning is done by maximizing the agreement of the latent representations of audio and tags, using a contrastive loss. The result is an audio embedding model which reflects acoustic and semantic characteristics of sounds. We evaluate the quality of our embedding model, measuring its performance as a feature extractor on three different tasks (namely, sound event recognition, and music genre and musical instrument classification), and investigate what type of characteristics the model captures. Our results are promising, sometimes in par with the state-of-the-art in the considered tasks and the embeddings produced with our method are well correlated with some acoustic descriptors.	翻訳日:2022-11-21 03:51:11 公開日:2020-07-08
# ニューラルモデルにおける構成一般化に関する研究 A Study of Compositional Generalization in Neural Models ( http://arxiv.org/abs/2006.09437v2 ) ライセンス: Link先を確認	Tim Klinger, Dhaval Adjodah, Vincent Marois, Josh Joseph, Matthew Riemer, Alex 'Sandy' Pentland, Murray Campbell	(参考訳) 合成学習とリレーショナル学習は人間の知能の目印であるが、ニューラルモデルに対する課題を示すものである。このようなモデルの開発における難点は、体系的に評価する明確な構成的および関係的なタスク構造を持つベンチマークが欠如していることである。本稿では,論理的なドメイン特化言語を用いて,合成概念と関係概念から画像を生成することを可能にする,conceptworldという環境を提案する。 2x2平方、ペントミノ、シーケンス、これらのオブジェクトを含むシーン、その他のより複雑な概念など、さまざまな構成構造のための画像を生成するために使用します。我々は,それらの引数の合成深度が増加し,置換されるにつれて,合成引数との関係を一般化する標準的なニューラルアーキテクチャの能力をテストする実験を行う。 MLPやCNN,ResNetといった標準的なニューラルネットワークや,WReNやPrediNetといった最先端のリレーショナルネットワークを,マルチクラスの画像分類設定で比較する。単純な問題に対して、全てのモデルは密接な概念にうまく一般化するが、長い構成連鎖に苦しむ。置換性を含むより複雑なテストでは、全てのモデルは短鎖でも苦労する。これらの困難を強調し、さらなる実験を行うための環境を提供することで、構成的、関係的な領域において効果的に一般化できるモデルの開発を奨励したいと考えています。 Compositional and relational learning is a hallmark of human intelligence, but one which presents challenges for neural models. One difficulty in the development of such models is the lack of benchmarks with clear compositional and relational task structure on which to systematically evaluate them. In this paper, we introduce an environment called ConceptWorld, which enables the generation of images from compositional and relational concepts, defined using a logical domain specific language. We use it to generate images for a variety of compositional structures: 2x2 squares, pentominoes, sequences, scenes involving these objects, and other more complex concepts. We perform experiments to test the ability of standard neural architectures to generalize on relations with compositional arguments as the compositional depth of those arguments increases and under substitution. We compare standard neural networks such as MLP, CNN and ResNet, as well as state-of-the-art relational networks including WReN and PrediNet in a multi-class image classification setting. For simple problems, all models generalize well to close concepts but struggle with longer compositional chains. For more complex tests involving substitutivity, all models struggle, even with short chains. In highlighting these difficulties and providing an environment for further experimentation, we hope to encourage the development of models which are able to generalize effectively in compositional, relational domains.	翻訳日:2022-11-20 19:35:36 公開日:2020-07-08
# モデルベース強化学習におけるデルタスキーマネットワーク Delta Schema Network in Model-based Reinforcement Learning ( http://arxiv.org/abs/2006.09950v2 ) ライセンス: Link先を確認	Andrey Gorodetskiy, Alexandra Shlychkova, Aleksandr I. Panov	(参考訳) この研究は、伝達学習の非効率性である人工知能の未解決問題に焦点を当てている。強化学習の分野でこの問題を解決するために用いられるメカニズムの1つはモデルに基づくアプローチである。本稿では,環境データからオブジェクトとアクション間の論理的関係を抽出できるスキーマネットワーク手法を拡張している。我々は、デルタスキーマネットワーク(dsn)のトレーニング、環境の将来の状態の予測、積極的な報酬につながる行動計画のためのアルゴリズムを提案する。 DSNは、古典的なアタリゲーム環境において、転送学習の強い性能を示す。 This work is devoted to unresolved problems of Artificial General Intelligence - the inefficiency of transfer learning. One of the mechanisms that are used to solve this problem in the area of reinforcement learning is a model-based approach. In the paper we are expanding the schema networks method which allows to extract the logical relationships between objects and actions from the environment data. We present algorithms for training a Delta Schema Network (DSN), predicting future states of the environment and planning actions that will lead to positive reward. DSN shows strong performance of transfer learning on the classic Atari game environment.	翻訳日:2022-11-19 19:24:06 公開日:2020-07-08
# 連続時間限におけるメタ学習 Meta Learning in the Continuous Time Limit ( http://arxiv.org/abs/2006.10921v2 ) ライセンス: Link先を確認	Ruitu Xu, Lin Chen, Amin Karbasi	(参考訳) 本稿では,モデル非依存メタラーニング(MAML)の学習力学の基礎となる常微分方程式(ODE)を確立する。この過程の連続時間極限ビューは,手動で選択した勾配降下のステップサイズの影響を取り除き,特定の離散化から生じる特別な場合として,既存の勾配降下訓練アルゴリズムを含む。我々は,MAML損失が非凸である場合でも,MAML損失関数の近似定常点に対する線形収束率を強く凸することを示した。さらに,MAML ODE の解析を通じて,既存のMAML トレーニング手法に付随する計算負担を大幅に軽減する BI-MAML トレーニングアルゴリズムを提案する。理論的な知見を補完するため,提案手法の既存研究に対する優位性を示す実証実験を行った。 In this paper, we establish the ordinary differential equation (ODE) that underlies the training dynamics of Model-Agnostic Meta-Learning (MAML). Our continuous-time limit view of the process eliminates the influence of the manually chosen step size of gradient descent and includes the existing gradient descent training algorithm as a special case that results from a specific discretization. We show that the MAML ODE enjoys a linear convergence rate to an approximate stationary point of the MAML loss function for strongly convex task losses, even when the corresponding MAML loss is non-convex. Moreover, through the analysis of the MAML ODE, we propose a new BI-MAML training algorithm that significantly reduces the computational burden associated with existing MAML training methods. To complement our theoretical findings, we perform empirical experiments to showcase the superiority of our proposed methods with respect to the existing work.	翻訳日:2022-11-19 03:39:22 公開日:2020-07-08
# 深部ニューラルネットワークによる時間集合の予測 Predicting Temporal Sets with Deep Neural Networks ( http://arxiv.org/abs/2006.11483v4 ) ライセンス: Link先を確認	Le Yu, Leilei Sun, Bowen Du, Chuanren Liu, Hui Xiong, Weifeng Lv	(参考訳) 各集合が任意の数の要素を含む集合列が与えられたとき、時間集合予測の問題は、次の集合の要素を予測することを目的としている。実際には、時間的集合予測は時間的事象や時系列の予測モデルよりもはるかに複雑であり、まだ未解決の問題である。時間的集合予測の問題に適応した多くの既存の手法は、通常、まず時間的集合を潜在表現に投影し、次に潜在表現で予測モデルを学ぶことによって2段階の戦略に従う。 2段階のアプローチはしばしば情報損失と不満足な予測性能をもたらす。本稿では,時間的集合予測のためのディープニューラルネットワークに基づく統合解を提案する。このアプローチのユニークな視点は、集合レベルの共起グラフを構築して要素関係を学習し、動的関係グラフ上でグラフ畳み込みを実行することである。さらに,要素や集合の時間依存性を適応的に学習するアテンションベースモジュールを設計する。最後に、異なるシーケンスで隠れた共有パターンを見つけ、静的情報と動的情報を融合して予測性能を向上させるゲート更新機構を提供する。実世界のデータセットに関する実験は、トレーニングデータの一部であっても、我々のアプローチが競争力のあるパフォーマンスを達成でき、既存のメソッドをかなりのマージンで上回ることができることを示している。 Given a sequence of sets, where each set contains an arbitrary number of elements, the problem of temporal sets prediction aims to predict the elements in the subsequent set. In practice, temporal sets prediction is much more complex than predictive modelling of temporal events and time series, and is still an open problem. Many possible existing methods, if adapted for the problem of temporal sets prediction, usually follow a two-step strategy by first projecting temporal sets into latent representations and then learning a predictive model with the latent representations. The two-step approach often leads to information loss and unsatisfactory prediction performance. In this paper, we propose an integrated solution based on the deep neural networks for temporal sets prediction. A unique perspective of our approach is to learn element relationship by constructing set-level co-occurrence graph and then perform graph convolutions on the dynamic relationship graphs. Moreover, we design an attention-based module to adaptively learn the temporal dependency of elements and sets. Finally, we provide a gated updating mechanism to find the hidden shared patterns in different sequences and fuse both static and dynamic information to improve the prediction performance. Experiments on real-world data sets demonstrate that our approach can achieve competitive performances even with a portion of the training data and can outperform existing methods with a significant margin.	翻訳日:2022-11-18 22:01:14 公開日:2020-07-08
# 画像圧縮における損失情報のモデル化 Modeling Lost Information in Lossy Image Compression ( http://arxiv.org/abs/2006.11999v3 ) ライセンス: Link先を確認	Yaolong Wang, Mingqing Xiao, Chang Liu, Shuxin Zheng, Tie-Yan Liu	(参考訳) ロスシー画像圧縮は、デジタル画像の最もよく使われる演算子の1つである。近年提案された深層学習に基づく画像圧縮手法は, オートエンコーダ構造を活用し, この分野で有望な結果を得た。画像はまず低次元の潜伏特徴に符号化され、その後、統計冗長性を利用してエントロピー符号化される。しかし、エンコード中に失われた情報は、残念ながら避けられないため、デコーダが元の画像を再構築する上で大きな課題となる。本研究では,情報損失問題を抑えるために,ILC(Invertible Lossy Compression)と呼ばれる新しい非可逆的フレームワークを提案する。特に、ICCはエンコーダ・デコーダ構造を置き換えるための可逆符号化モジュールを導入し、低次元情報潜在表現を生成する一方で、失われた情報をさらなるコード化や保存を行わない補助潜在変数に変換する。潜伏表現は量子化されビットストリームに符号化され、潜伏変数は特定の分布、すなわち等方ガウス分布に従わざるを得ない。このようにして、サロゲート潜伏変数を容易に描画し、モジュールの逆パスにサンプル変数と復号化潜伏特徴を加えることにより、原画像の復元を可能にする。画像圧縮法におけるオートエンコーダを置き換えた新しいコンポーネントにより、ICCは既存の圧縮アルゴリズムと組み合わせることで、広範囲なベンチマークデータセット上でのベースライン手法を大幅に上回ることを示す。 Lossy image compression is one of the most commonly used operators for digital images. Most recently proposed deep-learning-based image compression methods leverage the auto-encoder structure, and reach a series of promising results in this field. The images are encoded into low dimensional latent features first, and entropy coded subsequently by exploiting the statistical redundancy. However, the information lost during encoding is unfortunately inevitable, which poses a significant challenge to the decoder to reconstruct the original images. In this work, we propose a novel invertible framework called Invertible Lossy Compression (ILC) to largely mitigate the information loss problem. Specifically, ILC introduces an invertible encoding module to replace the encoder-decoder structure to produce the low dimensional informative latent representation, meanwhile, transform the lost information into an auxiliary latent variable that won't be further coded or stored. The latent representation is quantized and encoded into bit-stream, and the latent variable is forced to follow a specified distribution, i.e. isotropic Gaussian distribution. In this way, recovering the original image is made tractable by easily drawing a surrogate latent variable and applying the inverse pass of the module with the sampled variable and decoded latent features. Experimental results demonstrate that with a new component replacing the auto-encoder in image compression methods, ILC can significantly outperform the baseline method on extensive benchmark datasets by combining with the existing compression algorithms.	翻訳日:2022-11-18 05:22:10 公開日:2020-07-08
# ソーシャルボット検出の10年 A Decade of Social Bot Detection ( http://arxiv.org/abs/2007.03604v2 ) ライセンス: Link先を確認	Stefano Cresci	(参考訳) 2016年11月9日朝、世界はアメリカ合衆国大統領選挙の衝撃的な結果に目覚めた: ドナルド・トランプは第45代アメリカ合衆国大統領だった。いまだに世界中に重大な結果をもたらす予期せぬ出来事。今日、少数のソーシャルボット、自動化されたソーシャルメディアアカウントが、分裂したメッセージや偽情報を広めるのに中心的な役割を果たし、おそらくトランプ氏の勝利に寄与したことを私たちは知っている。 2016年のアメリカ合衆国大統領選挙の後、世界はソーシャルメディアにおける広範な詐欺の重大さを認識し始めた。トランプ氏のエクスプロイトを受けて、私たちはボットの検出と削除に対する多くの努力と、これらの悪意ある俳優が我々の社会に与えた影響の増大の間に激しい不協和音の出現を目撃した。このパラドックスは、このソーシャルボットのパンデミックを防ぐために、どのような戦略を強制すべきなのか? 2020年米大統領選への出馬中、この問題はこれまでになく重要視されている。 2016年以降、社会的、政治的、経済的なアナリストが脳卒中を起こしたのは、少なくとも2010年以降、コンピュータ科学者にとって、詐欺と自動化が問題となっている。本稿では,ソーシャルボット検出における最初の10年の研究を簡潔に調査する。縦断的な分析によって、ボットとの戦いにおける研究の主なトレンド、達成された主な成果、そしてこの絶え間ない戦いを困難なものにする要因について論じる。広範な分析から学んだ教訓に乗じて、詐欺や操作に対する上限となる可能性のあるイノベーションを提案します。ソーシャルボット検出における10年間の研究は、戦略的情報操作や政治トロルなど、他の、より最近のオンライン詐欺の影響を検知し緩和するための戦略を知らせることもできる。 On the morning of November 9th 2016, the world woke up to the shocking outcome of the US Presidential elections: Donald Trump was the 45th President of the United States of America. An unexpected event that still has tremendous consequences all over the world. Today, we know that a minority of social bots, automated social media accounts mimicking humans, played a central role in spreading divisive messages and disinformation, possibly contributing to Trump's victory. In the aftermath of the 2016 US elections, the world started to realize the gravity of widespread deception in social media. Following Trump's exploit, we witnessed to the emergence of a strident dissonance between the multitude of efforts for detecting and removing bots, and the increasing effects that these malicious actors seem to have on our societies. This paradox opens a burning question: What strategies should we enforce in order to stop this social bot pandemic? In these times, during the run-up to the 2020 US elections, the question appears as more crucial than ever. What stroke social, political and economic analysts after 2016, deception and automation, has been however a matter of study for computer scientists since at least 2010. In this work, we briefly survey the first decade of research in social bot detection. Via a longitudinal analysis, we discuss the main trends of research in the fight against bots, the major results that were achieved, and the factors that make this never-ending battle so challenging. Capitalizing on lessons learned from our extensive analysis, we suggest possible innovations that could give us the upper hand against deception and manipulation. Studying a decade of endeavours at social bot detection can also inform strategies for detecting and mitigating the effects of other, more recent, forms of online deception, such as strategic information operations and political trolls.	翻訳日:2022-11-18 00:00:27 公開日:2020-07-08
# 注意して聞く: 残学習とガンマタン音声表現に基づく音声キャプションシステム Listen carefully and tell: an audio captioning system based on residual learning and gammatone audio representation ( http://arxiv.org/abs/2006.15406v4 ) ライセンス: Link先を確認	Sergi Perez-Castanos, Javier Naranjo-Alcazar, Pedro Zuccarello and Maximo Cobos	(参考訳) 自動音声キャプションは、自由テキストを用いて音声を記述することを目的としている機械聴取タスクである。音声を入力として受け取り、テキスト記述、すなわち信号のキャプションとして出力するので、自動的な音声キャプションシステムを実装する必要がある。このタスクは、自動コンテンツ記述やマシン間インタラクションなど、多くのアプリケーションで有用である。本研究では,エンコーダフェーズにおける残差学習に基づく音声キャプションの自動生成手法を提案する。エンコーダフェーズは、異なる残留ネットワーク構成によって実装される。デコーダフェーズ(キャプションの作成)は、繰り返しレイヤとアテンション機構を使用して実行される。選ばれた音声表現はガンマトインである。その結果,本研究で提案するフレームワークがベースラインシステムを上回ることが判明した。 Automated audio captioning is machine listening task whose goal is to describe an audio using free text. An automated audio captioning system has to be implemented as it accepts an audio as input and outputs as textual description, that is, the caption of the signal. This task can be useful in many applications such as automatic content description or machine-to-machine interaction. In this work, an automatic audio captioning based on residual learning on the encoder phase is proposed. The encoder phase is implemented via different Residual Networks configurations. The decoder phase (create the caption) is run using recurrent layers plus attention mechanism. The audio representation chosen has been Gammatone. Results show that the framework proposed in this work surpass the baseline system in challenge results.	翻訳日:2022-11-16 08:16:20 公開日:2020-07-08
# グラフニューラルネットワークのための経路積分に基づく畳み込みとプーリング Path Integral Based Convolution and Pooling for Graph Neural Networks ( http://arxiv.org/abs/2006.16811v2 ) ライセンス: Link先を確認	Zheng Ma, Junyu Xuan, Yu Guang Wang, Ming Li, Pietro Lio	(参考訳) グラフニューラルネットワーク(GNN)は、従来のニューラルネットワークの機能をグラフ構造化データに拡張する。 CNNと同様、グラフの畳み込みとプーリングを最適化した設計が成功の鍵である。物理からアイデアを借用し,グラフの分類と回帰処理のための経路積分型グラフニューラルネットワーク(PAN)を提案する。具体的には、メッセージ送信側と受信側を結ぶ全ての経路を経路長に応じて学習可能な重みでリンクし、最大エントロピーランダムウォークに対応する畳み込み演算を考える。これはグラフラプラシアンを、経路積分形式から導かれる極大エントロピー遷移(MET)行列と呼ばれる新しい遷移行列に一般化する。重要なことに、MET行列の対角成分は、部分グラフ中心性に直接関係しており、自然かつ適応的なプーリング機構を提供する。 panは、さまざまなサイズと構造を備えたさまざまなグラフデータ用に調整可能な汎用フレームワークを提供する。既存のほとんどのGNNアーキテクチャはPANの特別なケースとして見ることができます。実験結果から, PANは様々なグラフ分類/回帰タスクにおいて, 物理科学におけるGNNの適用性を高めるために, 統計力学による新しいベンチマークデータセットを含む最先端の性能を達成できることが示唆された。 Graph neural networks (GNNs) extends the functionality of traditional neural networks to graph-structured data. Similar to CNNs, an optimized design of graph convolution and pooling is key to success. Borrowing ideas from physics, we propose a path integral based graph neural networks (PAN) for classification and regression tasks on graphs. Specifically, we consider a convolution operation that involves every path linking the message sender and receiver with learnable weights depending on the path length, which corresponds to the maximal entropy random walk. It generalizes the graph Laplacian to a new transition matrix we call maximal entropy transition (MET) matrix derived from a path integral formalism. Importantly, the diagonal entries of the MET matrix are directly related to the subgraph centrality, thus providing a natural and adaptive pooling mechanism. PAN provides a versatile framework that can be tailored for different graph data with varying sizes and structures. We can view most existing GNN architectures as special cases of PAN. Experimental results show that PAN achieves state-of-the-art performance on various graph classification/regression tasks, including a new benchmark dataset from statistical mechanics we propose to boost applications of GNN in physical sciences.	翻訳日:2022-11-15 14:30:52 公開日:2020-07-08
# 表現的記述論理のための署名に基づくアブダクション -- 技術報告 Signature-Based Abduction for Expressive Description Logics -- Technical Report ( http://arxiv.org/abs/2007.00757v2 ) ライセンス: Link先を確認	Patrick Koopmann, Warren Del-Pinto, Sophie Tourret and Renate A. Schmidt	(参考訳) シグネチャベースのアブダクション(signal-based abduction)は、特定の名前セットであるシグネチャの上に仮説を構築することを目的としている。この種のアブダクションは、観察された症状に使用される語彙が、それらの症状を説明するのに期待される語彙とは異なる、診断などのタスクに有用である。本稿では,TBox と ABox の公理を含む表現的記述論理 ALC で表現された観測に対するシグネチャベース推論を解くための,最初の完全解法を提案する。この方法は有限かつ完全な仮説の集合を計算することが保証され、現実的な知識ベースで評価される。 Signature-based abduction aims at building hypotheses over a specified set of names, the signature, that explain an observation relative to some background knowledge. This type of abduction is useful for tasks such as diagnosis, where the vocabulary used for observed symptoms differs from the vocabulary expected to explain those symptoms. We present the first complete method solving signature-based abduction for observations expressed in the expressive description logic ALC, which can include TBox and ABox axioms, thereby solving the knowledge base abduction problem. The method is guaranteed to compute a finite and complete set of hypotheses, and is evaluated on a set of realistic knowledge bases.	翻訳日:2022-11-14 23:28:30 公開日:2020-07-08
# 衛星画像時系列分類のための軽量時間自己認識 Lightweight Temporal Self-Attention for Classifying Satellite Image Time Series ( http://arxiv.org/abs/2007.00586v3 ) ライセンス: Link先を確認	Vivien Sainte Fare Garnot and Loic Landrieu	(参考訳) 地球観測衛星データのアクセシビリティと精度の向上は、産業や国家のアクターにもかなりの機会を与えている。しかし、これはグローバルスケールで時系列を処理できる効率的な方法を要求する。リモートセンシング時間列の分類にマルチヘッド自己注意機構を用いた最近の研究に基づいて,時間的注意エンコーダの修正を提案する。本ネットワークでは,時間入力のチャネルを並列に動作している複数の小型アテンションヘッドに分散する。各ヘッドは高度に特殊化された時間的特徴を抽出し、その特徴を1つの表現に分解する。提案手法は,オープンアクセス衛星画像データセット上の他の最先端の時系列分類アルゴリズムを上回り,パラメータをかなり少なくし,計算複雑性を低減した。 The increasing accessibility and precision of Earth observation satellite data offers considerable opportunities for industrial and state actors alike. This calls however for efficient methods able to process time-series on a global scale. Building on recent work employing multi-headed self-attention mechanisms to classify remote sensing time sequences, we propose a modification of the Temporal Attention Encoder. In our network, the channels of the temporal inputs are distributed among several compact attention heads operating in parallel. Each head extracts highly-specialized temporal features which are in turn concatenated into a single representation. Our approach outperforms other state-of-the-art time series classification algorithms on an open-access satellite image dataset, while using significantly fewer parameters and with a reduced computational complexity.	翻訳日:2022-11-14 22:27:12 公開日:2020-07-08
# 学習表現の線形識別性について On Linear Identifiability of Learned Representations ( http://arxiv.org/abs/2007.00810v3 ) ライセンス: Link先を確認	Geoffrey Roeder, Luke Metz and Diederik P. Kingma	(参考訳) 同定可能性(identifiability)は、統計モデルの望ましい性質である: 真のモデルパラメータは、十分な計算資源とデータから、任意の所望の精度に推定できることを意味する。表現学習の文脈における識別可能性について検討する: 下流課題に対して最適である非線形データ表現を発見する。ディープニューラルネットワークとしてパラメータ化される場合、そのような表現関数は設計によって過度にパラメータ化されるため、パラメータ空間における識別可能性に欠ける。本稿では, 非線形ICAの最近の進歩を基盤として, 線形不確定性に至るまでの関数空間において, 判別モデルの大きなファミリーが実際に識別可能であることを示すことによって, 識別可能性の回復を目指す。さまざまなドメインで表現学習を行う多くのモデルは、テキスト、画像、音声、出版時の最先端技術など、この意味で識別されている。線形同定可能性の十分条件を導出し,シミュレーションデータと実世界データの両方に対して経験的支援を行う。 Identifiability is a desirable property of a statistical model: it implies that the true model parameters may be estimated to any desired precision, given sufficient computational resources and data. We study identifiability in the context of representation learning: discovering nonlinear data representations that are optimal with respect to some downstream task. When parameterized as deep neural networks, such representation functions typically lack identifiability in parameter space, because they are overparameterized by design. In this paper, building on recent advances in nonlinear ICA, we aim to rehabilitate identifiability by showing that a large family of discriminative models are in fact identifiable in function space, up to a linear indeterminacy. Many models for representation learning in a wide variety of domains have been identifiable in this sense, including text, images and audio, state-of-the-art at time of publication. We derive sufficient conditions for linear identifiability and provide empirical support for the result on both simulated and real-world data.	翻訳日:2022-11-14 22:16:57 公開日:2020-07-08
# 私はブラックボックスエージェントを作るのかブラックボックスエージェントを解釈するのか? Am I Building a White Box Agent or Interpreting a Black Box Agent? ( http://arxiv.org/abs/2007.01187v3 ) ライセンス: Link先を確認	Tom Bewley	(参考訳) ブラックボックス関数の解釈可能なモデルを構築する場合、忠実性に対する最適化は、基礎となるタスクのパフォーマンスを低下させる可能性があり、その逆も同様である。私は、このジレンマが現代における説明可能な人工知能の分野と関連性を再評価し、ブラックボックスが動的環境と相互作用するエージェントであるときにどのように複合化されるかを強調する。次に、ホワイトボックスエージェントの構築とブラックボックスエージェントの解釈という、2つの独立した研究方向について議論します。 The rule extraction literature contains the notion of a fidelity-accuracy dilemma: when building an interpretable model of a black box function, optimising for fidelity is likely to reduce performance on the underlying task, and vice versa. I reassert the relevance of this dilemma for the modern field of explainable artificial intelligence, and highlight how it is compounded when the black box is an agent interacting with a dynamic environment. I then discuss two independent research directions - building white box agents and interpreting black box agents - which are both coherent and worthy of attention, but must not be conflated by researchers embarking on projects in the domain of agent interpretability.	翻訳日:2022-11-14 14:18:37 公開日:2020-07-08
# 観測不能リセットによる時間自動学習 Active learning of timed automata with unobservable resets ( http://arxiv.org/abs/2007.01637v2 ) ライセンス: Link先を確認	L\'eo Henry, Nicolas Markey, Thierry J\'eron	(参考訳) 時間付き言語のアクティブラーニングは、観察された時間付き単語から時間付きオートマトンを推論することに関連する。エージェントは、対象言語における単語のメンバシップを問い合わせたり、候補モデルを提案し、そのターゲットに対する等価性を検証する。このフレームワークの主な難点は、クロックリセットの推論であり、時間付きオートマタのダイナミクスの中心だが、直接観測できない。興味深い最初のステップは、クロックリセットが観測と結びつくイベント記録オートマタのサブクラスに制限することですでに実現されている。一般時間オートマトンを学習するために、この手法をリセットなしイベント記録オートマトン(reset-free event-recording automateda)と呼ばれる新しいクラスに一般化する。これは、可読性のためにイベント記録自動化のシンプルなフレームワークを維持しながら、ジェネリックタイムドオートマトンと同じ課題を提供する。私たちの貢献の中心は、無効性の概念とそれを扱うアルゴリズムとデータ構造であり、一般的なタイムドオートマトンのための効率的な能動的学習手順の鍵である観測に矛盾するリセット仮説のオンザフライ検出とプルーニングを可能にする。 Active learning of timed languages is concerned with the inference of timed automata from observed timed words. The agent can query for the membership of words in the target language, or propose a candidate model and verify its equivalence to the target. The major difficulty of this framework is the inference of clock resets, central to the dynamics of timed automata, but not directly observable. Interesting first steps have already been made by restricting to the subclass of event-recording automata, where clock resets are tied to observations. In order to advance towards learning of general timed automata, we generalize this method to a new class, called reset-free event-recording automata, where some transitions may reset no clocks. This offers the same challenges as generic timed automata while keeping the simpler framework of event-recording automata for the sake of readability. Central to our contribution is the notion of invalidity, and the algorithm and data structures to deal with it, allowing on-the-fly detection and pruning of reset hypotheses that contradict observations, a key to any efficient active-learning procedure for generic timed automata.	翻訳日:2022-11-14 06:18:22 公開日:2020-07-08
# AVP-SLAM:駐車場における自律走行車両のセマンティック視覚マッピングと位置決め AVP-SLAM: Semantic Visual Mapping and Localization for Autonomous Vehicles in the Parking Lot ( http://arxiv.org/abs/2007.01813v2 ) ライセンス: Link先を確認	Tong Qin, Tongqing Chen, Yilun Chen, and Qing Su	(参考訳) 自動駐車は自動運転車の特定の用途である。このタスクでは、車両は狭く、混雑していて、GPSを付加した駐車場で移動する必要がある。正確なローカライゼーション能力は非常に重要です。従来の視覚ベースの手法は、テクスチャのない領域、繰り返し構造、外観の変化によって失われた追跡に悩まされる。本稿では、ロバストなセマンティックな特徴を利用して、駐車場に地図を構築し、車両をローカライズする。セマンティックな特徴には、通常駐車場に現れる案内標識、駐車場、スピードバンプなどが含まれる。従来の特徴と比較して、これらの意味的特徴は長期的な安定であり、視点と照明の変化に対して堅牢である。我々は4つのサラウンドビューカメラを用いて知覚範囲を拡大する。 IMU (Inertial Measurement Unit) とホイールエンコーダ(ホイールエンコーダ)の支援により,提案システムはグローバルな視覚意味マップを生成する。この地図はさらに、車両をセンチメートルレベルでローカライズするために使われる。我々は,システムの精度とリコールを分析し,実実験で他の手法と比較する。さらに,提案システムの実現可能性について,自動駐車による実証実験を行った。 Autonomous valet parking is a specific application for autonomous vehicles. In this task, vehicles need to navigate in narrow, crowded and GPS-denied parking lots. Accurate localization ability is of great importance. Traditional visual-based methods suffer from tracking lost due to texture-less regions, repeated structures, and appearance changes. In this paper, we exploit robust semantic features to build the map and localize vehicles in parking lots. Semantic features contain guide signs, parking lines, speed bumps, etc, which typically appear in parking lots. Compared with traditional features, these semantic features are long-term stable and robust to the perspective and illumination change. We adopt four surround-view cameras to increase the perception range. Assisting by an IMU (Inertial Measurement Unit) and wheel encoders, the proposed system generates a global visual semantic map. This map is further used to localize vehicles at the centimeter level. We analyze the accuracy and recall of our system and compare it against other methods in real experiments. Furthermore, we demonstrate the practicability of the proposed system by the autonomous parking application.	翻訳日:2022-11-14 06:04:08 公開日:2020-07-08
# graph2kernel grid-lstm:適応型近傍学習による歩行者追跡予測のためのマルチキュードモデル Graph2Kernel Grid-LSTM: A Multi-Cued Model for Pedestrian Trajectory Prediction by Learning Adaptive Neighborhoods ( http://arxiv.org/abs/2007.01915v2 ) ライセンス: Link先を確認	Sirin Haddad and Siew Kei Lam	(参考訳) 歩行者軌跡予測は,歩行軌跡の時間的表現にLong Short-Term Memory (LSTM) を広範囲に用い,群集の社会的・文脈的相互作用のモデル化に向けた顕著な研究トラックである。既存のアプローチでは、仮想地区を固定グリッドとして使用し、歩行者の社会的状態をプールし、社会的相互作用の捉え方を制御するチューニングプロセスを提供する。これは特定のシーンにパフォーマンスをカスタマイズするが、アプローチの一般化能力は低下する。本研究では,多次元特徴入力上で動作するLSTMの拡張であるtextit{Grid-LSTM}をデプロイする。本稿では,歩行者近傍がデザインに適応可能となることを提案し,インタラクションモデリングの新しい視点を提案する。エンコーダとして \textit{Grid-LSTM} を用いて, 視覚的境界と空間的境界を考慮し, 将来の歩行者運動への影響を学習する。我々のモデルは、いくつかの公開テストされた監視ビデオに類似した特徴を照合する最先端のアプローチよりも優れています。実験の結果は、シーンの特徴や群集のダイナミクスによって異なるデータセットにまたがるアプローチの一般化を明確に示している。 Pedestrian trajectory prediction is a prominent research track that has advanced towards modelling of crowd social and contextual interactions, with extensive usage of Long Short-Term Memory (LSTM) for temporal representation of walking trajectories. Existing approaches use virtual neighborhoods as a fixed grid for pooling social states of pedestrians with tuning process that controls how social interactions are being captured. This entails performance customization to specific scenes but lowers the generalization capability of the approaches. In our work, we deploy \textit{Grid-LSTM}, a recent extension of LSTM, which operates over multidimensional feature inputs. We present a new perspective to interaction modeling by proposing that pedestrian neighborhoods can become adaptive in design. We use \textit{Grid-LSTM} as an encoder to learn about potential future neighborhoods and their influence on pedestrian motion given the visual and the spatial boundaries. Our model outperforms state-of-the-art approaches that collate resembling features over several publicly-tested surveillance videos. The experiment results clearly illustrate the generalization of our approach across datasets that varies in scene features and crowd dynamics.	翻訳日:2022-11-14 05:57:00 公開日:2020-07-08
# 弱教師付きセマンティクスセグメンテーションのためのクロスイメージセマンティクスのマイニング Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation ( http://arxiv.org/abs/2007.01947v2 ) ライセンス: Link先を確認	Guolei Sun and Wenguan Wang and Jifeng Dai and Luc Van Gool	(参考訳) 本稿では,画像レベルの監視のみからセマンティックセグメンテーションを学習する問題を考察する。現在のポピュラーなソリューションでは、分類器からのオブジェクトローカライゼーションマップを監視信号として利用し、ローカライゼーションマップがより完全なオブジェクトコンテンツを取得するのに苦労している。画像内情報に重点を置く従来の取り組みよりも、総合的なオブジェクトパターンマイニングのためのクロスイメージ意味関係の価値に対処する。これを実現するために、2つのニューラルコアテンションが分類器に組み込まれ、画像間の意味的類似性と差異を補足的に捉える。特に、一対の訓練画像が与えられた場合、一対のコアテンションは、コアテンティブオブジェクトから共通のセマンティクスを認識するように分類器を強制するが、他方のコアテンションは、コントラッシブコアテンションと呼ばれ、他の非共有オブジェクトから未共有セマンティクスを識別するために分類器を駆動する。これにより、分類器は画像領域におけるより多くのオブジェクトパターンとより良い接地セマンティクスを発見するのに役立つ。オブジェクトパターン学習の促進に加えて、コアテンションは他の関連画像からのコンテキストを活用してローカライゼーションマップ推論を改善し、最終的にはセマンティックセグメンテーション学習の恩恵を受ける。さらに,本アルゴリズムは,(1)正確な画像レベルの監視のみによるWSSS学習,(2)余分な単純な単一ラベルデータ,(3)余分なノイズの多いWebデータといった,WSSS設定をうまく扱う統一的なフレームワークを提供する。これらすべての設定に新たな最先端技術を設定し、その有効性と一般化性を示す。さらに,本手法はCVPR2020 Learning from Imperfect Data Challengeの弱いスーパービジョンのセマンティックセマンティックセグメンテーショントラックで1位にランクインした。 This paper studies the problem of learning semantic segmentation from image-level supervision only. Current popular solutions leverage object localization maps from classifiers as supervision signals, and struggle to make the localization maps capture more complete object content. Rather than previous efforts that primarily focus on intra-image information, we address the value of cross-image semantic relations for comprehensive object pattern mining. To achieve this, two neural co-attentions are incorporated into the classifier to complimentarily capture cross-image semantic similarities and differences. In particular, given a pair of training images, one co-attention enforces the classifier to recognize the common semantics from co-attentive objects, while the other one, called contrastive co-attention, drives the classifier to identify the unshared semantics from the rest, uncommon objects. This helps the classifier discover more object patterns and better ground semantics in image regions. In addition to boosting object pattern learning, the co-attention can leverage context from other related images to improve localization map inference, hence eventually benefiting semantic segmentation learning. More essentially, our algorithm provides a unified framework that handles well different WSSS settings, i.e., learning WSSS with (1) precise image-level supervision only, (2) extra simple single-label data, and (3) extra noisy web data. It sets new state-of-the-arts on all these settings, demonstrating well its efficacy and generalizability. Moreover, our approach ranked 1st place in the Weakly-Supervised Semantic Segmentation Track of CVPR2020 Learning from Imperfect Data Challenge.	翻訳日:2022-11-14 05:20:27 公開日:2020-07-08
# リレーショナル思考に基づく音声認識のためのディープグラフランダム処理 Deep Graph Random Process for Relational-Thinking-Based Speech Recognition ( http://arxiv.org/abs/2007.02126v2 ) ライセンス: Link先を確認	Hengguan Huang, Fuzhao Xue, Hao Wang, Ye Wang	(参考訳) 人間の知性の中心にあるリレーショナル思考は、最初は、新しい感覚信号と事前知識の関係に関する無意識の知覚に依存し、結果としてこれらの知覚の結合と変換を通じて認識可能な概念や物体となる。このようなメンタルプロセスは、会話の自動音声認識(ASR)のような現実的な問題ではモデル化が困難であり、(発話間の関係を示すグラフとしてモデル化されている場合)パーセプションは無数であり、直接観察できない。本稿では,パーセプタを表現する無限個の確率グラフを生成可能な,ディープグラフランダム処理(dgp)と呼ばれるベイズ非パラメトリック深層学習手法を提案する。さらに,音響モデリングのための知覚グラフの結合と変換のための閉形式解を提案する。我々の手法は、訓練中に関係データを用いることなく、発話間の関係を推測できる。 CHiME-2およびCHiME-5を含むASRタスクの実験的評価により,本手法の有効性とメリットが示された。 Lying at the core of human intelligence, relational thinking is characterized by initially relying on innumerable unconscious percepts pertaining to relations between new sensory signals and prior knowledge, consequently becoming a recognizable concept or object through coupling and transformation of these percepts. Such mental processes are difficult to model in real-world problems such as in conversational automatic speech recognition (ASR), as the percepts (if they are modelled as graphs indicating relationships among utterances) are supposed to be innumerable and not directly observable. In this paper, we present a Bayesian nonparametric deep learning method called deep graph random process (DGP) that can generate an infinite number of probabilistic graphs representing percepts. We further provide a closed-form solution for coupling and transformation of these percept graphs for acoustic modeling. Our approach is able to successfully infer relations among utterances without using any relational data during training. Experimental evaluations on ASR tasks including CHiME-2 and CHiME-5 demonstrate the effectiveness and benefits of our method.	翻訳日:2022-11-13 13:19:28 公開日:2020-07-08
# OpenStreetMapのための人間支援人工知能による自然特徴作成技術 Human Assisted Artificial Intelligence Based Technique to Create Natural Features for OpenStreetMap ( http://arxiv.org/abs/2007.02149v2 ) ライセンス: Link先を確認	Piyush Yadav, Dipto Sarkar, Shailesh Deshpande, Edward Curry	(参考訳) 本研究では,ランドサットやセンチネルなどの衛星画像を用いて,人間の編集者がイニシエータやバリデータとして行動するosm上の自然な特徴を創造するaiベースの手法を提案する。この手法は、人間の入力を機械と結合して複雑な問題を効率的に解き、純粋な自律プロセスと比較するインタラクティブ機械学習技術に基づいている。ボトムアップアプローチでは、画像のスペクトルシグネチャを使用してクラスを抽出し、後に編集可能な機能に変換して自然な機能を生成するために、マシンラーニング(ml)パイプラインをエディターとループで使用する。 In this work, we propose an AI-based technique using freely available satellite images like Landsat and Sentinel to create natural features over OSM in congruence with human editors acting as initiators and validators. The method is based on Interactive Machine Learning technique where human inputs are coupled with the machine to solve complex problems efficiently as compare to pure autonomous process. We use a bottom-up approach where a machine learning (ML) pipeline in loop with editors is used to extract classes using spectral signatures of images and later convert them to editable features to create natural features.	翻訳日:2022-11-13 13:09:38 公開日:2020-07-08
# 大規模ターゲット広告システムのための多次元学習 Multi-Manifold Learning for Large-scale Targeted Advertising System ( http://arxiv.org/abs/2007.02334v2 ) ライセンス: Link先を確認	Kyuyong Shin, Young-Jin Park, Kyung-Min Kim, Sunyoung Kwon	(参考訳) messenger広告(ads)は、直接的および個人的ユーザー体験を提供し、高いコンバージョン率と売上をもたらす。しかし、人々は広告に懐疑的であり、時にはスパムだと認識し、最終的にはユーザー満足度が低下する。特定の広告メッセージに興味を示す個人に対して広告を提供するターゲット広告は、強く求められている。正確なユーザーターゲティングの成功の鍵は、埋め込み空間における正確なユーザーと広告表現を学ぶことである。過去の研究の多くはユークリッド空間における表現学習を制限してきたが、近年の研究では、ソーシャルネットワークやレコメンダシステム、広告といった現実のデータセットから生じる複雑なネットワーク特性の異なる射影に対する双曲的多様体学習が提案されている。本稿では,ハイパーボリック空間におけるユーザと広告の階層構造を効果的に学習し,マルチマニフォールド学習に拡張するフレームワークを提案する。学習可能な曲率を持つ複数の双曲多様体を構築し,ユーザとアドの表現を各多様体にマッピングする。各多様体の起源は、各ユーザクラスタのセンタロイドとして設定される。各広告のユーザ嗜好を双曲空間内の2つのエンティティ間の距離を用いて推定し、学習された複数の多様体から算出された値を集約して最終予測を行う。提案手法を,公開ベンチマークデータセットと大規模商用メッセンジャーシステムLINE上で評価し,その性能向上による有効性を示す。 Messenger advertisements (ads) give direct and personal user experience yielding high conversion rates and sales. However, people are skeptical about ads and sometimes perceive them as spam, which eventually leads to a decrease in user satisfaction. Targeted advertising, which serves ads to individuals who may exhibit interest in a particular advertising message, is strongly required. The key to the success of precise user targeting lies in learning the accurate user and ad representation in the embedding space. Most of the previous studies have limited the representation learning in the Euclidean space, but recent studies have suggested hyperbolic manifold learning for the distinct projection of complex network properties emerging from real-world datasets such as social networks, recommender systems, and advertising. We propose a framework that can effectively learn the hierarchical structure in users and ads on the hyperbolic space, and extend to the Multi-Manifold Learning. Our method constructs multiple hyperbolic manifolds with learnable curvatures and maps the representation of user and ad to each manifold. The origin of each manifold is set as the centroid of each user cluster. The user preference for each ad is estimated using the distance between two entities in the hyperbolic space, and the final prediction is determined by aggregating the values calculated from the learned multiple manifolds. We evaluate our method on public benchmark datasets and a large-scale commercial messenger system LINE, and demonstrate its effectiveness through improved performance.	翻訳日:2022-11-13 07:56:16 公開日:2020-07-08
# 機械学習を用いたウェアラブル慣性測定装置による歩行からの接地距離の推定 Estimation of Ground Contacts from Human Gait by a Wearable Inertial Measurement Unit using machine learning ( http://arxiv.org/abs/2007.02433v2 ) ライセンス: Link先を確認	Muhammad Junaid Umer and Qaiser Riaz	(参考訳) 運動障害のリハビリテーションのためのロボティクスシステムと運動支援の意図が高まっている。このシナリオでは、地上接触の推定はロボット工学と医療の研究の活発な分野である。本稿では,健常者歩行における左右足の推定と分類について,胸部および腰部のIMUセンサデータに基づく検討を行った。この目的のために, 胸部, 下肢に2台のスマートフォン, 右足首に1台のスマートウォッチを用いて, 被験者48名のIMUデータを収集した。アプローチデータの堅牢性を示すため、6つの異なる表面(道路タイルカーペットコンクリートと土)で収集した。右足首センサデータに基づいて腰部および胸部センサの記録データを単段に分割し,各分割したステップの時間周波数とウェーブレット領域から計408個の特徴を算出した。分類タスクでは、SVMとRFの2つの機械学習分類器を10倍のクロス検証法で訓練した。個々の表面,硬質表面,軟質表面および全表面の分類実験を行い,98.88%の精度で各表面の精度が最も高かった。さらに、硬質軟質および全表面の分類率はそれぞれ95.60%、94.38%、95.05%である。その結果, 物体の背後と胸部からの角速度と加速度の6次元データを用いて, 異なる面における正常な歩行による接地推定を高精度に行うことができた。 Robotics system for rehabilitation of movement disorders and motion assistance are gaining increased intention. In this scenario estimation of ground contact is an active area of research in robotics and healthcare. This article addresses the estimation and classification of right and left foot during the healthy human gait based on the IMU sensor data of chest and lower back. For this purpose we have collected an IMU data of 48 subjects by using two smartphones at chest and lower back of the human body and one smart watch at right ankle of the body. To show the robustness of our approach data was collected at six different surfaces (road tiles carpet grass concrete and soil). The recorded data of lower back and chest sensor was segmented into single steps on the basis of right ankle sensor data, then we computed a total of 408 features from time frequency and wavelet domain of each segmented step. For classification task we have trained two machine learning classifiers SVM and RF with 10 fold cross validation method. We performed classification experiments at individual surfaces, hard surfaces, soft surfaces and all surfaces, highest accuracy was achieved at individual surfaces with accuracy index of 98.88%. Furthermore, classification rate at hard soft and all surface are 95.60%, 94.38% and 95.05% respectively. The results shows that estimation of ground contact form normal human walk at different surfaces can be performed with high accuracy using 6D data of angular velocities and accelerations from chest and lower back location of the body.	翻訳日:2022-11-13 07:53:20 公開日:2020-07-08
# 大規模物質移動のためのガイドファインチューニング Guided Fine-Tuning for Large-Scale Material Transfer ( http://arxiv.org/abs/2007.03059v2 ) ライセンス: Link先を確認	Valentin Deschaintre, George Drettakis and Adrien Bousseau	(参考訳) 本稿では, SVBRDFの外観を類似材料を表す対象画像に転送する手法を提案する。提案手法は,対象画像からSVBRDF値と類似したSVBRDF値の抽出を学習できるように,提供した例の深い外観キャプチャネットワークを微調整する。このシンプルなアプローチの強みを示す2つの新しい材料キャプチャーと設計ワークフローを導入する。最初のワークフローでは、少数の画像から大規模オブジェクトの可塑性SVBRDFを生成することができる。具体的には、ユーザーは大きな表面の1枚の写真と、その詳細の一部のクローズアップフラッシュ写真だけを撮る必要がある。本手法では, 壁面や床, 家具など, 数メートルの幅の広い表面を軽量に捕捉し, 壁面からSVBRDFパラメータを抽出する手法と, これらのパラメータを表面全体に伝達する手法を用いている。第2のワークフローでは、ユーザが既存のSVBRDFの外観を移譲することで、インターネット画像から大きなSVBRDFを作成する強力な方法を提供する。異なる例を選択すれば、ユーザはターゲット画像に割り当てられた素材を制御でき、深い外観キャプチャによる創造可能性を大幅に向上することができる。 We present a method to transfer the appearance of one or a few exemplar SVBRDFs to a target image representing similar materials. Our solution is extremely simple: we fine-tune a deep appearance-capture network on the provided exemplars, such that it learns to extract similar SVBRDF values from the target image. We introduce two novel material capture and design workflows that demonstrate the strength of this simple approach. Our first workflow allows to produce plausible SVBRDFs of large-scale objects from only a few pictures. Specifically, users only need take a single picture of a large surface and a few close-up flash pictures of some of its details. We use existing methods to extract SVBRDF parameters from the close-ups, and our method to transfer these parameters to the entire surface, enabling the lightweight capture of surfaces several meters wide such as murals, floors and furniture. In our second workflow, we provide a powerful way for users to create large SVBRDFs from internet pictures by transferring the appearance of existing, pre-designed SVBRDFs. By selecting different exemplars, users can control the materials assigned to the target image, greatly enhancing the creative possibilities offered by deep appearance capture.	翻訳日:2022-11-13 03:12:57 公開日:2020-07-08
# 大規模多言語ASR:50言語,1モデル,10億パラメータ Massively Multilingual ASR: 50 Languages, 1 Model, 1 Billion Parameters ( http://arxiv.org/abs/2007.03001v2 ) ライセンス: Link先を確認	Vineel Pratap, Anuroop Sriram, Paden Tomasello, Awni Hannun, Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert	(参考訳) 我々は,低リソース言語における音声認識(ASR)の性能向上と,多様な言語をサポートするASRシステムの展開を単純化することを目的として,複数の言語を対象とした単一音響モデルの訓練を行った。言語別トレーニングデータ(100時間から1100時間)によって,51言語を対象とした広範なベンチマークを実施した。入力言語を知らずに単一関節モデルから多言語学習の3つの変種を、この情報を用いて複数の頭部(言語クラスタ毎に1つ)と比較する。複数の言語におけるASRモデルの多言語学習は、特に低リソース言語における認識性能を向上させることができることを示す。ジョイントモデルでは20.9%,23%,28.8%,単言語ベースラインでは28.8%,言語入力を伴うジョイントモデルでは20.9%,マルチヘッドモデルでは28.8%であった。私たちの知る限り、これは50以上の言語と16,000時間以上のオーディオを持つ多言語ASRを大規模に研究する最初の作品です。 We study training a single acoustic model for multiple languages with the aim of improving automatic speech recognition (ASR) performance on low-resource languages, and over-all simplifying deployment of ASR systems that support diverse languages. We perform an extensive benchmark on 51 languages, with varying amount of training data by language(from 100 hours to 1100 hours). We compare three variants of multilingual training from a single joint model without knowing the input language, to using this information, to multiple heads (one per language cluster). We show that multilingual training of ASR models on several languages can improve recognition performance, in particular, on low resource languages. We see 20.9%, 23% and 28.8% average WER relative reduction compared to monolingual baselines on joint model, joint model with language input and multi head model respectively. To our knowledge, this is the first work studying multilingual ASR at massive scale, with more than 50 languages and more than 16,000 hours of audio across them.	翻訳日:2022-11-13 03:04:37 公開日:2020-07-08
# 解剖学的構造を正確に抽出する学習 Learning to Segment Anatomical Structures Accurately from One Exemplar ( http://arxiv.org/abs/2007.03052v2 ) ライセンス: Link先を確認	Yuhang Lu, Weijian Li, Kang Zheng, Yirui Wang, Adam P. Harrison, Chihung Lin, Song Wang, Jing Xiao, Le Lu, Chang-Fu Kuo, Shun Miao	(参考訳) 重要な解剖学的構造の正確なセグメンテーションは、医療画像解析の核心にある。主なボトルネックは、必要なエキスパートラベルのイメージアノテーションをスケーラブルに収集することです。大量の注釈付きトレーニング画像を用いることなく、正確な解剖学的構造セグメンテーションを作成できる方法は非常に望ましい。本稿では,自然に組み込まれた人間のループ機構を備えた単発解剖セグメンタであるcontour transformer network(ctn)の新たな貢献を提案する。セグメンテーションはグラフ畳み込みネットワーク(gcns)に基づく輪郭進化過程を学習することによって定式化される。我々のCTNモデルのトレーニングにはラベル付き画像のみが必要であり、輪郭のグローバルな形状と外観の整合性を測定するために新たに導入された損失関数を通じてラベル付きデータを活用する。本手法は,非学習型手法を著しく上回り,最先端の完全教師付き深層学習手法と競合することを実証する。最小限のHuman-in-the-loop編集フィードバックにより、セグメンテーション性能をさらに改善し、オブザーバの望ましい結果に合わせることができる。これにより、臨床医による画像に基づくバイオマーカー評価(パーソナライズされた定量的臨床診断をサポートする)が容易になり、完全に監督された基準を上回ることができる。 Accurate segmentation of critical anatomical structures is at the core of medical image analysis. The main bottleneck lies in gathering the requisite expert-labeled image annotations in a scalable manner. Methods that permit to produce accurate anatomical structure segmentation without using a large amount of fully annotated training images are highly desirable. In this work, we propose a novel contribution of Contour Transformer Network (CTN), a one-shot anatomy segmentor including a naturally built-in human-in-the-loop mechanism. Segmentation is formulated by learning a contour evolution behavior process based on graph convolutional networks (GCNs). Training of our CTN model requires only one labeled image exemplar and leverages additional unlabeled data through newly introduced loss functions that measure the global shape and appearance consistency of contours. We demonstrate that our one-shot learning method significantly outperforms non-learning-based methods and performs competitively to the state-of-the-art fully supervised deep learning approaches. With minimal human-in-the-loop editing feedback, the segmentation performance can be further improved and tailored towards the observer desired outcomes. This can facilitate the clinician designed imaging-based biomarker assessments (to support personalized quantitative clinical diagnosis) and outperforms fully supervised baselines.	翻訳日:2022-11-13 02:54:57 公開日:2020-07-08
# 残留特徴注意深層ニューラルネットワークを用いたリモートセンシング画像のマルチイメージ超解像 Multi-image Super Resolution of Remotely Sensed Images using Residual Feature Attention Deep Neural Networks ( http://arxiv.org/abs/2007.03107v2 ) ライセンス: Link先を確認	Francesco Salvetti, Vittorio Mazzia, Aleem Khaliq, Marcello Chiaberge	(参考訳) 畳み込みニューラルネットワーク(cnns)は、画像スーパーレゾリューション(sr)における最先端の成果を一貫して証明されており、リモートセンシング分野において、キャプチャされたデータからさらなる情報や知識を抽出する絶好の機会である。しかし、文献で発表された作品の多くは、シングルイメージ超解法問題に焦点を当てている。現在、衛星ベースのリモートセンシングプラットフォームは、高時間分解能と低空間分解能の巨大なデータ可用性を提供している。本研究は,マルチイメージ超解像課題に効果的に取り組み,同時に空間的・時間的相関を利用して複数の画像を組み合わせる新しい残像注意モデル(RAMS)を提案する。本研究では3次元畳み込みによる視覚特徴の注意機構を導入し,複数の低解像度画像の認識データ融合と情報抽出を行い,局所的な畳み込み操作の限界を克服する。さらに,同じシーンで複数の入力を複数持つことで,ネステッド残差接続を広範囲に活用し,冗長な低周波信号を流し,より重要な高周波成分に演算を集中させる。単一画像または複数画像の超解像に対して利用可能な他のソリューションに対する大規模な実験と評価を行い、提案した深層学習に基づくソリューションがリモートセンシングアプリケーションにおけるマルチイメージ超解像の最先端とみなすことができることを示した。 Convolutional Neural Networks (CNNs) have been consistently proved state-of-the-art results in image Super-Resolution (SR), representing an exceptional opportunity for the remote sensing field to extract further information and knowledge from captured data. However, most of the works published in the literature have been focusing on the Single-Image Super-Resolution problem so far. At present, satellite based remote sensing platforms offer huge data availability with high temporal resolution and low spatial resolution. In this context, the presented research proposes a novel residual attention model (RAMS) that efficiently tackles the multi-image super-resolution task, simultaneously exploiting spatial and temporal correlations to combine multiple images. We introduce the mechanism of visual feature attention with 3D convolutions in order to obtain an aware data fusion and information extraction of the multiple low-resolution images, transcending limitations of the local region of convolutional operations. Moreover, having multiple inputs with the same scene, our representation learning network makes extensive use of nestled residual connections to let flow redundant low-frequency signals and focus the computation on more important high-frequency components. Extensive experimentation and evaluations against other available solutions, either for single or multi-image super-resolution, have demonstrated that the proposed deep learning-based solution can be considered state-of-the-art for Multi-Image Super-Resolution for remote sensing applications.	翻訳日:2022-11-13 02:18:11 公開日:2020-07-08
# パラメトリックマシン:アーキテクチャ検索への新しいアプローチ Parametric machines: a fresh approach to architecture search ( http://arxiv.org/abs/2007.02777v2 ) ライセンス: Link先を確認	Pietro Vertechi, Patrizio Frosini, Mattia G. Bergomi	(参考訳) カテゴリ理論のツールを使用して、ニューラルネットワークとそのアーキテクチャを形式的に記述できるフレームワークを提供する。まず、一般的な分類学的文脈で機械の概念を定義し、より複雑なものにいかに単純な機械を結合できるかを示す。ニューラルネットワークと神経常微分方程式を一般化した,有限かつ無限大の機械を探索する。関数解析とカーネル法からアイデアを借用し、マシンの完全でノルム化された無限次元空間を構築し、与えられた計算問題を解決するために最適なアーキテクチャとパラメーターを見つける方法について議論する。我々の数値実験では、これらのカーネルにインスパイアされたネットワークは、トレーニングデータセットが小さい場合、古典的なニューラルネットワークより優れている。 Using tools from category theory, we provide a framework where artificial neural networks, and their architectures, can be formally described. We first define the notion of machine in a general categorical context, and show how simple machines can be combined into more complex ones. We explore finite- and infinite-depth machines, which generalize neural networks and neural ordinary differential equations. Borrowing ideas from functional analysis and kernel methods, we build complete, normed, infinite-dimensional spaces of machines, and discuss how to find optimal architectures and parameters -- within those spaces -- to solve a given computational problem. In our numerical experiments, these kernel-inspired networks can outperform classical neural networks when the training dataset is small.	翻訳日:2022-11-13 01:34:16 公開日:2020-07-08
# ビジョンに基づく新型コロナウイルスのソーシャルディスタンシングと臨界密度検出システム A Vision-based Social Distancing and Critical Density Detection System for COVID-19 ( http://arxiv.org/abs/2007.03578v2 ) ライセンス: Link先を確認	Dongfang Yang, Ekim Yurtsever, Vishnu Renganathan, Keith A. Redmill, \"Umit \"Ozg\"uner	(参考訳) 新型コロナウイルス(covid-19)の感染拡大に対する効果的な対策として,ソーシャルディスタンシングが実証されている。しかし、個人は必要な6フィート(約2メートル)の距離を自分と周囲と追跡することができない。個人間の距離を検知し、警告できるアクティブ監視システムは、致命的な病気の拡散を遅らせることができる。さらに、関心領域(roi)における社会的密度の測定と流入の変調は、社会的距離違反の発生機会を減少させる。一方、データの記録や、対策に従わない個人へのラベル付けは、自由社会における個人の権利を侵害する。ここでは,人工知能(AI)に基づくリアルタイムなソーシャルディスタンシング検出・警告システムを提案する。(1)システムはデータの記録・キャッシュを決して行なわないこと,(2)警告は個人を標的にすべきでないこと,(3)人間の監督者は検出・警告ループにいないこと,(4)コードがオープンソースで公開されていること,である。本稿では,この背景に対して,単眼カメラと深層学習に基づくリアルタイム物体検出器を用いてソーシャルディスタンスを測定することを提案する。違反が検出されると、ソーシャルディスタンシング対策に違反した個人を標的にすることなく、侵入的でない音声視覚警告信号を出力する。また、社会密度が臨界値を超えた場合、システムは制御信号を送信してroiへの流入を変調する。提案手法を実世界のデータセットにまたがってテストし,その汎用性と性能を測定した。提案手法はデプロイ可能であり,コードをオープンソースにしています。 Social distancing has been proven as an effective measure against the spread of the infectious COronaVIrus Disease 2019 (COVID-19). However, individuals are not used to tracking the required 6-feet (2-meters) distance between themselves and their surroundings. An active surveillance system capable of detecting distances between individuals and warning them can slow down the spread of the deadly disease. Furthermore, measuring social density in a region of interest (ROI) and modulating inflow can decrease social distancing violation occurrence chance. On the other hand, recording data and labeling individuals who do not follow the measures will breach individuals' rights in free-societies. Here we propose an Artificial Intelligence (AI) based real-time social distancing detection and warning system considering four important ethical factors: (1) the system should never record/cache data, (2) the warnings should not target the individuals, (3) no human supervisor should be in the detection/warning loop, and (4) the code should be open-source and accessible to the public. Against this backdrop, we propose using a monocular camera and deep learning-based real-time object detectors to measure social distancing. If a violation is detected, a non-intrusive audio-visual warning signal is emitted without targeting the individual who breached the social distancing measure. Also, if the social density is over a critical value, the system sends a control signal to modulate inflow into the ROI. We tested the proposed method across real-world datasets to measure its generality and performance. The proposed method is ready for deployment, and our code is open-sourced.	翻訳日:2022-11-12 20:28:47 公開日:2020-07-08
# 胸部CT検査における肺Opacificationの分画 Segmentation of Pulmonary Opacification in Chest CT Scans of COVID-19 Patients ( http://arxiv.org/abs/2007.03643v2 ) ライセンス: Link先を確認	Keegan Lensink, Issam Laradji, Marco Law, Paolo Emilio Barbano, Savvas Nicolaou, William Parker, Eldad Haber	(参考訳) 重症急性呼吸症候群コロナウイルス2(SARS-CoV-2)は急速に世界的なパンデミックに広まっている。患者の肺に不透明な症状として現れる肺炎は、このウイルスに関連する最も一般的な発表であり、これらの変化が患者の死亡や死亡とどのように関係しているかに注目が集まっている。本研究は,胸部CT(CT)スキャンにおける肺閉塞のパターン分類のためのオープンソースモデルであり,感染のさまざまなステージと重症度に相関している。世界中の医療センターから663人の胸部CTスキャンを収集し、肺の6つの異なるパターンを分割する25,000個のスライスでピクセルワイドセグメンテーションラベルを作成しました。データセットでトレーニングされた複数のセグメンテーションモデルに対して、オープンソース実装と事前トレーニングされた重み付けを提供します。最適モデルでは,テストセットで0.76オパシティ・インターセクション・オーバー・ユニオンスコアを達成し,ドメイン適応を成功させ,専門家の1.7%以内のオパシティの容積を予測する。さらに,このタスクに固有のオブザーバ間変動の解析を行い,適切な確率的アプローチのための手法を提案する。 The Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has rapidly spread into a global pandemic. A form of pneumonia, presenting as opacities with in a patient's lungs, is the most common presentation associated with this virus, and great attention has gone into how these changes relate to patient morbidity and mortality. In this work we provide open source models for the segmentation of patterns of pulmonary opacification on chest Computed Tomography (CT) scans which have been correlated with various stages and severities of infection. We have collected 663 chest CT scans of COVID-19 patients from healthcare centers around the world, and created pixel wise segmentation labels for nearly 25,000 slices that segment 6 different patterns of pulmonary opacification. We provide open source implementations and pre-trained weights for multiple segmentation models trained on our dataset. Our best model achieves an opacity Intersection-Over-Union score of 0.76 on our test set, demonstrates successful domain adaptation, and predicts the volume of opacification within 1.7\% of expert radiologists. Additionally, we present an analysis of the inter-observer variability inherent to this task, and propose methods for appropriate probabilistic approaches.	翻訳日:2022-11-12 20:27:45 公開日:2020-07-08
# HKR for Handwriting Kazakh & Russian Database (英語) HKR For Handwritten Kazakh & Russian Database ( http://arxiv.org/abs/2007.03579v2 ) ライセンス: Link先を確認	Daniyar Nurseitov, Kairat Bostanbekov, Daniyar Kurmankhojayev, Anel Alimova, Abdelrahman Abdallah	(参考訳) 本稿では,オフライン手書き文字認識のための新しいロシア語とカザフ語データベース(ロシア語の95%,カザフ語/文の5%)を提案する。データベースとともにいくつかの前処理と分割手順が開発されている。データベースはキリル文字で書かれており、同じ33文字を共有している。これらの文字に加えて、カザフ文字には9つの特別な文字が含まれている。このデータセットはフォームのコレクションです。データセット内のすべてのフォームのソースは \latex によって生成され、その後手書きの人物によって埋められた。データベースは1400以上のフォームで構成されている。約63,000の文があり、約200の異なる作家によって作られた715699以上の記号がある。ディープラーニングと機械学習を使うことで、手書き認識タスクの分野で研究者に役立てることができる。 In this paper, we present a new Russian and Kazakh database (with about 95% of Russian and 5% of Kazakh words/sentences respectively) for offline handwriting recognition. A few pre-processing and segmentation procedures have been developed together with the database. The database is written in Cyrillic and shares the same 33 characters. Besides these characters, the Kazakh alphabet also contains 9 additional specific characters. This dataset is a collection of forms. The sources of all the forms in the datasets were generated by \LaTeX which subsequently was filled out by persons with their handwriting. The database consists of more than 1400 filled forms. There are approximately 63000 sentences, more than 715699 symbols produced by approximately 200 different writers. It can serve researchers in the field of handwriting recognition tasks by using deep and machine learning.	翻訳日:2022-11-12 20:09:38 公開日:2020-07-08
# 限定ラベルデータから群衆の数え方を学ぶ Learning to Count in the Crowd from Limited Labeled Data ( http://arxiv.org/abs/2007.03195v2 ) ライセンス: Link先を確認	Vishwanath A. Sindagi, Rajeev Yasarla, Deepak Sam Babu, R. Venkatesh Babu, Vishal M. Patel	(参考訳) 最近の群衆カウントアプローチは優れたパフォーマンスを達成しました。しかし、それらは本質的に完全に教師付きパラダイムに基づいており、多数の注釈付きサンプルを必要とする。アノテーションの取得は費用がかかり、労働集約的なプロセスです。本研究では,ラベルなしデータの膨大なプールを活用しながら,限定されたサンプル数から群衆を数えることを学ぶことで,アノテーションの努力を減らすことに注力する。具体的には,非ラベルデータに対する疑似基底真理の推定を含むガウス過程に基づく反復学習機構を提案する。提案手法は上海技術, UCF-QNRF, WorldExpo, UCSDなどのいくつかのデータセットに対して, 半教師付きデータ設定で有効であることが示されている。さらに,提案手法は,実世界のデータセット(合成から現実への転送)をより一般化しながら,合成データセットから学習のネットワークを数えられるように活用できることを実証する。 Recent crowd counting approaches have achieved excellent performance. However, they are essentially based on fully supervised paradigm and require large number of annotated samples. Obtaining annotations is an expensive and labour-intensive process. In this work, we focus on reducing the annotation efforts by learning to count in the crowd from limited number of labeled samples while leveraging a large pool of unlabeled data. Specifically, we propose a Gaussian Process-based iterative learning mechanism that involves estimation of pseudo-ground truth for the unlabeled data, which is then used as supervision for training the network. The proposed method is shown to be effective under the reduced data (semi-supervised) settings for several datasets like ShanghaiTech, UCF-QNRF, WorldExpo, UCSD, etc. Furthermore, we demonstrate that the proposed method can be leveraged to enable the network in learning to count from synthetic dataset while being able to generalize better to real-world datasets (synthetic-to-real transfer).	翻訳日:2022-11-12 19:50:41 公開日:2020-07-08
# アンサンブル分類器は、中間欠性障害の検出と診断に十分強力か? Are Ensemble Classifiers Powerful Enough for the Detection and Diagnosis of Intermediate-Severity Faults? ( http://arxiv.org/abs/2007.03167v2 ) ライセンス: Link先を確認	Baihong Jin, Yingshui Tan, Yuxin Chen, Kameshwar Poolla, Alberto Sangiovanni Vincentelli	(参考訳) 中間重度(IS)断層は、重度断層よりも軽度な症状を示し、正常な手術条件に類似しているため、検出と診断が困難である。トレーニングデータにおけるIS故障例の欠如は、機械学習(ML)技術に基づくフォールト検出・診断(FDD)手法に重大なリスクをもたらす可能性がある。エンサンブルモデルはMLに広く適用されており、アウト・オブ・ディストリビューション(OOD)データを検出するための有望な方法と考えられている。これらのモデルに共通する落とし穴を、2つの実世界のデータセット上のいくつかの一般的なアンサンブルモデルを用いて広範な実験によって同定する。次に,is障害の検出と診断のための,より効率的なアンサンブルモデルの設計方法について述べる。 Intermediate-Severity (IS) faults present milder symptoms compared to severe faults, and are more difficult to detect and diagnose due to their close resemblance to normal operating conditions. The lack of IS fault examples in the training data can pose severe risks to Fault Detection and Diagnosis (FDD) methods that are built upon Machine Learning (ML) techniques, because these faults can be easily mistaken as normal operating conditions. Ensemble models are widely applied in ML and are considered promising methods for detecting out-of-distribution (OOD) data. We identify common pitfalls in these models through extensive experiments with several popular ensemble models on two real-world datasets. Then, we discuss how to design more effective ensemble models for detecting and diagnosing IS faults.	翻訳日:2022-11-12 18:49:58 公開日:2020-07-08
# ニューラルプログラムにおける強一般化と効率性 Strong Generalization and Efficiency in Neural Programs ( http://arxiv.org/abs/2007.03629v2 ) ライセンス: Link先を確認	Yujia Li, Felix Gimeno, Pushmeet Kohli, Oriol Vinyals	(参考訳) 本研究では,神経プログラム誘導の枠組みを一般化した効率的なアルゴリズムを学習する問題について検討する。神経モデルの入力/出力インターフェースを慎重に設計し、模倣することで、任意の入力サイズに対して正しい結果を生成するモデルを学び、強力な一般化を達成することができる。さらに,強化学習を用いることで,プログラム効率の指標を最適化し,模倣に用いる教師を上回る新しいアルゴリズムを探索する。これにより、ソート、順序付きリストの検索、NP完全 0/1 knapsack 問題など、さまざまな問題においてカスタム記述されたソリューションよりも優れた結果が得られる。ハイライトとして、私たちの学習したモデルは、テストした任意の入力データサイズで完全にソートを実行でき、o(n log n)$の複雑さで、手入力されたアルゴリズムよりも優れています。 We study the problem of learning efficient algorithms that strongly generalize in the framework of neural program induction. By carefully designing the input / output interfaces of the neural model and through imitation, we are able to learn models that produce correct results for arbitrary input sizes, achieving strong generalization. Moreover, by using reinforcement learning, we optimize for program efficiency metrics, and discover new algorithms that surpass the teacher used in imitation. With this, our approach can learn to outperform custom-written solutions for a variety of problems, as we tested it on sorting, searching in ordered lists and the NP-complete 0/1 knapsack problem, which sets a notable milestone in the field of Neural Program Induction. As highlights, our learned model can perform sorting perfectly on any input data size we tested on, with $O(n log n)$ complexity, whilst outperforming hand-coded algorithms, including quick sort, in number of operations even for list sizes far beyond those seen during training.	翻訳日:2022-11-12 18:14:23 公開日:2020-07-08
# マルチタスク学習によるX線後方散乱と前方散乱の同時推定 Simultaneous Estimation of X-ray Back-Scatter and Forward-Scatter using Multi-Task Learning ( http://arxiv.org/abs/2007.04018v1 ) ライセンス: Link先を確認	Philipp Roser, Xia Zhong, Annette Birkhold, Alexander Preuhs, Christopher Syben, Elisabeth Hoppe, Norbert Strobel, Markus Kowarschik, Rebecca Fahrig, Andreas Maier	(参考訳) 散乱放射は2つの方法でx線画像誘導の手順に影響を与える主要な関心事である。まず、後方散乱は複雑な介入の際の患者(皮膚)の服用に大きく寄与する。第2に、前方散乱放射は投影画像のコントラストを減少させ、3次元再構成においてアーティファクトを導入する。従来の抗散乱格子はX線を遮断することで画質を向上するが、検出器の抗散乱格子による追加の減衰は高用量で補償する必要がある。これはまた、患者を世話するスタッフに影響する服用量も増加させる。皮膚線量定量化には、予め決定されたスカラーバック散乱因子または線形点拡散関数を患者表面点への一次ケルマ前方射影に適用することにより、バック散乱が考慮される。しかし, 患者形状が異なるため, 従来の方法の一般化は限られている。そこで本研究では,従来の手法と学習に基づく手法を組み合わせることで,検出器に到達した前方散乱と患者皮膚線量に影響を及ぼす後方散乱を同時に推定する手法を提案する。前方散乱を知ればX線投射を補正できるが,後方散乱成分の良好な推定は皮膚線量評価の改善に役立つ。後方散乱と後方散乱を同時に推定するために,X線物理とニューラルネットワークを組み合わせることで,後方散乱と前方散乱の同時推定を行うマルチタスク手法を提案する。理論的には, どちらの場合においても高精度な散乱推定が可能となる。さらに,マルチタスクフレームワークの研究方向と学習に基づく散乱推定を一般論として示す。 Scattered radiation is a major concern impacting X-ray image-guided procedures in two ways. First, back-scatter significantly contributes to patient (skin) dose during complicated interventions. Second, forward-scattered radiation reduces contrast in projection images and introduces artifacts in 3-D reconstructions. While conventionally employed anti-scatter grids improve image quality by blocking X-rays, the additional attenuation due to the anti-scatter grid at the detector needs to be compensated for by a higher patient entrance dose. This also increases the room dose affecting the staff caring for the patient. For skin dose quantification, back-scatter is usually accounted for by applying pre-determined scalar back-scatter factors or linear point spread functions to a primary kerma forward projection onto a patient surface point. However, as patients come in different shapes, the generalization of conventional methods is limited. Here, we propose a novel approach combining conventional techniques with learning-based methods to simultaneously estimate the forward-scatter reaching the detector as well as the back-scatter affecting the patient skin dose. Knowing the forward-scatter, we can correct X-ray projections, while a good estimate of the back-scatter component facilitates an improved skin dose assessment. To simultaneously estimate forward-scatter as well as back-scatter, we propose a multi-task approach for joint back- and forward-scatter estimation by combining X-ray physics with neural networks. We show that, in theory, highly accurate scatter estimation in both cases is possible. In addition, we identify research directions for our multi-task framework and learning-based scatter estimation in general.	翻訳日:2022-11-12 13:54:36 公開日:2020-07-08
# スマートウォッチによるエピデミック露光通知: 近接性に基づくプライバシー保護アプローチ Epidemic Exposure Notification with Smartwatch: A Proximity-Based Privacy-Preserving Approach ( http://arxiv.org/abs/2007.04399v1 ) ライセンス: Link先を確認	Pai Chet Ng, Petros Spachos, Stefano Gregori, Konstantinos Plataniotis	(参考訳) パンデミック後の世界のビジネスは、従業員や顧客の健康と福祉を守る革新的な方法を模索している。無線技術は、接触追跡の補助として重要な役割を担い、局所感染の発生を早急に防ぎ、さらなる拡散を防ぐ。本研究は,ビジネス,ホスピタリティ,レクリエーション施設における安全な物理的距離を助長するスマートウォッチに基づくウェアラブル近接露光通知ソリューションを提案する。近距離ベースのプライバシー保存型コンタクトトレース(p$^3$ct)は、信頼性の高い近接センシングにbluetooth low energy(ble)技術と、アイデンティティを保存するアンビエントシグネチャプロトコルを利用しています。近接センシングは、受信信号強度(rss)を利用してユーザのインタラクションを検出し、感染症と診断された患者に対して、それらを低リスクまたは高リスクに分類する。より正確には、ユーザーは、患者との距離と時間の観点から、彼らの相互作用に基づいて、自分の露出を通知される。我々のプライバシー保護プロトコルは、ユーザーの身元が匿名化されることを保証するために、周囲の署名を使用する。提案手法の有効性を広範囲な実験により実証する。 Businesses planning for the post-pandemic world are looking for innovative ways to protect the health and welfare of their employees and customers. Wireless technologies can play a key role in assisting contact tracing to quickly halt a local infection outbreak and prevent further spread. In this work, we present a wearable proximity and exposure notification solution based on a smartwatch that also promotes safe physical distancing in business, hospitality, or recreational facilities. Our proximity-based privacy-preserving contact tracing (P$^3$CT) leverages the Bluetooth Low Energy (BLE) technology for reliable proximity sensing, and an ambient signature protocol for preserving identity. Proximity sensing exploits the received signal strength (RSS) to detect the user's interaction and thus classifying them into low- or high-risk with respect to a patient diagnosed with an infectious disease. More precisely, a user is notified of their exposure based on their interactions, in terms of distance and time, with a patient. Our privacy-preserving protocol uses the ambient signatures to ensure that users' identities be anonymized. We demonstrate the feasibility of our proposed solution through extensive experimentation.	翻訳日:2022-11-12 13:54:15 公開日:2020-07-08
# 可変 Lebesgue 空間におけるニューラルネットワークの近似 Approximation with Neural Networks in Variable Lebesgue Spaces ( http://arxiv.org/abs/2007.04166v1 ) ライセンス: Link先を確認	\'Angela Capel and Jes\'us Oc\'ariz	(参考訳) 本稿では,可変ルベーグ空間におけるニューラルネットワークの普遍近似特性について述べる。空間の指数関数が有界となると、任意の所望の精度で全ての関数を浅いニューラルネットワークで近似できることを示す。この結果は、指数関数の有界性に依存する近似の普遍性を決定する。さらに、指数が非有界であるときは常に、近似できる関数の部分空間に対するいくつかの特徴づけ結果が得られる。 This paper concerns the universal approximation property with neural networks in variable Lebesgue spaces. We show that, whenever the exponent function of the space is bounded, every function can be approximated with shallow neural networks with any desired accuracy. This result subsequently leads to determine the universality of the approximation depending on the boundedness of the exponent function. Furthermore, whenever the exponent is unbounded, we obtain some characterization results for the subspace of functions that can be approximated.	翻訳日:2022-11-12 13:52:03 公開日:2020-07-08
# オープンワールド機械学習の批判的評価 A Critical Evaluation of Open-World Machine Learning ( http://arxiv.org/abs/2007.04391v1 ) ライセンス: Link先を確認	Liwei Song, Vikash Sehwag, Arjun Nitin Bhagoji, Prateek Mittal	(参考訳) オープンワールド機械学習(ML)は、オフ・オブ・ディストリビューション(OOD)検出器とイン・ディストリビューションデータに基づいてトレーニングされたクローズドワールドモデルを組み合わせる。オープンワールドMLシステムに関するこれまでの研究は、多様でおそらくは敵対的な条件下での信頼性のテストに失敗する。そこで本稿では,システムコンポーネントの変更に対して,最先端のオープンワールドMLシステムがいかにレジリエンスであるかを理解する。 6つのOOD検出器で評価した結果,OOD検出性能には分布内データ,モデルアーキテクチャ,OODデータの選択が強く影響し,70 %以上の偽陽性率を誘導することがわかった。さらに、22の意図しない汚職や敵対的な摂動を伴うOOD入力が、オープンワールドMLシステムに最大100\%の偽陽性率で使用できないことを示す。オープンワールドMLのレジリエンスを高めるため、ロバスト分類器とOOD検出技術を組み合わせて、OOD検出とロバストネスの新たなトレードオフを明らかにする。 Open-world machine learning (ML) combines closed-world models trained on in-distribution data with out-of-distribution (OOD) detectors, which aim to detect and reject OOD inputs. Previous works on open-world ML systems usually fail to test their reliability under diverse, and possibly adversarial conditions. Therefore, in this paper, we seek to understand how resilient are state-of-the-art open-world ML systems to changes in system components? With our evaluation across 6 OOD detectors, we find that the choice of in-distribution data, model architecture and OOD data have a strong impact on OOD detection performance, inducing false positive rates in excess of $70\%$. We further show that OOD inputs with 22 unintentional corruptions or adversarial perturbations render open-world ML systems unusable with false positive rates of up to $100\%$. To increase the resilience of open-world ML, we combine robust classifiers with OOD detection techniques and uncover a new trade-off between OOD detection and robustness.	翻訳日:2022-11-12 13:51:56 公開日:2020-07-08
# ガイドスターフリー画像誘導波面整形 Guidestar-free image-guided wavefront-shaping ( http://arxiv.org/abs/2007.03956v1 ) ライセンス: Link先を確認	Tomer Yeminy and Ori Katz	(参考訳) 散乱媒体による光学イメージングは多くの応用において基本的な課題である。近年, 生体組織を画像化したり, 角を見回したりといった重要なブレークスルーが, 波面形状のアプローチによって得られている。しかし、これらは波面補正、コヒーレント照明の制御、そして多くの場合、形状の焦点をラスター走査するために埋め込まれたガイドスターを必要とする。スペックル相関を利用し、ガイドスターやウェーブフロント制御を回避できる別の新しい計算手法は、メモリ効果相関範囲に含まれる小さな2次元オブジェクトに限られる。そこで本研究では,非侵襲的でガイドスターフリーで広視野の非コヒーレントイメージングを高散乱層を通じて実現し,照明制御を行なわない,画像誘導波面形成という新しい概念を提案する。最も重要なのは、画像品質のメトリクスを盲目的に最適化することで、メモリ効果範囲よりも大きいオブジェクトでもウェーブフロント補正が見つかることです。高散乱層とマルチコアファイバによる拡張物体のイメージングを実演し、顕微鏡から内視鏡まで様々な応用において非侵襲的なイメージングの道を開く。 Optical imaging through scattering media is a fundamental challenge in many applications. Recently, substantial breakthroughs such as imaging through biological tissues and looking around corners have been obtained by the use of wavefront-shaping approaches. However, these require an implanted guide-star for determining the wavefront correction, controlled coherent illumination, and most often raster scanning of the shaped focus. Alternative novel computational approaches that exploit speckle correlations, avoid guide-stars and wavefront control but are limited to small two-dimensional objects contained within the memory-effect correlations range. Here, we present a new concept, image-guided wavefront-shaping, allowing non-invasive, guidestar-free, widefield, incoherent imaging through highly scattering layers, without illumination control. Most importantly, the wavefront-correction is found even for objects that are larger than the memory-effect range, by blindly optimizing image-quality metrics. We demonstrate imaging of extended objects through highly-scattering layers and multi-core fibers, paving the way for non-invasive imaging in various applications, from microscopy to endoscopy.	翻訳日:2022-11-12 13:50:20 公開日:2020-07-08
# 3dポイントクラウドデータ圧縮技術の最近の動向と3d圧縮領域における直接処理の課題 A Quick Review on Recent Trends in 3D Point Cloud Data Compression Techniques and the Challenges of Direct Processing in 3D Compressed Domain ( http://arxiv.org/abs/2007.05038v1 ) ライセンス: Link先を確認	Mohammed Javed and MD Meraz and Pavan Chakraborty	(参考訳) オブジェクトの検出、追跡、セグメンテーションのための3Dポイントクラウドデータの自動処理は、AIとデータサイエンスの分野における最新のトレンド研究である。しかし、(LiDARを使った)3Dポイントクラウドの形で作成されているデータの量は極めて大きく、研究者は現在、生成した大量のデータを処理するために、新しいデータ圧縮アルゴリズムの発明を進めている。しかし、一方の圧縮は、空間要求を克服する利点があるが、他方の処理は、余分な計算資源を注入する減圧のために高価になる。したがって、圧縮されたデータを直接操作・分析できるアルゴリズムを、圧縮と再圧縮の段階を伴わずに開発する(何度も要求されるように、圧縮されたデータを操作または解析する必要がある)。この研究分野はCompressed Domain Processingと呼ばれている。本稿では,LiDARが生成する3Dポイントクラウドデータ圧縮領域における最近の最先端技術開発について概説するとともに,3Dポイントクラウドデータの圧縮ドメイン処理の今後の課題を取り上げる。 Automatic processing of 3D Point Cloud data for object detection, tracking and segmentation is the latest trending research in the field of AI and Data Science, which is specifically aimed at solving different challenges of autonomous driving cars and getting real time performance. However, the amount of data that is being produced in the form of 3D point cloud (with LiDAR) is very huge, due to which the researchers are now on the way inventing new data compression algorithms to handle huge volumes of data thus generated. However, compression on one hand has an advantage in overcoming space requirements, but on the other hand, its processing gets expensive due to the decompression, which indents additional computing resources. Therefore, it would be novel to think of developing algorithms that can operate/analyse directly with the compressed data without involving the stages of decompression and recompression (required as many times, the compressed data needs to be operated or analyzed). This research field is termed as Compressed Domain Processing. In this paper, we will quickly review few of the recent state-of-the-art developments in the area of LiDAR generated 3D point cloud data compression, and highlight the future challenges of compressed domain processing of 3D point cloud data.	翻訳日:2022-11-12 13:49:40 公開日:2020-07-08
# VEC-OFによるオープンソフトウェア定義モビリティエコシステムの実現 Enable an Open Software Defined Mobility Ecosystem through VEC-OF ( http://arxiv.org/abs/2007.03879v1 ) ライセンス: Link先を確認	Sanchu Han, Yong He, Yin Ding	(参考訳) OEMs and new entrants can take the Mobility as a Service market (MaaS) as the entry point, upgrade its E/E (Electric and Electronic) architecture to be C/C (Computing and Communication) architecture, build one open software defined and data driven software platform for its production and service model, use efficient and collaborative ways of vehicles, roads, cloud and network to continuously improve core technologies such as autonomous driving, provide MaaS operators with an affordable and agile platform. 本稿では,VEC-OF(Vehicle-Edge-Cloud Open Framework)という新しいフレームワークを提案する。Vehicle-Edge-Cloud Open Frameworkは,より安全で,より効率的で,接続性が高く,信頼性の高いMaaSを実現するための,新たなデータおよびAI中心の自動車ソフトウェアフレームワークである。 OEMs and new entrants can take the Mobility as a Service market (MaaS) as the entry point, upgrade its E/E (Electric and Electronic) architecture to be C/C (Computing and Communication) architecture, build one open software defined and data driven software platform for its production and service model, use efficient and collaborative ways of vehicles, roads, cloud and network to continuously improve core technologies such as autonomous driving, provide MaaS operators with an affordable and agile platform. In this paper we present one new framework, VEC-OF (Vehicle-Edge-Cloud Open Framework), which is a new data and AI centric vehicle software framework enabling a much safer, more efficient, connected and trusted MaaS through cooperative vehicle, infrastructure and cloud capabilities and intelligence	翻訳日:2022-11-12 13:49:18 公開日:2020-07-08
# インテリジェント車両のためのカメラとクラウドデジタルツイン情報のセンサ融合 Sensor Fusion of Camera and Cloud Digital Twin Information for Intelligent Vehicles ( http://arxiv.org/abs/2007.04350v1 ) ライセンス: Link先を確認	Yongkang Liu, Ziran Wang, Kyungtae Han, Zhenyu Shou, Prashant Tiwari, and John H. L. Hansen	(参考訳) インテリジェントな車両と高度運転支援システム(ADAS)の急速な発展に伴い、交通システムには様々なレベルの人間ドライバーの関与が関与している。この状況下では、ドライバーの視覚誘導は潜在的なリスクを防ぐために不可欠である。本稿では,視覚誘導システムの開発を進めるために,カメラ画像とクラウドからのデジタルツイン知識を統合した新しいセンサ融合手法を提案する。目標車両バウンディングボックスは、エゴ車両上を走行する物体検出器の結果と雲からの位置情報とを組み合わせることで描画・マッチングされる。ユニオン(iou)しきい値の0.7の交点で79.2%の精度で一致し、追加の特徴点として深度画像が得られた。ゲームエンジンベースのシミュレーション結果は、視覚誘導システムがクラウドデジタルツインシステムと大幅に協調して運転安全性を向上させることも明らかにしている。 With the rapid development of intelligent vehicles and Advanced Driving Assistance Systems (ADAS), a mixed level of human driver engagements is involved in the transportation system. Visual guidance for drivers is essential under this situation to prevent potential risks. To advance the development of visual guidance systems, we introduce a novel sensor fusion methodology, integrating camera image and Digital Twin knowledge from the cloud. Target vehicle bounding box is drawn and matched by combining results of object detector running on ego vehicle and position information from the cloud. The best matching result, with a 79.2% accuracy under 0.7 Intersection over Union (IoU) threshold, is obtained with depth image served as an additional feature source. Game engine-based simulation results also reveal that the visual guidance system could improve driving safety significantly cooperate with the cloud Digital Twin system.	翻訳日:2022-11-12 13:42:55 公開日:2020-07-08
# 微小知覚超解像への旅 Journey Towards Tiny Perceptual Super-Resolution ( http://arxiv.org/abs/2007.04356v1 ) ライセンス: Link先を確認	Royson Lee, {\L}ukasz Dudziak, Mohamed Abdelfattah, Stylianos I. Venieris, Hyeji Kim, Hongkai Wen, Nicholas D. Lane	(参考訳) シングルイメージ知覚超解像(SR)における最近の研究は、深層畳み込みネットワークによる現実的なテクスチャの生成において、前例のない性能を示した。しかし、これらの畳み込みモデルはあまりに大きく高価であり、エンドデバイスへの効果的な展開を妨げる。本研究では,nasとgenerative adversarial networks(gans)を統合したニューラルネットワーク探索(nas)手法を提案する。具体的には,生成器と判別器の両方のアーキテクチャを逐次的に探索し,sr最適化判別器を探索するユニークな課題と重要な観察を強調し,既存の判別器アーキテクチャと比較する。我々の小さな知覚的SR(TPSR)モデルは、フル参照知覚計量(LPIPS)と歪み計量(PSNR)の両方でSRGANとEnhanceNetを上回り、それぞれ26.4$\times$よりメモリ効率が良く、33.6$\times$より計算効率が良い。 Recent works in single-image perceptual super-resolution (SR) have demonstrated unprecedented performance in generating realistic textures by means of deep convolutional networks. However, these convolutional models are excessively large and expensive, hindering their effective deployment to end devices. In this work, we propose a neural architecture search (NAS) approach that integrates NAS and generative adversarial networks (GANs) with recent advances in perceptual SR and pushes the efficiency of small perceptual SR models to facilitate on-device execution. Specifically, we search over the architectures of both the generator and the discriminator sequentially, highlighting the unique challenges and key observations of searching for an SR-optimized discriminator and comparing them with existing discriminator architectures in the literature. Our tiny perceptual SR (TPSR) models outperform SRGAN and EnhanceNet on both full-reference perceptual metric (LPIPS) and distortion metric (PSNR) while being up to 26.4$\times$ more memory efficient and 33.6$\times$ more compute efficient respectively.	翻訳日:2022-11-12 13:42:39 公開日:2020-07-08
# アート素材としての言葉:連続GANによる絵画の生成 Words as Art Materials: Generating Paintings with Sequential GANs ( http://arxiv.org/abs/2007.04383v1 ) ライセンス: Link先を確認	Azmi Can \"Ozgen, Haz{\i}m Kemal Ekenel	(参考訳) ジェネレーティブ・アドバイサル・ネットワークを用いた画像へのテキスト記述の変換が研究分野として人気を博している。近年,視覚的に魅力的な画像が生成されている。これらの研究に触発されて,大分散データセット上での芸術的画像の生成について検討した。このデータセットには、形状、色、内容など、バリエーションのあるイメージが含まれている。これらの画像のバリエーションは、芸術的本質の重要な要素である独創性をもたらす。私たちの研究の大きな特徴は、文章ではなく、画像記述としてキーワードを使うことです。ネットワークアーキテクチャとして,逐次生成適応型ネットワークモデルを提案する。この逐次モデルの最初の段階はワードベクトルを処理してベース画像を生成するが、次の段階は単語ベクトルを使わずに高解像度の芸術的なイメージを作成することに焦点を当てる。我々はganの不安定性に対処するため,wasserstein損失,スペクトル正規化,ミニバッチ識別などの混合手法を提案した。最終的には、さまざまなスタイルの絵画画像を生成することができました。 fr\'echetインセプション距離スコアを用いて評価を行い,186名を対象にユーザ調査を行った。 Converting text descriptions into images using Generative Adversarial Networks has become a popular research area. Visually appealing images have been generated successfully in recent years. Inspired by these studies, we investigated the generation of artistic images on a large variance dataset. This dataset includes images with variations, for example, in shape, color, and content. These variations in images provide originality which is an important factor for artistic essence. One major characteristic of our work is that we used keywords as image descriptions, instead of sentences. As the network architecture, we proposed a sequential Generative Adversarial Network model. The first stage of this sequential model processes the word vectors and creates a base image whereas the next stages focus on creating high-resolution artistic-style images without working on word vectors. To deal with the unstable nature of GANs, we proposed a mixture of techniques like Wasserstein loss, spectral normalization, and minibatch discrimination. Ultimately, we were able to generate painting images, which have a variety of styles. We evaluated our results by using the Fr\'echet Inception Distance score and conducted a user study with 186 participants.	翻訳日:2022-11-12 13:42:17 公開日:2020-07-08
# ロボット手術における楽器セグメンテーションのための効率的な構造探索 Searching for Efficient Architecture for Instrument Segmentation in Robotic Surgery ( http://arxiv.org/abs/2007.04449v1 ) ライセンス: Link先を確認	Daniil Pakhomov, Nassir Navab	(参考訳) 手術器具のセグメンテーションはロボット支援手術において重要な問題であり、完全な楽器ポーズ推定への重要なステップであり、手術中の拡張現実オーバーレイのマスキングに直接使用される。ほとんどのアプリケーションは、高精度外科画像の正確なリアルタイムセグメンテーションに依存している。従来の研究は主に高精度なセグメンテーションマスクを提供する手法に焦点を当てていたが、その大半は計算コストのためリアルタイムアプリケーションでは使用できない。本研究では,高解像度画像のリアルタイム推論を行うために,軽量かつ高効率な深部残差アーキテクチャを設計する。検出した重み付きディープ残差ネットワークの精度の低下と、追加の計算負荷の増大を避けるため、ネットワークの残差単位に対する拡張率の差分探索を行う。我々は、EndoVis 2017 Robotic Instrumentsデータセットで発見されたアーキテクチャを検証し、私たちのモデルは、高解像度画像上で125FPSの速度で、スピードと精度のトレードオフの観点から最先端のモデルであることを検証した。 Segmentation of surgical instruments is an important problem in robot-assisted surgery: it is a crucial step towards full instrument pose estimation and is directly used for masking of augmented reality overlays during surgical procedures. Most applications rely on accurate real-time segmentation of high-resolution surgical images. While previous research focused primarily on methods that deliver high accuracy segmentation masks, majority of them can not be used for real-time applications due to their computational cost. In this work, we design a light-weight and highly-efficient deep residual architecture which is tuned to perform real-time inference of high-resolution images. To account for reduced accuracy of the discovered light-weight deep residual network and avoid adding any additional computational burden, we perform a differentiable search over dilation rates for residual units of our network. We test our discovered architecture on the EndoVis 2017 Robotic Instruments dataset and verify that our model is the state-of-the-art in terms of speed and accuracy tradeoff with a speed of up to 125 FPS on high resolution images.	翻訳日:2022-11-12 13:42:00 公開日:2020-07-08
# 深部fiducial inference Deep Fiducial Inference ( http://arxiv.org/abs/2007.04285v1 ) ライセンス: Link先を確認	Gang Li, Jan Hannig	(参考訳) 2000年代中頃から、fiducial inferenceの現代的な修正への関心が復活した。これまで、一般化された分裂分布を抽出する主な計算ツールはマルコフ連鎖モンテカルロ(mcmc)である。本稿では,複雑な状況で使用可能な一般化されたfiducial distributionの計算方法を提案する。特に,非正規化fiducial density (mcmc) の難易度を克服するために,fiducial autoencoder (fae) を設計した。装着されたオートエンコーダを用いて未知パラメータの一般化されたフィデューシャルサンプルを生成する。精度を向上させるために,デコーダに差し込むと観測データを十分に再現できないサンプルを除去し,近似フィデューシャル計算(AFC)アルゴリズムを適用した。数値実験により,faeをベースとする逆解の有効性と,afc補正faeソリューションの精度が向上した。 Since the mid-2000s, there has been a resurrection of interest in modern modifications of fiducial inference. To date, the main computational tool to extract a generalized fiducial distribution is Markov chain Monte Carlo (MCMC). We propose an alternative way of computing a generalized fiducial distribution that could be used in complex situations. In particular, to overcome the difficulty when the unnormalized fiducial density (needed for MCMC), we design a fiducial autoencoder (FAE). The fitted autoencoder is used to generate generalized fiducial samples of the unknown parameters. To increase accuracy, we then apply an approximate fiducial computation (AFC) algorithm, by rejecting samples that when plugged into a decoder do not replicate the observed data well enough. Our numerical experiments show the effectiveness of our FAE-based inverse solution and the excellent coverage performance of the AFC corrected FAE solution.	翻訳日:2022-11-12 13:41:28 公開日:2020-07-08
# ディープラーニングモデルの分散トレーニング--分類学的観点から Distributed Training of Deep Learning Models: A Taxonomic Perspective ( http://arxiv.org/abs/2007.03970v1 ) ライセンス: Link先を確認	Matthias Langer, Zhen He, Wenny Rahayu, and Yanbo Xue	(参考訳) distributed deep learning systems (ddls)は、クラスタの分散リソースを利用してディープニューラルネットワークモデルをトレーニングする。 DDLSの開発者は、選択した環境で特定のワークロードを効率的に処理するための多くの決定をする必要がある。 GPUベースのディープラーニングの出現、データセットとディープニューラルネットワークモデルの絶え間なく増加するサイズ、クラスタ環境に存在する帯域制限と組み合わせることで、DDLSの開発者は、高品質モデルを迅速にトレーニングするために革新的である必要がある。 DDLSを並べて比較するのは、広範な機能リストとアーキテクチャ上の違いのため難しい。我々は、ディープラーニングモデルのトレーニングに関連する一般的な特性を分析し、そのようなワークロードをクラスタに分散して協調的なモデルトレーニングを実現することで、独立したマシンのクラスタ内でディープニューラルネットワークをトレーニングする際の基本的な原則に光を当てることを目指している。そこで,現代DDLSが使用する様々な技術の概要を述べ,その教育過程への影響と意義について論じる。 DDLSを概念化し、比較するために、異なるテクニックをカテゴリに分類し、分散ディープラーニングシステムの分類を確立させる。 Distributed deep learning systems (DDLS) train deep neural network models by utilizing the distributed resources of a cluster. Developers of DDLS are required to make many decisions to process their particular workloads in their chosen environment efficiently. The advent of GPU-based deep learning, the ever-increasing size of datasets and deep neural network models, in combination with the bandwidth constraints that exist in cluster environments require developers of DDLS to be innovative in order to train high quality models quickly. Comparing DDLS side-by-side is difficult due to their extensive feature lists and architectural deviations. We aim to shine some light on the fundamental principles that are at work when training deep neural networks in a cluster of independent machines by analyzing the general properties associated with training deep learning models and how such workloads can be distributed in a cluster to achieve collaborative model training. Thereby we provide an overview of the different techniques that are used by contemporary DDLS and discuss their influence and implications on the training process. To conceptualize and compare DDLS, we group different techniques into categories, thus establishing a taxonomy of distributed deep learning systems.	翻訳日:2022-11-12 13:40:42 公開日:2020-07-08
# 深層学習から見たスプリット製造の攻撃 Attacking Split Manufacturing from a Deep Learning Perspective ( http://arxiv.org/abs/2007.03989v1 ) ライセンス: Link先を確認	Haocheng Li, Satwik Patnaik, Abhrajit Sengupta, Haoyu Yang, Johann Knechtel, Bei Yu, Evangeline F. Y. Young, Ozgur Sinanoglu	(参考訳) フォワード・オブ・ライン(FEOL)とバック・エンド・オブ・ライン(BEOL)の部品を異なるファウンドリーに委譲する集積回路分割製造の概念は、知的財産(IP)の過剰生産、海賊行為、あるいはFEOL施設の敵によるハードウェア・トロイの木馬の侵入を防ぐことである。本研究では,様々なレイアウトレベルの配置とルーティングヒントをベクトルおよび画像に基づく特徴として定式化することにより,スプリット製造のセキュリティ約束に挑戦する。我々は,不足しているbeol接続を高精度に推定可能な,高度な深層ニューラルネットワークを構築した。 ISCAS-85ベンチマークと同様のネットワークフロー攻撃[1]と比較して、M1で分割すると1.21倍、M3で1%以下の動作時間で分割すると1.12倍の精度が得られる。 The notion of integrated circuit split manufacturing which delegates the front-end-of-line (FEOL) and back-end-of-line (BEOL) parts to different foundries, is to prevent overproduction, piracy of the intellectual property (IP), or targeted insertion of hardware Trojans by adversaries in the FEOL facility. In this work, we challenge the security promise of split manufacturing by formulating various layout-level placement and routing hints as vector- and image-based features. We construct a sophisticated deep neural network which can infer the missing BEOL connections with high accuracy. Compared with the publicly available network-flow attack [1], for the same set of ISCAS-85 benchmarks, we achieve 1.21X accuracy when splitting on M1 and 1.12X accuracy when splitting on M3 with less than 1% running time.	翻訳日:2022-11-12 13:40:23 公開日:2020-07-08
# topsisを用いた内部サプライチェーン最適化の戦略評価:コイル巻線機メーカーにおける実証 Strategic Evaluation in Optimizing the Internal Supply Chain Using TOPSIS: Evidence In A Coil Winding Machine Manufacturer ( http://arxiv.org/abs/2007.10121v1 ) ライセンス: Link先を確認	Dilip U Shenoy, Vinay Sharma, Shiva HC Prasad	(参考訳) 製造会社の大半は、付加価値による商品の収益性の向上の観点から、サプライチェーンの最適化を目指している。本研究は、特定の基準に関して、内部サプライチェーンの性能に影響する要因を批判的に検討する。したがって、これらの要因を製造業におけるサプライチェーンのパフォーマンスの重要な側面にランク付けする。企業の意思決定者から回答を集めるために使用される、事前定義された一連の質問に対する半構造化インタビュー。 TOPSISと呼ばれる多基準意思決定ツールを使用して、応答を評価し、要因をランク付けする。この結果から,サプライヤ関係と在庫計画が,製品提供のオンタイム化,生産柔軟性,コスト削減,追加コストに正の影響を与えていることが示唆された。本研究は,客観的および主観的評価手法を用いてプロセスパラメータの同定と最適化を支援する。本研究は,マネージャの思考過程が内部サプライチェーンを最適化する上での複合的影響を抽出したものである。 Most of the manufacturing firm aims to optimize their Supply Chain in terms of improved profitability of its products through value Addition. This study takes a critical look into the factors that affect the Performance of internal supply chain with respect to specific criteria. Accordingly, ranking these factors to get the critical dimensions of supply chain performance in the manufacturing industry. A semi-structured interview with the pre-defined set of questions used to collect the responses from decision makers of the firm. Multi criteria decision-making tool called TOPSIS is used to evaluate the responses and rank the factors. The results of this indicate that supplier relationship and inventory planning were most principal factors positively influencing on-time delivery of the product, production flexibility, cost savings, additional costs. This study helps to identify and optimize the process parameters using objective and subjective evaluation approach. The combined influence of the thought process of the manager to optimize the internal supply chain is extracted in this work.	翻訳日:2022-11-12 13:34:34 公開日:2020-07-08
# 連立言語同定を用いたエンドツーエンドのバイリンガルASRシステム Streaming End-to-End Bilingual ASR Systems with Joint Language Identification ( http://arxiv.org/abs/2007.03900v1 ) ライセンス: Link先を確認	Surabhi Punjabi, Harish Arsikere, Zeynab Raeesy, Chander Chandak, Nikhil Bhave, Ankish Bansal, Markus M\"uller, Sergio Murillo, Ariya Rastrow, Sri Garimella, Roland Maas, Mat Hans, Athanasios Mouchtaris, Siegfried Kunzmann	(参考訳) 多言語ASR技術は、モデルトレーニングとデプロイを単純化するが、その精度は実行時の言語情報の可用性に依存することが知られている。言語のアイデンティティは、現実のシナリオでは事前には知られていないため、最小限のレイテンシでオンザフライで推測する必要がある。さらに音声アクティブなスマートアシスタントシステムでは、ASR出力の下流処理には言語アイデンティティも必要である。本稿では,recurrent neural network transducer(rnn-t)アーキテクチャを用いて,asrと言語識別(lid)の両方を実行するストリーミング,エンドツーエンドのバイリンガルシステムを提案する。入力側では、事前訓練された音響専用LID分類器からの埋め込みを用いて、RNN-Tのトレーニングと推論を誘導する一方、出力側では、言語ターゲットをASRターゲットと共同でモデル化する。提案手法は、アメリカ合衆国で話される英語とスペイン語、インドで話される英語とヒンディーの2つの言語対に適用できる。 ASR-LIDアーキテクチャは英語とスペイン語では単言語ASRと音響のみのLIDの精度に一致している。英語ヒンディー語のより難易度の高い(言語内コードのスイッチングによる)ケースでは、英語のasrとlidメトリクスが劣化している。全体として、ユーザが動的に言語間で切り替えるシナリオでは、提案アーキテクチャは複数のモノリンガル ASR モデルと LID 分類器を並列に実行するよりも、有望な単純化を提供する。 Multilingual ASR technology simplifies model training and deployment, but its accuracy is known to depend on the availability of language information at runtime. Since language identity is seldom known beforehand in real-world scenarios, it must be inferred on-the-fly with minimum latency. Furthermore, in voice-activated smart assistant systems, language identity is also required for downstream processing of ASR output. In this paper, we introduce streaming, end-to-end, bilingual systems that perform both ASR and language identification (LID) using the recurrent neural network transducer (RNN-T) architecture. On the input side, embeddings from pretrained acoustic-only LID classifiers are used to guide RNN-T training and inference, while on the output side, language targets are jointly modeled with ASR targets. The proposed method is applied to two language pairs: English-Spanish as spoken in the United States, and English-Hindi as spoken in India. Experiments show that for English-Spanish, the bilingual joint ASR-LID architecture matches monolingual ASR and acoustic-only LID accuracies. For the more challenging (owing to within-utterance code switching) case of English-Hindi, English ASR and LID metrics show degradation. Overall, in scenarios where users switch dynamically between languages, the proposed architecture offers a promising simplification over running multiple monolingual ASR models and an LID classifier in parallel.	翻訳日:2022-11-12 13:34:18 公開日:2020-07-08
# 金属アーティファクト低減のための低次元多様体制約ディスタングルメントネットワーク Low-dimensional Manifold Constrained Disentanglement Network for Metal Artifact Reduction ( http://arxiv.org/abs/2007.03882v1 ) ライセンス: Link先を確認	Chuang Niu, Wenxiang Cong, Fenglei Fan, Hongming Shan, Mengzhou Li, Jimin Liang, Ge Wang	(参考訳) 深層ニューラルネットワークに基づく手法はctメタルアーティファクトリダクション(mar)に有望な結果をもたらしており、そのほとんどはトレーニングに多くの合成ペアイメージを使用している。 CT画像中の金属人工物は, 臨床像を正確に反映しない可能性があるため, 欠損した臨床像を直接使用し, 臨床データセットに有望な結果をもたらすアーティファクト・ディアンタングメント・ネットワーク(ADN)が提案された。しかし, 十分な監督がなければ, 対向的損失のみに基づいて, アーチファクト影響CT画像の構造的詳細を復元することは困難である。これらの問題を克服するために,パッチ多様体が一般に低次元であることのイメージ特性を活かした低次元多様体(LDM)制約分散ネットワーク(DN)を提案する。具体的には,LDM-DN学習アルゴリズムを設計し,低次元のパッチ多様体上の画像に制約を加えながら,相乗的ネットワーク損失関数を最適化する。さらに、ペアデータとペアデータの両方から学習し、臨床データセットのmar性能をさらに向上させるために、効率的なハイブリッド最適化スキームを提案する。大規模な実験により、LDM-DNアプローチはペアおよび/またはペアなしの学習環境におけるMAR性能を一貫して改善し、合成および臨床データセット上で競合する手法より優れていることが示された。 Deep neural network based methods have achieved promising results for CT metal artifact reduction (MAR), most of which use many synthesized paired images for training. As synthesized metal artifacts in CT images may not accurately reflect the clinical counterparts, an artifact disentanglement network (ADN) was proposed with unpaired clinical images directly, producing promising results on clinical datasets. However, without sufficient supervision, it is difficult for ADN to recover structural details of artifact-affected CT images based on adversarial losses only. To overcome these problems, here we propose a low-dimensional manifold (LDM) constrained disentanglement network (DN), leveraging the image characteristics that the patch manifold is generally low-dimensional. Specifically, we design an LDM-DN learning algorithm to empower the disentanglement network through optimizing the synergistic network loss functions while constraining the recovered images to be on a low-dimensional patch manifold. Moreover, learning from both paired and unpaired data, an efficient hybrid optimization scheme is proposed to further improve the MAR performance on clinical datasets. Extensive experiments demonstrate that the proposed LDM-DN approach can consistently improve the MAR performance in paired and/or unpaired learning settings, outperforming competing methods on synthesized and clinical datasets.	翻訳日:2022-11-12 13:33:54 公開日:2020-07-08
# AUSN: ニューラルネットワークの非一様分布を適応的に重畳した近似量子化 AUSN: Approximately Uniform Quantization by Adaptively Superimposing Non-uniform Distribution for Deep Neural Networks ( http://arxiv.org/abs/2007.03903v1 ) ライセンス: Link先を確認	Liu Fangxin, Zhao Wenbo, Wang Yanzhi, Dai Changzhi, Jiang Li	(参考訳) エッジアプリケーションのDNN推論を単純化するためには量子化が不可欠である。しかし、既存の均一な量子化法と非一様量子化法は、表現範囲と解像度との固有の矛盾を示し、その結果、未使用ビット幅または重要な精度低下をもたらす。さらに、これらの手法には3つの欠点がある。一量子化誤差の原因を詳細に分析するための量的指標がないこと。二画像分類タスクのCNNに基づく限定的な焦点三ビット幅を下げることにより、実際のハードウェア及びエネルギー消費の無意識を低下させる。本稿では,まず,クリッピング誤差と丸め誤差の2つの定量的指標を定義し,量子化誤差分布を解析した。境界と丸みを帯びたエラーは、層、モデル、タスクによって大きく異なる。そこで本研究では,重みと活性化を定量化する新しい量子化法を提案する。鍵となる考え方は、複数の非一様量子化値、すなわち AUSN を適応的に重ね合わせることでユニフォーム量子化を近似することである。 AUSNは、ビット幅を極端まで効率的に活用するデコーダフリーコーディングスキームと、ハードウェア設計の余分な努力なしに異なるDNN層、モデル、タスクに符号化スキームを適応できる重ね合わせ量子化アルゴリズムと、よく知られたビット幅オーバーフローと再量子化の問題を排除するラウンドリングスキームから構成されている。様々なタスクのDNNモデルの理論的解析と精度評価は、AUSNの有効性と一般化を示している。 FPGAの合成〜(Appendix B参照)の結果は、エネルギー消費の削減に2\times$、ハードウェアリソースの削減に2\times$4\times$である。 Quantization is essential to simplify DNN inference in edge applications. Existing uniform and non-uniform quantization methods, however, exhibit an inherent conflict between the representing range and representing resolution, and thereby result in either underutilized bit-width or significant accuracy drop. Moreover, these methods encounter three drawbacks: i) the absence of a quantitative metric for in-depth analysis of the source of the quantization errors; ii) the limited focus on the image classification tasks based on CNNs; iii) the unawareness of the real hardware and energy consumption reduced by lowering the bit-width. In this paper, we first define two quantitative metrics, i.e., the Clipping Error and rounding error, to analyze the quantization error distribution. We observe that the boundary- and rounding- errors vary significantly across layers, models and tasks. Consequently, we propose a novel quantization method to quantize the weight and activation. The key idea is to Approximate the Uniform quantization by Adaptively Superposing multiple Non-uniform quantized values, namely AUSN. AUSN is consist of a decoder-free coding scheme that efficiently exploits the bit-width to its extreme, a superposition quantization algorithm that can adapt the coding scheme to different DNN layers, models and tasks without extra hardware design effort, and a rounding scheme that can eliminate the well-known bit-width overflow and re-quantization issues. Theoretical analysis~(see Appendix A) and accuracy evaluation on various DNN models of different tasks show the effectiveness and generalization of AUSN. The synthesis~(see Appendix B) results on FPGA show $2\times$ reduction of the energy consumption, and $2\times$ to $4\times$ reduction of the hardware resource.	翻訳日:2022-11-12 13:33:26 公開日:2020-07-08
# 簡単な学習モデルによる勝利:オランダ・グロニンゲンの地震検出 Winning with Simple Learning Models: Detecting Earthquakes in Groningen, the Netherlands ( http://arxiv.org/abs/2007.03924v1 ) ライセンス: Link先を確認	Umair bin Waheed, Ahmed Shaheen, Mike Fehler, Ben Fulcher	(参考訳) ディープラーニングは、科学全体の長年の研究課題に対処するための破壊的なツールとして急速に発展しつつある。その成功にもかかわらず、近年のディープラーニングの過剰使用傾向は、多くの機械学習実践者に関係している。近年、地震学者は低等級地震の検出における深層学習アルゴリズムの有効性を実証している。本稿では,地震イベント検出の問題を再考するが,特徴抽出を伴うロジスティック回帰モデルを用いる。我々は,学際的時系列解析手法から収集した時系列操作の膨大なデータベースから,特徴を適切に識別する。トレーニング可能なパラメータを5つしか持たない単純な学習モデルを用いて,グロニンゲンガス田からの低マグニチュード誘発地震を複数検出する。よりシンプルなモデルの利点として、選択された機能は、データセットに存在するノイズやイベントクラスを理解するのに役立ちます。シンプルなモデルは、メンテナンス、デバッグ、理解、トレーニングが容易であるため、よりシンプルな選択肢を慎重に検討することなく、ディープラーニングを使用するのは危険である、という結論に達しています。 Deep learning is fast emerging as a potential disruptive tool to tackle longstanding research problems across the sciences. Notwithstanding its success across disciplines, the recent trend of the overuse of deep learning is concerning to many machine learning practitioners. Recently, seismologists have also demonstrated the efficacy of deep learning algorithms in detecting low magnitude earthquakes. Here, we revisit the problem of seismic event detection but using a logistic regression model with feature extraction. We select well-discriminating features from a huge database of time-series operations collected from interdisciplinary time-series analysis methods. Using a simple learning model with only five trainable parameters, we detect several low-magnitude induced earthquakes from the Groningen gas field that are not present in the catalog. We note that the added advantage of simpler models is that the selected features add to our understanding of the noise and event classes present in the dataset. Since simpler models are easy to maintain, debug, understand, and train, through this study we underscore that it might be a dangerous pursuit to use deep learning without carefully weighing simpler alternatives.	翻訳日:2022-11-12 13:32:55 公開日:2020-07-08
# 画像分割のためのデュアルcnnの設計と学習 Designing and Training of A Dual CNN for Image Denoising ( http://arxiv.org/abs/2007.03951v1 ) ライセンス: Link先を確認	Chunwei Tian, Yong Xu, Wangmeng Zuo, Bo Du, Chia-Wen Lin and David Zhang	(参考訳) 画像復調のための深層畳み込みニューラルネットワーク(CNN)は近年研究の関心を集めている。しかし、平易なネットワークでは、実際のノイズ画像のような複雑なタスクの詳細な詳細を復元できない。本稿ではDudeNet(Dual Denoising Network)を提案し,クリーンな画像の復元を行った。具体的には,機能抽出ブロック,拡張ブロック,圧縮ブロック,再構築ブロックの4つのモジュールで構成される。スパースマカニズムを持つ特徴抽出ブロックは、2つのサブネットワークを介してグローバルおよびローカルな特徴を抽出する。拡張ブロックはグローバルとローカルの機能を収集して融合し、後者のネットワークに補完的な情報を提供する。圧縮ブロックは抽出した情報を洗練し、ネットワークを圧縮する。最後に、復元ブロックを利用して、音像を再構成する。 1) パース機構を持つデュアルネットワークは、デノイザの一般化能力を高めるために補完的な特徴を抽出することができる。 2)大域的特徴と局所的特徴を融合させることで,複雑な雑音画像の細部を復元することができる。 (3)デノイザの複雑さを低減するために小型フィルタを用いる。大規模な実験は、DudeNetが既存の最先端のデノナイジング手法よりも優れていることを示す。 Deep convolutional neural networks (CNNs) for image denoising have recently attracted increasing research interest. However, plain networks cannot recover fine details for a complex task, such as real noisy images. In this paper, we propsoed a Dual denoising Network (DudeNet) to recover a clean image. Specifically, DudeNet consists of four modules: a feature extraction block, an enhancement block, a compression block, and a reconstruction block. The feature extraction block with a sparse machanism extracts global and local features via two sub-networks. The enhancement block gathers and fuses the global and local features to provide complementary information for the latter network. The compression block refines the extracted information and compresses the network. Finally, the reconstruction block is utilized to reconstruct a denoised image. The DudeNet has the following advantages: (1) The dual networks with a parse mechanism can extract complementary features to enhance the generalized ability of denoiser. (2) Fusing global and local features can extract salient features to recover fine details for complex noisy images. (3) A Small-size filter is used to reduce the complexity of denoiser. Extensive experiments demonstrate the superiority of DudeNet over existing current state-of-the-art denoising methods.	翻訳日:2022-11-12 13:32:40 公開日:2020-07-08
# 医用画像評価における予測不確かさの定量化と活用 Quantifying and Leveraging Predictive Uncertainty for Medical Image Assessment ( http://arxiv.org/abs/2007.04258v1 ) ライセンス: Link先を確認	Florin C. Ghesu, Bogdan Georgescu, Awais Mansoor, Youngjin Yoo, Eli Gibson, R.S. Vishwanath, Abishek Balachandran, James M. Balter, Yue Cao, Ramandeep Singh, Subba R. Digumarthy, Mannudeep K. Kalra, Sasa Grbic, Dorin Comaniciu	(参考訳) 医療画像の解釈は難しい課題であり、しばしばアーティファクト、オクルージョン、限られたコントラストなどの存在によって複雑になる。最も注目すべきは胸部x線撮影の症例で、異常の検出と分類において高いレート間変動がある。これは主に、病気の出現に関するデータや主観的な定義の不確定な証拠によるものである。もう一つの例は、2次元超音波画像に基づく解剖学的ビューの分類である。しばしば、フレームでキャプチャされた解剖学的文脈は、基礎となる解剖学を認識するには不十分である。これらの問題の現在の機械学習ソリューションは、通常、限られた情報と高いラベルノイズに対応する基盤となるモデルの能力に依存する確率的予測を提供することに制限されている。しかし実際には、これは不明瞭なデータに対する一般化が不十分な過信システムにつながる。そこで本研究では,分類の確率的推定だけでなく,予測結果におけるシステムの信頼度を捉える明示的不確実性尺度を学習するシステムを提案する。本手法は, 放射線検査, 超音波, 磁気共鳴画像などの異なる放射線検査から得られた医用画像のあいまいさを考慮に入れる上で重要である。本実験では, 予測不確実性に基づく試料の拒絶は, 胸部X線写真における異常の分類において, 25%未満の拒絶率で, ROC-AUCを8%から0.91に向上させることができることを示した。さらに,不確実性に基づくブートストラップをトレーニングデータのフィルタに適用することで,ロバスト性や精度が大幅に向上することを示す。 The interpretation of medical images is a challenging task, often complicated by the presence of artifacts, occlusions, limited contrast and more. Most notable is the case of chest radiography, where there is a high inter-rater variability in the detection and classification of abnormalities. This is largely due to inconclusive evidence in the data or subjective definitions of disease appearance. An additional example is the classification of anatomical views based on 2D Ultrasound images. Often, the anatomical context captured in a frame is not sufficient to recognize the underlying anatomy. Current machine learning solutions for these problems are typically limited to providing probabilistic predictions, relying on the capacity of underlying models to adapt to limited information and the high degree of label noise. In practice, however, this leads to overconfident systems with poor generalization on unseen data. To account for this, we propose a system that learns not only the probabilistic estimate for classification, but also an explicit uncertainty measure which captures the confidence of the system in the predicted output. We argue that this approach is essential to account for the inherent ambiguity characteristic of medical images from different radiologic exams including computed radiography, ultrasonography and magnetic resonance imaging. In our experiments we demonstrate that sample rejection based on the predicted uncertainty can significantly improve the ROC-AUC for various tasks, e.g., by 8% to 0.91 with an expected rejection rate of under 25% for the classification of different abnormalities in chest radiographs. In addition, we show that using uncertainty-driven bootstrapping to filter the training data, one can achieve a significant increase in robustness and accuracy.	翻訳日:2022-11-12 13:32:02 公開日:2020-07-08
# シミュレーションfermi/lat望遠鏡画像を用いたニューラルネットワーク点抽出に関する研究 A study of Neural networks point source extraction on simulated Fermi/LAT Telescope images ( http://arxiv.org/abs/2007.04295v1 ) ライセンス: Link先を確認	Mariia Drozdova, Anton Broilovskiy, Andrey Ustyuzhanin, Denys Malyshev	(参考訳) GeV帯の天体物理画像は、背景と前景の天体物理拡散放射の強い寄与と、現代の宇宙観測装置の比較的広い範囲の拡散関数により、分析が困難である。あるケースでは、画像上の点源を見つけることさえ、非自明な作業になる。本稿では,フェルミ大都市圏望遠鏡の画像を模倣した人工データセット上で学習した畳み込みニューラルネットワーク(cnn)を用いた点源抽出法を提案する。これらの画像は1から10GeVのエネルギーをカバーする10×10度の原数光子マップである。我々は、精度が15%向上し、少なくとも4つの精度改善の要因で推論時間を削減できる様々なcnnアーキテクチャを比較した。 Astrophysical images in the GeV band are challenging to analyze due to the strong contribution of the background and foreground astrophysical diffuse emission and relatively broad point spread function of modern space-based instruments. In certain cases, even finding of point sources on the image becomes a non-trivial task. We present a method for point sources extraction using a convolution neural network (CNN) trained on our own artificial data set which imitates images from the Fermi Large Area Telescope. These images are raw count photon maps of 10x10 degrees covering energies from 1 to 10 GeV. We compare different CNN architectures that demonstrate accuracy increase by ~15% and reduces the inference time by at least the factor of 4 accuracy improvement with respect to a similar state of the art models.	翻訳日:2022-11-12 13:31:35 公開日:2020-07-08
# KIT MOMA: モバイルデバイスのデータセット KIT MOMA: A Mobile Machines Dataset ( http://arxiv.org/abs/2007.04198v1 ) ライセンス: Link先を確認	Yusheng Xiang, Hongzhe Wang, Tianqing Su, Ruoyu Li, Christine Brach, Samuel S. Mao, Marcus Geimer	(参考訳) 通常、クローズドな場所で作業するモバイルマシンは、自動運転技術を利用する可能性が高い。しかし、開発と革新の活発な発展は、主に旅客車の分野で起きている。対照的に、自動運転やモバイルマシンでの作業についても多くの研究があるが、SOTAソリューションに関するコンセンサスはまだ達成されていない。解決すべき最も緊急な問題は、公然と挑戦的なビジュアルデータセットがないことであり、異なる研究の結果と同等である、と私たちは信じています。この問題に対処するため、我々は8種類のモバイルマシンを含むKIT MOMAデータセットを公開し、モバイル構築マシンを検出するためのSOTAアルゴリズムを評価するベンチマークとして使用することができる。収集された画像のビューは、すべての興味深いマシンがクローズドな場所で作業している場合、地上の固定カメラがより適していると考えるので、モバイルマシンの外部にある。 KIT MOMAのイメージのほとんどは実際のシーンにあるが、一部の画像は建設機械メーカーの公式ウェブサイトにある。また、このデータセット上でのYOLO v3の性能を評価し、SOTAコンピュータビジョンアルゴリズムは、特定の作業場での移動体検出に優れた性能を示していることを示す。データセットとともにトレーニングされた重量もアップロードします。これは建設機械業界のエンジニアが直接使用することができます。データセット、トレーニングされたウェイト、アップデートはGithubで確認できます。さらに、デモは私たちのyoutubeで見ることができる。 Mobile machines typically working in a closed site, have a high potential to utilize autonomous driving technology. However, vigorously thriving development and innovation are happening mostly in the area of passenger cars. In contrast, although there are also many research pieces about autonomous driving or working in mobile machines, a consensus about the SOTA solution is still not achieved. We believe that the most urgent problem that should be solved is the absence of a public and challenging visual dataset, which makes the results from different researches comparable. To address the problem, we publish the KIT MOMA dataset, including eight classes of commonly used mobile machines, which can be used as a benchmark to evaluate the SOTA algorithms to detect mobile construction machines. The view of the gathered images is outside of the mobile machines since we believe fixed cameras on the ground are more suitable if all the interesting machines are working in a closed site. Most of the images in KIT MOMA are in a real scene, whereas some of the images are from the official website of top construction machine companies. Also, we have evaluated the performance of YOLO v3 on our dataset, indicating that the SOTA computer vision algorithms already show an excellent performance for detecting the mobile machines in a specific working site. Together with the dataset, we also upload the trained weights, which can be directly used by engineers from the construction machine industry. The dataset, trained weights, and updates can be found on our Github. Moreover, the demo can be found on our Youtube.	翻訳日:2022-11-12 13:24:56 公開日:2020-07-08
# 動的および反復スパンニング森林を用いたスーパーピクセルセグメンテーション Superpixel Segmentation using Dynamic and Iterative Spanning Forest ( http://arxiv.org/abs/2007.04257v1 ) ライセンス: Link先を確認	F.C. Belem and S.J.F. Guimaraes and A.X. Falcao	(参考訳) 画像オブジェクトを構成する部分として、スーパーピクセルはいくつかの高レベルの操作を改善することができる。しかし、画像分割法は、スーパーピクセル数を減らすために精度を著しく損なう可能性がある。我々は,isf(cycleed spanning forest)フレームワークに基づくソリューションを調査した。本稿では、以下のステップをベースとしたDynamic ISF(DISF)について述べる。 (a)所望のスーパーピクセル数よりもかなり多くのピクセルを持つ画像グラフとシードセットから始まります。 b) 種子は互いに競合し, それぞれの種子は最も近縁なピクセルを征服し, 画像分割(スパンニング林)と接続されたスーパーピクセルが形成される。ステップ (c)disFは,超画素解析に基づいて関連値を種子に割り当て,最も無関係な種子を除去する。ステップ (b) (c)は所望のスーパーピクセル数に到達するまで繰り返される。 DISFは、リージョンマージアルゴリズムと比較して、イテレーション毎に関連するエッジを再構築する機会がある。他のシードベースのスーパーピクセル法と比較すると、DIFは関連する種子を見つける傾向にある。さらに,isfフレームワークにおいて,より効率的なスーパーピクセルデライン化のために動的アークウェイト推定を導入し,異なるオブジェクト特性を持つ3つのデータセット上でのすべての結果を示す。 As constituent parts of image objects, superpixels can improve several higher-level operations. However, image segmentation methods might have their accuracy seriously compromised for reduced numbers of superpixels. We have investigated a solution based on the Iterative Spanning Forest (ISF) framework. In this work, we present Dynamic ISF (DISF) -- a method based on the following steps. (a) It starts from an image graph and a seed set with considerably more pixels than the desired number of superpixels. (b) The seeds compete among themselves, and each seed conquers its most closely connected pixels, resulting in an image partition (spanning forest) with connected superpixels. In step (c), DISF assigns relevance values to seeds based on superpixel analysis and removes the most irrelevant ones. Steps (b) and (c) are repeated until the desired number of superpixels is reached. DISF has the chance to reconstruct relevant edges after each iteration, when compared to region merging algorithms. As compared to other seed-based superpixel methods, DISF is more likely to find relevant seeds. It also introduces dynamic arc-weight estimation in the ISF framework for more effective superpixel delineation, and we demonstrate all results on three datasets with distinct object properties.	翻訳日:2022-11-12 13:24:03 公開日:2020-07-08
# 廃棄物オブジェクト分割へのマルチレベルアプローチ A Multi-Level Approach to Waste Object Segmentation ( http://arxiv.org/abs/2007.04259v1 ) ライセンス: Link先を確認	Tao Wang and Yuanzheng Cai and Lingyu Liang and Dongyi Ye	(参考訳) 本稿では,カラー画像から無駄な物体を局所化する問題と,そのような物体とロボットが相互作用する上で重要な知覚成分である奥行き画像について論じる。具体的には,複数の空間的粒度レベルでの強度と深度情報を統合する。まず、シーンレベルのディープネットワークが初期粗いセグメンテーションを生成し、そこでいくつかの潜在的なオブジェクト領域を選択してズームインして細かなセグメンテーションを行う。上記のステップの結果はさらに密結合された条件付きランダムフィールドに統合され、ピクセルレベルの精度で外観、深さ、空間親和性を尊重する。さらに, この領域における今後の研究を促進するために, 新たにRGBD 廃棄物オブジェクト分割データセット MJU-Waste を作成した。本手法の有効性は,MJU-WasteとTrash Annotation in Context (TACO)データセットの両方で検証される。 We address the problem of localizing waste objects from a color image and an optional depth image, which is a key perception component for robotic interaction with such objects. Specifically, our method integrates the intensity and depth information at multiple levels of spatial granularity. Firstly, a scene-level deep network produces an initial coarse segmentation, based on which we select a few potential object regions to zoom in and perform fine segmentation. The results of the above steps are further integrated into a densely connected conditional random field that learns to respect the appearance, depth, and spatial affinities with pixel-level accuracy. In addition, we create a new RGBD waste object segmentation dataset, MJU-Waste, that is made public to facilitate future research in this area. The efficacy of our method is validated on both MJU-Waste and the Trash Annotation in Context (TACO) dataset.	翻訳日:2022-11-12 13:23:42 公開日:2020-07-08
# UU-Net:ビジュアル監視ビデオフットプリントの顔認識 The UU-Net: Reversible Face De-Identification for Visual Surveillance Video Footage ( http://arxiv.org/abs/2007.04316v1 ) ライセンス: Link先を確認	Hugo Proen\c{c}a	(参考訳) そこで本稿では,ランドマークベース技術が使用できない低解像度映像データに対する可逆的顔識別法を提案する。我々のソリューションは、データ保護規則を満たし、最小限のプライバシー制約の下で公開可能な、現実的な非識別ストリームを生成することができる。特に、これらのストリームは、後に元のシーンを再構築するのに必要な全ての情報をカプセル化しており、犯罪捜査など、被写体の識別が最も重要なシナリオに有用である。 2つの主要コンポーネントを共同で最適化する学習プロセスについて述べる。 1) 原データを受信し、ID情報が写実的でシームレスな方法で代理される非識別ストリームを生成する公開モジュール 2) 法・セキュリティ当局のために設計された私的なモジュールで,公開ストリームを分析し,元のシーンを再構築し,現場のすべての被験者の実際のIDを開示する。提案手法はランドマークフリーであり、条件付き生成対向ネットワークを用いて、ポーズ、照明、背景情報、さらには表情を保存した合成顔を生成する。また、生データと非識別データの間で保存されるべきソフトな顔属性のセットを完全に制御できるようにし、このソリューションの応用範囲を広げる。実験は3種類の視覚監視データセット(BIODI, MARS, P-DESTRE)を用いて行った。ソースコードはhttps://github.com/hugomcp/uu-netで入手できる。 We propose a reversible face de-identification method for low resolution video data, where landmark-based techniques cannot be reliably used. Our solution is able to generate a photo realistic de-identified stream that meets the data protection regulations and can be publicly released under minimal privacy constraints. Notably, such stream encapsulates all the information required to later reconstruct the original scene, which is useful for scenarios, such as crime investigation, where the identification of the subjects is of most importance. We describe a learning process that jointly optimizes two main components: 1) a public module, that receives the raw data and generates the de-identified stream, where the ID information is surrogated in a photo-realistic and seamless way; and 2) a private module, designed for legal/security authorities, that analyses the public stream and reconstructs the original scene, disclosing the actual IDs of all the subjects in the scene. The proposed solution is landmarks-free and uses a conditional generative adversarial network to generate synthetic faces that preserve pose, lighting, background information and even facial expressions. Also, we enable full control over the set of soft facial attributes that should be preserved between the raw and de-identified data, which broads the range of applications for this solution. Our experiments were conducted in three different visual surveillance datasets (BIODI, MARS and P-DESTRE) and showed highly encouraging results. The source code is available at https://github.com/hugomcp/uu-net.	翻訳日:2022-11-12 13:23:13 公開日:2020-07-08
# Auto-MAP: DNNワークロードの分散実行計画を探索するDQNフレームワーク Auto-MAP: A DQN Framework for Exploring Distributed Execution Plans for DNN Workloads ( http://arxiv.org/abs/2007.04069v1 ) ライセンス: Link先を確認	Siyu Wang, Yi Rong, Shiqing Fan, Zhen Zheng, LanSong Diao, Guoping Long, Jun Yang, Xiaoyong Liu, Wei Lin	(参考訳) 過去10年間、ディープニューラルネットワークをトレーニングするための計算要件が増加してきた。現在のアプローチ(データ/モデル並列性、パイプライン並列性など)は、トレーニングタスクを複数のデバイスに並列化する。しかしながら、これらのアプローチは常に特定のディープラーニングフレームワークに依存しており、詳細な手作業による設計を必要とするため、異なるタイプのモデルのメンテナンスと共有が難しい。本稿では,DNNワークロードの分散実行計画を探索するフレームワークであるAuto-MAPを提案する。効率的な探索は、強化学習の大きな課題である。 DQNとタスク固有のプルーニング戦略を利用して、最適化された戦略を含む検索空間を効率的に探索する。評価の結果,Auto-MAPは複数のNLPおよび畳み込みモデルにおいて,より優れたスループットを実現しつつ,最適解を2時間以内に見つけることができることがわかった。 The last decade has witnessed growth in the computational requirements for training deep neural networks. Current approaches (e.g., data/model parallelism, pipeline parallelism) parallelize training tasks onto multiple devices. However, these approaches always rely on specific deep learning frameworks and requires elaborate manual design, which make it difficult to maintain and share between different type of models. In this paper, we propose Auto-MAP, a framework for exploring distributed execution plans for DNN workloads, which can automatically discovering fast parallelization strategies through reinforcement learning on IR level of deep learning models. Efficient exploration remains a major challenge for reinforcement learning. We leverage DQN with task-specific pruning strategies to help efficiently explore the search space including optimized strategies. Our evaluation shows that Auto-MAP can find the optimal solution in two hours, while achieving better throughput on several NLP and convolution models.	翻訳日:2022-11-12 13:22:32 公開日:2020-07-08
# 部分教師付きマルチオルガンセグメンテーションにおける限界損失と排除損失 Marginal loss and exclusion loss for partially supervised multi-organ segmentation ( http://arxiv.org/abs/2007.03868v1 ) ライセンス: Link先を確認	Gonglei Shi, Li Xiao, Yang Chen, S. Kevin Zhou	(参考訳) 医用画像に複数の臓器をアノテートすることは費用も時間もかかるため、ラベル付きの既存の複数臓器データセットはしばしばサンプルサイズが低く、主に部分的にラベル付けされている。本稿では,そのようなデータセットの結合から単一マルチ組織セグメンテーションネットワークを学習する方法を検討する。この目的のために,特にこのシナリオ用に設計された2種類の新しい損失関数を提案する。 (一)限界損失、及び (ii)排他的損失。部分ラベル付き画像の背景ラベルは、実際には、すべてのラベル付き臓器の「マージ」ラベルと(フルラベルの意味で)「true」背景であるので、この「マージ」背景ラベルの確率は限界確率であり、マージ前の関連する確率を合計する。この限界確率は、任意の既存の損失関数(例えば、クロスエントロピー損失、ディース損失など)に差し込み、限界損失を形成することができる。臓器が重複しないという事実を生かして,ラベル付き臓器間の相違性と非ラベル付き臓器の推定セグメンテーションを評価するために,除外損失を提案する。肝,脾臓,左右腎,膵の多臓器分節化における5つのベンチマークデータセットの結合実験により,新たに提案した損失関数を用いることで,余分な計算を導入することなく,最先端の手法に顕著な性能向上がもたらされることを示した。 Annotating multiple organs in medical images is both costly and time-consuming; therefore, existing multi-organ datasets with labels are often low in sample size and mostly partially labeled, that is, a dataset has a few organs labeled but not all organs. In this paper, we investigate how to learn a single multi-organ segmentation network from a union of such datasets. To this end, we propose two types of novel loss function, particularly designed for this scenario: (i) marginal loss and (ii) exclusion loss. Because the background label for a partially labeled image is, in fact, a `merged' label of all unlabelled organs and `true' background (in the sense of full labels), the probability of this `merged' background label is a marginal probability, summing the relevant probabilities before merging. This marginal probability can be plugged into any existing loss function (such as cross entropy loss, Dice loss, etc.) to form a marginal loss. Leveraging the fact that the organs are non-overlapping, we propose the exclusion loss to gauge the dissimilarity between labeled organs and the estimated segmentation of unlabelled organs. Experiments on a union of five benchmark datasets in multi-organ segmentation of liver, spleen, left and right kidneys, and pancreas demonstrate that using our newly proposed loss functions brings a conspicuous performance improvement for state-of-the-art methods without introducing any extra computation.	翻訳日:2022-11-12 13:16:11 公開日:2020-07-08
# fetoscopic mosaicking に対する深部胎盤血管セグメンテーション Deep Placental Vessel Segmentation for Fetoscopic Mosaicking ( http://arxiv.org/abs/2007.04349v1 ) ライセンス: Link先を確認	Sophia Bano, Francisco Vasconcelos, Luke M. Shepherd, Emmanuel Vander Poorten, Tom Vercauteren, Sebastien Ourselin, Anna L. David, Jan Deprest and Danail Stoyanov	(参考訳) ツイン・ツー・ツイン・トランスフュージョン症候群(TTTS)の治療中、臨床医は最初に胎盤血管の異常を同定し、両胎児の血流を調節するためにレーザーを照射する。手術は, 環境の移動性, 羊水中の視認性不良, 時折出血し, フェトスコープ視野の制限, 画像品質の制限などにより困難である。理想的には、解剖学的胎盤血管は自動的に同定され、分節化され、レーザーアブレーションのガイドとして拡張された血管地図を作成する。フェトスコープ映像における胎盤血管のセグメンテーションを行うために, u-netアーキテクチャを利用したソリューションを提案する。得られた容器確率マップは、直接強度に基づく手法を用いて連続した容器マップを登録することにより、モザイクアライメントのための十分な手がかりを提供する。 6種類の異なるin vivo fetoscopic video実験により、血管強度に基づく登録は画像強度に基づく登録法より優れ、質的および定量的比較においてより堅牢性を示すことが示された。さらに,400フレームまでのシーケンスにおいてもドリフトの蓄積を無視できるように削減し,地盤の欠落時にドリフト誤差を定量化するためのスキームを組み込んだ。本稿では,第1報 in vivo vessel segmentation と fetoscopic videos dataset をコントリビュートすることにより,胎盤胎盤血管のセグメンテーションと登録のベンチマークを提供する。 During fetoscopic laser photocoagulation, a treatment for twin-to-twin transfusion syndrome (TTTS), the clinician first identifies abnormal placental vascular connections and laser ablates them to regulate blood flow in both fetuses. The procedure is challenging due to the mobility of the environment, poor visibility in amniotic fluid, occasional bleeding, and limitations in the fetoscopic field-of-view and image quality. Ideally, anastomotic placental vessels would be automatically identified, segmented and registered to create expanded vessel maps to guide laser ablation, however, such methods have yet to be clinically adopted. We propose a solution utilising the U-Net architecture for performing placental vessel segmentation in fetoscopic videos. The obtained vessel probability maps provide sufficient cues for mosaicking alignment by registering consecutive vessel maps using the direct intensity-based technique. Experiments on 6 different in vivo fetoscopic videos demonstrate that the vessel intensity-based registration outperformed image intensity-based registration approaches showing better robustness in qualitative and quantitative comparison. We additionally reduce drift accumulation to negligible even for sequences with up to 400 frames and we incorporate a scheme for quantifying drift error in the absence of the ground-truth. Our paper provides a benchmark for fetoscopy placental vessel segmentation and registration by contributing the first in vivo vessel segmentation and fetoscopic videos dataset.	翻訳日:2022-11-12 13:08:01 公開日:2020-07-08
# 背景知識に基づく多次元語句認識アルゴリズムに関する研究 Research on multi-dimensional end-to-end phrase recognition algorithm based on background knowledge ( http://arxiv.org/abs/2007.03860v1 ) ライセンス: Link先を確認	Zheng Li, Gang Tu, Guang Liu, Zhi-Qiang Zhan, Yi-Jian Liu	(参考訳) 現在、教師付き学習に基づくエンド・ツー・エンドの深層手法は、エンティティ認識と依存性分析に使われている。この手法には2つの問題がある: 第一に、背景知識は導入できない;第二に、自然言語の多粒度とネスト特徴は認識できない。これらの問題を解決するために、フレーズウィンドウに基づくアノテーションルールを提案し、それに対応する多次元の語句認識アルゴリズムを設計する。このアノテーション規則は、文を7種類のネスト句に分割し、句間の依存関係を示す。このアルゴリズムは、背景知識を導入するだけでなく、文中のあらゆる種類のネスト句を認識するだけでなく、句間の依存関係を認識する。実験の結果, アノテーションルールは使い易く, あいまいさがないことがわかった。マッチングアルゴリズムは, 従来のエンドツーエンドアルゴリズムよりも, 文法の多粒度や多様性特性に一貫性がある。 CPWDデータセットの実験では、背景知識を導入することにより、エンドツーエンドの手法の精度を1ポイント以上向上する。この手法はCCL 2018の競技に応用され、中国のユーモア型認識において第一位を獲得した。 At present, the deep end-to-end method based on supervised learning is used in entity recognition and dependency analysis. There are two problems in this method: firstly, background knowledge cannot be introduced; secondly, multi granularity and nested features of natural language cannot be recognized. In order to solve these problems, the annotation rules based on phrase window are proposed, and the corresponding multi-dimensional end-to-end phrase recognition algorithm is designed. This annotation rule divides sentences into seven types of nested phrases, and indicates the dependency between phrases. The algorithm can not only introduce background knowledge, recognize all kinds of nested phrases in sentences, but also recognize the dependency between phrases. The experimental results show that the annotation rule is easy to use and has no ambiguity; the matching algorithm is more consistent with the multi granularity and diversity characteristics of syntax than the traditional end-to-end algorithm. The experiment on CPWD dataset, by introducing background knowledge, the new algorithm improves the accuracy of the end-to-end method by more than one point. The corresponding method was applied to the CCL 2018 competition and won the first place in the task of Chinese humor type recognition.	翻訳日:2022-11-12 13:06:47 公開日:2020-07-08
# カービン内会話エージェントにおける客室乗務員の聴覚的理解 Audio-Visual Understanding of Passenger Intents for In-Cabin Conversational Agents ( http://arxiv.org/abs/2007.03876v1 ) ライセンス: Link先を確認	Eda Okur, Shachi H Kumar, Saurav Sahay, Lama Nachman	(参考訳) 車両内状況における多モード対話理解機能の構築は、自律走行車(AV)インタラクションシステムにおける乗客の快適性を高めるために重要である。この目的のために、音声対話と車両ビジョンシステムから乗客の意図を理解することは、avのための文脈的および視覚的な会話エージェントを開発する上で重要な要素である。本研究の目的は、車内エージェントであるAMIE(Automated-vehicle Multimodal In-cabin Experience)を探索することである。本研究では,車内および車外からの言語/言語入力と非言語/音響的・視覚的手がかりを組み込むことにより,車内発話のマルチモーダル理解のメリットについて論じる。実験結果は,マルチモーダルアプローチによる意図検出の性能向上により,テキストのみベースラインを上回った。 Building multimodal dialogue understanding capabilities situated in the in-cabin context is crucial to enhance passenger comfort in autonomous vehicle (AV) interaction systems. To this end, understanding passenger intents from spoken interactions and vehicle vision systems is a crucial component for developing contextual and visually grounded conversational agents for AV. Towards this goal, we explore AMIE (Automated-vehicle Multimodal In-cabin Experience), the in-cabin agent responsible for handling multimodal passenger-vehicle interactions. In this work, we discuss the benefits of a multimodal understanding of in-cabin utterances by incorporating verbal/language input together with the non-verbal/acoustic and visual clues from inside and outside the vehicle. Our experimental results outperformed text-only baselines as we achieved improved performances for intent detection with a multimodal approach.	翻訳日:2022-11-12 13:06:07 公開日:2020-07-08
# n-項関係知識ベースに対するテンソル分解の一般化 Generalizing Tensor Decomposition for N-ary Relational Knowledge Bases ( http://arxiv.org/abs/2007.03988v1 ) ライセンス: Link先を確認	Yu Liu and Quanming Yao and Yong Li	(参考訳) 知識ベース(kbs)の急速な発展に伴い、リンク予測タスク(リンク予測タスク)は、特に、強力なテンソル分解関連手法を持つバイナリリレーショナルkbs(つまり知識グラフ)において広く研究されてきた。しかし、高次関係事実を持つユビキタスなn-aryリレーショナルkbは、既存の翻訳ベースおよびニューラルネットワークベースのアプローチが様々な関係のモデリングにおいて弱い表現力と高い複雑さを持つため、あまり注目されていない。テンソル分解は n-項リレーショナルKB に対しては考慮されていないが、双対リレーショナルKB のテンソル分解関連法を直接 n-項ケースに拡張しても指数モデル複雑性と二項リレーショナルKBの強い仮定により満足な結果が得られない。本研究では,n-aryリレーショナルKBのテンソル分解を一般化するために,タッカー分解とテンソルリング分解に基づく一般化モデルであるGETDを提案する。既存の負サンプリング手法は、GETDのn-aryケースにも一般化される。さらに, GETD が KB を完全表現可能であることを理論的に証明する。 2つの代表的なn-aryリレーショナルkbデータセットの広範な評価は、getdの優れたパフォーマンスを示し、最先端のメソッドを15-%以上改善した。さらにGETDは、ベンチマークバイナリリレーショナルKBデータセットの最先端結果も取得する。 With the rapid development of knowledge bases (KBs), link prediction task, which completes KBs with missing facts, has been broadly studied in especially binary relational KBs (a.k.a knowledge graph) with powerful tensor decomposition related methods. However, the ubiquitous n-ary relational KBs with higher-arity relational facts are paid less attention, in which existing translation based and neural network based approaches have weak expressiveness and high complexity in modeling various relations. Tensor decomposition has not been considered for n-ary relational KBs, while directly extending tensor decomposition related methods of binary relational KBs to the n-ary case does not yield satisfactory results due to exponential model complexity and their strong assumptions on binary relations. To generalize tensor decomposition for n-ary relational KBs, in this work, we propose GETD, a generalized model based on Tucker decomposition and Tensor Ring decomposition. The existing negative sampling technique is also generalized to the n-ary case for GETD. In addition, we theoretically prove that GETD is fully expressive to completely represent any KBs. Extensive evaluations on two representative n-ary relational KB datasets demonstrate the superior performance of GETD, significantly improving the state-of-the-art methods by over 15\%. Moreover, GETD further obtains the state-of-the-art results on the benchmark binary relational KB datasets.	翻訳日:2022-11-12 13:05:52 公開日:2020-07-08
# Citation Recommendationのためのニューラルテキスト表現の学習 Learning Neural Textual Representations for Citation Recommendation ( http://arxiv.org/abs/2007.04070v1 ) ライセンス: Link先を確認	Binh Thanh Kieu, Inigo Jauregi Unanue, Son Bao Pham, Hieu Xuan Phan, Massimo Piccardi	(参考訳) 科学文献の急速な発展に伴い、論文の適切な引用を手作業で選択することはますます困難で時間がかかりつつある。近年,自動引用推薦の手法がいくつか提案されているが,引用推薦のための効果的な文書表現はいまだにかなり解明されていない。そこで本稿では,シームズと三重項ネットワークを組み込んだ文書(センテンス-BERT)をサブモジュラースコアリング関数で逐次表現する,引用レコメンデーションの新しい手法を提案する。私たちの知る限りでは、これは引用推薦のタスクのために深い表現とサブモジュラーの選択を組み合わせる最初のアプローチです。一般的なベンチマークデータセットであるACL Anthology Network corpusを使用して実験が行われ、ベースラインとMRRやF1-at-kスコアといったメトリクスを使用した最先端アプローチに対して評価されている。その結果, 提案手法は, 測定値毎に比較した全ての手法より優れていることがわかった。 With the rapid growth of the scientific literature, manually selecting appropriate citations for a paper is becoming increasingly challenging and time-consuming. While several approaches for automated citation recommendation have been proposed in the recent years, effective document representations for citation recommendation are still elusive to a large extent. For this reason, in this paper we propose a novel approach to citation recommendation which leverages a deep sequential representation of the documents (Sentence-BERT) cascaded with Siamese and triplet networks in a submodular scoring function. To the best of our knowledge, this is the first approach to combine deep representations and submodular selection for a task of citation recommendation. Experiments have been carried out using a popular benchmark dataset - the ACL Anthology Network corpus - and evaluated against baselines and a state-of-the-art approach using metrics such as the MRR and F1-at-k score. The results show that the proposed approach has been able to outperform all the compared approaches in every measured metric.	翻訳日:2022-11-12 13:05:26 公開日:2020-07-08
# 談話のコヒーレンス,参照グラウンド,目標指向対話 Discourse Coherence, Reference Grounding and Goal Oriented Dialogue ( http://arxiv.org/abs/2007.04428v1 ) ライセンス: Link先を確認	Baber Khalid, Malihe Alikhani, Michael Fellner, Brian McMahan, Matthew Stone	(参考訳) 混合開始型ヒューマンコンピュータ参照コミュニケーションを実現するための従来のアプローチは、情報状態または協調的な問題解決アプローチを採用してきた。本稿では,sdrt \cite{asher-lascarides:2003a} のようなコヒーレンスに基づく談話モデルに着想を得た新たなアプローチを議論する。提案手法の実装に向けた第一歩として、談話間の制約を蓄積し、学習確率モデルを用いてそれらを解釈する参照通信領域における単純な対話システムについて述べる。 Prior approaches to realizing mixed-initiative human--computer referential communication have adopted information-state or collaborative problem-solving approaches. In this paper, we argue for a new approach, inspired by coherence-based models of discourse such as SDRT \cite{asher-lascarides:2003a}, in which utterances attach to an evolving discourse structure and the associated knowledge graph of speaker commitments serves as an interface to real-world reasoning and conversational strategy. As first steps towards implementing the approach, we describe a simple dialogue system in a referential communication domain that accumulates constraints across discourse, interprets them using a learned probabilistic model, and plans clarification using reinforcement learning.	翻訳日:2022-11-12 13:05:08 公開日:2020-07-08
# Dungのセマンティクスは攻撃除去単調性を満たす Dung's semantics satisfy attack removal monotonicity ( http://arxiv.org/abs/2007.04221v1 ) ライセンス: Link先を確認	Leila Amgoud, Srdjan Vesic	(参考訳) 攻撃除去の単調性は, 好ましく, 安定し, 完全で, 接地されたセマンティクスが満足していることが示される。これは、b から a への攻撃が取り除かれた場合、a の状態は悪化しないことを意味する。 We show that preferred, stable, complete, and grounded semantics satisfy attack removal monotonicity. This means that if an attack from b to a is removed, the status of a cannot worsen, e.g. if a was skeptically accepted, it cannot become rejected.	翻訳日:2022-11-12 13:04:21 公開日:2020-07-08
# スムースゲームに対する確率的ハミルトン勾配法 Stochastic Hamiltonian Gradient Methods for Smooth Games ( http://arxiv.org/abs/2007.04202v1 ) ライセンス: Link先を確認	Nicolas Loizou, Hugo Berard, Alexia Jolicoeur-Martineau, Pascal Vincent, Simon Lacoste-Julien, Ioannis Mitliagkas	(参考訳) 機械学習における敵意の定式化の成功は、スムーズなゲームに対する新たなモチベーションをもたらした。本研究では,確率的ハミルトニアン手法のクラスに着目し,ある種の確率的滑らかなゲームに対して,最初の収束保証を提供する。確率的ハミルトン勾配勾配(SHGD)の非バイアス推定器を提案し,その利点を明らかにする。最適化文献のツールを用いて,shgd が定常点近傍に線形収束することを示す。厳密な解の収束を保証するため, SHGDをステップサイズを小さくして解析し, 初めての確率分散低減ハミルトン法を提案する。この結果から,非凸な非凸問題を含む「十分両線形」条件を満たす確率的非制約双線型ゲームや,より一般的な確率的ゲームに対して,最初のグローバルな非漸近的最終点収束保証を提供する。我々は,確率的双線形ゲームと十分な双線形ゲームにおいて,我々の理論が厳密であることを示す実験と,単純な対向機械学習の定式化による解析を補完する。 The success of adversarial formulations in machine learning has brought renewed motivation for smooth games. In this work, we focus on the class of stochastic Hamiltonian methods and provide the first convergence guarantees for certain classes of stochastic smooth games. We propose a novel unbiased estimator for the stochastic Hamiltonian gradient descent (SHGD) and highlight its benefits. Using tools from the optimization literature we show that SHGD converges linearly to the neighbourhood of a stationary point. To guarantee convergence to the exact solution, we analyze SHGD with a decreasing step-size and we also present the first stochastic variance reduced Hamiltonian method. Our results provide the first global non-asymptotic last-iterate convergence guarantees for the class of stochastic unconstrained bilinear games and for the more general class of stochastic games that satisfy a "sufficiently bilinear" condition, notably including some non-convex non-concave problems. We supplement our analysis with experiments on stochastic bilinear and sufficiently bilinear games, where our theory is shown to be tight, and on simple adversarial machine learning formulations.	翻訳日:2022-11-12 12:58:23 公開日:2020-07-08
# StructureBoost: 構造カテゴリー変数の効率的なグラディエントブースティング StructureBoost: Efficient Gradient Boosting for Structured Categorical Variables ( http://arxiv.org/abs/2007.04446v1 ) ライセンス: Link先を確認	Brian Lucena	(参考訳) 構造カテゴリー決定木 (SCDT) に基づくグラディエント促進法は, 分類変数が既知の基盤構造を持つ問題に対して, 数値および1ホットエンコーディングより優れることを示した。しかし、SCDTの列挙手順は、低濃度または中等濃度のカテゴリー変数を除いて実現不可能である。計算障害を克服する2つの手法を提案し,実装し,複雑な構造的分類変数の勾配ブースティングを効率的に行う。結果、StructureBoostと呼ばれるパッケージは、洗練された構造を含むカテゴリ予測器の問題で、CatBoostやLightGBMのような確立したパッケージより優れていることが示されている。さらに, 基礎構造に関する知識から, structureboost が未知の分類値を正確に予測できることを実証する。 Gradient boosting methods based on Structured Categorical Decision Trees (SCDT) have been demonstrated to outperform numerical and one-hot-encodings on problems where the categorical variable has a known underlying structure. However, the enumeration procedure in the SCDT is infeasible except for categorical variables with low or moderate cardinality. We propose and implement two methods to overcome the computational obstacles and efficiently perform Gradient Boosting on complex structured categorical variables. The resulting package, called StructureBoost, is shown to outperform established packages such as CatBoost and LightGBM on problems with categorical predictors that contain sophisticated structure. Moreover, we demonstrate that StructureBoost can make accurate predictions on unseen categorical values due to its knowledge of the underlying structure.	翻訳日:2022-11-12 12:57:18 公開日:2020-07-08
# MRIF : 勧告のための多分解能核融合 MRIF: Multi-resolution Interest Fusion for Recommendation ( http://arxiv.org/abs/2007.07084v1 ) ライセンス: Link先を確認	Shihao Li (1), Dekun Yang (1), Bufeng Zhang (1) ((1) Alibaba Inc)	(参考訳) パーソナライズドレコメンデーションの主なタスクは、ユーザーの過去の行動に基づいてユーザーの興味を捉えることである。近年のレコメンデータシステムの進歩のほとんどは、ディープラーニングベースのアプローチを用いてユーザの好みを正確にモデル化することに焦点を当てている。ユーザの興味には2つの重要な特性がある。1つは、ユーザの興味は時間とともに動的で進化し、もう1つは、ユーザの関心は、長期的および短期的な嗜好のような、正確な時間的範囲が異なることである。既存のアプローチでは、異なる時間範囲を考慮せずに、ユーザの関心のドリフトに対処するためにリカレントニューラルネットワーク(RNN)を使用しているか、長期と短期の好みを別々にモデル化するために2つの異なるネットワークを設計している。本稿では,ユーザの利害関係を考慮に入れた多分解能利害融合モデル(MRIF)を提案する。提案モデルでは,ユーザの興味の動的変化を異なる時間範囲で捉えることができ,マルチ解像度のユーザ関心を組み合わせて予測を行う効果的な方法を提供する。実験の結果,提案手法は最先端のレコメンデーション手法よりも優れていた。 The main task of personalized recommendation is capturing users' interests based on their historical behaviors. Most of recent advances in recommender systems mainly focus on modeling users' preferences accurately using deep learning based approaches. There are two important properties of users' interests, one is that users' interests are dynamic and evolve over time, the other is that users' interests have different resolutions, or temporal-ranges to be precise, such as long-term and short-term preferences. Existing approaches either use Recurrent Neural Networks (RNNs) to address the drifts in users' interests without considering different temporal-ranges, or design two different networks to model long-term and short-term preferences separately. This paper presents a multi-resolution interest fusion model (MRIF) that takes both properties of users' interests into consideration. The proposed model is capable to capture the dynamic changes in users' interests at different temporal-ranges, and provides an effective way to combine a group of multi-resolution user interests to make predictions. Experiments show that our method outperforms state-of-the-art recommendation methods consistently.	翻訳日:2022-11-12 12:56:13 公開日:2020-07-08
# フェアクラスタリングは? Whither Fair Clustering? ( http://arxiv.org/abs/2007.07838v1 ) ライセンス: Link先を確認	Deepak P	(参考訳) 分類フェアネス研究に支配されている比較的多忙なフェア機械学習の領域では、クラスタリングにおけるフェアネスが近年注目され始めている。本稿では, フェアクラスタリングにおける既存の作業を評価し, 未調査の方向性がいくつかあることを観察し, フェアクラスタリングにおける最先端技術は, 非常に画期的であることを仮定する。我々は,目標とする規範的な原則を拡大し,目標が完全に達成できない欠点を特徴付けること,下流プロセスの知識を活用すれば,公平なクラスタリング研究における研究の範囲を大きく広げることができると仮定する。クラスタリングと教師なし学習が、人間の生活に重要な決定を下し、影響を及ぼすのにますます使われているとき、公正なクラスタリングの範囲を広げることは、非常に重要であると考えています。 Within the relatively busy area of fair machine learning that has been dominated by classification fairness research, fairness in clustering has started to see some recent attention. In this position paper, we assess the existing work in fair clustering and observe that there are several directions that are yet to be explored, and postulate that the state-of-the-art in fair clustering has been quite parochial in outlook. We posit that widening the normative principles to target for, characterizing shortfalls where the target cannot be achieved fully, and making use of knowledge of downstream processes can significantly widen the scope of research in fair clustering research. At a time when clustering and unsupervised learning are being increasingly used to make and influence decisions that matter significantly to human lives, we believe that widening the ambit of fair clustering is of immense significance.	翻訳日:2022-11-12 12:55:51 公開日:2020-07-08
# 自動微分を用いたモデルベースクラスタリング:ミス種別と高次元データの比較 Model-based Clustering using Automatic Differentiation: Confronting Misspecification and High-Dimensional Data ( http://arxiv.org/abs/2007.12786v1 ) ライセンス: Link先を確認	Siva Rajesh Kasa, Vaibhav Rajan	(参考訳) ガウス混合モデルを用いたモデルベースクラスタリングの実用上重要な2つの事例について検討する:(1)不特定性がある場合、(2)高次元データに基づく場合、自動微分(AD)を用いたグラディエントD(GD)に基づく最適化の最近の進歩を踏まえて。シミュレーションにより,EMのクラスタリング性能は,不特定の場合のGDに比べて向上し,高次元データGDではEMより優れていた。 em と gd はともに高い確率でクラスタ解釈が貧弱な多くの解が存在することを観測する。この問題に対処するため、我々は、適合するコンポーネントのペア間のkullback leiblerの発散に基づく可能性の新たなペナルティ項を設計する。このペナル化確率の勾配の閉形式表現は導出が難しいが、ADを最適化する利点を説明できる。高次元データとモデル選択のためのこのペナルティの拡張について論じる。合成および実データセットに関する数値実験により,提案手法を用いたクラスタリングの有効性が示された。 We study two practically important cases of model based clustering using Gaussian Mixture Models: (1) when there is misspecification and (2) on high dimensional data, in the light of recent advances in Gradient Descent (GD) based optimization using Automatic Differentiation (AD). Our simulation studies show that EM has better clustering performance, measured by Adjusted Rand Index, compared to GD in cases of misspecification, whereas on high dimensional data GD outperforms EM. We observe that both with EM and GD there are many solutions with high likelihood but poor cluster interpretation. To address this problem we design a new penalty term for the likelihood based on the Kullback Leibler divergence between pairs of fitted components. Closed form expressions for the gradients of this penalized likelihood are difficult to derive but AD can be done effortlessly, illustrating the advantage of AD-based optimization. Extensions of this penalty for high dimensional data and for model selection are discussed. Numerical experiments on synthetic and real datasets demonstrate the efficacy of clustering using the proposed penalized likelihood approach.	翻訳日:2022-11-12 12:55:35 公開日:2020-07-08
# SiENet:画像外挿のためのシームズ拡張ネットワーク SiENet: Siamese Expansion Network for Image Extrapolation ( http://arxiv.org/abs/2007.03851v1 ) ライセンス: Link先を確認	Xiaofeng Zhang, Feng Chen, Cailing Wang, Songsong Wu, Ming Tao and Guoping Jiang	(参考訳) 画像の塗布と異なり、画像の露光はイメージセンタ内のコンテキストが比較的小さく、画像境界でより多くのコンテンツをキャプチャして予測する。したがって、既存のメソッドの古典的なエンコーダ・デコーダパイプラインは、拡張された未知のコンテンツを正確に予測することはできない。本稿では,Siamese Expansion Network (SiENet) と呼ばれる,画像外挿のための2段階逆解析モデルを提案する。 2つの段階において、適応充填畳み込み(adaptive fill convolution)と呼ばれる新しい境界感度畳み込みは、エンコーダが未知のコンテンツを予測するように設計され、デコーダの負担を軽減する。さらに,ネットワークに事前知識を導入し,エンコーダの推論能力を強化するため,サーム逆数機構を設計し,未発見画像の特徴量に対する被覆長範囲特徴量の分布をモデル化する。 4つのデータセットの結果から,本手法は既存の最先端技術よりも優れ,現実的な結果が得られることが示された。 Different from image inpainting, image outpainting has relative less context in the image center to capture and more content at the image border to predict. Therefore, classical encoder-decoder pipeline of existing methods may not predict the outstretched unknown content perfectly. In this paper, a novel two-stage siamese adversarial model for image extrapolation, named Siamese Expansion Network (SiENet) is proposed. In two stages, a novel border sensitive convolution named adaptive filling convolution is designed for allowing encoder to predict the unknown content, alleviating the burden of decoder. Besides, to introduce prior knowledge to network and reinforce the inferring ability of encoder, siamese adversarial mechanism is designed to enable our network to model the distribution of covered long range feature for that of uncovered image feature. The results on four datasets has demonstrated that our method outperforms existing state-of-the-arts and could produce realistic results.	翻訳日:2022-11-12 12:55:16 公開日:2020-07-08
# 非負関数の非パラメトリックモデル Non-parametric Models for Non-negative Functions ( http://arxiv.org/abs/2007.03926v1 ) ライセンス: Link先を確認	Ulysse Marteau-Ferey (PSL, DI-ENS, SIERRA), Francis Bach (PSL, DI-ENS, SIERRA), Alessandro Rudi (PSL, DI-ENS, SIERRA)	(参考訳) 線形モデルは、機械学習、信号処理、統計など、多くの分野で大きな効果と柔軟性を示している。それらは、使用する最適化問題の凸性を維持しながら、関数のリッチな空間を表現でき、評価、差別化、統合が容易である。しかし、教師なし学習、密度推定、非パラメトリックベイズ法に不可欠な非負関数のモデル化では、線形モデルは直接適用されない。さらに、一般化線形モデルのような現在の最先端モデルは、非凸最適化問題につながるか、容易に統合できない。本稿では、線形モデルの同じ良い性質の恩恵を受ける非負関数に対する最初のモデルを提供する。特に、表現定理を認め、凸問題に対する効率的な二重定式化を提供することを証明している。その表現力について研究し、結果として得られる函数の空間が一般化線型モデルのそれよりも厳密にリッチであることを示す。最後に、モデルと理論結果を凸錐の出力を持つ関数に拡張する。本論文は, 定式化, アルゴリズムによる導出, 実用的結果, 密度推定問題, ヘテロシドスティック誤差を伴う回帰問題, および多変量回帰問題における有効性を示すモデルの実験的評価によって補完された。 Linear models have shown great effectiveness and flexibility in many fields such as machine learning, signal processing and statistics. They can represent rich spaces of functions while preserving the convexity of the optimization problems where they are used, and are simple to evaluate, differentiate and integrate. However, for modeling non-negative functions, which are crucial for unsupervised learning, density estimation, or non-parametric Bayesian methods, linear models are not applicable directly. Moreover, current state-of-the-art models like generalized linear models either lead to non-convex optimization problems, or cannot be easily integrated. In this paper we provide the first model for non-negative functions which benefits from the same good properties of linear models. In particular, we prove that it admits a representer theorem and provide an efficient dual formulation for convex problems. We study its representation power, showing that the resulting space of functions is strictly richer than that of generalized linear models. Finally we extend the model and the theoretical results to functions with outputs in convex cones. The paper is complemented by an experimental evaluation of the model showing its effectiveness in terms of formulation, algorithmic derivation and practical results on the problems of density estimation, regression with heteroscedastic errors, and multiple quantile regression.	翻訳日:2022-11-12 12:49:22 公開日:2020-07-08
# PIDラグランジアン法による強化学習における応答安全 Responsive Safety in Reinforcement Learning by PID Lagrangian Methods ( http://arxiv.org/abs/2007.03964v1 ) ライセンス: Link先を確認	Adam Stooke, Joshua Achiam, and Pieter Abbeel	(参考訳) ラグランジアン法は制約付き最適化問題のアルゴリズムとして広く用いられているが、その学習力学は振動やオーバーシュートを示し、安全強化学習に適用するとエージェントトレーニング中に制約違反行動を引き起こす。本稿では,制約関数の微分を利用した新しいラグランジュ乗算器更新法を提案する。我々は、従来のラグランジュ乗算器更新が \emph{integral} 制御として振る舞う制御の観点を採り、我々の用語は \emph{proportional} と \emph{derivative} 制御を導入し、減衰と予測手段によって良好な学習ダイナミクスを達成する。我々はPIDラグランジアン法を深部RLに適用し、安全RLベンチマークであるSafety Gymにおける新しい技術状態を設定する。最後に,報奨とコストの相対的な数値スケールに対する不変性を提供することにより,コントローラのチューニングを容易にする新しい手法を提案する。我々のアルゴリズムは従来のラグランジアンアプローチと同様に、導出と実装がほとんど簡単であり、性能とハイパーパラメータの堅牢性が改善された。 Lagrangian methods are widely used algorithms for constrained optimization problems, but their learning dynamics exhibit oscillations and overshoot which, when applied to safe reinforcement learning, leads to constraint-violating behavior during agent training. We address this shortcoming by proposing a novel Lagrange multiplier update method that utilizes derivatives of the constraint function. We take a controls perspective, wherein the traditional Lagrange multiplier update behaves as \emph{integral} control; our terms introduce \emph{proportional} and \emph{derivative} control, achieving favorable learning dynamics through damping and predictive measures. We apply our PID Lagrangian methods in deep RL, setting a new state of the art in Safety Gym, a safe RL benchmark. Lastly, we introduce a new method to ease controller tuning by providing invariance to the relative numerical scales of reward and cost. Our extensive experiments demonstrate improved performance and hyperparameter robustness, while our algorithms remain nearly as simple to derive and implement as the traditional Lagrangian approach.	翻訳日:2022-11-12 12:49:00 公開日:2020-07-08
# ヒューマンアクティビティ認識のための効率的なデータインプテーション手法 An Efficient Data Imputation Technique for Human Activity Recognition ( http://arxiv.org/abs/2007.04456v1 ) ライセンス: Link先を確認	Ivan Miguel Pires, Faisal Hussain, Nuno M. Garcia, Eftim Zdravevski	(参考訳) 人間の行動認識の膨大な応用は、健康モニタリングシステムからバーチャルリアリティーアプリケーションまで幅広く利用されている。このように、多くの応用において日常生活活動の自動認識が重要になっている。近年,人間の日常生活活動の効率的なモニタリングと認識のために,機械学習モデルを訓練するためのデータセットが多数提案されている。しかし、データセットに不完全なアクティビティがある場合、すなわちデータセットキャプチャーにサンプルが欠けている場合、アクティビティ認識における機械学習モデルのパフォーマンスは重要な影響を受ける。そこで本研究では,人間の日常生活活動をよりよく認識するために,データセットの欠落サンプルを外挿する手法を提案する。提案手法は,k-nearest neighbors (knn) インプテーション手法を用いて,データキャプチャにおける欠落サンプルの抽出を行う。提案手法は,実際のデータセットと類似したアクティビティパターンをエレガントに推定した。 The tremendous applications of human activity recognition are surging its span from health monitoring systems to virtual reality applications. Thus, the automatic recognition of daily life activities has become significant for numerous applications. In recent years, many datasets have been proposed to train the machine learning models for efficient monitoring and recognition of human daily living activities. However, the performance of machine learning models in activity recognition is crucially affected when there are incomplete activities in a dataset, i.e., having missing samples in dataset captures. Therefore, in this work, we propose a methodology for extrapolating the missing samples of a dataset to better recognize the human daily living activities. The proposed method efficiently pre-processes the data captures and utilizes the k-Nearest Neighbors (KNN) imputation technique to extrapolate the missing samples in dataset captures. The proposed methodology elegantly extrapolated a similar pattern of activities as they were in the real dataset.	翻訳日:2022-11-12 12:48:37 公開日:2020-07-08
# BlockFLow: フェデレーション学習のための説明責任とプライバシ保護ソリューション BlockFLow: An Accountable and Privacy-Preserving Solution for Federated Learning ( http://arxiv.org/abs/2007.03856v1 ) ライセンス: Link先を確認	Vaikkunth Mugunthan, Ravi Rahman and Lalana Kagal	(参考訳) 連合学習は、基礎となるデータを共有する必要なしに、協調エージェント間の機械学習モデルの開発を可能にする。しかし、ランダムなデータでトレーニングする悪意のあるエージェント、あるいは結果クラスが反転したデータセットでは、組み合わせたモデルを弱める可能性がある。 BlockFLowは、完全な分散化とプライバシ保護を備えた、説明可能な連邦学習システムである。その主な目標は、基盤となるデータセットのプライバシ保護と悪意のある敵に対する耐性を確保しながら、コントリビューションの品質に比例するエージェントに報酬を与えることである。具体的には、blockflowはディファレンシャルプライバシを取り入れ、モデルコントリビュートのための新しい監査メカニズムを導入し、ethereumスマートコントラクトを使用して優れた振る舞いをインセンティブ化する。フェデレートされた学習システムに対する既存の監査やアカウンタビリティ手法とは異なり、我々のシステムは中央集権的なテストデータセットを必要とせず、エージェント間でデータセットを共有するか、あるいは1つ以上の信頼できる監査者間でデータセットを共有する。パブリックなEthereumブロックチェーン上で実行する場合、BlockFLowは監査の結果を使用して、コントリビューションの品質に基づいた暗号通貨の報酬を行う。ロジスティック回帰モデルによって解決可能な分類タスクを提供する2つのデータセット上のblockflowを評価した。その結果, 評価スコアは, 正直なエージェントのデータセットの品質を反映していることがわかった。また、不正エージェントのスコアは、正直エージェントのスコアよりも統計的に低い。これらの結果は、合理的なブロックチェーンコストとともに、説明可能なフェデレーション学習システムとしてのBlockFLowの有効性を示している。 Federated learning enables the development of a machine learning model among collaborating agents without requiring them to share their underlying data. However, malicious agents who train on random data, or worse, on datasets with the result classes inverted, can weaken the combined model. BlockFLow is an accountable federated learning system that is fully decentralized and privacy-preserving. Its primary goal is to reward agents proportional to the quality of their contribution while protecting the privacy of the underlying datasets and being resilient to malicious adversaries. Specifically, BlockFLow incorporates differential privacy, introduces a novel auditing mechanism for model contribution, and uses Ethereum smart contracts to incentivize good behavior. Unlike existing auditing and accountability methods for federated learning systems, our system does not require a centralized test dataset, sharing of datasets between the agents, or one or more trusted auditors; it is fully decentralized and resilient up to a 50% collusion attack in a malicious trust model. When run on the public Ethereum blockchain, BlockFLow uses the results from the audit to reward parties with cryptocurrency based on the quality of their contribution. We evaluated BlockFLow on two datasets that offer classification tasks solvable via logistic regression models. Our results show that the resultant auditing scores reflect the quality of the honest agents' datasets. Moreover, the scores from dishonest agents are statistically lower than those from the honest agents. These results, along with the reasonable blockchain costs, demonstrate the effectiveness of BlockFLow as an accountable federated learning system.	翻訳日:2022-11-12 12:47:58 公開日:2020-07-08
# ニューラルSDEによるロバスト価格とヘッジ Robust pricing and hedging via neural SDEs ( http://arxiv.org/abs/2007.04154v1 ) ライセンス: Link先を確認	Patryk Gierjatowicz and Marc Sabate-Vidales and David \v{S}i\v{s}ka and Lukasz Szpruch and \v{Z}an \v{Z}uri\v{c}	(参考訳) 数学的モデリングは金融業界に広く浸透しており、重要な意思決定プロセスを動かしている。任意のモデルが現実に粗悪な近似を与えるだけであり、不適切なモデルを使用することのリスクは検出と定量化が難しい。対照的に、現代のデータサイエンス技術は、より堅牢でデータ駆動のモデル選択メカニズムへの扉を開く。しかしながら、ほとんどの機械学習モデルは、個々のパラメータが意味のある解釈を持たないため、"ブラックボックス"である。本稿の目的は,上記の2つの世界のベストを達成するアプローチを組み合わせることである。ニューラルネットワークと古典確率微分方程式(SDE)に基づくリスクモデルを組み合わせることで、デリバティブの価格とそれに対応するヘッジ戦略の堅牢な境界を見つけ、関連する市場データを取り込む。ニューラルSDEと呼ばれる結果は生成モデルのインスタンス化であり、因果最適輸送の理論と密接に関連している。ニューラルSDEはリスクニュートラルと現実世界の両方で一貫した校正を可能にする。したがって、モデルはリスクプロファイルやヘッジ戦略を評価するのに必要な市場シナリオをシミュレートするために使用できる。我々は,ニューラルSDEの効率的な利用に必要な新しいアルゴリズムを開発し,分析する。局所的および確率的ボラティリティモデルを用いて数値実験によるアプローチを検証する。 Mathematical modelling is ubiquitous in the financial industry and drives key decision processes. Any given model provides only a crude approximation to reality and the risk of using an inadequate model is hard to detect and quantify. By contrast, modern data science techniques are opening the door to more robust and data-driven model selection mechanisms. However, most machine learning models are "black-boxes" as individual parameters do not have meaningful interpretation. The aim of this paper is to combine the above approaches achieving the best of both worlds. Combining neural networks with risk models based on classical stochastic differential equations (SDEs), we find robust bounds for prices of derivatives and the corresponding hedging strategies while incorporating relevant market data. The resulting model called neural SDE is an instantiation of generative models and is closely linked with the theory of causal optimal transport. Neural SDEs allow consistent calibration under both the risk-neutral and the real-world measures. Thus the model can be used to simulate market scenarios needed for assessing risk profiles and hedging strategies. We develop and analyse novel algorithms needed for efficient use of neural SDEs. We validate our approach with numerical experiments using both local and stochastic volatility models.	翻訳日:2022-11-12 12:46:35 公開日:2020-07-08
# サンプリングによるDPPからの学習:HKPVと対称性を超えて Learning from DPPs via Sampling: Beyond HKPV and symmetry ( http://arxiv.org/abs/2007.04287v1 ) ライセンス: Link先を確認	R\'emi Bardenet and Subhroshekhar Ghosh	(参考訳) 決定点プロセス(DPP)は,これらの確率的モデルの本質的な能力を生かして,サンプルの多様性を促進する,レコメンデーションシステム,特徴選択,要約抽出のための重要なツールとなっている。 DPPからサンプルを採取する能力は、これらのモデルの実証的研究に最重要である。ほとんどの正確なサンプルは、Hough、Krishnapur、Peres、Vir\'ag (henceforth HKPV)によるスペクトルメタアルゴリズムの変種である。対称カーネルを持つDPPでは、スケーラブルなHKPVサンプリング器が提案されており、まずはアイテムの基底セットをダウンサンプルするか、Nystr\"om型分解を用いてカーネルをローランクにする。本研究では,HKPVとは大きく異なるアプローチを提案する。 DPP(いわゆる線形統計学)の重要な可観測値だけをサンプリングすることで、多くの統計的および学習目的が効果的に達成できるという事実が発覚し、そのような可観測値のラプラス変換の式を1つの行列式として呼び出す。従来の低ランク近似手法とラプラス逆解析を組み合わせることで,dppの線形統計量の分布関数を直接近似する方法を示す。この分布関数は、要求に従って仮説テストや実際に線形統計学をサンプリングするのに使うことができる。我々のアプローチはスケーラブルであり、従来の対称カーネルを超えて非常に一般的なDPPに適用できる。 Determinantal point processes (DPPs) have become a significant tool for recommendation systems, feature selection, or summary extraction, harnessing the intrinsic ability of these probabilistic models to facilitate sample diversity. The ability to sample from DPPs is paramount to the empirical investigation of these models. Most exact samplers are variants of a spectral meta-algorithm due to Hough, Krishnapur, Peres and Vir\'ag (henceforth HKPV), which is in general time and resource intensive. For DPPs with symmetric kernels, scalable HKPV samplers have been proposed that either first downsample the ground set of items, or force the kernel to be low-rank, using e.g. Nystr\"om-type decompositions. In the present work, we contribute a radically different approach than HKPV. Exploiting the fact that many statistical and learning objectives can be effectively accomplished by only sampling certain key observables of a DPP (so-called linear statistics), we invoke an expression for the Laplace transform of such an observable as a single determinant, which holds in complete generality. Combining traditional low-rank approximation techniques with Laplace inversion algorithms from numerical analysis, we show how to directly approximate the distribution function of a linear statistic of a DPP. This distribution function can then be used in hypothesis testing or to actually sample the linear statistic, as per requirement. Our approach is scalable and applies to very general DPPs, beyond traditional symmetric kernels.	翻訳日:2022-11-12 12:42:04 公開日:2020-07-08
# 楽観的スコア比を用いたロバストベイズ分類 Robust Bayesian Classification Using an Optimistic Score Ratio ( http://arxiv.org/abs/2007.04458v1 ) ライセンス: Link先を確認	Viet Anh Nguyen and Nian Si and Jose Blanchet	(参考訳) 我々は,クラス条件,あるいは文脈分布に関する情報が限られている場合,頑健なバイナリ分類のための楽観的スコア比を用いたベイズ文脈分類モデルを構築する。楽観的なスコアは、平均ベクトルと基礎となる文脈分布の共分散行列に制限された構造的制約を用いて規定される文脈曖昧性集合に属するすべての分布のうち、テストサンプルの観測結果を説明する最も有効な分布を探索する。楽観的スコア比を用いたベイズ分類器は,概念的に魅力的であり,統計的保証がしっかりでき,計算も容易である。提案する楽観的スコア比分類器のパワーを合成データと実験データの両方に示す。 We build a Bayesian contextual classification model using an optimistic score ratio for robust binary classification when there is limited information on the class-conditional, or contextual, distribution. The optimistic score searches for the distribution that is most plausible to explain the observed outcomes in the testing sample among all distributions belonging to the contextual ambiguity set which is prescribed using a limited structural constraint on the mean vector and the covariance matrix of the underlying contextual distribution. We show that the Bayesian classifier using the optimistic score ratio is conceptually attractive, delivers solid statistical guarantees and is computationally tractable. We showcase the power of the proposed optimistic score ratio classifier on both synthetic and empirical data.	翻訳日:2022-11-12 12:41:03 公開日:2020-07-08
# URSABench:ディープニューラルネットワークのための近似ベイズ推論法の総合ベンチマーク URSABench: Comprehensive Benchmarking of Approximate Bayesian Inference Methods for Deep Neural Networks ( http://arxiv.org/abs/2007.04466v1 ) ライセンス: Link先を確認	Meet P. Vadera, Adam D. Cobb, Brian Jalaian, Benjamin M. Marlin	(参考訳) ディープラーニング手法は、幅広いアプリケーションドメインにおける予測精度の向上を続けているが、不確実性とその堅牢性など、パフォーマンスの他の面においても重要な問題が残っている。ベイズ近似の最近の進歩はこれらの問題に対処する上で大きな可能性を秘めているが、これらの手法の計算スケーラビリティは大規模モデルに適用した場合に問題となる可能性がある。本稿では,深層学習に基づく分類タスクに着目した近似ベイズ推定手法の包括的評価のためのベンチマークツールのオープンソーススイートであるofursabench(the uncertainty, robustness, scalability, and accu-racy benchmark)の開発に関する初期研究について述べる。 While deep learning methods continue to improve in predictive accuracy on a wide range of application domains, significant issues remain with other aspects of their performance including their ability to quantify uncertainty and their robustness. Recent advances in approximate Bayesian inference hold significant promise for addressing these concerns, but the computational scalability of these methods can be problematic when applied to large-scale models. In this paper, we describe initial work on the development ofURSABench(the Uncertainty, Robustness, Scalability, and Accu-racy Benchmark), an open-source suite of bench-marking tools for comprehensive assessment of approximate Bayesian inference methods with a focus on deep learning-based classification tasks	翻訳日:2022-11-12 12:40:50 公開日:2020-07-08
# 深層学習に基づくidsにおけるニューラルネットワークの異なるタイプの敵訓練の評価 Evaluation of Adversarial Training on Different Types of Neural Networks in Deep Learning-based IDSs ( http://arxiv.org/abs/2007.04472v1 ) ライセンス: Link先を確認	Rana Abou Khamis and Ashraf Matrawy	(参考訳) ディープニューラルネットワークの侵入検知システムを含むネットワークセキュリティアプリケーションは、異常活動の検出タスクをより正確かつ堅牢にするために急速に増加している。 DNNの利用が急速に増加し、システム内を移動するデータ量が増えると、敵の攻撃の種類が増えていることが深刻な課題となっている。本稿では,ニューラルネットワーク(convolutional neural networks, cnn)やrecurrent neural networks(rnn)など,さまざまなニューラルネットワークを用いた,さまざまな回避攻撃の有効性と,レジリエンスに基づくディープラーニングidのトレーニング方法について検討する。 min-maxアプローチを用いて,2つのベンチマークデータセットを用いて,対向例に対するロバストidのトレーニング問題を定式化する。異なるディープラーニングアルゴリズムと異なるベンチマークデータセットに関する実験により、敵の訓練に基づくmin-maxアプローチによる防御が、よく知られた5つの敵の攻撃方法に対する堅牢性を向上することを示した。 Network security applications, including intrusion detection systems of deep neural networks, are increasing rapidly to make detection task of anomaly activities more accurate and robust. With the rapid increase of using DNN and the volume of data traveling through systems, different growing types of adversarial attacks to defeat them create a severe challenge. In this paper, we focus on investigating the effectiveness of different evasion attacks and how to train a resilience deep learning-based IDS using different Neural networks, e.g., convolutional neural networks (CNN) and recurrent neural networks (RNN). We use the min-max approach to formulate the problem of training robust IDS against adversarial examples using two benchmark datasets. Our experiments on different deep learning algorithms and different benchmark datasets demonstrate that defense using an adversarial training-based min-max approach improves the robustness against the five well-known adversarial attack methods.	翻訳日:2022-11-12 12:40:37 公開日:2020-07-08
# 多視点知識蒸留によるロバスト再同定 Robust Re-Identification by Multiple Views Knowledge Distillation ( http://arxiv.org/abs/2007.04174v1 ) ライセンス: Link先を確認	Angelo Porrello, Luca Bergamini, Simone Calderara	(参考訳) 再同定におけるロバスト性を実現するため、標準手法では追跡情報をビデオ対ビデオ方式で活用する。しかし、これらのソリューションは、単一の画像クエリ(例えば、画像からビデオへの設定)のパフォーマンスが大幅に低下する。近年の研究では,映像ベースネットワークから画像ベースネットワークへ時間情報を転送することで,この深刻な劣化に対処している。本研究は,対象対象を描写した一組の視点から,優れた知識の伝達を可能にするトレーニング戦略を考案する。本提案では,教師がより少ない視点を観察する生徒を教育する教師・学生の枠組みにおいて,この視覚的多様性を監督信号として捉える。その結果、学生は教師だけでなく、映像対ビデオの最先端技術も大きく上回っている(火星では6.3%、デューク=ビデオ=リードでは8.6%、ヴェリ-776では5%)。人, 乗り物, 動物リidの徹底分析により, vkdの特性を定性的, 定量的に検討した。コードはhttps://github.com/aimagelab/VKD.comで入手できる。 To achieve robustness in Re-Identification, standard methods leverage tracking information in a Video-To-Video fashion. However, these solutions face a large drop in performance for single image queries (e.g., Image-To-Video setting). Recent works address this severe degradation by transferring temporal information from a Video-based network to an Image-based one. In this work, we devise a training strategy that allows the transfer of a superior knowledge, arising from a set of views depicting the target object. Our proposal - Views Knowledge Distillation (VKD) - pins this visual variety as a supervision signal within a teacher-student framework, where the teacher educates a student who observes fewer views. As a result, the student outperforms not only its teacher but also the current state-of-the-art in Image-To-Video by a wide margin (6.3% mAP on MARS, 8.6% on Duke-Video-ReId and 5% on VeRi-776). A thorough analysis - on Person, Vehicle and Animal Re-ID - investigates the properties of VKD from a qualitatively and quantitatively perspective. Code is available at https://github.com/aimagelab/VKD.	翻訳日:2022-11-12 12:38:43 公開日:2020-07-08
# 第四カプセルネットワーク Quaternion Capsule Networks ( http://arxiv.org/abs/2007.04389v1 ) ライセンス: Link先を確認	Bar{\i}\c{s} \"Ozcan, Furkan K{\i}nl{\i}, Furkan K{\i}ra\c{c}	(参考訳) カプセルはニューロンのグループ化であり、ポーズや特徴といった視覚的な実体の洗練された情報を表現できる。この特性の観点から、Capsule Networksは、オブジェクト認識のような困難なタスクにおいてCNNよりも優れており、これは、ポーズ情報の高次元表現の助けを借りて、オブジェクトとその部分間の変換を学習することによって達成される。本稿では、カプセルのポーズ情報とその変換を四元数で表現する四元数カプセル(QCN)について述べる。四元系はジンバルロックに免疫があり、カプセルの回転表現の正則化が容易であり、行列よりもパラメータの数が少ない。実験の結果、qcnsは、パラメータの少ない新しい視点に一般化し、よく知られたベンチマークデータセット上の最先端のカプセルアーキテクチャで、ほぼあるいはより優れたパフォーマンスを達成することが示された。 Capsules are grouping of neurons that allow to represent sophisticated information of a visual entity such as pose and features. In the view of this property, Capsule Networks outperform CNNs in challenging tasks like object recognition in unseen viewpoints, and this is achieved by learning the transformations between the object and its parts with the help of high dimensional representation of pose information. In this paper, we present Quaternion Capsules (QCN) where pose information of capsules and their transformations are represented by quaternions. Quaternions are immune to the gimbal lock, have straightforward regularization of the rotation representation for capsules, and require less number of parameters than matrices. The experimental results show that QCNs generalize better to novel viewpoints with fewer parameters, and also achieve on-par or better performances with the state-of-the-art Capsule architectures on well-known benchmarking datasets.	翻訳日:2022-11-12 12:38:22 公開日:2020-07-08
# 職場でよく使われる性差別的言明の自動検出 Automatic Detection of Sexist Statements Commonly Used at the Workplace ( http://arxiv.org/abs/2007.04181v1 ) ライセンス: Link先を確認	Dylan Grosz, Patricia Conde-Cespedes	(参考訳) 職場でのヘイトスピーチの検出は、社会的コンテキストが従来のヘイトスピーチの微妙なバージョンを意味するため、ユニークな分類タスクである。最先端の職場性差別検出モデルに関するアプリケーションには、ヒューマンリソース部門の支援、AIチャットボット、感情分析などがある。既存のヘイトスピーチ検出手法のほとんどは、堅牢で正確だが、ソーシャルメディア、特にTwitterで見られるヘイトスピーチに焦点を当てている。ソーシャルメディアの文脈は職場よりもはるかに匿名であるため、セクシズムのより攻撃的で「敵対的な」バージョンに結びつく傾向がある。したがって、大量の"敵対的"性差別を持つデータセットは、"敵対的"性差別的ステートメントが、文脈に関係なく、モデルにセクシストであることを示唆する2つの単語をヒンジできるため、少し簡単に検出できる。本稿では,職場で語られる可能性が高い性差別的発言のデータセットと,最先端の成果を得られる深層学習モデルを提案する。これまでの研究は、単に集約されたtwitterのデータに基づいて「敵意」と「善意」のセクシズムを区別するための最先端のモデルを作ってきた。我々のディープラーニング手法は、GloVeやランダムな単語埋め込みで初期化され、LSTMを使用して、より多様なフィルタリングされたデータセットでこれらのモデルをパフォーマンスし、職場の性差別をより対象とし、F1スコアが0.88になる。 Detecting hate speech in the workplace is a unique classification task, as the underlying social context implies a subtler version of conventional hate speech. Applications regarding a state-of the-art workplace sexism detection model include aids for Human Resources departments, AI chatbots and sentiment analysis. Most existing hate speech detection methods, although robust and accurate, focus on hate speech found on social media, specifically Twitter. The context of social media is much more anonymous than the workplace, therefore it tends to lend itself to more aggressive and "hostile" versions of sexism. Therefore, datasets with large amounts of "hostile" sexism have a slightly easier detection task since "hostile" sexist statements can hinge on a couple words that, regardless of context, tip the model off that a statement is sexist. In this paper we present a dataset of sexist statements that are more likely to be said in the workplace as well as a deep learning model that can achieve state-of-the art results. Previous research has created state-of-the-art models to distinguish "hostile" and "benevolent" sexism based simply on aggregated Twitter data. Our deep learning methods, initialized with GloVe or random word embeddings, use LSTMs with attention mechanisms to outperform those models on a more diverse, filtered dataset that is more targeted towards workplace sexism, leading to an F1 score of 0.88.	翻訳日:2022-11-12 12:37:51 公開日:2020-07-08
# 良性はどんなに良性に満ちているか? How benign is benign overfitting? ( http://arxiv.org/abs/2007.04028v1 ) ライセンス: Link先を確認	Amartya Sanyal, Puneet K Dokania, Varun Kanade, Philip H.S. Torr	(参考訳) 深層ニューラルネットワークにおける敵意的脆弱性の原因として,悪いデータと(おそらく)訓練されたモデルについて検討する。 sgdでトレーニングすると、深層ニューラルネットワークはラベルノイズの存在下でも、基本的にトレーニングエラーをゼロにすると同時に、良性オーバーフィット(benign overfitting)と呼ばれる自然テストデータに対して優れた一般化を示す。しかし、これらのモデルは敵の攻撃に弱い。我々は,ラベルノイズを敵の脆弱性の原因の一つとみなし,これを支持する理論的・実証的な証拠を提供する。驚くべきことに、MNISTやCIFARといったデータセットでラベルノイズのいくつかの例が見つかり、堅牢にトレーニングされたモデルがこれらのいくつかでトレーニングエラーを引き起こしている。しかし、ノイズの多いラベルを除去するだけでは、敵の堅牢性を達成できない。標準的なトレーニング手順は、ニューラルネットワークを"単純な"分類境界の学習に偏らせる。敵の訓練がより複雑な決定境界を生み出すことを観察する。複雑な決定境界の必要性の一部は、準最適表現学習から生じると推測する。単純な玩具の例を用いて、表現の選択が敵の強靭性に大きな影響を与えるか理論的に示す。 We investigate two causes for adversarial vulnerability in deep neural networks: bad data and (poorly) trained models. When trained with SGD, deep neural networks essentially achieve zero training error, even in the presence of label noise, while also exhibiting good generalization on natural test data, something referred to as benign overfitting [2, 10]. However, these models are vulnerable to adversarial attacks. We identify label noise as one of the causes for adversarial vulnerability, and provide theoretical and empirical evidence in support of this. Surprisingly, we find several instances of label noise in datasets such as MNIST and CIFAR, and that robustly trained models incur training error on some of these, i.e. they don't fit the noise. However, removing noisy labels alone does not suffice to achieve adversarial robustness. Standard training procedures bias neural networks towards learning "simple" classification boundaries, which may be less robust than more complex ones. We observe that adversarial training does produce more complex decision boundaries. We conjecture that in part the need for complex decision boundaries arises from sub-optimal representation learning. By means of simple toy examples, we show theoretically how the choice of representation can drastically affect adversarial robustness.	翻訳日:2022-11-12 10:12:02 公開日:2020-07-08
# モデル同定における構造制約に関する事前知識の導入 Incorporating prior knowledge about structural constraints in model identification ( http://arxiv.org/abs/2007.04030v1 ) ライセンス: Link先を確認	Deepak Maurya, Sivadurgaprasad Chinta, Abhishek Sivaram and Raghunathan Rengaswamy	(参考訳) モデル同定は化学産業において重要な問題である。近年,関心システムに関する部分的知識を活用したデータ駆動モデル学習への関心が高まっている。モデル識別のためのほとんどの技術は、モデルの構造のような部分的な情報を組み込む自由を与えていない。本稿では,そのような部分的情報を利用してより良い推定値を生成するモデル同定手法を提案する。具体的には,本モデルに関する本質的な構造情報を利用して,pcaなどの既存手法を改良した構造主成分分析(spca)を提案する。既存の手法や近縁な手法のほとんどは、計算コストのかかる空間的制約を用いる。提案手法は構造情報を利用するためのPCAの賢明な修正である。提案手法の有効性を合成および工業ケーススタディーを用いて実証した。 Model identification is a crucial problem in chemical industries. In recent years, there has been increasing interest in learning data-driven models utilizing partial knowledge about the system of interest. Most techniques for model identification do not provide the freedom to incorporate any partial information such as the structure of the model. In this article, we propose model identification techniques that could leverage such partial information to produce better estimates. Specifically, we propose Structural Principal Component Analysis (SPCA) which improvises over existing methods like PCA by utilizing the essential structural information about the model. Most of the existing methods or closely related methods use sparsity constraints which could be computationally expensive. Our proposed method is a wise modification of PCA to utilize structural information. The efficacy of the proposed approach is demonstrated using synthetic and industrial case-studies.	翻訳日:2022-11-12 10:11:42 公開日:2020-07-08
# Diverse Ensemblesは校正を改善する Diverse Ensembles Improve Calibration ( http://arxiv.org/abs/2007.04206v1 ) ライセンス: Link先を確認	Asa Cooper Stickland and Iain Murray	(参考訳) 現代のディープニューラルネットワークは、特に列車とテストの分布が不一致している場合に、校正された予測を生じることがある。モデルの集合をトレーニングし、予測を平均化することは、これらの問題を緩和するのに役立ちます。アンサンブルメンバーごとに異なるデータ拡張を用いてキャリブレーションを改善するための簡単な手法を提案する。さらに、'mixing'という未拡張および拡張入力のアイデアを用いて、テストとトレーニングの分布が同じであるときに校正を改善する。これらの単純なテクニックは、CIFAR10とCIFAR100ベンチマークの強いベースラインに対する校正と精度を改善し、その破損したバージョンのドメイン外データを改善する。 Modern deep neural networks can produce badly calibrated predictions, especially when train and test distributions are mismatched. Training an ensemble of models and averaging their predictions can help alleviate these issues. We propose a simple technique to improve calibration, using a different data augmentation for each ensemble member. We additionally use the idea of `mixing' un-augmented and augmented inputs to improve calibration when test and training distributions are the same. These simple techniques improve calibration and accuracy over strong baselines on the CIFAR10 and CIFAR100 benchmarks, and out-of-domain data from their corrupted versions.	翻訳日:2022-11-12 10:10:07 公開日:2020-07-08
# 適応部分モジュラー最大化のための線形時間アルゴリズム Linear-Time Algorithms for Adaptive Submodular Maximization ( http://arxiv.org/abs/2007.04214v1 ) ライセンス: Link先を確認	Shaojie Tang	(参考訳) 本稿では,2つの確率的部分モジュラー最大化問題に対する高速アルゴリズムを提案する。まず,濃度制約を満たした適応部分モジュラー最大化問題から始める。近似比(1-1/e-\epsilon)$の線形時間アルゴリズムを開発した。特に、我々のアルゴリズムの時間複雑性は$O(n\log\frac{1}{\epsilon})$(関数評価の数)であり、これは濃度制約とは独立であり、$n$は基底集合のサイズである。次に,完全適応部分モジュラリティの概念を導入し,分割マトロイド制約を受ける完全適応部分モジュラリティ関数を最大化する線形時間アルゴリズムを開発した。 1-1/e-\epsilon}{4-2/e-2\epsilon}$の近似比をo(n\log\frac{1}{\epsilon})$の関数評価のみを用いて達成することを示す。 In this paper, we develop fast algorithms for two stochastic submodular maximization problems. We start with the well-studied adaptive submodular maximization problem subject to a cardinality constraint. We develop the first linear-time algorithm which achieves a $(1-1/e-\epsilon)$ approximation ratio. Notably, the time complexity of our algorithm is $O(n\log\frac{1}{\epsilon})$ (number of function evaluations) which is independent of the cardinality constraint, where $n$ is the size of the ground set. Then we introduce the concept of fully adaptive submodularity, and develop a linear-time algorithm for maximizing a fully adaptive submoudular function subject to a partition matroid constraint. We show that our algorithm achieves a $\frac{1-1/e-\epsilon}{4-2/e-2\epsilon}$ approximation ratio using only $O(n\log\frac{1}{\epsilon})$ number of function evaluations.	翻訳日:2022-11-12 10:09:57 公開日:2020-07-08
# RicciNets:Ricci Flowを用いた高速ニューラルネットワークの曲率誘導プルーニング RicciNets: Curvature-guided Pruning of High-performance Neural Networks Using Ricci Flow ( http://arxiv.org/abs/2007.04216v1 ) ライセンス: Link先を確認	Samuel Glass, Simeon Spasov, Pietro Li\`o	(参考訳) トレーニング前にランダムに配線されたニューラルネットワーク内の有線計算経路を同定する新しい手法を提案する。この計算グラフは、局所グラフ測度で定義されるノード質量確率関数に基づいてプルーニングされ、強化学習ベースの制御ニューラルネットワークによって生成されるハイパーパラメータによって重み付けされる。計算グラフをニューラルネットワークにマッピングする前に,リッチ曲率の定義を用いて,重要度の低いエッジを除去する。我々は,1パスあたりの浮動小数点演算数(flops)の約35\%$の削減を示し,性能の低下はみられなかった。さらに,本手法は,ランダムに繋がったニューラルネットワークを純粋に構造的特性に基づいて正規化することを可能にし,一方のネットワークで識別される好適な特性が他のネットワークに一般化できることを見いだすことができる。この方法では、低マグニチュード重みで刈り取られたものと類似した圧縮下で、より優れた性能のネットワークを生成する。我々の知る限り、これはランダムに配線されたニューラルネットワークをプルーニングする最初の研究であり、プルーニング機構でリッチ曲率のトポロジカル測度を利用する最初の研究である。 A novel method to identify salient computational paths within randomly wired neural networks before training is proposed. The computational graph is pruned based on a node mass probability function defined by local graph measures and weighted by hyperparameters produced by a reinforcement learning-based controller neural network. We use the definition of Ricci curvature to remove edges of low importance before mapping the computational graph to a neural network. We show a reduction of almost $35\%$ in the number of floating-point operations (FLOPs) per pass, with no degradation in performance. Further, our method can successfully regularize randomly wired neural networks based on purely structural properties, and also find that the favourable characteristics identified in one network generalise to other networks. The method produces networks with better performance under similar compression to those pruned by lowest-magnitude weights. To our best knowledge, this is the first work on pruning randomly wired neural networks, as well as the first to utilize the topological measure of Ricci curvature in the pruning mechanism.	翻訳日:2022-11-12 10:09:40 公開日:2020-07-08
# 感情認識のための音声・視覚の時間的アグリゲーション Temporal aggregation of audio-visual modalities for emotion recognition ( http://arxiv.org/abs/2007.04364v1 ) ライセンス: Link先を確認	Andreea Birhala, Catalin Nicolae Ristea, Anamaria Radoi, Liviu Cristian Dutu	(参考訳) 感情認識は感情コンピューティングや人間とコンピュータの相互作用において重要な役割を担っている。現在の技術進歩は、人の感情状態に関するデータを収集する可能性を高めている。一般に、被写体が伝達する感情に関する人間の知覚は、被写体との最初の秒間で収集された声と視覚情報に基づいている。その結果、感情認識に対する現在のアプローチのほとんどにおいて、言語的(つまり、音声)と非言語的(すなわち、画像)の情報の統合が望ましい選択であるように思われる。本稿では,各モダリティに対する時間的オフセットの異なる時間窓からの音声と視覚のモダリティを組み合わせた感情認識のためのマルチモーダル融合手法を提案する。提案手法は,文献や精度評価から,他の手法よりも優れていることを示す。実験は、オープンアクセスマルチモーダルデータセットCREMA-D上で実施される。 Emotion recognition has a pivotal role in affective computing and in human-computer interaction. The current technological developments lead to increased possibilities of collecting data about the emotional state of a person. In general, human perception regarding the emotion transmitted by a subject is based on vocal and visual information collected in the first seconds of interaction with the subject. As a consequence, the integration of verbal (i.e., speech) and non-verbal (i.e., image) information seems to be the preferred choice in most of the current approaches towards emotion recognition. In this paper, we propose a multimodal fusion technique for emotion recognition based on combining audio-visual modalities from a temporal window with different temporal offsets for each modality. We show that our proposed method outperforms other methods from the literature and human accuracy rating. The experiments are conducted over the open-access multimodal dataset CREMA-D.	翻訳日:2022-11-12 10:03:12 公開日:2020-07-08
# 知識グラフに基づく意味融合による会話レコメンダシステムの改善 Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion ( http://arxiv.org/abs/2007.04032v1 ) ライセンス: Link先を確認	Kun Zhou, Wayne Xin Zhao, Shuqing Bian, Yuanhang Zhou, Ji-Rong Wen, Jingsong Yu	(参考訳) 対話型推薦システム(CRS)は,対話型対話を通じて高品質な項目をユーザに推薦することを目的としている。 CRSの取り組みはいくつかあるが、2つの大きな問題はまだ解決されていない。まず、会話データ自体にユーザの好みを正確に理解するための十分なコンテキスト情報がない。第二に、自然言語表現とアイテムレベルのユーザ嗜好の間には意味的なギャップがある。これらの問題に対処するために、単語指向とエンティティ指向の知識グラフ(kg)の両方を組み込んでcrssのデータ表現を強化し、単語レベルとエンティティレベルの意味空間を整合させるために相互情報最大化を採用する。協調したセマンティック表現に基づいて、正確なレコメンデーションを行うためのKG強化レコメンデーションコンポーネントと、応答テキストに情報的キーワードやエンティティを生成するKG強化ダイアログコンポーネントをさらに発展させる。提案手法の有効性を実証し,提案手法の有効性を検証した。 Conversational recommender systems (CRS) aim to recommend high-quality items to users through interactive conversations. Although several efforts have been made for CRS, two major issues still remain to be solved. First, the conversation data itself lacks of sufficient contextual information for accurately understanding users' preference. Second, there is a semantic gap between natural language expression and item-level user preference. To address these issues, we incorporate both word-oriented and entity-oriented knowledge graphs (KG) to enhance the data representations in CRSs, and adopt Mutual Information Maximization to align the word-level and entity-level semantic spaces. Based on the aligned semantic representations, we further develop a KG-enhanced recommender component for making accurate recommendations, and a KG-enhanced dialog component that can generate informative keywords or entities in the response text. Extensive experiments have demonstrated the effectiveness of our approach in yielding better performance on both recommendation and conversation tasks.	翻訳日:2022-11-12 10:02:59 公開日:2020-07-08
# 敵対的摂動に頑健な深層ニューラルネットワークの高速学習 Fast Training of Deep Neural Networks Robust to Adversarial Perturbations ( http://arxiv.org/abs/2007.03832v1 ) ライセンス: Link先を確認	Justin Goodwin, Olivia Brown, Victoria Helus	(参考訳) ディープニューラルネットワークは、多くの領域で高速にトレーニングし、うまく一般化することができる。その有望な性能にもかかわらず、ディープネットワークは入力の摂動に対する感性(例えば、敵の例)を示しており、学習した特徴表現はしばしば解釈が困難であり、真の能力と信頼性に関する懸念を提起している。近年の対人訓練における研究は、モデルが対人例に対して最適化される頑健な最適化の形で、摂動に対する性能感性を改善し、より解釈可能な特徴表現を得る能力を示している。しかし、敵対的なトレーニングは、標準(すなわち非ロバスト)トレーニングよりも計算コストが増大し、大規模問題での使用には非現実的になる。最近の研究は、敵の訓練に対する迅速な近似が、無限大規範に縛られた摂動の存在下でトレーニング時間を短縮し、堅牢性を維持することを約束していることを示している。本研究では,本手法がユークリッドのノルムにまで拡張され,ロバストモデルに共通する人間による特徴表現が保たれることを示す。さらに,分散学習方式を用いることで,堅牢な深層ネットワークをトレーニングする時間をさらに短縮できることを示す。高速対人トレーニングは、堅牢な最適化が非現実的と考えられていた機械学習アプリケーションにおいて、セキュリティと説明可能性の向上を提供する、有望なアプローチである。 Deep neural networks are capable of training fast and generalizing well within many domains. Despite their promising performance, deep networks have shown sensitivities to perturbations of their inputs (e.g., adversarial examples) and their learned feature representations are often difficult to interpret, raising concerns about their true capability and trustworthiness. Recent work in adversarial training, a form of robust optimization in which the model is optimized against adversarial examples, demonstrates the ability to improve performance sensitivities to perturbations and yield feature representations that are more interpretable. Adversarial training, however, comes with an increased computational cost over that of standard (i.e., nonrobust) training, rendering it impractical for use in large-scale problems. Recent work suggests that a fast approximation to adversarial training shows promise for reducing training time and maintaining robustness in the presence of perturbations bounded by the infinity norm. In this work, we demonstrate that this approach extends to the Euclidean norm and preserves the human-aligned feature representations that are common for robust models. Additionally, we show that using a distributed training scheme can further reduce the time to train robust deep networks. Fast adversarial training is a promising approach that will provide increased security and explainability in machine learning applications for which robust optimization was previously thought to be impractical.	翻訳日:2022-11-12 10:02:17 公開日:2020-07-08
# 非線形性を持つリニアテンソル投影 Linear Tensor Projection Revealing Nonlinearity ( http://arxiv.org/abs/2007.03912v1 ) ライセンス: Link先を確認	Koji Maruhashi, Heewon Park, Rui Yamaguchi, Satoru Miyano	(参考訳) 次元の縮小は高次元データの学習に有効な方法であり、人間の可読低次元部分空間における決定境界をよりよく理解することができる。主成分分析や線形判別分析のような線形手法は、多くの変数間の相関を捉えることができるが、データ予測において重要な相関を捉えることは保証されていない。さらに、決定境界が強い非線形性を持つ場合、保証はますます困難になる。この問題は、データが変数間の関係を表す行列またはテンソルであるときに悪化する。本研究は,サブスペース内の予測モデルが強い非線形性を持つ場合でも,元のデータ情報を可能な限り保持しつつ,予測精度を最大化する部分空間を探索する学習手法を提案する。これにより、ユーザが知りたがっている予測問題の背後にある変数のグループのメカニズムを、容易に解釈できる。本手法は, 行列やテンソルを含む各種データに適用することにより, 有効性を示す。 Dimensionality reduction is an effective method for learning high-dimensional data, which can provide better understanding of decision boundaries in human-readable low-dimensional subspace. Linear methods, such as principal component analysis and linear discriminant analysis, make it possible to capture the correlation between many variables; however, there is no guarantee that the correlations that are important in predicting data can be captured. Moreover, if the decision boundary has strong nonlinearity, the guarantee becomes increasingly difficult. This problem is exacerbated when the data are matrices or tensors that represent relationships between variables. We propose a learning method that searches for a subspace that maximizes the prediction accuracy while retaining as much of the original data information as possible, even if the prediction model in the subspace has strong nonlinearity. This makes it easier to interpret the mechanism of the group of variables behind the prediction problem that the user wants to know. We show the effectiveness of our method by applying it to various types of data including matrices and tensors.	翻訳日:2022-11-12 10:01:42 公開日:2020-07-08
# binary stochastic filter: 特徴の選択とその先 Binary Stochastic Filtering: feature selection and beyond ( http://arxiv.org/abs/2007.03920v1 ) ライセンス: Link先を確認	Andrii Trelin and Ale\v{s} Proch\'azka	(参考訳) 特徴選択は、データと機械学習モデルを理解する上で最も決定的なツールの1つである。他の方法では、$l^{1}$ペナルティによって引き起こされるスパーシティは、この問題に対する最も単純で最もよく研究されたアプローチの1つである。このような正規化は、重みの空間性やユニットアクティベーションを達成するためにニューラルネットワークで頻繁に使用されるが、特徴選択問題にどのように適用できるかは不明である。この研究は、階層重みの代わりに統計的に機能関与を罰することで、空間規則化をどのように使用できるかを再考することで、ニューラルネットワークを自動で特徴の選択能力で拡張することを目的としている。提案手法は,数種類の古典的手法と比較して高い効率を示し,計算オーバーヘッドを最小限に抑え,既存のアーキテクチャに直接適用できることを示した。さらに、この方法はニューロンのプルーニングやスペクトルデータの重要領域の選択に容易に一般化できる。 Feature selection is one of the most decisive tools in understanding data and machine learning models. Among other methods, sparsity induced by $L^{1}$ penalty is one of the simplest and best studied approaches to this problem. Although such regularization is frequently used in neural networks to achieve sparsity of weights or unit activations, it is unclear how it can be employed in the feature selection problem. This work aims at extending the neural network with ability to automatically select features by rethinking how the sparsity regularization can be used, namely, by stochastically penalizing feature involvement instead of the layer weights. The proposed method has demonstrated superior efficiency when compared to a few classical methods, achieved with minimal or no computational overhead, and can be directly applied to any existing architecture. Furthermore, the method is easily generalizable for neuron pruning and selection of regions of importance for spectral data.	翻訳日:2022-11-12 10:01:28 公開日:2020-07-08
# スクリーニングテストによるスパースベイズ学習の高速化とその応用 Accelerated Sparse Bayesian Learning via Screening Test and Its Applications ( http://arxiv.org/abs/2007.04006v1 ) ライセンス: Link先を確認	Yiping Jiang, Tianshi Chen	(参考訳) 高次元の設定では、スパース構造はメモリと計算の複雑さの点で効率上重要である。線形系では、直交する特徴の過剰完備な辞書が提供される最も簡単な解を見つけることは、通常NPハードであり、代替の近似法を考える必要がある。本稿では,経験的ベイズアプローチとして,LASSOのような固定された先行手法よりも,解の空間性を促進するためにパラメータ化を事前に用いた,疎ベイズ学習を選択する。しかし、スクリーニングテストは、最適な解において係数がゼロであることが保証された特徴のサブセットを迅速に識別することを目的としており、より小さく、より簡単に解決できる問題を得るために、完全な辞書から安全に取り除くことができる。次に、より小さな問題を解き、その後、小さな解をゼロでパディングすることで元の問題の解を復元する。提案手法の性能は,様々なデータセットやアプリケーションで検討する。 In high-dimensional settings, sparse structures are critical for efficiency in term of memory and computation complexity. For a linear system, to find the sparsest solution provided with an over-complete dictionary of features directly is typically NP-hard, and thus alternative approximate methods should be considered. In this paper, our choice for alternative method is sparse Bayesian learning, which, as empirical Bayesian approaches, uses a parameterized prior to encourage sparsity in solution, rather than the other methods with fixed priors such as LASSO. Screening test, however, aims at quickly identifying a subset of features whose coefficients are guaranteed to be zero in the optimal solution, and then can be safely removed from the complete dictionary to obtain a smaller, more easily solved problem. Next, we solve the smaller problem, after which the solution of the original problem can be recovered by padding the smaller solution with zeros. The performance of the proposed method will be examined on various data sets and applications.	翻訳日:2022-11-12 10:00:40 公開日:2020-07-08
# フリーハンド超音波における自動プローブ運動誘導法 Automatic Probe Movement Guidance for Freehand Obstetric Ultrasound ( http://arxiv.org/abs/2007.04480v1 ) ライセンス: Link先を確認	Richard Droste, Lior Drukker, Aris T. Papageorghiou, J. Alison Noble	(参考訳) そこで本研究では, 標準平面獲得のためのリアルタイムプローブ移動指導を行う最初のシステムを提案する。このようなシステムは、オペレーターの専門知識のレベルを低くすることで、世界中の産科超音波スキャンの展開に寄与することができる。本システムは、プローブに取り付けられた慣性測定ユニット(IMU)の超音波ビデオ信号と運動信号を受信し、誘導信号を予測する人工ニューラルネットワークを用いる。 US-GuideNetと呼ばれるネットワークは、標準平面位置への移動(ゴール予測)または専門家のソノグラフィーが実行する次の動き(アクション予測)を予測する。他の超音波応用のための既存のモデルはシミュレーションやファントムで訓練されているが、実際の超音波ビデオを用いてモデルを訓練し、17人のソノグラフィーによる464の定期的な臨床スキャンからプローブ運動データを収集する。 3種類の標準平面に対する評価は、このモデルが目標予測に88.8%、行動予測に90.9%の精度で有用な誘導信号を提供することを示している。 We present the first system that provides real-time probe movement guidance for acquiring standard planes in routine freehand obstetric ultrasound scanning. Such a system can contribute to the worldwide deployment of obstetric ultrasound scanning by lowering the required level of operator expertise. The system employs an artificial neural network that receives the ultrasound video signal and the motion signal of an inertial measurement unit (IMU) that is attached to the probe, and predicts a guidance signal. The network termed US-GuideNet predicts either the movement towards the standard plane position (goal prediction), or the next movement that an expert sonographer would perform (action prediction). While existing models for other ultrasound applications are trained with simulations or phantoms, we train our model with real-world ultrasound video and probe motion data from 464 routine clinical scans by 17 accredited sonographers. Evaluations for 3 standard plane types show that the model provides a useful guidance signal with an accuracy of 88.8% for goal prediction and 90.9% for action prediction.	翻訳日:2022-11-12 09:54:20 公開日:2020-07-08
# ダウンサイドリスク制約を持つ自然アクター批判アルゴリズム A Natural Actor-Critic Algorithm with Downside Risk Constraints ( http://arxiv.org/abs/2007.04203v1 ) ライセンス: Link先を確認	Thomas Spooner and Rahul Savani	(参考訳) リスクに敏感な強化学習に関する既存の研究は、対称とダウンサイドのリスク対策の両方において、政策勾配の直接モンテカルロ推定を用いている。このアプローチは偏りのない勾配推定をもたらすが、時間微分法に比べて高い分散とサンプル効率の低下に苦しむ。本稿では,回帰の下位部分モーメントを指標とした負のリスク回避による予測と制御について検討する。我々は,その非線形性を回避し,下部部分モーメントを上限とする新しいベルマン方程式を導入する。下位部分モーメントに対するこのプロキシが縮小であることを証明し、分散分解によるアルゴリズムの安定性に対する直観を与える。これにより、サンプル効率が良く、部分モーメントのオンライン推定が可能になる。リスクに敏感な制御では、制約されたポリシーを見つけるための近年のアクタークリティカルな手法であるReward Constrained Policy Optimizationを、より低い部分モーメントのプロキシでインスタンス化する。提案手法を自然政策勾配に拡張し,リスクに敏感な強化学習のための3つのベンチマーク問題に対するアプローチの有効性を示す。 Existing work on risk-sensitive reinforcement learning - both for symmetric and downside risk measures - has typically used direct Monte-Carlo estimation of policy gradients. While this approach yields unbiased gradient estimates, it also suffers from high variance and decreased sample efficiency compared to temporal-difference methods. In this paper, we study prediction and control with aversion to downside risk which we gauge by the lower partial moment of the return. We introduce a new Bellman equation that upper bounds the lower partial moment, circumventing its non-linearity. We prove that this proxy for the lower partial moment is a contraction, and provide intuition into the stability of the algorithm by variance decomposition. This allows sample-efficient, on-line estimation of partial moments. For risk-sensitive control, we instantiate Reward Constrained Policy Optimization, a recent actor-critic method for finding constrained policies, with our proxy for the lower partial moment. We extend the method to use natural policy gradients and demonstrate the effectiveness of our approach on three benchmark problems for risk-sensitive reinforcement learning.	翻訳日:2022-11-12 09:54:03 公開日:2020-07-08
# 共同視聴による生音声からの音声表現の学習 Learning Speech Representations from Raw Audio by Joint Audiovisual Self-Supervision ( http://arxiv.org/abs/2007.04134v1 ) ライセンス: Link先を確認	Abhinav Shukla, Stavros Petridis, Maja Pantic	(参考訳) 音声と視覚的モダリティの直感的な相互作用は、クロスモーダルな自己教師付き学習に有用である。この概念は、ビデオのアクション認識や音響シーンの分類といった一般的なオーディオビジュアルタスクで実証されている。しかし、セルフスーパービジョンは視聴覚音声については未検討のままである。生音声波形から自己教師付き音声表現を学習する手法を提案する。音声のみの自己スーパービジョン(情報的オーディオ属性の予測)と視覚的自己スーパービジョン(音声から発話顔を生成する)を組み合わせることで生音声エンコーダを訓練する。 visual pretextタスクは、音声表現を駆動して、唇の動きに関連する情報をキャプチャする。これにより、オーディオエンコーダを視覚情報に富み、エンコーダを視覚的モダリティなしで評価することができる。本手法は,確立された単語分類ベンチマークにおいて,既存の自己教師型音声特徴に対して,競合性能を達成し,ラベルの少ない学習において,他の手法よりも大幅に優れる。また,本手法は教師あり訓練よりも優れており,音声関連タスクの強力な初期化を実現している。本研究は,音声表現を学習するための視聴覚音声におけるマルチモーダル自己スーパービジョンの可能性を示す。 The intuitive interaction between the audio and visual modalities is valuable for cross-modal self-supervised learning. This concept has been demonstrated for generic audiovisual tasks like video action recognition and acoustic scene classification. However, self-supervision remains under-explored for audiovisual speech. We propose a method to learn self-supervised speech representations from the raw audio waveform. We train a raw audio encoder by combining audio-only self-supervision (by predicting informative audio attributes) with visual self-supervision (by generating talking faces from audio). The visual pretext task drives the audio representations to capture information related to lip movements. This enriches the audio encoder with visual information and the encoder can be used for evaluation without the visual modality. Our method attains competitive performance with respect to existing self-supervised audio features on established isolated word classification benchmarks, and significantly outperforms other methods at learning from fewer labels. Notably, our method also outperforms fully supervised training, thus providing a strong initialization for speech related tasks. Our results demonstrate the potential of multimodal self-supervision in audiovisual speech for learning good audio representations.	翻訳日:2022-11-12 09:53:10 公開日:2020-07-08
# 小データセットにおける音素表現学習のための予測符号化モデルの解析 Analysis of Predictive Coding Models for Phonemic Representation Learning in Small Datasets ( http://arxiv.org/abs/2007.04205v1 ) ライセンス: Link先を確認	Mar\'ia Andrea Cruz Bland\'on and Okko R\"as\"anen	(参考訳) 予測符号化を用いたニューラルネットワークモデルは、人間の言語獲得の計算モデルの観点から興味深い。この文献ではいくつかの有望な予測型コーディングベースの学習アルゴリズムが提案されているが、どのように異なる言語に一般化し、データセットサイズをトレーニングするかは現在不明である。また,これらのモデルが効果的な音韻特徴学習者であることを示す一方で,これらのモデルの予測損失関数の最小化が最適音素様表現につながるかどうかも不明である。本研究では,データセットサイズが異なる2つの言語に対する音素識別タスク(abxタスク)における,自己回帰型予測符号化と対比型予測符号化の2つの予測符号化モデルの挙動について検討した。実験では,2つのデータセットとの自己回帰的損失と音素識別スコアとの間に強い相関が認められた。しかし驚いたことに、CPCモデルはトレーニングデータを渡した後既に急速に収束しており、平均すると、その表現は両方の言語でのAPCよりも優れています。 Neural network models using predictive coding are interesting from the viewpoint of computational modelling of human language acquisition, where the objective is to understand how linguistic units could be learned from speech without any labels. Even though several promising predictive coding -based learning algorithms have been proposed in the literature, it is currently unclear how well they generalise to different languages and training dataset sizes. In addition, despite that such models have shown to be effective phonemic feature learners, it is unclear whether minimisation of the predictive loss functions of these models also leads to optimal phoneme-like representations. The present study investigates the behaviour of two predictive coding models, Autoregressive Predictive Coding and Contrastive Predictive Coding, in a phoneme discrimination task (ABX task) for two languages with different dataset sizes. Our experiments show a strong correlation between the autoregressive loss and the phoneme discrimination scores with the two datasets. However, to our surprise, the CPC model shows rapid convergence already after one pass over the training data, and, on average, its representations outperform those of APC on both languages.	翻訳日:2022-11-12 09:52:52 公開日:2020-07-08
# iq-vqa: インテリジェントな視覚的質問応答 IQ-VQA: Intelligent Visual Question Answering ( http://arxiv.org/abs/2007.04422v1 ) ライセンス: Link先を確認	Vatsal Goel, Mohit Chandak, Ashish Anand and Prithwijit Guha	(参考訳) 視覚的な質問に対する回答の分野には大きな進歩があったが、今日のモデルには一貫性がなく、不安定な傾向がある。そこで本研究では,任意のVQAアーキテクチャの一貫性と堅牢性を高めるモデル独立巡回フレームワークを提案する。モデルに元の質問に答えるようにトレーニングし、回答に基づいて含意を生成し、生成した含意に正しく答えることを学びます。循環的枠組みの一部として,任意の問答対から示唆された質問を生成できる新しい含意生成器を提案する。一貫性に関する今後の研究のベースラインとして、新しい人間の注釈付きVQA-Implicationsデータセットを提供する。データセットは、VQA v2.0バリデーションデータセットから作成された3つのタイプの含意(論理的等価性、必要条件、相互排他)を含む約30万の質問で構成されている。 VQAモデルの一貫性をルールベースデータセットで約15%向上し、VQA-Implicationsデータセットで約7%向上し、パフォーマンスを劣化させることなくロバストネスを約2%向上することを示す。さらに,視覚と言語に対するマルチモーダル理解の向上を強調したアテンションマップの改良も定量的に示す。 Even though there has been tremendous progress in the field of Visual Question Answering, models today still tend to be inconsistent and brittle. To this end, we propose a model-independent cyclic framework which increases consistency and robustness of any VQA architecture. We train our models to answer the original question, generate an implication based on the answer and then also learn to answer the generated implication correctly. As a part of the cyclic framework, we propose a novel implication generator which can generate implied questions from any question-answer pair. As a baseline for future works on consistency, we provide a new human annotated VQA-Implications dataset. The dataset consists of ~30k questions containing implications of 3 types - Logical Equivalence, Necessary Condition and Mutual Exclusion - made from the VQA v2.0 validation dataset. We show that our framework improves consistency of VQA models by ~15% on the rule-based dataset, ~7% on VQA-Implications dataset and robustness by ~2%, without degrading their performance. In addition, we also quantitatively show improvement in attention maps which highlights better multi-modal understanding of vision and language.	翻訳日:2022-11-12 09:52:09 公開日:2020-07-08
# autolr: 学習率政策への進化的アプローチ AutoLR: An Evolutionary Approach to Learning Rate Policies ( http://arxiv.org/abs/2007.04223v1 ) ライセンス: Link先を確認	Pedro Carvalho, Nuno Louren\c{c}o, Filipe Assun\c{c}\~ao, Penousal Machado	(参考訳) 適切な学習率の選択は、優れたニューラルネットワークのトレーニングとパフォーマンスにとって最重要である。これまでは、適切な学習率を見つけるためには、経験と試行錯誤に頼る必要があった。現在では、優れた学習率の探索を容易にするような技術自動手法が多数存在する。これらの手法は有効であり、長年にわたって良い結果をもたらしてきたが、一般的な解決策である。つまり、特定のネットワークトポロジに対する学習率の最適化は、ほとんど未調査のままである。本稿では,構造化文法進化を用いたニューラルネットワークアーキテクチャのための学習率スケジューラを進化させるフレームワークであるautolrを提案する。このシステムは、一般的な学習率のベースライン値と比較された学習率ポリシーを発展させるために使用された。その結果、ある進化したポリシーを用いたトレーニングは確立されたベースラインよりも効率的であり、このアプローチはニューラルネットワークのパフォーマンスを改善する有効な手段であることが示唆された。 The choice of a proper learning rate is paramount for good Artificial Neural Network training and performance. In the past, one had to rely on experience and trial-and-error to find an adequate learning rate. Presently, a plethora of state of the art automatic methods exist that make the search for a good learning rate easier. While these techniques are effective and have yielded good results over the years, they are general solutions. This means the optimization of learning rate for specific network topologies remains largely unexplored. This work presents AutoLR, a framework that evolves Learning Rate Schedulers for a specific Neural Network Architecture using Structured Grammatical Evolution. The system was used to evolve learning rate policies that were compared with a commonly used baseline value for learning rate. Results show that training performed using certain evolved policies is more efficient than the established baseline and suggest that this approach is a viable means of improving a neural network's performance.	翻訳日:2022-11-12 09:46:12 公開日:2020-07-08
# Decolonial AI: 人工知能の社会技術的展望としてのDecolonial Theory Decolonial AI: Decolonial Theory as Sociotechnical Foresight in Artificial Intelligence ( http://arxiv.org/abs/2007.04068v1 ) ライセンス: Link先を確認	Shakir Mohamed, Marie-Therese Png, William Isaac	(参考訳) 本稿では,人工知能の進歩の理解と形成におけるクリティカルサイエンス,特にポストコロニアル理論とデコロニアル理論の重要な役割について考察する。人工知能(AI)は、現代社会とその関係を再形成する技術進歩の1つと見なされている。継続的に適応するシステムの設計とデプロイは、極端にポジティブな変化の約束を果たすが、同時に、特にすでに脆弱な人々に対して重大なリスクを生じさせる。価値と権力はこの議論の中心である。デコロニアル理論は、我々の知的、政治的、経済的、社会的世界を形成する権力のパターンを説明するために、歴史的な後見を用いる。 aiコミュニティは、その技術的実践にデコロニアルな批判的アプローチを組み込むことで、研究と技術開発を確立された倫理原則とよりよく一致させることができる先見と戦術を発達させ、イノベーションと科学的進歩の否定的な影響に耐え続ける脆弱な人々を遠ざけることができる。我々は、植民地化の事例である問題のあるアプリケーションを強調し、デコロニアルレンズを使用して、人工知能のデコロニアル分野を形成する3つの戦術を提出する。今後は、AI研究が推進する新たな科学的ブレークスルーとテクノロジーの波が到来し、AIコミュニティが倫理的見地と、私たちにとって利用可能な知的視点の多種多元性を通じて社会的契約を強化することとなり、究極的には、利益と正義を目標に、より大きな幸福を可能にする将来の技術をサポートする。 This paper explores the important role of critical science, and in particular of post-colonial and decolonial theories, in understanding and shaping the ongoing advances in artificial intelligence. Artificial Intelligence (AI) is viewed as amongst the technological advances that will reshape modern societies and their relations. Whilst the design and deployment of systems that continually adapt holds the promise of far-reaching positive change, they simultaneously pose significant risks, especially to already vulnerable peoples. Values and power are central to this discussion. Decolonial theories use historical hindsight to explain patterns of power that shape our intellectual, political, economic, and social world. By embedding a decolonial critical approach within its technical practice, AI communities can develop foresight and tactics that can better align research and technology development with established ethical principles, centring vulnerable peoples who continue to bear the brunt of negative impacts of innovation and scientific progress. We highlight problematic applications that are instances of coloniality, and using a decolonial lens, submit three tactics that can form a decolonial field of artificial intelligence: creating a critical technical practice of AI, seeking reverse tutelage and reverse pedagogies, and the renewal of affective and political communities. The years ahead will usher in a wave of new scientific breakthroughs and technologies driven by AI research, making it incumbent upon AI communities to strengthen the social contract through ethical foresight and the multiplicity of intellectual perspectives available to us; ultimately supporting future technologies that enable greater well-being, with the goal of beneficence and justice for all.	翻訳日:2022-11-12 09:45:57 公開日:2020-07-08
# 散乱合成学習者:対象の発見,属性,アナロジカル推論における関連性 The Scattering Compositional Learner: Discovering Objects, Attributes, Relationships in Analogical Reasoning ( http://arxiv.org/abs/2007.04212v1 ) ライセンス: Link先を確認	Yuhuai Wu, Honghua Dong, Roger Grosse, Jimmy Ba	(参考訳) 本稿では,RPM(Raven's Progressive Matrices)という,リッチな構成構造を含む類似推論タスクに着目した。そこで本研究では,データの構成構造を明らかにするために,ニューラルネットワークを逐次的に構成するscl(slicant compositional learner)を提案する。我々のSCLは2つのRPMデータセットで最先端のパフォーマンスを達成し、48.7%が Balanced-RAVENを、26.4%がPGMを改善した。さらに,本モデルでは,オブジェクトの属性(例えば,形状色,サイズ)とそれらの関係(例えば,進行,結合)の合成表現を検出する。また、合成表現により、SCLはテスト時間領域シフトに対して著しく堅牢になり、以前は見つからなかったアナログに対してゼロショットの一般化を大幅に改善する。 In this work, we focus on an analogical reasoning task that contains rich compositional structures, Raven's Progressive Matrices (RPM). To discover compositional structures of the data, we propose the Scattering Compositional Learner (SCL), an architecture that composes neural networks in a sequence. Our SCL achieves state-of-the-art performance on two RPM datasets, with a 48.7% relative improvement on Balanced-RAVEN and 26.4% on PGM over the previous state-of-the-art. We additionally show that our model discovers compositional representations of objects' attributes (e.g., shape color, size), and their relationships (e.g., progression, union). We also find that the compositional representation makes the SCL significantly more robust to test-time domain shifts and greatly improves zero-shot generalization to previously unseen analogies.	翻訳日:2022-11-12 09:44:52 公開日:2020-07-08
# Few-Shot分類器の精度予測 Predicting the Accuracy of a Few-Shot Classifier ( http://arxiv.org/abs/2007.04238v1 ) ライセンス: Link先を確認	Myriam Bontonou, Louis B\'ethune, Vincent Gripon	(参考訳) 少数ショット学習の文脈では、ラベル付きサンプルの少ないため、検証セットを用いて訓練された分類器の一般化能力を測定することはできない。本稿では,これまで見つからなかったデータに対して,私の分類器は十分に一般化されているか? まず,一般化性能の変動要因を分析した。次に、転送ベースのソリューションの使用事例を調査し、3つの設定を検討する。一数個のラベル付きサンプルしかアクセスできない場所を監督すること。二数個のラベル付きサンプルとラベルなしサンプルのセットの両方にアクセスすることができる半監督 iii) ラベルなしのサンプルしかアクセスできない場所を監督していないこと。各設定に対して,検討された分類器の一般化能力と実証的に相関する合理的な尺度を提案する。また,この単純な尺度を用いて,信頼度の高い一般化を予測できることを示した。標準的な数ショットビジョンデータセットで実験を行います。 In the context of few-shot learning, one cannot measure the generalization ability of a trained classifier using validation sets, due to the small number of labeled samples. In this paper, we are interested in finding alternatives to answer the question: is my classifier generalizing well to previously unseen data? We first analyze the reasons for the variability of generalization performances. We then investigate the case of using transfer-based solutions, and consider three settings: i) supervised where we only have access to a few labeled samples, ii) semi-supervised where we have access to both a few labeled samples and a set of unlabeled samples and iii) unsupervised where we only have access to unlabeled samples. For each setting, we propose reasonable measures that we empirically demonstrate to be correlated with the generalization ability of considered classifiers. We also show that these simple measures can be used to predict generalization up to a certain confidence. We conduct our experiments on standard few-shot vision datasets.	翻訳日:2022-11-12 09:42:50 公開日:2020-07-08

Title

Authors

Abstract

論文公表日・翻訳日

# 進化論理回路によるオープンエンディングフィットネス景観に関する研究

Investigation into Open-Ended Fitness Landscape through Evolutionary Logical Circuits ( http://arxiv.org/abs/2002.00593v2 )

ライセンス: Link先を確認

Masaki Suyama and Kosuke Sato

(参考訳) 累積的な文化進化は、人類が様々な生態学的・人口的環境で繁栄するきっかけとなった。人間が解決する必要のあるタスクの解決策は、クローズドまたはオープンエンドのフィットネスランドスケープの形をとるタスク空間にマッピングされ、前者は文化進化の研究において後者よりも広範囲にモデル化された。本稿では,前回の試行で構築された回路を用いた論理回路を構築するコンピュータシミュレーションを用いて,オープンエンドフィットネスランドスケープをモデル化したArthur and Polak (2006) によるシミュレーションを修正した。このシミュレーションを用いて、オープンエンドフィットネスランドスケープの性質を明らかにするとともに、グループサイズの増大により文化の蓄積速度が向上するかどうかを調べた。その結果, 群サイズは蓄積速度を増加させたが, 期待以上に制限された。また、2種類の蓄積、発明と改良が区別された場合、両者の性質が異なっていた。改良では, 1つのエージェントの生産性がグループサイズの増加とともに低下する凸関数に追従した。発明では、この軌道は急激な増加の連続したパターンを示し、次いで高原を示した。

Cumulative cultural evolution is what made humanity to thrive in various ecological and demographic environments. Solutions to the tasks that humans needed to solve could be mapped onto a task space which could take the form of either closed or open-ended fitness landscape, with the former being modeled more extensively than the latter in studies of cultural evolution. In this article, we modified a simulation by Arthur and Polak (2006) that modeled open-ended fitness landscape by using a computer simulation that builds logical circuits with circuits that were built in earlier trials. We used this simulation to clarify the nature of open-ended fitness landscape and to investigate whether the speed of accumulation of culture is increased by an increase in group size. The results indicated that group size increased the speed of accumulation but is limited than expected. Also, when two types of accumulation, invention and improvement, were distinguished the nature of the two differed. In improvement, the trajectory followed a convex function with productivity of one agent decreasing as group size increased. In invention, the trajectory showed a continuous pattern of rapid increase followed by a plateau.

翻訳日:2023-06-04 20:50:14 公開日:2020-07-08

# 準確率関数を持つ量子相関の雑音適応試験

Noise-adaptive test of quantum correlations with quasiprobability functions ( http://arxiv.org/abs/2002.05840v2 )

ライセンス: Link先を確認

Seung-Woo Lee, Jaewan Kim, Wonmin Son

(参考訳) 本稿では,雑音の存在下での準確率関数の観点から量子相関をテストする手法を提案する。測定の不完全性と熱環境が量子相関に及ぼす影響を分析し,そのノイズ効果を一般化準確率関数の次数パラメータの変化にうまくカプセル化できることを示した。次に、一般化準確率関数を用いて、ベル型不等式という形で雑音適応型絡み込み証を定式化する。驚くべきことに、厳しい雑音下で量子相関を観測できる。本手法は,連続可変系を用いた近時雑音量子プロセッサにおける量子相関の検証に有用である。

We introduce a method for testing quantum correlations in terms of quasiprobability functions in the presence of noise. We analyze the effects of measurement imperfection and thermal environment on quantum correlations and show that their noise effects can be well encapsulated into the change of the order parameter of the generalized quasiprobability function. We then formulate a noise-adaptive entanglement witness in the form of a Bell-type inequality by using the generalized quasiprobability function. Remarkably, it allows us to observe quantum correlations under severe noise. Our method provides a useful tool to test quantum correlations in near-term noisy quantum processors with continuous-variable systems.

翻訳日:2023-06-03 17:09:34 公開日:2020-07-08

# CFTにおけるナヒトマン理論の一般化

A Generalized Nachtmann Theorem in CFT ( http://arxiv.org/abs/2002.12390v2 )

ライセンス: Link先を確認

Sandipan Kundu

(参考訳) ローレンツ符号におけるユニタリ量子場理論のコリエーターは、ある解析性と肯定的性質に従う。 2次元以上のユニタリ CFT の相互作用に対して、これらの性質は一次作用素の OPE に現れる極小ツイスト作用素の族に一般的な制約を与えることを示す。特に、任意のスカラー一次の反射対称 OPE に偶数スピンが現れる極小ツイスト作用素の族に対して、ツイストはスピンの単調に増大する凸函数でなければならないという凸定理を導いて拡張する。我々の議論は完全に非摂動的であり、ユニタリ cft における非同一のスカラープライマリの ope にも当てはまり、ope に現れる回転作用素のねじれを制限している。最後に、同じ手法が特定のCFT相関器のRegge動作にも制約を課していると論じる。

Correlators of unitary quantum field theories in Lorentzian signature obey certain analyticity and positivity properties. For interacting unitary CFTs in more than two dimensions, we show that these properties impose general constraints on families of minimal twist operators that appear in the OPEs of primary operators. In particular, we rederive and extend the convexity theorem which states that for the family of minimal twist operators with even spins appearing in the reflection-symmetric OPE of any scalar primary, twist must be a monotonically increasing convex function of the spin. Our argument is completely non-perturbative and it also applies to the OPE of nonidentical scalar primaries in unitary CFTs, constraining the twist of spinning operators appearing in the OPE. Finally, we argue that the same methods also impose constraints on the Regge behavior of certain CFT correlators.

翻訳日:2023-06-01 12:29:21 公開日:2020-07-08

# qsw_mpi: 量子確率歩行の並列シミュレーションのためのフレームワーク

QSW_MPI: a framework for parallel simulation of quantum stochastic walks ( http://arxiv.org/abs/2003.02450v2 )

ライセンス: Link先を確認

Edric Matwiejew and Jingbo Wang

(参考訳) QSW_MPIは連続時間量子確率ウォークの時系列シミュレーションのために開発されたピソンパッケージである。このモデルは、連続時間ランダムウォークと連続時間量子ウォークの一般化を含むリンドブラッド形式論におけるマルコフ開量子系の研究を可能にする。 QSW_MPIは、並列化されたFortranライブラリにアクセスするピソンインタフェースで、スパースデータ構造を利用するため、大規模並列コンピュータにスケーラブルであり、任意の複雑さのグラフと無方向性グラフ上の幅広いウォークダイナミクスのシミュレーションを可能にする。

QSW_MPI is a python package developed for time-series simulation of continuous-time quantum stochastic walks. This model allows for the study of Markovian open quantum systems in the Lindblad formalism, including a generalisation of the continuous-time random walk and continuous-time quantum walk. Consisting of a python interface accessing parallelised Fortran libraries utilising sparse data structures, QSW_MPI is scalable to massively parallel computers, which makes possible the simulation of a wide range of walk dynamics on directed and undirected graphs of arbitrary complexity.

翻訳日:2023-05-30 11:44:17 公開日:2020-07-08

# 1次元光学格子におけるスピン軌道結合によるスタガードフラックスの有効三角形ラダー

Effective triangular ladders with staggered flux from spin-orbit coupling in 1D optical lattices ( http://arxiv.org/abs/2003.04154v3 )

ライセンス: Link先を確認

Josep Cabedo, Joan Claramunt, Jordi Mompart, Ver\`onica Ahufinger and Alessio Celi

(参考訳) 光誘起スピン軌道結合は、超低温原子で量子磁気を研究する柔軟なツールである。本研究では,1次元光学格子中のスピン軌道結合ボース気体を,ハミルトニアンの最低帯域切断後のスタガードフラックスを持つ2脚三角ラダーにマッピング可能であることを示す。有効フラックスとトンネル強度の比は、広範囲の値に独立に調整することができる。ハードコアボソン近似が保持するパラメータの特定の構造を特定し,可変磁束を持つフラストレーション三角形スピンラダーを実現する。実効スピンハミルトニアンの性質を密度行列再正規化法を用いて検討し, 半充填時の位相図を決定する。均一な超流動とボンドオーダー絶縁体という2つの相を示す。後者は低ラマン調律でのみ安定化できる。最後に、予測相転移を横断するSOC系のパラメータ空間にわたって実験可能な軌道を提供する。

Light-induced spin-orbit coupling is a flexible tool to study quantum magnetism with ultracold atoms. In this work we show that spin-orbit coupled Bose gases in a one-dimensional optical lattice can be mapped into a two-leg triangular ladder with staggered flux following a lowest-band truncation of the Hamiltonian. The effective flux and the ratio of the tunneling strengths can be independently adjusted to a wide range of values. We identify a certain regime of parameters where a hard-core boson approximation holds and the system realizes a frustrated triangular spin ladder with tunable flux. We study the properties of the effective spin Hamiltonian using the density-matrix renormalization-group method and determine the phase diagram at half-filling. It displays two phases: a uniform superfluid and a bond-ordered insulator. The latter can be stabilized only for low Raman detuning. Finally, we provide experimentally feasible trajectories across the parameter space of the SOC system that cross the predicted phase transition.

翻訳日:2023-05-30 03:15:06 公開日:2020-07-08

# ドメインウォール非線形量子化

Domain wall nonlinear quantization ( http://arxiv.org/abs/2003.05387v3 )

ライセンス: Link先を確認

M. G. Ivanov

(参考訳) 領域壁の非線形量子化(コディメンション1の相対論的膜)を考える。膜塵の方程式はハミルトン・ヤコビ方程式の類似物と見なされ、量子アナログを構成することができる。結果として得られる方程式は非線形クライン・フォック・ゴルドン方程式の形を持つ。これは量子領域の壁に対する平均場近似と解釈できる。分散関係は(線形近似で)小さな摂動に対して得られる。摂動の群速度は光の速度を超えない。ドメイン壁に沿って伝播する摂動に対して、(古典的な場合のように)質量を持たないモードに加えて、大きなものが現れます。この結果は凝縮物質理論や超弦理論や超重力理論における膜量子化において興味深い。

The nonlinear quantization of the domain wall (relativistic membrane of codimension 1) is considered. The membrane dust equation is considered as an analogue of the Hamilton-Jacobi equation, which allows us to construct its quantum analogue. The resulting equation has the form of a nonlinear Klein-Fock-Gordon equation. It can be interpreted as the mean field approximation for a quantum domain wall. Dispersion relations are obtained for small perturbations (in a linear approximation). The group speed of perturbations does not exceed the speed of light. For perturbations propagating along the domain wall, in addition to the massless mode (as in the classical case), a massive one appears. The result may be interesting in condensed matter theory and in membrane quantization in superstring and supergravity theories.

翻訳日:2023-05-29 11:10:38 公開日:2020-07-08

# 単位非マルコフ量子進化における可逆性緩和

Irreversibility mitigation in unital non-Markovian quantum evolutions ( http://arxiv.org/abs/2004.04619v2 )

ライセンス: Link先を確認

Stefano Gherardini, Stefano Marcantoni, Filippo Caruso

(参考訳) 熱力学的エントロピー生成と非マルコフ進化の関係は、現在の研究課題である。本稿では,開量子系における確率エントロピー生成の挙動について考察する。特に、パウリ流の族について、量子力学がP-分割できないことを仮定して、平均エントロピー生成と分散の両方が特定の時間間隔で減少することを示す。系の力学は全体的に可逆的であるが、この結果は可逆性への過渡的傾向として解釈され、エントロピー生成のデルタピーク分布として0付近で説明される。最後に、量子系ダイナミクスを発生させる発電機のパラメータに関する解析的境界も提供し、対応する非マルコフ進化の可逆的緩和を確実にする。

The relation between the thermodynamic entropy production and non-Markovian evolutions is matter of current research. Here, we study the behavior of the stochastic entropy production in open quantum systems undergoing unital non-Markovian dynamics. In particular, for the family of Pauli channels we show that in some specific time intervals both the average entropy production and the variance can decrease, provided that the quantum dynamics fails to be P-divisible. Although the dynamics of the system is overall irreversible, our result may be interpreted as a transient tendency towards reversibility, described as a delta peaked distribution of entropy production around zero. Finally, we also provide analytical bounds on the parameters in the generator giving rise to the quantum system dynamics, so as to ensure irreversibility mitigation of the corresponding non-Markovian evolution.

翻訳日:2023-05-25 08:42:18 公開日:2020-07-08

# 局在ディラック波動関数のリバウンド運動

Rebound Motion of Localized Dirac Wavefunctions ( http://arxiv.org/abs/2004.07938v2 )

ライセンス: Link先を確認

Domenico P.L. Castrigiano

(参考訳) 有界局所化自由ディラック波動関数のキャリアは無限度から縮小し、その後再び無限度へと拡大する。運動は光の速度で等方的に起こる。その間にはリバウンドの位相があり、これはキャリアの直径の順に最小の延長で時間と空間に制限される。この動きは、空間のあらゆる方向に特定の時間があり、収縮から膨張への変化が瞬時に起こるように、異方的に突然進行する。漸近的に、過去と未来に関しても、位置の確率は、光速で外半径が増加する球殻の中で最大1に集中する。

It is shown that the carrier of a bounded localized free Dirac wavefunction shrinks from infinity and subsequently expands to infinity again. The motion occurs isotropicly at the speed of light. In between there is the phase of rebound, which is limited in time and space in the order of the diameter of the carrier at its minimal extension. This motion proceeds anisotropicly and abruptly as for every direction in space there is a specific time, at which the change from shrinking to expanding happens instantaneously. Asymptotically, regarding the past and the future as well, the probability of position concentrates up to 1 within any spherical shell whose outer radius increases at light speed.

翻訳日:2023-05-23 08:51:10 公開日:2020-07-08

# 有限温度における振動スペクトルのオンザフライ半古典的評価

On-the-fly ab initio semiclassical evaluation of vibronic spectra at finite temperature ( http://arxiv.org/abs/2005.09126v2 )

ライセンス: Link先を確認

Tomislav Begu\v{s}i\'c and Ji\v{r}\'i Van\'i\v{c}ek

(参考訳) 振動分解された電子スペクトルをゼロ温度で計算・解析するため,我々は最近,非調和性,モードモード結合,ヘルツベルグ-テラー効果を考慮し,オン・ザ・フライ ab initio extended thawed gaussian approximation [a. patoz et al., j. phys. chem. lett. 9, 2367 (2018)] を実装した。非零温度でのスペクトル評価のために,本手法を一般化する。熱場力学と並行して、密度行列のコヒーレンス成分のフォン・ノイマン進化を2倍の自由度を持つ拡張空間における波動関数のschr\"{o}dinger進化に変換する。拡張解法ガウス近似の効率により、この座標数の増加は計算コストをほとんど増やさない。より具体的には、元々のゼロ温度アプローチと比較して、有限温度法は追加のab initio電子構造計算を必要としない。同時に、新しいアプローチはスペクトルに対する有限温度、非調和性、ヘルツベルク・テラー効果を明確に区別することができる。モデルモース系において、一般的に用いられる大域高調波法に対する有限温度ソードガウス近似の利点を示し、上記のすべての効果が寄与するベンゼンの対称性禁止吸収スペクトルの評価に応用する。

To compute and analyze vibrationally resolved electronic spectra at zero temperature, we have recently implemented the on-the-fly ab initio extended thawed Gaussian approximation [A. Patoz et al., J. Phys. Chem. Lett. 9, 2367 (2018)], which accounts for anharmonicity, mode-mode coupling, and Herzberg-Teller effects. Here, we generalize this method in order to evaluate spectra at non-zero temperature. In line with thermo-field dynamics, we transform the von Neumann evolution of the coherence component of the density matrix to the Schr\"{o}dinger evolution of a wavefunction in an augmented space with twice as many degrees of freedom. Due to the efficiency of the extended thawed Gaussian approximation, this increase in the number of coordinates results in nearly no additional computational cost. More specifically, compared to the original, zero-temperature approach, the finite-temperature method requires no additional ab initio electronic structure calculations. At the same time, the new approach allows for a clear distinction among finite-temperature, anharmonicity, and Herzberg-Teller effects on spectra. We show, on a model Morse system, the advantages of the finite-temperature thawed Gaussian approximation over the commonly used global harmonic methods and apply it to evaluate the symmetry-forbidden absorption spectrum of benzene, where all of the aforementioned effects contribute.

翻訳日:2023-05-19 10:54:28 公開日:2020-07-08

# 偏光中性子を用いた電界イメージング

Electric field imaging using polarized neutrons ( http://arxiv.org/abs/2006.03728v2 )

ライセンス: Link先を確認

Yuan-Yu Jau, Daniel S. Hussey, Thomas R. Gentile, and Wangchun Chen

(参考訳) 実験では、電気的に中立な粒子である中性子を用いて、分離または占有できる標的体積内の静電場を直接可視化できることを実証する。感度ポラリメトリー方式の多色スピン偏光中性子ビームを用いて電界画像を得た。この研究は、他の従来のプローブではアクセスできない物体の空間依存的な電場を撮像することにより、電位、電気分極、電荷分布、誘電率の新たな診断力を可能にする。

We experimentally demonstrate that electrically neutral particles, neutrons, can be used to directly visualize the electrostatic field inside a target volume that can be isolated or occupied. Electric-field images were obtained using a polychromatic, spin-polarized neutron beam with a sensitive polarimetry scheme. This work may enable new diagnostic power of the structure of electric potential, electric polarization, charge distribution, and dielectric constant by imaging spatially dependent electric fields in objects that cannot be accessed by other conventional probes.

翻訳日:2023-05-17 02:00:34 公開日:2020-07-08

# 単光子付加コヒーレント状態の非古典性に及ぼすポストセレクトフォンノイマン測定の影響

Effects of postselected von Neumann measurement on nonclassicality of single-photon-added coherent state ( http://arxiv.org/abs/2006.08081v2 )

ライセンス: Link先を確認

Yusuf Turek

(参考訳) 単光子付加コヒーレント状態(SPACS)の非古典性に対するフォン・ノイマン後測定の影響について検討した。光子数分布,マンデルq_{m}因子,および電界二次のスクイーズパラメータといった空間の種々の場特性に対するフォン・ノイマン測定後の明示的な表現と解析結果について検討した。その結果, 測定後のSPACSの非古典性は初期状態よりも劇的に変化した。この測定により、SPACSはより強いポアソニアン光子統計を、一定の結合強度と、ポストセレクション確率の低い大きな弱い値で保持することができた。

The effects of von Neumann postselected measurement on nonclassicality of single-photon-added coherent state (SPACS) are studied. Explicit expressions and analytical results for various field properties of SPACS such as the photon number distribution, the Mandel Q_{m} factor and the squeezing parameter of field quadrature after postselected von Neumann measurement are investigated. The results showed that the nonclassicality of SPACS after measurement changed dramatically than initial state. The measurement let SPACS possess more strong sub-Poissonian photon statistics in some definite coupling strength regimes and large weak values which accompanied by low postselection probabilities.

翻訳日:2023-05-13 20:36:48 公開日:2020-07-08

# 周期的に焼成された2脚はしごにおける偶整数位相不変量をもつ非エルミート的フロケ位相

Non-Hermitian Floquet phases with even-integer topological invariants in a periodically quenched two-leg ladder ( http://arxiv.org/abs/2006.08897v2 )

ライセンス: Link先を確認

Longwen Zhou

(参考訳) 周期的に駆動される非エルミート系は、独自のトポロジカル、動的、輸送特性を持つエキゾチックな非平衡相を持つことができる。本研究では,拡張cii対称性クラスに属する時間周期的クエンチと非エルミート効果の両方を対象とする実験的に実現可能な2脚ラダーモデルを提案する。駆動と非相互性の相互作用により、豊富な非エルミートフロケット位相相が系内に出現し、それぞれが偶数整数位相不変量 $(w_{0},w_{\pi})\in2\mathbb{z}\times2\mathbb{z}$ によって特徴づけられる。開境界条件の下では、これらの不変量はさらにシステムの端辺に局在するゼロおよび$\pi$-quasienergyモードの数を予測する。最終的に、CII対称性クラスにおける非エルミートフロケ位相の位相不変量に対する動的プローブとして使用できる平均キラル変位の一般化版を構築した。そこで本研究では,非エルミートフロッケ位相問題の新しいタイプを導入し,さらに,駆動開放系におけるトポロジーとダイナミクスの豊かさを明らかにした。

Periodically driven non-Hermitian systems could possess exotic nonequilibrium phases with unique topological, dynamical and transport properties. In this work, we introduce an experimentally realizable two-leg ladder model subjecting to both time-periodic quenches and non-Hermitian effects, which belongs to an extended CII symmetry class. Due to the interplay between drivings and nonreciprocity, rich non-Hermitian Floquet topological phases emerge in the system, with each of them been characterized by a pair of even-integer topological invariants $(w_{0},w_{\pi})\in2\mathbb{Z}\times2\mathbb{Z}$. Under the open boundary condition, these invariants further predict the number of zero- and $\pi$-quasienergy modes localized around the edges of the system. We finally construct a generalized version of the mean chiral displacement, which could be employed as a dynamical probe to the topological invariants of non-Hermitian Floquet phases in the CII symmetry class. Our work thus introduces a new type of non-Hermitian Floquet topological matter, and further reveals the richness of topology and dynamics in driven open systems.

翻訳日:2023-05-13 18:18:21 公開日:2020-07-08

# ランダムテンソルネットワークにおけるpetz再構成

Petz reconstruction in random tensor networks ( http://arxiv.org/abs/2006.12601v2 )

ライセンス: Link先を確認

Hewei Frederic Jia, Mukund Rangamani

(参考訳) ホログラフィーのランダムテンソルネットワーク玩具モデルにおけるバルク再構成の考え方について述べる。具体的には、ペッツ再構成マップが、レプリカのトリックを利用して境界データからバルク演算子を得る方法を示す。また,粗粒化とランダム射影の違いについてもコメントする機会を得た。

We illustrate the ideas of bulk reconstruction in the context of random tensor network toy models of holography. Specifically, we demonstrate how the Petz reconstruction map works to obtain bulk operators from the boundary data by exploiting the replica trick. We also take the opportunity to comment on the differences between coarse-graining and random projections.

翻訳日:2023-05-13 04:39:27 公開日:2020-07-08

# 新型コロナウイルスのリスクを見積もる個人別健康トークン

Differentially Private Health Tokens for Estimating COVID-19 Risk ( http://arxiv.org/abs/2006.14329v2 )

ライセンス: Link先を確認

David Butler, Chris Hicks, James Bell, Carsten Maple, Jon Crowcroft

(参考訳) Covid-19との戦いにおいて、多くの政府や企業がいわゆる免疫パスポートを評価し、試行し、実施している。抗体や健康証明書としても知られており、他の人を危険にさらすことなく仕事や混雑した場所に戻れる技術には明確な需要がある。このようなシステムに対する大きな批判の1つは、免疫のない人々に対して不当に差別するために誤用される可能性があり、「免疫特権を持つ」人々の集団を形成することを許している。この作業では、設計によって差別的でない代替の技術的ソリューションを探究する動機があります。特に私たちは、個々のテスト結果がランダム化され、有用な集計リスク見積が計算できるような、ランダム化された健康証明書を提案します。健康トークンは,少人数の利用者による集団感染リスクを推定する有効なメカニズムを示しながら,免疫に基づく差別を緩和できることを示した。我々は、アイデンティティフリーおよびIDバインディングユースケースの文脈において、我々のアプローチの生存可能性を評価し、多くの攻撃の可能性を検討する。実験の結果,500以上のグループでは,平均で0.03 % 以下の誤差があり,複数のアイデンティティフリーな文脈において,集約された結果が有用であることがわかった。最後に,我々のソリューションの実用性を示すオープンソースプロトタイプの結果を示す。

In the fight against Covid-19, many governments and businesses are in the process of evaluating, trialling and even implementing so-called immunity passports. Also known as antibody or health certificates, there is a clear demand for any technology that could allow people to return to work and other crowded places without placing others at risk. One of the major criticisms of such systems is that they could be misused to unfairly discriminate against those without immunity, allowing the formation of an `immuno-privileged' class of people. In this work we are motivated to explore an alternative technical solution that is non-discriminatory by design. In particular we propose health tokens -- randomised health certificates which, using methods from differential privacy, allow individual test results to be randomised whilst still allowing useful aggregate risk estimates to be calculated. We show that health tokens could mitigate immunity-based discrimination whilst still presenting a viable mechanism for estimating the collective transmission risk posed by small groups of users. We evaluate the viability of our approach in the context of identity-free and identity-binding use cases and then consider a number of possible attacks. Our experimental results show that for groups of size 500 or more, the error associated with our method can be as low as 0.03 on average and thus the aggregated results can be useful in a number of identity-free contexts. Finally, we present the results of our open-source prototype which demonstrates the practicality of our solution.

翻訳日:2023-05-12 20:04:38 公開日:2020-07-08

# 開量子システムにおけるkrotovアルゴリズムのコスト関数の単調増加の証明

Proof of monotonic increase in the cost function for Krotov algorithm for open quantum systems ( http://arxiv.org/abs/2006.16817v2 )

ライセンス: Link先を確認

Tejas Shetty

(参考訳) 多くの量子制御論文では、クロトフの単調収束変分制御アルゴリズム(maday and tririnici (2003), tannor et al. (1992), zhu and rabitz (1998)など)の変種の一つを用いている。 N, Suriらによる「オープン量子系変分最適化による熱化の高速化」に関する論文。 [EPJST 227, 203 -216 (2018), arXiv:1711.08776] は、オープン量子系に対して Krotov アルゴリズムを実行する方法を提供する。我々は,論文の定理1を証明し,同一の付録1に記載された簡潔な処理に大きく拡張する。

A great number of quantum control papers have used one of the variants of the monotonically convergent variational control algorithm of Krotov (as described in Maday and Turinici (2003), Tannor et al. (1992), Zhu and Rabitz (1998), etc). The paper "Speeding up Thermalisation via Open Quantum System Variational Optimisation" by N, Suri, et al. [EPJST 227, 203 -216 (2018), arXiv:1711.08776] provides us a way of carrying out Krotov algorithm for open quantum systems. We shall prove the Theorem 1 of the paper, greatly expanding upon the brief treatment given in appendix 1 of the same.

翻訳日:2023-05-12 03:20:04 公開日:2020-07-08

# 信号検出理論を用いた高校環境における槍フィッシング感受性の定量化

Quantifying Susceptibility to Spear Phishing in a High School Environment Using Signal Detection Theory ( http://arxiv.org/abs/2006.16380v2 )

ライセンス: Link先を確認

Ploy Unchit, Sanchari Das, Andrew Kim, L. Jean Camp

(参考訳) スピアフィッシング(英:spear phishing)は、社会工学を用いて標的の被害者を標的とした機密情報を取得する詐欺攻撃である。特定の犠牲者を狙うために社会的手がかりとパーソナライズされた情報を使用することで区別される。スピアフィッシングに対するレジリエンスに関する以前の研究は、学生に不釣り合いに焦点を合わせながら、コンビニエンスサンプルに焦点を当てている。対照的に,本稿では,高校生コミュニティの評価について報告する。信号検出理論(SDT)を用いた研究には,高校生57名と教員(高校生12名,職員45名)が参加した。シナリオベースの分析を通じて、参加者はフィッシングメールと本物のメールを区別する作業を行った。その結果, 技術背景に関わらず, 自己検出における自信の偏りがみられた。これらの知見は,過疎な集団の意思決定を評価する上で重要であり,ヒトの感受性を調べることで,潜在的な槍フィッシング攻撃から人々を保護している。

Spear phishing is a deceptive attack that uses social engineering to obtain confidential information through targeted victimization. It is distinguished by its use of social cues and personalized information to target specific victims. Previous work on resilience to spear phishing has focused on convenience samples, with a disproportionate focus on students. In contrast, here, we report on an evaluation of a high school community. We engaged 57 high school students and faculty members (12 high school students, 45 staff members) as participants in research utilizing signal detection theory (SDT). Through scenario-based analysis, participants tasked with distinguishing phishing emails from authentic emails. The results revealed an overconfidence bias in self-detection from the participants, regardless of their technical background. These findings are critical for evaluating the decision-making of underrepresented populations and protecting people from potential spear phishing attacks by examining human susceptibility.

翻訳日:2023-05-12 03:19:26 公開日:2020-07-08

# コンピュータを信頼できる時(そしてできないとき)

When we can trust computers (and when we can't) ( http://arxiv.org/abs/2007.03741v1 )

ライセンス: Link先を確認

Peter V. Coveney and Roger R. Highfield

(参考訳) コンピュータパワーの絶え間ない上昇により、コンピュータは科学の最も圧力のかかる問題を、さらにさらに解決できると広く期待されている。計算モデリングの限界を探求し、比較的単純で理論に固執した科学と工学の領域において、これらの手法は確かに強力である、と結論づける。それでも、コード、データ、ドキュメントの可用性は、検証、検証、不確実性定量化といった様々な技術とともに、コンピュータが生成した発見に対する信頼を構築する上で不可欠である。科学分野の複雑なシステム、特に生物学や医学において、社会科学や人文科学について何も言わずに、理論に固執していない場合、コンピュータは客観性の錯覚を生み出すことができる。また,デジタル手法では解決できない自然界の重要な側面についても論じる。長期的には、デジタル計算に現在置かれている過度な信頼を誘惑するために、アナログ手法に重点を置く必要がある。

With the relentless rise of computer power, there is a widespread expectation that computers can solve the most pressing problems of science, and even more besides. We explore the limits of computational modelling and conclude that, in the domains of science and engineering that are relatively simple and firmly grounded in theory, these methods are indeed powerful. Even so, the availability of code, data and documentation, along with a range of techniques for validation, verification and uncertainty quantification, are essential for building trust in computer generated findings. When it comes to complex systems in domains of science that are less firmly grounded in theory, notably biology and medicine, to say nothing of the social sciences and humanities, computers can create the illusion of objectivity, not least because the rise of big data and machine learning pose new challenges to reproducibility, while lacking true explanatory power. We also discuss important aspects of the natural world which cannot be solved by digital means. In the long-term, renewed emphasis on analogue methods will be necessary to temper the excessive faith currently placed in digital computation.

翻訳日:2023-05-10 23:45:55 公開日:2020-07-08

# ITForensics Managementのためのアジャイルアプローチ

Agile Approach for IT Forensics Management ( http://arxiv.org/abs/2007.04125v1 )

ライセンス: Link先を確認

Matthias Schopp, Peter Hillmann

(参考訳) サイバー攻撃とITインシデントに関する法医学的な調査は、複雑化とネットワークの強化によりますます難しくなってきている。特にAdvanced Attacks(AT)では、Advanced Persistent Threatsのようなアジャイルアプローチは不可欠です。複数のシステムが攻撃(マルチホスト攻撃)に関与している。現在の法医学モデルと手続きは、そのような攻撃を分析する過程においてかなりの欠陥を示している。そこで本稿では,アジャイル手法を用いて新たな法医学的管理手法を形成する新しいフラワーモデルを提案する。このように、ATの増大する課題は満たされている。このような攻撃の法医学的な調査では、分析が必要なデータ量のために、ビッグデータの問題を解決する必要がある。提案したモデルは、早期の状態で答えるべき質問を正確に定義し、これらの質問に答えるために必要な裁判所の手続きで利用可能な証拠のみを収集することによって、この要件を満たす。さらに, 調査プロセスの異なる段階に対応する新しいAT花モデルが提示される。

The forensic investigation of cyber attacks and IT incidents is becoming increasingly difficult due to increasing complexity and intensify networking. Especially with Advanced Attacks (AT) like the increasing Advanced Persistent Threats an agile approach is indispensable. Several systems are involved in an attack (multi-host attacks). Current forensic models and procedures show considerable deficits in the process of analyzing such attacks. For this purpose, this paper presents the novel flower model, which uses agile methods and forms a new forensic management approach. In this way, the growing challenges of ATs are met. In the forensic investigation of such attacks, big data problems have to be solved due to the amount of data that needs to be analyzed. The proposed model meets this requirement by precisely defining the questions that need to be answered in an early state and collecting only the evidence usable in court proceedings that is needed to answer these questions. Additionally, the novel flower model for AT is presented that meets the different phases of an investigation process.

翻訳日:2023-05-10 23:40:59 公開日:2020-07-08

# 問題解決スキルとしての計算思考に関する研究--工学と社会科学の知識に基づく比較

Study on Computational Thinking as Problem-solving Skill: Comparison Based on Students Mindset in Engineering and Social Science ( http://arxiv.org/abs/2007.04060v1 )

ライセンス: Link先を確認

Andik Asmara

(参考訳) 21世紀のスキルを強制する能力の1つは、最上位の地位となる批判的思考と問題解決のスキルである。問題解決能力に重点を置くことは子供に教えることができ、特に小学校ではk-12の先行研究に注目する。計算思考(Computational Thinking)は、この10年で広く採用され、研究された問題解決技術である。本研究は, 計算的思考法を活用できる可能性に基づいて, 課題を解決できる学生の能力を検討することを目的とした。この研究の参加者は、台湾で学ぶ6人の国際学生と、工学と社会科学の2つのdeferent sciencesの学生であった。データインタビューの分析には質的手法が用いられ、気候変動という世界的な問題から事例を取り上げている。その結果、新しい環境で生き残ることが、彼らの問題解決スキルの実施の証拠となった。工学と社会科学の両方の学生は違いがあり、アルゴリズムに正確な構造を使う方法がある。

One of the capabilities which 21st-century skill compulsory a person is critical thinking and problem-solving skill that becomes top positions rank. Focus on problem-solving skills can be taught to a child, especially begun in elementary school refer to prior research focus on K-12. Computational thinking was one problem-solving skill that popular to implemented and studied in the current decade. This study was conducted to explore students' capability to be able solving of the problem based on the possibility use the computational thinking way. Participants in this study came from six international students that study in Taiwan and from two deferent sciences disciplines, engineering, and social science. A qualitative method was used to analyze data interviews, took example cases from the global issue that is Climate Change. The result founded that survive in a new environment was become evidence of their implementation of problem-solving skills. Problem-solving mindset both students of engineering and social science had discrepancy, those are how to use precise structure in the algorithm.

翻訳日:2023-05-10 23:40:39 公開日:2020-07-08

# 光渦中の原子の四極子吸収速度と軌道角運動量移動

Quadrupole absorption rate and orbital angular momentum transfer for atoms in optical vortices ( http://arxiv.org/abs/2007.04021v1 )

ライセンス: Link先を確認

Smail Bougouffa and Mohamed Babiker

(参考訳) 四極子遷移における光渦と原子の相互作用に関する最近の実験は、原子の電子状態と光渦場の間の軌道角運動量(oam)の交換を伴うことが示されている。理論と実験の両方による以前の研究は、電気双極子原子遷移における渦OAMの電子的自由度への移動を排除しており、電子運動へのOAMの移動を含む最も低い多極性秩序が電気四極子であることが確認されている。光渦を含む四極子転移は定量化されていないため、セシウム原子が線形偏光渦の場下にあるcsの6^2s_{1/2}\rightarrow 5^2d_{5/2}$を参照して、oam転移に伴う吸収率を評価する。実験により, 適度な光強度の吸収速度は, 四重極自発放射率よりも小さいが, 現代の分光法では測定範囲に留まることが示唆された。

Recent experiments involving the interaction of optical vortices with atoms in quadrupole transitions have been shown to be accompanied by the exchange of orbital angular momentum (OAM) between the electronic states of the atom and the optical vortex field. Earlier work by both theory and experiment had ruled out the transfer of a vortex OAM to the electronic degrees of freedom in an electric dipole atomic transition and it has been confirmed that the lowest multipolar order involving an OAM transfer to the electronic motion is indeed the electric quadrupole. Hitherto, the quadrupole transition involving optical vortices has not been quantified and we have thus set out to evaluate the absorption rate accompanied by an OAM transfer with reference to the $6^2S_{1/2}\rightarrow 5^2D_{5/2}$ in Cs when caesium atoms are subject to the field of a linearly polarized optical vortex. Our results assuming typical experimentally accessible parameters indicate that the absorption rate for moderate light intensities is smaller than the quadrupole spontaneous emission rate, but should still be within the measurement capabilities of modern spectroscopic techniques.

翻訳日:2023-05-10 23:40:09 公開日:2020-07-08

# 量子有限オートマトンによる化学反応のモデル化

A Quantum Finite Automata Approach to Modeling the Chemical Reactions ( http://arxiv.org/abs/2007.03976v1 )

ライセンス: Link先を確認

Amandeep Singh Bhatia, Shenggen Zheng

(参考訳) 近年、モデリングへの関心は分子レベルから原子レベル、量子スケールへと大きく高まっている。計算化学の分野は、原子や分子から工業規模のプロセスまで、システムの操作とシミュレーションのための計算モデルを設計する上で重要な役割を担っている。これは計算能力の大幅な増加とアルゴリズムの効率に影響を受けている。古典的オートマトン理論を用いた化学反応を熱力学的に表現することは、コンピュータ科学に大きな影響を与えた。量子計算モデルを用いた化学情報処理の研究は自然な目標である。本稿では,線形時間で停止する2方向量子有限オートマトンを用いた化学反応のモデル化を行った。さらに、古典的なプッシュダウンオートマトンは、複数のスタックを持つ化学反応のために設計することができる。化学受容/放出シグネチャと量子オートマトンモデルを組み合わせて計算の汎用性を高めることが証明されている。

In recent years, the modeling interest has increased significantly from the molecular level to the atomic and quantum scale. The field of computational chemistry plays a significant role in designing computational models for the operation and simulation of systems ranging from atoms and molecules to industrial-scale processes. It is influenced by a tremendous increase in computing power and the efficiency of algorithms. The representation of chemical reactions using classical automata theory in thermodynamic terms had a great influence on computer science. The study of chemical information processing with quantum computational models is a natural goal. In this paper, we have modeled chemical reactions using two-way quantum finite automata, which are halted in linear time. Additionally, classical pushdown automata can be designed for such chemical reactions with multiple stacks. It has been proven that computational versatility can be increased by combining chemical accept/reject signatures and quantum automata models.

翻訳日:2023-05-10 23:39:01 公開日:2020-07-08

# 量子群対称性を持つ量子チャネル

Quantum channels with quantum group symmetry ( http://arxiv.org/abs/2007.03901v1 )

ライセンス: Link先を確認

Hun Hee Lee and Sang-Gyun Youn

(参考訳) 本稿では、任意のコンパクトな量子群が量子チャネルの対称性群として使用できることを証明し、共変チャネルの概念を導出する。そして、同変チャネルの凸集合の構造を、関連する融合則に対する多重性のない条件を仮定して、すべての極点を同定することにより、最近の結果の広範な一般化を提供する。群対称性と対照的な量子群対称性の存在は、量子置換群の例と$SU_q(2)$で強調される。後者の例では、非カック型条件から生じるハイゼンベルク像の必要性を見出す。本論文は、射影表現に関する共変性によって終わり、ワイル共変チャネルとそのフェルミオン的類似性に戻る。

In this paper we will demonstrate that any compact quantum group can be used as symmetry groups for quantum channels, which leads us to the concept of covariant channels. We, then, unearth the structure of the convex set of covariant channels by identifying all extreme points under the assumption of multiplicity-free condition for the associated fusion rule, which provides a wide generalization of some recent results. The presence of quantum group symmetry contrast to the group symmetry will be highlighted in the examples of quantum permutation groups and $SU_q(2)$. In the latter example, we will see the necessity of the Heisenberg picture coming from the non-Kac type condition. This paper ends with the covariance with respect to projective representations, which leads us back to Weyl covariant channels and its fermionic analogue.

翻訳日:2023-05-10 23:38:38 公開日:2020-07-08

# Ge/Siナノワイヤ量子ドットにおける強スピン軌道相互作用とホールスピンの$g$-factor再正規化

Strong spin-orbit interaction and $g$-factor renormalization of hole spins in Ge/Si nanowire quantum dots ( http://arxiv.org/abs/2007.04308v1 )

ライセンス: Link先を確認

F. N. M. Froning, M. J. Ran\v{c}i\'c, B. Het\'enyi, S. Bosco, M. K. Rehmann, A. Li, E. P. A. M. Bakkers, F. A. Zwanenburg, D. Loss, D. M. Zumb\"uhl, F. R. Braakman

(参考訳) スピン軌道相互作用は、量子計算の中心にスピン量子ビット、位相的に非自明な状態の研究、スピントロニクスにおける様々な応用がある。 ge/siコア/シェルナノワイヤのホールスピンは、強くて電気的に調整可能なスピン軌道相互作用を経験し、これらの分野の研究に特に有望なプラットフォームとなっている。 ge/siナノワイヤ内の2重量子ドットに閉じ込められたホールスピンのスピン軌道相互作用の強度をスピンブロック輸送系内におけるスピン混合遷移の測定により実験的に決定する。驚くほど短いスピン軌道長が$\sim$65 nmであり、量子ドット長とインタードット距離に匹敵する。さらに, 印加磁場のホール状態に対する大きな軌道効果を観測し, スピン混合遷移エネルギーの磁場依存性を明らかにした。これらの軌道効果とともに、強いスピン軌道相互作用は、磁場による$g$-factorの顕著な向上を引き起こすが、大きなスピン軌道相互作用強度は、この物質系の予測された直接ラシュバスピン軌道相互作用と一致し、スピン量子ビットの超高速なラビ振動と効率的なクビット量子ビット相互作用、およびマヨラナゼロモードの研究に適したプラットフォームを提供する。

The spin-orbit interaction lies at the heart of quantum computation with spin qubits, research on topologically non-trivial states, and various applications in spintronics. Hole spins in Ge/Si core/shell nanowires experience a spin-orbit interaction that has been predicted to be both strong and electrically tunable, making them a particularly promising platform for research in these fields. We experimentally determine the strength of spin-orbit interaction of hole spins confined to a double quantum dot in a Ge/Si nanowire by measuring spin-mixing transitions inside a regime of spin-blockaded transport. We find a remarkably short spin-orbit length of $\sim$65 nm, comparable to the quantum dot length and the interdot distance. We additionally observe a large orbital effect of the applied magnetic field on the hole states, resulting in a large magnetic field dependence of the spin-mixing transition energies. Strikingly, together with these orbital effects, the strong spin-orbit interaction causes a significant enhancement of the $g$-factor with magnetic field.The large spin-orbit interaction strength demonstrated is consistent with the predicted direct Rashba spin-orbit interaction in this material system and is expected to enable ultrafast Rabi oscillations of spin qubits and efficient qubit-qubit interactions, as well as provide a platform suitable for studying Majorana zero modes.

翻訳日:2023-05-10 23:32:30 公開日:2020-07-08

# ポリエンの高エネルギー三重項ペア状態と分子内一重項分裂における役割

Higher energy triplet-pair states in polyenes and their role in intramolecular singlet fission ( http://arxiv.org/abs/2007.04305v1 )

ライセンス: Link先を確認

Darren J Valentine, Dilhan Manawadu, and William Barford

(参考訳) 明るい状態(1^1B_u^+$/S_2$) を超えるエネルギーを持つ拡張ポリエン系は、一重項分裂によって三重項を生成する。この過程は、2^1A_g^-$/$S_1$状態には関与せず、他の状態が役割を果たすことを示唆している。パリエ・パリル・ピエルス・ハミルトンの密度行列再正規化群 (DMRG) 計算を用いて, 一重項分裂に関与する可能性のある候補状態について検討した。緩和された1^1b_u^-$と3^1a_g^-$ singlet状態と1^5a_g^-$ quintet状態は$s_2$状態以下であることがわかった。 1^1b_u^-$,3^1a_g^-$,1^5a_g^-$状態はすべて三重項三重項を持つと考えられており、三重項状態の積と結合二重化、スピンスピン相関、波動関数の重なりの計算によって確認される。したがって、三重項対と電子ホール特性の両方からなる一重項励起(つまり、$2^1A_g^-$, $1^1B_u^-$, $3^1A_g^-$, $\cdots$)があり、基本的に同じ励起であるが質量中心エネルギーを持つ。この族で最も低いエネルギー元素である2^1A_g^-$状態は一重項核分裂を起こさない。しかし、より高いエネルギーメンバー(例えば3^1a_g^-$)は、運動エネルギーの増加と電子格子緩和の低減により、特定の鎖長に対して一重項分裂を起こすことができる。

Probing extended polyene systems with energy in excess of the bright state ($1^1B_u^+$/$S_2$) band edge generates triplets via singlet fission. This process is not thought to involve the $2^1A_g^-$/$S_1$ state, suggesting that other states play a role. Using density matrix renormalisation group (DMRG) calculations of the Pariser-Parr-Pople-Peierls Hamiltonian, we investigate candidate states that could be involved in singlet fission. We find that the relaxed $1^1B_u^-$, and $3^1A_g^-$ singlet states and $1^5A_g^-$ quintet state lie below the $S_2$ state. The $1^1B_u^-$, $3^1A_g^-$ and $1^5A_g^-$ states are all thought to have triplet-triplet character, which is confirmed by our calculations of bond dimerization, spin-spin correlation and wavefunction overlap with products of triplet states. We thus show that there is a family of singlet excitations(i.e., $2^1A_g^-$, $1^1B_u^-$, $3^1A_g^-$, $\cdots$), composed of both triplet-pair and electron-hole character, which are fundamentally the same excitation, but have different center-of-mass energies. The lowest energy member of this family, the $2^1A_g^-$ state, cannot undergo singlet fission. But higher energy members (e.g., the $3^1A_g^-$) state, owing to their increased kinetic energy and reduced electron-lattice relaxation, can undergo singlet fission for certain chain lengths.

翻訳日:2023-05-10 23:32:03 公開日:2020-07-08

# 量子ファンアウト:回路最適化と技術モデリング

Quantum Fan-out: Circuit Optimizations and Technology Modeling ( http://arxiv.org/abs/2007.04246v1 )

ライセンス: Link先を確認

Pranav Gokhale, Samantha Koretsky, Shilin Huang, Swarnadeep Majumder, Andrew Drucker, Kenneth R. Brown, Frederic T. Chong

(参考訳) 命令スケジューリングは、古典計算と同様に、量子コンピューティングにおける重要なコンパイラ最適化である。現在のスケジューラは、キュービットが重複しない限り、命令の同時実行を可能にすることで、データの並列処理を最適化する。しかし、多くの量子ハードウェアプラットフォームでは、重なり合う量子ビットの命令を__globalインタラクション__で同時に実行できる。例えば、従来の量子回路におけるファンアウトは論理レベルで見る場合にのみシーケンシャルに実装できるが、物理的レベルでのグローバルな相互作用はファンアウトを1ステップで達成できる。 NISQ(Noisy Intermediate-Scale Quantum)ワークロードの回路合成を最適化するために,この同時ファンアウトプリミティブを活用する。さらに,ファンアウトに基づく新しい量子メモリアーキテクチャを提案する。我々の研究はファンアウトプリミティブのハードウェア実装にも取り組んでいる。我々は、閉じ込められたイオン量子コンピュータの現実的なシミュレーションを行う。また,超伝導量子ビットを用いたファンアウトの概念実証実験を行った。 NISQアプリケーション回路と量子メモリアーキテクチャに対して,現実的なノイズモデルの下で深度(ランタイム)および忠実度推定を行う。我々のシミュレーションは、実行時の漸近的な利点を伴う有望な結果を示し、7～24%のエラーの低減を示す。

Instruction scheduling is a key compiler optimization in quantum computing, just as it is for classical computing. Current schedulers optimize for data parallelism by allowing simultaneous execution of instructions, as long as their qubits do not overlap. However, on many quantum hardware platforms, instructions on overlapping qubits can be executed simultaneously through __global interactions__. For example, while fan-out in traditional quantum circuits can only be implemented sequentially when viewed at the logical level, global interactions at the physical level allow fan-out to be achieved in one step. We leverage this simultaneous fan-out primitive to optimize circuit synthesis for NISQ (Noisy Intermediate-Scale Quantum) workloads. In addition, we introduce novel quantum memory architectures based on fan-out. Our work also addresses hardware implementation of the fan-out primitive. We perform realistic simulations for trapped ion quantum computers. We also demonstrate experimental proof-of-concept of fan-out with superconducting qubits. We perform depth (runtime) and fidelity estimation for NISQ application circuits and quantum memory architectures under realistic noise models. Our simulations indicate promising results with an asymptotic advantage in runtime, as well as 7--24% reduction in error.

翻訳日:2023-05-10 23:30:54 公開日:2020-07-08

# 時間依存背景におけるロンドンの超伝導アプローチ

London superconductivity approach in a time-dependent background ( http://arxiv.org/abs/2007.04230v1 )

ライセンス: Link先を確認

Vanderley Aguiar, Jo\~ao P. G. Nascimento, Ilde Guedes and Raimundo N. Costa Filho

(参考訳) 本論文の主な目的は、ロンドンアプローチを用いて時間依存パラメータを持つ超伝導体における電荷空間の正確な量子解を得ることである。本稿では、lewis and riesenfeld invariant operator法に基づく超伝導体内部電荷の新しい量子化スキームを提案する。得られた波動関数から,時間依存の不確かさとシステムの平均エネルギーを計算した。シャノンエントロピーや複雑性といった情報尺度も得られた。後者は常に時間非依存であり、伝導度にも依存しない。他の量は時間依存関数 \r{ho}(t), c 個の数で書かれ、非線形微分方程式を満たす。

The main goal of this paper is to obtain the exact quantum solutions for charge space in a superconductor with time-dependent parameters using the London approach. We introduce a new quantization scheme for the charge inside a superconductor based on the Lewis and Riesenfeld invariant operator method. From the wave-functions obtained, we calculated the time-dependent uncertainties and the mean energy of the system. Information measures were also obtained, such as Shannon entropy and complexity. The later is always time-independent and also does not depend on conductivity. The others quantities are written in terms of a time-dependent function, \r{ho}(t), c-number quantity satisfying a nonlinear differential equation.

翻訳日:2023-05-10 23:30:38 公開日:2020-07-08

# Nレベル量子スピン系のロバストフィードバック安定化

Robust feedback stabilization of N-level quantum spin systems ( http://arxiv.org/abs/2007.04211v1 )

ライセンス: Link先を確認

Weichao Liang, Nina H. Amini, and Paolo Mason

(参考訳) 本稿では,nレベル量子角運動量系と電磁場との相互作用について検討する。推定された量子状態を表す追加状態を導入する必要があるので、初期状態と物理パラメータの無知を仮定する。量子状態の進化とその推定は、結合確率マスター方程式によって記述される。本稿では,フィードバック制御系の存在下でのシステムの漸近的挙動について検討する。我々は,フィードバック制御器と推定パラメータに十分な条件を与え,結合確率系の指数的安定化を測定者の固有状態に保証する。さらに、対応する収束率を推定する。このような条件を満たすパラメータ化されたフィードバック則も提供します。本研究は, [21] のフィードバック安定化戦略が, 推定状態の不正確な初期化や未知の物理パラメータに対する堅牢性を示すものである。

In this paper, we consider N-level quantum angular momentum systems interacting with electromagnetic fields undergoing continuous-time measurements. We suppose unawareness of the initial state and physical parameters, entailing the introduction of an additional state representing the estimated quantum state. The evolution of the quantum state and its estimation is described by a coupled stochastic master equation. Here, we study the asymptotic behavior of such a system in presence of a feedback controller. We provide sufficient conditions on the feedback controller and on the estimated parameters that guarantee exponential stabilization of the coupled stochastic system towards an eigenstate of the measurement operator. Furthermore, we estimate the corresponding rate of convergence. We also provide parametrized feedback laws satisfying such conditions. Our results show the robustness of the feedback stabilization strategy considered in [21] in case of imprecise initialization of the estimated state and with respect to the unknown physical parameters.

翻訳日:2023-05-10 23:30:10 公開日:2020-07-08

# 世論調査におけるオンライン暗黙の関連テストの利用

Using Online Implicit Association Tests in Opinion Polling ( http://arxiv.org/abs/2007.04183v1 )

ライセンス: Link先を確認

Alan Smeaton and Hyowon Lee and Niamh Morris and David Hanley

(参考訳) 世論調査は、今や私たちの日々のニュースサイクルのデファクトな要素であり、その結果が政府やビジネスに常に明らかな方法で影響を与えているため、社会の非常に重要な要素になっています。しかし、ポーリングは必ずしも正確というわけではないし、1930年代までさかのぼる世界に大きな影響を与えてきた真に不正確なポーリング結果もいくつかある。本稿では,現代の不正確な世論調査の理由の一つとして,社会的に望ましい反応 (shy vote) 現象を分析した。暗黙の連帯試験 (IATs) を通じてそれを公開する方法を説明し、アイルランドのイギリスに対する意見に関する小さな調査において、シャイな有権者効果を示す。従来の世論調査にIATを取り入れることで、これらをオンラインで正確に実施できるという事実を指摘するとともに、世論の世論調査の機会を制限するCovid-19規制時代において、より多種多様な回答者のサンプルにポーリングが到達できるようにする。

Opinion polls have now become a very important component of society because they are now a defacto component of our daily news cycle and because their results influence governments and business in ways which are not always obvious to us. However, polling is not always accurate and there have been some really inaccurate polling results which have had major influences on the world going back to the 1930s but also as recently as just the last 3 or 4 years. In this paper we analyse the phenomenon of socially desirable responding (shy voters) which has emerged as one of the reasons for modern day inaccurate polling. We describe how it can be exposed through implicit association tests (IATs) and we demonstrate the shy voter effect in a small survey on opinions in Ireland towards the United Kingdom. We argue for inclusion of IATs in traditional polling and point to the fact that these can be conducted accurately online, which also allows polling to reach a larger and more diverse sample of respondents in the days of Covid-19 restrictions which restricts the opportunities for poll sampling from the general public.

翻訳日:2023-05-10 23:29:59 公開日:2020-07-08

# NERD: リスクデータストリームの予測のためのニューラルネットワーク

NERD: Neural Network for Edict of Risky Data Streams ( http://arxiv.org/abs/2007.07753v1 )

ライセンス: Link先を確認

Sandro Passarelli, Cem G\"undogan, Lars Stiemert, Matthias Schopp, Peter Hillmann

(参考訳) サイバーインシデントは、単純な接続損失から断続的な攻撃まで、幅広い原因を持つ可能性がある。一度サイバーセキュリティのインシデントやシステム障害が特定できれば、どのように進むかを決めることはしばしば複雑になる。特に、実際の原因が直接的詳細決定可能でない場合。そこで我々は,サイバーインシデント対応支援システムのコンセプトを開発した。このシステムには侵入検知システムや監視ツールなど,複数の情報ソースが組み込まれている。同期パッケージ比のような20以上の重要な属性を使用して、潜在的なセキュリティインシデントを特定し、データを異なる優先順位カテゴリに分類する。その後、システムは人工知能を使用してさらなる意思決定プロセスをサポートし、取締役会を簡潔にするために対応するレポートを生成する。この情報から、その原因やトラブルシューティング対策について、適切かつ詳細な提案がなされる。学習プロセスの入力としてラベル付きフローデータを使用することで,問題の解決に関するユーザからのフィードバックを今後の意思決定に含める。プロトタイプは、意思決定が持続的に改善され、サイバーインシデント処理プロセスがより効果的になることを示している。

Cyber incidents can have a wide range of cause from a simple connection loss to an insistent attack. Once a potential cyber security incidents and system failures have been identified, deciding how to proceed is often complex. Especially, if the real cause is not directly in detail determinable. Therefore, we developed the concept of a Cyber Incident Handling Support System. The developed system is enriched with information by multiple sources such as intrusion detection systems and monitoring tools. It uses over twenty key attributes like sync-package ratio to identify potential security incidents and to classify the data into different priority categories. Afterwards, the system uses artificial intelligence to support the further decision-making process and to generate corresponding reports to brief the Board of Directors. Originating from this information, appropriate and detailed suggestions are made regarding the causes and troubleshooting measures. Feedback from users regarding the problem solutions are included into future decision-making by using labelled flow data as input for the learning process. The prototype shows that the decision making can be sustainably improved and the Cyber Incident Handling process becomes much more effective.

翻訳日:2023-05-10 23:22:27 公開日:2020-07-08

# mog-vqe:多目的遺伝的変分量子固有解法

MoG-VQE: Multiobjective genetic variational quantum eigensolver ( http://arxiv.org/abs/2007.04424v1 )

ライセンス: Link先を確認

D. Chivilikhin, A. Samarin, V. Ulyantsev, I. Iorsh, A. R. Oganov, O. Kyriienko

(参考訳) 変分量子固有解法(VQE)は、短期量子コンピュータのための最初の実用的なアルゴリズムとして登場した。その成功は主に選択された変分アンザッツに依存し、ハミルトニアンの近似基底状態を作成する量子回路に対応する。典型的には、高い表現精度(回路深度を犠牲にして)を達成するか、あるいは正確な基底エネルギーへの収束を犠牲にする浅い回路を使用する。本稿では,低深度と精度の向上を両立させる手法を提案し,ハードウェア効率の良いVQEのための遺伝的改良アンサッツを考案した。本手法は多目的遺伝的変分量子固有解法 (MoG-VQE) を多目的パレート最適化に頼り, 非支配的ソート遺伝的アルゴリズム (NSGA-II) を用いて変分アンザッツの位相を最適化する。各回路トポロジに対して、共分散行列適応進化戦略(CMA-ES)を用いて、単一キュービット回転の角度を最適化する。提案プロトコルでは, 得られたエネルギー精度と2ビットゲート数の両面で高い性能を同時に提供する回路を作成できるので, パレート最適解に到達しようと試みる。種々の分子 (H$_2$, H$_4$, H$_6$, BeH$_2$, LiH) に対して実験を行い, 標準のハードウェア効率のアンサッツと比較して, 2量子ゲート数の約10倍の減少を観測した。 12量子ビットのLiHハミルトニアンでは、既に12個のCNOTで化学的精度に達することができる。その結果、アルゴリズムは、短期デバイスに対する基底状態の忠実度を著しく向上させる。

Variational quantum eigensolver (VQE) emerged as a first practical algorithm for near-term quantum computers. Its success largely relies on the chosen variational ansatz, corresponding to a quantum circuit that prepares an approximate ground state of a Hamiltonian. Typically, it either aims to achieve high representation accuracy (at the expense of circuit depth), or uses a shallow circuit sacrificing the convergence to the exact ground state energy. Here, we propose the approach which can combine both low depth and improved precision, capitalizing on a genetically-improved ansatz for hardware-efficient VQE. Our solution, the multiobjective genetic variational quantum eigensolver (MoG-VQE), relies on multiobjective Pareto optimization, where topology of the variational ansatz is optimized using the non-dominated sorting genetic algorithm (NSGA-II). For each circuit topology, we optimize angles of single-qubit rotations using covariance matrix adaptation evolution strategy (CMA-ES) -- a derivative-free approach known to perform well for noisy black-box optimization. Our protocol allows preparing circuits that simultaneously offer high performance in terms of obtained energy precision and the number of two-qubit gates, thus trying to reach Pareto-optimal solutions. Tested for various molecules (H$_2$, H$_4$, H$_6$, BeH$_2$, LiH), we observe nearly ten-fold reduction in the two-qubit gate counts as compared to the standard hardware-efficient ansatz. For 12-qubit LiH Hamiltonian this allows reaching chemical precision already at 12 CNOTs. Consequently, the algorithm shall lead to significant growth of the ground state fidelity for near-term devices.

翻訳日:2023-05-10 23:21:33 公開日:2020-07-08

# 絡み合いレンズによる準粒子の観察

Observing Quasiparticles through the Entanglement Lens ( http://arxiv.org/abs/2007.04318v1 )

ライセンス: Link先を確認

Yizhi You, Elisabeth Wybo, Frank Pollmann, S. L. Sondhi

(参考訳) 相互作用する量子系の低エネルギー物理学は通常、関連する準粒子や低エネルギー励起とその量子数を同定することで理解される。我々は、対応する量子状態における絡み合いの性質を調べるために、これを超える量子情報フレームワークを提案する。我々は、量子数、局所性、分数化を含む準粒子の健全な特徴が、絡み合いスペクトルや相互情報に反映されていると論じる。これらのアイデアを、積分可能性破壊摂動を持つ$d=1$横場イジングモデルの特定の文脈で説明する。

The low energy physics of interacting quantum systems is typically understood through the identification of the relevant quasiparticles or low energy excitations and their quantum numbers. We present a quantum information framework that goes beyond this to examine the nature of the entanglement in the corresponding quantum states. We argue that the salient features of the quasiparticles, including their quantum numbers, locality and fractionalization are reflected in the entanglement spectrum and in the mutual information. We illustrate these ideas in the specific context of the $d=1$ transverse field Ising model with an integrability breaking perturbation.

翻訳日:2023-05-10 23:19:47 公開日:2020-07-08

# MTI-Net:マルチタスク学習のためのマルチスケールタスクインタラクションネットワーク

MTI-Net: Multi-Scale Task Interaction Networks for Multi-Task Learning ( http://arxiv.org/abs/2001.06902v5 )

ライセンス: Link先を確認

Simon Vandenhende, Stamatios Georgoulis and Luc Van Gool

(参考訳) 本稿では,マルチタスク学習環境においてタスク情報を蒸留する際に,複数のスケールでタスクインタラクションを検討することの重要性について論じる。共通の信念とは対照的に、特定のスケールで高い親和性を持つタスクは、他のスケールでこの動作を維持することが保証されていない。我々はこの発見を3つの方法で構築する新しいアーキテクチャ MTI-Net を提案する。まず、マルチスケールのマルチモーダル蒸留ユニットを介して、あらゆるスケールでのタスクインタラクションを明示的にモデル化する。第二に、機能伝達モジュールを介して、より低いスケールから高いスケールで蒸留されたタスク情報を伝播する。第3に、すべてのスケールから機能集約ユニットを介して洗練されたタスク特徴を集約し、最終的なタスク毎の予測を生成する。 2つのマルチタスク高密度ラベル付けデータセットに対する大規模な実験により、従来の研究とは異なり、我々のマルチタスクモデルはマルチタスク学習の潜在能力、すなわち、メモリフットプリントが小さくなり、計算回数が減り、シングルタスク学習の性能が向上することを示した。コードは、https://github.com/SimonVandenhende/Multi-Task-Learning-PyTorchで公開されている。

In this paper, we argue about the importance of considering task interactions at multiple scales when distilling task information in a multi-task learning setup. In contrast to common belief, we show that tasks with high affinity at a certain scale are not guaranteed to retain this behaviour at other scales, and vice versa. We propose a novel architecture, namely MTI-Net, that builds upon this finding in three ways. First, it explicitly models task interactions at every scale via a multi-scale multi-modal distillation unit. Second, it propagates distilled task information from lower to higher scales via a feature propagation module. Third, it aggregates the refined task features from all scales via a feature aggregation unit to produce the final per-task predictions. Extensive experiments on two multi-task dense labeling datasets show that, unlike prior work, our multi-task model delivers on the full potential of multi-task learning, that is, smaller memory footprint, reduced number of calculations, and better performance w.r.t. single-task learning. The code is made publicly available: https://github.com/SimonVandenhende/Multi-Task-Learning-PyTorch.

翻訳日:2023-01-08 12:36:28 公開日:2020-07-08

# CNNに基づく高速ソースデバイス識別

CNN-based fast source device identification ( http://arxiv.org/abs/2001.11847v3 )

ライセンス: Link先を確認

Sara Mandelli, Davide Cozzolino, Paolo Bestagini, Luisa Verdoliva, Stefano Tubaro

(参考訳) ソース識別は、画像の原点を追跡することができるため、画像鑑識において重要なトピックである。これは知的財産を請求する上で貴重な情報であり、また違法な資料の著者も明らかにしている。本稿では,センサノイズに基づくデバイス識別の問題に対処し,畳み込みニューラルネットワーク(CNN)を用いた高速かつ正確な解法を提案する。具体的には、カメラ指紋と画像ノイズをパッチレベルで比較する方法を学習する2チャンネルCNNを提案する。提案手法は従来の手法よりもはるかに高速であり,精度の向上が期待できる。このアプローチは、ソーシャルネットワークなど、大規模な画像データベースが分析されるシナリオに特に適している。この例では、ソーシャルメディアにアップロードされた画像は通常、少なくとも2つの圧縮段階にあるため、二重JPEG圧縮画像の調査を含め、常に標準的なアプローチよりも高い精度を報告している。

Source identification is an important topic in image forensics, since it allows to trace back the origin of an image. This represents a precious information to claim intellectual property but also to reveal the authors of illicit materials. In this paper we address the problem of device identification based on sensor noise and propose a fast and accurate solution using convolutional neural networks (CNNs). Specifically, we propose a 2-channel-based CNN that learns a way of comparing camera fingerprint and image noise at patch level. The proposed solution turns out to be much faster than the conventional approach and to ensure an increased accuracy. This makes the approach particularly suitable in scenarios where large databases of images are analyzed, like over social networks. In this vein, since images uploaded on social media usually undergo at least two compression stages, we include investigations on double JPEG compressed images, always reporting higher accuracy than standard approaches.

翻訳日:2023-01-05 06:28:32 公開日:2020-07-08

# 機械学習:logitベースの分類器に対する線形フィルタリング

Machine Unlearning: Linear Filtration for Logit-based Classifiers ( http://arxiv.org/abs/2002.02730v2 )

ライセンス: Link先を確認

Thomas Baumhauer and Pascal Sch\"ottle and Matthias Zeppelzauer

(参考訳) 最近制定された法律では、個人が自分の個人データがどんな風に使用されるかを決める権利、特に「忘れられる権利」を付与している。個人がモデルのトレーニングプロセスの一部であるデータを使用する許可を取り除いた場合、どのように進めばいいのか? この質問から、機械学習の分野が生まれ、それは「モデルからトレーニングデータを削除」する方法の調査として広く説明できる。我々の研究は、分類モデル(ディープニューラルネットワークなど)のクラス全体の削除要求の設定に関するこの研究の方向性を補完する。最初のステップとして,直感的で計算効率の良い衛生手法として線形濾過を提案する。本実験は,ナイーブ削除スキームに対する敵意設定の利点を示す。

Recently enacted legislation grants individuals certain rights to decide in what fashion their personal data may be used, and in particular a "right to be forgotten". This poses a challenge to machine learning: how to proceed when an individual retracts permission to use data which has been part of the training process of a model? From this question emerges the field of machine unlearning, which could be broadly described as the investigation of how to "delete training data from models". Our work complements this direction of research for the specific setting of class-wide deletion requests for classification models (e.g. deep neural networks). As a first step, we propose linear filtration as a intuitive, computationally efficient sanitization method. Our experiments demonstrate benefits in an adversarial setting over naive deletion schemes.

翻訳日:2023-01-03 03:51:31 公開日:2020-07-08

# privacyfl: プライバシー保護と安全な連合学習のためのシミュレータ

PrivacyFL: A simulator for privacy-preserving and secure federated learning ( http://arxiv.org/abs/2002.08423v2 )

ライセンス: Link先を確認

Vaikkunth Mugunthan, Anton Peraire-Bueno and Lalana Kagal

(参考訳) フェデレーション学習(federated learning)は、分散クライアントがトレーニングデータをローカライズしながら共有機械学習モデルを共同学習できるようにするテクニックである。これは、トレーニングされたモデルの重みやパラメータからトレーニングデータセットに関する情報をリークすることができるため、データプライバシのリスクを低減することができる。フェデレーション学習環境のセットアップ、特にセキュリティとプライバシの保証は、操作可能な多数の設定とパラメータを備えた、時間を要するプロセスである。クライアントがコラボレーションが実現可能であることを保証し、モデル精度を改善するためには、プライバシ保護とセキュアなフェデレーション学習のための実世界のシミュレータが必要である。本稿では,フェデレート学習環境のための拡張可能で,構成が容易でスケーラブルなシミュレータであるPrivacyFLを紹介する。主な機能としては、レイテンシシミュレーション、クライアントからの離脱に対する堅牢性、集中型と分散型の学習のサポート、差分プライバシーとセキュアなマルチパーティ計算に基づく設定可能なプライバシとセキュリティメカニズムなどがある。本稿では,我々の研究を動機付け,シミュレータと関連するプロトコルのアーキテクチャを説明し,その幅広い機能とその利点を浮き彫りにした多数のシナリオにおける評価について論じる。本稿は,様々な状況下での連携型学習環境の実現可能性の検証という,現実的な重要な課題に対処する。病院、銀行、研究機関といった、大量の機密データを持ち、協力したい組織は、プライバシーを守り、セキュアな方法でそれを可能にするシステムを持つことで、大きな利益を享受できるため、実践的な影響も大きい。

Federated learning is a technique that enables distributed clients to collaboratively learn a shared machine learning model while keeping their training data localized. This reduces data privacy risks, however, privacy concerns still exist since it is possible to leak information about the training dataset from the trained model's weights or parameters. Setting up a federated learning environment, especially with security and privacy guarantees, is a time-consuming process with numerous configurations and parameters that can be manipulated. In order to help clients ensure that collaboration is feasible and to check that it improves their model accuracy, a real-world simulator for privacy-preserving and secure federated learning is required. In this paper, we introduce PrivacyFL, which is an extensible, easily configurable and scalable simulator for federated learning environments. Its key features include latency simulation, robustness to client departure, support for both centralized and decentralized learning, and configurable privacy and security mechanisms based on differential privacy and secure multiparty computation. In this paper, we motivate our research, describe the architecture of the simulator and associated protocols, and discuss its evaluation in numerous scenarios that highlight its wide range of functionality and its advantages. Our paper addresses a significant real-world problem: checking the feasibility of participating in a federated learning environment under a variety of circumstances. It also has a strong practical impact because organizations such as hospitals, banks, and research institutes, which have large amounts of sensitive data and would like to collaborate, would greatly benefit from having a system that enables them to do so in a privacy-preserving and secure manner.

翻訳日:2022-12-30 13:44:45 公開日:2020-07-08

# 自己回帰モデルによる予測サンプリング

Predictive Sampling with Forecasting Autoregressive Models ( http://arxiv.org/abs/2002.09928v2 )

ライセンス: Link先を確認

Auke Wiggers, Emiel Hoogeboom

(参考訳) 自動回帰モデル(ARM)は現在、画像とオーディオデータの可能性に基づくモデリングにおいて最先端のパフォーマンスを持っている。一般的に、ニューラルネットワークベースのARMは高速な推論を可能にするように設計されている。本稿では,ARMの高速推論特性を利用してサンプリングを高速化する手法である予測サンプリングアルゴリズムを提案する。本稿では,arm固定点反復によるサンプリングと学習予測モジュールの2種類の予測サンプリングを提案する。有効性は2つの設定で示される。 i)二項mnist,svhn,cifar10の明示的確率モデリング及び二 SVHN、CIFAR10、Imagenet32で訓練されたオートエンコーダにおける離散潜時モデリング実験により,ARM推論呼び出し数やサンプリング速度において,ベースラインよりもかなりの改善が見られた。

Autoregressive models (ARMs) currently hold state-of-the-art performance in likelihood-based modeling of image and audio data. Generally, neural network based ARMs are designed to allow fast inference, but sampling from these models is impractically slow. In this paper, we introduce the predictive sampling algorithm: a procedure that exploits the fast inference property of ARMs in order to speed up sampling, while keeping the model intact. We propose two variations of predictive sampling, namely sampling with ARM fixed-point iteration and learned forecasting modules. Their effectiveness is demonstrated in two settings: i) explicit likelihood modeling on binary MNIST, SVHN and CIFAR10, and ii) discrete latent modeling in an autoencoder trained on SVHN, CIFAR10 and Imagenet32. Empirically, we show considerable improvements over baselines in number of ARM inference calls and sampling speed.

翻訳日:2022-12-29 09:19:20 公開日:2020-07-08

# 対人強化学習によるロバスト市場形成

Robust Market Making via Adversarial Reinforcement Learning ( http://arxiv.org/abs/2003.01820v2 )

ライセンス: Link先を確認

Thomas Spooner, Rahul Savani

(参考訳) 本稿では, 対人強化学習(ARL)を用いて, 対人的かつ適応的な市場条件に頑健な市場マーキングエージェントを作成できることを示す。 ARLを適用するために、Avellaneda と Stoikov [2008] のよく研究された単一エージェントモデルを、市場メーカーと敵の間の離散時間ゼロサムゲームに変換する。相手は、市場メーカーの経費で利益を上げたい他の市場参加者の代理として機能する。 2つの従来の単エージェントRLエージェントとARLを経験的に比較し、ARLアプローチが導くことを示す。 1) 制約のないリスク回避行動の出現又はドメイン固有の罰則 2) 試験環境における敵の有無にかかわらず評価された基準指標のセットによる性能の大幅な改善 3) 不確実性をモデル化した。我々は,本手法が一貫して収束することを示す実証実験を行い,単純な単段ゲームにおいて,我々が収束するプロファイルがnash平衡に対応することを証明した。

We show that adversarial reinforcement learning (ARL) can be used to produce market marking agents that are robust to adversarial and adaptively-chosen market conditions. To apply ARL, we turn the well-studied single-agent model of Avellaneda and Stoikov [2008] into a discrete-time zero-sum game between a market maker and adversary. The adversary acts as a proxy for other market participants that would like to profit at the market maker's expense. We empirically compare two conventional single-agent RL agents with ARL, and show that our ARL approach leads to: 1) the emergence of risk-averse behaviour without constraints or domain-specific penalties; 2) significant improvements in performance across a set of standard metrics, evaluated with or without an adversary in the test environment, and; 3) improved robustness to model uncertainty. We empirically demonstrate that our ARL method consistently converges, and we prove for several special cases that the profiles that we converge to correspond to Nash equilibria in a simplified single-stage game.

翻訳日:2022-12-26 21:39:52 公開日:2020-07-08

# anysize gan: イメージワーピング問題の解決策

Anysize GAN: A solution to the image-warping problem ( http://arxiv.org/abs/2003.03233v2 )

ライセンス: Link先を確認

Connah Kendrick, David Gillespie, Moi Hoon Yap

(参考訳) 本稿では,Deep Learningにおける共通問題を解決するために,GAN(General Adversarial Network)の新たなタイプを提案する。我々は,既存の潜在ベクトルベースGAN構造に適用可能な新しいアーキテクチャを開発し,任意のサイズのオンザフライ画像を生成する。画像生成のための既存のGANは、一致する寸法の均一な画像を必要とする。しかし、ImageNetのような公開データセットには数千の異なるサイズが含まれている。画像のサイズ変更は画像データの変形や変化を引き起こすが、ネットワークはこの前処理ステップを必要としない。トレーニングのために任意のサイズの画像をロードできるように、標準的なデータローディング技術に大きな変更を加えています。また、複数の入力と新しい動的リサイズ層を追加することで、ネットワークを2つの方法で修正する。最後に、判別器を複数の解像度で処理するように調整する。これらの変更により、メモリが許せば、リサイズなしで複数の解像度データセットをトレーニングできる。 isic 2019皮膚病変データセットで結果を確認した。提案手法は,特徴的関係を維持しつつ,空間的関係の保存と理解を行なわずに,異なる大きさの現実的な画像を生成することを実証する。論文を受理し、ソースコードを公開します。

We propose a new type of General Adversarial Network (GAN) to resolve a common issue with Deep Learning. We develop a novel architecture that can be applied to existing latent vector based GAN structures that allows them to generate on-the-fly images of any size. Existing GAN for image generation requires uniform images of matching dimensions. However, publicly available datasets, such as ImageNet contain thousands of different sizes. Resizing image causes deformations and changing the image data, whereas as our network does not require this preprocessing step. We make significant changes to the standard data loading techniques to enable any size image to be loaded for training. We also modify the network in two ways, by adding multiple inputs and a novel dynamic resizing layer. Finally we make adjustments to the discriminator to work on multiple resolutions. These changes can allow multiple resolution datasets to be trained on without any resizing, if memory allows. We validate our results on the ISIC 2019 skin lesion dataset. We demonstrate our method can successfully generate realistic images at different sizes without issue, preserving and understanding spatial relationships, while maintaining feature relationships. We will release the source codes upon paper acceptance.

翻訳日:2022-12-26 01:48:14 公開日:2020-07-08

# 環境微生物画像分割のためのマルチスケールCNN-CRFフレームワーク

A Multi-scale CNN-CRF Framework for Environmental Microorganism Image Segmentation ( http://arxiv.org/abs/2003.03744v2 )

ライセンス: Link先を確認

Jinghua Zhang, Chen Li, Frank Kulwa, Xin Zhao, Changhao Sun, Zihan Li, Tao Jiang, Hong Li, and Shouliang Qi

(参考訳) 研究者が環境微生物(EM)を効果的に識別するのを支援するために,EM画像セグメンテーションのためのマルチスケールCNN-CRF(MSCC)フレームワークを提案する。 1つは新しいピクセルレベルのセグメンテーションアプローチで、新しく導入された畳み込みニューラルネットワーク(CNN)、すなわち「mU-Net-B3」と高密度条件ランダムフィールド(CRF)後処理を使用する。 2つ目はvgg-16ベースのパッチレベルのセグメンテーション法で、新しい"バッファ"戦略により、emsの詳細のセグメンテーション品質がさらに向上する。実験では、420 EM画像の最先端手法と比較して、提案したMSCC法はメモリ要求を355 MBから103 MBに減らし、総合評価指標(Dice, Jaccard, Recall, Accuracy)を85.24%、77.42%、82.27%、96.76%から87.13%、79.74%、87.12%、96.91%に改善し、22.58%から20.26%に減らした。したがって、MSCC法は、EMセグメンテーション分野において大きなポテンシャルを示す。

To assist researchers to identify Environmental Microorganisms (EMs) effectively, a Multiscale CNN-CRF (MSCC) framework for the EM image segmentation is proposed in this paper. There are two parts in this framework: The first is a novel pixel-level segmentation approach, using a newly introduced Convolutional Neural Network (CNN), namely, "mU-Net-B3", with a dense Conditional Random Field (CRF) postprocessing. The second is a VGG-16 based patch-level segmentation method with a novel "buffer" strategy, which further improves the segmentation quality of the details of the EMs. In the experiment, compared with the state-of-the-art methods on 420 EM images, the proposed MSCC method reduces the memory requirement from 355 MB to 103 MB, improves the overall evaluation indexes (Dice, Jaccard, Recall, Accuracy) from 85.24%, 77.42%, 82.27%, and 96.76% to 87.13%, 79.74%, 87.12%, and 96.91%, respectively, and reduces the volume overlap error from 22.58% to 20.26%. Therefore, the MSCC method shows great potential in the EM segmentation field.

翻訳日:2022-12-25 14:24:37 公開日:2020-07-08

# 実画像復元・強調のための学習強化機能

Learning Enriched Features for Real Image Restoration and Enhancement ( http://arxiv.org/abs/2003.06792v2 )

ライセンス: Link先を確認

Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, Ling Shao

(参考訳) 劣化した画像から高品質の画像コンテンツを回復することを目的として、画像復元は監視、計算写真、医用画像、リモートセンシングなどの多くの応用を享受している。近年,畳み込みニューラルネットワーク(cnns)は,従来の画像復元手法に比べて劇的に改善されている。既存のCNNベースのメソッドは通常、フル解像度またはプログレッシブに低解像度の表現で動作する。前者の場合、空間的に正確だが文脈的に劣る結果が得られ、後者の場合、意味的に信頼できるが空間的に劣る出力が生成される。本稿では,ネットワーク全体を通して空間的にpreciseな高分解能表現を維持し,低解像度表現から強い文脈情報を受け取ることを目的とした,新しいアーキテクチャを提案する。このアプローチのコアは、いくつかのキー要素を含むマルチスケールの残留ブロックである。 (a)マルチスケール特徴抽出のための並列マルチレゾリューション畳み込みストリーム (b)多解像度ストリーム間の情報交換 c) 文脈情報取得のための空間的及びチャネル的注意機構 (d)注意に基づくマルチスケール特徴集約。簡単に言うと、我々は高解像度の空間的詳細を同時に保存しながら、複数のスケールからの文脈情報を組み合わせた豊富な特徴集合を学習する。 5つの実画像ベンチマークデータセットに対する大規模な実験により、我々の手法は、MIRNetと呼ばれ、画像のデノゲーション、超解像、画像強調など、様々な画像処理タスクに対して最先端の結果が得られることを示した。ソースコードと事前訓練されたモデルはhttps://github.com/swz30/MIRNet.comで入手できる。

With the goal of recovering high-quality image content from its degraded version, image restoration enjoys numerous applications, such as in surveillance, computational photography, medical imaging, and remote sensing. Recently, convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task. Existing CNN-based methods typically operate either on full-resolution or on progressively low-resolution representations. In the former case, spatially precise but contextually less robust results are achieved, while in the latter case, semantically reliable but spatially less accurate outputs are generated. In this paper, we present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network and receiving strong contextual information from the low-resolution representations. The core of our approach is a multi-scale residual block containing several key elements: (a) parallel multi-resolution convolution streams for extracting multi-scale features, (b) information exchange across the multi-resolution streams, (c) spatial and channel attention mechanisms for capturing contextual information, and (d) attention based multi-scale feature aggregation. In a nutshell, our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details. Extensive experiments on five real image benchmark datasets demonstrate that our method, named as MIRNet, achieves state-of-the-art results for a variety of image processing tasks, including image denoising, super-resolution, and image enhancement. The source code and pre-trained models are available at https://github.com/swz30/MIRNet.

翻訳日:2022-12-23 08:56:10 公開日:2020-07-08

# 条件付き生成逆ネットワークによるマルチモーダル形状補完

Multimodal Shape Completion via Conditional Generative Adversarial Networks ( http://arxiv.org/abs/2003.07717v3 )

ライセンス: Link先を確認

Rundi Wu, Xuelin Chen, Yixin Zhuang, Baoquan Chen

(参考訳) 形状取得装置からの部分的データ、すなわち形状に欠ける領域を埋めるために、いくつかの深層学習法が提案されている。しかし、これらの手法は1つの出力で部分的な形状を完遂するだけで、欠落した幾何を推論するあいまいさを無視している。したがって,多モード形状完備化問題として,一対多写像を学習することで,複数の出力で部分形状を完備化しようとする。条件付き学習データを必要としない条件付き生成モデルにより部分的な形状を完遂する最初のマルチモーダル形状補完法を開発した。提案手法は,学習結果のマルチモーダル分布の完備化を条件に,あいまいさを抽出する。様々な形状の不完全性を含む複数のデータセットに対するアプローチを広範に評価し,本手法の基本的な方法と変種を比較し,多様性と品質の両面で部分的な形状を完遂する上での手法のメリットを実証した。

Several deep learning methods have been proposed for completing partial data from shape acquisition setups, i.e., filling the regions that were missing in the shape. These methods, however, only complete the partial shape with a single output, ignoring the ambiguity when reasoning the missing geometry. Hence, we pose a multi-modal shape completion problem, in which we seek to complete the partial shape with multiple outputs by learning a one-to-many mapping. We develop the first multimodal shape completion method that completes the partial shape via conditional generative modeling, without requiring paired training data. Our approach distills the ambiguity by conditioning the completion on a learned multimodal distribution of possible results. We extensively evaluate the approach on several datasets that contain varying forms of shape incompleteness, and compare among several baseline methods and variants of our methods qualitatively and quantitatively, demonstrating the merit of our method in completing partial shapes with both diversity and quality.

翻訳日:2022-12-22 21:30:43 公開日:2020-07-08

# 交換可能なデータのためのエネルギーベースプロセス

Energy-Based Processes for Exchangeable Data ( http://arxiv.org/abs/2003.07521v2 )

ライセンス: Link先を確認

Mengjiao Yang, Bo Dai, Hanjun Dai, Dale Schuurmans

(参考訳) 近年,点雲などの交換可能性を持つ集合のモデリングへの関心が高まっている。現在のアプローチの欠点は、考慮される集合の濃度を制限するか、観測されていないデータ上の制限された形式の分布しか表現できないことである。これらの制限を克服するために、エネルギーベースのモデルから交換可能なデータまで拡張し、エネルギー関数のニューラルネットワークパラメータ化を可能にするEnergy-Based Processs (EBP)を導入する。これらのモデルの重要な利点は、集合上のより柔軟な分布を、その濃度を制限することなく表現できることである。我々は,ポイントクラウド生成,分類,デノイジング,画像補完など,さまざまなタスクにおける最先端のパフォーマンスを示す,ebpsの効率的なトレーニング手順を開発した。

Recently there has been growing interest in modeling sets with exchangeability such as point clouds. A shortcoming of current approaches is that they restrict the cardinality of the sets considered or can only express limited forms of distribution over unobserved data. To overcome these limitations, we introduce Energy-Based Processes (EBPs), which extend energy based models to exchangeable data while allowing neural network parameterizations of the energy function. A key advantage of these models is the ability to express more flexible distributions over sets without restricting their cardinality. We develop an efficient training procedure for EBPs that demonstrates state-of-the-art performance on a variety of tasks such as point cloud generation, classification, denoising, and image completion.

翻訳日:2022-12-22 20:38:17 公開日:2020-07-08

# 運動からの非剛体構造に対する重み付き核ノルムの正確な最適化

Accurate Optimization of Weighted Nuclear Norm for Non-Rigid Structure from Motion ( http://arxiv.org/abs/2003.10281v2 )

ライセンス: Link先を確認

Jos\'e Pedro Iglesias, Carl Olsson, Marcus Valtonen \"Ornhag

(参考訳) 与えられたランクの行列を最小二乗の意味でデータに合わせることは、行列の双線型パラメタライゼーションを明示的に最適化することで、レバンス=マルカルトのような2次法を用いて非常に効果的に行うことができる。対照的に、重み付き核ノルム優先のようなより一般的な特異値ペナルティを適用する場合、行列の要素に対する直接最適化が一般的に用いられる。結果の目的関数の非微分性のため、第一次劣次法や分割法が主に用いられる。これらは速いイテレーションを提供するが、ジグザグによって最小に近い非効率になることはよく知られており、実際、近似解の解決を迫られることが多い。本稿では,2次法により多くの場合において,より正確な結果が得られることを示す。我々の主な結果は、重み付き核規範ペナルティを含む一般の正規化子に対して、元の問題と同値な双線型定式化を構築する方法を示している。これらの定式化によって正則化関数は2つの微分可能となり、2次法が適用できる。動作問題からの多くの構造について実験により,本手法が最先端手法より優れていることを示す。

Fitting a matrix of a given rank to data in a least squares sense can be done very effectively using 2nd order methods such as Levenberg-Marquardt by explicitly optimizing over a bilinear parameterization of the matrix. In contrast, when applying more general singular value penalties, such as weighted nuclear norm priors, direct optimization over the elements of the matrix is typically used. Due to non-differentiability of the resulting objective function, first order sub-gradient or splitting methods are predominantly used. While these offer rapid iterations it is well known that they become inefficent near the minimum due to zig-zagging and in practice one is therefore often forced to settle for an approximate solution. In this paper we show that more accurate results can in many cases be achieved with 2nd order methods. Our main result shows how to construct bilinear formulations, for a general class of regularizers including weighted nuclear norm penalties, that are provably equivalent to the original problems. With these formulations the regularizing function becomes twice differentiable and 2nd order methods can be applied. We show experimentally, on a number of structure from motion problems, that our approach outperforms state-of-the-art methods.

翻訳日:2022-12-21 00:43:54 公開日:2020-07-08

# 限られた資源深層学習のためのデータと計算効率

A Data and Compute Efficient Design for Limited-Resources Deep Learning ( http://arxiv.org/abs/2004.09691v2 )

ライセンス: Link先を確認

Mirgahney Mohamed, Gabriele Cesa, Taco S. Cohen and Max Welling

(参考訳) データ効率の改善により、同種のニューラルネットワークはディープラーニングコミュニティへの関心を高めている。医療分野では、データの対称性を効果的に活用して、より正確で堅牢なモデルの構築に成功している。より広い範囲の患者にリーチするために、モバイルでデバイス上での深層学習ソリューションの実装が医療応用のために開発されている。しかし、同変モデルは大規模で計算コストのかかるアーキテクチャを使って一般的に実装されている。本研究では,mobilenetv2の等価バージョンを設計,テストし,モデルの量子化によりさらに最適化することで,より効率的な推論を実現する。我々は,patch camelyon (pcam) の医療データセット上で,より計算効率の高い技術性能を実現する。

Thanks to their improved data efficiency, equivariant neural networks have gained increased interest in the deep learning community. They have been successfully applied in the medical domain where symmetries in the data can be effectively exploited to build more accurate and robust models. To be able to reach a much larger body of patients, mobile, on-device implementations of deep learning solutions have been developed for medical applications. However, equivariant models are commonly implemented using large and computationally expensive architectures, not suitable to run on mobile devices. In this work, we design and test an equivariant version of MobileNetV2 and further optimize it with model quantization to enable more efficient inference. We achieve close-to state of the art performance on the Patch Camelyon (PCam) medical dataset while being more computationally efficient.

翻訳日:2022-12-11 06:20:07 公開日:2020-07-08

# キュラス探査による柔軟かつ効率的な長距離計画

Flexible and Efficient Long-Range Planning Through Curious Exploration ( http://arxiv.org/abs/2004.10876v2 )

ライセンス: Link先を確認

Aidan Curtis, Minjian Xin, Dilip Arumugam, Kevin Feigelis, Daniel Yamins

(参考訳) 時間的拡張型マルチフェーズプランを柔軟かつ効率的に発見するアルゴリズムは、ロボット工学の進歩とモデルに基づく強化学習にとって重要なステップである。長距離計画の核となる問題は、可能なアクションシーケンスのツリーを探索する効率的な方法を見つけることである。タスク・アンド・モーション・プランニング(tamp)による既存の非学習型計画ソリューションは、アクションの効果と前提条件に対する論理的記述の存在に依存している。この制約により、tampメソッドは、ツリー探索問題を効率的に減らすことができるが、隠れない複雑な物理環境に一般化する能力は制限される。対照的に、深層強化学習(DRL)法は、柔軟なニューラルネットワークに基づく関数近似を用いて、自然に見えない状況に一般化するポリシーを発見する。しかし、DRL法は、長距離多段階計画環境に固有の非常にまばらな報酬景観を扱うのに苦労する。本稿では、好奇心誘導サンプリング戦略と模倣学習を組み合わせることで、TAMPとDRLの要素を融合させるCurious Sample Planner(CSP)を提案する。 CSPは、多種多様なリアルな3Dタスクを解くための、興味深く複雑な時間的拡張プランを効率的に発見できることを示す。対照的に、標準的な計画と学習の手法は、これらのタスクを全く解決できなかったり、巨大な、非常に可変なトレーニングサンプルでしかできなかったりします。我々は、CSPで様々な好奇心メトリクスを使用することを検討し、CSPが発見するソリューションの種類を分析する。最後に、CSPはタスク転送をサポートし、あるタスクの経験から学んだ探索ポリシーが関連するタスクの効率向上に役立つことを示す。

Identifying algorithms that flexibly and efficiently discover temporally-extended multi-phase plans is an essential step for the advancement of robotics and model-based reinforcement learning. The core problem of long-range planning is finding an efficient way to search through the tree of possible action sequences. Existing non-learned planning solutions from the Task and Motion Planning (TAMP) literature rely on the existence of logical descriptions for the effects and preconditions for actions. This constraint allows TAMP methods to efficiently reduce the tree search problem but limits their ability to generalize to unseen and complex physical environments. In contrast, deep reinforcement learning (DRL) methods use flexible neural-network-based function approximators to discover policies that generalize naturally to unseen circumstances. However, DRL methods struggle to handle the very sparse reward landscapes inherent to long-range multi-step planning situations. Here, we propose the Curious Sample Planner (CSP), which fuses elements of TAMP and DRL by combining a curiosity-guided sampling strategy with imitation learning to accelerate planning. We show that CSP can efficiently discover interesting and complex temporally-extended plans for solving a wide range of physically realistic 3D tasks. In contrast, standard planning and learning methods often fail to solve these tasks at all or do so only with a huge and highly variable number of training samples. We explore the use of a variety of curiosity metrics with CSP and analyze the types of solutions that CSP discovers. Finally, we show that CSP supports task transfer so that the exploration policies learned during experience with one task can help improve efficiency on related tasks.

翻訳日:2022-12-10 18:31:20 公開日:2020-07-08

# VC次元によるロバスト部分ガウス推定

Robust subgaussian estimation with VC-dimension ( http://arxiv.org/abs/2004.11734v3 )

ライセンス: Link先を確認

Jules Depersin

(参考訳) 中央値法(英語版)(mom)に基づく手続きは、データが重み付きまたは破損している場合でも非漸近的かつ強い偏差境界を提供する。この研究は、MOM推定器の余剰リスクを束縛する新しい一般的な方法を提案する。中心となる技術は、統計複雑性を測定するためにVC次元(ラデマッハの複雑さの代わりに)を用いることである。特に、これはスパース推定のための最初のロバストな推定子を与えることができ、これはいわゆる準ガウジアンレートを、分解されていないデータに対して有限秒のモーメントを仮定するだけで達成する。対照的に、ラデマッハ複素数を用いた以前の研究は、次元と対数的に成長する多くの有限モーメントを必要とした。この手法により、任意のノルムにおける平均推定のための新しいロバストなスガウス境界を導出する。また、L_4-L_2$ノルム同値性のない亜ガウス境界を初めて達成した共分散推定のための新しいロバストな推定器も導出する。

Median-of-means (MOM) based procedures provide non-asymptotic and strong deviation bounds even when data are heavy-tailed and/or corrupted. This work proposes a new general way to bound the excess risk for MOM estimators. The core technique is the use of VC-dimension (instead of Rademacher complexity) to measure the statistical complexity. In particular, this allows to give the first robust estimators for sparse estimation which achieves the so-called subgaussian rate only assuming a finite second moment for the uncorrupted data. By comparison, previous works using Rademacher complexities required a number of finite moments that grows logarithmically with the dimension. With this technique, we derive new robust sugaussian bounds for mean estimation in any norm. We also derive a new robust estimator for covariance estimation that is the first to achieve subgaussian bounds without $L_4-L_2$ norm equivalence.

翻訳日:2022-12-10 03:06:50 公開日:2020-07-08

# AlignShift:3次元異方性ボリュームにおける画像厚のギャップを埋める

AlignShift: Bridging the Gap of Imaging Thickness in 3D Anisotropic Volumes ( http://arxiv.org/abs/2005.01969v2 )

ライセンス: Link先を確認

Jiancheng Yang, Yi He, Xiaoyang Huang, Jingwei Xu, Xiaodan Ye, Guangyu Tao, Bingbing Ni

(参考訳) 本稿では,3次元医用画像処理における基本的な課題について述べる。異方性医用ボリュームでは、薄いスライス(1mm)と厚いスライス(5mm)の間に大きなパフォーマンスギャップがある。従来の芸術では、薄いスライスに3Dアプローチ、厚いスライスに2Dアプローチを使う傾向がある。我々は,薄肉および厚肉の医療用ボリュームの統一的アプローチを目指す。ビデオ解析の最近の進歩に触発されて,理論上は任意の2次元事前学習ネットワークを太さ対応の3Dネットワークに変換する新しいパラメータフリー演算子であるAlignShiftを提案する。興味深いことに、変換されたネットワークは薄いスライスでは3Dのように振る舞うが、厚いスライスでは2Dに適応的に縮退する。入力画像厚みに応じてアライメントされた「仮想スライス」をシフト・融合することにより、統一された厚み認識表現学習を実現する。広汎性病変検出のための32k病変からなる,公衆の大規模深部結節ベンチマークに関する広範囲な実験により,前回と比べ,ホイッスルやベルを伴わない有意なマージンで先行する手法の有効性が検証された。さらに重要なことに、この方法は統一フレームワークによって薄いスライスボリュームと厚いスライスボリュームのパフォーマンスギャップを埋める最初の方法です。 PyTorch のコードは https://github.com/M3DV/AlignShift でオープンソース化されている。

This paper addresses a fundamental challenge in 3D medical image processing: how to deal with imaging thickness. For anisotropic medical volumes, there is a significant performance gap between thin-slice (mostly 1mm) and thick-slice (mostly 5mm) volumes. Prior arts tend to use 3D approaches for the thin-slice and 2D approaches for the thick-slice, respectively. We aim at a unified approach for both thin- and thick-slice medical volumes. Inspired by recent advances in video analysis, we propose AlignShift, a novel parameter-free operator to convert theoretically any 2D pretrained network into thickness-aware 3D network. Remarkably, the converted networks behave like 3D for the thin-slice, nevertheless degenerate to 2D for the thick-slice adaptively. The unified thickness-aware representation learning is achieved by shifting and fusing aligned "virtual slices" as per the input imaging thickness. Extensive experiments on public large-scale DeepLesion benchmark, consisting of 32K lesions for universal lesion detection, validate the effectiveness of our method, which outperforms previous state of the art by considerable margins without whistles and bells. More importantly, to our knowledge, this is the first method that bridges the performance gap between thin- and thick-slice volumes by a unified framework. To improve research reproducibility, our code in PyTorch is open source at https://github.com/M3DV/AlignShift.

翻訳日:2022-12-06 13:41:58 公開日:2020-07-08

# 安全な深層強化学習のための確率的保証

Probabilistic Guarantees for Safe Deep Reinforcement Learning ( http://arxiv.org/abs/2005.07073v2 )

ライセンス: Link先を確認

Edoardo Bacci and David Parker

(参考訳) 深層強化学習は多くの制御タスクにうまく適用されているが、安全上重要なシナリオにおけるこれらのエージェントの適用は安全性上の懸念から制限されている。これらのコントローラーの厳密なテストは、特にハードウェアの故障や騒がしいセンサーのため、確率的な環境での運用では困難である。確率的環境下での深部強化学習エージェントの安全性を測定するアルゴリズムMOSAICを提案する。本手法は,環境におけるコントローラの実行の形式的抽象化を反復的に構築し,マルコフ決定過程の確率論的モデルチェックを活用し,有限時間軸上での安全な動作に関する確率論的保証を実現する。異なる初期設定のためにコントローラの安全な操作の確率の境界を生成し、正しい振る舞いが保証される領域を識別する。いくつかのベンチマーク制御問題で訓練されたエージェントに対するアプローチの実装と評価を行った。

Deep reinforcement learning has been successfully applied to many control tasks, but the application of such agents in safety-critical scenarios has been limited due to safety concerns. Rigorous testing of these controllers is challenging, particularly when they operate in probabilistic environments due to, for example, hardware faults or noisy sensors. We propose MOSAIC, an algorithm for measuring the safety of deep reinforcement learning agents in stochastic settings. Our approach is based on the iterative construction of a formal abstraction of a controller's execution in an environment, and leverages probabilistic model checking of Markov decision processes to produce probabilistic guarantees on safe behaviour over a finite time horizon. It produces bounds on the probability of safe operation of the controller for different initial configurations and identifies regions where correct behaviour can be guaranteed. We implement and evaluate our approach on agents trained for several benchmark control problems.

翻訳日:2022-12-03 05:16:26 公開日:2020-07-08

# マルチモーダル表現学習のための適応トランスフォーマー

Adaptive Transformers for Learning Multimodal Representations ( http://arxiv.org/abs/2005.07486v3 )

ライセンス: Link先を確認

Prajjwal Bhargava

(参考訳) トランスフォーマーの使用は、言語意味論の学習から有意義なビシオ言語表現の形成へと成長してきた。これらのアーキテクチャはしばしば過度にパラメータ化され、大量の計算を必要とする。本研究では,モデル解釈性と計算効率についてより深く学ぶために適応的アプローチを拡張する。具体的には,注意スパン,スパース,構造化ドロップアウトの手法について検討し,視覚や言語タスクに対する注意のメカニズムがどのように広がるかを理解するのに役立つ。さらに,これらの手法は,ネットワークが入力シーケンスの複雑さ,異なるモダリティに対するスパーシティ・プレファレンス,その他の関連する現象をどのように知覚するかを知る上で有用であることを示す。

The usage of transformers has grown from learning about language semantics to forming meaningful visiolinguistic representations. These architectures are often over-parametrized, requiring large amounts of computation. In this work, we extend adaptive approaches to learn more about model interpretability and computational efficiency. Specifically, we study attention spans, sparse, and structured dropout methods to help understand how their attention mechanism extends for vision and language tasks. We further show that these approaches can help us learn more about how the network perceives the complexity of input sequences, sparsity preferences for different modalities, and other related phenomena.

翻訳日:2022-12-02 22:24:26 公開日:2020-07-08

# 新型コロナウイルス(COVID-19)パンデミックにおける世論と感情:Twitterのトピックモデリングに潜在ディリクレ・アロケーションを用いた

Public discourse and sentiment during the COVID-19 pandemic: using Latent Dirichlet Allocation for topic modeling on Twitter ( http://arxiv.org/abs/2005.08817v3 )

ライセンス: Link先を確認

Jia Xue, Junxiang Chen, Chen Chen, Chengda Zheng, Sijia Li, Tingshao Zhu

(参考訳) この研究の目的は、twitterユーザーの談話とcovid-19に対する心理的反応を理解することだ。私たちは、2020年1月23日から3月7日までに収集された新型コロナウイルスに関連する約190万ツイート(英語で書かれた)の分析に機械学習技術を使用します。 11のトピックが識別され、さらに10のテーマに分類され、「確認されたケースに関する更新」、「1919年関連死」、「中国国外のケース(世界規模)」、「韓国でのcovid-19の流行」、「ニューヨークでの流行の兆候」、「ダイアモンド・プリンセス・クルーズ」、「経済への影響」、「予防措置」、「権限」、「サプライチェーン」がある。結果はTwitterで一般的なトピックとして治療や症状に関連するメッセージを明らかにしない。感情分析は、新型コロナウイルスの未知の性質に対する恐怖があらゆるトピックで支配的であることを示している。本研究の意義と限界についても論じる。

The study aims to understand Twitter users' discourse and psychological reactions to COVID-19. We use machine learning techniques to analyze about 1.9 million Tweets (written in English) related to coronavirus collected from January 23 to March 7, 2020. A total of salient 11 topics are identified and then categorized into ten themes, including "updates about confirmed cases," "COVID-19 related death," "cases outside China (worldwide)," "COVID-19 outbreak in South Korea," "early signs of the outbreak in New York," "Diamond Princess cruise," "economic impact," "Preventive measures," "authorities," and "supply chain." Results do not reveal treatments and symptoms related messages as prevalent topics on Twitter. Sentiment analysis shows that fear for the unknown nature of the coronavirus is dominant in all topics. Implications and limitations of the study are also discussed.

翻訳日:2022-12-02 00:15:18 公開日:2020-07-08

# 心臓mriのための高能率・位相認識ビデオ超解像

Efficient and Phase-aware Video Super-resolution for Cardiac MRI ( http://arxiv.org/abs/2005.10626v4 )

ライセンス: Link先を確認

Jhih-Yuan Lin, Yu-Cheng Chang, Winston H. Hsu

(参考訳) 心臓磁気共鳴イメージング(CMR)は、非侵襲的で痛みのない方法で心臓の構造と機能を説明することができるため、広く用いられている。しかし、ハードウェアの制限により高品質なスキャンを得るには時間がかかり、コストがかかる。そこで本研究では,ハードウェアのアップグレードやスキャンプロトコルの変更を伴わずに,CMRビデオの超解像問題を解決するための新しいエンドツーエンドトレーニングネットワークを提案する。我々は,心の知識をモデルに取り入れ,時間的情報の利用を支援する。具体的には,CMRの循環特性を満たすように調整された周期関数として心臓の知識を定式化する。さらに,残差学習方式の残差は,LR-HRマッピングを漸進的改良方式で学習することを容易にする。この機構により、タスクの難易度に応じて改善イテレーションを調整することにより、ネットワークに適応性を持たせることができる。大規模データセットに対する大規模な実験結果から,提案手法の優位性を示した。

Cardiac Magnetic Resonance Imaging (CMR) is widely used since it can illustrate the structure and function of heart in a non-invasive and painless way. However, it is time-consuming and high-cost to acquire the high-quality scans due to the hardware limitation. To this end, we propose a novel end-to-end trainable network to solve CMR video super-resolution problem without the hardware upgrade and the scanning protocol modifications. We incorporate the cardiac knowledge into our model to assist in utilizing the temporal information. Specifically, we formulate the cardiac knowledge as the periodic function, which is tailored to meet the cyclic characteristic of CMR. In addition, the proposed residual of residual learning scheme facilitates the network to learn the LR-HR mapping in a progressive refinement fashion. This mechanism enables the network to have the adaptive capability by adjusting refinement iterations depending on the difficulty of the task. Extensive experimental results on large-scale datasets demonstrate the superiority of the proposed method compared with numerous state-of-the-art methods.

翻訳日:2022-11-30 23:56:05 公開日:2020-07-08

# Web検索と会話エージェントのためのユーザインテント推論

User Intent Inference for Web Search and Conversational Agents ( http://arxiv.org/abs/2005.13808v2 )

ライセンス: Link先を確認

Ali Ahmadvand

(参考訳) ユーザー意図の理解は、会話エージェントと検索エンジンの両方を設計する上で重要なステップである。ユーザの発話やクエリは短く、曖昧で、コンテキストに依存しているため、ユーザ意図の検出や推論は難しい。これらの研究課題に対処するために、私の論文は以下に焦点を当てています。 1)会話エージェントの発話トピックと意図分類 2)eコマース分野に着目したWeb検索エンジンの検索意図のマイニングと分類を行う。最初のトピックに対処するために、エンティティ情報と会話コンテキストの手がかりを組み込んだ新しいモデルを提案し、ユーザの発話の話題と意図の両方を予測する。第2の研究テーマは、web検索意図予測における既存の技術メソッドを、次のとおりeコマースドメインに拡張することです。 1)検索クエリの意図と関連する製品カテゴリを予測する共同学習モデルの構築。 2)新しい隠れユーザの意図を明らかにする。すべてのモデルは、主要なeコマースサイト検索エンジンから入手可能な実際のクエリで評価される。これらの研究の成果は、自然言語理解、クエリスコーピング、クエリ提案、ランキングなど、様々なタスクのパフォーマンスを改善するために利用することができ、結果としてユーザーエクスペリエンスが強化される。

User intent understanding is a crucial step in designing both conversational agents and search engines. Detecting or inferring user intent is challenging, since the user utterances or queries can be short, ambiguous, and contextually dependent. To address these research challenges, my thesis work focuses on: 1) Utterance topic and intent classification for conversational agents 2) Query intent mining and classification for Web search engines, focusing on the e-commerce domain. To address the first topic, I proposed novel models to incorporate entity information and conversation-context clues to predict both topic and intent of the user's utterances. For the second research topic, I plan to extend the existing state of the art methods in Web search intent prediction to the e-commerce domain, via: 1) Developing a joint learning model to predict search queries' intents and the product categories associated with them, 2) Discovering new hidden users' intents. All the models will be evaluated on the real queries available from a major e-commerce site search engine. The results from these studies can be leveraged to improve performance of various tasks such as natural language understanding, query scoping, query suggestion, and ranking, resulting in an enriched user experience.

翻訳日:2022-11-27 05:38:28 公開日:2020-07-08

# 深部言語表現における分離多様体の出現

Emergence of Separable Manifolds in Deep Language Representations ( http://arxiv.org/abs/2006.01095v4 )

ライセンス: Link先を確認

Jonathan Mamou, Hang Le, Miguel Del Rio, Cory Stephenson, Hanlin Tang, Yoon Kim, SueYeon Chung

(参考訳) ディープニューラルネットワーク(DNN)は、様々な認知モダリティの知覚的タスクを解く上で、非常に経験的な成功を示している。最近の研究では、タスク最適化されたDNNから抽出された表現と脳内の神経集団の間にかなりの類似性が報告されている。その後、DNNは複雑な認知機能の基礎となる計算原理を推論する一般的なモデルクラスとなり、神経集団の情報を調べるために開発された手法を応用するための自然なテストベッドとして登場した。本研究では,特徴表現の幾何学とクラスの線形分離性を結びつける計算神経科学の最近の手法である平均場理論多様体解析を用いて,大規模文脈埋め込みモデルから言語表現を分析する。異なるモデルファミリ(bert, roberta, gptなど)からの表現を探索し、特にあいまいなデータ(例えば、複数のpart-of-speechタグを持つ単語、多くの単語を含むpart-of-speechクラス)において、層深度(例えば、part-of-speechタグのための多様体)を越えて言語多様体が出現する証拠を見つける。さらに、これらの多様体における線形分離性の出現は、多様体の半径、次元性、多様体間相関の複合化によって引き起こされる。

Deep neural networks (DNNs) have shown much empirical success in solving perceptual tasks across various cognitive modalities. While they are only loosely inspired by the biological brain, recent studies report considerable similarities between representations extracted from task-optimized DNNs and neural populations in the brain. DNNs have subsequently become a popular model class to infer computational principles underlying complex cognitive functions, and in turn, they have also emerged as a natural testbed for applying methods originally developed to probe information in neural populations. In this work, we utilize mean-field theoretic manifold analysis, a recent technique from computational neuroscience that connects geometry of feature representations with linear separability of classes, to analyze language representations from large-scale contextual embedding models. We explore representations from different model families (BERT, RoBERTa, GPT, etc.) and find evidence for emergence of linguistic manifolds across layer depth (e.g., manifolds for part-of-speech tags), especially in ambiguous data (i.e, words with multiple part-of-speech tags, or part-of-speech classes including many words). In addition, we find that the emergence of linear separability in these manifolds is driven by a combined reduction of manifolds' radius, dimensionality and inter-manifold correlations.

翻訳日:2022-11-26 05:56:04 公開日:2020-07-08

# MRIにおける心筋運動追跡のためのバイオメカニクスインフォームドニューラルネットワーク

Biomechanics-informed Neural Networks for Myocardial Motion Tracking in MRI ( http://arxiv.org/abs/2006.04725v3 )

ライセンス: Link先を確認

Chen Qin, Shuo Wang, Chen Chen, Huaqi Qiu, Wenjia Bai and Daniel Rueckert

(参考訳) 画像登録は、しばしば解空間の正規化を必要とする不測の逆問題である。本稿では, 平滑性などの明示的な正規化条件を課す現在のアプローチのほとんどとは対照的に, バイオメカニクスによる正規化を暗黙的に学習できる新しい手法を提案する。このようなアプローチは、アプリケーション固有の事前知識をディープラーニングベースの登録に組み込むことができる。特に, 生体力学的に妥当な変形の多様体を学習するために変分オートエンコーダ(vae)を活用し, 生体力学的シミュレーションを再構成することでその基礎特性を暗黙的に把握する。学習されたvae正規化器は、任意の深層学習ベースの登録ネットワークと結合して、生体力学的に実現可能な解空間を定式化することができる。提案手法は2つの異なるデータセットから得られた2次元心筋mriデータを用いた心筋運動追跡の文脈で検証される。その結果,運動追跡精度の点で他の競合手法よりも優れた性能を示し,非圧縮性やひずみなどの生体力学的特性を学習できることを示した。この手法は、一般的なl2正規化スキームと比較して、未検出領域に対するより良い一般化性も示されている。

Image registration is an ill-posed inverse problem which often requires regularisation on the solution space. In contrast to most of the current approaches which impose explicit regularisation terms such as smoothness, in this paper we propose a novel method that can implicitly learn biomechanics-informed regularisation. Such an approach can incorporate application-specific prior knowledge into deep learning based registration. Particularly, the proposed biomechanics-informed regularisation leverages a variational autoencoder (VAE) to learn a manifold for biomechanically plausible deformations and to implicitly capture their underlying properties via reconstructing biomechanical simulations. The learnt VAE regulariser then can be coupled with any deep learning based registration network to regularise the solution space to be biomechanically plausible. The proposed method is validated in the context of myocardial motion tracking on 2D stacks of cardiac MRI data from two different datasets. The results show that it can achieve better performance against other competing methods in terms of motion tracking accuracy and has the ability to learn biomechanical properties such as incompressibility and strains. The method has also been shown to have better generalisability to unseen domains compared with commonly used L2 regularisation schemes.

翻訳日:2022-11-24 02:28:40 公開日:2020-07-08

# KiU-Net:オーバーコンプリート表現を用いたバイオメディカル画像の正確なセグメンテーションを目指して

KiU-Net: Towards Accurate Segmentation of Biomedical Images using Over-complete Representations ( http://arxiv.org/abs/2006.04878v2 )

ライセンス: Link先を確認

Jeya Maria Jose, Vishwanath Sindagi, Ilker Hacihaliloglu, Vishal M. Patel

(参考訳) 優れた性能のため、U-Netは近年でもっとも広く使われているバイオメディカルイメージセグメンテーションのバックボーンアーキテクチャである。しかし,本研究では,より小さな解剖学的ランドマークを不明瞭な境界で検出する場合,かなりの性能低下が観察されている。この問題を詳細に分析し,より高次元(空間的な意味で)にデータを投影するオーバーコンプリートアーキテクチャ(ki-net)を提案することで対処する。このネットワークをU-Netで拡張すると、小さな解剖学的ランドマークとぼやけたノイズ境界を分割する場合には、全体的なパフォーマンスが向上する。さらに、提案するネットワークには、より高速な収束やパラメータの少ないといったメリットがある。本研究は, 早産児の2次元超音波(us)から脳解剖学を分離する作業について検討し, 標準u-netと比較して, dice精度とjaccard indexの点で約4%の改善を達成し, 最近の最良法を2%上回った。コード:https://github.com/jeya-maria-jose/KiU-Net-pytorch

Due to its excellent performance, U-Net is the most widely used backbone architecture for biomedical image segmentation in the recent years. However, in our studies, we observe that there is a considerable performance drop in the case of detecting smaller anatomical landmarks with blurred noisy boundaries. We analyze this issue in detail, and address it by proposing an over-complete architecture (Ki-Net) which involves projecting the data onto higher dimensions (in the spatial sense). This network, when augmented with U-Net, results in significant improvements in the case of segmenting small anatomical landmarks and blurred noisy boundaries while obtaining better overall performance. Furthermore, the proposed network has additional benefits like faster convergence and fewer number of parameters. We evaluate the proposed method on the task of brain anatomy segmentation from 2D Ultrasound (US) of preterm neonates, and achieve an improvement of around 4% in terms of the DICE accuracy and Jaccard index as compared to the standard-U-Net, while outperforming the recent best methods by 2%. Code: https://github.com/jeya-maria-jose/KiU-Net-pytorch .

翻訳日:2022-11-24 02:27:29 公開日:2020-07-08

# covid-abs:ソーシャルディスタンシング介入の健康と経済効果をシミュレートするエージェントベースの新型コロナウイルス流行モデル

COVID-ABS: An Agent-Based Model of COVID-19 Epidemic to Simulate Health and Economic Effects of Social Distancing Interventions ( http://arxiv.org/abs/2006.10532v2 )

ライセンス: Link先を確認

Petr\^onio C. L. Silva, Paulo V. C. Batista, H\'elder S. Lima, Marcos A. Alves, Frederico G. Guimar\~aes, Rodrigo C. P. Silva

(参考訳) SARS-CoV-2による新型コロナウイルスのパンデミックは、世界中の公衆衛生と経済に直接影響を与えている。この問題を解決するため、各国はウイルスの拡散を抑制するために異なる政策と非薬剤的介入を講じてきた。本稿では,人,企業,政府を模倣するエージェントの社会を用いて,パンデミックのダイナミクスをシミュレートする新たなエージェントベースモデルであるcovid-19-absを提案する。その結果,(1)何もせず,(2)ロックダウン,(3)条件付きロックダウン,(4)垂直隔離,(5)部分隔離,(6)顔マスクの使用,(7)顔マスクの使用,および50%の社会的孤立への密着性,の7つの要因が分析された。ロックダウンによるシナリオの実施が不可能で、死亡者数が最も少なく、経済に最も影響を与える場合、フェイスマスクの使用と部分隔離を組み合わせるシナリオは、社会協力の観点からの実施においてより現実的なものとなり得る。 COVID-ABSモデルはPythonプログラミング言語で実装され、ソースコードが公開されている。このモデルは、入力パラメータを変更したり、様々なシナリオを作成できるようにすることで、他の社会にも容易に拡張できる。そのため、政治家や保健当局が新型コロナウイルス対策を計画する上で有用なツールである。

The COVID-19 pandemic due to the SARS-CoV-2 coronavirus has directly impacted the public health and economy worldwide. To overcome this problem, countries have adopted different policies and non-pharmaceutical interventions for controlling the spread of the virus. This paper proposes the COVID-ABS, a new SEIR (Susceptible-Exposed-Infected-Recovered) agent-based model that aims to simulate the pandemic dynamics using a society of agents emulating people, business and government. Seven different scenarios of social distancing interventions were analyzed, with varying epidemiological and economic effects: (1) do nothing, (2) lockdown, (3) conditional lockdown, (4) vertical isolation, (5) partial isolation, (6) use of face masks, and (7) use of face masks together with 50% of adhesion to social isolation. In the impossibility of implementing scenarios with lockdown, which present the lowest number of deaths and highest impact on the economy, scenarios combining the use of face masks and partial isolation can be the more realistic for implementation in terms of social cooperation. The COVID-ABS model was implemented in Python programming language, with source code publicly available. The model can be easily extended to other societies by changing the input parameters, as well as allowing the creation of a multitude of other scenarios. Therefore, it is a useful tool to assist politicians and health authorities to plan their actions against the COVID-19 epidemic.

翻訳日:2022-11-23 15:22:33 公開日:2020-07-08

# COALA: セマンティックにリッチな音声表現を学習するための協調型オートエンコーダ

COALA: Co-Aligned Autoencoders for Learning Semantically Enriched Audio Representations ( http://arxiv.org/abs/2006.08386v2 )

ライセンス: Link先を確認

Xavier Favory, Konstantinos Drossos, Tuomas Virtanen and Xavier Serra

(参考訳) ディープニューラルネットワーク(DNN)に基づく音声表現学習は、手作り機能に代わるアプローチとして登場した。高性能を実現するために、DNNは大量の注釈付きデータを必要とすることが多く、入手が困難でコストがかかる。本稿では,学習した音声および関連タグの潜在表現を整列させて,音声表現を学習する手法を提案する。調整は、音声とタグの潜在表現の一致を最大化し、対照的な損失を用いて行う。その結果,音の音響的・意味的特性を反映した音響埋め込みモデルが得られた。組込みモデルの品質を評価し,3つの異なるタスク(音のイベント認識,音楽ジャンル,楽器分類)で特徴抽出器としての性能を測定し,そのモデルがどのような特徴を捉えているかを検討する。提案手法により得られた埋め込みは,いくつかの音響ディスクリプタとよく相関している。

Audio representation learning based on deep neural networks (DNNs) emerged as an alternative approach to hand-crafted features. For achieving high performance, DNNs often need a large amount of annotated data which can be difficult and costly to obtain. In this paper, we propose a method for learning audio representations, aligning the learned latent representations of audio and associated tags. Aligning is done by maximizing the agreement of the latent representations of audio and tags, using a contrastive loss. The result is an audio embedding model which reflects acoustic and semantic characteristics of sounds. We evaluate the quality of our embedding model, measuring its performance as a feature extractor on three different tasks (namely, sound event recognition, and music genre and musical instrument classification), and investigate what type of characteristics the model captures. Our results are promising, sometimes in par with the state-of-the-art in the considered tasks and the embeddings produced with our method are well correlated with some acoustic descriptors.

翻訳日:2022-11-21 03:51:11 公開日:2020-07-08

# ニューラルモデルにおける構成一般化に関する研究

A Study of Compositional Generalization in Neural Models ( http://arxiv.org/abs/2006.09437v2 )

ライセンス: Link先を確認

Tim Klinger, Dhaval Adjodah, Vincent Marois, Josh Joseph, Matthew Riemer, Alex 'Sandy' Pentland, Murray Campbell

(参考訳) 合成学習とリレーショナル学習は人間の知能の目印であるが、ニューラルモデルに対する課題を示すものである。このようなモデルの開発における難点は、体系的に評価する明確な構成的および関係的なタスク構造を持つベンチマークが欠如していることである。本稿では,論理的なドメイン特化言語を用いて,合成概念と関係概念から画像を生成することを可能にする,conceptworldという環境を提案する。 2x2平方、ペントミノ、シーケンス、これらのオブジェクトを含むシーン、その他のより複雑な概念など、さまざまな構成構造のための画像を生成するために使用します。我々は,それらの引数の合成深度が増加し,置換されるにつれて,合成引数との関係を一般化する標準的なニューラルアーキテクチャの能力をテストする実験を行う。 MLPやCNN,ResNetといった標準的なニューラルネットワークや,WReNやPrediNetといった最先端のリレーショナルネットワークを,マルチクラスの画像分類設定で比較する。単純な問題に対して、全てのモデルは密接な概念にうまく一般化するが、長い構成連鎖に苦しむ。置換性を含むより複雑なテストでは、全てのモデルは短鎖でも苦労する。これらの困難を強調し、さらなる実験を行うための環境を提供することで、構成的、関係的な領域において効果的に一般化できるモデルの開発を奨励したいと考えています。

Compositional and relational learning is a hallmark of human intelligence, but one which presents challenges for neural models. One difficulty in the development of such models is the lack of benchmarks with clear compositional and relational task structure on which to systematically evaluate them. In this paper, we introduce an environment called ConceptWorld, which enables the generation of images from compositional and relational concepts, defined using a logical domain specific language. We use it to generate images for a variety of compositional structures: 2x2 squares, pentominoes, sequences, scenes involving these objects, and other more complex concepts. We perform experiments to test the ability of standard neural architectures to generalize on relations with compositional arguments as the compositional depth of those arguments increases and under substitution. We compare standard neural networks such as MLP, CNN and ResNet, as well as state-of-the-art relational networks including WReN and PrediNet in a multi-class image classification setting. For simple problems, all models generalize well to close concepts but struggle with longer compositional chains. For more complex tests involving substitutivity, all models struggle, even with short chains. In highlighting these difficulties and providing an environment for further experimentation, we hope to encourage the development of models which are able to generalize effectively in compositional, relational domains.

翻訳日:2022-11-20 19:35:36 公開日:2020-07-08

# モデルベース強化学習におけるデルタスキーマネットワーク

Delta Schema Network in Model-based Reinforcement Learning ( http://arxiv.org/abs/2006.09950v2 )

ライセンス: Link先を確認

Andrey Gorodetskiy, Alexandra Shlychkova, Aleksandr I. Panov

(参考訳) この研究は、伝達学習の非効率性である人工知能の未解決問題に焦点を当てている。強化学習の分野でこの問題を解決するために用いられるメカニズムの1つはモデルに基づくアプローチである。本稿では,環境データからオブジェクトとアクション間の論理的関係を抽出できるスキーマネットワーク手法を拡張している。我々は、デルタスキーマネットワーク(dsn)のトレーニング、環境の将来の状態の予測、積極的な報酬につながる行動計画のためのアルゴリズムを提案する。 DSNは、古典的なアタリゲーム環境において、転送学習の強い性能を示す。

This work is devoted to unresolved problems of Artificial General Intelligence - the inefficiency of transfer learning. One of the mechanisms that are used to solve this problem in the area of reinforcement learning is a model-based approach. In the paper we are expanding the schema networks method which allows to extract the logical relationships between objects and actions from the environment data. We present algorithms for training a Delta Schema Network (DSN), predicting future states of the environment and planning actions that will lead to positive reward. DSN shows strong performance of transfer learning on the classic Atari game environment.

翻訳日:2022-11-19 19:24:06 公開日:2020-07-08

# 連続時間限におけるメタ学習

Meta Learning in the Continuous Time Limit ( http://arxiv.org/abs/2006.10921v2 )

ライセンス: Link先を確認

Ruitu Xu, Lin Chen, Amin Karbasi

(参考訳) 本稿では,モデル非依存メタラーニング(MAML)の学習力学の基礎となる常微分方程式(ODE)を確立する。この過程の連続時間極限ビューは,手動で選択した勾配降下のステップサイズの影響を取り除き,特定の離散化から生じる特別な場合として,既存の勾配降下訓練アルゴリズムを含む。我々は,MAML損失が非凸である場合でも,MAML損失関数の近似定常点に対する線形収束率を強く凸することを示した。さらに,MAML ODE の解析を通じて,既存のMAML トレーニング手法に付随する計算負担を大幅に軽減する BI-MAML トレーニングアルゴリズムを提案する。理論的な知見を補完するため,提案手法の既存研究に対する優位性を示す実証実験を行った。

In this paper, we establish the ordinary differential equation (ODE) that underlies the training dynamics of Model-Agnostic Meta-Learning (MAML). Our continuous-time limit view of the process eliminates the influence of the manually chosen step size of gradient descent and includes the existing gradient descent training algorithm as a special case that results from a specific discretization. We show that the MAML ODE enjoys a linear convergence rate to an approximate stationary point of the MAML loss function for strongly convex task losses, even when the corresponding MAML loss is non-convex. Moreover, through the analysis of the MAML ODE, we propose a new BI-MAML training algorithm that significantly reduces the computational burden associated with existing MAML training methods. To complement our theoretical findings, we perform empirical experiments to showcase the superiority of our proposed methods with respect to the existing work.

翻訳日:2022-11-19 03:39:22 公開日:2020-07-08

# 深部ニューラルネットワークによる時間集合の予測

Predicting Temporal Sets with Deep Neural Networks ( http://arxiv.org/abs/2006.11483v4 )

ライセンス: Link先を確認

Le Yu, Leilei Sun, Bowen Du, Chuanren Liu, Hui Xiong, Weifeng Lv

(参考訳) 各集合が任意の数の要素を含む集合列が与えられたとき、時間集合予測の問題は、次の集合の要素を予測することを目的としている。実際には、時間的集合予測は時間的事象や時系列の予測モデルよりもはるかに複雑であり、まだ未解決の問題である。時間的集合予測の問題に適応した多くの既存の手法は、通常、まず時間的集合を潜在表現に投影し、次に潜在表現で予測モデルを学ぶことによって2段階の戦略に従う。 2段階のアプローチはしばしば情報損失と不満足な予測性能をもたらす。本稿では,時間的集合予測のためのディープニューラルネットワークに基づく統合解を提案する。このアプローチのユニークな視点は、集合レベルの共起グラフを構築して要素関係を学習し、動的関係グラフ上でグラフ畳み込みを実行することである。さらに,要素や集合の時間依存性を適応的に学習するアテンションベースモジュールを設計する。最後に、異なるシーケンスで隠れた共有パターンを見つけ、静的情報と動的情報を融合して予測性能を向上させるゲート更新機構を提供する。実世界のデータセットに関する実験は、トレーニングデータの一部であっても、我々のアプローチが競争力のあるパフォーマンスを達成でき、既存のメソッドをかなりのマージンで上回ることができることを示している。

Given a sequence of sets, where each set contains an arbitrary number of elements, the problem of temporal sets prediction aims to predict the elements in the subsequent set. In practice, temporal sets prediction is much more complex than predictive modelling of temporal events and time series, and is still an open problem. Many possible existing methods, if adapted for the problem of temporal sets prediction, usually follow a two-step strategy by first projecting temporal sets into latent representations and then learning a predictive model with the latent representations. The two-step approach often leads to information loss and unsatisfactory prediction performance. In this paper, we propose an integrated solution based on the deep neural networks for temporal sets prediction. A unique perspective of our approach is to learn element relationship by constructing set-level co-occurrence graph and then perform graph convolutions on the dynamic relationship graphs. Moreover, we design an attention-based module to adaptively learn the temporal dependency of elements and sets. Finally, we provide a gated updating mechanism to find the hidden shared patterns in different sequences and fuse both static and dynamic information to improve the prediction performance. Experiments on real-world data sets demonstrate that our approach can achieve competitive performances even with a portion of the training data and can outperform existing methods with a significant margin.

翻訳日:2022-11-18 22:01:14 公開日:2020-07-08

# 画像圧縮における損失情報のモデル化

Modeling Lost Information in Lossy Image Compression ( http://arxiv.org/abs/2006.11999v3 )

ライセンス: Link先を確認

Yaolong Wang, Mingqing Xiao, Chang Liu, Shuxin Zheng, Tie-Yan Liu

(参考訳) ロスシー画像圧縮は、デジタル画像の最もよく使われる演算子の1つである。近年提案された深層学習に基づく画像圧縮手法は, オートエンコーダ構造を活用し, この分野で有望な結果を得た。画像はまず低次元の潜伏特徴に符号化され、その後、統計冗長性を利用してエントロピー符号化される。しかし、エンコード中に失われた情報は、残念ながら避けられないため、デコーダが元の画像を再構築する上で大きな課題となる。本研究では,情報損失問題を抑えるために,ILC(Invertible Lossy Compression)と呼ばれる新しい非可逆的フレームワークを提案する。特に、ICCはエンコーダ・デコーダ構造を置き換えるための可逆符号化モジュールを導入し、低次元情報潜在表現を生成する一方で、失われた情報をさらなるコード化や保存を行わない補助潜在変数に変換する。潜伏表現は量子化されビットストリームに符号化され、潜伏変数は特定の分布、すなわち等方ガウス分布に従わざるを得ない。このようにして、サロゲート潜伏変数を容易に描画し、モジュールの逆パスにサンプル変数と復号化潜伏特徴を加えることにより、原画像の復元を可能にする。画像圧縮法におけるオートエンコーダを置き換えた新しいコンポーネントにより、ICCは既存の圧縮アルゴリズムと組み合わせることで、広範囲なベンチマークデータセット上でのベースライン手法を大幅に上回ることを示す。

Lossy image compression is one of the most commonly used operators for digital images. Most recently proposed deep-learning-based image compression methods leverage the auto-encoder structure, and reach a series of promising results in this field. The images are encoded into low dimensional latent features first, and entropy coded subsequently by exploiting the statistical redundancy. However, the information lost during encoding is unfortunately inevitable, which poses a significant challenge to the decoder to reconstruct the original images. In this work, we propose a novel invertible framework called Invertible Lossy Compression (ILC) to largely mitigate the information loss problem. Specifically, ILC introduces an invertible encoding module to replace the encoder-decoder structure to produce the low dimensional informative latent representation, meanwhile, transform the lost information into an auxiliary latent variable that won't be further coded or stored. The latent representation is quantized and encoded into bit-stream, and the latent variable is forced to follow a specified distribution, i.e. isotropic Gaussian distribution. In this way, recovering the original image is made tractable by easily drawing a surrogate latent variable and applying the inverse pass of the module with the sampled variable and decoded latent features. Experimental results demonstrate that with a new component replacing the auto-encoder in image compression methods, ILC can significantly outperform the baseline method on extensive benchmark datasets by combining with the existing compression algorithms.

翻訳日:2022-11-18 05:22:10 公開日:2020-07-08

# ソーシャルボット検出の10年

A Decade of Social Bot Detection ( http://arxiv.org/abs/2007.03604v2 )

ライセンス: Link先を確認

Stefano Cresci

(参考訳) 2016年11月9日朝、世界はアメリカ合衆国大統領選挙の衝撃的な結果に目覚めた: ドナルド・トランプは第45代アメリカ合衆国大統領だった。いまだに世界中に重大な結果をもたらす予期せぬ出来事。今日、少数のソーシャルボット、自動化されたソーシャルメディアアカウントが、分裂したメッセージや偽情報を広めるのに中心的な役割を果たし、おそらくトランプ氏の勝利に寄与したことを私たちは知っている。 2016年のアメリカ合衆国大統領選挙の後、世界はソーシャルメディアにおける広範な詐欺の重大さを認識し始めた。トランプ氏のエクスプロイトを受けて、私たちはボットの検出と削除に対する多くの努力と、これらの悪意ある俳優が我々の社会に与えた影響の増大の間に激しい不協和音の出現を目撃した。このパラドックスは、このソーシャルボットのパンデミックを防ぐために、どのような戦略を強制すべきなのか? 2020年米大統領選への出馬中、この問題はこれまでになく重要視されている。 2016年以降、社会的、政治的、経済的なアナリストが脳卒中を起こしたのは、少なくとも2010年以降、コンピュータ科学者にとって、詐欺と自動化が問題となっている。本稿では,ソーシャルボット検出における最初の10年の研究を簡潔に調査する。縦断的な分析によって、ボットとの戦いにおける研究の主なトレンド、達成された主な成果、そしてこの絶え間ない戦いを困難なものにする要因について論じる。広範な分析から学んだ教訓に乗じて、詐欺や操作に対する上限となる可能性のあるイノベーションを提案します。ソーシャルボット検出における10年間の研究は、戦略的情報操作や政治トロルなど、他の、より最近のオンライン詐欺の影響を検知し緩和するための戦略を知らせることもできる。

On the morning of November 9th 2016, the world woke up to the shocking outcome of the US Presidential elections: Donald Trump was the 45th President of the United States of America. An unexpected event that still has tremendous consequences all over the world. Today, we know that a minority of social bots, automated social media accounts mimicking humans, played a central role in spreading divisive messages and disinformation, possibly contributing to Trump's victory. In the aftermath of the 2016 US elections, the world started to realize the gravity of widespread deception in social media. Following Trump's exploit, we witnessed to the emergence of a strident dissonance between the multitude of efforts for detecting and removing bots, and the increasing effects that these malicious actors seem to have on our societies. This paradox opens a burning question: What strategies should we enforce in order to stop this social bot pandemic? In these times, during the run-up to the 2020 US elections, the question appears as more crucial than ever. What stroke social, political and economic analysts after 2016, deception and automation, has been however a matter of study for computer scientists since at least 2010. In this work, we briefly survey the first decade of research in social bot detection. Via a longitudinal analysis, we discuss the main trends of research in the fight against bots, the major results that were achieved, and the factors that make this never-ending battle so challenging. Capitalizing on lessons learned from our extensive analysis, we suggest possible innovations that could give us the upper hand against deception and manipulation. Studying a decade of endeavours at social bot detection can also inform strategies for detecting and mitigating the effects of other, more recent, forms of online deception, such as strategic information operations and political trolls.

翻訳日:2022-11-18 00:00:27 公開日:2020-07-08

# 注意して聞く: 残学習とガンマタン音声表現に基づく音声キャプションシステム

Listen carefully and tell: an audio captioning system based on residual learning and gammatone audio representation ( http://arxiv.org/abs/2006.15406v4 )

ライセンス: Link先を確認

Sergi Perez-Castanos, Javier Naranjo-Alcazar, Pedro Zuccarello and Maximo Cobos

(参考訳) 自動音声キャプションは、自由テキストを用いて音声を記述することを目的としている機械聴取タスクである。音声を入力として受け取り、テキスト記述、すなわち信号のキャプションとして出力するので、自動的な音声キャプションシステムを実装する必要がある。このタスクは、自動コンテンツ記述やマシン間インタラクションなど、多くのアプリケーションで有用である。本研究では,エンコーダフェーズにおける残差学習に基づく音声キャプションの自動生成手法を提案する。エンコーダフェーズは、異なる残留ネットワーク構成によって実装される。デコーダフェーズ(キャプションの作成)は、繰り返しレイヤとアテンション機構を使用して実行される。選ばれた音声表現はガンマトインである。その結果,本研究で提案するフレームワークがベースラインシステムを上回ることが判明した。

Automated audio captioning is machine listening task whose goal is to describe an audio using free text. An automated audio captioning system has to be implemented as it accepts an audio as input and outputs as textual description, that is, the caption of the signal. This task can be useful in many applications such as automatic content description or machine-to-machine interaction. In this work, an automatic audio captioning based on residual learning on the encoder phase is proposed. The encoder phase is implemented via different Residual Networks configurations. The decoder phase (create the caption) is run using recurrent layers plus attention mechanism. The audio representation chosen has been Gammatone. Results show that the framework proposed in this work surpass the baseline system in challenge results.

翻訳日:2022-11-16 08:16:20 公開日:2020-07-08

# グラフニューラルネットワークのための経路積分に基づく畳み込みとプーリング

Path Integral Based Convolution and Pooling for Graph Neural Networks ( http://arxiv.org/abs/2006.16811v2 )

ライセンス: Link先を確認

Zheng Ma, Junyu Xuan, Yu Guang Wang, Ming Li, Pietro Lio

(参考訳) グラフニューラルネットワーク(GNN)は、従来のニューラルネットワークの機能をグラフ構造化データに拡張する。 CNNと同様、グラフの畳み込みとプーリングを最適化した設計が成功の鍵である。物理からアイデアを借用し,グラフの分類と回帰処理のための経路積分型グラフニューラルネットワーク(PAN)を提案する。具体的には、メッセージ送信側と受信側を結ぶ全ての経路を経路長に応じて学習可能な重みでリンクし、最大エントロピーランダムウォークに対応する畳み込み演算を考える。これはグラフラプラシアンを、経路積分形式から導かれる極大エントロピー遷移(MET)行列と呼ばれる新しい遷移行列に一般化する。重要なことに、MET行列の対角成分は、部分グラフ中心性に直接関係しており、自然かつ適応的なプーリング機構を提供する。 panは、さまざまなサイズと構造を備えたさまざまなグラフデータ用に調整可能な汎用フレームワークを提供する。既存のほとんどのGNNアーキテクチャはPANの特別なケースとして見ることができます。実験結果から, PANは様々なグラフ分類/回帰タスクにおいて, 物理科学におけるGNNの適用性を高めるために, 統計力学による新しいベンチマークデータセットを含む最先端の性能を達成できることが示唆された。

Graph neural networks (GNNs) extends the functionality of traditional neural networks to graph-structured data. Similar to CNNs, an optimized design of graph convolution and pooling is key to success. Borrowing ideas from physics, we propose a path integral based graph neural networks (PAN) for classification and regression tasks on graphs. Specifically, we consider a convolution operation that involves every path linking the message sender and receiver with learnable weights depending on the path length, which corresponds to the maximal entropy random walk. It generalizes the graph Laplacian to a new transition matrix we call maximal entropy transition (MET) matrix derived from a path integral formalism. Importantly, the diagonal entries of the MET matrix are directly related to the subgraph centrality, thus providing a natural and adaptive pooling mechanism. PAN provides a versatile framework that can be tailored for different graph data with varying sizes and structures. We can view most existing GNN architectures as special cases of PAN. Experimental results show that PAN achieves state-of-the-art performance on various graph classification/regression tasks, including a new benchmark dataset from statistical mechanics we propose to boost applications of GNN in physical sciences.

翻訳日:2022-11-15 14:30:52 公開日:2020-07-08

# 表現的記述論理のための署名に基づくアブダクション -- 技術報告

Signature-Based Abduction for Expressive Description Logics -- Technical Report ( http://arxiv.org/abs/2007.00757v2 )

ライセンス: Link先を確認

Patrick Koopmann, Warren Del-Pinto, Sophie Tourret and Renate A. Schmidt

(参考訳) シグネチャベースのアブダクション(signal-based abduction)は、特定の名前セットであるシグネチャの上に仮説を構築することを目的としている。この種のアブダクションは、観察された症状に使用される語彙が、それらの症状を説明するのに期待される語彙とは異なる、診断などのタスクに有用である。本稿では,TBox と ABox の公理を含む表現的記述論理 ALC で表現された観測に対するシグネチャベース推論を解くための,最初の完全解法を提案する。この方法は有限かつ完全な仮説の集合を計算することが保証され、現実的な知識ベースで評価される。

Signature-based abduction aims at building hypotheses over a specified set of names, the signature, that explain an observation relative to some background knowledge. This type of abduction is useful for tasks such as diagnosis, where the vocabulary used for observed symptoms differs from the vocabulary expected to explain those symptoms. We present the first complete method solving signature-based abduction for observations expressed in the expressive description logic ALC, which can include TBox and ABox axioms, thereby solving the knowledge base abduction problem. The method is guaranteed to compute a finite and complete set of hypotheses, and is evaluated on a set of realistic knowledge bases.

翻訳日:2022-11-14 23:28:30 公開日:2020-07-08

# 衛星画像時系列分類のための軽量時間自己認識

Lightweight Temporal Self-Attention for Classifying Satellite Image Time Series ( http://arxiv.org/abs/2007.00586v3 )

ライセンス: Link先を確認

Vivien Sainte Fare Garnot and Loic Landrieu

(参考訳) 地球観測衛星データのアクセシビリティと精度の向上は、産業や国家のアクターにもかなりの機会を与えている。しかし、これはグローバルスケールで時系列を処理できる効率的な方法を要求する。リモートセンシング時間列の分類にマルチヘッド自己注意機構を用いた最近の研究に基づいて,時間的注意エンコーダの修正を提案する。本ネットワークでは,時間入力のチャネルを並列に動作している複数の小型アテンションヘッドに分散する。各ヘッドは高度に特殊化された時間的特徴を抽出し、その特徴を1つの表現に分解する。提案手法は,オープンアクセス衛星画像データセット上の他の最先端の時系列分類アルゴリズムを上回り,パラメータをかなり少なくし,計算複雑性を低減した。

The increasing accessibility and precision of Earth observation satellite data offers considerable opportunities for industrial and state actors alike. This calls however for efficient methods able to process time-series on a global scale. Building on recent work employing multi-headed self-attention mechanisms to classify remote sensing time sequences, we propose a modification of the Temporal Attention Encoder. In our network, the channels of the temporal inputs are distributed among several compact attention heads operating in parallel. Each head extracts highly-specialized temporal features which are in turn concatenated into a single representation. Our approach outperforms other state-of-the-art time series classification algorithms on an open-access satellite image dataset, while using significantly fewer parameters and with a reduced computational complexity.

翻訳日:2022-11-14 22:27:12 公開日:2020-07-08

# 学習表現の線形識別性について

On Linear Identifiability of Learned Representations ( http://arxiv.org/abs/2007.00810v3 )

ライセンス: Link先を確認

Geoffrey Roeder, Luke Metz and Diederik P. Kingma

(参考訳) 同定可能性(identifiability)は、統計モデルの望ましい性質である: 真のモデルパラメータは、十分な計算資源とデータから、任意の所望の精度に推定できることを意味する。表現学習の文脈における識別可能性について検討する: 下流課題に対して最適である非線形データ表現を発見する。ディープニューラルネットワークとしてパラメータ化される場合、そのような表現関数は設計によって過度にパラメータ化されるため、パラメータ空間における識別可能性に欠ける。本稿では, 非線形ICAの最近の進歩を基盤として, 線形不確定性に至るまでの関数空間において, 判別モデルの大きなファミリーが実際に識別可能であることを示すことによって, 識別可能性の回復を目指す。さまざまなドメインで表現学習を行う多くのモデルは、テキスト、画像、音声、出版時の最先端技術など、この意味で識別されている。線形同定可能性の十分条件を導出し,シミュレーションデータと実世界データの両方に対して経験的支援を行う。

Identifiability is a desirable property of a statistical model: it implies that the true model parameters may be estimated to any desired precision, given sufficient computational resources and data. We study identifiability in the context of representation learning: discovering nonlinear data representations that are optimal with respect to some downstream task. When parameterized as deep neural networks, such representation functions typically lack identifiability in parameter space, because they are overparameterized by design. In this paper, building on recent advances in nonlinear ICA, we aim to rehabilitate identifiability by showing that a large family of discriminative models are in fact identifiable in function space, up to a linear indeterminacy. Many models for representation learning in a wide variety of domains have been identifiable in this sense, including text, images and audio, state-of-the-art at time of publication. We derive sufficient conditions for linear identifiability and provide empirical support for the result on both simulated and real-world data.

翻訳日:2022-11-14 22:16:57 公開日:2020-07-08

# 私はブラックボックスエージェントを作るのかブラックボックスエージェントを解釈するのか?

Am I Building a White Box Agent or Interpreting a Black Box Agent? ( http://arxiv.org/abs/2007.01187v3 )

ライセンス: Link先を確認

Tom Bewley

(参考訳) ブラックボックス関数の解釈可能なモデルを構築する場合、忠実性に対する最適化は、基礎となるタスクのパフォーマンスを低下させる可能性があり、その逆も同様である。私は、このジレンマが現代における説明可能な人工知能の分野と関連性を再評価し、ブラックボックスが動的環境と相互作用するエージェントであるときにどのように複合化されるかを強調する。次に、ホワイトボックスエージェントの構築とブラックボックスエージェントの解釈という、2つの独立した研究方向について議論します。

The rule extraction literature contains the notion of a fidelity-accuracy dilemma: when building an interpretable model of a black box function, optimising for fidelity is likely to reduce performance on the underlying task, and vice versa. I reassert the relevance of this dilemma for the modern field of explainable artificial intelligence, and highlight how it is compounded when the black box is an agent interacting with a dynamic environment. I then discuss two independent research directions - building white box agents and interpreting black box agents - which are both coherent and worthy of attention, but must not be conflated by researchers embarking on projects in the domain of agent interpretability.

翻訳日:2022-11-14 14:18:37 公開日:2020-07-08

# 観測不能リセットによる時間自動学習

Active learning of timed automata with unobservable resets ( http://arxiv.org/abs/2007.01637v2 )

ライセンス: Link先を確認

L\'eo Henry, Nicolas Markey, Thierry J\'eron

(参考訳) 時間付き言語のアクティブラーニングは、観察された時間付き単語から時間付きオートマトンを推論することに関連する。エージェントは、対象言語における単語のメンバシップを問い合わせたり、候補モデルを提案し、そのターゲットに対する等価性を検証する。このフレームワークの主な難点は、クロックリセットの推論であり、時間付きオートマタのダイナミクスの中心だが、直接観測できない。興味深い最初のステップは、クロックリセットが観測と結びつくイベント記録オートマタのサブクラスに制限することですでに実現されている。一般時間オートマトンを学習するために、この手法をリセットなしイベント記録オートマトン(reset-free event-recording automateda)と呼ばれる新しいクラスに一般化する。これは、可読性のためにイベント記録自動化のシンプルなフレームワークを維持しながら、ジェネリックタイムドオートマトンと同じ課題を提供する。私たちの貢献の中心は、無効性の概念とそれを扱うアルゴリズムとデータ構造であり、一般的なタイムドオートマトンのための効率的な能動的学習手順の鍵である観測に矛盾するリセット仮説のオンザフライ検出とプルーニングを可能にする。

Active learning of timed languages is concerned with the inference of timed automata from observed timed words. The agent can query for the membership of words in the target language, or propose a candidate model and verify its equivalence to the target. The major difficulty of this framework is the inference of clock resets, central to the dynamics of timed automata, but not directly observable. Interesting first steps have already been made by restricting to the subclass of event-recording automata, where clock resets are tied to observations. In order to advance towards learning of general timed automata, we generalize this method to a new class, called reset-free event-recording automata, where some transitions may reset no clocks. This offers the same challenges as generic timed automata while keeping the simpler framework of event-recording automata for the sake of readability. Central to our contribution is the notion of invalidity, and the algorithm and data structures to deal with it, allowing on-the-fly detection and pruning of reset hypotheses that contradict observations, a key to any efficient active-learning procedure for generic timed automata.

翻訳日:2022-11-14 06:18:22 公開日:2020-07-08

# AVP-SLAM:駐車場における自律走行車両のセマンティック視覚マッピングと位置決め

AVP-SLAM: Semantic Visual Mapping and Localization for Autonomous Vehicles in the Parking Lot ( http://arxiv.org/abs/2007.01813v2 )

ライセンス: Link先を確認

Tong Qin, Tongqing Chen, Yilun Chen, and Qing Su

(参考訳) 自動駐車は自動運転車の特定の用途である。このタスクでは、車両は狭く、混雑していて、GPSを付加した駐車場で移動する必要がある。正確なローカライゼーション能力は非常に重要です。従来の視覚ベースの手法は、テクスチャのない領域、繰り返し構造、外観の変化によって失われた追跡に悩まされる。本稿では、ロバストなセマンティックな特徴を利用して、駐車場に地図を構築し、車両をローカライズする。セマンティックな特徴には、通常駐車場に現れる案内標識、駐車場、スピードバンプなどが含まれる。従来の特徴と比較して、これらの意味的特徴は長期的な安定であり、視点と照明の変化に対して堅牢である。我々は4つのサラウンドビューカメラを用いて知覚範囲を拡大する。 IMU (Inertial Measurement Unit) とホイールエンコーダ(ホイールエンコーダ)の支援により,提案システムはグローバルな視覚意味マップを生成する。この地図はさらに、車両をセンチメートルレベルでローカライズするために使われる。我々は,システムの精度とリコールを分析し,実実験で他の手法と比較する。さらに,提案システムの実現可能性について,自動駐車による実証実験を行った。

Autonomous valet parking is a specific application for autonomous vehicles. In this task, vehicles need to navigate in narrow, crowded and GPS-denied parking lots. Accurate localization ability is of great importance. Traditional visual-based methods suffer from tracking lost due to texture-less regions, repeated structures, and appearance changes. In this paper, we exploit robust semantic features to build the map and localize vehicles in parking lots. Semantic features contain guide signs, parking lines, speed bumps, etc, which typically appear in parking lots. Compared with traditional features, these semantic features are long-term stable and robust to the perspective and illumination change. We adopt four surround-view cameras to increase the perception range. Assisting by an IMU (Inertial Measurement Unit) and wheel encoders, the proposed system generates a global visual semantic map. This map is further used to localize vehicles at the centimeter level. We analyze the accuracy and recall of our system and compare it against other methods in real experiments. Furthermore, we demonstrate the practicability of the proposed system by the autonomous parking application.

翻訳日:2022-11-14 06:04:08 公開日:2020-07-08

# graph2kernel grid-lstm:適応型近傍学習による歩行者追跡予測のためのマルチキュードモデル

Graph2Kernel Grid-LSTM: A Multi-Cued Model for Pedestrian Trajectory Prediction by Learning Adaptive Neighborhoods ( http://arxiv.org/abs/2007.01915v2 )

ライセンス: Link先を確認

Sirin Haddad and Siew Kei Lam

(参考訳) 歩行者軌跡予測は,歩行軌跡の時間的表現にLong Short-Term Memory (LSTM) を広範囲に用い,群集の社会的・文脈的相互作用のモデル化に向けた顕著な研究トラックである。既存のアプローチでは、仮想地区を固定グリッドとして使用し、歩行者の社会的状態をプールし、社会的相互作用の捉え方を制御するチューニングプロセスを提供する。これは特定のシーンにパフォーマンスをカスタマイズするが、アプローチの一般化能力は低下する。本研究では,多次元特徴入力上で動作するLSTMの拡張であるtextit{Grid-LSTM}をデプロイする。本稿では,歩行者近傍がデザインに適応可能となることを提案し,インタラクションモデリングの新しい視点を提案する。エンコーダとして \textit{Grid-LSTM} を用いて, 視覚的境界と空間的境界を考慮し, 将来の歩行者運動への影響を学習する。我々のモデルは、いくつかの公開テストされた監視ビデオに類似した特徴を照合する最先端のアプローチよりも優れています。実験の結果は、シーンの特徴や群集のダイナミクスによって異なるデータセットにまたがるアプローチの一般化を明確に示している。

Pedestrian trajectory prediction is a prominent research track that has advanced towards modelling of crowd social and contextual interactions, with extensive usage of Long Short-Term Memory (LSTM) for temporal representation of walking trajectories. Existing approaches use virtual neighborhoods as a fixed grid for pooling social states of pedestrians with tuning process that controls how social interactions are being captured. This entails performance customization to specific scenes but lowers the generalization capability of the approaches. In our work, we deploy \textit{Grid-LSTM}, a recent extension of LSTM, which operates over multidimensional feature inputs. We present a new perspective to interaction modeling by proposing that pedestrian neighborhoods can become adaptive in design. We use \textit{Grid-LSTM} as an encoder to learn about potential future neighborhoods and their influence on pedestrian motion given the visual and the spatial boundaries. Our model outperforms state-of-the-art approaches that collate resembling features over several publicly-tested surveillance videos. The experiment results clearly illustrate the generalization of our approach across datasets that varies in scene features and crowd dynamics.

翻訳日:2022-11-14 05:57:00 公開日:2020-07-08

# 弱教師付きセマンティクスセグメンテーションのためのクロスイメージセマンティクスのマイニング

Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation ( http://arxiv.org/abs/2007.01947v2 )

ライセンス: Link先を確認

Guolei Sun and Wenguan Wang and Jifeng Dai and Luc Van Gool

(参考訳) 本稿では,画像レベルの監視のみからセマンティックセグメンテーションを学習する問題を考察する。現在のポピュラーなソリューションでは、分類器からのオブジェクトローカライゼーションマップを監視信号として利用し、ローカライゼーションマップがより完全なオブジェクトコンテンツを取得するのに苦労している。画像内情報に重点を置く従来の取り組みよりも、総合的なオブジェクトパターンマイニングのためのクロスイメージ意味関係の価値に対処する。これを実現するために、2つのニューラルコアテンションが分類器に組み込まれ、画像間の意味的類似性と差異を補足的に捉える。特に、一対の訓練画像が与えられた場合、一対のコアテンションは、コアテンティブオブジェクトから共通のセマンティクスを認識するように分類器を強制するが、他方のコアテンションは、コントラッシブコアテンションと呼ばれ、他の非共有オブジェクトから未共有セマンティクスを識別するために分類器を駆動する。これにより、分類器は画像領域におけるより多くのオブジェクトパターンとより良い接地セマンティクスを発見するのに役立つ。オブジェクトパターン学習の促進に加えて、コアテンションは他の関連画像からのコンテキストを活用してローカライゼーションマップ推論を改善し、最終的にはセマンティックセグメンテーション学習の恩恵を受ける。さらに,本アルゴリズムは,(1)正確な画像レベルの監視のみによるWSSS学習,(2)余分な単純な単一ラベルデータ,(3)余分なノイズの多いWebデータといった,WSSS設定をうまく扱う統一的なフレームワークを提供する。これらすべての設定に新たな最先端技術を設定し、その有効性と一般化性を示す。さらに,本手法はCVPR2020 Learning from Imperfect Data Challengeの弱いスーパービジョンのセマンティックセマンティックセグメンテーショントラックで1位にランクインした。

This paper studies the problem of learning semantic segmentation from image-level supervision only. Current popular solutions leverage object localization maps from classifiers as supervision signals, and struggle to make the localization maps capture more complete object content. Rather than previous efforts that primarily focus on intra-image information, we address the value of cross-image semantic relations for comprehensive object pattern mining. To achieve this, two neural co-attentions are incorporated into the classifier to complimentarily capture cross-image semantic similarities and differences. In particular, given a pair of training images, one co-attention enforces the classifier to recognize the common semantics from co-attentive objects, while the other one, called contrastive co-attention, drives the classifier to identify the unshared semantics from the rest, uncommon objects. This helps the classifier discover more object patterns and better ground semantics in image regions. In addition to boosting object pattern learning, the co-attention can leverage context from other related images to improve localization map inference, hence eventually benefiting semantic segmentation learning. More essentially, our algorithm provides a unified framework that handles well different WSSS settings, i.e., learning WSSS with (1) precise image-level supervision only, (2) extra simple single-label data, and (3) extra noisy web data. It sets new state-of-the-arts on all these settings, demonstrating well its efficacy and generalizability. Moreover, our approach ranked 1st place in the Weakly-Supervised Semantic Segmentation Track of CVPR2020 Learning from Imperfect Data Challenge.

翻訳日:2022-11-14 05:20:27 公開日:2020-07-08

# リレーショナル思考に基づく音声認識のためのディープグラフランダム処理

Deep Graph Random Process for Relational-Thinking-Based Speech Recognition ( http://arxiv.org/abs/2007.02126v2 )

ライセンス: Link先を確認

Hengguan Huang, Fuzhao Xue, Hao Wang, Ye Wang

(参考訳) 人間の知性の中心にあるリレーショナル思考は、最初は、新しい感覚信号と事前知識の関係に関する無意識の知覚に依存し、結果としてこれらの知覚の結合と変換を通じて認識可能な概念や物体となる。このようなメンタルプロセスは、会話の自動音声認識(ASR)のような現実的な問題ではモデル化が困難であり、(発話間の関係を示すグラフとしてモデル化されている場合)パーセプションは無数であり、直接観察できない。本稿では,パーセプタを表現する無限個の確率グラフを生成可能な,ディープグラフランダム処理(dgp)と呼ばれるベイズ非パラメトリック深層学習手法を提案する。さらに,音響モデリングのための知覚グラフの結合と変換のための閉形式解を提案する。我々の手法は、訓練中に関係データを用いることなく、発話間の関係を推測できる。 CHiME-2およびCHiME-5を含むASRタスクの実験的評価により,本手法の有効性とメリットが示された。

Lying at the core of human intelligence, relational thinking is characterized by initially relying on innumerable unconscious percepts pertaining to relations between new sensory signals and prior knowledge, consequently becoming a recognizable concept or object through coupling and transformation of these percepts. Such mental processes are difficult to model in real-world problems such as in conversational automatic speech recognition (ASR), as the percepts (if they are modelled as graphs indicating relationships among utterances) are supposed to be innumerable and not directly observable. In this paper, we present a Bayesian nonparametric deep learning method called deep graph random process (DGP) that can generate an infinite number of probabilistic graphs representing percepts. We further provide a closed-form solution for coupling and transformation of these percept graphs for acoustic modeling. Our approach is able to successfully infer relations among utterances without using any relational data during training. Experimental evaluations on ASR tasks including CHiME-2 and CHiME-5 demonstrate the effectiveness and benefits of our method.

翻訳日:2022-11-13 13:19:28 公開日:2020-07-08

# OpenStreetMapのための人間支援人工知能による自然特徴作成技術

Human Assisted Artificial Intelligence Based Technique to Create Natural Features for OpenStreetMap ( http://arxiv.org/abs/2007.02149v2 )

ライセンス: Link先を確認

Piyush Yadav, Dipto Sarkar, Shailesh Deshpande, Edward Curry

(参考訳) 本研究では,ランドサットやセンチネルなどの衛星画像を用いて,人間の編集者がイニシエータやバリデータとして行動するosm上の自然な特徴を創造するaiベースの手法を提案する。この手法は、人間の入力を機械と結合して複雑な問題を効率的に解き、純粋な自律プロセスと比較するインタラクティブ機械学習技術に基づいている。ボトムアップアプローチでは、画像のスペクトルシグネチャを使用してクラスを抽出し、後に編集可能な機能に変換して自然な機能を生成するために、マシンラーニング(ml)パイプラインをエディターとループで使用する。

In this work, we propose an AI-based technique using freely available satellite images like Landsat and Sentinel to create natural features over OSM in congruence with human editors acting as initiators and validators. The method is based on Interactive Machine Learning technique where human inputs are coupled with the machine to solve complex problems efficiently as compare to pure autonomous process. We use a bottom-up approach where a machine learning (ML) pipeline in loop with editors is used to extract classes using spectral signatures of images and later convert them to editable features to create natural features.

翻訳日:2022-11-13 13:09:38 公開日:2020-07-08

# 大規模ターゲット広告システムのための多次元学習

Multi-Manifold Learning for Large-scale Targeted Advertising System ( http://arxiv.org/abs/2007.02334v2 )

ライセンス: Link先を確認

Kyuyong Shin, Young-Jin Park, Kyung-Min Kim, Sunyoung Kwon

(参考訳) messenger広告(ads)は、直接的および個人的ユーザー体験を提供し、高いコンバージョン率と売上をもたらす。しかし、人々は広告に懐疑的であり、時にはスパムだと認識し、最終的にはユーザー満足度が低下する。特定の広告メッセージに興味を示す個人に対して広告を提供するターゲット広告は、強く求められている。正確なユーザーターゲティングの成功の鍵は、埋め込み空間における正確なユーザーと広告表現を学ぶことである。過去の研究の多くはユークリッド空間における表現学習を制限してきたが、近年の研究では、ソーシャルネットワークやレコメンダシステム、広告といった現実のデータセットから生じる複雑なネットワーク特性の異なる射影に対する双曲的多様体学習が提案されている。本稿では,ハイパーボリック空間におけるユーザと広告の階層構造を効果的に学習し,マルチマニフォールド学習に拡張するフレームワークを提案する。学習可能な曲率を持つ複数の双曲多様体を構築し,ユーザとアドの表現を各多様体にマッピングする。各多様体の起源は、各ユーザクラスタのセンタロイドとして設定される。各広告のユーザ嗜好を双曲空間内の2つのエンティティ間の距離を用いて推定し、学習された複数の多様体から算出された値を集約して最終予測を行う。提案手法を,公開ベンチマークデータセットと大規模商用メッセンジャーシステムLINE上で評価し,その性能向上による有効性を示す。

Messenger advertisements (ads) give direct and personal user experience yielding high conversion rates and sales. However, people are skeptical about ads and sometimes perceive them as spam, which eventually leads to a decrease in user satisfaction. Targeted advertising, which serves ads to individuals who may exhibit interest in a particular advertising message, is strongly required. The key to the success of precise user targeting lies in learning the accurate user and ad representation in the embedding space. Most of the previous studies have limited the representation learning in the Euclidean space, but recent studies have suggested hyperbolic manifold learning for the distinct projection of complex network properties emerging from real-world datasets such as social networks, recommender systems, and advertising. We propose a framework that can effectively learn the hierarchical structure in users and ads on the hyperbolic space, and extend to the Multi-Manifold Learning. Our method constructs multiple hyperbolic manifolds with learnable curvatures and maps the representation of user and ad to each manifold. The origin of each manifold is set as the centroid of each user cluster. The user preference for each ad is estimated using the distance between two entities in the hyperbolic space, and the final prediction is determined by aggregating the values calculated from the learned multiple manifolds. We evaluate our method on public benchmark datasets and a large-scale commercial messenger system LINE, and demonstrate its effectiveness through improved performance.

翻訳日:2022-11-13 07:56:16 公開日:2020-07-08

# 機械学習を用いたウェアラブル慣性測定装置による歩行からの接地距離の推定

Estimation of Ground Contacts from Human Gait by a Wearable Inertial Measurement Unit using machine learning ( http://arxiv.org/abs/2007.02433v2 )

ライセンス: Link先を確認

Muhammad Junaid Umer and Qaiser Riaz

(参考訳) 運動障害のリハビリテーションのためのロボティクスシステムと運動支援の意図が高まっている。このシナリオでは、地上接触の推定はロボット工学と医療の研究の活発な分野である。本稿では,健常者歩行における左右足の推定と分類について,胸部および腰部のIMUセンサデータに基づく検討を行った。この目的のために, 胸部, 下肢に2台のスマートフォン, 右足首に1台のスマートウォッチを用いて, 被験者48名のIMUデータを収集した。アプローチデータの堅牢性を示すため、6つの異なる表面(道路タイルカーペットコンクリートと土)で収集した。右足首センサデータに基づいて腰部および胸部センサの記録データを単段に分割し,各分割したステップの時間周波数とウェーブレット領域から計408個の特徴を算出した。分類タスクでは、SVMとRFの2つの機械学習分類器を10倍のクロス検証法で訓練した。個々の表面,硬質表面,軟質表面および全表面の分類実験を行い,98.88%の精度で各表面の精度が最も高かった。さらに、硬質軟質および全表面の分類率はそれぞれ95.60%、94.38%、95.05%である。その結果, 物体の背後と胸部からの角速度と加速度の6次元データを用いて, 異なる面における正常な歩行による接地推定を高精度に行うことができた。

Robotics system for rehabilitation of movement disorders and motion assistance are gaining increased intention. In this scenario estimation of ground contact is an active area of research in robotics and healthcare. This article addresses the estimation and classification of right and left foot during the healthy human gait based on the IMU sensor data of chest and lower back. For this purpose we have collected an IMU data of 48 subjects by using two smartphones at chest and lower back of the human body and one smart watch at right ankle of the body. To show the robustness of our approach data was collected at six different surfaces (road tiles carpet grass concrete and soil). The recorded data of lower back and chest sensor was segmented into single steps on the basis of right ankle sensor data, then we computed a total of 408 features from time frequency and wavelet domain of each segmented step. For classification task we have trained two machine learning classifiers SVM and RF with 10 fold cross validation method. We performed classification experiments at individual surfaces, hard surfaces, soft surfaces and all surfaces, highest accuracy was achieved at individual surfaces with accuracy index of 98.88%. Furthermore, classification rate at hard soft and all surface are 95.60%, 94.38% and 95.05% respectively. The results shows that estimation of ground contact form normal human walk at different surfaces can be performed with high accuracy using 6D data of angular velocities and accelerations from chest and lower back location of the body.

翻訳日:2022-11-13 07:53:20 公開日:2020-07-08

# 大規模物質移動のためのガイドファインチューニング

Guided Fine-Tuning for Large-Scale Material Transfer ( http://arxiv.org/abs/2007.03059v2 )

ライセンス: Link先を確認

Valentin Deschaintre, George Drettakis and Adrien Bousseau

(参考訳) 本稿では, SVBRDFの外観を類似材料を表す対象画像に転送する手法を提案する。提案手法は,対象画像からSVBRDF値と類似したSVBRDF値の抽出を学習できるように,提供した例の深い外観キャプチャネットワークを微調整する。このシンプルなアプローチの強みを示す2つの新しい材料キャプチャーと設計ワークフローを導入する。最初のワークフローでは、少数の画像から大規模オブジェクトの可塑性SVBRDFを生成することができる。具体的には、ユーザーは大きな表面の1枚の写真と、その詳細の一部のクローズアップフラッシュ写真だけを撮る必要がある。本手法では, 壁面や床, 家具など, 数メートルの幅の広い表面を軽量に捕捉し, 壁面からSVBRDFパラメータを抽出する手法と, これらのパラメータを表面全体に伝達する手法を用いている。第2のワークフローでは、ユーザが既存のSVBRDFの外観を移譲することで、インターネット画像から大きなSVBRDFを作成する強力な方法を提供する。異なる例を選択すれば、ユーザはターゲット画像に割り当てられた素材を制御でき、深い外観キャプチャによる創造可能性を大幅に向上することができる。

We present a method to transfer the appearance of one or a few exemplar SVBRDFs to a target image representing similar materials. Our solution is extremely simple: we fine-tune a deep appearance-capture network on the provided exemplars, such that it learns to extract similar SVBRDF values from the target image. We introduce two novel material capture and design workflows that demonstrate the strength of this simple approach. Our first workflow allows to produce plausible SVBRDFs of large-scale objects from only a few pictures. Specifically, users only need take a single picture of a large surface and a few close-up flash pictures of some of its details. We use existing methods to extract SVBRDF parameters from the close-ups, and our method to transfer these parameters to the entire surface, enabling the lightweight capture of surfaces several meters wide such as murals, floors and furniture. In our second workflow, we provide a powerful way for users to create large SVBRDFs from internet pictures by transferring the appearance of existing, pre-designed SVBRDFs. By selecting different exemplars, users can control the materials assigned to the target image, greatly enhancing the creative possibilities offered by deep appearance capture.

翻訳日:2022-11-13 03:12:57 公開日:2020-07-08

# 大規模多言語ASR:50言語,1モデル,10億パラメータ

Massively Multilingual ASR: 50 Languages, 1 Model, 1 Billion Parameters ( http://arxiv.org/abs/2007.03001v2 )

ライセンス: Link先を確認

Vineel Pratap, Anuroop Sriram, Paden Tomasello, Awni Hannun, Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

(参考訳) 我々は,低リソース言語における音声認識(ASR)の性能向上と,多様な言語をサポートするASRシステムの展開を単純化することを目的として,複数の言語を対象とした単一音響モデルの訓練を行った。言語別トレーニングデータ(100時間から1100時間)によって,51言語を対象とした広範なベンチマークを実施した。入力言語を知らずに単一関節モデルから多言語学習の3つの変種を、この情報を用いて複数の頭部(言語クラスタ毎に1つ)と比較する。複数の言語におけるASRモデルの多言語学習は、特に低リソース言語における認識性能を向上させることができることを示す。ジョイントモデルでは20.9%,23%,28.8%,単言語ベースラインでは28.8%,言語入力を伴うジョイントモデルでは20.9%,マルチヘッドモデルでは28.8%であった。私たちの知る限り、これは50以上の言語と16,000時間以上のオーディオを持つ多言語ASRを大規模に研究する最初の作品です。

We study training a single acoustic model for multiple languages with the aim of improving automatic speech recognition (ASR) performance on low-resource languages, and over-all simplifying deployment of ASR systems that support diverse languages. We perform an extensive benchmark on 51 languages, with varying amount of training data by language(from 100 hours to 1100 hours). We compare three variants of multilingual training from a single joint model without knowing the input language, to using this information, to multiple heads (one per language cluster). We show that multilingual training of ASR models on several languages can improve recognition performance, in particular, on low resource languages. We see 20.9%, 23% and 28.8% average WER relative reduction compared to monolingual baselines on joint model, joint model with language input and multi head model respectively. To our knowledge, this is the first work studying multilingual ASR at massive scale, with more than 50 languages and more than 16,000 hours of audio across them.

翻訳日:2022-11-13 03:04:37 公開日:2020-07-08

# 解剖学的構造を正確に抽出する学習

Learning to Segment Anatomical Structures Accurately from One Exemplar ( http://arxiv.org/abs/2007.03052v2 )

ライセンス: Link先を確認

Yuhang Lu, Weijian Li, Kang Zheng, Yirui Wang, Adam P. Harrison, Chihung Lin, Song Wang, Jing Xiao, Le Lu, Chang-Fu Kuo, Shun Miao

(参考訳) 重要な解剖学的構造の正確なセグメンテーションは、医療画像解析の核心にある。主なボトルネックは、必要なエキスパートラベルのイメージアノテーションをスケーラブルに収集することです。大量の注釈付きトレーニング画像を用いることなく、正確な解剖学的構造セグメンテーションを作成できる方法は非常に望ましい。本稿では,自然に組み込まれた人間のループ機構を備えた単発解剖セグメンタであるcontour transformer network(ctn)の新たな貢献を提案する。セグメンテーションはグラフ畳み込みネットワーク(gcns)に基づく輪郭進化過程を学習することによって定式化される。我々のCTNモデルのトレーニングにはラベル付き画像のみが必要であり、輪郭のグローバルな形状と外観の整合性を測定するために新たに導入された損失関数を通じてラベル付きデータを活用する。本手法は,非学習型手法を著しく上回り,最先端の完全教師付き深層学習手法と競合することを実証する。最小限のHuman-in-the-loop編集フィードバックにより、セグメンテーション性能をさらに改善し、オブザーバの望ましい結果に合わせることができる。これにより、臨床医による画像に基づくバイオマーカー評価(パーソナライズされた定量的臨床診断をサポートする)が容易になり、完全に監督された基準を上回ることができる。

Accurate segmentation of critical anatomical structures is at the core of medical image analysis. The main bottleneck lies in gathering the requisite expert-labeled image annotations in a scalable manner. Methods that permit to produce accurate anatomical structure segmentation without using a large amount of fully annotated training images are highly desirable. In this work, we propose a novel contribution of Contour Transformer Network (CTN), a one-shot anatomy segmentor including a naturally built-in human-in-the-loop mechanism. Segmentation is formulated by learning a contour evolution behavior process based on graph convolutional networks (GCNs). Training of our CTN model requires only one labeled image exemplar and leverages additional unlabeled data through newly introduced loss functions that measure the global shape and appearance consistency of contours. We demonstrate that our one-shot learning method significantly outperforms non-learning-based methods and performs competitively to the state-of-the-art fully supervised deep learning approaches. With minimal human-in-the-loop editing feedback, the segmentation performance can be further improved and tailored towards the observer desired outcomes. This can facilitate the clinician designed imaging-based biomarker assessments (to support personalized quantitative clinical diagnosis) and outperforms fully supervised baselines.

翻訳日:2022-11-13 02:54:57 公開日:2020-07-08

# 残留特徴注意深層ニューラルネットワークを用いたリモートセンシング画像のマルチイメージ超解像

Multi-image Super Resolution of Remotely Sensed Images using Residual Feature Attention Deep Neural Networks ( http://arxiv.org/abs/2007.03107v2 )

ライセンス: Link先を確認

Francesco Salvetti, Vittorio Mazzia, Aleem Khaliq, Marcello Chiaberge

(参考訳) 畳み込みニューラルネットワーク(cnns)は、画像スーパーレゾリューション(sr)における最先端の成果を一貫して証明されており、リモートセンシング分野において、キャプチャされたデータからさらなる情報や知識を抽出する絶好の機会である。しかし、文献で発表された作品の多くは、シングルイメージ超解法問題に焦点を当てている。現在、衛星ベースのリモートセンシングプラットフォームは、高時間分解能と低空間分解能の巨大なデータ可用性を提供している。本研究は,マルチイメージ超解像課題に効果的に取り組み,同時に空間的・時間的相関を利用して複数の画像を組み合わせる新しい残像注意モデル(RAMS)を提案する。本研究では3次元畳み込みによる視覚特徴の注意機構を導入し,複数の低解像度画像の認識データ融合と情報抽出を行い,局所的な畳み込み操作の限界を克服する。さらに,同じシーンで複数の入力を複数持つことで,ネステッド残差接続を広範囲に活用し,冗長な低周波信号を流し,より重要な高周波成分に演算を集中させる。単一画像または複数画像の超解像に対して利用可能な他のソリューションに対する大規模な実験と評価を行い、提案した深層学習に基づくソリューションがリモートセンシングアプリケーションにおけるマルチイメージ超解像の最先端とみなすことができることを示した。

Convolutional Neural Networks (CNNs) have been consistently proved state-of-the-art results in image Super-Resolution (SR), representing an exceptional opportunity for the remote sensing field to extract further information and knowledge from captured data. However, most of the works published in the literature have been focusing on the Single-Image Super-Resolution problem so far. At present, satellite based remote sensing platforms offer huge data availability with high temporal resolution and low spatial resolution. In this context, the presented research proposes a novel residual attention model (RAMS) that efficiently tackles the multi-image super-resolution task, simultaneously exploiting spatial and temporal correlations to combine multiple images. We introduce the mechanism of visual feature attention with 3D convolutions in order to obtain an aware data fusion and information extraction of the multiple low-resolution images, transcending limitations of the local region of convolutional operations. Moreover, having multiple inputs with the same scene, our representation learning network makes extensive use of nestled residual connections to let flow redundant low-frequency signals and focus the computation on more important high-frequency components. Extensive experimentation and evaluations against other available solutions, either for single or multi-image super-resolution, have demonstrated that the proposed deep learning-based solution can be considered state-of-the-art for Multi-Image Super-Resolution for remote sensing applications.

翻訳日:2022-11-13 02:18:11 公開日:2020-07-08

# パラメトリックマシン:アーキテクチャ検索への新しいアプローチ

Parametric machines: a fresh approach to architecture search ( http://arxiv.org/abs/2007.02777v2 )

ライセンス: Link先を確認

Pietro Vertechi, Patrizio Frosini, Mattia G. Bergomi

(参考訳) カテゴリ理論のツールを使用して、ニューラルネットワークとそのアーキテクチャを形式的に記述できるフレームワークを提供する。まず、一般的な分類学的文脈で機械の概念を定義し、より複雑なものにいかに単純な機械を結合できるかを示す。ニューラルネットワークと神経常微分方程式を一般化した,有限かつ無限大の機械を探索する。関数解析とカーネル法からアイデアを借用し、マシンの完全でノルム化された無限次元空間を構築し、与えられた計算問題を解決するために最適なアーキテクチャとパラメーターを見つける方法について議論する。我々の数値実験では、これらのカーネルにインスパイアされたネットワークは、トレーニングデータセットが小さい場合、古典的なニューラルネットワークより優れている。

Using tools from category theory, we provide a framework where artificial neural networks, and their architectures, can be formally described. We first define the notion of machine in a general categorical context, and show how simple machines can be combined into more complex ones. We explore finite- and infinite-depth machines, which generalize neural networks and neural ordinary differential equations. Borrowing ideas from functional analysis and kernel methods, we build complete, normed, infinite-dimensional spaces of machines, and discuss how to find optimal architectures and parameters -- within those spaces -- to solve a given computational problem. In our numerical experiments, these kernel-inspired networks can outperform classical neural networks when the training dataset is small.

翻訳日:2022-11-13 01:34:16 公開日:2020-07-08

# ビジョンに基づく新型コロナウイルスのソーシャルディスタンシングと臨界密度検出システム

A Vision-based Social Distancing and Critical Density Detection System for COVID-19 ( http://arxiv.org/abs/2007.03578v2 )

ライセンス: Link先を確認

Dongfang Yang, Ekim Yurtsever, Vishnu Renganathan, Keith A. Redmill, \"Umit \"Ozg\"uner

(参考訳) 新型コロナウイルス(covid-19)の感染拡大に対する効果的な対策として,ソーシャルディスタンシングが実証されている。しかし、個人は必要な6フィート(約2メートル)の距離を自分と周囲と追跡することができない。個人間の距離を検知し、警告できるアクティブ監視システムは、致命的な病気の拡散を遅らせることができる。さらに、関心領域(roi)における社会的密度の測定と流入の変調は、社会的距離違反の発生機会を減少させる。一方、データの記録や、対策に従わない個人へのラベル付けは、自由社会における個人の権利を侵害する。ここでは,人工知能(AI)に基づくリアルタイムなソーシャルディスタンシング検出・警告システムを提案する。(1)システムはデータの記録・キャッシュを決して行なわないこと,(2)警告は個人を標的にすべきでないこと,(3)人間の監督者は検出・警告ループにいないこと,(4)コードがオープンソースで公開されていること,である。本稿では,この背景に対して,単眼カメラと深層学習に基づくリアルタイム物体検出器を用いてソーシャルディスタンスを測定することを提案する。違反が検出されると、ソーシャルディスタンシング対策に違反した個人を標的にすることなく、侵入的でない音声視覚警告信号を出力する。また、社会密度が臨界値を超えた場合、システムは制御信号を送信してroiへの流入を変調する。提案手法を実世界のデータセットにまたがってテストし,その汎用性と性能を測定した。提案手法はデプロイ可能であり,コードをオープンソースにしています。

Social distancing has been proven as an effective measure against the spread of the infectious COronaVIrus Disease 2019 (COVID-19). However, individuals are not used to tracking the required 6-feet (2-meters) distance between themselves and their surroundings. An active surveillance system capable of detecting distances between individuals and warning them can slow down the spread of the deadly disease. Furthermore, measuring social density in a region of interest (ROI) and modulating inflow can decrease social distancing violation occurrence chance. On the other hand, recording data and labeling individuals who do not follow the measures will breach individuals' rights in free-societies. Here we propose an Artificial Intelligence (AI) based real-time social distancing detection and warning system considering four important ethical factors: (1) the system should never record/cache data, (2) the warnings should not target the individuals, (3) no human supervisor should be in the detection/warning loop, and (4) the code should be open-source and accessible to the public. Against this backdrop, we propose using a monocular camera and deep learning-based real-time object detectors to measure social distancing. If a violation is detected, a non-intrusive audio-visual warning signal is emitted without targeting the individual who breached the social distancing measure. Also, if the social density is over a critical value, the system sends a control signal to modulate inflow into the ROI. We tested the proposed method across real-world datasets to measure its generality and performance. The proposed method is ready for deployment, and our code is open-sourced.

翻訳日:2022-11-12 20:28:47 公開日:2020-07-08

# 胸部CT検査における肺Opacificationの分画

Segmentation of Pulmonary Opacification in Chest CT Scans of COVID-19 Patients ( http://arxiv.org/abs/2007.03643v2 )

ライセンス: Link先を確認

Keegan Lensink, Issam Laradji, Marco Law, Paolo Emilio Barbano, Savvas Nicolaou, William Parker, Eldad Haber

(参考訳) 重症急性呼吸症候群コロナウイルス2(SARS-CoV-2)は急速に世界的なパンデミックに広まっている。患者の肺に不透明な症状として現れる肺炎は、このウイルスに関連する最も一般的な発表であり、これらの変化が患者の死亡や死亡とどのように関係しているかに注目が集まっている。本研究は,胸部CT(CT)スキャンにおける肺閉塞のパターン分類のためのオープンソースモデルであり,感染のさまざまなステージと重症度に相関している。世界中の医療センターから663人の胸部CTスキャンを収集し、肺の6つの異なるパターンを分割する25,000個のスライスでピクセルワイドセグメンテーションラベルを作成しました。データセットでトレーニングされた複数のセグメンテーションモデルに対して、オープンソース実装と事前トレーニングされた重み付けを提供します。最適モデルでは,テストセットで0.76オパシティ・インターセクション・オーバー・ユニオンスコアを達成し,ドメイン適応を成功させ,専門家の1.7%以内のオパシティの容積を予測する。さらに,このタスクに固有のオブザーバ間変動の解析を行い,適切な確率的アプローチのための手法を提案する。

The Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has rapidly spread into a global pandemic. A form of pneumonia, presenting as opacities with in a patient's lungs, is the most common presentation associated with this virus, and great attention has gone into how these changes relate to patient morbidity and mortality. In this work we provide open source models for the segmentation of patterns of pulmonary opacification on chest Computed Tomography (CT) scans which have been correlated with various stages and severities of infection. We have collected 663 chest CT scans of COVID-19 patients from healthcare centers around the world, and created pixel wise segmentation labels for nearly 25,000 slices that segment 6 different patterns of pulmonary opacification. We provide open source implementations and pre-trained weights for multiple segmentation models trained on our dataset. Our best model achieves an opacity Intersection-Over-Union score of 0.76 on our test set, demonstrates successful domain adaptation, and predicts the volume of opacification within 1.7\% of expert radiologists. Additionally, we present an analysis of the inter-observer variability inherent to this task, and propose methods for appropriate probabilistic approaches.

翻訳日:2022-11-12 20:27:45 公開日:2020-07-08

# HKR for Handwriting Kazakh & Russian Database (英語)

HKR For Handwritten Kazakh & Russian Database ( http://arxiv.org/abs/2007.03579v2 )

ライセンス: Link先を確認

Daniyar Nurseitov, Kairat Bostanbekov, Daniyar Kurmankhojayev, Anel Alimova, Abdelrahman Abdallah

(参考訳) 本稿では,オフライン手書き文字認識のための新しいロシア語とカザフ語データベース(ロシア語の95%,カザフ語/文の5%)を提案する。データベースとともにいくつかの前処理と分割手順が開発されている。データベースはキリル文字で書かれており、同じ33文字を共有している。これらの文字に加えて、カザフ文字には9つの特別な文字が含まれている。このデータセットはフォームのコレクションです。データセット内のすべてのフォームのソースは \latex によって生成され、その後手書きの人物によって埋められた。データベースは1400以上のフォームで構成されている。約63,000の文があり、約200の異なる作家によって作られた715699以上の記号がある。ディープラーニングと機械学習を使うことで、手書き認識タスクの分野で研究者に役立てることができる。

In this paper, we present a new Russian and Kazakh database (with about 95% of Russian and 5% of Kazakh words/sentences respectively) for offline handwriting recognition. A few pre-processing and segmentation procedures have been developed together with the database. The database is written in Cyrillic and shares the same 33 characters. Besides these characters, the Kazakh alphabet also contains 9 additional specific characters. This dataset is a collection of forms. The sources of all the forms in the datasets were generated by \LaTeX which subsequently was filled out by persons with their handwriting. The database consists of more than 1400 filled forms. There are approximately 63000 sentences, more than 715699 symbols produced by approximately 200 different writers. It can serve researchers in the field of handwriting recognition tasks by using deep and machine learning.

翻訳日:2022-11-12 20:09:38 公開日:2020-07-08

# 限定ラベルデータから群衆の数え方を学ぶ

Learning to Count in the Crowd from Limited Labeled Data ( http://arxiv.org/abs/2007.03195v2 )

ライセンス: Link先を確認

Vishwanath A. Sindagi, Rajeev Yasarla, Deepak Sam Babu, R. Venkatesh Babu, Vishal M. Patel

(参考訳) 最近の群衆カウントアプローチは優れたパフォーマンスを達成しました。しかし、それらは本質的に完全に教師付きパラダイムに基づいており、多数の注釈付きサンプルを必要とする。アノテーションの取得は費用がかかり、労働集約的なプロセスです。本研究では,ラベルなしデータの膨大なプールを活用しながら,限定されたサンプル数から群衆を数えることを学ぶことで,アノテーションの努力を減らすことに注力する。具体的には,非ラベルデータに対する疑似基底真理の推定を含むガウス過程に基づく反復学習機構を提案する。提案手法は上海技術, UCF-QNRF, WorldExpo, UCSDなどのいくつかのデータセットに対して, 半教師付きデータ設定で有効であることが示されている。さらに,提案手法は,実世界のデータセット(合成から現実への転送)をより一般化しながら,合成データセットから学習のネットワークを数えられるように活用できることを実証する。

Recent crowd counting approaches have achieved excellent performance. However, they are essentially based on fully supervised paradigm and require large number of annotated samples. Obtaining annotations is an expensive and labour-intensive process. In this work, we focus on reducing the annotation efforts by learning to count in the crowd from limited number of labeled samples while leveraging a large pool of unlabeled data. Specifically, we propose a Gaussian Process-based iterative learning mechanism that involves estimation of pseudo-ground truth for the unlabeled data, which is then used as supervision for training the network. The proposed method is shown to be effective under the reduced data (semi-supervised) settings for several datasets like ShanghaiTech, UCF-QNRF, WorldExpo, UCSD, etc. Furthermore, we demonstrate that the proposed method can be leveraged to enable the network in learning to count from synthetic dataset while being able to generalize better to real-world datasets (synthetic-to-real transfer).

翻訳日:2022-11-12 19:50:41 公開日:2020-07-08

# アンサンブル分類器は、中間欠性障害の検出と診断に十分強力か?

Are Ensemble Classifiers Powerful Enough for the Detection and Diagnosis of Intermediate-Severity Faults? ( http://arxiv.org/abs/2007.03167v2 )

ライセンス: Link先を確認

Baihong Jin, Yingshui Tan, Yuxin Chen, Kameshwar Poolla, Alberto Sangiovanni Vincentelli

(参考訳) 中間重度(IS)断層は、重度断層よりも軽度な症状を示し、正常な手術条件に類似しているため、検出と診断が困難である。トレーニングデータにおけるIS故障例の欠如は、機械学習(ML)技術に基づくフォールト検出・診断(FDD)手法に重大なリスクをもたらす可能性がある。エンサンブルモデルはMLに広く適用されており、アウト・オブ・ディストリビューション(OOD)データを検出するための有望な方法と考えられている。これらのモデルに共通する落とし穴を、2つの実世界のデータセット上のいくつかの一般的なアンサンブルモデルを用いて広範な実験によって同定する。次に,is障害の検出と診断のための,より効率的なアンサンブルモデルの設計方法について述べる。

Intermediate-Severity (IS) faults present milder symptoms compared to severe faults, and are more difficult to detect and diagnose due to their close resemblance to normal operating conditions. The lack of IS fault examples in the training data can pose severe risks to Fault Detection and Diagnosis (FDD) methods that are built upon Machine Learning (ML) techniques, because these faults can be easily mistaken as normal operating conditions. Ensemble models are widely applied in ML and are considered promising methods for detecting out-of-distribution (OOD) data. We identify common pitfalls in these models through extensive experiments with several popular ensemble models on two real-world datasets. Then, we discuss how to design more effective ensemble models for detecting and diagnosing IS faults.

翻訳日:2022-11-12 18:49:58 公開日:2020-07-08

# ニューラルプログラムにおける強一般化と効率性

Strong Generalization and Efficiency in Neural Programs ( http://arxiv.org/abs/2007.03629v2 )

ライセンス: Link先を確認

Yujia Li, Felix Gimeno, Pushmeet Kohli, Oriol Vinyals

(参考訳) 本研究では,神経プログラム誘導の枠組みを一般化した効率的なアルゴリズムを学習する問題について検討する。神経モデルの入力/出力インターフェースを慎重に設計し、模倣することで、任意の入力サイズに対して正しい結果を生成するモデルを学び、強力な一般化を達成することができる。さらに,強化学習を用いることで,プログラム効率の指標を最適化し,模倣に用いる教師を上回る新しいアルゴリズムを探索する。これにより、ソート、順序付きリストの検索、NP完全 0/1 knapsack 問題など、さまざまな問題においてカスタム記述されたソリューションよりも優れた結果が得られる。ハイライトとして、私たちの学習したモデルは、テストした任意の入力データサイズで完全にソートを実行でき、o(n log n)$の複雑さで、手入力されたアルゴリズムよりも優れています。

We study the problem of learning efficient algorithms that strongly generalize in the framework of neural program induction. By carefully designing the input / output interfaces of the neural model and through imitation, we are able to learn models that produce correct results for arbitrary input sizes, achieving strong generalization. Moreover, by using reinforcement learning, we optimize for program efficiency metrics, and discover new algorithms that surpass the teacher used in imitation. With this, our approach can learn to outperform custom-written solutions for a variety of problems, as we tested it on sorting, searching in ordered lists and the NP-complete 0/1 knapsack problem, which sets a notable milestone in the field of Neural Program Induction. As highlights, our learned model can perform sorting perfectly on any input data size we tested on, with $O(n log n)$ complexity, whilst outperforming hand-coded algorithms, including quick sort, in number of operations even for list sizes far beyond those seen during training.

翻訳日:2022-11-12 18:14:23 公開日:2020-07-08

# マルチタスク学習によるX線後方散乱と前方散乱の同時推定

Simultaneous Estimation of X-ray Back-Scatter and Forward-Scatter using Multi-Task Learning ( http://arxiv.org/abs/2007.04018v1 )

ライセンス: Link先を確認

Philipp Roser, Xia Zhong, Annette Birkhold, Alexander Preuhs, Christopher Syben, Elisabeth Hoppe, Norbert Strobel, Markus Kowarschik, Rebecca Fahrig, Andreas Maier

(参考訳) 散乱放射は2つの方法でx線画像誘導の手順に影響を与える主要な関心事である。まず、後方散乱は複雑な介入の際の患者(皮膚)の服用に大きく寄与する。第2に、前方散乱放射は投影画像のコントラストを減少させ、3次元再構成においてアーティファクトを導入する。従来の抗散乱格子はX線を遮断することで画質を向上するが、検出器の抗散乱格子による追加の減衰は高用量で補償する必要がある。これはまた、患者を世話するスタッフに影響する服用量も増加させる。皮膚線量定量化には、予め決定されたスカラーバック散乱因子または線形点拡散関数を患者表面点への一次ケルマ前方射影に適用することにより、バック散乱が考慮される。しかし, 患者形状が異なるため, 従来の方法の一般化は限られている。そこで本研究では,従来の手法と学習に基づく手法を組み合わせることで,検出器に到達した前方散乱と患者皮膚線量に影響を及ぼす後方散乱を同時に推定する手法を提案する。前方散乱を知ればX線投射を補正できるが,後方散乱成分の良好な推定は皮膚線量評価の改善に役立つ。後方散乱と後方散乱を同時に推定するために,X線物理とニューラルネットワークを組み合わせることで,後方散乱と前方散乱の同時推定を行うマルチタスク手法を提案する。理論的には, どちらの場合においても高精度な散乱推定が可能となる。さらに,マルチタスクフレームワークの研究方向と学習に基づく散乱推定を一般論として示す。

Scattered radiation is a major concern impacting X-ray image-guided procedures in two ways. First, back-scatter significantly contributes to patient (skin) dose during complicated interventions. Second, forward-scattered radiation reduces contrast in projection images and introduces artifacts in 3-D reconstructions. While conventionally employed anti-scatter grids improve image quality by blocking X-rays, the additional attenuation due to the anti-scatter grid at the detector needs to be compensated for by a higher patient entrance dose. This also increases the room dose affecting the staff caring for the patient. For skin dose quantification, back-scatter is usually accounted for by applying pre-determined scalar back-scatter factors or linear point spread functions to a primary kerma forward projection onto a patient surface point. However, as patients come in different shapes, the generalization of conventional methods is limited. Here, we propose a novel approach combining conventional techniques with learning-based methods to simultaneously estimate the forward-scatter reaching the detector as well as the back-scatter affecting the patient skin dose. Knowing the forward-scatter, we can correct X-ray projections, while a good estimate of the back-scatter component facilitates an improved skin dose assessment. To simultaneously estimate forward-scatter as well as back-scatter, we propose a multi-task approach for joint back- and forward-scatter estimation by combining X-ray physics with neural networks. We show that, in theory, highly accurate scatter estimation in both cases is possible. In addition, we identify research directions for our multi-task framework and learning-based scatter estimation in general.

翻訳日:2022-11-12 13:54:36 公開日:2020-07-08

# スマートウォッチによるエピデミック露光通知: 近接性に基づくプライバシー保護アプローチ

Epidemic Exposure Notification with Smartwatch: A Proximity-Based Privacy-Preserving Approach ( http://arxiv.org/abs/2007.04399v1 )

ライセンス: Link先を確認

Pai Chet Ng, Petros Spachos, Stefano Gregori, Konstantinos Plataniotis

(参考訳) パンデミック後の世界のビジネスは、従業員や顧客の健康と福祉を守る革新的な方法を模索している。無線技術は、接触追跡の補助として重要な役割を担い、局所感染の発生を早急に防ぎ、さらなる拡散を防ぐ。本研究は,ビジネス,ホスピタリティ,レクリエーション施設における安全な物理的距離を助長するスマートウォッチに基づくウェアラブル近接露光通知ソリューションを提案する。近距離ベースのプライバシー保存型コンタクトトレース(p$^3$ct)は、信頼性の高い近接センシングにbluetooth low energy(ble)技術と、アイデンティティを保存するアンビエントシグネチャプロトコルを利用しています。近接センシングは、受信信号強度(rss)を利用してユーザのインタラクションを検出し、感染症と診断された患者に対して、それらを低リスクまたは高リスクに分類する。より正確には、ユーザーは、患者との距離と時間の観点から、彼らの相互作用に基づいて、自分の露出を通知される。我々のプライバシー保護プロトコルは、ユーザーの身元が匿名化されることを保証するために、周囲の署名を使用する。提案手法の有効性を広範囲な実験により実証する。

Businesses planning for the post-pandemic world are looking for innovative ways to protect the health and welfare of their employees and customers. Wireless technologies can play a key role in assisting contact tracing to quickly halt a local infection outbreak and prevent further spread. In this work, we present a wearable proximity and exposure notification solution based on a smartwatch that also promotes safe physical distancing in business, hospitality, or recreational facilities. Our proximity-based privacy-preserving contact tracing (P$^3$CT) leverages the Bluetooth Low Energy (BLE) technology for reliable proximity sensing, and an ambient signature protocol for preserving identity. Proximity sensing exploits the received signal strength (RSS) to detect the user's interaction and thus classifying them into low- or high-risk with respect to a patient diagnosed with an infectious disease. More precisely, a user is notified of their exposure based on their interactions, in terms of distance and time, with a patient. Our privacy-preserving protocol uses the ambient signatures to ensure that users' identities be anonymized. We demonstrate the feasibility of our proposed solution through extensive experimentation.

翻訳日:2022-11-12 13:54:15 公開日:2020-07-08

# 可変 Lebesgue 空間におけるニューラルネットワークの近似

Approximation with Neural Networks in Variable Lebesgue Spaces ( http://arxiv.org/abs/2007.04166v1 )

ライセンス: Link先を確認

\'Angela Capel and Jes\'us Oc\'ariz

(参考訳) 本稿では,可変ルベーグ空間におけるニューラルネットワークの普遍近似特性について述べる。空間の指数関数が有界となると、任意の所望の精度で全ての関数を浅いニューラルネットワークで近似できることを示す。この結果は、指数関数の有界性に依存する近似の普遍性を決定する。さらに、指数が非有界であるときは常に、近似できる関数の部分空間に対するいくつかの特徴づけ結果が得られる。

This paper concerns the universal approximation property with neural networks in variable Lebesgue spaces. We show that, whenever the exponent function of the space is bounded, every function can be approximated with shallow neural networks with any desired accuracy. This result subsequently leads to determine the universality of the approximation depending on the boundedness of the exponent function. Furthermore, whenever the exponent is unbounded, we obtain some characterization results for the subspace of functions that can be approximated.

翻訳日:2022-11-12 13:52:03 公開日:2020-07-08

# オープンワールド機械学習の批判的評価

A Critical Evaluation of Open-World Machine Learning ( http://arxiv.org/abs/2007.04391v1 )

ライセンス: Link先を確認

Liwei Song, Vikash Sehwag, Arjun Nitin Bhagoji, Prateek Mittal

(参考訳) オープンワールド機械学習(ML)は、オフ・オブ・ディストリビューション(OOD)検出器とイン・ディストリビューションデータに基づいてトレーニングされたクローズドワールドモデルを組み合わせる。オープンワールドMLシステムに関するこれまでの研究は、多様でおそらくは敵対的な条件下での信頼性のテストに失敗する。そこで本稿では,システムコンポーネントの変更に対して,最先端のオープンワールドMLシステムがいかにレジリエンスであるかを理解する。 6つのOOD検出器で評価した結果,OOD検出性能には分布内データ,モデルアーキテクチャ,OODデータの選択が強く影響し,70 %以上の偽陽性率を誘導することがわかった。さらに、22の意図しない汚職や敵対的な摂動を伴うOOD入力が、オープンワールドMLシステムに最大100\%の偽陽性率で使用できないことを示す。オープンワールドMLのレジリエンスを高めるため、ロバスト分類器とOOD検出技術を組み合わせて、OOD検出とロバストネスの新たなトレードオフを明らかにする。

Open-world machine learning (ML) combines closed-world models trained on in-distribution data with out-of-distribution (OOD) detectors, which aim to detect and reject OOD inputs. Previous works on open-world ML systems usually fail to test their reliability under diverse, and possibly adversarial conditions. Therefore, in this paper, we seek to understand how resilient are state-of-the-art open-world ML systems to changes in system components? With our evaluation across 6 OOD detectors, we find that the choice of in-distribution data, model architecture and OOD data have a strong impact on OOD detection performance, inducing false positive rates in excess of $70\%$. We further show that OOD inputs with 22 unintentional corruptions or adversarial perturbations render open-world ML systems unusable with false positive rates of up to $100\%$. To increase the resilience of open-world ML, we combine robust classifiers with OOD detection techniques and uncover a new trade-off between OOD detection and robustness.

翻訳日:2022-11-12 13:51:56 公開日:2020-07-08

# ガイドスターフリー画像誘導波面整形

Guidestar-free image-guided wavefront-shaping ( http://arxiv.org/abs/2007.03956v1 )

ライセンス: Link先を確認

Tomer Yeminy and Ori Katz

(参考訳) 散乱媒体による光学イメージングは多くの応用において基本的な課題である。近年, 生体組織を画像化したり, 角を見回したりといった重要なブレークスルーが, 波面形状のアプローチによって得られている。しかし、これらは波面補正、コヒーレント照明の制御、そして多くの場合、形状の焦点をラスター走査するために埋め込まれたガイドスターを必要とする。スペックル相関を利用し、ガイドスターやウェーブフロント制御を回避できる別の新しい計算手法は、メモリ効果相関範囲に含まれる小さな2次元オブジェクトに限られる。そこで本研究では,非侵襲的でガイドスターフリーで広視野の非コヒーレントイメージングを高散乱層を通じて実現し,照明制御を行なわない,画像誘導波面形成という新しい概念を提案する。最も重要なのは、画像品質のメトリクスを盲目的に最適化することで、メモリ効果範囲よりも大きいオブジェクトでもウェーブフロント補正が見つかることです。高散乱層とマルチコアファイバによる拡張物体のイメージングを実演し、顕微鏡から内視鏡まで様々な応用において非侵襲的なイメージングの道を開く。

Optical imaging through scattering media is a fundamental challenge in many applications. Recently, substantial breakthroughs such as imaging through biological tissues and looking around corners have been obtained by the use of wavefront-shaping approaches. However, these require an implanted guide-star for determining the wavefront correction, controlled coherent illumination, and most often raster scanning of the shaped focus. Alternative novel computational approaches that exploit speckle correlations, avoid guide-stars and wavefront control but are limited to small two-dimensional objects contained within the memory-effect correlations range. Here, we present a new concept, image-guided wavefront-shaping, allowing non-invasive, guidestar-free, widefield, incoherent imaging through highly scattering layers, without illumination control. Most importantly, the wavefront-correction is found even for objects that are larger than the memory-effect range, by blindly optimizing image-quality metrics. We demonstrate imaging of extended objects through highly-scattering layers and multi-core fibers, paving the way for non-invasive imaging in various applications, from microscopy to endoscopy.

翻訳日:2022-11-12 13:50:20 公開日:2020-07-08

# 3dポイントクラウドデータ圧縮技術の最近の動向と3d圧縮領域における直接処理の課題

A Quick Review on Recent Trends in 3D Point Cloud Data Compression Techniques and the Challenges of Direct Processing in 3D Compressed Domain ( http://arxiv.org/abs/2007.05038v1 )

ライセンス: Link先を確認

Mohammed Javed and MD Meraz and Pavan Chakraborty

(参考訳) オブジェクトの検出、追跡、セグメンテーションのための3Dポイントクラウドデータの自動処理は、AIとデータサイエンスの分野における最新のトレンド研究である。しかし、(LiDARを使った)3Dポイントクラウドの形で作成されているデータの量は極めて大きく、研究者は現在、生成した大量のデータを処理するために、新しいデータ圧縮アルゴリズムの発明を進めている。しかし、一方の圧縮は、空間要求を克服する利点があるが、他方の処理は、余分な計算資源を注入する減圧のために高価になる。したがって、圧縮されたデータを直接操作・分析できるアルゴリズムを、圧縮と再圧縮の段階を伴わずに開発する(何度も要求されるように、圧縮されたデータを操作または解析する必要がある)。この研究分野はCompressed Domain Processingと呼ばれている。本稿では,LiDARが生成する3Dポイントクラウドデータ圧縮領域における最近の最先端技術開発について概説するとともに,3Dポイントクラウドデータの圧縮ドメイン処理の今後の課題を取り上げる。

Automatic processing of 3D Point Cloud data for object detection, tracking and segmentation is the latest trending research in the field of AI and Data Science, which is specifically aimed at solving different challenges of autonomous driving cars and getting real time performance. However, the amount of data that is being produced in the form of 3D point cloud (with LiDAR) is very huge, due to which the researchers are now on the way inventing new data compression algorithms to handle huge volumes of data thus generated. However, compression on one hand has an advantage in overcoming space requirements, but on the other hand, its processing gets expensive due to the decompression, which indents additional computing resources. Therefore, it would be novel to think of developing algorithms that can operate/analyse directly with the compressed data without involving the stages of decompression and recompression (required as many times, the compressed data needs to be operated or analyzed). This research field is termed as Compressed Domain Processing. In this paper, we will quickly review few of the recent state-of-the-art developments in the area of LiDAR generated 3D point cloud data compression, and highlight the future challenges of compressed domain processing of 3D point cloud data.

翻訳日:2022-11-12 13:49:40 公開日:2020-07-08

# VEC-OFによるオープンソフトウェア定義モビリティエコシステムの実現

Enable an Open Software Defined Mobility Ecosystem through VEC-OF ( http://arxiv.org/abs/2007.03879v1 )

ライセンス: Link先を確認

Sanchu Han, Yong He, Yin Ding

(参考訳) OEMs and new entrants can take the Mobility as a Service market (MaaS) as the entry point, upgrade its E/E (Electric and Electronic) architecture to be C/C (Computing and Communication) architecture, build one open software defined and data driven software platform for its production and service model, use efficient and collaborative ways of vehicles, roads, cloud and network to continuously improve core technologies such as autonomous driving, provide MaaS operators with an affordable and agile platform. 本稿では,VEC-OF(Vehicle-Edge-Cloud Open Framework)という新しいフレームワークを提案する。Vehicle-Edge-Cloud Open Frameworkは,より安全で,より効率的で,接続性が高く,信頼性の高いMaaSを実現するための,新たなデータおよびAI中心の自動車ソフトウェアフレームワークである。

OEMs and new entrants can take the Mobility as a Service market (MaaS) as the entry point, upgrade its E/E (Electric and Electronic) architecture to be C/C (Computing and Communication) architecture, build one open software defined and data driven software platform for its production and service model, use efficient and collaborative ways of vehicles, roads, cloud and network to continuously improve core technologies such as autonomous driving, provide MaaS operators with an affordable and agile platform. In this paper we present one new framework, VEC-OF (Vehicle-Edge-Cloud Open Framework), which is a new data and AI centric vehicle software framework enabling a much safer, more efficient, connected and trusted MaaS through cooperative vehicle, infrastructure and cloud capabilities and intelligence

翻訳日:2022-11-12 13:49:18 公開日:2020-07-08

# インテリジェント車両のためのカメラとクラウドデジタルツイン情報のセンサ融合

Sensor Fusion of Camera and Cloud Digital Twin Information for Intelligent Vehicles ( http://arxiv.org/abs/2007.04350v1 )

ライセンス: Link先を確認

Yongkang Liu, Ziran Wang, Kyungtae Han, Zhenyu Shou, Prashant Tiwari, and John H. L. Hansen

(参考訳) インテリジェントな車両と高度運転支援システム(ADAS)の急速な発展に伴い、交通システムには様々なレベルの人間ドライバーの関与が関与している。この状況下では、ドライバーの視覚誘導は潜在的なリスクを防ぐために不可欠である。本稿では,視覚誘導システムの開発を進めるために,カメラ画像とクラウドからのデジタルツイン知識を統合した新しいセンサ融合手法を提案する。目標車両バウンディングボックスは、エゴ車両上を走行する物体検出器の結果と雲からの位置情報とを組み合わせることで描画・マッチングされる。ユニオン(iou)しきい値の0.7の交点で79.2%の精度で一致し、追加の特徴点として深度画像が得られた。ゲームエンジンベースのシミュレーション結果は、視覚誘導システムがクラウドデジタルツインシステムと大幅に協調して運転安全性を向上させることも明らかにしている。

With the rapid development of intelligent vehicles and Advanced Driving Assistance Systems (ADAS), a mixed level of human driver engagements is involved in the transportation system. Visual guidance for drivers is essential under this situation to prevent potential risks. To advance the development of visual guidance systems, we introduce a novel sensor fusion methodology, integrating camera image and Digital Twin knowledge from the cloud. Target vehicle bounding box is drawn and matched by combining results of object detector running on ego vehicle and position information from the cloud. The best matching result, with a 79.2% accuracy under 0.7 Intersection over Union (IoU) threshold, is obtained with depth image served as an additional feature source. Game engine-based simulation results also reveal that the visual guidance system could improve driving safety significantly cooperate with the cloud Digital Twin system.

翻訳日:2022-11-12 13:42:55 公開日:2020-07-08

# 微小知覚超解像への旅

Journey Towards Tiny Perceptual Super-Resolution ( http://arxiv.org/abs/2007.04356v1 )

ライセンス: Link先を確認

Royson Lee, {\L}ukasz Dudziak, Mohamed Abdelfattah, Stylianos I. Venieris, Hyeji Kim, Hongkai Wen, Nicholas D. Lane

(参考訳) シングルイメージ知覚超解像(SR)における最近の研究は、深層畳み込みネットワークによる現実的なテクスチャの生成において、前例のない性能を示した。しかし、これらの畳み込みモデルはあまりに大きく高価であり、エンドデバイスへの効果的な展開を妨げる。本研究では,nasとgenerative adversarial networks(gans)を統合したニューラルネットワーク探索(nas)手法を提案する。具体的には,生成器と判別器の両方のアーキテクチャを逐次的に探索し,sr最適化判別器を探索するユニークな課題と重要な観察を強調し,既存の判別器アーキテクチャと比較する。我々の小さな知覚的SR(TPSR)モデルは、フル参照知覚計量(LPIPS)と歪み計量(PSNR)の両方でSRGANとEnhanceNetを上回り、それぞれ26.4$\times$よりメモリ効率が良く、33.6$\times$より計算効率が良い。

Recent works in single-image perceptual super-resolution (SR) have demonstrated unprecedented performance in generating realistic textures by means of deep convolutional networks. However, these convolutional models are excessively large and expensive, hindering their effective deployment to end devices. In this work, we propose a neural architecture search (NAS) approach that integrates NAS and generative adversarial networks (GANs) with recent advances in perceptual SR and pushes the efficiency of small perceptual SR models to facilitate on-device execution. Specifically, we search over the architectures of both the generator and the discriminator sequentially, highlighting the unique challenges and key observations of searching for an SR-optimized discriminator and comparing them with existing discriminator architectures in the literature. Our tiny perceptual SR (TPSR) models outperform SRGAN and EnhanceNet on both full-reference perceptual metric (LPIPS) and distortion metric (PSNR) while being up to 26.4$\times$ more memory efficient and 33.6$\times$ more compute efficient respectively.

翻訳日:2022-11-12 13:42:39 公開日:2020-07-08

# アート素材としての言葉:連続GANによる絵画の生成

Words as Art Materials: Generating Paintings with Sequential GANs ( http://arxiv.org/abs/2007.04383v1 )

ライセンス: Link先を確認

Azmi Can \"Ozgen, Haz{\i}m Kemal Ekenel

(参考訳) ジェネレーティブ・アドバイサル・ネットワークを用いた画像へのテキスト記述の変換が研究分野として人気を博している。近年,視覚的に魅力的な画像が生成されている。これらの研究に触発されて,大分散データセット上での芸術的画像の生成について検討した。このデータセットには、形状、色、内容など、バリエーションのあるイメージが含まれている。これらの画像のバリエーションは、芸術的本質の重要な要素である独創性をもたらす。私たちの研究の大きな特徴は、文章ではなく、画像記述としてキーワードを使うことです。ネットワークアーキテクチャとして,逐次生成適応型ネットワークモデルを提案する。この逐次モデルの最初の段階はワードベクトルを処理してベース画像を生成するが、次の段階は単語ベクトルを使わずに高解像度の芸術的なイメージを作成することに焦点を当てる。我々はganの不安定性に対処するため,wasserstein損失,スペクトル正規化,ミニバッチ識別などの混合手法を提案した。最終的には、さまざまなスタイルの絵画画像を生成することができました。 fr\'echetインセプション距離スコアを用いて評価を行い,186名を対象にユーザ調査を行った。

Converting text descriptions into images using Generative Adversarial Networks has become a popular research area. Visually appealing images have been generated successfully in recent years. Inspired by these studies, we investigated the generation of artistic images on a large variance dataset. This dataset includes images with variations, for example, in shape, color, and content. These variations in images provide originality which is an important factor for artistic essence. One major characteristic of our work is that we used keywords as image descriptions, instead of sentences. As the network architecture, we proposed a sequential Generative Adversarial Network model. The first stage of this sequential model processes the word vectors and creates a base image whereas the next stages focus on creating high-resolution artistic-style images without working on word vectors. To deal with the unstable nature of GANs, we proposed a mixture of techniques like Wasserstein loss, spectral normalization, and minibatch discrimination. Ultimately, we were able to generate painting images, which have a variety of styles. We evaluated our results by using the Fr\'echet Inception Distance score and conducted a user study with 186 participants.

翻訳日:2022-11-12 13:42:17 公開日:2020-07-08

# ロボット手術における楽器セグメンテーションのための効率的な構造探索

Searching for Efficient Architecture for Instrument Segmentation in Robotic Surgery ( http://arxiv.org/abs/2007.04449v1 )

ライセンス: Link先を確認

Daniil Pakhomov, Nassir Navab

(参考訳) 手術器具のセグメンテーションはロボット支援手術において重要な問題であり、完全な楽器ポーズ推定への重要なステップであり、手術中の拡張現実オーバーレイのマスキングに直接使用される。ほとんどのアプリケーションは、高精度外科画像の正確なリアルタイムセグメンテーションに依存している。従来の研究は主に高精度なセグメンテーションマスクを提供する手法に焦点を当てていたが、その大半は計算コストのためリアルタイムアプリケーションでは使用できない。本研究では,高解像度画像のリアルタイム推論を行うために,軽量かつ高効率な深部残差アーキテクチャを設計する。検出した重み付きディープ残差ネットワークの精度の低下と、追加の計算負荷の増大を避けるため、ネットワークの残差単位に対する拡張率の差分探索を行う。我々は、EndoVis 2017 Robotic Instrumentsデータセットで発見されたアーキテクチャを検証し、私たちのモデルは、高解像度画像上で125FPSの速度で、スピードと精度のトレードオフの観点から最先端のモデルであることを検証した。

Segmentation of surgical instruments is an important problem in robot-assisted surgery: it is a crucial step towards full instrument pose estimation and is directly used for masking of augmented reality overlays during surgical procedures. Most applications rely on accurate real-time segmentation of high-resolution surgical images. While previous research focused primarily on methods that deliver high accuracy segmentation masks, majority of them can not be used for real-time applications due to their computational cost. In this work, we design a light-weight and highly-efficient deep residual architecture which is tuned to perform real-time inference of high-resolution images. To account for reduced accuracy of the discovered light-weight deep residual network and avoid adding any additional computational burden, we perform a differentiable search over dilation rates for residual units of our network. We test our discovered architecture on the EndoVis 2017 Robotic Instruments dataset and verify that our model is the state-of-the-art in terms of speed and accuracy tradeoff with a speed of up to 125 FPS on high resolution images.

翻訳日:2022-11-12 13:42:00 公開日:2020-07-08

# 深部fiducial inference

Deep Fiducial Inference ( http://arxiv.org/abs/2007.04285v1 )

ライセンス: Link先を確認

Gang Li, Jan Hannig

(参考訳) 2000年代中頃から、fiducial inferenceの現代的な修正への関心が復活した。これまで、一般化された分裂分布を抽出する主な計算ツールはマルコフ連鎖モンテカルロ(mcmc)である。本稿では,複雑な状況で使用可能な一般化されたfiducial distributionの計算方法を提案する。特に,非正規化fiducial density (mcmc) の難易度を克服するために,fiducial autoencoder (fae) を設計した。装着されたオートエンコーダを用いて未知パラメータの一般化されたフィデューシャルサンプルを生成する。精度を向上させるために,デコーダに差し込むと観測データを十分に再現できないサンプルを除去し,近似フィデューシャル計算(AFC)アルゴリズムを適用した。数値実験により,faeをベースとする逆解の有効性と,afc補正faeソリューションの精度が向上した。

Since the mid-2000s, there has been a resurrection of interest in modern modifications of fiducial inference. To date, the main computational tool to extract a generalized fiducial distribution is Markov chain Monte Carlo (MCMC). We propose an alternative way of computing a generalized fiducial distribution that could be used in complex situations. In particular, to overcome the difficulty when the unnormalized fiducial density (needed for MCMC), we design a fiducial autoencoder (FAE). The fitted autoencoder is used to generate generalized fiducial samples of the unknown parameters. To increase accuracy, we then apply an approximate fiducial computation (AFC) algorithm, by rejecting samples that when plugged into a decoder do not replicate the observed data well enough. Our numerical experiments show the effectiveness of our FAE-based inverse solution and the excellent coverage performance of the AFC corrected FAE solution.

翻訳日:2022-11-12 13:41:28 公開日:2020-07-08

# ディープラーニングモデルの分散トレーニング--分類学的観点から

Distributed Training of Deep Learning Models: A Taxonomic Perspective ( http://arxiv.org/abs/2007.03970v1 )

ライセンス: Link先を確認

Matthias Langer, Zhen He, Wenny Rahayu, and Yanbo Xue

(参考訳) distributed deep learning systems (ddls)は、クラスタの分散リソースを利用してディープニューラルネットワークモデルをトレーニングする。 DDLSの開発者は、選択した環境で特定のワークロードを効率的に処理するための多くの決定をする必要がある。 GPUベースのディープラーニングの出現、データセットとディープニューラルネットワークモデルの絶え間なく増加するサイズ、クラスタ環境に存在する帯域制限と組み合わせることで、DDLSの開発者は、高品質モデルを迅速にトレーニングするために革新的である必要がある。 DDLSを並べて比較するのは、広範な機能リストとアーキテクチャ上の違いのため難しい。我々は、ディープラーニングモデルのトレーニングに関連する一般的な特性を分析し、そのようなワークロードをクラスタに分散して協調的なモデルトレーニングを実現することで、独立したマシンのクラスタ内でディープニューラルネットワークをトレーニングする際の基本的な原則に光を当てることを目指している。そこで,現代DDLSが使用する様々な技術の概要を述べ,その教育過程への影響と意義について論じる。 DDLSを概念化し、比較するために、異なるテクニックをカテゴリに分類し、分散ディープラーニングシステムの分類を確立させる。

Distributed deep learning systems (DDLS) train deep neural network models by utilizing the distributed resources of a cluster. Developers of DDLS are required to make many decisions to process their particular workloads in their chosen environment efficiently. The advent of GPU-based deep learning, the ever-increasing size of datasets and deep neural network models, in combination with the bandwidth constraints that exist in cluster environments require developers of DDLS to be innovative in order to train high quality models quickly. Comparing DDLS side-by-side is difficult due to their extensive feature lists and architectural deviations. We aim to shine some light on the fundamental principles that are at work when training deep neural networks in a cluster of independent machines by analyzing the general properties associated with training deep learning models and how such workloads can be distributed in a cluster to achieve collaborative model training. Thereby we provide an overview of the different techniques that are used by contemporary DDLS and discuss their influence and implications on the training process. To conceptualize and compare DDLS, we group different techniques into categories, thus establishing a taxonomy of distributed deep learning systems.

翻訳日:2022-11-12 13:40:42 公開日:2020-07-08

# 深層学習から見たスプリット製造の攻撃

Attacking Split Manufacturing from a Deep Learning Perspective ( http://arxiv.org/abs/2007.03989v1 )

ライセンス: Link先を確認

Haocheng Li, Satwik Patnaik, Abhrajit Sengupta, Haoyu Yang, Johann Knechtel, Bei Yu, Evangeline F. Y. Young, Ozgur Sinanoglu

(参考訳) フォワード・オブ・ライン(FEOL)とバック・エンド・オブ・ライン(BEOL)の部品を異なるファウンドリーに委譲する集積回路分割製造の概念は、知的財産(IP)の過剰生産、海賊行為、あるいはFEOL施設の敵によるハードウェア・トロイの木馬の侵入を防ぐことである。本研究では,様々なレイアウトレベルの配置とルーティングヒントをベクトルおよび画像に基づく特徴として定式化することにより,スプリット製造のセキュリティ約束に挑戦する。我々は,不足しているbeol接続を高精度に推定可能な,高度な深層ニューラルネットワークを構築した。 ISCAS-85ベンチマークと同様のネットワークフロー攻撃[1]と比較して、M1で分割すると1.21倍、M3で1%以下の動作時間で分割すると1.12倍の精度が得られる。

The notion of integrated circuit split manufacturing which delegates the front-end-of-line (FEOL) and back-end-of-line (BEOL) parts to different foundries, is to prevent overproduction, piracy of the intellectual property (IP), or targeted insertion of hardware Trojans by adversaries in the FEOL facility. In this work, we challenge the security promise of split manufacturing by formulating various layout-level placement and routing hints as vector- and image-based features. We construct a sophisticated deep neural network which can infer the missing BEOL connections with high accuracy. Compared with the publicly available network-flow attack [1], for the same set of ISCAS-85 benchmarks, we achieve 1.21X accuracy when splitting on M1 and 1.12X accuracy when splitting on M3 with less than 1% running time.

翻訳日:2022-11-12 13:40:23 公開日:2020-07-08

# topsisを用いた内部サプライチェーン最適化の戦略評価:コイル巻線機メーカーにおける実証

Strategic Evaluation in Optimizing the Internal Supply Chain Using TOPSIS: Evidence In A Coil Winding Machine Manufacturer ( http://arxiv.org/abs/2007.10121v1 )

ライセンス: Link先を確認

Dilip U Shenoy, Vinay Sharma, Shiva HC Prasad

(参考訳) 製造会社の大半は、付加価値による商品の収益性の向上の観点から、サプライチェーンの最適化を目指している。本研究は、特定の基準に関して、内部サプライチェーンの性能に影響する要因を批判的に検討する。したがって、これらの要因を製造業におけるサプライチェーンのパフォーマンスの重要な側面にランク付けする。企業の意思決定者から回答を集めるために使用される、事前定義された一連の質問に対する半構造化インタビュー。 TOPSISと呼ばれる多基準意思決定ツールを使用して、応答を評価し、要因をランク付けする。この結果から,サプライヤ関係と在庫計画が,製品提供のオンタイム化,生産柔軟性,コスト削減,追加コストに正の影響を与えていることが示唆された。本研究は,客観的および主観的評価手法を用いてプロセスパラメータの同定と最適化を支援する。本研究は,マネージャの思考過程が内部サプライチェーンを最適化する上での複合的影響を抽出したものである。

Most of the manufacturing firm aims to optimize their Supply Chain in terms of improved profitability of its products through value Addition. This study takes a critical look into the factors that affect the Performance of internal supply chain with respect to specific criteria. Accordingly, ranking these factors to get the critical dimensions of supply chain performance in the manufacturing industry. A semi-structured interview with the pre-defined set of questions used to collect the responses from decision makers of the firm. Multi criteria decision-making tool called TOPSIS is used to evaluate the responses and rank the factors. The results of this indicate that supplier relationship and inventory planning were most principal factors positively influencing on-time delivery of the product, production flexibility, cost savings, additional costs. This study helps to identify and optimize the process parameters using objective and subjective evaluation approach. The combined influence of the thought process of the manager to optimize the internal supply chain is extracted in this work.

翻訳日:2022-11-12 13:34:34 公開日:2020-07-08

# 連立言語同定を用いたエンドツーエンドのバイリンガルASRシステム

Streaming End-to-End Bilingual ASR Systems with Joint Language Identification ( http://arxiv.org/abs/2007.03900v1 )

ライセンス: Link先を確認

Surabhi Punjabi, Harish Arsikere, Zeynab Raeesy, Chander Chandak, Nikhil Bhave, Ankish Bansal, Markus M\"uller, Sergio Murillo, Ariya Rastrow, Sri Garimella, Roland Maas, Mat Hans, Athanasios Mouchtaris, Siegfried Kunzmann

(参考訳) 多言語ASR技術は、モデルトレーニングとデプロイを単純化するが、その精度は実行時の言語情報の可用性に依存することが知られている。言語のアイデンティティは、現実のシナリオでは事前には知られていないため、最小限のレイテンシでオンザフライで推測する必要がある。さらに音声アクティブなスマートアシスタントシステムでは、ASR出力の下流処理には言語アイデンティティも必要である。本稿では,recurrent neural network transducer(rnn-t)アーキテクチャを用いて,asrと言語識別(lid)の両方を実行するストリーミング,エンドツーエンドのバイリンガルシステムを提案する。入力側では、事前訓練された音響専用LID分類器からの埋め込みを用いて、RNN-Tのトレーニングと推論を誘導する一方、出力側では、言語ターゲットをASRターゲットと共同でモデル化する。提案手法は、アメリカ合衆国で話される英語とスペイン語、インドで話される英語とヒンディーの2つの言語対に適用できる。 ASR-LIDアーキテクチャは英語とスペイン語では単言語ASRと音響のみのLIDの精度に一致している。英語ヒンディー語のより難易度の高い(言語内コードのスイッチングによる)ケースでは、英語のasrとlidメトリクスが劣化している。全体として、ユーザが動的に言語間で切り替えるシナリオでは、提案アーキテクチャは複数のモノリンガル ASR モデルと LID 分類器を並列に実行するよりも、有望な単純化を提供する。

Multilingual ASR technology simplifies model training and deployment, but its accuracy is known to depend on the availability of language information at runtime. Since language identity is seldom known beforehand in real-world scenarios, it must be inferred on-the-fly with minimum latency. Furthermore, in voice-activated smart assistant systems, language identity is also required for downstream processing of ASR output. In this paper, we introduce streaming, end-to-end, bilingual systems that perform both ASR and language identification (LID) using the recurrent neural network transducer (RNN-T) architecture. On the input side, embeddings from pretrained acoustic-only LID classifiers are used to guide RNN-T training and inference, while on the output side, language targets are jointly modeled with ASR targets. The proposed method is applied to two language pairs: English-Spanish as spoken in the United States, and English-Hindi as spoken in India. Experiments show that for English-Spanish, the bilingual joint ASR-LID architecture matches monolingual ASR and acoustic-only LID accuracies. For the more challenging (owing to within-utterance code switching) case of English-Hindi, English ASR and LID metrics show degradation. Overall, in scenarios where users switch dynamically between languages, the proposed architecture offers a promising simplification over running multiple monolingual ASR models and an LID classifier in parallel.

翻訳日:2022-11-12 13:34:18 公開日:2020-07-08

# 金属アーティファクト低減のための低次元多様体制約ディスタングルメントネットワーク

Low-dimensional Manifold Constrained Disentanglement Network for Metal Artifact Reduction ( http://arxiv.org/abs/2007.03882v1 )

ライセンス: Link先を確認

Chuang Niu, Wenxiang Cong, Fenglei Fan, Hongming Shan, Mengzhou Li, Jimin Liang, Ge Wang

(参考訳) 深層ニューラルネットワークに基づく手法はctメタルアーティファクトリダクション(mar)に有望な結果をもたらしており、そのほとんどはトレーニングに多くの合成ペアイメージを使用している。 CT画像中の金属人工物は, 臨床像を正確に反映しない可能性があるため, 欠損した臨床像を直接使用し, 臨床データセットに有望な結果をもたらすアーティファクト・ディアンタングメント・ネットワーク(ADN)が提案された。しかし, 十分な監督がなければ, 対向的損失のみに基づいて, アーチファクト影響CT画像の構造的詳細を復元することは困難である。これらの問題を克服するために,パッチ多様体が一般に低次元であることのイメージ特性を活かした低次元多様体(LDM)制約分散ネットワーク(DN)を提案する。具体的には,LDM-DN学習アルゴリズムを設計し,低次元のパッチ多様体上の画像に制約を加えながら,相乗的ネットワーク損失関数を最適化する。さらに、ペアデータとペアデータの両方から学習し、臨床データセットのmar性能をさらに向上させるために、効率的なハイブリッド最適化スキームを提案する。大規模な実験により、LDM-DNアプローチはペアおよび/またはペアなしの学習環境におけるMAR性能を一貫して改善し、合成および臨床データセット上で競合する手法より優れていることが示された。

Deep neural network based methods have achieved promising results for CT metal artifact reduction (MAR), most of which use many synthesized paired images for training. As synthesized metal artifacts in CT images may not accurately reflect the clinical counterparts, an artifact disentanglement network (ADN) was proposed with unpaired clinical images directly, producing promising results on clinical datasets. However, without sufficient supervision, it is difficult for ADN to recover structural details of artifact-affected CT images based on adversarial losses only. To overcome these problems, here we propose a low-dimensional manifold (LDM) constrained disentanglement network (DN), leveraging the image characteristics that the patch manifold is generally low-dimensional. Specifically, we design an LDM-DN learning algorithm to empower the disentanglement network through optimizing the synergistic network loss functions while constraining the recovered images to be on a low-dimensional patch manifold. Moreover, learning from both paired and unpaired data, an efficient hybrid optimization scheme is proposed to further improve the MAR performance on clinical datasets. Extensive experiments demonstrate that the proposed LDM-DN approach can consistently improve the MAR performance in paired and/or unpaired learning settings, outperforming competing methods on synthesized and clinical datasets.

翻訳日:2022-11-12 13:33:54 公開日:2020-07-08

# AUSN: ニューラルネットワークの非一様分布を適応的に重畳した近似量子化

AUSN: Approximately Uniform Quantization by Adaptively Superimposing Non-uniform Distribution for Deep Neural Networks ( http://arxiv.org/abs/2007.03903v1 )

ライセンス: Link先を確認

Liu Fangxin, Zhao Wenbo, Wang Yanzhi, Dai Changzhi, Jiang Li

(参考訳) エッジアプリケーションのDNN推論を単純化するためには量子化が不可欠である。しかし、既存の均一な量子化法と非一様量子化法は、表現範囲と解像度との固有の矛盾を示し、その結果、未使用ビット幅または重要な精度低下をもたらす。さらに、これらの手法には3つの欠点がある。一量子化誤差の原因を詳細に分析するための量的指標がないこと。二画像分類タスクのCNNに基づく限定的な焦点三ビット幅を下げることにより、実際のハードウェア及びエネルギー消費の無意識を低下させる。本稿では,まず,クリッピング誤差と丸め誤差の2つの定量的指標を定義し,量子化誤差分布を解析した。境界と丸みを帯びたエラーは、層、モデル、タスクによって大きく異なる。そこで本研究では,重みと活性化を定量化する新しい量子化法を提案する。鍵となる考え方は、複数の非一様量子化値、すなわち AUSN を適応的に重ね合わせることでユニフォーム量子化を近似することである。 AUSNは、ビット幅を極端まで効率的に活用するデコーダフリーコーディングスキームと、ハードウェア設計の余分な努力なしに異なるDNN層、モデル、タスクに符号化スキームを適応できる重ね合わせ量子化アルゴリズムと、よく知られたビット幅オーバーフローと再量子化の問題を排除するラウンドリングスキームから構成されている。様々なタスクのDNNモデルの理論的解析と精度評価は、AUSNの有効性と一般化を示している。 FPGAの合成〜(Appendix B参照)の結果は、エネルギー消費の削減に2\times$、ハードウェアリソースの削減に2\times$4\times$である。

Quantization is essential to simplify DNN inference in edge applications. Existing uniform and non-uniform quantization methods, however, exhibit an inherent conflict between the representing range and representing resolution, and thereby result in either underutilized bit-width or significant accuracy drop. Moreover, these methods encounter three drawbacks: i) the absence of a quantitative metric for in-depth analysis of the source of the quantization errors; ii) the limited focus on the image classification tasks based on CNNs; iii) the unawareness of the real hardware and energy consumption reduced by lowering the bit-width. In this paper, we first define two quantitative metrics, i.e., the Clipping Error and rounding error, to analyze the quantization error distribution. We observe that the boundary- and rounding- errors vary significantly across layers, models and tasks. Consequently, we propose a novel quantization method to quantize the weight and activation. The key idea is to Approximate the Uniform quantization by Adaptively Superposing multiple Non-uniform quantized values, namely AUSN. AUSN is consist of a decoder-free coding scheme that efficiently exploits the bit-width to its extreme, a superposition quantization algorithm that can adapt the coding scheme to different DNN layers, models and tasks without extra hardware design effort, and a rounding scheme that can eliminate the well-known bit-width overflow and re-quantization issues. Theoretical analysis~(see Appendix A) and accuracy evaluation on various DNN models of different tasks show the effectiveness and generalization of AUSN. The synthesis~(see Appendix B) results on FPGA show $2\times$ reduction of the energy consumption, and $2\times$ to $4\times$ reduction of the hardware resource.

翻訳日:2022-11-12 13:33:26 公開日:2020-07-08

# 簡単な学習モデルによる勝利:オランダ・グロニンゲンの地震検出

Winning with Simple Learning Models: Detecting Earthquakes in Groningen, the Netherlands ( http://arxiv.org/abs/2007.03924v1 )

ライセンス: Link先を確認

Umair bin Waheed, Ahmed Shaheen, Mike Fehler, Ben Fulcher

(参考訳) ディープラーニングは、科学全体の長年の研究課題に対処するための破壊的なツールとして急速に発展しつつある。その成功にもかかわらず、近年のディープラーニングの過剰使用傾向は、多くの機械学習実践者に関係している。近年、地震学者は低等級地震の検出における深層学習アルゴリズムの有効性を実証している。本稿では,地震イベント検出の問題を再考するが,特徴抽出を伴うロジスティック回帰モデルを用いる。我々は,学際的時系列解析手法から収集した時系列操作の膨大なデータベースから,特徴を適切に識別する。トレーニング可能なパラメータを5つしか持たない単純な学習モデルを用いて,グロニンゲンガス田からの低マグニチュード誘発地震を複数検出する。よりシンプルなモデルの利点として、選択された機能は、データセットに存在するノイズやイベントクラスを理解するのに役立ちます。シンプルなモデルは、メンテナンス、デバッグ、理解、トレーニングが容易であるため、よりシンプルな選択肢を慎重に検討することなく、ディープラーニングを使用するのは危険である、という結論に達しています。

Deep learning is fast emerging as a potential disruptive tool to tackle longstanding research problems across the sciences. Notwithstanding its success across disciplines, the recent trend of the overuse of deep learning is concerning to many machine learning practitioners. Recently, seismologists have also demonstrated the efficacy of deep learning algorithms in detecting low magnitude earthquakes. Here, we revisit the problem of seismic event detection but using a logistic regression model with feature extraction. We select well-discriminating features from a huge database of time-series operations collected from interdisciplinary time-series analysis methods. Using a simple learning model with only five trainable parameters, we detect several low-magnitude induced earthquakes from the Groningen gas field that are not present in the catalog. We note that the added advantage of simpler models is that the selected features add to our understanding of the noise and event classes present in the dataset. Since simpler models are easy to maintain, debug, understand, and train, through this study we underscore that it might be a dangerous pursuit to use deep learning without carefully weighing simpler alternatives.

翻訳日:2022-11-12 13:32:55 公開日:2020-07-08

# 画像分割のためのデュアルcnnの設計と学習

Designing and Training of A Dual CNN for Image Denoising ( http://arxiv.org/abs/2007.03951v1 )

ライセンス: Link先を確認

Chunwei Tian, Yong Xu, Wangmeng Zuo, Bo Du, Chia-Wen Lin and David Zhang

(参考訳) 画像復調のための深層畳み込みニューラルネットワーク(CNN)は近年研究の関心を集めている。しかし、平易なネットワークでは、実際のノイズ画像のような複雑なタスクの詳細な詳細を復元できない。本稿ではDudeNet(Dual Denoising Network)を提案し,クリーンな画像の復元を行った。具体的には,機能抽出ブロック,拡張ブロック,圧縮ブロック,再構築ブロックの4つのモジュールで構成される。スパースマカニズムを持つ特徴抽出ブロックは、2つのサブネットワークを介してグローバルおよびローカルな特徴を抽出する。拡張ブロックはグローバルとローカルの機能を収集して融合し、後者のネットワークに補完的な情報を提供する。圧縮ブロックは抽出した情報を洗練し、ネットワークを圧縮する。最後に、復元ブロックを利用して、音像を再構成する。 1) パース機構を持つデュアルネットワークは、デノイザの一般化能力を高めるために補完的な特徴を抽出することができる。 2)大域的特徴と局所的特徴を融合させることで,複雑な雑音画像の細部を復元することができる。 (3)デノイザの複雑さを低減するために小型フィルタを用いる。大規模な実験は、DudeNetが既存の最先端のデノナイジング手法よりも優れていることを示す。

Deep convolutional neural networks (CNNs) for image denoising have recently attracted increasing research interest. However, plain networks cannot recover fine details for a complex task, such as real noisy images. In this paper, we propsoed a Dual denoising Network (DudeNet) to recover a clean image. Specifically, DudeNet consists of four modules: a feature extraction block, an enhancement block, a compression block, and a reconstruction block. The feature extraction block with a sparse machanism extracts global and local features via two sub-networks. The enhancement block gathers and fuses the global and local features to provide complementary information for the latter network. The compression block refines the extracted information and compresses the network. Finally, the reconstruction block is utilized to reconstruct a denoised image. The DudeNet has the following advantages: (1) The dual networks with a parse mechanism can extract complementary features to enhance the generalized ability of denoiser. (2) Fusing global and local features can extract salient features to recover fine details for complex noisy images. (3) A Small-size filter is used to reduce the complexity of denoiser. Extensive experiments demonstrate the superiority of DudeNet over existing current state-of-the-art denoising methods.

翻訳日:2022-11-12 13:32:40 公開日:2020-07-08

# 医用画像評価における予測不確かさの定量化と活用

Quantifying and Leveraging Predictive Uncertainty for Medical Image Assessment ( http://arxiv.org/abs/2007.04258v1 )

ライセンス: Link先を確認

Florin C. Ghesu, Bogdan Georgescu, Awais Mansoor, Youngjin Yoo, Eli Gibson, R.S. Vishwanath, Abishek Balachandran, James M. Balter, Yue Cao, Ramandeep Singh, Subba R. Digumarthy, Mannudeep K. Kalra, Sasa Grbic, Dorin Comaniciu

(参考訳) 医療画像の解釈は難しい課題であり、しばしばアーティファクト、オクルージョン、限られたコントラストなどの存在によって複雑になる。最も注目すべきは胸部x線撮影の症例で、異常の検出と分類において高いレート間変動がある。これは主に、病気の出現に関するデータや主観的な定義の不確定な証拠によるものである。もう一つの例は、2次元超音波画像に基づく解剖学的ビューの分類である。しばしば、フレームでキャプチャされた解剖学的文脈は、基礎となる解剖学を認識するには不十分である。これらの問題の現在の機械学習ソリューションは、通常、限られた情報と高いラベルノイズに対応する基盤となるモデルの能力に依存する確率的予測を提供することに制限されている。しかし実際には、これは不明瞭なデータに対する一般化が不十分な過信システムにつながる。そこで本研究では,分類の確率的推定だけでなく,予測結果におけるシステムの信頼度を捉える明示的不確実性尺度を学習するシステムを提案する。本手法は, 放射線検査, 超音波, 磁気共鳴画像などの異なる放射線検査から得られた医用画像のあいまいさを考慮に入れる上で重要である。本実験では, 予測不確実性に基づく試料の拒絶は, 胸部X線写真における異常の分類において, 25%未満の拒絶率で, ROC-AUCを8%から0.91に向上させることができることを示した。さらに,不確実性に基づくブートストラップをトレーニングデータのフィルタに適用することで,ロバスト性や精度が大幅に向上することを示す。

The interpretation of medical images is a challenging task, often complicated by the presence of artifacts, occlusions, limited contrast and more. Most notable is the case of chest radiography, where there is a high inter-rater variability in the detection and classification of abnormalities. This is largely due to inconclusive evidence in the data or subjective definitions of disease appearance. An additional example is the classification of anatomical views based on 2D Ultrasound images. Often, the anatomical context captured in a frame is not sufficient to recognize the underlying anatomy. Current machine learning solutions for these problems are typically limited to providing probabilistic predictions, relying on the capacity of underlying models to adapt to limited information and the high degree of label noise. In practice, however, this leads to overconfident systems with poor generalization on unseen data. To account for this, we propose a system that learns not only the probabilistic estimate for classification, but also an explicit uncertainty measure which captures the confidence of the system in the predicted output. We argue that this approach is essential to account for the inherent ambiguity characteristic of medical images from different radiologic exams including computed radiography, ultrasonography and magnetic resonance imaging. In our experiments we demonstrate that sample rejection based on the predicted uncertainty can significantly improve the ROC-AUC for various tasks, e.g., by 8% to 0.91 with an expected rejection rate of under 25% for the classification of different abnormalities in chest radiographs. In addition, we show that using uncertainty-driven bootstrapping to filter the training data, one can achieve a significant increase in robustness and accuracy.

翻訳日:2022-11-12 13:32:02 公開日:2020-07-08

# シミュレーションfermi/lat望遠鏡画像を用いたニューラルネットワーク点抽出に関する研究

A study of Neural networks point source extraction on simulated Fermi/LAT Telescope images ( http://arxiv.org/abs/2007.04295v1 )

ライセンス: Link先を確認

Mariia Drozdova, Anton Broilovskiy, Andrey Ustyuzhanin, Denys Malyshev

(参考訳) GeV帯の天体物理画像は、背景と前景の天体物理拡散放射の強い寄与と、現代の宇宙観測装置の比較的広い範囲の拡散関数により、分析が困難である。あるケースでは、画像上の点源を見つけることさえ、非自明な作業になる。本稿では,フェルミ大都市圏望遠鏡の画像を模倣した人工データセット上で学習した畳み込みニューラルネットワーク(cnn)を用いた点源抽出法を提案する。これらの画像は1から10GeVのエネルギーをカバーする10×10度の原数光子マップである。我々は、精度が15%向上し、少なくとも4つの精度改善の要因で推論時間を削減できる様々なcnnアーキテクチャを比較した。

Astrophysical images in the GeV band are challenging to analyze due to the strong contribution of the background and foreground astrophysical diffuse emission and relatively broad point spread function of modern space-based instruments. In certain cases, even finding of point sources on the image becomes a non-trivial task. We present a method for point sources extraction using a convolution neural network (CNN) trained on our own artificial data set which imitates images from the Fermi Large Area Telescope. These images are raw count photon maps of 10x10 degrees covering energies from 1 to 10 GeV. We compare different CNN architectures that demonstrate accuracy increase by ~15% and reduces the inference time by at least the factor of 4 accuracy improvement with respect to a similar state of the art models.

翻訳日:2022-11-12 13:31:35 公開日:2020-07-08

# KIT MOMA: モバイルデバイスのデータセット

KIT MOMA: A Mobile Machines Dataset ( http://arxiv.org/abs/2007.04198v1 )

ライセンス: Link先を確認

Yusheng Xiang, Hongzhe Wang, Tianqing Su, Ruoyu Li, Christine Brach, Samuel S. Mao, Marcus Geimer

(参考訳) 通常、クローズドな場所で作業するモバイルマシンは、自動運転技術を利用する可能性が高い。しかし、開発と革新の活発な発展は、主に旅客車の分野で起きている。対照的に、自動運転やモバイルマシンでの作業についても多くの研究があるが、SOTAソリューションに関するコンセンサスはまだ達成されていない。解決すべき最も緊急な問題は、公然と挑戦的なビジュアルデータセットがないことであり、異なる研究の結果と同等である、と私たちは信じています。この問題に対処するため、我々は8種類のモバイルマシンを含むKIT MOMAデータセットを公開し、モバイル構築マシンを検出するためのSOTAアルゴリズムを評価するベンチマークとして使用することができる。収集された画像のビューは、すべての興味深いマシンがクローズドな場所で作業している場合、地上の固定カメラがより適していると考えるので、モバイルマシンの外部にある。 KIT MOMAのイメージのほとんどは実際のシーンにあるが、一部の画像は建設機械メーカーの公式ウェブサイトにある。また、このデータセット上でのYOLO v3の性能を評価し、SOTAコンピュータビジョンアルゴリズムは、特定の作業場での移動体検出に優れた性能を示していることを示す。データセットとともにトレーニングされた重量もアップロードします。これは建設機械業界のエンジニアが直接使用することができます。データセット、トレーニングされたウェイト、アップデートはGithubで確認できます。さらに、デモは私たちのyoutubeで見ることができる。

Mobile machines typically working in a closed site, have a high potential to utilize autonomous driving technology. However, vigorously thriving development and innovation are happening mostly in the area of passenger cars. In contrast, although there are also many research pieces about autonomous driving or working in mobile machines, a consensus about the SOTA solution is still not achieved. We believe that the most urgent problem that should be solved is the absence of a public and challenging visual dataset, which makes the results from different researches comparable. To address the problem, we publish the KIT MOMA dataset, including eight classes of commonly used mobile machines, which can be used as a benchmark to evaluate the SOTA algorithms to detect mobile construction machines. The view of the gathered images is outside of the mobile machines since we believe fixed cameras on the ground are more suitable if all the interesting machines are working in a closed site. Most of the images in KIT MOMA are in a real scene, whereas some of the images are from the official website of top construction machine companies. Also, we have evaluated the performance of YOLO v3 on our dataset, indicating that the SOTA computer vision algorithms already show an excellent performance for detecting the mobile machines in a specific working site. Together with the dataset, we also upload the trained weights, which can be directly used by engineers from the construction machine industry. The dataset, trained weights, and updates can be found on our Github. Moreover, the demo can be found on our Youtube.

翻訳日:2022-11-12 13:24:56 公開日:2020-07-08

# 動的および反復スパンニング森林を用いたスーパーピクセルセグメンテーション

Superpixel Segmentation using Dynamic and Iterative Spanning Forest ( http://arxiv.org/abs/2007.04257v1 )

ライセンス: Link先を確認

F.C. Belem and S.J.F. Guimaraes and A.X. Falcao

(参考訳) 画像オブジェクトを構成する部分として、スーパーピクセルはいくつかの高レベルの操作を改善することができる。しかし、画像分割法は、スーパーピクセル数を減らすために精度を著しく損なう可能性がある。我々は,isf(cycleed spanning forest)フレームワークに基づくソリューションを調査した。本稿では、以下のステップをベースとしたDynamic ISF(DISF)について述べる。 (a)所望のスーパーピクセル数よりもかなり多くのピクセルを持つ画像グラフとシードセットから始まります。 b) 種子は互いに競合し, それぞれの種子は最も近縁なピクセルを征服し, 画像分割(スパンニング林)と接続されたスーパーピクセルが形成される。ステップ (c)disFは,超画素解析に基づいて関連値を種子に割り当て,最も無関係な種子を除去する。ステップ (b) (c)は所望のスーパーピクセル数に到達するまで繰り返される。 DISFは、リージョンマージアルゴリズムと比較して、イテレーション毎に関連するエッジを再構築する機会がある。他のシードベースのスーパーピクセル法と比較すると、DIFは関連する種子を見つける傾向にある。さらに,isfフレームワークにおいて,より効率的なスーパーピクセルデライン化のために動的アークウェイト推定を導入し,異なるオブジェクト特性を持つ3つのデータセット上でのすべての結果を示す。

As constituent parts of image objects, superpixels can improve several higher-level operations. However, image segmentation methods might have their accuracy seriously compromised for reduced numbers of superpixels. We have investigated a solution based on the Iterative Spanning Forest (ISF) framework. In this work, we present Dynamic ISF (DISF) -- a method based on the following steps. (a) It starts from an image graph and a seed set with considerably more pixels than the desired number of superpixels. (b) The seeds compete among themselves, and each seed conquers its most closely connected pixels, resulting in an image partition (spanning forest) with connected superpixels. In step (c), DISF assigns relevance values to seeds based on superpixel analysis and removes the most irrelevant ones. Steps (b) and (c) are repeated until the desired number of superpixels is reached. DISF has the chance to reconstruct relevant edges after each iteration, when compared to region merging algorithms. As compared to other seed-based superpixel methods, DISF is more likely to find relevant seeds. It also introduces dynamic arc-weight estimation in the ISF framework for more effective superpixel delineation, and we demonstrate all results on three datasets with distinct object properties.

翻訳日:2022-11-12 13:24:03 公開日:2020-07-08

# 廃棄物オブジェクト分割へのマルチレベルアプローチ

A Multi-Level Approach to Waste Object Segmentation ( http://arxiv.org/abs/2007.04259v1 )

ライセンス: Link先を確認

Tao Wang and Yuanzheng Cai and Lingyu Liang and Dongyi Ye

(参考訳) 本稿では,カラー画像から無駄な物体を局所化する問題と,そのような物体とロボットが相互作用する上で重要な知覚成分である奥行き画像について論じる。具体的には,複数の空間的粒度レベルでの強度と深度情報を統合する。まず、シーンレベルのディープネットワークが初期粗いセグメンテーションを生成し、そこでいくつかの潜在的なオブジェクト領域を選択してズームインして細かなセグメンテーションを行う。上記のステップの結果はさらに密結合された条件付きランダムフィールドに統合され、ピクセルレベルの精度で外観、深さ、空間親和性を尊重する。さらに, この領域における今後の研究を促進するために, 新たにRGBD 廃棄物オブジェクト分割データセット MJU-Waste を作成した。本手法の有効性は,MJU-WasteとTrash Annotation in Context (TACO)データセットの両方で検証される。

We address the problem of localizing waste objects from a color image and an optional depth image, which is a key perception component for robotic interaction with such objects. Specifically, our method integrates the intensity and depth information at multiple levels of spatial granularity. Firstly, a scene-level deep network produces an initial coarse segmentation, based on which we select a few potential object regions to zoom in and perform fine segmentation. The results of the above steps are further integrated into a densely connected conditional random field that learns to respect the appearance, depth, and spatial affinities with pixel-level accuracy. In addition, we create a new RGBD waste object segmentation dataset, MJU-Waste, that is made public to facilitate future research in this area. The efficacy of our method is validated on both MJU-Waste and the Trash Annotation in Context (TACO) dataset.

翻訳日:2022-11-12 13:23:42 公開日:2020-07-08

# UU-Net:ビジュアル監視ビデオフットプリントの顔認識

The UU-Net: Reversible Face De-Identification for Visual Surveillance Video Footage ( http://arxiv.org/abs/2007.04316v1 )

ライセンス: Link先を確認

Hugo Proen\c{c}a

(参考訳) そこで本稿では,ランドマークベース技術が使用できない低解像度映像データに対する可逆的顔識別法を提案する。我々のソリューションは、データ保護規則を満たし、最小限のプライバシー制約の下で公開可能な、現実的な非識別ストリームを生成することができる。特に、これらのストリームは、後に元のシーンを再構築するのに必要な全ての情報をカプセル化しており、犯罪捜査など、被写体の識別が最も重要なシナリオに有用である。 2つの主要コンポーネントを共同で最適化する学習プロセスについて述べる。 1) 原データを受信し、ID情報が写実的でシームレスな方法で代理される非識別ストリームを生成する公開モジュール 2) 法・セキュリティ当局のために設計された私的なモジュールで,公開ストリームを分析し,元のシーンを再構築し,現場のすべての被験者の実際のIDを開示する。提案手法はランドマークフリーであり、条件付き生成対向ネットワークを用いて、ポーズ、照明、背景情報、さらには表情を保存した合成顔を生成する。また、生データと非識別データの間で保存されるべきソフトな顔属性のセットを完全に制御できるようにし、このソリューションの応用範囲を広げる。実験は3種類の視覚監視データセット(BIODI, MARS, P-DESTRE)を用いて行った。ソースコードはhttps://github.com/hugomcp/uu-netで入手できる。

We propose a reversible face de-identification method for low resolution video data, where landmark-based techniques cannot be reliably used. Our solution is able to generate a photo realistic de-identified stream that meets the data protection regulations and can be publicly released under minimal privacy constraints. Notably, such stream encapsulates all the information required to later reconstruct the original scene, which is useful for scenarios, such as crime investigation, where the identification of the subjects is of most importance. We describe a learning process that jointly optimizes two main components: 1) a public module, that receives the raw data and generates the de-identified stream, where the ID information is surrogated in a photo-realistic and seamless way; and 2) a private module, designed for legal/security authorities, that analyses the public stream and reconstructs the original scene, disclosing the actual IDs of all the subjects in the scene. The proposed solution is landmarks-free and uses a conditional generative adversarial network to generate synthetic faces that preserve pose, lighting, background information and even facial expressions. Also, we enable full control over the set of soft facial attributes that should be preserved between the raw and de-identified data, which broads the range of applications for this solution. Our experiments were conducted in three different visual surveillance datasets (BIODI, MARS and P-DESTRE) and showed highly encouraging results. The source code is available at https://github.com/hugomcp/uu-net.

翻訳日:2022-11-12 13:23:13 公開日:2020-07-08

# Auto-MAP: DNNワークロードの分散実行計画を探索するDQNフレームワーク

Auto-MAP: A DQN Framework for Exploring Distributed Execution Plans for DNN Workloads ( http://arxiv.org/abs/2007.04069v1 )

ライセンス: Link先を確認

Siyu Wang, Yi Rong, Shiqing Fan, Zhen Zheng, LanSong Diao, Guoping Long, Jun Yang, Xiaoyong Liu, Wei Lin

(参考訳) 過去10年間、ディープニューラルネットワークをトレーニングするための計算要件が増加してきた。現在のアプローチ(データ/モデル並列性、パイプライン並列性など)は、トレーニングタスクを複数のデバイスに並列化する。しかしながら、これらのアプローチは常に特定のディープラーニングフレームワークに依存しており、詳細な手作業による設計を必要とするため、異なるタイプのモデルのメンテナンスと共有が難しい。本稿では,DNNワークロードの分散実行計画を探索するフレームワークであるAuto-MAPを提案する。効率的な探索は、強化学習の大きな課題である。 DQNとタスク固有のプルーニング戦略を利用して、最適化された戦略を含む検索空間を効率的に探索する。評価の結果,Auto-MAPは複数のNLPおよび畳み込みモデルにおいて,より優れたスループットを実現しつつ,最適解を2時間以内に見つけることができることがわかった。

The last decade has witnessed growth in the computational requirements for training deep neural networks. Current approaches (e.g., data/model parallelism, pipeline parallelism) parallelize training tasks onto multiple devices. However, these approaches always rely on specific deep learning frameworks and requires elaborate manual design, which make it difficult to maintain and share between different type of models. In this paper, we propose Auto-MAP, a framework for exploring distributed execution plans for DNN workloads, which can automatically discovering fast parallelization strategies through reinforcement learning on IR level of deep learning models. Efficient exploration remains a major challenge for reinforcement learning. We leverage DQN with task-specific pruning strategies to help efficiently explore the search space including optimized strategies. Our evaluation shows that Auto-MAP can find the optimal solution in two hours, while achieving better throughput on several NLP and convolution models.

翻訳日:2022-11-12 13:22:32 公開日:2020-07-08

# 部分教師付きマルチオルガンセグメンテーションにおける限界損失と排除損失

Marginal loss and exclusion loss for partially supervised multi-organ segmentation ( http://arxiv.org/abs/2007.03868v1 )

ライセンス: Link先を確認

Gonglei Shi, Li Xiao, Yang Chen, S. Kevin Zhou

(参考訳) 医用画像に複数の臓器をアノテートすることは費用も時間もかかるため、ラベル付きの既存の複数臓器データセットはしばしばサンプルサイズが低く、主に部分的にラベル付けされている。本稿では,そのようなデータセットの結合から単一マルチ組織セグメンテーションネットワークを学習する方法を検討する。この目的のために,特にこのシナリオ用に設計された2種類の新しい損失関数を提案する。 (一)限界損失、及び (ii)排他的損失。部分ラベル付き画像の背景ラベルは、実際には、すべてのラベル付き臓器の「マージ」ラベルと(フルラベルの意味で)「true」背景であるので、この「マージ」背景ラベルの確率は限界確率であり、マージ前の関連する確率を合計する。この限界確率は、任意の既存の損失関数(例えば、クロスエントロピー損失、ディース損失など)に差し込み、限界損失を形成することができる。臓器が重複しないという事実を生かして,ラベル付き臓器間の相違性と非ラベル付き臓器の推定セグメンテーションを評価するために,除外損失を提案する。肝,脾臓,左右腎,膵の多臓器分節化における5つのベンチマークデータセットの結合実験により,新たに提案した損失関数を用いることで,余分な計算を導入することなく,最先端の手法に顕著な性能向上がもたらされることを示した。

Annotating multiple organs in medical images is both costly and time-consuming; therefore, existing multi-organ datasets with labels are often low in sample size and mostly partially labeled, that is, a dataset has a few organs labeled but not all organs. In this paper, we investigate how to learn a single multi-organ segmentation network from a union of such datasets. To this end, we propose two types of novel loss function, particularly designed for this scenario: (i) marginal loss and (ii) exclusion loss. Because the background label for a partially labeled image is, in fact, a `merged' label of all unlabelled organs and `true' background (in the sense of full labels), the probability of this `merged' background label is a marginal probability, summing the relevant probabilities before merging. This marginal probability can be plugged into any existing loss function (such as cross entropy loss, Dice loss, etc.) to form a marginal loss. Leveraging the fact that the organs are non-overlapping, we propose the exclusion loss to gauge the dissimilarity between labeled organs and the estimated segmentation of unlabelled organs. Experiments on a union of five benchmark datasets in multi-organ segmentation of liver, spleen, left and right kidneys, and pancreas demonstrate that using our newly proposed loss functions brings a conspicuous performance improvement for state-of-the-art methods without introducing any extra computation.

翻訳日:2022-11-12 13:16:11 公開日:2020-07-08

# fetoscopic mosaicking に対する深部胎盤血管セグメンテーション

Deep Placental Vessel Segmentation for Fetoscopic Mosaicking ( http://arxiv.org/abs/2007.04349v1 )

ライセンス: Link先を確認

Sophia Bano, Francisco Vasconcelos, Luke M. Shepherd, Emmanuel Vander Poorten, Tom Vercauteren, Sebastien Ourselin, Anna L. David, Jan Deprest and Danail Stoyanov

(参考訳) ツイン・ツー・ツイン・トランスフュージョン症候群(TTTS)の治療中、臨床医は最初に胎盤血管の異常を同定し、両胎児の血流を調節するためにレーザーを照射する。手術は, 環境の移動性, 羊水中の視認性不良, 時折出血し, フェトスコープ視野の制限, 画像品質の制限などにより困難である。理想的には、解剖学的胎盤血管は自動的に同定され、分節化され、レーザーアブレーションのガイドとして拡張された血管地図を作成する。フェトスコープ映像における胎盤血管のセグメンテーションを行うために, u-netアーキテクチャを利用したソリューションを提案する。得られた容器確率マップは、直接強度に基づく手法を用いて連続した容器マップを登録することにより、モザイクアライメントのための十分な手がかりを提供する。 6種類の異なるin vivo fetoscopic video実験により、血管強度に基づく登録は画像強度に基づく登録法より優れ、質的および定量的比較においてより堅牢性を示すことが示された。さらに,400フレームまでのシーケンスにおいてもドリフトの蓄積を無視できるように削減し,地盤の欠落時にドリフト誤差を定量化するためのスキームを組み込んだ。本稿では,第1報 in vivo vessel segmentation と fetoscopic videos dataset をコントリビュートすることにより,胎盤胎盤血管のセグメンテーションと登録のベンチマークを提供する。

During fetoscopic laser photocoagulation, a treatment for twin-to-twin transfusion syndrome (TTTS), the clinician first identifies abnormal placental vascular connections and laser ablates them to regulate blood flow in both fetuses. The procedure is challenging due to the mobility of the environment, poor visibility in amniotic fluid, occasional bleeding, and limitations in the fetoscopic field-of-view and image quality. Ideally, anastomotic placental vessels would be automatically identified, segmented and registered to create expanded vessel maps to guide laser ablation, however, such methods have yet to be clinically adopted. We propose a solution utilising the U-Net architecture for performing placental vessel segmentation in fetoscopic videos. The obtained vessel probability maps provide sufficient cues for mosaicking alignment by registering consecutive vessel maps using the direct intensity-based technique. Experiments on 6 different in vivo fetoscopic videos demonstrate that the vessel intensity-based registration outperformed image intensity-based registration approaches showing better robustness in qualitative and quantitative comparison. We additionally reduce drift accumulation to negligible even for sequences with up to 400 frames and we incorporate a scheme for quantifying drift error in the absence of the ground-truth. Our paper provides a benchmark for fetoscopy placental vessel segmentation and registration by contributing the first in vivo vessel segmentation and fetoscopic videos dataset.

翻訳日:2022-11-12 13:08:01 公開日:2020-07-08

# 背景知識に基づく多次元語句認識アルゴリズムに関する研究

Research on multi-dimensional end-to-end phrase recognition algorithm based on background knowledge ( http://arxiv.org/abs/2007.03860v1 )

ライセンス: Link先を確認

Zheng Li, Gang Tu, Guang Liu, Zhi-Qiang Zhan, Yi-Jian Liu

(参考訳) 現在、教師付き学習に基づくエンド・ツー・エンドの深層手法は、エンティティ認識と依存性分析に使われている。この手法には2つの問題がある: 第一に、背景知識は導入できない;第二に、自然言語の多粒度とネスト特徴は認識できない。これらの問題を解決するために、フレーズウィンドウに基づくアノテーションルールを提案し、それに対応する多次元の語句認識アルゴリズムを設計する。このアノテーション規則は、文を7種類のネスト句に分割し、句間の依存関係を示す。このアルゴリズムは、背景知識を導入するだけでなく、文中のあらゆる種類のネスト句を認識するだけでなく、句間の依存関係を認識する。実験の結果, アノテーションルールは使い易く, あいまいさがないことがわかった。マッチングアルゴリズムは, 従来のエンドツーエンドアルゴリズムよりも, 文法の多粒度や多様性特性に一貫性がある。 CPWDデータセットの実験では、背景知識を導入することにより、エンドツーエンドの手法の精度を1ポイント以上向上する。この手法はCCL 2018の競技に応用され、中国のユーモア型認識において第一位を獲得した。

At present, the deep end-to-end method based on supervised learning is used in entity recognition and dependency analysis. There are two problems in this method: firstly, background knowledge cannot be introduced; secondly, multi granularity and nested features of natural language cannot be recognized. In order to solve these problems, the annotation rules based on phrase window are proposed, and the corresponding multi-dimensional end-to-end phrase recognition algorithm is designed. This annotation rule divides sentences into seven types of nested phrases, and indicates the dependency between phrases. The algorithm can not only introduce background knowledge, recognize all kinds of nested phrases in sentences, but also recognize the dependency between phrases. The experimental results show that the annotation rule is easy to use and has no ambiguity; the matching algorithm is more consistent with the multi granularity and diversity characteristics of syntax than the traditional end-to-end algorithm. The experiment on CPWD dataset, by introducing background knowledge, the new algorithm improves the accuracy of the end-to-end method by more than one point. The corresponding method was applied to the CCL 2018 competition and won the first place in the task of Chinese humor type recognition.

翻訳日:2022-11-12 13:06:47 公開日:2020-07-08

# カービン内会話エージェントにおける客室乗務員の聴覚的理解

Audio-Visual Understanding of Passenger Intents for In-Cabin Conversational Agents ( http://arxiv.org/abs/2007.03876v1 )

ライセンス: Link先を確認

Eda Okur, Shachi H Kumar, Saurav Sahay, Lama Nachman

(参考訳) 車両内状況における多モード対話理解機能の構築は、自律走行車(AV)インタラクションシステムにおける乗客の快適性を高めるために重要である。この目的のために、音声対話と車両ビジョンシステムから乗客の意図を理解することは、avのための文脈的および視覚的な会話エージェントを開発する上で重要な要素である。本研究の目的は、車内エージェントであるAMIE(Automated-vehicle Multimodal In-cabin Experience)を探索することである。本研究では,車内および車外からの言語/言語入力と非言語/音響的・視覚的手がかりを組み込むことにより,車内発話のマルチモーダル理解のメリットについて論じる。実験結果は,マルチモーダルアプローチによる意図検出の性能向上により,テキストのみベースラインを上回った。

Building multimodal dialogue understanding capabilities situated in the in-cabin context is crucial to enhance passenger comfort in autonomous vehicle (AV) interaction systems. To this end, understanding passenger intents from spoken interactions and vehicle vision systems is a crucial component for developing contextual and visually grounded conversational agents for AV. Towards this goal, we explore AMIE (Automated-vehicle Multimodal In-cabin Experience), the in-cabin agent responsible for handling multimodal passenger-vehicle interactions. In this work, we discuss the benefits of a multimodal understanding of in-cabin utterances by incorporating verbal/language input together with the non-verbal/acoustic and visual clues from inside and outside the vehicle. Our experimental results outperformed text-only baselines as we achieved improved performances for intent detection with a multimodal approach.

翻訳日:2022-11-12 13:06:07 公開日:2020-07-08

# n-項関係知識ベースに対するテンソル分解の一般化

Generalizing Tensor Decomposition for N-ary Relational Knowledge Bases ( http://arxiv.org/abs/2007.03988v1 )

ライセンス: Link先を確認

Yu Liu and Quanming Yao and Yong Li

(参考訳) 知識ベース(kbs)の急速な発展に伴い、リンク予測タスク(リンク予測タスク)は、特に、強力なテンソル分解関連手法を持つバイナリリレーショナルkbs(つまり知識グラフ)において広く研究されてきた。しかし、高次関係事実を持つユビキタスなn-aryリレーショナルkbは、既存の翻訳ベースおよびニューラルネットワークベースのアプローチが様々な関係のモデリングにおいて弱い表現力と高い複雑さを持つため、あまり注目されていない。テンソル分解は n-項リレーショナルKB に対しては考慮されていないが、双対リレーショナルKB のテンソル分解関連法を直接 n-項ケースに拡張しても指数モデル複雑性と二項リレーショナルKBの強い仮定により満足な結果が得られない。本研究では,n-aryリレーショナルKBのテンソル分解を一般化するために,タッカー分解とテンソルリング分解に基づく一般化モデルであるGETDを提案する。既存の負サンプリング手法は、GETDのn-aryケースにも一般化される。さらに, GETD が KB を完全表現可能であることを理論的に証明する。 2つの代表的なn-aryリレーショナルkbデータセットの広範な評価は、getdの優れたパフォーマンスを示し、最先端のメソッドを15-%以上改善した。さらにGETDは、ベンチマークバイナリリレーショナルKBデータセットの最先端結果も取得する。

With the rapid development of knowledge bases (KBs), link prediction task, which completes KBs with missing facts, has been broadly studied in especially binary relational KBs (a.k.a knowledge graph) with powerful tensor decomposition related methods. However, the ubiquitous n-ary relational KBs with higher-arity relational facts are paid less attention, in which existing translation based and neural network based approaches have weak expressiveness and high complexity in modeling various relations. Tensor decomposition has not been considered for n-ary relational KBs, while directly extending tensor decomposition related methods of binary relational KBs to the n-ary case does not yield satisfactory results due to exponential model complexity and their strong assumptions on binary relations. To generalize tensor decomposition for n-ary relational KBs, in this work, we propose GETD, a generalized model based on Tucker decomposition and Tensor Ring decomposition. The existing negative sampling technique is also generalized to the n-ary case for GETD. In addition, we theoretically prove that GETD is fully expressive to completely represent any KBs. Extensive evaluations on two representative n-ary relational KB datasets demonstrate the superior performance of GETD, significantly improving the state-of-the-art methods by over 15\%. Moreover, GETD further obtains the state-of-the-art results on the benchmark binary relational KB datasets.

翻訳日:2022-11-12 13:05:52 公開日:2020-07-08

# Citation Recommendationのためのニューラルテキスト表現の学習

Learning Neural Textual Representations for Citation Recommendation ( http://arxiv.org/abs/2007.04070v1 )

ライセンス: Link先を確認

Binh Thanh Kieu, Inigo Jauregi Unanue, Son Bao Pham, Hieu Xuan Phan, Massimo Piccardi

(参考訳) 科学文献の急速な発展に伴い、論文の適切な引用を手作業で選択することはますます困難で時間がかかりつつある。近年,自動引用推薦の手法がいくつか提案されているが,引用推薦のための効果的な文書表現はいまだにかなり解明されていない。そこで本稿では,シームズと三重項ネットワークを組み込んだ文書(センテンス-BERT)をサブモジュラースコアリング関数で逐次表現する,引用レコメンデーションの新しい手法を提案する。私たちの知る限りでは、これは引用推薦のタスクのために深い表現とサブモジュラーの選択を組み合わせる最初のアプローチです。一般的なベンチマークデータセットであるACL Anthology Network corpusを使用して実験が行われ、ベースラインとMRRやF1-at-kスコアといったメトリクスを使用した最先端アプローチに対して評価されている。その結果, 提案手法は, 測定値毎に比較した全ての手法より優れていることがわかった。

With the rapid growth of the scientific literature, manually selecting appropriate citations for a paper is becoming increasingly challenging and time-consuming. While several approaches for automated citation recommendation have been proposed in the recent years, effective document representations for citation recommendation are still elusive to a large extent. For this reason, in this paper we propose a novel approach to citation recommendation which leverages a deep sequential representation of the documents (Sentence-BERT) cascaded with Siamese and triplet networks in a submodular scoring function. To the best of our knowledge, this is the first approach to combine deep representations and submodular selection for a task of citation recommendation. Experiments have been carried out using a popular benchmark dataset - the ACL Anthology Network corpus - and evaluated against baselines and a state-of-the-art approach using metrics such as the MRR and F1-at-k score. The results show that the proposed approach has been able to outperform all the compared approaches in every measured metric.

翻訳日:2022-11-12 13:05:26 公開日:2020-07-08

# 談話のコヒーレンス,参照グラウンド,目標指向対話

Discourse Coherence, Reference Grounding and Goal Oriented Dialogue ( http://arxiv.org/abs/2007.04428v1 )

ライセンス: Link先を確認

Baber Khalid, Malihe Alikhani, Michael Fellner, Brian McMahan, Matthew Stone

(参考訳) 混合開始型ヒューマンコンピュータ参照コミュニケーションを実現するための従来のアプローチは、情報状態または協調的な問題解決アプローチを採用してきた。本稿では,sdrt \cite{asher-lascarides:2003a} のようなコヒーレンスに基づく談話モデルに着想を得た新たなアプローチを議論する。提案手法の実装に向けた第一歩として、談話間の制約を蓄積し、学習確率モデルを用いてそれらを解釈する参照通信領域における単純な対話システムについて述べる。

Prior approaches to realizing mixed-initiative human--computer referential communication have adopted information-state or collaborative problem-solving approaches. In this paper, we argue for a new approach, inspired by coherence-based models of discourse such as SDRT \cite{asher-lascarides:2003a}, in which utterances attach to an evolving discourse structure and the associated knowledge graph of speaker commitments serves as an interface to real-world reasoning and conversational strategy. As first steps towards implementing the approach, we describe a simple dialogue system in a referential communication domain that accumulates constraints across discourse, interprets them using a learned probabilistic model, and plans clarification using reinforcement learning.

翻訳日:2022-11-12 13:05:08 公開日:2020-07-08

# Dungのセマンティクスは攻撃除去単調性を満たす

Dung's semantics satisfy attack removal monotonicity ( http://arxiv.org/abs/2007.04221v1 )

ライセンス: Link先を確認

Leila Amgoud, Srdjan Vesic

(参考訳) 攻撃除去の単調性は, 好ましく, 安定し, 完全で, 接地されたセマンティクスが満足していることが示される。これは、b から a への攻撃が取り除かれた場合、a の状態は悪化しないことを意味する。

We show that preferred, stable, complete, and grounded semantics satisfy attack removal monotonicity. This means that if an attack from b to a is removed, the status of a cannot worsen, e.g. if a was skeptically accepted, it cannot become rejected.

翻訳日:2022-11-12 13:04:21 公開日:2020-07-08

# スムースゲームに対する確率的ハミルトン勾配法

Stochastic Hamiltonian Gradient Methods for Smooth Games ( http://arxiv.org/abs/2007.04202v1 )

ライセンス: Link先を確認

Nicolas Loizou, Hugo Berard, Alexia Jolicoeur-Martineau, Pascal Vincent, Simon Lacoste-Julien, Ioannis Mitliagkas

(参考訳) 機械学習における敵意の定式化の成功は、スムーズなゲームに対する新たなモチベーションをもたらした。本研究では,確率的ハミルトニアン手法のクラスに着目し,ある種の確率的滑らかなゲームに対して,最初の収束保証を提供する。確率的ハミルトン勾配勾配(SHGD)の非バイアス推定器を提案し,その利点を明らかにする。最適化文献のツールを用いて,shgd が定常点近傍に線形収束することを示す。厳密な解の収束を保証するため, SHGDをステップサイズを小さくして解析し, 初めての確率分散低減ハミルトン法を提案する。この結果から,非凸な非凸問題を含む「十分両線形」条件を満たす確率的非制約双線型ゲームや,より一般的な確率的ゲームに対して,最初のグローバルな非漸近的最終点収束保証を提供する。我々は,確率的双線形ゲームと十分な双線形ゲームにおいて,我々の理論が厳密であることを示す実験と,単純な対向機械学習の定式化による解析を補完する。

The success of adversarial formulations in machine learning has brought renewed motivation for smooth games. In this work, we focus on the class of stochastic Hamiltonian methods and provide the first convergence guarantees for certain classes of stochastic smooth games. We propose a novel unbiased estimator for the stochastic Hamiltonian gradient descent (SHGD) and highlight its benefits. Using tools from the optimization literature we show that SHGD converges linearly to the neighbourhood of a stationary point. To guarantee convergence to the exact solution, we analyze SHGD with a decreasing step-size and we also present the first stochastic variance reduced Hamiltonian method. Our results provide the first global non-asymptotic last-iterate convergence guarantees for the class of stochastic unconstrained bilinear games and for the more general class of stochastic games that satisfy a "sufficiently bilinear" condition, notably including some non-convex non-concave problems. We supplement our analysis with experiments on stochastic bilinear and sufficiently bilinear games, where our theory is shown to be tight, and on simple adversarial machine learning formulations.

翻訳日:2022-11-12 12:58:23 公開日:2020-07-08

# StructureBoost: 構造カテゴリー変数の効率的なグラディエントブースティング

StructureBoost: Efficient Gradient Boosting for Structured Categorical Variables ( http://arxiv.org/abs/2007.04446v1 )

ライセンス: Link先を確認

Brian Lucena

(参考訳) 構造カテゴリー決定木 (SCDT) に基づくグラディエント促進法は, 分類変数が既知の基盤構造を持つ問題に対して, 数値および1ホットエンコーディングより優れることを示した。しかし、SCDTの列挙手順は、低濃度または中等濃度のカテゴリー変数を除いて実現不可能である。計算障害を克服する2つの手法を提案し,実装し,複雑な構造的分類変数の勾配ブースティングを効率的に行う。結果、StructureBoostと呼ばれるパッケージは、洗練された構造を含むカテゴリ予測器の問題で、CatBoostやLightGBMのような確立したパッケージより優れていることが示されている。さらに, 基礎構造に関する知識から, structureboost が未知の分類値を正確に予測できることを実証する。

Gradient boosting methods based on Structured Categorical Decision Trees (SCDT) have been demonstrated to outperform numerical and one-hot-encodings on problems where the categorical variable has a known underlying structure. However, the enumeration procedure in the SCDT is infeasible except for categorical variables with low or moderate cardinality. We propose and implement two methods to overcome the computational obstacles and efficiently perform Gradient Boosting on complex structured categorical variables. The resulting package, called StructureBoost, is shown to outperform established packages such as CatBoost and LightGBM on problems with categorical predictors that contain sophisticated structure. Moreover, we demonstrate that StructureBoost can make accurate predictions on unseen categorical values due to its knowledge of the underlying structure.

翻訳日:2022-11-12 12:57:18 公開日:2020-07-08

# MRIF : 勧告のための多分解能核融合

MRIF: Multi-resolution Interest Fusion for Recommendation ( http://arxiv.org/abs/2007.07084v1 )

ライセンス: Link先を確認

Shihao Li (1), Dekun Yang (1), Bufeng Zhang (1) ((1) Alibaba Inc)

(参考訳) パーソナライズドレコメンデーションの主なタスクは、ユーザーの過去の行動に基づいてユーザーの興味を捉えることである。近年のレコメンデータシステムの進歩のほとんどは、ディープラーニングベースのアプローチを用いてユーザの好みを正確にモデル化することに焦点を当てている。ユーザの興味には2つの重要な特性がある。1つは、ユーザの興味は時間とともに動的で進化し、もう1つは、ユーザの関心は、長期的および短期的な嗜好のような、正確な時間的範囲が異なることである。既存のアプローチでは、異なる時間範囲を考慮せずに、ユーザの関心のドリフトに対処するためにリカレントニューラルネットワーク(RNN)を使用しているか、長期と短期の好みを別々にモデル化するために2つの異なるネットワークを設計している。本稿では,ユーザの利害関係を考慮に入れた多分解能利害融合モデル(MRIF)を提案する。提案モデルでは,ユーザの興味の動的変化を異なる時間範囲で捉えることができ,マルチ解像度のユーザ関心を組み合わせて予測を行う効果的な方法を提供する。実験の結果,提案手法は最先端のレコメンデーション手法よりも優れていた。

The main task of personalized recommendation is capturing users' interests based on their historical behaviors. Most of recent advances in recommender systems mainly focus on modeling users' preferences accurately using deep learning based approaches. There are two important properties of users' interests, one is that users' interests are dynamic and evolve over time, the other is that users' interests have different resolutions, or temporal-ranges to be precise, such as long-term and short-term preferences. Existing approaches either use Recurrent Neural Networks (RNNs) to address the drifts in users' interests without considering different temporal-ranges, or design two different networks to model long-term and short-term preferences separately. This paper presents a multi-resolution interest fusion model (MRIF) that takes both properties of users' interests into consideration. The proposed model is capable to capture the dynamic changes in users' interests at different temporal-ranges, and provides an effective way to combine a group of multi-resolution user interests to make predictions. Experiments show that our method outperforms state-of-the-art recommendation methods consistently.

翻訳日:2022-11-12 12:56:13 公開日:2020-07-08

# フェアクラスタリングは?

Whither Fair Clustering? ( http://arxiv.org/abs/2007.07838v1 )

ライセンス: Link先を確認

Deepak P

(参考訳) 分類フェアネス研究に支配されている比較的多忙なフェア機械学習の領域では、クラスタリングにおけるフェアネスが近年注目され始めている。本稿では, フェアクラスタリングにおける既存の作業を評価し, 未調査の方向性がいくつかあることを観察し, フェアクラスタリングにおける最先端技術は, 非常に画期的であることを仮定する。我々は,目標とする規範的な原則を拡大し,目標が完全に達成できない欠点を特徴付けること,下流プロセスの知識を活用すれば,公平なクラスタリング研究における研究の範囲を大きく広げることができると仮定する。クラスタリングと教師なし学習が、人間の生活に重要な決定を下し、影響を及ぼすのにますます使われているとき、公正なクラスタリングの範囲を広げることは、非常に重要であると考えています。

Within the relatively busy area of fair machine learning that has been dominated by classification fairness research, fairness in clustering has started to see some recent attention. In this position paper, we assess the existing work in fair clustering and observe that there are several directions that are yet to be explored, and postulate that the state-of-the-art in fair clustering has been quite parochial in outlook. We posit that widening the normative principles to target for, characterizing shortfalls where the target cannot be achieved fully, and making use of knowledge of downstream processes can significantly widen the scope of research in fair clustering research. At a time when clustering and unsupervised learning are being increasingly used to make and influence decisions that matter significantly to human lives, we believe that widening the ambit of fair clustering is of immense significance.

翻訳日:2022-11-12 12:55:51 公開日:2020-07-08

# 自動微分を用いたモデルベースクラスタリング:ミス種別と高次元データの比較

Model-based Clustering using Automatic Differentiation: Confronting Misspecification and High-Dimensional Data ( http://arxiv.org/abs/2007.12786v1 )

ライセンス: Link先を確認

Siva Rajesh Kasa, Vaibhav Rajan

(参考訳) ガウス混合モデルを用いたモデルベースクラスタリングの実用上重要な2つの事例について検討する:(1)不特定性がある場合、(2)高次元データに基づく場合、自動微分(AD)を用いたグラディエントD(GD)に基づく最適化の最近の進歩を踏まえて。シミュレーションにより,EMのクラスタリング性能は,不特定の場合のGDに比べて向上し,高次元データGDではEMより優れていた。 em と gd はともに高い確率でクラスタ解釈が貧弱な多くの解が存在することを観測する。この問題に対処するため、我々は、適合するコンポーネントのペア間のkullback leiblerの発散に基づく可能性の新たなペナルティ項を設計する。このペナル化確率の勾配の閉形式表現は導出が難しいが、ADを最適化する利点を説明できる。高次元データとモデル選択のためのこのペナルティの拡張について論じる。合成および実データセットに関する数値実験により,提案手法を用いたクラスタリングの有効性が示された。

We study two practically important cases of model based clustering using Gaussian Mixture Models: (1) when there is misspecification and (2) on high dimensional data, in the light of recent advances in Gradient Descent (GD) based optimization using Automatic Differentiation (AD). Our simulation studies show that EM has better clustering performance, measured by Adjusted Rand Index, compared to GD in cases of misspecification, whereas on high dimensional data GD outperforms EM. We observe that both with EM and GD there are many solutions with high likelihood but poor cluster interpretation. To address this problem we design a new penalty term for the likelihood based on the Kullback Leibler divergence between pairs of fitted components. Closed form expressions for the gradients of this penalized likelihood are difficult to derive but AD can be done effortlessly, illustrating the advantage of AD-based optimization. Extensions of this penalty for high dimensional data and for model selection are discussed. Numerical experiments on synthetic and real datasets demonstrate the efficacy of clustering using the proposed penalized likelihood approach.

翻訳日:2022-11-12 12:55:35 公開日:2020-07-08

# SiENet:画像外挿のためのシームズ拡張ネットワーク

SiENet: Siamese Expansion Network for Image Extrapolation ( http://arxiv.org/abs/2007.03851v1 )

ライセンス: Link先を確認

Xiaofeng Zhang, Feng Chen, Cailing Wang, Songsong Wu, Ming Tao and Guoping Jiang

(参考訳) 画像の塗布と異なり、画像の露光はイメージセンタ内のコンテキストが比較的小さく、画像境界でより多くのコンテンツをキャプチャして予測する。したがって、既存のメソッドの古典的なエンコーダ・デコーダパイプラインは、拡張された未知のコンテンツを正確に予測することはできない。本稿では,Siamese Expansion Network (SiENet) と呼ばれる,画像外挿のための2段階逆解析モデルを提案する。 2つの段階において、適応充填畳み込み(adaptive fill convolution)と呼ばれる新しい境界感度畳み込みは、エンコーダが未知のコンテンツを予測するように設計され、デコーダの負担を軽減する。さらに,ネットワークに事前知識を導入し,エンコーダの推論能力を強化するため,サーム逆数機構を設計し,未発見画像の特徴量に対する被覆長範囲特徴量の分布をモデル化する。 4つのデータセットの結果から,本手法は既存の最先端技術よりも優れ,現実的な結果が得られることが示された。

Different from image inpainting, image outpainting has relative less context in the image center to capture and more content at the image border to predict. Therefore, classical encoder-decoder pipeline of existing methods may not predict the outstretched unknown content perfectly. In this paper, a novel two-stage siamese adversarial model for image extrapolation, named Siamese Expansion Network (SiENet) is proposed. In two stages, a novel border sensitive convolution named adaptive filling convolution is designed for allowing encoder to predict the unknown content, alleviating the burden of decoder. Besides, to introduce prior knowledge to network and reinforce the inferring ability of encoder, siamese adversarial mechanism is designed to enable our network to model the distribution of covered long range feature for that of uncovered image feature. The results on four datasets has demonstrated that our method outperforms existing state-of-the-arts and could produce realistic results.

翻訳日:2022-11-12 12:55:16 公開日:2020-07-08

# 非負関数の非パラメトリックモデル

Non-parametric Models for Non-negative Functions ( http://arxiv.org/abs/2007.03926v1 )

ライセンス: Link先を確認

Ulysse Marteau-Ferey (PSL, DI-ENS, SIERRA), Francis Bach (PSL, DI-ENS, SIERRA), Alessandro Rudi (PSL, DI-ENS, SIERRA)

(参考訳) 線形モデルは、機械学習、信号処理、統計など、多くの分野で大きな効果と柔軟性を示している。それらは、使用する最適化問題の凸性を維持しながら、関数のリッチな空間を表現でき、評価、差別化、統合が容易である。しかし、教師なし学習、密度推定、非パラメトリックベイズ法に不可欠な非負関数のモデル化では、線形モデルは直接適用されない。さらに、一般化線形モデルのような現在の最先端モデルは、非凸最適化問題につながるか、容易に統合できない。本稿では、線形モデルの同じ良い性質の恩恵を受ける非負関数に対する最初のモデルを提供する。特に、表現定理を認め、凸問題に対する効率的な二重定式化を提供することを証明している。その表現力について研究し、結果として得られる函数の空間が一般化線型モデルのそれよりも厳密にリッチであることを示す。最後に、モデルと理論結果を凸錐の出力を持つ関数に拡張する。本論文は, 定式化, アルゴリズムによる導出, 実用的結果, 密度推定問題, ヘテロシドスティック誤差を伴う回帰問題, および多変量回帰問題における有効性を示すモデルの実験的評価によって補完された。

Linear models have shown great effectiveness and flexibility in many fields such as machine learning, signal processing and statistics. They can represent rich spaces of functions while preserving the convexity of the optimization problems where they are used, and are simple to evaluate, differentiate and integrate. However, for modeling non-negative functions, which are crucial for unsupervised learning, density estimation, or non-parametric Bayesian methods, linear models are not applicable directly. Moreover, current state-of-the-art models like generalized linear models either lead to non-convex optimization problems, or cannot be easily integrated. In this paper we provide the first model for non-negative functions which benefits from the same good properties of linear models. In particular, we prove that it admits a representer theorem and provide an efficient dual formulation for convex problems. We study its representation power, showing that the resulting space of functions is strictly richer than that of generalized linear models. Finally we extend the model and the theoretical results to functions with outputs in convex cones. The paper is complemented by an experimental evaluation of the model showing its effectiveness in terms of formulation, algorithmic derivation and practical results on the problems of density estimation, regression with heteroscedastic errors, and multiple quantile regression.

翻訳日:2022-11-12 12:49:22 公開日:2020-07-08

# PIDラグランジアン法による強化学習における応答安全

Responsive Safety in Reinforcement Learning by PID Lagrangian Methods ( http://arxiv.org/abs/2007.03964v1 )

ライセンス: Link先を確認

Adam Stooke, Joshua Achiam, and Pieter Abbeel

(参考訳) ラグランジアン法は制約付き最適化問題のアルゴリズムとして広く用いられているが、その学習力学は振動やオーバーシュートを示し、安全強化学習に適用するとエージェントトレーニング中に制約違反行動を引き起こす。本稿では,制約関数の微分を利用した新しいラグランジュ乗算器更新法を提案する。我々は、従来のラグランジュ乗算器更新が \emph{integral} 制御として振る舞う制御の観点を採り、我々の用語は \emph{proportional} と \emph{derivative} 制御を導入し、減衰と予測手段によって良好な学習ダイナミクスを達成する。我々はPIDラグランジアン法を深部RLに適用し、安全RLベンチマークであるSafety Gymにおける新しい技術状態を設定する。最後に,報奨とコストの相対的な数値スケールに対する不変性を提供することにより,コントローラのチューニングを容易にする新しい手法を提案する。我々のアルゴリズムは従来のラグランジアンアプローチと同様に、導出と実装がほとんど簡単であり、性能とハイパーパラメータの堅牢性が改善された。

Lagrangian methods are widely used algorithms for constrained optimization problems, but their learning dynamics exhibit oscillations and overshoot which, when applied to safe reinforcement learning, leads to constraint-violating behavior during agent training. We address this shortcoming by proposing a novel Lagrange multiplier update method that utilizes derivatives of the constraint function. We take a controls perspective, wherein the traditional Lagrange multiplier update behaves as \emph{integral} control; our terms introduce \emph{proportional} and \emph{derivative} control, achieving favorable learning dynamics through damping and predictive measures. We apply our PID Lagrangian methods in deep RL, setting a new state of the art in Safety Gym, a safe RL benchmark. Lastly, we introduce a new method to ease controller tuning by providing invariance to the relative numerical scales of reward and cost. Our extensive experiments demonstrate improved performance and hyperparameter robustness, while our algorithms remain nearly as simple to derive and implement as the traditional Lagrangian approach.

翻訳日:2022-11-12 12:49:00 公開日:2020-07-08

# ヒューマンアクティビティ認識のための効率的なデータインプテーション手法

An Efficient Data Imputation Technique for Human Activity Recognition ( http://arxiv.org/abs/2007.04456v1 )

ライセンス: Link先を確認

Ivan Miguel Pires, Faisal Hussain, Nuno M. Garcia, Eftim Zdravevski

(参考訳) 人間の行動認識の膨大な応用は、健康モニタリングシステムからバーチャルリアリティーアプリケーションまで幅広く利用されている。このように、多くの応用において日常生活活動の自動認識が重要になっている。近年,人間の日常生活活動の効率的なモニタリングと認識のために,機械学習モデルを訓練するためのデータセットが多数提案されている。しかし、データセットに不完全なアクティビティがある場合、すなわちデータセットキャプチャーにサンプルが欠けている場合、アクティビティ認識における機械学習モデルのパフォーマンスは重要な影響を受ける。そこで本研究では,人間の日常生活活動をよりよく認識するために,データセットの欠落サンプルを外挿する手法を提案する。提案手法は,k-nearest neighbors (knn) インプテーション手法を用いて,データキャプチャにおける欠落サンプルの抽出を行う。提案手法は,実際のデータセットと類似したアクティビティパターンをエレガントに推定した。

The tremendous applications of human activity recognition are surging its span from health monitoring systems to virtual reality applications. Thus, the automatic recognition of daily life activities has become significant for numerous applications. In recent years, many datasets have been proposed to train the machine learning models for efficient monitoring and recognition of human daily living activities. However, the performance of machine learning models in activity recognition is crucially affected when there are incomplete activities in a dataset, i.e., having missing samples in dataset captures. Therefore, in this work, we propose a methodology for extrapolating the missing samples of a dataset to better recognize the human daily living activities. The proposed method efficiently pre-processes the data captures and utilizes the k-Nearest Neighbors (KNN) imputation technique to extrapolate the missing samples in dataset captures. The proposed methodology elegantly extrapolated a similar pattern of activities as they were in the real dataset.

翻訳日:2022-11-12 12:48:37 公開日:2020-07-08

# BlockFLow: フェデレーション学習のための説明責任とプライバシ保護ソリューション

BlockFLow: An Accountable and Privacy-Preserving Solution for Federated Learning ( http://arxiv.org/abs/2007.03856v1 )

ライセンス: Link先を確認

Vaikkunth Mugunthan, Ravi Rahman and Lalana Kagal

(参考訳) 連合学習は、基礎となるデータを共有する必要なしに、協調エージェント間の機械学習モデルの開発を可能にする。しかし、ランダムなデータでトレーニングする悪意のあるエージェント、あるいは結果クラスが反転したデータセットでは、組み合わせたモデルを弱める可能性がある。 BlockFLowは、完全な分散化とプライバシ保護を備えた、説明可能な連邦学習システムである。その主な目標は、基盤となるデータセットのプライバシ保護と悪意のある敵に対する耐性を確保しながら、コントリビューションの品質に比例するエージェントに報酬を与えることである。具体的には、blockflowはディファレンシャルプライバシを取り入れ、モデルコントリビュートのための新しい監査メカニズムを導入し、ethereumスマートコントラクトを使用して優れた振る舞いをインセンティブ化する。フェデレートされた学習システムに対する既存の監査やアカウンタビリティ手法とは異なり、我々のシステムは中央集権的なテストデータセットを必要とせず、エージェント間でデータセットを共有するか、あるいは1つ以上の信頼できる監査者間でデータセットを共有する。パブリックなEthereumブロックチェーン上で実行する場合、BlockFLowは監査の結果を使用して、コントリビューションの品質に基づいた暗号通貨の報酬を行う。ロジスティック回帰モデルによって解決可能な分類タスクを提供する2つのデータセット上のblockflowを評価した。その結果, 評価スコアは, 正直なエージェントのデータセットの品質を反映していることがわかった。また、不正エージェントのスコアは、正直エージェントのスコアよりも統計的に低い。これらの結果は、合理的なブロックチェーンコストとともに、説明可能なフェデレーション学習システムとしてのBlockFLowの有効性を示している。

Federated learning enables the development of a machine learning model among collaborating agents without requiring them to share their underlying data. However, malicious agents who train on random data, or worse, on datasets with the result classes inverted, can weaken the combined model. BlockFLow is an accountable federated learning system that is fully decentralized and privacy-preserving. Its primary goal is to reward agents proportional to the quality of their contribution while protecting the privacy of the underlying datasets and being resilient to malicious adversaries. Specifically, BlockFLow incorporates differential privacy, introduces a novel auditing mechanism for model contribution, and uses Ethereum smart contracts to incentivize good behavior. Unlike existing auditing and accountability methods for federated learning systems, our system does not require a centralized test dataset, sharing of datasets between the agents, or one or more trusted auditors; it is fully decentralized and resilient up to a 50% collusion attack in a malicious trust model. When run on the public Ethereum blockchain, BlockFLow uses the results from the audit to reward parties with cryptocurrency based on the quality of their contribution. We evaluated BlockFLow on two datasets that offer classification tasks solvable via logistic regression models. Our results show that the resultant auditing scores reflect the quality of the honest agents' datasets. Moreover, the scores from dishonest agents are statistically lower than those from the honest agents. These results, along with the reasonable blockchain costs, demonstrate the effectiveness of BlockFLow as an accountable federated learning system.

翻訳日:2022-11-12 12:47:58 公開日:2020-07-08

# ニューラルSDEによるロバスト価格とヘッジ

Robust pricing and hedging via neural SDEs ( http://arxiv.org/abs/2007.04154v1 )

ライセンス: Link先を確認

Patryk Gierjatowicz and Marc Sabate-Vidales and David \v{S}i\v{s}ka and Lukasz Szpruch and \v{Z}an \v{Z}uri\v{c}

(参考訳) 数学的モデリングは金融業界に広く浸透しており、重要な意思決定プロセスを動かしている。任意のモデルが現実に粗悪な近似を与えるだけであり、不適切なモデルを使用することのリスクは検出と定量化が難しい。対照的に、現代のデータサイエンス技術は、より堅牢でデータ駆動のモデル選択メカニズムへの扉を開く。しかしながら、ほとんどの機械学習モデルは、個々のパラメータが意味のある解釈を持たないため、"ブラックボックス"である。本稿の目的は,上記の2つの世界のベストを達成するアプローチを組み合わせることである。ニューラルネットワークと古典確率微分方程式(SDE)に基づくリスクモデルを組み合わせることで、デリバティブの価格とそれに対応するヘッジ戦略の堅牢な境界を見つけ、関連する市場データを取り込む。ニューラルSDEと呼ばれる結果は生成モデルのインスタンス化であり、因果最適輸送の理論と密接に関連している。ニューラルSDEはリスクニュートラルと現実世界の両方で一貫した校正を可能にする。したがって、モデルはリスクプロファイルやヘッジ戦略を評価するのに必要な市場シナリオをシミュレートするために使用できる。我々は,ニューラルSDEの効率的な利用に必要な新しいアルゴリズムを開発し,分析する。局所的および確率的ボラティリティモデルを用いて数値実験によるアプローチを検証する。

Mathematical modelling is ubiquitous in the financial industry and drives key decision processes. Any given model provides only a crude approximation to reality and the risk of using an inadequate model is hard to detect and quantify. By contrast, modern data science techniques are opening the door to more robust and data-driven model selection mechanisms. However, most machine learning models are "black-boxes" as individual parameters do not have meaningful interpretation. The aim of this paper is to combine the above approaches achieving the best of both worlds. Combining neural networks with risk models based on classical stochastic differential equations (SDEs), we find robust bounds for prices of derivatives and the corresponding hedging strategies while incorporating relevant market data. The resulting model called neural SDE is an instantiation of generative models and is closely linked with the theory of causal optimal transport. Neural SDEs allow consistent calibration under both the risk-neutral and the real-world measures. Thus the model can be used to simulate market scenarios needed for assessing risk profiles and hedging strategies. We develop and analyse novel algorithms needed for efficient use of neural SDEs. We validate our approach with numerical experiments using both local and stochastic volatility models.

翻訳日:2022-11-12 12:46:35 公開日:2020-07-08

# サンプリングによるDPPからの学習:HKPVと対称性を超えて

Learning from DPPs via Sampling: Beyond HKPV and symmetry ( http://arxiv.org/abs/2007.04287v1 )

ライセンス: Link先を確認

R\'emi Bardenet and Subhroshekhar Ghosh

(参考訳) 決定点プロセス(DPP)は,これらの確率的モデルの本質的な能力を生かして,サンプルの多様性を促進する,レコメンデーションシステム,特徴選択,要約抽出のための重要なツールとなっている。 DPPからサンプルを採取する能力は、これらのモデルの実証的研究に最重要である。ほとんどの正確なサンプルは、Hough、Krishnapur、Peres、Vir\'ag (henceforth HKPV)によるスペクトルメタアルゴリズムの変種である。対称カーネルを持つDPPでは、スケーラブルなHKPVサンプリング器が提案されており、まずはアイテムの基底セットをダウンサンプルするか、Nystr\"om型分解を用いてカーネルをローランクにする。本研究では,HKPVとは大きく異なるアプローチを提案する。 DPP(いわゆる線形統計学)の重要な可観測値だけをサンプリングすることで、多くの統計的および学習目的が効果的に達成できるという事実が発覚し、そのような可観測値のラプラス変換の式を1つの行列式として呼び出す。従来の低ランク近似手法とラプラス逆解析を組み合わせることで,dppの線形統計量の分布関数を直接近似する方法を示す。この分布関数は、要求に従って仮説テストや実際に線形統計学をサンプリングするのに使うことができる。我々のアプローチはスケーラブルであり、従来の対称カーネルを超えて非常に一般的なDPPに適用できる。

Determinantal point processes (DPPs) have become a significant tool for recommendation systems, feature selection, or summary extraction, harnessing the intrinsic ability of these probabilistic models to facilitate sample diversity. The ability to sample from DPPs is paramount to the empirical investigation of these models. Most exact samplers are variants of a spectral meta-algorithm due to Hough, Krishnapur, Peres and Vir\'ag (henceforth HKPV), which is in general time and resource intensive. For DPPs with symmetric kernels, scalable HKPV samplers have been proposed that either first downsample the ground set of items, or force the kernel to be low-rank, using e.g. Nystr\"om-type decompositions. In the present work, we contribute a radically different approach than HKPV. Exploiting the fact that many statistical and learning objectives can be effectively accomplished by only sampling certain key observables of a DPP (so-called linear statistics), we invoke an expression for the Laplace transform of such an observable as a single determinant, which holds in complete generality. Combining traditional low-rank approximation techniques with Laplace inversion algorithms from numerical analysis, we show how to directly approximate the distribution function of a linear statistic of a DPP. This distribution function can then be used in hypothesis testing or to actually sample the linear statistic, as per requirement. Our approach is scalable and applies to very general DPPs, beyond traditional symmetric kernels.

翻訳日:2022-11-12 12:42:04 公開日:2020-07-08

# 楽観的スコア比を用いたロバストベイズ分類

Robust Bayesian Classification Using an Optimistic Score Ratio ( http://arxiv.org/abs/2007.04458v1 )

ライセンス: Link先を確認

Viet Anh Nguyen and Nian Si and Jose Blanchet

(参考訳) 我々は,クラス条件,あるいは文脈分布に関する情報が限られている場合,頑健なバイナリ分類のための楽観的スコア比を用いたベイズ文脈分類モデルを構築する。楽観的なスコアは、平均ベクトルと基礎となる文脈分布の共分散行列に制限された構造的制約を用いて規定される文脈曖昧性集合に属するすべての分布のうち、テストサンプルの観測結果を説明する最も有効な分布を探索する。楽観的スコア比を用いたベイズ分類器は,概念的に魅力的であり,統計的保証がしっかりでき,計算も容易である。提案する楽観的スコア比分類器のパワーを合成データと実験データの両方に示す。

We build a Bayesian contextual classification model using an optimistic score ratio for robust binary classification when there is limited information on the class-conditional, or contextual, distribution. The optimistic score searches for the distribution that is most plausible to explain the observed outcomes in the testing sample among all distributions belonging to the contextual ambiguity set which is prescribed using a limited structural constraint on the mean vector and the covariance matrix of the underlying contextual distribution. We show that the Bayesian classifier using the optimistic score ratio is conceptually attractive, delivers solid statistical guarantees and is computationally tractable. We showcase the power of the proposed optimistic score ratio classifier on both synthetic and empirical data.

翻訳日:2022-11-12 12:41:03 公開日:2020-07-08

# URSABench:ディープニューラルネットワークのための近似ベイズ推論法の総合ベンチマーク

URSABench: Comprehensive Benchmarking of Approximate Bayesian Inference Methods for Deep Neural Networks ( http://arxiv.org/abs/2007.04466v1 )

ライセンス: Link先を確認

Meet P. Vadera, Adam D. Cobb, Brian Jalaian, Benjamin M. Marlin

(参考訳) ディープラーニング手法は、幅広いアプリケーションドメインにおける予測精度の向上を続けているが、不確実性とその堅牢性など、パフォーマンスの他の面においても重要な問題が残っている。ベイズ近似の最近の進歩はこれらの問題に対処する上で大きな可能性を秘めているが、これらの手法の計算スケーラビリティは大規模モデルに適用した場合に問題となる可能性がある。本稿では,深層学習に基づく分類タスクに着目した近似ベイズ推定手法の包括的評価のためのベンチマークツールのオープンソーススイートであるofursabench(the uncertainty, robustness, scalability, and accu-racy benchmark)の開発に関する初期研究について述べる。

While deep learning methods continue to improve in predictive accuracy on a wide range of application domains, significant issues remain with other aspects of their performance including their ability to quantify uncertainty and their robustness. Recent advances in approximate Bayesian inference hold significant promise for addressing these concerns, but the computational scalability of these methods can be problematic when applied to large-scale models. In this paper, we describe initial work on the development ofURSABench(the Uncertainty, Robustness, Scalability, and Accu-racy Benchmark), an open-source suite of bench-marking tools for comprehensive assessment of approximate Bayesian inference methods with a focus on deep learning-based classification tasks

翻訳日:2022-11-12 12:40:50 公開日:2020-07-08

# 深層学習に基づくidsにおけるニューラルネットワークの異なるタイプの敵訓練の評価

Evaluation of Adversarial Training on Different Types of Neural Networks in Deep Learning-based IDSs ( http://arxiv.org/abs/2007.04472v1 )

ライセンス: Link先を確認

Rana Abou Khamis and Ashraf Matrawy

(参考訳) ディープニューラルネットワークの侵入検知システムを含むネットワークセキュリティアプリケーションは、異常活動の検出タスクをより正確かつ堅牢にするために急速に増加している。 DNNの利用が急速に増加し、システム内を移動するデータ量が増えると、敵の攻撃の種類が増えていることが深刻な課題となっている。本稿では,ニューラルネットワーク(convolutional neural networks, cnn)やrecurrent neural networks(rnn)など,さまざまなニューラルネットワークを用いた,さまざまな回避攻撃の有効性と,レジリエンスに基づくディープラーニングidのトレーニング方法について検討する。 min-maxアプローチを用いて,2つのベンチマークデータセットを用いて,対向例に対するロバストidのトレーニング問題を定式化する。異なるディープラーニングアルゴリズムと異なるベンチマークデータセットに関する実験により、敵の訓練に基づくmin-maxアプローチによる防御が、よく知られた5つの敵の攻撃方法に対する堅牢性を向上することを示した。

Network security applications, including intrusion detection systems of deep neural networks, are increasing rapidly to make detection task of anomaly activities more accurate and robust. With the rapid increase of using DNN and the volume of data traveling through systems, different growing types of adversarial attacks to defeat them create a severe challenge. In this paper, we focus on investigating the effectiveness of different evasion attacks and how to train a resilience deep learning-based IDS using different Neural networks, e.g., convolutional neural networks (CNN) and recurrent neural networks (RNN). We use the min-max approach to formulate the problem of training robust IDS against adversarial examples using two benchmark datasets. Our experiments on different deep learning algorithms and different benchmark datasets demonstrate that defense using an adversarial training-based min-max approach improves the robustness against the five well-known adversarial attack methods.

翻訳日:2022-11-12 12:40:37 公開日:2020-07-08

# 多視点知識蒸留によるロバスト再同定

Robust Re-Identification by Multiple Views Knowledge Distillation ( http://arxiv.org/abs/2007.04174v1 )

ライセンス: Link先を確認

Angelo Porrello, Luca Bergamini, Simone Calderara

(参考訳) 再同定におけるロバスト性を実現するため、標準手法では追跡情報をビデオ対ビデオ方式で活用する。しかし、これらのソリューションは、単一の画像クエリ(例えば、画像からビデオへの設定)のパフォーマンスが大幅に低下する。近年の研究では,映像ベースネットワークから画像ベースネットワークへ時間情報を転送することで,この深刻な劣化に対処している。本研究は,対象対象を描写した一組の視点から,優れた知識の伝達を可能にするトレーニング戦略を考案する。本提案では,教師がより少ない視点を観察する生徒を教育する教師・学生の枠組みにおいて,この視覚的多様性を監督信号として捉える。その結果、学生は教師だけでなく、映像対ビデオの最先端技術も大きく上回っている(火星では6.3%、デューク=ビデオ=リードでは8.6%、ヴェリ-776では5%)。人, 乗り物, 動物リidの徹底分析により, vkdの特性を定性的, 定量的に検討した。コードはhttps://github.com/aimagelab/VKD.comで入手できる。

To achieve robustness in Re-Identification, standard methods leverage tracking information in a Video-To-Video fashion. However, these solutions face a large drop in performance for single image queries (e.g., Image-To-Video setting). Recent works address this severe degradation by transferring temporal information from a Video-based network to an Image-based one. In this work, we devise a training strategy that allows the transfer of a superior knowledge, arising from a set of views depicting the target object. Our proposal - Views Knowledge Distillation (VKD) - pins this visual variety as a supervision signal within a teacher-student framework, where the teacher educates a student who observes fewer views. As a result, the student outperforms not only its teacher but also the current state-of-the-art in Image-To-Video by a wide margin (6.3% mAP on MARS, 8.6% on Duke-Video-ReId and 5% on VeRi-776). A thorough analysis - on Person, Vehicle and Animal Re-ID - investigates the properties of VKD from a qualitatively and quantitatively perspective. Code is available at https://github.com/aimagelab/VKD.

翻訳日:2022-11-12 12:38:43 公開日:2020-07-08

# 第四カプセルネットワーク

Quaternion Capsule Networks ( http://arxiv.org/abs/2007.04389v1 )

ライセンス: Link先を確認

Bar{\i}\c{s} \"Ozcan, Furkan K{\i}nl{\i}, Furkan K{\i}ra\c{c}

(参考訳) カプセルはニューロンのグループ化であり、ポーズや特徴といった視覚的な実体の洗練された情報を表現できる。この特性の観点から、Capsule Networksは、オブジェクト認識のような困難なタスクにおいてCNNよりも優れており、これは、ポーズ情報の高次元表現の助けを借りて、オブジェクトとその部分間の変換を学習することによって達成される。本稿では、カプセルのポーズ情報とその変換を四元数で表現する四元数カプセル(QCN)について述べる。四元系はジンバルロックに免疫があり、カプセルの回転表現の正則化が容易であり、行列よりもパラメータの数が少ない。実験の結果、qcnsは、パラメータの少ない新しい視点に一般化し、よく知られたベンチマークデータセット上の最先端のカプセルアーキテクチャで、ほぼあるいはより優れたパフォーマンスを達成することが示された。

Capsules are grouping of neurons that allow to represent sophisticated information of a visual entity such as pose and features. In the view of this property, Capsule Networks outperform CNNs in challenging tasks like object recognition in unseen viewpoints, and this is achieved by learning the transformations between the object and its parts with the help of high dimensional representation of pose information. In this paper, we present Quaternion Capsules (QCN) where pose information of capsules and their transformations are represented by quaternions. Quaternions are immune to the gimbal lock, have straightforward regularization of the rotation representation for capsules, and require less number of parameters than matrices. The experimental results show that QCNs generalize better to novel viewpoints with fewer parameters, and also achieve on-par or better performances with the state-of-the-art Capsule architectures on well-known benchmarking datasets.

翻訳日:2022-11-12 12:38:22 公開日:2020-07-08

# 職場でよく使われる性差別的言明の自動検出

Automatic Detection of Sexist Statements Commonly Used at the Workplace ( http://arxiv.org/abs/2007.04181v1 )

ライセンス: Link先を確認

Dylan Grosz, Patricia Conde-Cespedes

(参考訳) 職場でのヘイトスピーチの検出は、社会的コンテキストが従来のヘイトスピーチの微妙なバージョンを意味するため、ユニークな分類タスクである。最先端の職場性差別検出モデルに関するアプリケーションには、ヒューマンリソース部門の支援、AIチャットボット、感情分析などがある。既存のヘイトスピーチ検出手法のほとんどは、堅牢で正確だが、ソーシャルメディア、特にTwitterで見られるヘイトスピーチに焦点を当てている。ソーシャルメディアの文脈は職場よりもはるかに匿名であるため、セクシズムのより攻撃的で「敵対的な」バージョンに結びつく傾向がある。したがって、大量の"敵対的"性差別を持つデータセットは、"敵対的"性差別的ステートメントが、文脈に関係なく、モデルにセクシストであることを示唆する2つの単語をヒンジできるため、少し簡単に検出できる。本稿では,職場で語られる可能性が高い性差別的発言のデータセットと,最先端の成果を得られる深層学習モデルを提案する。これまでの研究は、単に集約されたtwitterのデータに基づいて「敵意」と「善意」のセクシズムを区別するための最先端のモデルを作ってきた。我々のディープラーニング手法は、GloVeやランダムな単語埋め込みで初期化され、LSTMを使用して、より多様なフィルタリングされたデータセットでこれらのモデルをパフォーマンスし、職場の性差別をより対象とし、F1スコアが0.88になる。

Detecting hate speech in the workplace is a unique classification task, as the underlying social context implies a subtler version of conventional hate speech. Applications regarding a state-of the-art workplace sexism detection model include aids for Human Resources departments, AI chatbots and sentiment analysis. Most existing hate speech detection methods, although robust and accurate, focus on hate speech found on social media, specifically Twitter. The context of social media is much more anonymous than the workplace, therefore it tends to lend itself to more aggressive and "hostile" versions of sexism. Therefore, datasets with large amounts of "hostile" sexism have a slightly easier detection task since "hostile" sexist statements can hinge on a couple words that, regardless of context, tip the model off that a statement is sexist. In this paper we present a dataset of sexist statements that are more likely to be said in the workplace as well as a deep learning model that can achieve state-of-the art results. Previous research has created state-of-the-art models to distinguish "hostile" and "benevolent" sexism based simply on aggregated Twitter data. Our deep learning methods, initialized with GloVe or random word embeddings, use LSTMs with attention mechanisms to outperform those models on a more diverse, filtered dataset that is more targeted towards workplace sexism, leading to an F1 score of 0.88.

翻訳日:2022-11-12 12:37:51 公開日:2020-07-08

# 良性はどんなに良性に満ちているか?

How benign is benign overfitting? ( http://arxiv.org/abs/2007.04028v1 )

ライセンス: Link先を確認

Amartya Sanyal, Puneet K Dokania, Varun Kanade, Philip H.S. Torr

(参考訳) 深層ニューラルネットワークにおける敵意的脆弱性の原因として,悪いデータと(おそらく)訓練されたモデルについて検討する。 sgdでトレーニングすると、深層ニューラルネットワークはラベルノイズの存在下でも、基本的にトレーニングエラーをゼロにすると同時に、良性オーバーフィット(benign overfitting)と呼ばれる自然テストデータに対して優れた一般化を示す。しかし、これらのモデルは敵の攻撃に弱い。我々は,ラベルノイズを敵の脆弱性の原因の一つとみなし,これを支持する理論的・実証的な証拠を提供する。驚くべきことに、MNISTやCIFARといったデータセットでラベルノイズのいくつかの例が見つかり、堅牢にトレーニングされたモデルがこれらのいくつかでトレーニングエラーを引き起こしている。しかし、ノイズの多いラベルを除去するだけでは、敵の堅牢性を達成できない。標準的なトレーニング手順は、ニューラルネットワークを"単純な"分類境界の学習に偏らせる。敵の訓練がより複雑な決定境界を生み出すことを観察する。複雑な決定境界の必要性の一部は、準最適表現学習から生じると推測する。単純な玩具の例を用いて、表現の選択が敵の強靭性に大きな影響を与えるか理論的に示す。

We investigate two causes for adversarial vulnerability in deep neural networks: bad data and (poorly) trained models. When trained with SGD, deep neural networks essentially achieve zero training error, even in the presence of label noise, while also exhibiting good generalization on natural test data, something referred to as benign overfitting [2, 10]. However, these models are vulnerable to adversarial attacks. We identify label noise as one of the causes for adversarial vulnerability, and provide theoretical and empirical evidence in support of this. Surprisingly, we find several instances of label noise in datasets such as MNIST and CIFAR, and that robustly trained models incur training error on some of these, i.e. they don't fit the noise. However, removing noisy labels alone does not suffice to achieve adversarial robustness. Standard training procedures bias neural networks towards learning "simple" classification boundaries, which may be less robust than more complex ones. We observe that adversarial training does produce more complex decision boundaries. We conjecture that in part the need for complex decision boundaries arises from sub-optimal representation learning. By means of simple toy examples, we show theoretically how the choice of representation can drastically affect adversarial robustness.

翻訳日:2022-11-12 10:12:02 公開日:2020-07-08

# モデル同定における構造制約に関する事前知識の導入

Incorporating prior knowledge about structural constraints in model identification ( http://arxiv.org/abs/2007.04030v1 )

ライセンス: Link先を確認

Deepak Maurya, Sivadurgaprasad Chinta, Abhishek Sivaram and Raghunathan Rengaswamy

(参考訳) モデル同定は化学産業において重要な問題である。近年,関心システムに関する部分的知識を活用したデータ駆動モデル学習への関心が高まっている。モデル識別のためのほとんどの技術は、モデルの構造のような部分的な情報を組み込む自由を与えていない。本稿では,そのような部分的情報を利用してより良い推定値を生成するモデル同定手法を提案する。具体的には,本モデルに関する本質的な構造情報を利用して,pcaなどの既存手法を改良した構造主成分分析(spca)を提案する。既存の手法や近縁な手法のほとんどは、計算コストのかかる空間的制約を用いる。提案手法は構造情報を利用するためのPCAの賢明な修正である。提案手法の有効性を合成および工業ケーススタディーを用いて実証した。

Model identification is a crucial problem in chemical industries. In recent years, there has been increasing interest in learning data-driven models utilizing partial knowledge about the system of interest. Most techniques for model identification do not provide the freedom to incorporate any partial information such as the structure of the model. In this article, we propose model identification techniques that could leverage such partial information to produce better estimates. Specifically, we propose Structural Principal Component Analysis (SPCA) which improvises over existing methods like PCA by utilizing the essential structural information about the model. Most of the existing methods or closely related methods use sparsity constraints which could be computationally expensive. Our proposed method is a wise modification of PCA to utilize structural information. The efficacy of the proposed approach is demonstrated using synthetic and industrial case-studies.

翻訳日:2022-11-12 10:11:42 公開日:2020-07-08

# Diverse Ensemblesは校正を改善する

Diverse Ensembles Improve Calibration ( http://arxiv.org/abs/2007.04206v1 )

ライセンス: Link先を確認

Asa Cooper Stickland and Iain Murray

(参考訳) 現代のディープニューラルネットワークは、特に列車とテストの分布が不一致している場合に、校正された予測を生じることがある。モデルの集合をトレーニングし、予測を平均化することは、これらの問題を緩和するのに役立ちます。アンサンブルメンバーごとに異なるデータ拡張を用いてキャリブレーションを改善するための簡単な手法を提案する。さらに、'mixing'という未拡張および拡張入力のアイデアを用いて、テストとトレーニングの分布が同じであるときに校正を改善する。これらの単純なテクニックは、CIFAR10とCIFAR100ベンチマークの強いベースラインに対する校正と精度を改善し、その破損したバージョンのドメイン外データを改善する。

Modern deep neural networks can produce badly calibrated predictions, especially when train and test distributions are mismatched. Training an ensemble of models and averaging their predictions can help alleviate these issues. We propose a simple technique to improve calibration, using a different data augmentation for each ensemble member. We additionally use the idea of `mixing' un-augmented and augmented inputs to improve calibration when test and training distributions are the same. These simple techniques improve calibration and accuracy over strong baselines on the CIFAR10 and CIFAR100 benchmarks, and out-of-domain data from their corrupted versions.

翻訳日:2022-11-12 10:10:07 公開日:2020-07-08

# 適応部分モジュラー最大化のための線形時間アルゴリズム

Linear-Time Algorithms for Adaptive Submodular Maximization ( http://arxiv.org/abs/2007.04214v1 )

ライセンス: Link先を確認

Shaojie Tang

(参考訳) 本稿では,2つの確率的部分モジュラー最大化問題に対する高速アルゴリズムを提案する。まず,濃度制約を満たした適応部分モジュラー最大化問題から始める。近似比(1-1/e-\epsilon)$の線形時間アルゴリズムを開発した。特に、我々のアルゴリズムの時間複雑性は$O(n\log\frac{1}{\epsilon})$(関数評価の数)であり、これは濃度制約とは独立であり、$n$は基底集合のサイズである。次に,完全適応部分モジュラリティの概念を導入し,分割マトロイド制約を受ける完全適応部分モジュラリティ関数を最大化する線形時間アルゴリズムを開発した。 1-1/e-\epsilon}{4-2/e-2\epsilon}$の近似比をo(n\log\frac{1}{\epsilon})$の関数評価のみを用いて達成することを示す。

In this paper, we develop fast algorithms for two stochastic submodular maximization problems. We start with the well-studied adaptive submodular maximization problem subject to a cardinality constraint. We develop the first linear-time algorithm which achieves a $(1-1/e-\epsilon)$ approximation ratio. Notably, the time complexity of our algorithm is $O(n\log\frac{1}{\epsilon})$ (number of function evaluations) which is independent of the cardinality constraint, where $n$ is the size of the ground set. Then we introduce the concept of fully adaptive submodularity, and develop a linear-time algorithm for maximizing a fully adaptive submoudular function subject to a partition matroid constraint. We show that our algorithm achieves a $\frac{1-1/e-\epsilon}{4-2/e-2\epsilon}$ approximation ratio using only $O(n\log\frac{1}{\epsilon})$ number of function evaluations.

翻訳日:2022-11-12 10:09:57 公開日:2020-07-08

# RicciNets:Ricci Flowを用いた高速ニューラルネットワークの曲率誘導プルーニング

RicciNets: Curvature-guided Pruning of High-performance Neural Networks Using Ricci Flow ( http://arxiv.org/abs/2007.04216v1 )

ライセンス: Link先を確認

Samuel Glass, Simeon Spasov, Pietro Li\`o

(参考訳) トレーニング前にランダムに配線されたニューラルネットワーク内の有線計算経路を同定する新しい手法を提案する。この計算グラフは、局所グラフ測度で定義されるノード質量確率関数に基づいてプルーニングされ、強化学習ベースの制御ニューラルネットワークによって生成されるハイパーパラメータによって重み付けされる。計算グラフをニューラルネットワークにマッピングする前に,リッチ曲率の定義を用いて,重要度の低いエッジを除去する。我々は,1パスあたりの浮動小数点演算数(flops)の約35\%$の削減を示し,性能の低下はみられなかった。さらに,本手法は,ランダムに繋がったニューラルネットワークを純粋に構造的特性に基づいて正規化することを可能にし,一方のネットワークで識別される好適な特性が他のネットワークに一般化できることを見いだすことができる。この方法では、低マグニチュード重みで刈り取られたものと類似した圧縮下で、より優れた性能のネットワークを生成する。我々の知る限り、これはランダムに配線されたニューラルネットワークをプルーニングする最初の研究であり、プルーニング機構でリッチ曲率のトポロジカル測度を利用する最初の研究である。

A novel method to identify salient computational paths within randomly wired neural networks before training is proposed. The computational graph is pruned based on a node mass probability function defined by local graph measures and weighted by hyperparameters produced by a reinforcement learning-based controller neural network. We use the definition of Ricci curvature to remove edges of low importance before mapping the computational graph to a neural network. We show a reduction of almost $35\%$ in the number of floating-point operations (FLOPs) per pass, with no degradation in performance. Further, our method can successfully regularize randomly wired neural networks based on purely structural properties, and also find that the favourable characteristics identified in one network generalise to other networks. The method produces networks with better performance under similar compression to those pruned by lowest-magnitude weights. To our best knowledge, this is the first work on pruning randomly wired neural networks, as well as the first to utilize the topological measure of Ricci curvature in the pruning mechanism.

翻訳日:2022-11-12 10:09:40 公開日:2020-07-08

# 感情認識のための音声・視覚の時間的アグリゲーション

Temporal aggregation of audio-visual modalities for emotion recognition ( http://arxiv.org/abs/2007.04364v1 )

ライセンス: Link先を確認

Andreea Birhala, Catalin Nicolae Ristea, Anamaria Radoi, Liviu Cristian Dutu

(参考訳) 感情認識は感情コンピューティングや人間とコンピュータの相互作用において重要な役割を担っている。現在の技術進歩は、人の感情状態に関するデータを収集する可能性を高めている。一般に、被写体が伝達する感情に関する人間の知覚は、被写体との最初の秒間で収集された声と視覚情報に基づいている。その結果、感情認識に対する現在のアプローチのほとんどにおいて、言語的(つまり、音声)と非言語的(すなわち、画像)の情報の統合が望ましい選択であるように思われる。本稿では,各モダリティに対する時間的オフセットの異なる時間窓からの音声と視覚のモダリティを組み合わせた感情認識のためのマルチモーダル融合手法を提案する。提案手法は,文献や精度評価から,他の手法よりも優れていることを示す。実験は、オープンアクセスマルチモーダルデータセットCREMA-D上で実施される。

Emotion recognition has a pivotal role in affective computing and in human-computer interaction. The current technological developments lead to increased possibilities of collecting data about the emotional state of a person. In general, human perception regarding the emotion transmitted by a subject is based on vocal and visual information collected in the first seconds of interaction with the subject. As a consequence, the integration of verbal (i.e., speech) and non-verbal (i.e., image) information seems to be the preferred choice in most of the current approaches towards emotion recognition. In this paper, we propose a multimodal fusion technique for emotion recognition based on combining audio-visual modalities from a temporal window with different temporal offsets for each modality. We show that our proposed method outperforms other methods from the literature and human accuracy rating. The experiments are conducted over the open-access multimodal dataset CREMA-D.

翻訳日:2022-11-12 10:03:12 公開日:2020-07-08

# 知識グラフに基づく意味融合による会話レコメンダシステムの改善

Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion ( http://arxiv.org/abs/2007.04032v1 )

ライセンス: Link先を確認

Kun Zhou, Wayne Xin Zhao, Shuqing Bian, Yuanhang Zhou, Ji-Rong Wen, Jingsong Yu

(参考訳) 対話型推薦システム(CRS)は,対話型対話を通じて高品質な項目をユーザに推薦することを目的としている。 CRSの取り組みはいくつかあるが、2つの大きな問題はまだ解決されていない。まず、会話データ自体にユーザの好みを正確に理解するための十分なコンテキスト情報がない。第二に、自然言語表現とアイテムレベルのユーザ嗜好の間には意味的なギャップがある。これらの問題に対処するために、単語指向とエンティティ指向の知識グラフ(kg)の両方を組み込んでcrssのデータ表現を強化し、単語レベルとエンティティレベルの意味空間を整合させるために相互情報最大化を採用する。協調したセマンティック表現に基づいて、正確なレコメンデーションを行うためのKG強化レコメンデーションコンポーネントと、応答テキストに情報的キーワードやエンティティを生成するKG強化ダイアログコンポーネントをさらに発展させる。提案手法の有効性を実証し,提案手法の有効性を検証した。

Conversational recommender systems (CRS) aim to recommend high-quality items to users through interactive conversations. Although several efforts have been made for CRS, two major issues still remain to be solved. First, the conversation data itself lacks of sufficient contextual information for accurately understanding users' preference. Second, there is a semantic gap between natural language expression and item-level user preference. To address these issues, we incorporate both word-oriented and entity-oriented knowledge graphs (KG) to enhance the data representations in CRSs, and adopt Mutual Information Maximization to align the word-level and entity-level semantic spaces. Based on the aligned semantic representations, we further develop a KG-enhanced recommender component for making accurate recommendations, and a KG-enhanced dialog component that can generate informative keywords or entities in the response text. Extensive experiments have demonstrated the effectiveness of our approach in yielding better performance on both recommendation and conversation tasks.

翻訳日:2022-11-12 10:02:59 公開日:2020-07-08

# 敵対的摂動に頑健な深層ニューラルネットワークの高速学習

Fast Training of Deep Neural Networks Robust to Adversarial Perturbations ( http://arxiv.org/abs/2007.03832v1 )

ライセンス: Link先を確認

Justin Goodwin, Olivia Brown, Victoria Helus

(参考訳) ディープニューラルネットワークは、多くの領域で高速にトレーニングし、うまく一般化することができる。その有望な性能にもかかわらず、ディープネットワークは入力の摂動に対する感性(例えば、敵の例)を示しており、学習した特徴表現はしばしば解釈が困難であり、真の能力と信頼性に関する懸念を提起している。近年の対人訓練における研究は、モデルが対人例に対して最適化される頑健な最適化の形で、摂動に対する性能感性を改善し、より解釈可能な特徴表現を得る能力を示している。しかし、敵対的なトレーニングは、標準(すなわち非ロバスト)トレーニングよりも計算コストが増大し、大規模問題での使用には非現実的になる。最近の研究は、敵の訓練に対する迅速な近似が、無限大規範に縛られた摂動の存在下でトレーニング時間を短縮し、堅牢性を維持することを約束していることを示している。本研究では,本手法がユークリッドのノルムにまで拡張され,ロバストモデルに共通する人間による特徴表現が保たれることを示す。さらに,分散学習方式を用いることで,堅牢な深層ネットワークをトレーニングする時間をさらに短縮できることを示す。高速対人トレーニングは、堅牢な最適化が非現実的と考えられていた機械学習アプリケーションにおいて、セキュリティと説明可能性の向上を提供する、有望なアプローチである。

Deep neural networks are capable of training fast and generalizing well within many domains. Despite their promising performance, deep networks have shown sensitivities to perturbations of their inputs (e.g., adversarial examples) and their learned feature representations are often difficult to interpret, raising concerns about their true capability and trustworthiness. Recent work in adversarial training, a form of robust optimization in which the model is optimized against adversarial examples, demonstrates the ability to improve performance sensitivities to perturbations and yield feature representations that are more interpretable. Adversarial training, however, comes with an increased computational cost over that of standard (i.e., nonrobust) training, rendering it impractical for use in large-scale problems. Recent work suggests that a fast approximation to adversarial training shows promise for reducing training time and maintaining robustness in the presence of perturbations bounded by the infinity norm. In this work, we demonstrate that this approach extends to the Euclidean norm and preserves the human-aligned feature representations that are common for robust models. Additionally, we show that using a distributed training scheme can further reduce the time to train robust deep networks. Fast adversarial training is a promising approach that will provide increased security and explainability in machine learning applications for which robust optimization was previously thought to be impractical.

翻訳日:2022-11-12 10:02:17 公開日:2020-07-08

# 非線形性を持つリニアテンソル投影

Linear Tensor Projection Revealing Nonlinearity ( http://arxiv.org/abs/2007.03912v1 )

ライセンス: Link先を確認

Koji Maruhashi, Heewon Park, Rui Yamaguchi, Satoru Miyano

(参考訳) 次元の縮小は高次元データの学習に有効な方法であり、人間の可読低次元部分空間における決定境界をよりよく理解することができる。主成分分析や線形判別分析のような線形手法は、多くの変数間の相関を捉えることができるが、データ予測において重要な相関を捉えることは保証されていない。さらに、決定境界が強い非線形性を持つ場合、保証はますます困難になる。この問題は、データが変数間の関係を表す行列またはテンソルであるときに悪化する。本研究は,サブスペース内の予測モデルが強い非線形性を持つ場合でも,元のデータ情報を可能な限り保持しつつ,予測精度を最大化する部分空間を探索する学習手法を提案する。これにより、ユーザが知りたがっている予測問題の背後にある変数のグループのメカニズムを、容易に解釈できる。本手法は, 行列やテンソルを含む各種データに適用することにより, 有効性を示す。

Dimensionality reduction is an effective method for learning high-dimensional data, which can provide better understanding of decision boundaries in human-readable low-dimensional subspace. Linear methods, such as principal component analysis and linear discriminant analysis, make it possible to capture the correlation between many variables; however, there is no guarantee that the correlations that are important in predicting data can be captured. Moreover, if the decision boundary has strong nonlinearity, the guarantee becomes increasingly difficult. This problem is exacerbated when the data are matrices or tensors that represent relationships between variables. We propose a learning method that searches for a subspace that maximizes the prediction accuracy while retaining as much of the original data information as possible, even if the prediction model in the subspace has strong nonlinearity. This makes it easier to interpret the mechanism of the group of variables behind the prediction problem that the user wants to know. We show the effectiveness of our method by applying it to various types of data including matrices and tensors.

翻訳日:2022-11-12 10:01:42 公開日:2020-07-08

# binary stochastic filter: 特徴の選択とその先

Binary Stochastic Filtering: feature selection and beyond ( http://arxiv.org/abs/2007.03920v1 )

ライセンス: Link先を確認

Andrii Trelin and Ale\v{s} Proch\'azka

(参考訳) 特徴選択は、データと機械学習モデルを理解する上で最も決定的なツールの1つである。他の方法では、$l^{1}$ペナルティによって引き起こされるスパーシティは、この問題に対する最も単純で最もよく研究されたアプローチの1つである。このような正規化は、重みの空間性やユニットアクティベーションを達成するためにニューラルネットワークで頻繁に使用されるが、特徴選択問題にどのように適用できるかは不明である。この研究は、階層重みの代わりに統計的に機能関与を罰することで、空間規則化をどのように使用できるかを再考することで、ニューラルネットワークを自動で特徴の選択能力で拡張することを目的としている。提案手法は,数種類の古典的手法と比較して高い効率を示し,計算オーバーヘッドを最小限に抑え,既存のアーキテクチャに直接適用できることを示した。さらに、この方法はニューロンのプルーニングやスペクトルデータの重要領域の選択に容易に一般化できる。

Feature selection is one of the most decisive tools in understanding data and machine learning models. Among other methods, sparsity induced by $L^{1}$ penalty is one of the simplest and best studied approaches to this problem. Although such regularization is frequently used in neural networks to achieve sparsity of weights or unit activations, it is unclear how it can be employed in the feature selection problem. This work aims at extending the neural network with ability to automatically select features by rethinking how the sparsity regularization can be used, namely, by stochastically penalizing feature involvement instead of the layer weights. The proposed method has demonstrated superior efficiency when compared to a few classical methods, achieved with minimal or no computational overhead, and can be directly applied to any existing architecture. Furthermore, the method is easily generalizable for neuron pruning and selection of regions of importance for spectral data.

翻訳日:2022-11-12 10:01:28 公開日:2020-07-08

# スクリーニングテストによるスパースベイズ学習の高速化とその応用

Accelerated Sparse Bayesian Learning via Screening Test and Its Applications ( http://arxiv.org/abs/2007.04006v1 )

ライセンス: Link先を確認

Yiping Jiang, Tianshi Chen

(参考訳) 高次元の設定では、スパース構造はメモリと計算の複雑さの点で効率上重要である。線形系では、直交する特徴の過剰完備な辞書が提供される最も簡単な解を見つけることは、通常NPハードであり、代替の近似法を考える必要がある。本稿では,経験的ベイズアプローチとして,LASSOのような固定された先行手法よりも,解の空間性を促進するためにパラメータ化を事前に用いた,疎ベイズ学習を選択する。しかし、スクリーニングテストは、最適な解において係数がゼロであることが保証された特徴のサブセットを迅速に識別することを目的としており、より小さく、より簡単に解決できる問題を得るために、完全な辞書から安全に取り除くことができる。次に、より小さな問題を解き、その後、小さな解をゼロでパディングすることで元の問題の解を復元する。提案手法の性能は,様々なデータセットやアプリケーションで検討する。

In high-dimensional settings, sparse structures are critical for efficiency in term of memory and computation complexity. For a linear system, to find the sparsest solution provided with an over-complete dictionary of features directly is typically NP-hard, and thus alternative approximate methods should be considered. In this paper, our choice for alternative method is sparse Bayesian learning, which, as empirical Bayesian approaches, uses a parameterized prior to encourage sparsity in solution, rather than the other methods with fixed priors such as LASSO. Screening test, however, aims at quickly identifying a subset of features whose coefficients are guaranteed to be zero in the optimal solution, and then can be safely removed from the complete dictionary to obtain a smaller, more easily solved problem. Next, we solve the smaller problem, after which the solution of the original problem can be recovered by padding the smaller solution with zeros. The performance of the proposed method will be examined on various data sets and applications.

翻訳日:2022-11-12 10:00:40 公開日:2020-07-08

# フリーハンド超音波における自動プローブ運動誘導法

Automatic Probe Movement Guidance for Freehand Obstetric Ultrasound ( http://arxiv.org/abs/2007.04480v1 )

ライセンス: Link先を確認

Richard Droste, Lior Drukker, Aris T. Papageorghiou, J. Alison Noble

(参考訳) そこで本研究では, 標準平面獲得のためのリアルタイムプローブ移動指導を行う最初のシステムを提案する。このようなシステムは、オペレーターの専門知識のレベルを低くすることで、世界中の産科超音波スキャンの展開に寄与することができる。本システムは、プローブに取り付けられた慣性測定ユニット(IMU)の超音波ビデオ信号と運動信号を受信し、誘導信号を予測する人工ニューラルネットワークを用いる。 US-GuideNetと呼ばれるネットワークは、標準平面位置への移動(ゴール予測)または専門家のソノグラフィーが実行する次の動き(アクション予測)を予測する。他の超音波応用のための既存のモデルはシミュレーションやファントムで訓練されているが、実際の超音波ビデオを用いてモデルを訓練し、17人のソノグラフィーによる464の定期的な臨床スキャンからプローブ運動データを収集する。 3種類の標準平面に対する評価は、このモデルが目標予測に88.8%、行動予測に90.9%の精度で有用な誘導信号を提供することを示している。

We present the first system that provides real-time probe movement guidance for acquiring standard planes in routine freehand obstetric ultrasound scanning. Such a system can contribute to the worldwide deployment of obstetric ultrasound scanning by lowering the required level of operator expertise. The system employs an artificial neural network that receives the ultrasound video signal and the motion signal of an inertial measurement unit (IMU) that is attached to the probe, and predicts a guidance signal. The network termed US-GuideNet predicts either the movement towards the standard plane position (goal prediction), or the next movement that an expert sonographer would perform (action prediction). While existing models for other ultrasound applications are trained with simulations or phantoms, we train our model with real-world ultrasound video and probe motion data from 464 routine clinical scans by 17 accredited sonographers. Evaluations for 3 standard plane types show that the model provides a useful guidance signal with an accuracy of 88.8% for goal prediction and 90.9% for action prediction.

翻訳日:2022-11-12 09:54:20 公開日:2020-07-08

# ダウンサイドリスク制約を持つ自然アクター批判アルゴリズム

A Natural Actor-Critic Algorithm with Downside Risk Constraints ( http://arxiv.org/abs/2007.04203v1 )

ライセンス: Link先を確認

Thomas Spooner and Rahul Savani

(参考訳) リスクに敏感な強化学習に関する既存の研究は、対称とダウンサイドのリスク対策の両方において、政策勾配の直接モンテカルロ推定を用いている。このアプローチは偏りのない勾配推定をもたらすが、時間微分法に比べて高い分散とサンプル効率の低下に苦しむ。本稿では,回帰の下位部分モーメントを指標とした負のリスク回避による予測と制御について検討する。我々は,その非線形性を回避し,下部部分モーメントを上限とする新しいベルマン方程式を導入する。下位部分モーメントに対するこのプロキシが縮小であることを証明し、分散分解によるアルゴリズムの安定性に対する直観を与える。これにより、サンプル効率が良く、部分モーメントのオンライン推定が可能になる。リスクに敏感な制御では、制約されたポリシーを見つけるための近年のアクタークリティカルな手法であるReward Constrained Policy Optimizationを、より低い部分モーメントのプロキシでインスタンス化する。提案手法を自然政策勾配に拡張し,リスクに敏感な強化学習のための3つのベンチマーク問題に対するアプローチの有効性を示す。

Existing work on risk-sensitive reinforcement learning - both for symmetric and downside risk measures - has typically used direct Monte-Carlo estimation of policy gradients. While this approach yields unbiased gradient estimates, it also suffers from high variance and decreased sample efficiency compared to temporal-difference methods. In this paper, we study prediction and control with aversion to downside risk which we gauge by the lower partial moment of the return. We introduce a new Bellman equation that upper bounds the lower partial moment, circumventing its non-linearity. We prove that this proxy for the lower partial moment is a contraction, and provide intuition into the stability of the algorithm by variance decomposition. This allows sample-efficient, on-line estimation of partial moments. For risk-sensitive control, we instantiate Reward Constrained Policy Optimization, a recent actor-critic method for finding constrained policies, with our proxy for the lower partial moment. We extend the method to use natural policy gradients and demonstrate the effectiveness of our approach on three benchmark problems for risk-sensitive reinforcement learning.

翻訳日:2022-11-12 09:54:03 公開日:2020-07-08

# 共同視聴による生音声からの音声表現の学習

Learning Speech Representations from Raw Audio by Joint Audiovisual Self-Supervision ( http://arxiv.org/abs/2007.04134v1 )

ライセンス: Link先を確認

Abhinav Shukla, Stavros Petridis, Maja Pantic

(参考訳) 音声と視覚的モダリティの直感的な相互作用は、クロスモーダルな自己教師付き学習に有用である。この概念は、ビデオのアクション認識や音響シーンの分類といった一般的なオーディオビジュアルタスクで実証されている。しかし、セルフスーパービジョンは視聴覚音声については未検討のままである。生音声波形から自己教師付き音声表現を学習する手法を提案する。音声のみの自己スーパービジョン(情報的オーディオ属性の予測)と視覚的自己スーパービジョン(音声から発話顔を生成する)を組み合わせることで生音声エンコーダを訓練する。 visual pretextタスクは、音声表現を駆動して、唇の動きに関連する情報をキャプチャする。これにより、オーディオエンコーダを視覚情報に富み、エンコーダを視覚的モダリティなしで評価することができる。本手法は,確立された単語分類ベンチマークにおいて,既存の自己教師型音声特徴に対して,競合性能を達成し,ラベルの少ない学習において,他の手法よりも大幅に優れる。また,本手法は教師あり訓練よりも優れており,音声関連タスクの強力な初期化を実現している。本研究は,音声表現を学習するための視聴覚音声におけるマルチモーダル自己スーパービジョンの可能性を示す。

The intuitive interaction between the audio and visual modalities is valuable for cross-modal self-supervised learning. This concept has been demonstrated for generic audiovisual tasks like video action recognition and acoustic scene classification. However, self-supervision remains under-explored for audiovisual speech. We propose a method to learn self-supervised speech representations from the raw audio waveform. We train a raw audio encoder by combining audio-only self-supervision (by predicting informative audio attributes) with visual self-supervision (by generating talking faces from audio). The visual pretext task drives the audio representations to capture information related to lip movements. This enriches the audio encoder with visual information and the encoder can be used for evaluation without the visual modality. Our method attains competitive performance with respect to existing self-supervised audio features on established isolated word classification benchmarks, and significantly outperforms other methods at learning from fewer labels. Notably, our method also outperforms fully supervised training, thus providing a strong initialization for speech related tasks. Our results demonstrate the potential of multimodal self-supervision in audiovisual speech for learning good audio representations.

翻訳日:2022-11-12 09:53:10 公開日:2020-07-08

# 小データセットにおける音素表現学習のための予測符号化モデルの解析

Analysis of Predictive Coding Models for Phonemic Representation Learning in Small Datasets ( http://arxiv.org/abs/2007.04205v1 )

ライセンス: Link先を確認

Mar\'ia Andrea Cruz Bland\'on and Okko R\"as\"anen

(参考訳) 予測符号化を用いたニューラルネットワークモデルは、人間の言語獲得の計算モデルの観点から興味深い。この文献ではいくつかの有望な予測型コーディングベースの学習アルゴリズムが提案されているが、どのように異なる言語に一般化し、データセットサイズをトレーニングするかは現在不明である。また,これらのモデルが効果的な音韻特徴学習者であることを示す一方で,これらのモデルの予測損失関数の最小化が最適音素様表現につながるかどうかも不明である。本研究では,データセットサイズが異なる2つの言語に対する音素識別タスク(abxタスク)における,自己回帰型予測符号化と対比型予測符号化の2つの予測符号化モデルの挙動について検討した。実験では,2つのデータセットとの自己回帰的損失と音素識別スコアとの間に強い相関が認められた。しかし驚いたことに、CPCモデルはトレーニングデータを渡した後既に急速に収束しており、平均すると、その表現は両方の言語でのAPCよりも優れています。

Neural network models using predictive coding are interesting from the viewpoint of computational modelling of human language acquisition, where the objective is to understand how linguistic units could be learned from speech without any labels. Even though several promising predictive coding -based learning algorithms have been proposed in the literature, it is currently unclear how well they generalise to different languages and training dataset sizes. In addition, despite that such models have shown to be effective phonemic feature learners, it is unclear whether minimisation of the predictive loss functions of these models also leads to optimal phoneme-like representations. The present study investigates the behaviour of two predictive coding models, Autoregressive Predictive Coding and Contrastive Predictive Coding, in a phoneme discrimination task (ABX task) for two languages with different dataset sizes. Our experiments show a strong correlation between the autoregressive loss and the phoneme discrimination scores with the two datasets. However, to our surprise, the CPC model shows rapid convergence already after one pass over the training data, and, on average, its representations outperform those of APC on both languages.

翻訳日:2022-11-12 09:52:52 公開日:2020-07-08

# iq-vqa: インテリジェントな視覚的質問応答

IQ-VQA: Intelligent Visual Question Answering ( http://arxiv.org/abs/2007.04422v1 )

ライセンス: Link先を確認

Vatsal Goel, Mohit Chandak, Ashish Anand and Prithwijit Guha

(参考訳) 視覚的な質問に対する回答の分野には大きな進歩があったが、今日のモデルには一貫性がなく、不安定な傾向がある。そこで本研究では,任意のVQAアーキテクチャの一貫性と堅牢性を高めるモデル独立巡回フレームワークを提案する。モデルに元の質問に答えるようにトレーニングし、回答に基づいて含意を生成し、生成した含意に正しく答えることを学びます。循環的枠組みの一部として,任意の問答対から示唆された質問を生成できる新しい含意生成器を提案する。一貫性に関する今後の研究のベースラインとして、新しい人間の注釈付きVQA-Implicationsデータセットを提供する。データセットは、VQA v2.0バリデーションデータセットから作成された3つのタイプの含意(論理的等価性、必要条件、相互排他)を含む約30万の質問で構成されている。 VQAモデルの一貫性をルールベースデータセットで約15%向上し、VQA-Implicationsデータセットで約7%向上し、パフォーマンスを劣化させることなくロバストネスを約2%向上することを示す。さらに,視覚と言語に対するマルチモーダル理解の向上を強調したアテンションマップの改良も定量的に示す。

Even though there has been tremendous progress in the field of Visual Question Answering, models today still tend to be inconsistent and brittle. To this end, we propose a model-independent cyclic framework which increases consistency and robustness of any VQA architecture. We train our models to answer the original question, generate an implication based on the answer and then also learn to answer the generated implication correctly. As a part of the cyclic framework, we propose a novel implication generator which can generate implied questions from any question-answer pair. As a baseline for future works on consistency, we provide a new human annotated VQA-Implications dataset. The dataset consists of ~30k questions containing implications of 3 types - Logical Equivalence, Necessary Condition and Mutual Exclusion - made from the VQA v2.0 validation dataset. We show that our framework improves consistency of VQA models by ~15% on the rule-based dataset, ~7% on VQA-Implications dataset and robustness by ~2%, without degrading their performance. In addition, we also quantitatively show improvement in attention maps which highlights better multi-modal understanding of vision and language.

翻訳日:2022-11-12 09:52:09 公開日:2020-07-08

# autolr: 学習率政策への進化的アプローチ

AutoLR: An Evolutionary Approach to Learning Rate Policies ( http://arxiv.org/abs/2007.04223v1 )

ライセンス: Link先を確認

Pedro Carvalho, Nuno Louren\c{c}o, Filipe Assun\c{c}\~ao, Penousal Machado

(参考訳) 適切な学習率の選択は、優れたニューラルネットワークのトレーニングとパフォーマンスにとって最重要である。これまでは、適切な学習率を見つけるためには、経験と試行錯誤に頼る必要があった。現在では、優れた学習率の探索を容易にするような技術自動手法が多数存在する。これらの手法は有効であり、長年にわたって良い結果をもたらしてきたが、一般的な解決策である。つまり、特定のネットワークトポロジに対する学習率の最適化は、ほとんど未調査のままである。本稿では,構造化文法進化を用いたニューラルネットワークアーキテクチャのための学習率スケジューラを進化させるフレームワークであるautolrを提案する。このシステムは、一般的な学習率のベースライン値と比較された学習率ポリシーを発展させるために使用された。その結果、ある進化したポリシーを用いたトレーニングは確立されたベースラインよりも効率的であり、このアプローチはニューラルネットワークのパフォーマンスを改善する有効な手段であることが示唆された。

The choice of a proper learning rate is paramount for good Artificial Neural Network training and performance. In the past, one had to rely on experience and trial-and-error to find an adequate learning rate. Presently, a plethora of state of the art automatic methods exist that make the search for a good learning rate easier. While these techniques are effective and have yielded good results over the years, they are general solutions. This means the optimization of learning rate for specific network topologies remains largely unexplored. This work presents AutoLR, a framework that evolves Learning Rate Schedulers for a specific Neural Network Architecture using Structured Grammatical Evolution. The system was used to evolve learning rate policies that were compared with a commonly used baseline value for learning rate. Results show that training performed using certain evolved policies is more efficient than the established baseline and suggest that this approach is a viable means of improving a neural network's performance.

翻訳日:2022-11-12 09:46:12 公開日:2020-07-08

# Decolonial AI: 人工知能の社会技術的展望としてのDecolonial Theory

Decolonial AI: Decolonial Theory as Sociotechnical Foresight in Artificial Intelligence ( http://arxiv.org/abs/2007.04068v1 )

ライセンス: Link先を確認

Shakir Mohamed, Marie-Therese Png, William Isaac

(参考訳) 本稿では,人工知能の進歩の理解と形成におけるクリティカルサイエンス,特にポストコロニアル理論とデコロニアル理論の重要な役割について考察する。人工知能(AI)は、現代社会とその関係を再形成する技術進歩の1つと見なされている。継続的に適応するシステムの設計とデプロイは、極端にポジティブな変化の約束を果たすが、同時に、特にすでに脆弱な人々に対して重大なリスクを生じさせる。価値と権力はこの議論の中心である。デコロニアル理論は、我々の知的、政治的、経済的、社会的世界を形成する権力のパターンを説明するために、歴史的な後見を用いる。 aiコミュニティは、その技術的実践にデコロニアルな批判的アプローチを組み込むことで、研究と技術開発を確立された倫理原則とよりよく一致させることができる先見と戦術を発達させ、イノベーションと科学的進歩の否定的な影響に耐え続ける脆弱な人々を遠ざけることができる。我々は、植民地化の事例である問題のあるアプリケーションを強調し、デコロニアルレンズを使用して、人工知能のデコロニアル分野を形成する3つの戦術を提出する。今後は、AI研究が推進する新たな科学的ブレークスルーとテクノロジーの波が到来し、AIコミュニティが倫理的見地と、私たちにとって利用可能な知的視点の多種多元性を通じて社会的契約を強化することとなり、究極的には、利益と正義を目標に、より大きな幸福を可能にする将来の技術をサポートする。

This paper explores the important role of critical science, and in particular of post-colonial and decolonial theories, in understanding and shaping the ongoing advances in artificial intelligence. Artificial Intelligence (AI) is viewed as amongst the technological advances that will reshape modern societies and their relations. Whilst the design and deployment of systems that continually adapt holds the promise of far-reaching positive change, they simultaneously pose significant risks, especially to already vulnerable peoples. Values and power are central to this discussion. Decolonial theories use historical hindsight to explain patterns of power that shape our intellectual, political, economic, and social world. By embedding a decolonial critical approach within its technical practice, AI communities can develop foresight and tactics that can better align research and technology development with established ethical principles, centring vulnerable peoples who continue to bear the brunt of negative impacts of innovation and scientific progress. We highlight problematic applications that are instances of coloniality, and using a decolonial lens, submit three tactics that can form a decolonial field of artificial intelligence: creating a critical technical practice of AI, seeking reverse tutelage and reverse pedagogies, and the renewal of affective and political communities. The years ahead will usher in a wave of new scientific breakthroughs and technologies driven by AI research, making it incumbent upon AI communities to strengthen the social contract through ethical foresight and the multiplicity of intellectual perspectives available to us; ultimately supporting future technologies that enable greater well-being, with the goal of beneficence and justice for all.

翻訳日:2022-11-12 09:45:57 公開日:2020-07-08

# 散乱合成学習者:対象の発見,属性,アナロジカル推論における関連性

The Scattering Compositional Learner: Discovering Objects, Attributes, Relationships in Analogical Reasoning ( http://arxiv.org/abs/2007.04212v1 )

ライセンス: Link先を確認

Yuhuai Wu, Honghua Dong, Roger Grosse, Jimmy Ba

(参考訳) 本稿では,RPM(Raven's Progressive Matrices)という,リッチな構成構造を含む類似推論タスクに着目した。そこで本研究では,データの構成構造を明らかにするために,ニューラルネットワークを逐次的に構成するscl(slicant compositional learner)を提案する。我々のSCLは2つのRPMデータセットで最先端のパフォーマンスを達成し、48.7%が Balanced-RAVENを、26.4%がPGMを改善した。さらに,本モデルでは,オブジェクトの属性(例えば,形状色,サイズ)とそれらの関係(例えば,進行,結合)の合成表現を検出する。また、合成表現により、SCLはテスト時間領域シフトに対して著しく堅牢になり、以前は見つからなかったアナログに対してゼロショットの一般化を大幅に改善する。

In this work, we focus on an analogical reasoning task that contains rich compositional structures, Raven's Progressive Matrices (RPM). To discover compositional structures of the data, we propose the Scattering Compositional Learner (SCL), an architecture that composes neural networks in a sequence. Our SCL achieves state-of-the-art performance on two RPM datasets, with a 48.7% relative improvement on Balanced-RAVEN and 26.4% on PGM over the previous state-of-the-art. We additionally show that our model discovers compositional representations of objects' attributes (e.g., shape color, size), and their relationships (e.g., progression, union). We also find that the compositional representation makes the SCL significantly more robust to test-time domain shifts and greatly improves zero-shot generalization to previously unseen analogies.

翻訳日:2022-11-12 09:44:52 公開日:2020-07-08

# Few-Shot分類器の精度予測

Predicting the Accuracy of a Few-Shot Classifier ( http://arxiv.org/abs/2007.04238v1 )

ライセンス: Link先を確認

Myriam Bontonou, Louis B\'ethune, Vincent Gripon

(参考訳) 少数ショット学習の文脈では、ラベル付きサンプルの少ないため、検証セットを用いて訓練された分類器の一般化能力を測定することはできない。本稿では,これまで見つからなかったデータに対して,私の分類器は十分に一般化されているか? まず,一般化性能の変動要因を分析した。次に、転送ベースのソリューションの使用事例を調査し、3つの設定を検討する。一数個のラベル付きサンプルしかアクセスできない場所を監督すること。二数個のラベル付きサンプルとラベルなしサンプルのセットの両方にアクセスすることができる半監督 iii) ラベルなしのサンプルしかアクセスできない場所を監督していないこと。各設定に対して,検討された分類器の一般化能力と実証的に相関する合理的な尺度を提案する。また,この単純な尺度を用いて,信頼度の高い一般化を予測できることを示した。標準的な数ショットビジョンデータセットで実験を行います。

In the context of few-shot learning, one cannot measure the generalization ability of a trained classifier using validation sets, due to the small number of labeled samples. In this paper, we are interested in finding alternatives to answer the question: is my classifier generalizing well to previously unseen data? We first analyze the reasons for the variability of generalization performances. We then investigate the case of using transfer-based solutions, and consider three settings: i) supervised where we only have access to a few labeled samples, ii) semi-supervised where we have access to both a few labeled samples and a set of unlabeled samples and iii) unsupervised where we only have access to unlabeled samples. For each setting, we propose reasonable measures that we empirically demonstrate to be correlated with the generalization ability of considered classifiers. We also show that these simple measures can be used to predict generalization up to a certain confidence. We conduct our experiments on standard few-shot vision datasets.

翻訳日:2022-11-12 09:42:50 公開日:2020-07-08

PDF登録状況（公開日: 20200708）