Fugu-MT: arxivの論文翻訳

このサイトではarxivの論文のうち、30ページ以下でCreative Commonsライセンス（CC 0, CC BY, CC BY-SA）の論文を日本語訳しています。本文がCCでない論文、長すぎる論文はメタデータのみを翻訳しています。（arxivのメタデータは CC 0です。）翻訳文のライセンスはCC BY-SA 4.0です。翻訳にはFugu-Machine Translatorを利用しています。

本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。

公開日が20201204となっている論文です。

Title	Authors	Abstract	論文公表日・翻訳日
# タイムビンとウェーブライクエンコーディングのハイブリッド絡み合いの発生法 Scheme for the generation of hybrid entanglement between time-bin and wavelike encodings ( http://arxiv.org/abs/2002.04450v3 ) ライセンス: Link先を確認	\'Elie Gouzien, Floriane Brunel, S\'ebastien Tanzilli, Virginia D'Auria	(参考訳) 本稿では,位相に符号化されたコヒーレント状態キュービットと単一光子時間ビンキュービットを絡むハイブリッド状態の生成法を提案する。他の報告されたソリューションと比較して、時間ビン符号化は、特に光ファイバの長距離伝搬を含む応用によく適合する。これにより,今後の量子通信に有望な資源となる。この観点から,本手法を現実的な実験資源を考慮した分析を行い,得られたハイブリッド状態の品質に対する不完全性の影響について考察する。 We propose a scheme for the generation of hybrid states entangling a single-photon time-bin qubit with a coherent-state qubit encoded on phases. Compared to other reported solutions, time-bin encoding makes hybrid entanglement particularly well adapted to applications involving long-distance propagation in optical fibers. This makes our proposal a promising resource for future out-of-the-laboratory quantum communication. In this perspective, we analyze our scheme by taking into account realistic experimental resources and discuss the impact of their imperfections on the quality of the obtained hybrid state.	翻訳日:2023-06-03 23:31:13 公開日:2020-12-04
# 双極子および高モーメント保存系における異常拡散 Anomalous Diffusion in Dipole- and Higher-Moment Conserving Systems ( http://arxiv.org/abs/2004.00635v2 ) ライセンス: Link先を確認	Johannes Feldmeier, Pablo Sala, Giuseppe de Tomasi, Frank Pollmann, Michael Knap	(参考訳) 相互作用系における大域的保存量の存在は、遅くとも拡散輸送に繋がる。ここでは、関連する大域電荷の双極子モーメントを保持する系、あるいはそれよりも高次モーメントの一般化が、このシナリオから逃れ、代わりに拡散崩壊を示す。双極子および四極子保存の特定の場合のセルオートマトンとしての時間発展をモデル化し、後期緩和の特異な異常指数を求める。これらの知見は、保存モーメントの数に応じて一連の指数を導出する一般的な流体力学モデルを解析的に構築し、電荷相関関数のスケーリング形式を正確に記述することによって説明される。相関関係の空間的プロファイルを解析し,高次モーメント保存の潜在的な実験的意義について考察する。 The presence of global conserved quantities in interacting systems generically leads to diffusive transport at late times. Here, we show that systems conserving the dipole moment of an associated global charge, or even higher moment generalizations thereof, escape this scenario, displaying subdiffusive decay instead. Modelling the time evolution as cellular automata for specific cases of dipole- and quadrupole-conservation, we numerically find distinct anomalous exponents of the late time relaxation. We explain these findings by analytically constructing a general hydrodynamic model that results in a series of exponents depending on the number of conserved moments, yielding an accurate description of the scaling form of charge correlation functions. We analyze the spatial profile of the correlations and discuss potential experimentally relevant signatures of higher moment conservation.	翻訳日:2023-05-27 05:21:56 公開日:2020-12-04
# 量子線形応答の文脈性による熱力学と気象学における量子シグネチャの証明 Certifying quantum signatures in thermodynamics and metrology via contextuality of quantum linear response ( http://arxiv.org/abs/2004.01213v3 ) ライセンス: Link先を確認	Matteo Lostaglio	(参考訳) 線形応答系における古典力学と量子力学の基本的な違いを,後者が一般的な文脈にあることを示すことによって同定する。これにより、所望の出力スケーリング \emph{unavoidably} が文脈性の形で非古典的な効果を必要とする量子エンジンの例を提供することができる。さらに,局所的メトロロジーの文脈的利点について述べる。線形応答理論の普遍性を考えると、これらのツールによって幅広い量子現象の非古典性が証明できることを期待している。 We identify a fundamental difference between classical and quantum dynamics in the linear response regime by showing that the latter is in general contextual. This allows us to provide an example of a quantum engine whose favorable power output scaling \emph{unavoidably} requires nonclassical effects in the form of contextuality. Furthermore, we describe contextual advantages for local metrology. Given the ubiquity of linear response theory, we anticipate that these tools will allow one to certify the nonclassicality of a wide array of quantum phenomena.	翻訳日:2023-05-27 03:05:10 公開日:2020-12-04
# 量子スピンシステムによるフロケット工学と例外環のシミュレート Floquet engineering and simulating exceptional rings with a quantum spin system ( http://arxiv.org/abs/2005.02703v3 ) ライセンス: Link先を確認	Peng He, Ze-Hao Huang	(参考訳) コヒーレント放射の形での時間周期駆動は、トポロジカル材料や合成量子物質の操作に強力なツールを提供する。本稿では、フレケット工学を通してスペクトルに例外環を示す非エルミート半金属を実現する手法を提案する。環の同心対から双極子対への遷移が観察される。同心対は量子化されたベリー相のみを持ち、双極子対は反対のチャーン数を持ち、フェルミ面のトポロジカルなリフシッツ転移を示す。システムの輸送特性に対処し, この遷移過程には非自明なホール導電率の緊急時が伴うことがわかった。さらに, 量子スピン系を用いた非エルミート半金属の量子シミュレーションと, 長時間ダイナミクスによるトポロジーのキャラクタリゼーションについて検討した。 Time-periodic driving in the form of coherent radiation provides powerful tool for the manipulation of topological materials or synthetic quantum matter. In this paper we propose a scheme to realize non-Hermitian semimetals exhibiting exceptional rings in the spectra through Floquet engineering. A transition from a concentric pair of the rings to a dipolar pair is observed. The concentric pair carries only a quantized Berry phase while the dipolar pair possesses opposite Chern numbers in addition, signaling a topological Lifshitz transition of the Fermi surface. The transport properties of the system are addressed, and we find that this transition process is accompanied by the emergency of a nontrivial Hall conductivity. Furthermore, we explore the quantum simulation of non-Hermitian semimetals with a quantum spin system and the characterization of the topology via the long-time dynamics.	翻訳日:2023-05-21 00:48:33 公開日:2020-12-04
# 有効次元最小化による量子多体散乱のダイナミクスへの影響強化 Enhancing the effect of quantum many-body scars on dynamics by minimising the effective dimension ( http://arxiv.org/abs/2006.03099v2 ) ライセンス: Link先を確認	Shane Dooley, Graham Kells	(参考訳) 量子多体スカーリングは、相互作用するリドバーグ原子鎖における長寿命コヒーレント振動のメカニズムであると考えられている。これらの持続的な振動は、初期状態と多体傷の大きな重複に起因する。この「有効次元」は多体散乱系における非熱的初期状態の同定に有用であることを示す。有効次元を最小化することで、リドベルク連鎖の初期状態がより顕著で長寿命な振動を生じさせ、多くの物体の傷跡が力学に与える影響を強調させる。 Quantum many-body scarring is believed to be the mechanism behind long-lived coherent oscillations in interacting Rydberg atom chains. These persistent oscillations are due to the large overlap of the many-body scars with certain initial states. We show that the "effective dimension" is a useful measure for identifying non-thermalising initial states in many-body scarred systems. By minimising the effective dimension we find physically reasonable initial states of the Rydberg chain that lead to more pronounced and longer lived oscillations, accentuating the effect of the many-body scars on the dynamics.	翻訳日:2023-05-17 04:01:42 公開日:2020-12-04
# 動的アセット割り当てを伴うブロックチェーンマイナの平衡 Equilibrium of Blockchain Miners with Dynamic Asset Allocation ( http://arxiv.org/abs/2006.08016v2 ) ライセンス: Link先を確認	Go Yamamoto, Aron Laszka, Fuhito Kojima	(参考訳) マイニングビジネスの複合リターンを最大化しようとするブロックチェーンマイニング担当者をモデル化し、分析する。最適戦略の分析では、鉱夫と鉱夫の間で新たな均衡点が発見され、鉱夫や鉱夫の市場シェアを予測している。鉱業のコストは、各鉱夫または鉱業プールのシェアを均衡で決定する。我々は, 複合リターンを最大化しようとする鉱夫も鉱業プールも, 鉱業コストが同じ水準であれば, ハッシュレートの50%以上を占めるための経済的インセンティブは得られないと結論づける。しかし、優れたコスト効率の鉱夫が存在する場合、この鉱夫の市場シェアは均衡の50%を超え、生態系全体の存続を脅かす可能性がある。 We model and analyze blockchain miners who seek to maximize the compound return of their mining businesses. The analysis of the optimal strategies finds a new equilibrium point among the miners and the mining pools, which predicts the market share of each miner or mining pool. The cost of mining determines the share of each miner or mining pool at equilibrium. We conclude that neither miners nor mining pools who seek to maximize their compound return will have a financial incentive to occupy more than 50% of the hash rate if the cost of mining is at the same level for all. However, if there is an outstandingly cost-efficient miner, then the market share of this miner may exceed 50% in the equilibrium, which can threaten the viability of the entire ecosystem.	翻訳日:2023-05-14 19:02:42 公開日:2020-12-04
# 高速捕捉イオン絡み込み動作のためのマルチGHz繰り返し、マルチワット平均電力、紫外レーザーパルス Multi-GHz repetition rate, multi-watt average power, ultraviolet laser pulses for fast trapped-ion entanglement operations ( http://arxiv.org/abs/2007.03404v2 ) ライセンス: Link先を確認	M. I. Hussain, D. Heinrich, M. Guevara-Bertsch, E.Torrontegui, J. J. Garc{\i}a-Ripoll, C. F. Roos, and R. Blatt	(参考訳) 閉じ込められたイオンで2量子ゲート操作を行う従来のアプローチは、本質的に遅いプロセスであるレーザー光による運動側バンド上のイオンの励起に依存している。高速エンタングリングゲートプロトコルを実装する1つの方法は、ゲート速度を桁違いに増やすのに適切なパルスレーザーを必要とする。しかし、このような高速エンタングリングゲート動作の実現は、必要となるレーザー源が市販されていないため、大きな技術的課題となる。そこで我々は,周波数コムに基づく超高速絡み込みゲート源を開発した。ソースは、パルスエネルギー$\sim$800 pJを393.3 nmで5GHz繰り返して数百モードロックパルスのバーストを生成し、高速な2ビットゲート演算を実装するための全ての要求を満たす。単一チャープ紫外線パルスを用いて,ca$^+$イオン中の急速断熱通路を示す。絡み合うゲートを誘導するレーザーシステムの適用性と予測性能を検証するために, 音源パラメータに基づいてシミュレーションを行う。ゲートタイムはトラップ期間よりも速く、エラーは10^{-4}$になる。 The conventional approach to perform two-qubit gate operations in trapped ions relies on exciting the ions on motional sidebands with laser light, which is an inherently slow process. One way to implement a fast entangling gate protocol requires a suitable pulsed laser to increase the gate speed by orders of magnitude. However, the realization of such a fast entangling gate operation presents a big technical challenge, as such the required laser source is not available off-the-shelf. For this, we have engineered an ultrafast entangling gate source based on a frequency comb. The source generates bursts of several hundred mode-locked pulses with pulse energy $\sim$800 pJ at 5 GHz repetition rate at 393.3 nm and complies with all requirements for implementing a fast two-qubit gate operation. Using a single, chirped ultraviolet pulse, we demonstrate a rapid adiabatic passage in a Ca$^+$ ion. To verify the applicability and projected performance of the laser system for inducing entangling gates we run simulations based on our source parameters. The gate time can be faster than a trap period with an error approaching $10^{-4}$.	翻訳日:2023-05-11 02:00:41 公開日:2020-12-04
# 1次元マターウェーブ量子ブレザの測定 Measurement of one-dimensional matter-wave quantum breather ( http://arxiv.org/abs/2007.08365v2 ) ライセンス: Link先を確認	Piotr Staro\'n, Andrzej Syrwid, Krzysztof Sacha	(参考訳) 粒子の位置測定のbethe ansatz法と数値シミュレーションを用いて, 平均場アプローチがブレッシャー解に相当する環上の相互作用するボソンのポストクエンチ多体力学について検討した。初期多体基底状態が翻訳的に不変であるにもかかわらず、系の質量中心の量子揺らぎが抽出された場合、測定によりブレッサーダイナミクスが明らかにされる。さらに、多体進化の解析は、呼吸器を形成するソリトンが解離する兆候を示している。 Employing the Bethe ansatz approach and numerical simulations of measurements of particles' positions we investigate a post-quench many-body dynamics of attractively interacting bosons on a ring, which in the mean-field approach corresponds to the so-called breather solution. Despite the fact that the initial many-body ground state is translationally invariant, the measurements reveal breather dynamics if quantum fluctuations of the center of mass of the system are extracted. Moreover, the analysis of the many-body evolution shows signatures of dissociation of the solitons that form the breather.	翻訳日:2023-05-09 07:01:32 公開日:2020-12-04
# レーザによるオンチップ導波路上のコヒーレント光メモリ Coherent Optical Memory Baesd on A Laser-written On-chip Waveguide ( http://arxiv.org/abs/2008.12901v2 ) ライセンス: Link先を確認	Tian-Xiang Zhu, Chao Liu, Liang Zheng, Zong-Quan Zhou, Chuan-Feng Li, Guang-Can Guo	(参考訳) 量子メモリは、大規模量子ネットワークを構築するためのコアデバイスである。スケーラブルで便利な実用用途では、集積光メモリ、特にオンチップ光メモリは、他のオンチップデバイスと容易に統合できるため、重要な要件である。本稿では、希土類イオンドープ結晶表面に作製されたIV型導波路(例えば、$\mathrm{Eu^{3+}}$:$\mathrm{Y_2SiO_5}$)に基づくコヒーレント光メモリについて報告する。表面導波路内部における$\mathrm{eu^{3+}}$イオンの光学遷移({^7}f{_0}\rightarrow{^5}d{_0}}$)の性質はバルク結晶と比較してよく保存されている。スピン波原子周波数コム貯蔵は、IV型導波路内で実証される。この装置の信頼性は、検索パルスと基準パルスとの間の${97\pm 1\%}$の高い干渉可視性によって確認される。オンチップ光メモリは、集積量子ノードへの道を開く。 Quantum memory is the core device for the construction of large-scale quantum networks. For scalable and convenient practical applications, integrated optical memories, especially on-chip optical memories, are crucial requirements because they can be easily integrated with other on-chip devices. Here, we report the coherent optical memory based on a type-IV waveguide fabricated on the surface of a rare-earth ion-doped crystal (i.e. $\mathrm{Eu^{3+}}$:$\mathrm{Y_2SiO_5}$). The properties of the optical transition ($\mathrm{{^7}F{_0}\rightarrow{^5}D{_0}}$) of the $\mathrm{Eu^{3+}}$ ions inside the surface waveguide are well preserved compared to those of the bulk crystal. Spin-wave atomic frequency comb storage is demonstrated inside the type-IV waveguide. The reliability of this device is confirmed by the high interference visibility of ${97\pm 1\%}$ between the retrieval pulse and the reference pulse. The developed on-chip optical memory paves the way towards integrated quantum nodes.	翻訳日:2023-05-04 11:20:28 公開日:2020-12-04
# フィン電界効果トランジスタの共通ゲート構造を用いた小型スピン量子ビット Compact spin qubits using the common gate structure of fin field-effect transistors ( http://arxiv.org/abs/2009.04620v2 ) ライセンス: Link先を確認	Tetsufumi Tanamoto, Keiji Ono	(参考訳) 商用トランジスタのサイズはナノメートルオーダーであり、従来の相補的金属酸化物半導体(cmos)トランジスタを用いたスピン量子ビットの多くの提案がある。しかし、以前に提案されたスピン量子ビットは、少数の量子ビットを制御するために多くのワイヤを必要とする。これにより、量子ビットをチップに組み込む際に重大な「ワイヤの接合」問題が発生する。ここでは、複雑な配線を減らすため、スピン量子ビットがフィンフィールド効果トランジスタ(FinFET)デバイスに埋め込まれ、スピン量子ビットがフィンFETの共通ゲート電極を共有することを理論的に検討する。クォービット間の相互作用は、Ruderman Kittel Kasuya Yosida (RKKY) 相互作用を介してFinFETのチャネルを介して起こる。コンパクトな実装の補償は、小さな空間で高密度の電流線を必要とする。現在提案されている量子コンピュータに加えて,量子アニーリングマシンの可能性についても論じる。 The sizes of commercial transistors are of nanometer order, and there have already been many proposals of spin qubits using conventional complementary metal oxide semiconductor (CMOS) transistors. However, the previously proposed spin qubits require many wires to control a small number of qubits. This causes a significant 'jungle of wires' problem when the qubits are integrated into a chip. Herein, to reduce the complicated wiring, we theoretically consider spin qubits embedded into fin field-effect transistor (FinFET) devices such that the spin qubits share the common gate electrode of the FinFET. The interactions between qubits occur via the Ruderman Kittel Kasuya Yosida (RKKY) interaction via the channel of the FinFET. The compensation for the compact implementation requires high-density current lines in a small space. The possibility of a quantum annealing machine is discussed in addition to the quantum computers of the current proposals.	翻訳日:2023-05-03 00:55:31 公開日:2020-12-04
# 絡み合った状態の条件的弱値によって特徴づけられる量子ゆらぎの文脈性 Contextuality of quantum fluctuations characterized by conditional weak values of entangled states ( http://arxiv.org/abs/2009.06145v2 ) ライセンス: Link先を確認	Holger F. Hofmann	(参考訳) 物理的性質の量子揺らぎは、その物理的性質に少なくとも部分的に敏感な測定値の測定統計において観測することができる。量子理論は、物理特性によって取られる値の効果的な分布は、これらの値が決定される特定の測定コンテキストに依存し、弱い値が測定コンテキストにおける量子ゆらぎのこの依存性を記述する文脈値として特定されていることを示している。ここで、古典統計と量子文脈性の関係は、量子参照と絡み合う系を考えることによって検討される。システムの量子揺らぎは、基準の正確な射影測定によって制御することができ、その結果、基準の測定によって決定される有効な状態準備コンテキストに応じて、量子揺らぎのコンテキスト値が異なる。その結果、混合状態統計は幅広い潜在的な文脈と一致しており、状況の正確な定義には、状態準備と測定の両方において最大量子コヒーレンスが必要であることが示唆された。 The quantum fluctuations of a physical property can be observed in the measurement statistics of any measurement that is at least partially sensitive to that physical property. Quantum theory indicates that the effective distribution of values taken by the physical property depends on the specific measurement context based on which these values are determined and weak values have been identified as the contextual values describing this dependence of quantum fluctuations on the measurement context. Here, the relation between classical statistics and quantum contextuality is explored by considering systems entangled with a quantum reference. The quantum fluctuations of the system can then be steered by precise projective measurements of the reference, resulting in different contextual values of the quantum fluctuations depending on the effective state preparation context determined by the measurement of the reference. The results show that mixed state statistics are consistent with a wide range of potential contexts, indicating that the precise definition of a context requires maximal quantum coherence in both state preparation and measurement.	翻訳日:2023-05-02 06:43:15 公開日:2020-12-04
# エフィモフ様状態と合成双曲曲面上の量子ファンネリング効果 Efimov-like states and quantum funneling effects on synthetic hyperbolic surfaces ( http://arxiv.org/abs/2010.05135v2 ) ライセンス: Link先を確認	Ren Zhang, Chenwei Lv, Yangqian Yan, and Qi Zhou	(参考訳) 調整されたサイト間トンネルとオンサイトエネルギーを持つ工学的格子モデルは、本質的に任意のリーマン面を高度に調整可能な局所曲率で合成することができる。ここでは、平面の格子によって生成される離散合成ポアンカー系半平面とポアンカー系円盤が、任意のゼロでない固有状態に対して無限に退化する。このようなefimov様状態は離散的スケーリング対称性を示し、双曲曲面を用いて量子異常を研究する前例のない装置である。さらに、すべての固有状態は双曲座標において指数関数的に局所化され、エルミート系における量子ファンネリング効果の最初の例を示す。このように、任意の初期波のパケットはポアンカーの半平面の端、もしくはポインカーの円盤上のそれと同等の方向に移動し、光と原子を2次元で収穫する効率的なスキームを提供する。我々の発見は双曲空間の興味深い性質を広げ、エフィモフ状態が余剰次元の曲線空間からの射影と見なせることを示唆している。 Engineering lattice models with tailored inter-site tunnelings and onsite energies could synthesize essentially arbitrary Riemannian surfaces with highly tunable local curvatures. Here, we point out that discrete synthetic Poincar\'e half-planes and Poincar\'e disks, which are created by lattices in flat planes, support infinitely degenerate eigenstates for any nonzero eigenenergies. Such Efimov-like states exhibit a discrete scaling symmetry and imply an unprecedented apparatus for studying quantum anomaly using hyperbolic surfaces. Furthermore, all eigenstates are exponentially localized in the hyperbolic coordinates, signifying the first example of quantum funneling effects in Hermitian systems. As such, any initial wave packet travels towards the edge of the Poincar\'e half-plane or its equivalent on the Poincar\'e disk, delivering an efficient scheme to harvest light and atoms in two dimensions. Our findings unfold the intriguing properties of hyperbolic spaces and suggest that Efimov states may be regarded as a projection from a curved space with an extra dimension.	翻訳日:2023-04-29 11:13:51 公開日:2020-12-04
# AI倫理における理想理論 Ideal theory in AI ethics ( http://arxiv.org/abs/2011.02279v2 ) ライセンス: Link先を確認	Daniel Estrada	(参考訳) 本稿では、ミルズ(2005年)が論じたように、ai倫理研究が理想論のイデオロギーに基づいて行う方法と、fazelpour \& lipton(2020年)のai倫理に適用する方法について述べる。 AI倫理研究者を理想的な理論化に導く構造的・方法論的条件と、このアプローチが我々の研究コミュニティの質と未来にもたらす結果に対処する。最後に、AI倫理における非理想的未来の可能性について議論する。 This paper addresses the ways AI ethics research operates on an ideology of ideal theory, in the sense discussed by Mills (2005) and recently applied to AI ethics by Fazelpour \& Lipton (2020). I address the structural and methodological conditions that attract AI ethics researchers to ideal theorizing, and the consequences this approach has for the quality and future of our research community. Finally, I discuss the possibilities for a nonideal future in AI ethics.	翻訳日:2023-04-26 05:41:23 公開日:2020-12-04
# オッド次元における局所的トポロジカルマーカー Local Topological Markers in Odd Dimensions ( http://arxiv.org/abs/2011.04771v2 ) ライセンス: Link先を確認	Joseph Sykes and Ryan Barnett	(参考訳) 局所的トポロジカルマーカーは、トポロジカルに非自明なバンドを持つシステムを調べるための貴重なツールであることが証明されている。局所的な性質のため、そのようなマーカーは翻訳的不変系や空間的不均一系を等しい足場で扱うことができる。このうち最も一般的なものはチャーンマーカーと呼ばれるもので、2次元のシステムで利用できる。本稿では,このマーカーを 1d と 3d の系に一般化する方法について述べるとともに,関連する式が 1d と 3d のチャーン数によって与えられる位相ポンピング現象を正確に記述していることを示す。一般導出に加えて、モデルハミルトニアンを数値的に考慮してマーカーを検証する。これらの結果は、奇数次元系の位相ポンピングおよび位相相転移に対する障害の影響を含む将来の研究の扉を開く。 Local topological markers have proven to be a valuable tool for investigating systems with topologically non-trivial bands. Due to their local nature, such markers can treat translationally invariant systems and spatially inhomogeneous systems on an equal footing. Among the most prevalent of these is the so-called Chern marker, which is available for systems in two spatial dimensions. In this paper, we describe how to generalize this marker to 1d and 3d systems, by showing that the relevant expressions accurately describe the phenomenon of topological pumping given by the first and second Chern numbers in 1d and 3d respectively. In addition to providing general derivations, we verify the markers by numerically considering model Hamiltonians. These results will open the door for future studies including the influence of disorder on topological pumping and topological phase transitions in odd-dimensional systems.	翻訳日:2023-04-24 21:09:49 公開日:2020-12-04
# 非平衡量子自由エネルギーと有効温度, 機能および影響作用の生成 Nonequilibrium Quantum Free Energy and Effective Temperature, Generating Functional and Influence Action ( http://arxiv.org/abs/2011.10468v2 ) ライセンス: Link先を確認	Jen-Tsung Hsiang and B. L. Hu	(参考訳) 非平衡自由エネルギー $\mathcal{F}_{\textsc{s}}$ の定義は、熱浴に強く結合された動的ガウス量子開系に対して提案され、粗粒な有効作用と影響作用の観点から関数を生成する方法によって公式導出が提供される。ここで研究された量子ブラウン運動モデルによって実証されたガウス的開量子系に対しては、時間変化のある有効温度を自然な方法で導入することができ、それに伴い非平衡自由エネルギー $\mathcal{F}_{\textsc{s}}$, von Neumann entropy $\mathcal{S}_{vN}$, 内部エネルギー $\mathcal{U}_{\textsc{s}}$, 還元された系$S$ が定義される。浴温を参照する文献に見られる非平衡自由エネルギーとは対照的に、ここで見いだされる非平衡熱力学的関数は、系の完全な非平衡進化の歴史において、慣れ親しんだ関係である $\mathcal{f}_{\textsc{s}}(t)=\mathcal{u}_{\textsc{s}}(t)-t_{\textsc{eff}} (t)\,\mathcal{s}_{vn}(t)$ {\it at any and all moments of time} に従う。系が平衡した後、それらは弱いカップリング限界において、従来の平衡熱力学のそれと一致する。有効温度は、系の状態と浴との相互作用の両方をキャプチャするので、システムの平衡により、最初の浴温度よりもわずかに高い値に近づく。特に、ゼロ温度浴ではゼロではなく、システムバスの絡み合いの存在を示唆している。理にかなったことに、高温で超弱結合下では、浴温と区別がつかない。ここで発見された力学ガウス量子系の非平衡熱力学関数と関係は、非平衡量子力学の有意義な理論を確立するための有用な経路を開くべきである。 A definition of nonequilibrium free energy $\mathcal{F}_{\textsc{s}}$ is proposed for dynamical Gaussian quantum open systems strongly coupled to a heat bath and a formal derivation is provided by way of the generating functional in terms of the coarse-grained effective action and the influence action. For Gaussian open quantum systems exemplified by the quantum Brownian motion model studied here, a time-varying effective temperature can be introduced in a natural way, and with it, the nonequilibrium free energy $\mathcal{F}_{\textsc{s}}$, von Neumann entropy $\mathcal{S}_{vN}$ and internal energy $\mathcal{U}_{\textsc{s}}$ of the reduced system ($S$) can be defined accordingly. In contrast to the nonequilibrium free energy found in the literature which references the bath temperature, the nonequilibrium thermodynamic functions we find here obey the familiar relation $\mathcal{F}_{\textsc{s}}(t)=\mathcal{U}_{\textsc{s}}(t)- T_{\textsc{eff}} (t)\,\mathcal{S}_{vN}(t)$ {\it at any and all moments of time} in the system's fully nonequilibrium evolution history. After the system equilibrates they coincide, in the weak coupling limit, with their counterparts in conventional equilibrium thermodynamics. Since the effective temperature captures both the state of the system and its interaction with the bath, upon the system's equilibration, it approaches a value slightly higher than the initial bath temperature. Notably, it remains nonzero for a zero-temperature bath, signaling the existence of system-bath entanglement. Reasonably, at high bath temperatures and under ultra-weak couplings, it becomes indistinguishable from the bath temperature. The nonequilibrium thermodynamic functions and relations discovered here for dynamical Gaussian quantum systems should open up useful pathways toward establishing meaningful theories of nonequilibrium quantum thermodynamics.	翻訳日:2023-04-23 14:56:25 公開日:2020-12-04
# 新型コロナウイルスの接触追跡とプライバシー:世論の縦断的研究 COVID-19 Contact Tracing and Privacy: A Longitudinal Study of Public Opinion ( http://arxiv.org/abs/2012.01553v2 ) ライセンス: Link先を確認	Lucy Simko, Jack Lucas Chang, Maggie Jiang, Ryan Calo, Franziska Roesner, Tadayoshi Kohno	(参考訳) 新型コロナウイルス感染症(COVID-19)患者を、感染した人の接触情報を全て通知することで、特定するプロセスだ。政府、テクノロジー企業、研究グループは、スマートフォンアプリをリリースし、IoTデバイスを使用し、ウェアラブル技術を分散して“クローズコンタクト”を自動的に追跡し、個々のテストが肯定的な場合に、事前のコンタクトを識別している。しかし、効果的な技術ベースの接触追跡と個人のプライバシーとの緊張関係について、大きな議論が交わされている。そこで本研究では,接触者追跡とプライバシに焦点をあてた7ヶ月間のオンライン調査の結果について報告する。最初の調査は4月1日と3日に行われ、米国で最初のピークを迎える前に、毎週10週間(6月から)、そして11月までの2週間にわたって調査を継続し、コンタクトトレーシングと新型コロナウイルスに関する現在の議論を反映したトピック質問を加えました。以上より,政策立案者,技術者,研究者,公衆衛生専門家に対して,新型コロナウイルス(covid-19)の感染拡大を防止し,潜在的なプライバシー上の懸念を考慮しつつ,テクノロジーの活用方法や利用方法について報告する。引き続き縦断測定を行っており、2020年12月4日のレポートバージョン2.0を参照して、このレポートを時間とともに更新する。 There is growing use of technology-enabled contact tracing, the process of identifying potentially infected COVID-19 patients by notifying all recent contacts of an infected person. Governments, technology companies, and research groups alike have been working towards releasing smartphone apps, using IoT devices, and distributing wearable technology to automatically track "close contacts" and identify prior contacts in the event an individual tests positive. However, there has been significant public discussion about the tensions between effective technology-based contact tracing and the privacy of individuals. To inform this discussion, we present the results of seven months of online surveys focused on contact tracing and privacy, each with 100 participants. Our first surveys were on April 1 and 3, before the first peak of the virus in the US, and we continued to conduct the surveys weekly for 10 weeks (through June), and then fortnightly through November, adding topical questions to reflect current discussions about contact tracing and COVID-19. Our results present the diversity of public opinion and can inform policy makers, technologists, researchers, and public health experts on whether and how to leverage technology to reduce the spread of COVID-19, while considering potential privacy concerns. We are continuing to conduct longitudinal measurements and will update this report over time; citations to this version of the report should reference Report Version 2.0, December 4, 2020.	翻訳日:2023-04-22 07:39:53 公開日:2020-12-04
# 低量子ビット数領域における変分固有解のための最適量子サンプリング回帰アルゴリズム An optimal quantum sampling regression algorithm for variational eigensolving in the low qubit number regime ( http://arxiv.org/abs/2012.02338v1 ) ライセンス: Link先を確認	Pedro Rivero, Ian C. Clo\"et, Zack Sullivan	(参考訳) VQEアルゴリズムは、現在の量子プロセッサ(すなわちクラウド上)へのアクセス方法を考えると、非常に高価であることが判明した。この問題を軽減するために,代替ハイブリッド量子古典アルゴリズムである量子サンプリング回帰 (qsr) を導入し,低キュービット数領域における時間複雑性に基づくいくつかのユースケースを分析した。いくつかの古典的資源と引き換えに、この新しい戦略は量子プロセッサに必要なサンプルの数で最適であることが証明されている。我々は、このアルゴリズムがVQEよりも効率的であるかどうかを評価するための単純な解析モデルを構築し、同じ理論的考察から、量子的優位が生じる閾値を確立する。最後に,ベンチマーク問題に対するアルゴリズムの有効性を示す。 The VQE algorithm has turned out to be quite expensive to run given the way we currently access quantum processors (i.e. over the cloud). In order to alleviate this issue, we introduce Quantum Sampling Regression (QSR), an alternative hybrid quantum-classical algorithm, and analyze some of its use cases based on time complexity in the low qubit number regime. In exchange for some extra classical resources, this novel strategy is proved to be optimal in terms of the number of samples it requires from the quantum processor. We develop a simple analytical model to evaluate when this algorithm is more efficient than VQE, and, from the same theoretical considerations, establish a threshold above which quantum advantage can occur. Finally, we demonstrate the efficacy of our algorithm for a benchmark problem.	翻訳日:2023-04-22 03:18:10 公開日:2020-12-04
# 数保存演算による位相推定の改善 Improving phase estimation using the number-conserving operations ( http://arxiv.org/abs/2012.02441v1 ) ライセンス: Link先を確認	Huan Zhang, Wei Ye, Chaoping Wei, Cunjin Liu, Zeyang Liao, Liyun Hu	(参考訳) 本稿では, 2モード圧縮真空 (TMSV) 状態において, s^2+t^2=1の積 (saa^{{\dag}}+ta^{{\dag}}a)^{m} の積 (GSP) 演算の数値保存的重ね合わせを適用して生成する非古典的な入力状態を用いて, マッハ・ツェンダー干渉計のパリティ検出による位相測定の精度と精度を向上させる理論的手法を提案する。提案したGSP-TMSVの非古典的特性は、平均光子数(APN)、アンチバンチング効果、および2モードスクイージングの度合いによって研究される。特に,より高次m GSP演算とより小さいパラメータsの両方がAPNを増大させ,量子フィッシャー情報の改善につながることを示す。さらに, 位相測定精度を, 従来の光子減算・加算法と比較して, 光子損失の有無を比較検討した。提案手法は,特にs=0の場合において,光子損失が存在する場合においても,位相分解能の向上と感度の向上により最高の性能を示すことが判明した。興味深いことに、我々のスキームでは標準量子ノイズ限界(SQL)は常に超えることができ、ハイゼンベルク極限(HL)は、小さい総APNを持つs=0.5,1で達成できる。しかし、光子損失がある場合、hl は打ち負かすことはできないが、sql は特に大規模な apn レジームにおいて克服できる。この結果は量子力学における重要な応用を見出すことができる。 We propose a theoretical scheme to improve the resolution and precision of phase measurement with parity detection in the Mach-Zehnder interferometer by using a nonclassical input state which is generated by applying a number-conserving generalized superposition of products (GSP) operation, (saa^{{\dag}}+ta^{{\dag}}a)^{m} with s^2+t^2=1, on two-mode squeezed vacuum (TMSV) state. The nonclassical properties of the proposed GSP-TMSV are investigated via average photon number (APN), anti-bunching effect, and degrees of two-mode squeezing. Particularly, our results show that both higher-order m GSP operation and smaller parameter s can increase the total APN, which leads to the improvement of quantum Fisher information. In addition, we also compare the phase measurement precision with and without photon losses between our scheme and the previous photon subtraction/addition schemes. It is found that our scheme, especially for the case of s=0, has the best performance via the enhanced phase resolution and sensitivity when comparing to those previous schemes even in the presence of photon losses. Interestingly, without losses, the standard quantum-noise limit (SQL) can always be surpassed in our our scheme and the Heisenberg limit (HL) can be even achieved when s=0.5,1 with small total APNs. However, in the presence of photon losses, the HL cannot be beaten, but the SQL can still be overcome particularly in the large total APN regimes. Our results here can find important applications in quantum metrology.	翻訳日:2023-04-22 03:14:23 公開日:2020-12-04
# 2ビット系における集団振幅減衰のための量子回路 Quantum Circuits for Collective Amplitude Damping in Two-Qubit Systems ( http://arxiv.org/abs/2012.02410v1 ) ライセンス: Link先を確認	Yusuke Hama	(参考訳) 量子コンピュータは今や我々の社会に登場し、科学と工学の研究に利用されている。現在、約50量子ビットの中間サイズのコンピュータとして構築されており、ノイズ効果に弱い。したがって、ノイズ・中間スケール量子デバイスと呼ばれる。これらの機械を用いて効率的な量子計算を実現するために、個々の量子ノイズと集合量子ノイズのコヒーレントな制御が鍵となる。本研究では、後者のタイプに着目し、量子回路として表される集合量子ノイズの定式化について検討する。議論の簡略化と具体化のために, 2量子系における集団振幅減衰過程を解析した。我々の形式と量子回路の検証として,6つの異なる初期条件を検証し,量子シミュレーションにおける全体の演算の実行数を変化させることで,集団振幅減衰のディジタル量子シミュレーションを実証する。その結果, 2量子ビット系に対する量子マスター方程式の解との数値マッチングが良好であることが確認された。さらに,より大きな量子ビット系における振幅減衰の集団減衰を解析するために,形式性を拡張する方法の本質について述べる。これらの結果は、量子ノイズを制御し、大規模量子コンピュータを設計するための体系的なアプローチを確立するための道を開いた。 Quantum computers have now appeared in our society and are utilized for the investigation of science and engineering. At present, they have been built as intermediate-size computers containing about fifty qubits and are weak against noise effects. Hence, they are called noisy-intermediate scale quantum devices. In order to accomplish efficient quantum computation with using these machines, a key issue is going to be the coherent control of individual and collective quantum noises. In this work, we focus on a latter type and investigate formulations of the collective quantum noises represented as quantum circuits. To simplify our discussions and make them concrete, we analyze collective amplitude damping processes in two-qubit systems. As verifications of our formalisms and the quantum circuits, we demonstrate digital quantum simulations of the collective amplitude damping by examining six different initial conditions with varying the number of execution of an overall operation for our quantum simulations. We observe that our results show good numerical matching with the solution of quantum master equation for the two-qubit systems as we increase such a number. In addition, we explain the essence of the way to extend our formalisms to analyze the collective amplitude damping in larger qubit systems. These results pave the way for establishing systematic approaches to control the quantum noises and designing large-scale quantum computers.	翻訳日:2023-04-22 03:13:25 公開日:2020-12-04
# 偏りのあるプログラマ? あるいはバイアスデータ? AI倫理の操作に関するフィールド実験 Biased Programmers? Or Biased Data? A Field Experiment in Operationalizing AI Ethics ( http://arxiv.org/abs/2012.02394v1 ) ライセンス: Link先を確認	Bo Cowgill, Fabrizio Dell'Acqua, Samuel Deng, Daniel Hsu, Nakul Verma and Augustin Chaintreau	(参考訳) なぜバイアスのある予測が生じるのか? どんな介入を防げますか? 約400ドルのaiエンジニアから、ランダムに割り当てられた実験条件下でアルゴリズムを開発した8.2百万のアルゴリズムによる計算性能の予測を評価した。我々の治療部門は、プログラマのインセンティブ、トレーニングデータ、認識、および/またはAI倫理に関する技術的な知識を変更しました。次に,アルゴリズム入力のランダムな監査操作と20K被験者の地味数学性能を用いて,サンプル外予測をアルゴリズムから評価する。偏りのある予測は、主に偏りのあるトレーニングデータによって引き起こされる。しかし、より良いトレーニングデータの利点の3分の1は、新しい経済メカニズムによってもたらされます。また、プログラマの人口特性と、性別やキャリアに関する心理的偏見(IAT)の心理テストにおいて、パフォーマンスがどのように変化するかを評価する。女性、マイノリティ、低IATエンジニアがコードのバイアスや差別が低いという証拠は見つからない。しかし, 人口統計群では予測誤差が相関していることが判明し, クロスデミノグラフィー平均化による性能改善が期待できる。最後に、技術的アドバイス、単純なリマインダー、アルゴリズムバイアスの低減のためのインセンティブの改善など、実践的な管理や政策介入のメリットとトレードオフを定量化する。 Why do biased predictions arise? What interventions can prevent them? We evaluate 8.2 million algorithmic predictions of math performance from $\approx$400 AI engineers, each of whom developed an algorithm under a randomly assigned experimental condition. Our treatment arms modified programmers' incentives, training data, awareness, and/or technical knowledge of AI ethics. We then assess out-of-sample predictions from their algorithms using randomized audit manipulations of algorithm inputs and ground-truth math performance for 20K subjects. We find that biased predictions are mostly caused by biased training data. However, one-third of the benefit of better training data comes through a novel economic mechanism: Engineers exert greater effort and are more responsive to incentives when given better training data. We also assess how performance varies with programmers' demographic characteristics, and their performance on a psychological test of implicit bias (IAT) concerning gender and careers. We find no evidence that female, minority and low-IAT engineers exhibit lower bias or discrimination in their code. However, we do find that prediction errors are correlated within demographic groups, which creates performance improvements through cross-demographic averaging. Finally, we quantify the benefits and tradeoffs of practical managerial or policy interventions such as technical advice, simple reminders, and improved incentives for decreasing algorithmic bias.	翻訳日:2023-04-22 03:13:07 公開日:2020-12-04
# アルゴリズム的公正行動主義の経営的効果 The Managerial Effects of Algorithmic Fairness Activism ( http://arxiv.org/abs/2012.02393v1 ) ライセンス: Link先を確認	Bo Cowgill, Fabrizio Dell'Acqua and Sandra Matz	(参考訳) 倫理的議論はビジネスにおけるAIの採用にどのように影響しますか? AIフェアネスアクティビズムで使用される議論に対して、ビジネス上の意思決定者がランダムに公開します。アルゴリズムバイアスマネージャがAIを放棄して人間による手動によるレビューを行い、訴訟やネガティブなPRに対するさらなる期待を報告することの難しさを強調している。これらの効果は、AIが性別と人種格差を減らし、AIフェアネスに対処するためのエンジニアリング投資が実現可能であったとしても継続する。ステータスクオ比較の強調は、反対の効果をもたらす。また、AI倫理論における「科学的ベニア」の効果も測定する。 scientific veneerは管理行動を変えるが、有利な(批判的な)aiアクティビズムには非対称に利益がない。 How do ethical arguments affect AI adoption in business? We randomly expose business decision-makers to arguments used in AI fairness activism. Arguments emphasizing the inescapability of algorithmic bias lead managers to abandon AI for manual review by humans and report greater expectations about lawsuits and negative PR. These effects persist even when AI lowers gender and racial disparities and when engineering investments to address AI fairness are feasible. Emphasis on status quo comparisons yields opposite effects. We also measure the effects of "scientific veneer" in AI ethics arguments. Scientific veneer changes managerial behavior but does not asymmetrically benefit favorable (versus critical) AI activism.	翻訳日:2023-04-22 03:12:44 公開日:2020-12-04
# 2重ボース-ハバード鎖における$\mathbb{Z}_2$相とMajorana分光 $\mathbb{Z}_2$ phases and Majorana spectroscopy in paired Bose-Hubbard chains ( http://arxiv.org/abs/2012.02380v1 ) ライセンス: Link先を確認	Smitha Vishveshwara and David M. Weld	(参考訳) 最寄り-近距離対存在下でボース-ハバード鎖を調べる。ペアリング項は、数ゆらぎを持つが外対角長距離順序を持たない、異常なギャップを持つ$\mathbb{Z}_2$イジング相をもたらす。この相は、ギャップ保護されたマクロキュービットである強相関多体二重縮退基底状態を有する。強い相互作用の極限において、系は異方性逆スピン鎖に写像され、この系は、ゼロエネルギーマヨラナ境界状態を持つボース=ハッバード鎖のよく知られたフェルミオン姉妹に写像される。フェルミオン系とボソニック系の対応する位相は極めて異なる波動関数を持つが、同じエネルギースペクトルを共有している。貯留層誘起対を持つバイアスドジグザグ光学格子におけるボース・ハバード模型の低温原子化の可能性について述べ、実験的キタエフ連鎖分光法への道を開く。 We investigate the Bose-Hubbard chain in the presence of nearest-neighbor pairing. The pairing term gives rise to an unusual gapped $\mathbb{Z}_2$ Ising phase that has number fluctuation but no off-diagonal long range order. This phase has a strongly correlated many-body doubly degenerate ground state which is effectively a gap-protected macroscopic qubit. In the strongly interacting limit, the system can be mapped onto an anisotropic transverse spin chain, which in turn can be mapped to the better-known fermionic sister of the paired Bose-Hubbard chain: the Kitaev chain which hosts zero-energy Majorana bound states. While corresponding phases in the fermionic and bosonic systems have starkly different wavefunctions, they share identical energy spectra. We describe a possible cold-atom realization of the paired Bose-Hubbard model in a biased zig-zag optical lattice with reservoir-induced pairing, opening a possible route towards experimental Kitaev chain spectroscopy.	翻訳日:2023-04-22 03:12:33 公開日:2020-12-04
# 量子チャネルの絡み合いのワンショット操作 One-shot manipulation of entanglement for quantum channels ( http://arxiv.org/abs/2012.02631v1 ) ライセンス: Link先を確認	Ho-Joon Kim, Soojoon Lee, Ludovico Lami, Martin B. Plenio	(参考訳) 量子エンタングルメントの動的資源理論は超チャネル理論を用いて定式化できることを示す。この定式化において,チャネル分離性を自由資源として保持する分離チャネルと自由スーパーチャネルのクラスを同定し,スワップチャネルを動的絡み合い黄金単位として選択する。最初の結果は、自由スーパーチャネルの下の2成分量子チャネルの1ショットの動的絡み合いコストが、チャネルの標準的なログロバストネスによって制限されることである。自由超チャネルの下でのバイパルタイト量子チャネルの1ショット蒸留可能な動的絡み合いは、分離可能なチャネルを最小化したチャネルの仮説テスト相対エントロピーから構築した資源単調によって境界づけられる。また、2部量子チャネルの1ショット触媒的動的絡み合わせコストを、漸近的に無視できない動的絡み合わせを生じさせるような、より大規模な自由超チャネルのクラスの下で解決する。 We show that the dynamic resource theory of quantum entanglement can be formulated using the superchannel theory. In this formulation, we identify the separable channels and the class of free superchannels that preserve channel separability as free resources, and choose the swap channels as dynamic entanglement golden units. Our first result is that the one-shot dynamic entanglement cost of a bipartite quantum channel under the free superchannels is bounded by the standard log-robustness of channels. The one-shot distillable dynamic entanglement of a bipartite quantum channel under the free superchannels is found to be bounded by a resource monotone that we construct from the hypothesis-testing relative entropy of channels with minimization over separable channels. We also address the one-shot catalytic dynamic entanglement cost of a bipartite quantum channel under a larger class of free superchannels that could generate the dynamic entanglement which is asymptotically negligible; it is bounded by the generalized log-robustness of channels.	翻訳日:2023-04-22 03:05:18 公開日:2020-12-04
# 原子間相互作用研究のための新しい一般化モース様ポテンシャル New Generalized Morse-Like Potential for Studying the Atomic Interaction in Diatomic Molecules ( http://arxiv.org/abs/2012.02581v1 ) ライセンス: Link先を確認	C. M. Ekpo, Ephraim P. Inyang, P. O. Okoi, T. O. Magu, E. P. Agbo, K O Okorie, Etido P. Inyang	(参考訳) 本研究では,任意の次元における新たな一般化モースポテンシャルに対するラジアルシュロディンガー方程式の近似解析解を,ニキフォロフ・ウヴァロフ法を用いて求めた。エネルギー固有値と対応する固有関数は解析的に得られる。いくつかの二原子分子の回転振動エネルギー固有値は、いくつかの分光パラメータの助けを借りて計算される。 Attarctive Radial や Deng-Fan potential などのポテンシャルに対するエネルギー方程式も、いくつかのポテンシャルパラメータによって得られる。我々の結果は既存の文献とよく一致している。 In this study, we obtain the approximate analytical solutions of the radial Schrodinger equation for the New Generalized Morse-Like Potential in arbitrary dimensions by using the Nikiforov Uvarov Method. Energy eigenvalues and corresponding eigenfunction are obtain analytically. The rotational-vibrational energy eigenvalues for some diatomic molecules are computed with the aid of some spectroscopic parameters. The energy equation for some potentials such as Attarctive Radial and Deng-Fan potentials have also been obtained by varying some potential parameters. Our results excellently agree with the already existing literature.	翻訳日:2023-04-22 03:04:29 公開日:2020-12-04
# 積層導波路支援アレー検出チャネルを用いた磁力計統合プラットフォーム An integrated magnetometry platform with stackable waveguide-assisted detection channels for sensing arrays ( http://arxiv.org/abs/2012.02560v1 ) ライセンス: Link先を確認	Michael Hoese, Michael K. Koch, Vibhav Bharadwaj, Johannes Lang, John P. Hadden, Reina Yoshizaki, Argyro N. Giakoumaki, Roberta Ramponi, Fedor Jelezko, Shane M. Eaton, Alexander Kubanek	(参考訳) ダイヤモンドの負電荷NV$^-$中心はナノスケール、高感度磁気メトリーにおいて大きな成功を収めた。感度向上には効率的な蛍光検出が不可欠である。さらに、統合デバイスは実用的なセンサーを可能にする。ここでは, ダイヤモンド表面下数ナノメートルのnv$^-$中心を生成できる新しいアーキテクチャを提案するとともに, フェムト秒レーザーによるtype-ii導波路のモード場最大値についても述べる。結合効率を実験的に検証し、導波路を介して磁気共鳴信号の検出を示し、磁場および温度センシングにおける第一原理実証実験を行う。センシングタスクは、試料を通して直接光を照射することなく導波路を介して操作することができ、これは光に弱い生体システムにおける磁気測定の重要なステップである。将来的には,空間的および時間的相関のある磁気計測を容易にする2次元センシングアレイの開発が期待できる。 The negatively-charged NV$^-$-center in diamond has shown great success in nanoscale, high-sensitivity magnetometry. Efficient fluorescence detection is crucial for improving the sensitivity. Furthermore, integrated devices enable practicable sensors. Here, we present a novel architecture which allows us to create NV$^-$-centers a few nanometers below the diamond surface, and at the same time in the mode field maximum of femtosecond-laser-written type-II waveguides. We experimentally verify the coupling efficiency, showcase the detection of magnetic resonance signals through the waveguides and perform first proof-of-principle experiments in magnetic field and temperature sensing. The sensing task can be operated via the waveguide without direct light illumination through the sample, which marks an important step for magnetometry in biological systems which are fragile to light. In the future, our approach will enable the development of two-dimensional sensing arrays facilitating spatially and temporally correlated magnetometry.	翻訳日:2023-04-22 03:04:21 公開日:2020-12-04
# 超ラジアントナノレーザーの量子ランジュバンアプローチ Quantum Langevin approach for superradiant nanolasers ( http://arxiv.org/abs/2012.02533v1 ) ライセンス: Link先を確認	Igor Protsenko, Alexander Uskov, Emil C. Andr\'e, Jesper M{\o}rk and Martijn Wubs	(参考訳) 量子非線形ランジュバン方程式を解析的に解く新しい手法を提案し、集団効果が重要な役割を果たす超ラジアントレーザーのスペクトルの計算に適用した。任意のポンプレートの発振スペクトルを計算し、閾値領域をまたいでレーザーライン幅のポンプ依存性などのよく知られた結果を回収する。我々は,大きな緩和振動を持つ超ラジアントレーザーのスペクトルにおける新しいサイドバンドピークと,弱いポンプ速度に対する発振スペクトルの新しい非線形構造を予測する。提案手法は,レーザーライン幅の狭さ,発振スペクトルの構造,コヒーレント操作への移行における人口変動の重要性に新たな光を当てるものである。 A new approach for analytically solving quantum nonlinear Langevin equations is proposed and applied to calculations of spectra of superradiant lasers where collective effects play an important role. We calculate lasing spectra for arbitrary pump rates and recover well-known results such as the pump dependence of the laser linewidth across the threshold region. We predict new sideband peaks in the spectrum of superradiant lasers with large relaxation oscillations as well as new nonlinear structures in the lasing spectra for weak pump rates. Our approach sheds new light on the importance of population fluctuations in the narrowing of the laser linewidth, in the structure of the lasing spectrum, and in the transition to coherent operation.	翻訳日:2023-04-22 03:03:59 公開日:2020-12-04
# 有限サイズの半導体マイクロワイヤにおけるエキシトン-ポラリトンソリトン Exciton-polariton solitons in a semiconductor microwire of finite size ( http://arxiv.org/abs/2012.02477v1 ) ライセンス: Link先を確認	E. Nji Nde Aboringong, I. Ngek Ndifon and Alain M. Dikand\'e	(参考訳) エクシトン-ポラリトンソリトン(Exciton- Polariton Soliton)は、光と物質との相互作用により結合した励起子-光子状態からなる強い非線形準粒子である。半導体マイクロワイヤやナノワイヤのような半導体マイクロキャビティシステムでは、ポラリトンは反発的な非線形励起子-励起子相互作用と結合すると明るいポラリトンソリトンを生成する負の質量によって特徴づけられる。本研究では, 外部ポンピングによる放射損失を仮定した有限サイズの微小導波路における励起子-偏光子ソリトンダイナミクスについて検討する。ポラリトンパルスの周期列からなるモデル運動方程式に対する正確な明るいソリトン解は、ジャコビ楕円関数の項で得られる。パルス列の光子成分と励起子成分の両方のエネルギーに対応する正確な解析式を見いだした。その結果、マイクロワイヤ導波路のサイズ(すなわち長さ)は、媒体中に伝播するポラリトンソリトンによって伝達されるエネルギーの定量的な推定を得る上で重要な役割を担っていることが示唆された。 Exciton-polariton solitons are strongly nonlinear quasiparticles composed of coupled exciton-photon states due to the interaction of light with matter. In semiconductor microcavity systems such as semiconductor micro and nanowires, polaritons are characterized by a negative mass which when combined with the repulsive nonlinear exciton-exciton interaction, leads to the generation of bright polariton solitons. In this work we investigate the dynamics of bright exciton-polariton solitons in a finite-size microcavity waveguide, for which radiative losses are assumed balanced by the external pumping. An exact bright-soliton solution to the model equations of motion, consisting of a periodic train of polariton pulses, is obtained in terms of Jacobi elliptic functions. Exact analytical expressions corresponding to the energies of both photonic and excitonic components of the pulse train are found. Results suggest that the size (i.e. the length) of a microwire waveguide plays a relevant role in obtaining a quantitative estimate of the energy that could be conveyed by polariton solitons propagating in the medium.	翻訳日:2023-04-22 03:03:39 公開日:2020-12-04
# パラメータ化された$\phi^4$モデルにおけるkink-antikink散乱誘起呼吸境界状態とオシロン Kink-antikink scattering-induced breathing bound states and oscillons in a parametrized $\phi^4$ model ( http://arxiv.org/abs/2012.02470v1 ) ライセンス: Link先を確認	F. Naha Nzoupe, Alain M. Dikand\'e and C. Tchawoua	(参考訳) 近年の研究では、標準の$\phi^4$フィールドと同じクラスのスカラーフィールドモデルの形状変形性が、オシヨンと呼ばれる特定の種類の呼吸境界状態の生成を制御できる重要な役割を強調している。宇宙論の文脈では、オシヨンの内蔵機構は、スカラー超軽量ダークマターの標準的な画像に影響を与えることを示唆している。本研究は,古典的$\phi^4$場を漸近的極限として認める双安定系のパラメトリゼーションモデルにおいて,真空近傍のスカラー場の長寿命低振幅ほぼ調和振動の形成に着目して検討した。パラメトリズドモデルの特徴は、電位壁の急勾配のみを変化させる形状変形パラメータを持つ二重ウェルポテンシャルであり、したがってポテンシャルバリアのハンプの平坦さが2つの縮退ミニマとバリア高さに影響を与えない点にある。変形性パラメータの変動は、キンク-フォノン散乱電位のいくつかの追加振動モードを促進し、キンク-アンティキンク散乱における2バウンスウィンドウの抑制とオシロンの生成を引き起こす。数値的な結果から,2重井戸系におけるオシロン生成の主要な要因は,フラットバリアハンプを特徴とする電位障壁の非調和性であることが示唆された。 Recent studies have emphasized the important role that a shape deformability of scalar-field models pertaining to the same class with the standard $\phi^4$ field, can play in controlling the production of a specific type of breathing bound states so-called oscillons. In the context of cosmology, the built-in mechanism of oscillons suggests that they can affect the standard picture of scalar ultra-light dark matter. In the present work kink scatterings are investigated in a parametrized model of bistable system admitting the classical $\phi^4$ field as an asymptotic limit, with focus on the formation of long-lived low-amplitude almost harmonic oscillations of the scalar field around a vacuum. The parametrized model is characterized by a double-well potential with a shape-deformation parameter that changes only the steepness of the potential walls, and hence the flatness of the hump of the potential barrier, leaving unaffected the two degenerate minima and the barrier height. It is found that the variation of the deformability parameter promotes several additional vibrational modes in the kink-phonon scattering potential, leading to suppression of the two-bounce windows in kink-antikink scatterings and the production of oscillons. Numerical results suggest that the anharmonicity of the potential barrier, characterized by a flat barrier hump, is the main determinant factor for the production of oscillons in double-well systems.	翻訳日:2023-04-22 03:03:19 公開日:2020-12-04
# ワンショット動的資源理論 One-shot dynamical resource theory ( http://arxiv.org/abs/2012.02781v1 ) ライセンス: Link先を確認	Xiao Yuan, Pei Zeng, Minbo Gao and Qi Zhao	(参考訳) 資源理論の根本的な問題は資源の操作を研究することである。ここでは、量子チャネルの一般的な動的資源理論に着目し、資源の単一コピーによる1ショットの蒸留と希釈のタスクについて考察する。ユニタリチャネルや純粋な状態準備チャネルの任意のターゲットに対して、任意のリソースとターゲットの間で変換されるレートの上限と下限を決定するための普遍的な戦略を確立する。本研究では, チャネルのロバスト性およびチャネル仮説テストエントロピーに基づく資源対策と, 対象資源対策の正規化要因との関連性を示す。この戦略はチャネルロバスト性が有限であれば収束境界で最適となり、ターゲットリソースの測度は同じ値に崩壊する。シングルショットの結果はまた、漸近的リソース変換率を得るために、チャネルの漸近的並列操作にも適用される。我々は、純粋性、古典的容量、量子容量、非一様性、コヒーレンス、量子チャネルの絡み合いなど、いくつかの動的資源の例を示す。この結果は、量子通信、フォールトトレラント量子コンピューティング、量子熱力学に応用可能な一般的な動的資源理論に適用できる。 A fundamental problem in resource theory is to study the manipulation of the resource. Focusing on a general dynamical resource theory of quantum channels, here we consider tasks of one-shot resource distillation and dilution with a single copy of the resource. For any target of unitary channel or pure state preparation channel, we establish a universal strategy to determine upper and lower bounds on rates that convert between any given resource and the target. We show that the rates are related to resource measures based on the channel robustness and the channel hypothesis testing entropy, with regularization factors of the target resource measures. The strategy becomes optimal with converged bounds when the channel robustness is finite and measures of the target resource collapse to the same value. The single-shot result also applies to asymptotic parallel manipulation of channels to obtain asymptotic resource conversion rates. We give several examples of dynamical resources, including the purity, classical capacity, quantum capacity, non-uniformity, coherence, and entanglement of quantum channels. Our results are applicable to general dynamical resource theories with potential applications in quantum communication, fault-tolerant quantum computing, and quantum thermodynamics.	翻訳日:2023-04-22 02:55:24 公開日:2020-12-04
# 化学気相沈着グラフェン層および境界における異方性スピンダイナミクスとのロバストスピンインターコネクション Robust Spin Interconnect with Isotropic Spin Dynamics in Chemical Vapour Deposited Graphene Layers and Boundaries ( http://arxiv.org/abs/2012.02674v1 ) ライセンス: Link先を確認	Dmitrii Khokhriakov, Bogdan Karpiak, Anamul Md. Hoque, Bing Zhao, Subir Parui, Saroj P. Dash	(参考訳) 化学気相沈着(CVD)により成長する大面積グラフェンの利用は、全スピンメモリおよび論理回路におけるスケーラブルなスピン配線の開発に不可欠である。しかし、多層グラフェンパッチの存在とその境界がスピンダイナミクスに与える影響は、まだ解決されていないため、ロバストなスピンインターコネクトの基本的な理解と応用には必要である。ここでは, 特別に考案された単一層, 二重層および三層グラフェンチャネルにおける普遍的なスピン輸送と動的特性と, CVDグラフェン試料中に存在するそれらの層の境界と折り畳みについて報告する。グラフェン層の配向が異なるスピンに対して等方性スピン緩和を施した均一スピン寿命を室温で観察した。すべての不均一グラフェンチャネルにおいて、スピン偏極した平面外および面内スピンのスピン寿命異方性比は、一様に近いと測定される。解析の結果,多層チャネルにおけるエリオット・ヤフェットとヤコノフ・ペルルの機構の重要性が示され,後者の役割が高まった。大規模不均質CVDグラフェンの多層パッチとその境界と室温での折り畳みによる普遍的および等方的スピン輸送は、そのスピン相互結合性を証明し、スケーラブルなスピントロニクス回路の開発に有用である。 The utilization of large-area graphene grown by chemical vapour deposition (CVD) is crucial for the development of scalable spin interconnects in all-spin-based memory and logic circuits. However, the fundamental influence of the presence of multilayer graphene patches and their boundaries on spin dynamics has not been addressed yet, which is necessary for basic understanding and application of robust spin interconnects. Here, we report universal spin transport and dynamic properties in specially devised single layer, bi-layer, and tri-layer graphene channels and their layer boundaries and folds that are usually present in CVD graphene samples. We observe uniform spin lifetime with isotropic spin relaxation for spins with different orientations in graphene layers and their boundaries at room temperature. In all the inhomogeneous graphene channels, the spin lifetime anisotropy ratios for spins polarized out-of-plane and in-plane are measured to be close to unity. Our analysis shows the importance of both Elliott-Yafet and Dyakonov-Perel mechanisms, with an increasing role of the latter mechanism in multilayer channels. These results of universal and isotropic spin transport on large-area inhomogeneous CVD graphene with multilayer patches and their boundaries and folds at room temperature prove its outstanding spin interconnect functionality, beneficial for the development of scalable spintronic circuits.	翻訳日:2023-04-22 02:53:46 公開日:2020-12-04
# 気象正常化を用いた大学寮の電力消費に及ぼすCOVID-19の影響調査 Investigation of the Impacts of COVID-19 on the Electricity Consumption of a University Dormitory Using Weather Normalization ( http://arxiv.org/abs/2012.07748v1 ) ライセンス: Link先を確認	Zhihong Pang, Fan Feng, Zheng O'Neill	(参考訳) 本研究では,米国南部にある大学寮ビルの電力消費に対する新型コロナウイルス(covid-19)パンデミックの影響を調査した。 2017年1月1日から2020年7月31日までに収集されたこの大学寮の歴史的電力消費データと、キャンパス内気象観測所の気象データを用いて分析を行った。 4つの逆データ駆動予測モデル、すなわち、ニューラルニューラルネットワーク、ロング短期メモリリカレントニューラルネットワーク、eXtreme Gradient Boosting、Light Gradient Boosting Machineを用いて、気象条件の影響を考慮した。その結果、新型コロナウイルスによるキャンパス閉鎖時の予測値と比較して、対象建物の総電力消費量は41%(約276,000 kWh (942 MMBtu))減少した。また, 日負荷比 (DLR) も有意に変化した。概して、DLRは2020年3月後半に80%から40%近くまで徐々に減少し、2020年4月、5月、6月に30%から60%まで比較的安定した水準を維持し、2020年7月には徐々に正常な能力の80%まで回復した。 This study investigated the impacts of the COVID-19 pandemic on the electricity consumption of a university dormitory building in the southern U.S. The historical electricity consumption data of this university dormitory building and weather data of an on-campus weather station, which were collected from January 1st, 2017 to July 31st, 2020, were used for analysis. Four inverse data-driven prediction models, i.e., Artificial Neural Network, Long Short-Term Memory Recurrent Neural Network, eXtreme Gradient Boosting, and Light Gradient Boosting Machine, were exploited to account for the influence of the weather conditions. The results suggested that the total electricity consumption of the objective building decreased by nearly 41% (about 276,000 kWh (942 MMBtu)) compared with the prediction value during the campus shutdown due to the COVID-19. Besides, the daily load ratio (DLR) varied significantly as well. In general, the DLR decreased gradually from 80% to nearly 40% in the second half of March 2020, maintained on a relatively stable level between 30% to 60% in April, May, and June 2020, and then slowly recovered to 80% of the normal capacity in July 2020.	翻訳日:2023-04-22 02:47:24 公開日:2020-12-04
# 都市内業務におけるオンライン食品配送データの利用と住宅移動度検出とキャラクタリゼーション Exploring the Usage of Online Food Delivery Data for Intra-Urban Job and Housing Mobility Detection and Characterization ( http://arxiv.org/abs/2012.03739v1 ) ライセンス: Link先を確認	Yawen Zhang, Seth Spielman, Qi Liu, Si Shen, Jason Shuo Zhang, Qin Lv	(参考訳) ヒトのモビリティは都市計画や政策立案において重要な役割を担っている。しかし、ある空間的・時間的解像度では、例えば仕事や住宅の移動などの追跡は非常に困難である。本研究では,仕事や住宅の移動性を検出するために,オンラインフードデリバリーデータであるデータセットの新しいモダリティの利用について検討する。中国の北京で人気のオンライン食品注文・配送サービスから何百万もの注文を受け付けることで、従来のデータソースよりもはるかに高い空間的・時間的解像度で、仕事や住宅の移動を検出できるのです。一般的な動きの季節や起源・運命はよく特定できる。より重要なことは、検出された動きをマクロとマイクロレベルの両方の要素に合わせ、仕事と住宅のダイナミクスを特徴づける。以上の結果から,通勤距離は仕事や住宅移動の大きな要因であることが示唆された。また,(1)住宅移動者の場合,都市空間構造を考えると,住宅コストの低減と通勤距離の短縮との間にはトレードオフがあり,(2)就業ホッパーの場合,残業頻度が高い場合,仕事の切り替えによって労働時間を短縮する傾向が強い。この新しいデータセットのモダリティには制限があるが、異なる特徴を持つ複数のデータセットのマッシュアップがジョブやハウジングのダイナミクスをより包括的に表現できるような、アンサンブルアプローチは有望だと考えています。本研究は,雇用・住宅の移動性の検出・分析に食品配送データを活用することの有効性を実証し,アンサンブルに基づくアプローチの潜在可能性の実現に寄与する。 Human mobility plays a critical role in urban planning and policy-making. However, at certain spatial and temporal resolutions, it is very challenging to track, for example, job and housing mobility. In this study, we explore the usage of a new modality of dataset, online food delivery data, to detect job and housing mobility. By leveraging millions of meal orders from a popular online food ordering and delivery service in Beijing, China, we are able to detect job and housing moves at much higher spatial and temporal resolutions than using traditional data sources. Popular moving seasons and origins/destinations can be well identified. More importantly, we match the detected moves to both macro- and micro-level factors so as to characterize job and housing dynamics. Our findings suggest that commuting distance is a major factor for job and housing mobility. We also observe that: (1) For home movers, there is a trade-off between lower housing cost and shorter commuting distance given the urban spatial structure; (2) For job hoppers, those who frequently work overtime are more likely to reduce their working hours by switching jobs. While this new modality of dataset has its limitations, we believe that ensemble approaches would be promising, where a mash-up of multiple datasets with different characteristic limitations can provide a more comprehensive picture of job and housing dynamics. Our work demonstrates the effectiveness of utilizing food delivery data to detect and analyze job and housing mobility, and contributes to realizing the full potential of ensemble-based approaches.	翻訳日:2023-04-22 02:45:40 公開日:2020-12-04
# 公開鍵暗号を用いた健康診断の検証 Verifiable Proof of Health using Public Key Cryptography ( http://arxiv.org/abs/2012.02885v1 ) ライセンス: Link先を確認	Abhishek Singh, Ramesh Raskar	(参考訳) 現在のパンデミックでは、検疫や接触追跡などの健康関連の介入を行うため、検査は病気の拡散や早期診断を監視・抑制するための最も重要なツールであり続けている。したがって、公の場が安全にオープンする準備ができているため、テストステータスの検証能力が重要となる。暗号ツールの最近の進歩により、セキュアで弾力性のあるデジタルidシステムの構築が可能になった。本稿では,テスト結果検証システムの相互運用可能な層を設計する上で,プライバシや計算,その他の実用上の懸念を考慮に入れて,より厳格で選択的なロックダウンを可能にするためのエンドツーエンドの新型コロナウイルス結果検証プロトコルを構築することを提案する。また,提案システムのセキュリティ,プライバシ,倫理,公平性に関する様々な懸念についても論じる。 In the current pandemic, testing continues to be the most important tool for monitoring and curbing the disease spread and early identification of the disease to perform health-related interventions like quarantine, contact tracing and etc. Therefore, the ability to verify the testing status is pertinent as public places prepare to safely open. Recent advances in cryptographic tools have made it possible to build a secure and resilient digital-id system. In this work, we propose to build an end to end COVID-19 results verification protocol that takes privacy, computation, and other practical concerns into account for designing an inter-operable layer of testing results verification system that could potentially enable less stringent and more selective lockdowns. We also discuss various concerns encompassing the security, privacy, ethics and equity aspect of the proposed system.	翻訳日:2023-04-22 02:45:10 公開日:2020-12-04
# 動的偏極核環境と相互作用するNV中心の客観性 Appearance of objectivity for NV centers interacting with dynamically polarized nuclear environment ( http://arxiv.org/abs/2012.02855v1 ) ライセンス: Link先を確認	Damian Kwiatkowski, {\L}ukasz Cywi\'nski, Jaros{\l}aw K. Korbicz	(参考訳) 量子から古典への遷移は、まだ完全に理解できない。その多面的側面から、最近注目を集めているのが、量子から客観的な世界の出現である。特に、客観性は、スペクトル放送構造(SBS)として知られる進化中の特定の量子状態構造の形成によって現れる。この最強かつ最も基本的な客観性に関する研究がすでに行われているにもかかわらず、具体的な物理媒体での実践的実現は今のところ分析されていない。本研究では, ダイヤモンド中の窒素空孔中心を用いて, sbs生成過程をシミュレートする可能性について検討した。動的偏光技術の達成可能な限界を仮定すると、高いが実験可能な核スピンの偏光(p>0.5$)と$\approx \!以下の磁場に対して。 20$は、NV中心の状態と最も近い偏極環境が、合理的にSBS状態に近づく。 Quantum-to-classical transition still eludes a full understanding. Out of its multiple aspects, one has recently gained an increased attention - the appearance of objective world out of the quantum. One particularly idea is that objectivity appears thanks to specific quantum state structures formation during the evolution, known as Spectrum Broadcast Structures (SBS). Despite that quite some research was already performed on this strongest and most fundamental form of objectivity, its practical realization in a concrete physical medium has not been analyzed so far. In this work, we study the possibility to simulate objectivization process via SBS formation using widely studied Nitrogen-Vacancy centers in diamonds. Assuming achievable limits of dynamical polarization technique, we show that for high, but experimentally viable polarizations ($p>0.5$) of nuclear spins and for magnetic fields lower than $\approx \! 20$ Gauss the state of the NV center and its nearest polarized environment approaches reasonably well an SBS state.	翻訳日:2023-04-22 02:44:57 公開日:2020-12-04
# 拡散写像を用いた量子相転移の教師なし機械学習 Unsupervised machine learning of quantum phase transitions using diffusion maps ( http://arxiv.org/abs/2003.07399v2 ) ライセンス: Link先を確認	Alexander Lidiak and Zhexuan Gong	(参考訳) 実験的な量子シミュレータは巨大で複雑になり、膨大な量の計測データから新しい物理学を発見することは、特にシミュレーションモデルの理論的理解がほとんどない場合、非常に困難である。教師なしの機械学習手法はこの課題を克服する上で特に有望である。量子相転移を学習する特定のタスクのために、教師なし機械学習法は主に単純な順序パラメータによって特徴づけられる相転移のために開発されてきた。しかし、そのような方法はしばしば不連続相、原子価結合固体、位相次数、多体局在など、より複雑な相転移では失敗する。測定データの非線形次元減少とスペクトルクラスタリングを行う拡散写像法は,そのような複雑な位相遷移を教師なしで学習する上で有意なポテンシャルを有することを示す。本手法は、局所観測性の測定を単一基底で行うため、様々な量子位相や相転移を学習するための汎用ツールとして、多くの実験量子シミュレータに容易に適用できる。 Experimental quantum simulators have become large and complex enough that discovering new physics from the huge amount of measurement data can be quite challenging, especially when little theoretical understanding of the simulated model is available. Unsupervised machine learning methods are particularly promising in overcoming this challenge. For the specific task of learning quantum phase transitions, unsupervised machine learning methods have primarily been developed for phase transitions characterized by simple order parameters, typically linear in the measured observables. However, such methods often fail for more complicated phase transitions, such as those involving incommensurate phases, valence-bond solids, topological order, and many-body localization. We show that the diffusion map method, which performs nonlinear dimensionality reduction and spectral clustering of the measurement data, has significant potential for learning such complex phase transitions unsupervised. This method works for measurements of local observables in a single basis and is thus readily applicable to many experimental quantum simulators as a versatile tool for learning various quantum phases and phase transitions.	翻訳日:2022-12-23 03:58:41 公開日:2020-12-04
# きめ細かい表情操作に向けて Toward Fine-grained Facial Expression Manipulation ( http://arxiv.org/abs/2004.03132v2 ) ライセンス: Link先を確認	Jun Ling, Han Xue, Li Song, Shuhui Yang, Rong Xie, Xiao Gu	(参考訳) 表情操作は、所定の条件で表情を編集することを目的としている。従来の方法は、個別の感情ラベルまたは絶対状態(例えば、顔の動き単位)のガイダンスの下で入力画像を編集し、所望の表現を保持する。しかし、これらの手法は条件非関連領域の変更に悩まされるか、きめ細かい編集に非効率である。本研究では,これら2つの目的を考察し,新しい手法を提案する。まず,連続絶対条件を相対条件,特に相対作用単位に置き換える。相対作用単位を用いて、生成器は非ゼロ値の相対AUによって指定される関心領域のみを変換することを学ぶ。第2に、我々のジェネレータはU-Net上に構築されているが、高品質な表現編集のためのマルチスケール特徴融合(MSF)機構によって強化されている。定量的評価と定性評価の両面での広範囲な実験により,提案手法の改良が示された。コードは \url{https://github.com/junleen/expression-manipulator} で入手できる。 Facial expression manipulation aims at editing facial expression with a given condition. Previous methods edit an input image under the guidance of a discrete emotion label or absolute condition (e.g., facial action units) to possess the desired expression. However, these methods either suffer from changing condition-irrelevant regions or are inefficient for fine-grained editing. In this study, we take these two objectives into consideration and propose a novel method. First, we replace continuous absolute condition with relative condition, specifically, relative action units. With relative action units, the generator learns to only transform regions of interest which are specified by non-zero-valued relative AUs. Second, our generator is built on U-Net but strengthened by Multi-Scale Feature Fusion (MSF) mechanism for high-quality expression editing purposes. Extensive experiments on both quantitative and qualitative evaluation demonstrate the improvements of our proposed approach compared to the state-of-the-art expression editing methods. Code is available at \url{https://github.com/junleen/Expression-manipulator}.	翻訳日:2022-12-16 00:17:13 公開日:2020-12-04
# 混合密度条件生成逆ネットワークモデル(MD-CGAN) Mixture Density Conditional Generative Adversarial Network Models (MD-CGAN) ( http://arxiv.org/abs/2004.03797v3 ) ライセンス: Link先を確認	Jaleh Zand and Stephen Roberts	(参考訳) 近年,GAN (Generative Adversarial Networks) が注目されている。しかし、そのような例と比較して、予測を含む時系列モデリングへのGANの応用はより限られている。本稿では,時系列予測に着目した混合密度条件付き生成逆解析モデル(md-cgan)を提案する。本モデルでは,予測よりも確率的後続分布を推定できることを示すとともに,一連のベンチマーク手法と比較して,特にノイズが観測時系列の重要な成分である状況において,MD-CGANモデルが良好に動作することを示す。さらに、出力分布としてガウス混合モデルを用いることで、MD-CGANは非ガウス的な後続予測を提供する。 Generative Adversarial Networks (GANs) have gained significant attention in recent years, with impressive applications highlighted in computer vision in particular. Compared to such examples, however, there have been more limited applications of GANs to time series modelling, including forecasting. In this work, we present the Mixture Density Conditional Generative Adversarial Model (MD-CGAN), with a focus on time series forecasting. We show that our model is capable of estimating a probabilistic posterior distribution over forecasts and that, in comparison to a set of benchmark methods, the MD-CGAN model performs well, particularly in situations where noise is a significant component of the observed time series. Further, by using a Gaussian mixture model as the output distribution, MD-CGAN offers posterior predictions that are non-Gaussian.	翻訳日:2022-12-15 08:19:25 公開日:2020-12-04
# 深層学習を用いた道路網上の温室効果ガス排出予測 Greenhouse Gas Emission Prediction on Road Network using Deep Sequence Learning ( http://arxiv.org/abs/2004.08286v2 ) ライセンス: Link先を確認	Lama Alfaseeh, Ran Tu, Bilal Farooq, and Marianne Hatzopoulou	(参考訳) 交通システムの環境への影響を緩和することが最重要課題である。したがって、温室効果ガス(GHG)排出量の予測は、特に知的輸送システム(ITS)の出現において重要なトピックの1つである。本研究では,従来の時間ステップの速度,密度,GHG ERなど,最も代表的な予測値に基づいて,リンクレベルのGHG排出率(ER)を予測するディープラーニングフレームワークを開発する。特に,外因性変数を持つlong-short term memory(lstm)ネットワークの諸仕様を,クラスタリングおよび外因性変数を用いた自己回帰的統合移動平均(arima)モデルと比較した。トロント中心街の道路網はケーススタディとして利用され、校正交通マイクロシミュレーションとMOVESを用いて詳細なデータを合成する。 LSTM仕様では,3分間の速度,密度,GHG ER,リンク内速度が2層を隠蔽し,過度パラメータを体系的に調整した場合に最適であることがわかった。 30秒の更新間隔を採用すると、真のGHG ERと予測されたGHG ERとの相関はわずかに改善されるが、増大したルート平均二乗誤差(RMSE)値に反映されるように予測精度に悪影響を及ぼす。温暖化への悪影響を軽減するために,データ要求の少ない高頻度でのghg排出量の効率的な予測は,大規模道路網における非筋電性エコルーティングへの道を開く Mitigating the substantial undesirable impact of transportation systems on the environment is paramount. Thus, predicting Greenhouse Gas (GHG) emissions is one of the profound topics, especially with the emergence of intelligent transportation systems (ITS). We develop a deep learning framework to predict link-level GHG emission rate (ER) (in CO2eq gram/second) based on the most representative predictors, such as speed, density, and the GHG ER of previous time steps. In particular, various specifications of the long-short term memory (LSTM) networks with exogenous variables are examined and compared with clustering and the autoregressive integrated moving average (ARIMA) model with exogenous variables. The downtown Toronto road network is used as the case study and highly detailed data are synthesized using a calibrated traffic microsimulation and MOVES. It is found that LSTM specification with speed, density, GHG ER, and in-links speed from three previous minutes performs the best while adopting 2 hidden layers and when the hyper-parameters are systematically tuned. Adopting a 30 second updating interval improves slightly the correlation between true and predicted GHG ERs, but contributes negatively to the prediction accuracy as reflected on the increased root mean square error (RMSE) value. Efficiently predicting GHG emissions at a higher frequency with lower data requirements will pave the way to non-myopic eco-routing on large-scale road networks {to alleviate the adverse impact on the global warming	翻訳日:2022-12-12 21:18:30 公開日:2020-12-04
# Twitterの談話におけるバイアスによる目標情報操作の自動評価 Automatically Characterizing Targeted Information Operations Through Biases Present in Discourse on Twitter ( http://arxiv.org/abs/2004.08726v3 ) ライセンス: Link先を確認	Autumn Toney, Akshat Pandey, Wei Guo, David Broniatowski, Aylin Caliskan	(参考訳) 本稿では、人工知能による新たな情報操作と関連する可能性のある全体的な態度やバイアスを自動的に特徴付ける問題を検討する。これらの新興トピックの正確な分析には、新しいトピックのバイアスを特定するために何百万ものツイートを注釈付けするために、専門家による精巧な手動分析が必要である。本稿では,CaliskanらによるWord Embedding Association Testの新たなドメインへの拡張について紹介する(Caliskan, 2017)。本手法は,情報操作におけるバイアスの定量化に有効である。本手法は,Twitterの透明性レポートからの既知の情報操作関連ツイートを用いて検証する。我々は、新型コロナウイルスパンデミックに関するケーススタディを行い、未ラベルのTwitterデータ上での方法のパフォーマンスを評価し、新興ドメインにおけるそのユーザビリティを実証した。 This paper considers the problem of automatically characterizing overall attitudes and biases that may be associated with emerging information operations via artificial intelligence. Accurate analysis of these emerging topics usually requires laborious, manual analysis by experts to annotate millions of tweets to identify biases in new topics. We introduce extensions of the Word Embedding Association Test from Caliskan et al. to a new domain (Caliskan, 2017). Our practical and unsupervised method is used to quantify biases promoted in information operations. We validate our method using known information operation-related tweets from Twitter's Transparency Report. We perform a case study on the COVID-19 pandemic to evaluate our method's performance on non-labeled Twitter data, demonstrating its usability in emerging domains.	翻訳日:2022-12-12 05:08:21 公開日:2020-12-04
# 持続可能な消費をいかに活用できるか How Value-Sensitive Design Can Empower Sustainable Consumption ( http://arxiv.org/abs/2004.09180v4 ) ライセンス: Link先を確認	Thomas Asikis, Johannes Klinglmayr, Dirk Helbing, Evangelos Pournaras	(参考訳) いわゆる人口過多の世界では,持続的消費は存在的に重要である。しかしながら,製品選択のスペクトルの拡大と生産の複雑さは,消費者に情報と価値に敏感な意思決定を迫られる。最近の(個人化された)心理的操作に基づくアプローチは、しばしば不透明で、プライバシーを侵害し、(情報的な)自己決定と矛盾する。対照的に、情報的選択に基づく責任ある消費は、人間の認知能力に圧倒されがちな程度に推論を必要とする。その結果、持続可能な消費への集団的シフトは依然として大きな課題である。ここでは,価値に敏感なデザインをサポートし,サステナビリティ意識を活用する,透明な製品情報と説明可能な製品評価に専門家の知識と“群衆の知恵”を用いて,スマートフォンアプリとして実装された新しいパーソナルショッピングアシスタントを紹介する。 2つのスーパーマーケットにおける実世界のフィールド実験は、より高い持続可能性意識とボトムアップによるより持続可能な消費への行動シフトを確認している。これらの結果は、消費者の好みとより高い持続可能性と倫理的に一致した、小売業者と生産者のための新しいビジネスモデルを奨励する。 In a so-called overpopulated world, sustainable consumption is of existential importance.However, the expanding spectrum of product choices and their production complexity challenge consumers to make informed and value-sensitive decisions. Recent approaches based on (personalized) psychological manipulation are often intransparent, potentially privacy-invasive and inconsistent with (informational) self-determination. In contrast, responsible consumption based on informed choices currently requires reasoning to an extent that tends to overwhelm human cognitive capacity. As a result, a collective shift towards sustainable consumption remains a grand challenge. Here we demonstrate a novel personal shopping assistant implemented as a smart phone app that supports a value-sensitive design and leverages sustainability awareness, using experts' knowledge and "wisdom of the crowd" for transparent product information and explainable product ratings. Real-world field experiments in two supermarkets confirm higher sustainability awareness and a bottom-up behavioral shift towards more sustainable consumption. These results encourage novel business models for retailers and producers, ethically aligned with consumer preferences and with higher sustainability.	翻訳日:2022-12-11 19:29:19 公開日:2020-12-04
# EfficientPose: スケーラブルなシングルパーソンポーズ推定 EfficientPose: Scalable single-person pose estimation ( http://arxiv.org/abs/2004.12186v2 ) ライセンス: Link先を確認	Daniel Groos, Heri Ramampiaro, Espen A. F. Ihlen	(参考訳) 一人称人間のポーズ推定は、スポーツにおけるマーカーレス運動分析と臨床応用を促進する。それでも、人間のポーズ推定の最先端モデルは、一般に実際の応用の要件を満たしていない。深層学習技術の普及は、多くの先進的なアプローチを生み出した。しかし、この分野の進展に伴い、より複雑で非効率なモデルも導入され、計算要求が大幅に増加した。このような複雑で非効率な課題に対処するため,我々は,最近提案されている効率性ネットを活用し,効率的かつスケーラブルな一人称ポーズ推定を実現する,新しい畳み込みニューラルネットワークアーキテクチャである efficientpose を提案する。 efficientposeは、モバイル逆ボトルネック畳み込みを用いた効果的なマルチスケール特徴抽出器と計算効率の高い検出ブロックを活用したモデル群であると同時に、ポーズ構成の精度も向上している。複雑さと効率が低いため、EfficientPoseはメモリフットプリントと計算コストを制限し、エッジデバイス上の現実世界のアプリケーションを可能にする。実験の結果,MPIIシングルパーソンベンチマークを用いた結果,提案したEfficientPoseモデルは,精度と計算効率の両面で広く使用されているOpenPoseモデルより大幅に優れていることがわかった。特に,我々のトップパフォーマンスモデルでは,低複雑さのConvNetを用いて,シングルパーソンMPIIにおける最先端の精度を実現している。 Single-person human pose estimation facilitates markerless movement analysis in sports, as well as in clinical applications. Still, state-of-the-art models for human pose estimation generally do not meet the requirements of real-life applications. The proliferation of deep learning techniques has resulted in the development of many advanced approaches. However, with the progresses in the field, more complex and inefficient models have also been introduced, which have caused tremendous increases in computational demands. To cope with these complexity and inefficiency challenges, we propose a novel convolutional neural network architecture, called EfficientPose, which exploits recently proposed EfficientNets in order to deliver efficient and scalable single-person pose estimation. EfficientPose is a family of models harnessing an effective multi-scale feature extractor and computationally efficient detection blocks using mobile inverted bottleneck convolutions, while at the same time ensuring that the precision of the pose configurations is still improved. Due to its low complexity and efficiency, EfficientPose enables real-world applications on edge devices by limiting the memory footprint and computational cost. The results from our experiments, using the challenging MPII single-person benchmark, show that the proposed EfficientPose models substantially outperform the widely-used OpenPose model both in terms of accuracy and computational efficiency. In particular, our top-performing model achieves state-of-the-art accuracy on single-person MPII, with low-complexity ConvNets.	翻訳日:2022-12-09 21:44:12 公開日:2020-12-04
# リアルタイム応用のためのヒト人工内耳力学の畳み込みニューラルネットワークモデルとフィルタチューニング A convolutional neural-network model of human cochlear mechanics and filter tuning for real-time applications ( http://arxiv.org/abs/2004.14832v4 ) ライセンス: Link先を確認	Deepak Baby, Arthur Van Den Broucke, Sarah Verhulst	(参考訳) 聴覚モデルは、自動音声認識システムのための特徴抽出器や、ロボット工学、機械聴取、補聴器のフロントエンドとして一般的に用いられている。聴覚モデルは人間の聴覚の生体物理学的および非線形的特性を非常に詳細に捉えることができるが、これらの生体物理学モデルは計算コストが高く、リアルタイム応用には使用できない。本稿では,畳み込みニューラルネットワークと計算神経科学を組み合わせることによって,レベル依存フィルタチューニング(connear)を含む人工内耳力学のリアルタイムエンドツーエンドモデルを実現するハイブリッドアプローチを提案する。 CoNNear モデルは音響音声材料で訓練され、その性能と適用性はコクラー力学研究でよく用いられる(見えない)音刺激を用いて評価された。 connearモデルは、人間の人工内耳周波数選択率と、その音響強度依存性を正確にシミュレートする。 CoNNearアーキテクチャは並列で微分可能な計算に基づいており、リアルタイムな人間のパフォーマンスを実現する能力を持っている。これらのユニークなCNNear機能は、次世代のヒューマンライクな機械学習アプリケーションを可能にする。 Auditory models are commonly used as feature extractors for automatic speech-recognition systems or as front-ends for robotics, machine-hearing and hearing-aid applications. Although auditory models can capture the biophysical and nonlinear properties of human hearing in great detail, these biophysical models are computationally expensive and cannot be used in real-time applications. We present a hybrid approach where convolutional neural networks are combined with computational neuroscience to yield a real-time end-to-end model for human cochlear mechanics, including level-dependent filter tuning (CoNNear). The CoNNear model was trained on acoustic speech material and its performance and applicability were evaluated using (unseen) sound stimuli commonly employed in cochlear mechanics research. The CoNNear model accurately simulates human cochlear frequency selectivity and its dependence on sound intensity, an essential quality for robust speech intelligibility at negative speech-to-background-noise ratios. The CoNNear architecture is based on parallel and differentiable computations and has the power to achieve real-time human performance. These unique CoNNear features will enable the next generation of human-like machine-hearing applications.	翻訳日:2022-12-08 05:43:49 公開日:2020-12-04
# TAVAT: 言語理解のための仮想敵訓練 TAVAT: Token-Aware Virtual Adversarial Training for Language Understanding ( http://arxiv.org/abs/2004.14543v3 ) ライセンス: Link先を確認	Linyang Li, Xipeng Qiu	(参考訳) ニューラルネットワークの堅牢性向上には、グラディエントベースの逆行訓練が広く用いられているが、埋め込み空間が離散的であるため、自然言語処理タスクに容易に適応することはできない。自然言語処理の分野では、テキストが離散的であり、勾配によって直接摂動できないため、仮想対位訓練が導入される。あるいは、NLPタスクでは、埋め込み空間上の摂動を生成する仮想敵トレーニングが導入される。その成功にもかかわらず、既存の仮想敵の訓練方法はフロベニウス正規化球によってほぼ制約された摂動を生成する。微粒な摂動を創り出すために,トークン認識型仮想敵訓練法を提案する。トークンレベルの蓄積摂動語彙を導入し、摂動をより早く初期化し、トークンレベルの正規化球を用いて摂動を連続的に制限する。実験の結果, BERT や ALBERT などの事前学習モデルの性能は, かなりの差で向上することがわかった。提案手法は,BERTモデルを用いてGLUEベンチマークのスコアを78.3から80.9に改善し,シーケンスラベリングやテキスト分類タスクの性能を向上させる。 Gradient-based adversarial training is widely used in improving the robustness of neural networks, while it cannot be easily adapted to natural language processing tasks since the embedding space is discrete. In natural language processing fields, virtual adversarial training is introduced since texts are discrete and cannot be perturbed by gradients directly. Alternatively, virtual adversarial training, which generates perturbations on the embedding space, is introduced in NLP tasks. Despite its success, existing virtual adversarial training methods generate perturbations roughly constrained by Frobenius normalization balls. To craft fine-grained perturbations, we propose a Token-Aware Virtual Adversarial Training method. We introduce a token-level accumulated perturbation vocabulary to initialize the perturbations better and use a token-level normalization ball to constrain these perturbations pertinently. Experiments show that our method improves the performance of pre-trained models such as BERT and ALBERT in various tasks by a considerable margin. The proposed method improves the score of the GLUE benchmark from 78.3 to 80.9 using BERT model and it also enhances the performance of sequence labeling and text classification tasks.	翻訳日:2022-12-08 03:57:05 公開日:2020-12-04
# 不規則サンプリング時系列における長期依存性の学習 Learning Long-Term Dependencies in Irregularly-Sampled Time Series ( http://arxiv.org/abs/2006.04418v4 ) ライセンス: Link先を確認	Mathias Lechner and Ramin Hasani	(参考訳) 連続時間隠れ状態を持つリカレントニューラルネットワーク(RNN)は、不規則サンプリング時系列のモデリングに自然に適合する。しかし、これらのモデルは、入力データが長期依存を持つ場合、困難に直面します。通常のRNNと同様、この問題の根底にある理由は、トレーニング中に勾配が消滅または爆発することである。この現象は、ODEソルバの選択に関係なく、隠蔽状態の常微分方程式(ODE)で表される。我々は,その時間連続状態からメモリを分離する長寿命メモリ(LSTM)に基づく新しいアルゴリズムを設計することで,解を提供する。これにより、rnn内の連続時間動的流れをエンコードし、メモリパスを通じて一定のエラー伝搬を確保しながら、任意のタイムラグに到着する入力に応答することができる。我々はこれらのRNNモデルをODE-LSTMと呼ぶ。 ODE-LSTMは, 長期依存性のある一様でないサンプルデータに対して, 高度なRNNベースのデータよりも優れていることを示す。すべてのコードとデータはhttps://github.com/mlech26l/ode-lstmsで入手できる。 Recurrent neural networks (RNNs) with continuous-time hidden states are a natural fit for modeling irregularly-sampled time series. These models, however, face difficulties when the input data possess long-term dependencies. We prove that similar to standard RNNs, the underlying reason for this issue is the vanishing or exploding of the gradient during training. This phenomenon is expressed by the ordinary differential equation (ODE) representation of the hidden state, regardless of the ODE solver's choice. We provide a solution by designing a new algorithm based on the long short-term memory (LSTM) that separates its memory from its time-continuous state. This way, we encode a continuous-time dynamical flow within the RNN, allowing it to respond to inputs arriving at arbitrary time-lags while ensuring a constant error propagation through the memory path. We call these RNN models ODE-LSTMs. We experimentally show that ODE-LSTMs outperform advanced RNN-based counterparts on non-uniformly sampled data with long-term dependencies. All code and data is available at https://github.com/mlech26l/ode-lstms.	翻訳日:2022-11-24 00:32:45 公開日:2020-12-04
# 疎データ依存雑音におけるPCAによる高速ロバスト部分空間追跡 Fast Robust Subspace Tracking via PCA in Sparse Data-Dependent Noise ( http://arxiv.org/abs/2006.08030v3 ) ライセンス: Link先を確認	Praneeth Narayanamurthy and Namrata Vaswani	(参考訳) 本研究はロバスト部分空間追跡(st)問題を研究する。ロバスト ST は、ロバストPCA の(ゆっくりとした)時変部分空間拡張として簡単に理解できる。真のデータは、時間とともに固定またはゆっくり変化する低次元部分空間にあると仮定する。目標は、加法的なスパースアウトレーヤの存在下で変化する部分空間を時間とともに追跡し、(短い遅延で)これを迅速に行うことである。軽度な仮定の下で確実に正しい「高速」ミニバッチロバストなstソリューションを提案する。速い」とは2つの意味を持つ。 (i)部分空間の変化を検出し、その部分空間を最適に近い遅延で追跡することができる。 (ii)これを行う時間の複雑さは、単純な(ロバストでない)pcaと同じである。我々の主な結果は、断片的に定数部分空間(識別可能性に依る)を仮定するが、同時に、各時間にわずかな変化がある場合の座標も提供する。第2の貢献は、線形データ依存ノイズにおけるPCAの非漸近的保証である。これが役立つ重要な設定は、時間とともに十分に変化するサポートと疎結合な線形データ依存ノイズに対してである。この結果を用いて,提案するロバストstソリューションのサブスペース更新ステップの解析を行う。 This work studies the robust subspace tracking (ST) problem. Robust ST can be simply understood as a (slow) time-varying subspace extension of robust PCA. It assumes that the true data lies in a low-dimensional subspace that is either fixed or changes slowly with time. The goal is to track the changing subspaces over time in the presence of additive sparse outliers and to do this quickly (with a short delay). We introduce a "fast" mini-batch robust ST solution that is provably correct under mild assumptions. Here "fast" means two things: (i) the subspace changes can be detected and the subspaces can be tracked with near-optimal delay, and (ii) the time complexity of doing this is the same as that of simple (non-robust) PCA. Our main result assumes piecewise constant subspaces (needed for identifiability), but we also provide a corollary for the case when there is a little change at each time. A second contribution is a novel non-asymptotic guarantee for PCA in linearly data-dependent noise. An important setting where this is useful is for linearly data dependent noise that is sparse with support that changes enough over time. The analysis of the subspace update step of our proposed robust ST solution uses this result.	翻訳日:2022-11-21 12:36:49 公開日:2020-12-04
# 直交グラディエントによる連続学習のための一般化保証 Generalisation Guarantees for Continual Learning with Orthogonal Gradient Descent ( http://arxiv.org/abs/2006.11942v4 ) ライセンス: Link先を確認	Mehdi Abbana Bennani, Thang Doan, Masashi Sugiyama	(参考訳) 継続的学習では、ディープニューラルネットワークは破滅的な忘れがちである。この課題に取り組むために直交勾配降下が提案された。しかし、理論的な保証はまだ証明されていない。本稿では,ニューラルタンジェントカーネルシステムにおける連続学習アルゴリズムの理論的枠組みを提案する。このフレームワークは、伝達学習、一般化、タスク類似性のためのタスクおよびプロキシを通じてモデルのクローズドフォーム表現を含む。この枠組みでは、OGDが破滅的フォーッティングに対して堅牢であることを証明し、SGD と OGD の連続学習に対する最初の一般化を導出する。最後に,本手法の限界について検討し,ogdを用いた連続学習における神経接核変動の重要性を強調する。 In Continual Learning settings, deep neural networks are prone to Catastrophic Forgetting. Orthogonal Gradient Descent was proposed to tackle the challenge. However, no theoretical guarantees have been proven yet. We present a theoretical framework to study Continual Learning algorithms in the Neural Tangent Kernel regime. This framework comprises closed form expression of the model through tasks and proxies for Transfer Learning, generalisation and tasks similarity. In this framework, we prove that OGD is robust to Catastrophic Forgetting then derive the first generalisation bound for SGD and OGD for Continual Learning. Finally, we study the limits of this framework in practice for OGD and highlight the importance of the Neural Tangent Kernel variation for Continual Learning with OGD.	翻訳日:2022-11-18 11:48:48 公開日:2020-12-04
# 効率的かつ透明なレビューのためのオープンソースソフトウェア Open Source Software for Efficient and Transparent Reviews ( http://arxiv.org/abs/2006.12166v3 ) ライセンス: Link先を確認	Rens van de Schoot, Jonathan de Bruin, Raoul Schram, Parisa Zahedi, Jan de Boer, Felix Weijdema, Bianca Kramer, Martijn Huijts, Maarten Hoogerwerf, Gerbrich Ferdinands, Albert Harkema, Joukje Willemsen, Yongchao Ma, Qixiang Fang, Sybren Hindriks, Lars Tummers, Daniel Oberski	(参考訳) 研究者は,系統的なレビューやメタアナリシスを可能な限り効果的かつ透過的に行うために,タイトルや要約のスクリーニングを高速化するツール (ASReview) を設計した。体系的なレビューやメタ分析を含む多くのタスクでは、科学的文献を体系的にチェックする必要があります。現在、学者や実践者は、レビューやメタ分析にどの研究を組み込むべきかを手作業で調査している。これは、極めて不均衡なデータのためにエラーを起こしやすく、非効率である。体系的なレビューの未来は、利用可能なテキストの膨大な増加に対応するために、機械学習アルゴリズムとのインタラクションになる。そこで我々は,アクティブラーニングを応用したオープンソースの機械学習支援パイプラインasreviewを開発した。シミュレーションにより,ASReviewは手作業によるレビューよりもはるかに効率的なレビューを実現するとともに,高品質なレビューを実現することができることを示す。さらに,フリーでオープンソースな研究ソフトウェアの選択肢を説明し,ユーザエクスペリエンステストの結果を紹介する。私たちはコミュニティに対して,現在のプラクティスよりも測定可能かつ再現可能な改善を提供する,私たち自身のオープンソースプロジェクトへのコントリビューションを呼びかけています。 To help researchers conduct a systematic review or meta-analysis as efficiently and transparently as possible, we designed a tool (ASReview) to accelerate the step of screening titles and abstracts. For many tasks - including but not limited to systematic reviews and meta-analyses - the scientific literature needs to be checked systematically. Currently, scholars and practitioners screen thousands of studies by hand to determine which studies to include in their review or meta-analysis. This is error prone and inefficient because of extremely imbalanced data: only a fraction of the screened studies is relevant. The future of systematic reviewing will be an interaction with machine learning algorithms to deal with the enormous increase of available text. We therefore developed an open source machine learning-aided pipeline applying active learning: ASReview. We demonstrate by means of simulation studies that ASReview can yield far more efficient reviewing than manual reviewing, while providing high quality. Furthermore, we describe the options of the free and open source research software and present the results from user experience tests. We invite the community to contribute to open source projects such as our own that provide measurable and reproducible improvements over current practice.	翻訳日:2022-11-18 06:49:39 公開日:2020-12-04
# ロバスト線形回帰:多項式時間における最適速度 Robust Linear Regression: Optimal Rates in Polynomial Time ( http://arxiv.org/abs/2007.01394v4 ) ライセンス: Link先を確認	Ainesh Bakshi and Adarsh Prasad	(参考訳) 最小分布仮定の下で統計的に最適収束率を達成する線形モデルを学習するための頑健で効率的な推定値を得る。具体的には、私たちのデータは、$k$-hypercontractive distributionから引き出され、$\epsilon$-fractionが逆向きに破損していると仮定する。次に、ノイズが共変量とは独立である場合、$\epsilon^{2-2/k}$に比例するレートで真の分布の最適最小二乗最小値に収束する推定器を記述する。このような推定器は我々の研究以前には知られていなかったが、非有界計算にもアクセスできた。私たちが達成したレートは情報理論上最適であり、klivans, kothari, meka [colt'18] の主要なオープン問題を解く。我々の重要な洞察は、確率変数の独立性の多項式緩和として働く解析条件を特定することである。特に,雑音のモーメントと共変量のモーメントが負の相関関係にある場合,独立雑音と同じ速度が得られることを示す。さらに、条件が満たされない場合、$\epsilon^{2-4/k}$に比例するレートを取得し、情報理論上の下限に再び一致する。我々の中心となる技術的貢献は、前述の多項式不等式として定式化することで、"sum-of-squares"フレームワークにおける確率変数の独立性をアルゴリズム的に活用することである。 We obtain robust and computationally efficient estimators for learning several linear models that achieve statistically optimal convergence rate under minimal distributional assumptions. Concretely, we assume our data is drawn from a $k$-hypercontractive distribution and an $\epsilon$-fraction is adversarially corrupted. We then describe an estimator that converges to the optimal least-squares minimizer for the true distribution at a rate proportional to $\epsilon^{2-2/k}$, when the noise is independent of the covariates. We note that no such estimator was known prior to our work, even with access to unbounded computation. The rate we achieve is information-theoretically optimal and thus we resolve the main open question in Klivans, Kothari and Meka [COLT'18]. Our key insight is to identify an analytic condition that serves as a polynomial relaxation of independence of random variables. In particular, we show that when the moments of the noise and covariates are negatively-correlated, we obtain the same rate as independent noise. Further, when the condition is not satisfied, we obtain a rate proportional to $\epsilon^{2-4/k}$, and again match the information-theoretic lower bound. Our central technical contribution is to algorithmically exploit independence of random variables in the "sum-of-squares" framework by formulating it as the aforementioned polynomial inequality.	翻訳日:2022-11-15 14:32:55 公開日:2020-12-04
# 連結車両による交通流の予測 Prediction of Traffic Flow via Connected Vehicles ( http://arxiv.org/abs/2007.05460v2 ) ライセンス: Link先を確認	Ranwa Al Mallah, Bilal Farooq, Alejandro Quintero	(参考訳) 我々は,交通当局がフロー制御と渋滞防止のために早期行動を取るための短期交通フロー予測(stp)フレームワークを提案する。我々は,過去の流れデータと,リアルタイムフィードやコネクテッドカー(cv)技術が提供する軌道データなどの革新的な特徴に基づいて,目標道路区間の将来の流れを予測する。既存の手法が交通の変動に適応しないという事実に対処するため,本手法が流れの予測に組み込むことによって高度なモデリングを可能にし,CVが現実的に遭遇する様々な事象が軌道に沿ったセグメントに与える影響を示す。 CVからの入力によって強化されたマルチタスク学習環境において,Deep Neural Networks (DNN) を用いてSTP問題を解く。その結果,MTL-CV,平均ルート平均二乗誤差(RMSE)が0.052であり,最先端のARIMA時系列(RMSE 0.255)とベースライン分類器(RMSE 0.122)を上回った。ニューラルネットワーク(ANN)による単一タスク学習と比較して、AMNはMTL-CVよりもパフォーマンスが0.113と低い。 MTL-CVは、測定において直接的な歴史的傾向を使用するのとは対照的に、セグメント間の歴史的類似性を学習した。 We propose a Short-term Traffic flow Prediction (STP) framework so that transportation authorities take early actions to control flow and prevent congestion. We anticipate flow at future time frames on a target road segment based on historical flow data and innovative features such as real time feeds and trajectory data provided by Connected Vehicles (CV) technology. To cope with the fact that existing approaches do not adapt to variation in traffic, we show how this novel approach allows advanced modelling by integrating into the forecasting of flow, the impact of the various events that CV realistically encountered on segments along their trajectory. We solve the STP problem with a Deep Neural Networks (DNN) in a multitask learning setting augmented by input from CV. Results show that our approach, namely MTL-CV, with an average Root-Mean-Square Error (RMSE) of 0.052, outperforms state-of-the-art ARIMA time series (RMSE of 0.255) and baseline classifiers (RMSE of 0.122). Compared to single task learning with Artificial Neural Network (ANN), ANN had a lower performance, 0.113 for RMSE, than MTL-CV. MTL-CV learned historical similarities between segments, in contrast to using direct historical trends in the measure, because trends may not exist in the measure but do in the similarities.	翻訳日:2022-11-11 22:27:05 公開日:2020-12-04
# 条件付き生成逆ネットワークを用いた量子状態トモグラフィ Quantum State Tomography with Conditional Generative Adversarial Networks ( http://arxiv.org/abs/2008.03240v2 ) ライセンス: Link先を確認	Shahnawaz Ahmed, Carlos S\'anchez Mu\~noz, Franco Nori, Anton Frisk Kockum	(参考訳) 量子状態トモグラフィ(QST)は、中間スケールの量子デバイスにおいて難しい課題である。本稿では,QSTに条件付き生成逆ネットワーク(CGAN)を適用する。 CGANフレームワークでは、2つのデュエルニューラルネットワーク、ジェネレータと識別器がデータからマルチモーダルモデルを学ぶ。我々は、任意の標準ニューラルネットワークから物理密度行列への出力変換を可能にするカスタムニューラルネットワーク層でcganを補強する。密度行列を再構築するために、ジェネレータと判別器ネットワークは標準勾配法を用いて互いにデータを訓練する。我々のQST-CGANは、標準最大化法よりもはるかに高速かつ少ないデータで、光量子状態を再構成することを示した。また、QST-CGANは、類似した量子状態で事前学習された場合、発電機ネットワークの単一評価において量子状態を再構築可能であることを示す。 Quantum state tomography (QST) is a challenging task in intermediate-scale quantum devices. Here, we apply conditional generative adversarial networks (CGANs) to QST. In the CGAN framework, two duelling neural networks, a generator and a discriminator, learn multi-modal models from data. We augment a CGAN with custom neural-network layers that enable conversion of output from any standard neural network into a physical density matrix. To reconstruct the density matrix, the generator and discriminator networks train each other on data using standard gradient-based methods. We demonstrate that our QST-CGAN reconstructs optical quantum states with high fidelity orders of magnitude faster, and from less data, than a standard maximum-likelihood method. We also show that the QST-CGAN can reconstruct a quantum state in a single evaluation of the generator network if it has been pre-trained on similar quantum states.	翻訳日:2022-11-02 01:56:31 公開日:2020-12-04
# 降雨発生から降雨除去へ From Rain Generation to Rain Removal ( http://arxiv.org/abs/2008.03580v2 ) ライセンス: Link先を確認	Hong Wang, Zongsheng Yue, Qi Xie, Qian Zhao, Yefeng Zheng, Deyu Meng	(参考訳) シングルイメージ雨除去(SIRR)タスクでは、ディープラーニング(DL)ベースの手法の性能は、主に設計済みのデラミニングモデルとトレーニングデータセットの影響を受けている。現在の最先端技術のほとんどは、より優れた評価結果を得るために強力な深層モデルの構築に重点を置いている。本稿では, 降雨画像のより効率的な合成方法を探ることで, トレーニングデータセットの観点からSIRRタスクの処理を新たに試みる。具体的には,降雨層を発生器としてパラメータ化し,物理的構造的雨因子(例えば方向,スケール,厚さなど)を表す潜在変数として入力する,降雨画像のベイズ生成モデルを構築する。このモデルを解くために,降雨画像の統計的分布をデータ駆動的に近似するために,変分推論フレームワークを用いた。学習したジェネレータでは、既存のベンチマークデータセットを効率的に強化し拡張するために、多種多様な非反復的なトレーニングペアを自動かつ十分に生成することができる。降雨画像の現実性を質的に定量的に評価する。包括的実験により,提案手法は,現在の深層単一画像レーダの流出性能を著しく向上させるだけでなく,sirタスクのための大規模トレーニングサンプルプリコレクションの必要性を大幅に緩和する,複雑な降雨分布を忠実に抽出できることがわかった。 For the single image rain removal (SIRR) task, the performance of deep learning (DL)-based methods is mainly affected by the designed deraining models and training datasets. Most of current state-of-the-art focus on constructing powerful deep models to obtain better deraining results. In this paper, to further improve the deraining performance, we novelly attempt to handle the SIRR task from the perspective of training datasets by exploring a more efficient way to synthesize rainy images. Specifically, we build a full Bayesian generative model for rainy image where the rain layer is parameterized as a generator with the input as some latent variables representing the physical structural rain factors, e.g., direction, scale, and thickness. To solve this model, we employ the variational inference framework to approximate the expected statistical distribution of rainy image in a data-driven manner. With the learned generator, we can automatically and sufficiently generate diverse and non-repetitive training pairs so as to efficiently enrich and augment the existing benchmark datasets. User study qualitatively and quantitatively evaluates the realism of generated rainy images. Comprehensive experiments substantiate that the proposed model can faithfully extract the complex rain distribution that not only helps significantly improve the deraining performance of current deep single image derainers, but also largely loosens the requirement of large training sample pre-collection for the SIRR task.	翻訳日:2022-11-01 12:15:52 公開日:2020-12-04
# 可視的人体再同定のためのパラメータ共有探索とヘテロセンターに基づくトリプルト損失 Parameter Sharing Exploration and Hetero-Center based Triplet Loss for Visible-Thermal Person Re-Identification ( http://arxiv.org/abs/2008.06223v2 ) ライセンス: Link先を確認	Haijun Liu, Xiaoheng Tan and Xichuan Zhou	(参考訳) 本稿では、昼間の可視光度と夜間の熱量との一致を目標とする視熱的クロスモーダル人物再識別(VT Re-ID)タスクに焦点を当てた。 VT Re-IDの最も難しい問題である相互モダリティの相違に対処するために、マルチモダリティの人的特徴を学習する。本稿では,既存の文献では十分に研究されていない2ストリームネットワークのパラメータの共有数について検討する。 ResNet50モデルを適切に分割し、モダリティ固有特徴抽出ネットワークとモダリティ共有特徴埋め込みネットワークを構築することにより、VT Re-IDのための2ストリームネットワークのパラメータ共有の効果を実験的に実証する。さらに,パートレベルの人型特徴学習の枠組みでは,アンカーセンターが他のすべてのセンターと比較し,従来の三重項損失の厳密な制約を緩和するために,ヘテロセンタに基づく三重項損失を提案する。非常に単純な方法により,提案手法はVT Re-IDの性能を大幅に向上させることができる。 2つのデータセットに対する実験結果から,提案手法は最先端の手法を大きなマージンで明らかに上回り,特に優れた性能を示すRegDBデータセットでは,ランク1/mAP/mINP 91.05%/83.28%/68.84%であることがわかった。 VT Re-IDの新しいベースラインであり、シンプルだが効果的な戦略である。 This paper focuses on the visible-thermal cross-modality person re-identification (VT Re-ID) task, whose goal is to match person images between the daytime visible modality and the nighttime thermal modality. The two-stream network is usually adopted to address the cross-modality discrepancy, the most challenging problem for VT Re-ID, by learning the multi-modality person features. In this paper, we explore how many parameters of two-stream network should share, which is still not well investigated in the existing literature. By well splitting the ResNet50 model to construct the modality-specific feature extracting network and modality-sharing feature embedding network, we experimentally demonstrate the effect of parameters sharing of two-stream network for VT Re-ID. Moreover, in the framework of part-level person feature learning, we propose the hetero-center based triplet loss to relax the strict constraint of traditional triplet loss through replacing the comparison of anchor to all the other samples by anchor center to all the other centers. With the extremely simple means, the proposed method can significantly improve the VT Re-ID performance. The experimental results on two datasets show that our proposed method distinctly outperforms the state-of-the-art methods by large margins, especially on RegDB dataset achieving superior performance, rank1/mAP/mINP 91.05%/83.28%/68.84%. It can be a new baseline for VT Re-ID, with a simple but effective strategy.	翻訳日:2022-10-30 17:19:15 公開日:2020-12-04
# エコー状態ネットワークにおける貯水池方程式の破断対称性 Breaking Symmetries of the Reservoir Equations in Echo State Networks ( http://arxiv.org/abs/2010.07103v2 ) ライセンス: Link先を確認	Joschka Herteux, Christoph R\"ath	(参考訳) 貯留層計算は非線形時系列の予測に非常に成功したことが繰り返し示されている。しかし、貯水池の適切な設計についてはまだ完全には理解されていない。最も一般的なセットアップは有害な対称性を持ち、ミラー・トラクターと呼ばれるものを予測することに繋がる。分析的に証明します同様の問題は一般的な状況で発生し、いくつかの設計の成功や失敗を説明するのに使用します。対称性は双曲的接点活性化関数の直接の結果である。さらに、対称性を破る4つの方法は数値的に比較される:出力のバイアス、入力のシフト、読み出しの二次項、偶数と奇数の活性化関数の混合。まず, 鏡装具に対する感受性を試験する。第2に,平均値が0にシフトしたLorenzデータを予測するタスクにおいて,その性能を評価する。短時間の予測は予測地平線で測定され、最大のリャプノフ指数と相関次元は気候を表すために用いられる。最後に、lorenzアトラクタとhalvorsenアトラクタの複合データセットで同じ解析を繰り返す。出力バイアスを除くすべてのメソッドは、入力シフトと二次読み出しによって、全体的なパフォーマンスを最大にする対称性を完全に破ることができることが分かりました。 Reservoir computing has repeatedly been shown to be extremely successful in the prediction of nonlinear time-series. However, there is no complete understanding of the proper design of a reservoir yet. We find that the simplest popular setup has a harmful symmetry, which leads to the prediction of what we call mirror-attractor. We prove this analytically. Similar problems can arise in a general context, and we use them to explain the success or failure of some designs. The symmetry is a direct consequence of the hyperbolic tangent activation function. Further, four ways to break the symmetry are compared numerically: A bias in the output, a shift in the input, a quadratic term in the readout, and a mixture of even and odd activation functions. Firstly, we test their susceptibility to the mirror-attractor. Secondly, we evaluate their performance on the task of predicting Lorenz data with the mean shifted to zero. The short-time prediction is measured with the forecast horizon while the largest Lyapunov exponent and the correlation dimension are used to represent the climate. Finally, the same analysis is repeated on a combined dataset of the Lorenz attractor and the Halvorsen attractor, which we designed to reveal potential problems with symmetry. We find that all methods except the output bias are able to fully break the symmetry with input shift and quadratic readout performing the best overall.	翻訳日:2022-10-16 05:00:48 公開日:2020-12-04
# 連続学習によるストリーミンググラフニューラルネットワーク Streaming Graph Neural Networks via Continual Learning ( http://arxiv.org/abs/2009.10951v2 ) ライセンス: Link先を確認	Junshan Wang, Guojie Song, Yi Wu, Liang Wang	(参考訳) グラフニューラルネットワーク(GNN)は様々なアプリケーションで高いパフォーマンスを実現している。実世界では通常、ネットワークデータはストリーミング形式で形成される。ノードの近傍情報を参照するパターンの分布は、時間とともに変化する可能性がある。 GNNモデルは、まだキャプチャできない新しいパターンを学ぶ必要がある。しかし、徐々に学習が進むと、歴史知識が新しく学んだ知識によって上書きされるという破滅的な問題を忘れてしまう。したがって、GNNモデルをトレーニングして新しいパターンを学び、既存のパターンを同時に維持することが重要である。本稿では,連続学習に基づくストリーミングGNNモデルを提案する。まず,情報伝達に基づく新しいパターンを効率的に検出する近似アルゴリズムを設計する。次に,既存のパターン統合のためのデータ再生とモデル正規化の2つの視点を組み合わせる。特に,ノードの階層-重要サンプリング戦略を設計し,GNNパラメータの重み付き正規化項を導出し,より安定性と知識統合の一般化を実現する。本モデルは,実データおよび合成データを用いて評価し,複数のベースラインと比較した。ノード分類の結果,モデルパラメータを効率的に更新でき,モデル再トレーニングに匹敵する性能が得られることがわかった。さらに,合成データに関するケーススタディを行い,モデルの各部分について特定の分析を行い,新しい知識を学習し,異なる視点から既存の知識を維持する能力を示す。 Graph neural networks (GNNs) have achieved strong performance in various applications. In the real world, network data is usually formed in a streaming fashion. The distributions of patterns that refer to neighborhood information of nodes may shift over time. The GNN model needs to learn the new patterns that cannot yet be captured. But learning incrementally leads to the catastrophic forgetting problem that historical knowledge is overwritten by newly learned knowledge. Therefore, it is important to train GNN model to learn new patterns and maintain existing patterns simultaneously, which few works focus on. In this paper, we propose a streaming GNN model based on continual learning so that the model is trained incrementally and up-to-date node representations can be obtained at each time step. Firstly, we design an approximation algorithm to detect new coming patterns efficiently based on information propagation. Secondly, we combine two perspectives of data replaying and model regularization for existing pattern consolidation. Specially, a hierarchy-importance sampling strategy for nodes is designed and a weighted regularization term for GNN parameters is derived, achieving greater stability and generalization of knowledge consolidation. Our model is evaluated on real and synthetic data sets and compared with multiple baselines. The results of node classification prove that our model can efficiently update model parameters and achieve comparable performance to model retraining. In addition, we also conduct a case study on the synthetic data, and carry out some specific analysis for each part of our model, illustrating its ability to learn new knowledge and maintain existing knowledge from different perspectives.	翻訳日:2022-10-15 15:43:11 公開日:2020-12-04
# 介入的少数ショット学習 Interventional Few-Shot Learning ( http://arxiv.org/abs/2009.13000v2 ) ライセンス: Link先を確認	Zhongqi Yue and Hanwang Zhang and Qianru Sun and Xian-Sheng Hua	(参考訳) 一般的なFew-Shot Learning(FSL)メソッドでは、見過ごされがちな欠陥が明らかになりました。この発見は、事前学習された知識、サンプルの特徴、ラベルの因果関係に関する構造因果モデル(Structure Causal Model, SCM)という私たちの因果的仮定に根ざしている。そこで我々は,新たなFSLパラダイムであるIFSL(Interventional Few-Shot Learning)を提案する。具体的には、バックドア調整に基づく3つの効果的なIFSLアルゴリズムの実装を開発する。これは本質的に、多ショット学習のSCMに対する因果的介入である。 IFSLの貢献は、既存の微調整およびメタラーニングベースのFSLメソッドに直交しているため、IFSLは、新しい1/5ショットステート・オブ・ザ・アートを、 \textit{mini} ImageNet、 \textit{tiered} ImageNet、およびクロスドメインCUBで達成することで、そのすべてを改善することができる。コードはhttps://github.com/yue-zhongqi/ifslでリリースされる。 We uncover an ever-overlooked deficiency in the prevailing Few-Shot Learning (FSL) methods: the pre-trained knowledge is indeed a confounder that limits the performance. This finding is rooted from our causal assumption: a Structural Causal Model (SCM) for the causalities among the pre-trained knowledge, sample features, and labels. Thanks to it, we propose a novel FSL paradigm: Interventional Few-Shot Learning (IFSL). Specifically, we develop three effective IFSL algorithmic implementations based on the backdoor adjustment, which is essentially a causal intervention towards the SCM of many-shot learning: the upper-bound of FSL in a causal view. It is worth noting that the contribution of IFSL is orthogonal to existing fine-tuning and meta-learning based FSL methods, hence IFSL can improve all of them, achieving a new 1-/5-shot state-of-the-art on \textit{mini}ImageNet, \textit{tiered}ImageNet, and cross-domain CUB. Code is released at https://github.com/yue-zhongqi/ifsl.	翻訳日:2022-10-13 21:14:16 公開日:2020-12-04
# コントラスト学習のためのハード負混合 Hard Negative Mixing for Contrastive Learning ( http://arxiv.org/abs/2010.01028v2 ) ライセンス: Link先を確認	Yannis Kalantidis, Mert Bulent Sariyildiz, Noe Pion, Philippe Weinzaepfel, Diane Larlus	(参考訳) コントラスト学習は、コンピュータビジョンのための自己教師あり学習アプローチの重要な要素となっている。同じイメージの2つの拡張バージョンを互いに近接して埋め込み、異なるイメージの埋め込みを分離することで、高度に転送可能な視覚表現を訓練することができる。最近の研究で明らかになったように、重いデータ拡張と大きな負のセットは、どちらもそのような表現を学ぶ上で不可欠である。同時に、画像または特徴レベルのデータ混合戦略は、新しい例を合成することによって教師付き学習と半教師付き学習の両方を改善し、ネットワークにより堅牢な特徴を学習させる。本稿では,対照学習の重要な側面,すなわちハードネガティブの影響は,これまで無視されてきたと論じる。より意味のある負のサンプルを得るために、現在のトップコントラストの自己教師型学習アプローチは、バッチサイズを大幅に増加させるか、非常に大きなメモリバンクを保持するかのいずれかである。ですから私たちは、トップパフォーマンスのフレームワークを深く掘り下げて、より優れた、より高速な学習を促進するために、より厳しいネガティブが必要な証拠を示します。これらの観察に基づいて,データ混合の成功を動機とし,最小の計算オーバーヘッドでオンザフライで計算可能な特徴量レベルでのハードネガティブ混合戦略を提案する。我々は,線形分類,オブジェクト検出,インスタンスセグメンテーションに対するアプローチを徹底的に改善し,その手法を用いることで,最先端の自己教師型学習法で学習した視覚表現の質が向上することを示す。 Contrastive learning has become a key component of self-supervised learning approaches for computer vision. By learning to embed two augmented versions of the same image close to each other and to push the embeddings of different images apart, one can train highly transferable visual representations. As revealed by recent studies, heavy data augmentation and large sets of negatives are both crucial in learning such representations. At the same time, data mixing strategies either at the image or the feature level improve both supervised and semi-supervised learning by synthesizing novel examples, forcing networks to learn more robust features. In this paper, we argue that an important aspect of contrastive learning, i.e., the effect of hard negatives, has so far been neglected. To get more meaningful negative samples, current top contrastive self-supervised learning approaches either substantially increase the batch sizes, or keep very large memory banks; increasing the memory size, however, leads to diminishing returns in terms of performance. We therefore start by delving deeper into a top-performing framework and show evidence that harder negatives are needed to facilitate better and faster learning. Based on these observations, and motivated by the success of data mixing, we propose hard negative mixing strategies at the feature level, that can be computed on-the-fly with a minimal computational overhead. We exhaustively ablate our approach on linear classification, object detection and instance segmentation and show that employing our hard negative mixing procedure improves the quality of visual representations learned by a state-of-the-art self-supervised learning method.	翻訳日:2022-10-12 01:07:58 公開日:2020-12-04
# 変分動的混合 Variational Dynamic Mixtures ( http://arxiv.org/abs/2010.10403v2 ) ライセンス: Link先を確認	Chen Qiu, Stephan Mandt, Maja Rudolph	(参考訳) 深い確率的時系列予測モデルが機械学習の不可欠な部分となっている。いくつかの強力な生成モデルが提案されているが、それらの関連する推論モデルはしばしば制限されすぎており、生成モデルがモード平均ダイナミクスを予測している証拠を提供する。多くの実世界のシーケンスは高度にマルチモーダルであり、それらの平均的なダイナミクスは非物理的である(例えば、予測されたタクシー軌道は道路地図上の建物を通り抜けるかもしれない)。マルチモダリティをよりよく捉えるために、変分動的混合(vdm: variational dynamic mixtures)を開発した。それぞれの時間ステップにおけるVDM近似は混合密度ネットワークであり、そのパラメータは再帰的なアーキテクチャを通して複数のサンプルを伝播することに由来する。この結果, マルチモーダル後部近似が得られた。実証実験により、VDMは、異なるドメインの高度マルチモーダルデータセットにおいて競合するアプローチよりも優れていることを示す。 Deep probabilistic time series forecasting models have become an integral part of machine learning. While several powerful generative models have been proposed, we provide evidence that their associated inference models are oftentimes too limited and cause the generative model to predict mode-averaged dynamics. Modeaveraging is problematic since many real-world sequences are highly multi-modal, and their averaged dynamics are unphysical (e.g., predicted taxi trajectories might run through buildings on the street map). To better capture multi-modality, we develop variational dynamic mixtures (VDM): a new variational family to infer sequential latent variables. The VDM approximate posterior at each time step is a mixture density network, whose parameters come from propagating multiple samples through a recurrent architecture. This results in an expressive multi-modal posterior approximation. In an empirical study, we show that VDM outperforms competing approaches on highly multi-modal datasets from different domains.	翻訳日:2022-10-05 07:20:56 公開日:2020-12-04
# Digital Twins: 最先端のアート理論と実践,課題,オープンリサーチに関する質問 Digital Twins: State of the Art Theory and Practice, Challenges, and Open Research Questions ( http://arxiv.org/abs/2011.02833v3 ) ライセンス: Link先を確認	Angira Sharma, Edward Kosasih, Jie Zhang, Alexandra Brintrup, Anisoara Calinescu	(参考訳) Digital Twinは10年以上前に、リアルタイムモニタリング、シミュレーション、予測などのメリットを享受して、革新的なオールエンコンパスツールとして紹介された。しかし、デジタル双生児(DT)の理論的枠組みと実践的実装は、まだこのビジョンには程遠い。実装は成功したが、十分な実装の詳細は公開されていないため、それらの効果を評価し、比較し、DT方法論を共同で進めることは困難である。この研究は、様々なDT機能と現在のアプローチ、デジタルツインの実装と導入の遅れの背景にある欠点と理由を探求する。機械学習、モノのインターネット、ビッグデータの進歩は、そのリアルタイム監視と予測特性に関するdtの改善に大きく貢献している。この進歩と個々の企業ベースの取り組みにもかかわらず、この分野にはある種の研究ギャップがあり、この概念の普及が遅れている。この遅延の主な理由は、共通参照フレームワークの欠如、ドメイン依存、共有データのセキュリティ懸念、他の技術へのデジタルツインの依存、定量的メトリクスの欠如である。我々は、普遍参照フレームワークに必要なデジタル双生児の必要な構成要素を定義し、シミュレーションや自律システムといった同様の概念と比較して、その一意性を概念として検証する。この研究は、異なるドメインにおけるデジタルツインアプリケーションと、その中の機械学習とビッグデータの現状をさらに評価する。これにより、デジタル双生児の理論と実践をよりよく理解し前進させ、新しい研究課題に答え、特定することができる。 Digital Twin was introduced over a decade ago, as an innovative all-encompassing tool, with perceived benefits including real-time monitoring, simulation and forecasting. However, the theoretical framework and practical implementations of digital twins (DT) are still far from this vision. Although successful implementations exist, sufficient implementation details are not publicly available, therefore it is difficult to assess their effectiveness, draw comparisons and jointly advance the DT methodology. This work explores the various DT features and current approaches, the shortcomings and reasons behind the delay in the implementation and adoption of digital twin. Advancements in machine learning, internet of things and big data have contributed hugely to the improvements in DT with regards to its real-time monitoring and forecasting properties. Despite this progress and individual company-based efforts, certain research gaps exist in the field, which have caused delay in the widespread adoption of this concept. We reviewed relevant works and identified that the major reasons for this delay are the lack of a universal reference framework, domain dependence, security concerns of shared data, reliance of digital twin on other technologies, and lack of quantitative metrics. We define the necessary components of a digital twin required for a universal reference framework, which also validate its uniqueness as a concept compared to similar concepts like simulation, autonomous systems, etc. This work further assesses the digital twin applications in different domains and the current state of machine learning and big data in it. It thus answers and identifies novel research questions, both of which will help to better understand and advance the theory and practice of digital twins.	翻訳日:2022-09-30 13:07:07 公開日:2020-12-04
# 複数の自己教師付き補助タスクを持つグラフベースニューラルネットワークモデル Graph-Based Neural Network Models with Multiple Self-Supervised Auxiliary Tasks ( http://arxiv.org/abs/2011.07267v2 ) ライセンス: Link先を確認	Franco Manessi, Alessandro Rozza	(参考訳) ニューラルネットワークは大量のラベルのないデータから堅牢な表現を学習できるため、自己教師付き学習が注目されている。さらに、マルチタスク学習は、関連するタスクを同時にトレーニングするネットワークによる表現学習をさらに改善し、パフォーマンスが大幅に向上する。本稿では,グラフベースのニューラルネットワークモデルをマルチタスクで学習するための3つの補助タスクを提案する。グラフ畳み込みネットワークは構造化データポイント間の関係を捉えるための最も有望な手法であるので,標準的な半教師付きグラフ分類タスクにおける競合結果を達成するための構築ブロックとして利用する。 Self-supervised learning is currently gaining a lot of attention, as it allows neural networks to learn robust representations from large quantities of unlabeled data. Additionally, multi-task learning can further improve representation learning by training networks simultaneously on related tasks, leading to significant performance improvements. In this paper, we propose three novel self-supervised auxiliary tasks to train graph-based neural network models in a multi-task fashion. Since Graph Convolutional Networks are among the most promising approaches for capturing relationships among structured data points, we use them as a building block to achieve competitive results on standard semi-supervised graph classification tasks.	翻訳日:2022-09-25 13:46:06 公開日:2020-12-04
# ボリュームセンサを用いた協調認識と知覚のための特徴共有と統合 Feature Sharing and Integration for Cooperative Cognition and Perception with Volumetric Sensors ( http://arxiv.org/abs/2011.08317v3 ) ライセンス: Link先を確認	Ehsan Emad Marvasti, Arash Raftari, Amir Emad Marvasti, Yaser P.Fallah, Rui Guo, Hongsheng Lu	(参考訳) 近年の計算・通信システムの進歩により、高性能ニューラルネットワークと高速無線車両通信ネットワークが導入されている。その結果、協調的知覚や認知といった新しい技術が登場し、部分的に遮蔽されたターゲットの検出とセンシング範囲の拡大のためのソリューションを提供することで、感覚デバイスの固有の制限に対処している。しかし、信頼できる協調認識システムを設計するには、限られたネットワークリソースと異なるソースが共有するデータ間の不一致に起因する課題に対処する必要がある。本稿では,異なる協調認識技術の要件,限界,性能について検討し,Deep Feature Sharing(DFS)の概念の詳細な分析を行う。我々は,異なる協調物体検出設計を探求し,その性能を平均精度で評価する。実験にはVolonyデータセットを使用します。その結果,DFS法はGPSノイズによる局所化誤差にかなり敏感であることがわかった。さらに,より協調的な参加者の追加によるDFS手法の検出ゲインは生の情報共有技術に匹敵するものであり,DFSは通信要求を満たすための設計の柔軟性を実現する。 The recent advancement in computational and communication systems has led to the introduction of high-performing neural networks and high-speed wireless vehicular communication networks. As a result, new technologies such as cooperative perception and cognition have emerged, addressing the inherent limitations of sensory devices by providing solutions for the detection of partially occluded targets and expanding the sensing range. However, designing a reliable cooperative cognition or perception system requires addressing the challenges caused by limited network resources and discrepancies between the data shared by different sources. In this paper, we examine the requirements, limitations, and performance of different cooperative perception techniques, and present an in-depth analysis of the notion of Deep Feature Sharing (DFS). We explore different cooperative object detection designs and evaluate their performance in terms of average precision. We use the Volony dataset for our experimental study. The results confirm that the DFS methods are significantly less sensitive to the localization error caused by GPS noise. Furthermore, the results attest that detection gain of DFS methods caused by adding more cooperative participants in the scenes is comparable to raw information sharing technique while DFS enables flexibility in design toward satisfying communication requirements.	翻訳日:2022-09-24 23:21:18 公開日:2020-12-04
# FROST: より高速でロバストなワンショットセミ教師トレーニング FROST: Faster and more Robust One-shot Semi-supervised Training ( http://arxiv.org/abs/2011.09471v4 ) ライセンス: Link先を確認	Helena E. Liu and Leslie N. Smith	(参考訳) 半教師付き一発学習の最近の進歩は、新しい応用の深層学習の障壁を低くしている。しかしながら、半教師付き学習の最先端はトレーニングが遅く、ラベル付きデータとハイパーパラメータの値の選択に敏感である。本稿では,一対一の半教師付き学習手法を提案する。具体的には,半教師付き学習と,自己学習の1段階の単一ネットワークバージョンを組み合わせることで,より高速に学習し,ラベル付きサンプルの選択やハイパーパラメータの変更に対して頑健であることを示す。実験では,ラベルなしデータの構成が不明な場合,すなわちラベルなしデータが各クラスの不等数を含み,トレーニングクラスに属さない分布外例を含む場合において,frostがうまく機能することを示す。ハイパフォーマンス、トレーニング速度、ハイパーパラメータに対する感度はFROSTを一発半教師あり訓練の最も実用的な方法である。私たちのコードはhttps://github.com/helenaeliu/frostで利用可能です。 Recent advances in one-shot semi-supervised learning have lowered the barrier for deep learning of new applications. However, the state-of-the-art for semi-supervised learning is slow to train and the performance is sensitive to the choices of the labeled data and hyper-parameter values. In this paper, we present a one-shot semi-supervised learning method that trains up to an order of magnitude faster and is more robust than state-of-the-art methods. Specifically, we show that by combining semi-supervised learning with a one-stage, single network version of self-training, our FROST methodology trains faster and is more robust to choices for the labeled samples and changes in hyper-parameters. Our experiments demonstrate FROST's capability to perform well when the composition of the unlabeled data is unknown; that is when the unlabeled data contain unequal numbers of each class and can contain out-of-distribution examples that don't belong to any of the training classes. High performance, speed of training, and insensitivity to hyper-parameters make FROST the most practical method for one-shot semi-supervised training. Our code is available at https://github.com/HelenaELiu/FROST.	翻訳日:2022-09-24 03:20:03 公開日:2020-12-04
# 適合性問題に関するカテゴリー探索データ分析 Categorical exploratory data analysis on goodness-of-fit issues ( http://arxiv.org/abs/2011.09682v2 ) ライセンス: Link先を確認	Sabrina Enriquez, Fushing Hsieh	(参考訳) ジョージ・ボックス(george box)は、データ分析において、特に現実世界のデータ分析において、引き続き真であり続けるならば、この知恵を可視で説明可能なデータ駆動型パターンで注釈すべきである。このようなアノテーションは、データ分析アプローチとしての統計モデリングの限界だけでなく、妥当性にも価値ある光を当てることができる。実データを潜在的に到達不能あるいは非現実的な理論構造に保持することを避けるため、我々はカテゴリ探索データ分析(ceda)と呼ばれるデータ分析パラダイムを活用すべきである。本提案のメリットを,適合性の観点から,実世界の2つのデータセットを用いて説明する。どちらのデータセットでも、通常の分布のベル形状は一見するとかなりよく合っているように見える。 CEDAを適用して、各データがモデル形状に適合するか、どのようにずれるのかを、いくつかの重要な分布面を通して明らかにする。また、CEDA は木に基づく p-値のバージョンを利用できることを実証し、従来の統計的アプローチに基づく p-値と比較する。データ分析とともに、データサイエンス教育におけるデータ分析の第一手段としてcedaを使用する利点を照らし出すために、グラフィックディスプレイの作成に計算の努力を注ぐ。 If the aphorism "All models are wrong"- George Box, continues to be true in data analysis, particularly when analyzing real-world data, then we should annotate this wisdom with visible and explainable data-driven patterns. Such annotations can critically shed invaluable light on validity as well as limitations of statistical modeling as a data analysis approach. In an effort to avoid holding our real data to potentially unattainable or even unrealistic theoretical structures, we propose to utilize the data analysis paradigm called Categorical Exploratory Data Analysis (CEDA). We illustrate the merits of this proposal with two real-world data sets from the perspective of goodness-of-fit. In both data sets, the Normal distribution's bell shape seemingly fits rather well by first glance. We apply CEDA to bring out where and how each data fits or deviates from the model shape via several important distributional aspects. We also demonstrate that CEDA affords a version of tree-based p-value, and compare it with p-values based on traditional statistical approaches. Along our data analysis, we invest computational efforts in making graphic display to illuminate the advantages of using CEDA as one primary way of data analysis in Data Science education.	翻訳日:2022-09-23 20:53:11 公開日:2020-12-04
# テトラアドラルキラル性を有する分子のメッセージパッシングネットワーク Message Passing Networks for Molecules with Tetrahedral Chirality ( http://arxiv.org/abs/2012.00094v2 ) ライセンス: Link先を確認	Lagnajit Pattanaik, Octavian-Eugen Ganea, Ian Coley, Klavs F. Jensen, William H. Green, Connor W. Coley	(参考訳) 同一のグラフ接続を持つ分子は、立体化学、空間構造特性を示すと、物理的および生物学的性質が異なる。しかし、分子構造から構造とプロパティの関係を学ぶために設計された現代のニューラルネットワークは、分子をグラフ構造データとして扱うため、立体化学に不変である。そこで我々は, 四面体キラル性をもつ分子の性質を学習するための, メッセージパッシングニューラルネットワークのための2つのカスタムアグリゲーション関数を開発した。合成データと新規なタンパク質-リガンドドッキングデータセットの性能を薬物発見との関連性で評価した。その結果、ベースラインの総和アグリゲータよりも微妙な改善が見られ、さらなるアーキテクチャ開発の機会が浮かび上がっている。 Molecules with identical graph connectivity can exhibit different physical and biological properties if they exhibit stereochemistry-a spatial structural characteristic. However, modern neural architectures designed for learning structure-property relationships from molecular structures treat molecules as graph-structured data and therefore are invariant to stereochemistry. Here, we develop two custom aggregation functions for message passing neural networks to learn properties of molecules with tetrahedral chirality, one common form of stereochemistry. We evaluate performance on synthetic data as well as a newly-proposed protein-ligand docking dataset with relevance to drug discovery. Results show modest improvements over a baseline sum aggregator, highlighting opportunities for further architecture development.	翻訳日:2022-09-21 14:21:32 公開日:2020-12-04
# (参考訳) ディープラーニングのスケールダウン Scaling down Deep Learning ( http://arxiv.org/abs/2011.14439v3 ) ライセンス: CC BY 4.0	Sam Greydanus	(参考訳) 深層学習モデルは商業的・政治的に関係しているが、その訓練と運用の多くの側面はいまだに理解されていない。これは"深層学習の科学"プロジェクトへの関心を呼び起こし、その多くが大規模に実行され、膨大な時間、お金、電気を必要とする。しかし、この研究はどの程度大規模に行われる必要があるのか? 本稿では,従来のディープラーニングベンチマークに代わる最小限,低メモリ,低スループットのMNIST-1Dを提案する。トレーニングの例はMNISTの例の20倍小さいが、線形モデル、非線形モデル、畳み込みモデル、それぞれ32、68、94%の精度で区別する(これらのモデルはMNISTで94、99+、99+%を得る)。次に,宝くじの空間的インダクティブバイアスの測定,ディープダブル降下の観察,アクティベーション関数のメタラーニングといったユースケースを示す。 Though deep learning models have taken on commercial and political relevance, many aspects of their training and operation remain poorly understood. This has sparked interest in "science of deep learning" projects, many of which are run at scale and require enormous amounts of time, money, and electricity. But how much of this research really needs to occur at scale? In this paper, we introduce MNIST-1D: a minimalist, low-memory, and low-compute alternative to classic deep learning benchmarks. The training examples are 20 times smaller than MNIST examples yet they differentiate more clearly between linear, nonlinear, and convolutional models which attain 32, 68, and 94% accuracy respectively (these models obtain 94, 99+, and 99+% on MNIST). Then we present example use cases which include measuring the spatial inductive biases of lottery tickets, observing deep double descent, and metalearning an activation function.	翻訳日:2021-06-07 09:40:46 公開日:2020-12-04
# (参考訳) 遅延とエネルギー制約を考慮した非同期モバイルエッジ学習のためのタスク割当 Task Allocation for Asynchronous Mobile Edge Learning with Delay and Energy Constraints ( http://arxiv.org/abs/2012.00143v2 ) ライセンス: CC BY 4.0	Umair Mohammad, Sameh Sorour, Mohamed Hefeida	(参考訳) 本稿では、リソース制約された無線エッジネットワークを介して接続された複数のエッジノードまたは学習者間で非同期に機械学習モデルをトレーニングするための最適なタスク割り当てスキームを設計し、"モバイルエッジラーニング(MEL)"のパラダイムを拡張した。最適化は、各学習者に割り当てられたタスクの一部が、所定の大域的遅延制約と局所的な最大エネルギー消費限界内で完了するように行われる。消費される時間とエネルギーは、学習者の不均一なコミュニケーションと計算能力に直接関係している。提案するモデルは異種性認識(HA)である。結果の最適化はNP-hard quadratically-constrained integer linear program (QCILP) であるので、緩和された同期問題の解を用いて2段階のsugg-and-improve (SAI) 解を提案し、非同期問題の解を求める。提案したHA非同期(HA-Asyn)アプローチは、HA同期(HA-Sync)スキームとHU同値バッチ割り当てスキームと比較する。 20名の学習者が様々な完了時間とエネルギー消費制約をテストした結果,提案手法はhu同期/asynchronous (hu-sync/asyn) 法よりも優れており,ha-sync法と比較して最大25\%の利得が得られることがわかった。 This paper extends the paradigm of "mobile edge learning (MEL)" by designing an optimal task allocation scheme for training a machine learning model in an asynchronous manner across mutiple edge nodes or learners connected via a resource-constrained wireless edge network. The optimization is done such that the portion of the task allotted to each learner is completed within a given global delay constraint and a local maximum energy consumption limit. The time and energy consumed are related directly to the heterogeneous communication and computational capabilities of the learners; i.e. the proposed model is heterogeneity aware (HA). Because the resulting optimization is an NP-hard quadratically-constrained integer linear program (QCILP), a two-step suggest-and-improve (SAI) solution is proposed based on using the solution of the relaxed synchronous problem to obtain the solution to the asynchronous problem. The proposed HA asynchronous (HA-Asyn) approach is compared against the HA synchronous (HA-Sync) scheme and the heterogeneity unaware (HU) equal batch allocation scheme. Results from a system of 20 learners tested for various completion time and energy consumption constraints show that the proposed HA-Asyn method works better than the HU synchronous/asynchronous (HU-Sync/Asyn) approach and can provide gains of up-to 25\% compared to the HA-Sync scheme.	翻訳日:2021-06-06 16:42:53 公開日:2020-12-04
# 顔の超解像に対する空間的注意の学習 Learning Spatial Attention for Face Super-Resolution ( http://arxiv.org/abs/2012.01211v2 ) ライセンス: Link先を確認	Chaofeng Chen, Dihong Gong, Hao Wang, Zhifeng Li, Kwan-Yee K. Wong	(参考訳) 一般画像超解像技術は、低解像度の顔画像に適用する場合、詳細な顔構造を復元することが困難である。近年,顔画像に適した深層学習手法は,顔解析やランドマーク予測などのタスクを共同で訓練することで,性能の向上を実現している。しかし、マルチタスク学習には追加のラベル付きデータが必要である。さらに、既存の作品の多くは比較的低解像度の顔画像しか生成できない(例えば、128\times128$)ため、その用途は限られている。本稿では,顔超解像のための新しい顔注意ユニット (FAU) 上に構築したSPatial Attention Residual Network (SPARNet) を紹介する。具体的には,バニラ残留ブロックに空間的注意機構を導入する。これにより、畳み込み層は、キーとなる顔構造に関連する機能を適応的にブートストラップし、より機能豊富な領域に注意を払わないことができる。これにより、キーとなる顔構造が顔画像のごく一部を占めるだけで、トレーニングはより効果的で効率的なものになる。注意マップの可視化は、非常に低解像度の顔であっても、我々の空間的注意ネットワークがキーフェイス構造をうまく捉えることができることを示している。各種指標(psnr, ssim, アイデンティティ類似性, ランドマーク検出など)の定量的比較により, 現状よりも優れた手法が得られた。さらにSPARNetをSPARNetHDと呼ばれるマルチスケールの識別器で拡張し、高解像度な結果(512\times512$)を生成する。合成データを用いて訓練したSPARNetHDは、合成劣化顔画像に対して高品質で高解像度な出力を生成するだけでなく、現実の低画質顔画像に対して優れた一般化能力を示すことを示す。 General image super-resolution techniques have difficulties in recovering detailed face structures when applying to low resolution face images. Recent deep learning based methods tailored for face images have achieved improved performance by jointly trained with additional task such as face parsing and landmark prediction. However, multi-task learning requires extra manually labeled data. Besides, most of the existing works can only generate relatively low resolution face images (e.g., $128\times128$), and their applications are therefore limited. In this paper, we introduce a novel SPatial Attention Residual Network (SPARNet) built on our newly proposed Face Attention Units (FAUs) for face super-resolution. Specifically, we introduce a spatial attention mechanism to the vanilla residual blocks. This enables the convolutional layers to adaptively bootstrap features related to the key face structures and pay less attention to those less feature-rich regions. This makes the training more effective and efficient as the key face structures only account for a very small portion of the face image. Visualization of the attention maps shows that our spatial attention network can capture the key face structures well even for very low resolution faces (e.g., $16\times16$). Quantitative comparisons on various kinds of metrics (including PSNR, SSIM, identity similarity, and landmark detection) demonstrate the superiority of our method over current state-of-the-arts. We further extend SPARNet with multi-scale discriminators, named as SPARNetHD, to produce high resolution results (i.e., $512\times512$). We show that SPARNetHD trained with synthetic data cannot only produce high quality and high resolution outputs for synthetically degraded face images, but also show good generalization ability to real world low quality face images.	翻訳日:2021-05-25 03:57:35 公開日:2020-12-04
# rotnet:畳み込みニューラルネットワークを用いた恒星回転周期の高速かつスケーラブルな推定 RotNet: Fast and Scalable Estimation of Stellar Rotation Periods Using Convolutional Neural Networks ( http://arxiv.org/abs/2012.01985v2 ) ライセンス: Link先を確認	J. Emmanuel Johnson, Sairam Sundaresan, Tansu Daylan, Lisseth Gavilan, Daniel K. Giles, Stela Ishitani Silva, Anna Jungbluth, Brett Morris, Andr\'es Mu\~noz-Jaramillo	(参考訳) 恒星の磁気活動は、望遠鏡が観測する明るさを調節する表面の暗い斑点として現れる。これらの光度曲線は恒星の回転に関する重要な情報を含んでいる。しかしながら、回転周期の正確な推定は、基底真理情報、ノイズデータ、そして縮退解につながる大きなパラメータ空間のために計算的に高価である。深層学習のパワーを活かし,ケプラー光曲線からの恒星回転周期の後退に畳み込みニューラルネットワークを応用した。光曲線の画像変換のための幾何学保存時系列は、転送学習によって訓練されたResNet-18アーキテクチャへの入力として機能する。 mcquillan catalog of published rotation periodsはansatz to groundtruthとして使われている。我々は,この手法の性能を,ランダムフォレスト回帰器,1次元CNN,自動相関関数(ACF)に対してベンチマークし,回転周期を推定する。入力を少ないデータポイント(1k)に制限したものの、モデルはより正確な結果をもたらし、同じ数のデータポイントでacfが動作し、acfが65kのデータポイントで実行するよりも10000倍高速に動作します。最小限の機能エンジニアリングだけで、我々のアプローチは印象的な精度を持ち、より大規模な恒星パラメータの回帰にディープラーニングの適用を動機付けます。 Magnetic activity in stars manifests as dark spots on their surfaces that modulate the brightness observed by telescopes. These light curves contain important information on stellar rotation. However, the accurate estimation of rotation periods is computationally expensive due to scarce ground truth information, noisy data, and large parameter spaces that lead to degenerate solutions. We harness the power of deep learning and successfully apply Convolutional Neural Networks to regress stellar rotation periods from Kepler light curves. Geometry-preserving time-series to image transformations of the light curves serve as inputs to a ResNet-18 based architecture which is trained through transfer learning. The McQuillan catalog of published rotation periods is used as ansatz to groundtruth. We benchmark the performance of our method against a random forest regressor, a 1D CNN, and the Auto-Correlation Function (ACF) - the current standard to estimate rotation periods. Despite limiting our input to fewer data points (1k), our model yields more accurate results and runs 350 times faster than ACF runs on the same number of data points and 10,000 times faster than ACF runs on 65k data points. With only minimal feature engineering our approach has impressive accuracy, motivating the application of deep learning to regress stellar parameters on an even larger scale	翻訳日:2021-05-25 03:43:01 公開日:2020-12-04
# スパースイジングモデルのサンプル効率l0-l2制約構造学習 Sample-efficient L0-L2 constrained structure learning of sparse Ising models ( http://arxiv.org/abs/2012.01744v2 ) ライセンス: Link先を確認	Antoine Dedieu, Miguel L\'azaro-Gredilla, Dileep George	(参考訳) スパースイジングモデルの基盤となるグラフを$n$ i.i.dから$p$ノードで学習する問題を考察する。サンプル最新の最も優れた手法は、経験的損失(ロジスティック回帰損失または相互作用スクリーニング損失)と正規化器(L1ペナルティまたはL1制約)を組み合わせることである。これにより、グラフの各ノードごとに別々に解くことができる凸問題が発生する。本研究では, 濃度制約 L0 ノルムを利用して, 空間性を適切に誘導し, さらに L2 ノルムと組み合わせて非零係数をモデル化する。本稿では,論文で研究されているグラフトポロジの貧弱度と完全回復率の急激な相転移をL1系と比較することにより,理論上, (a) 回復保証のための新しい最先端の上限に達し, (b) 実験的に, サンプルの複雑さの向上を図っている。 We consider the problem of learning the underlying graph of a sparse Ising model with $p$ nodes from $n$ i.i.d. samples. The most recent and best performing approaches combine an empirical loss (the logistic regression loss or the interaction screening loss) with a regularizer (an L1 penalty or an L1 constraint). This results in a convex problem that can be solved separately for each node of the graph. In this work, we leverage the cardinality constraint L0 norm, which is known to properly induce sparsity, and further combine it with an L2 norm to better model the non-zero coefficients. We show that our proposed estimators achieve an improved sample complexity, both (a) theoretically -- by reaching new state-of-the-art upper bounds for recovery guarantees -- and (b) empirically -- by showing sharper phase transitions between poor and full recovery for graph topologies studied in the literature -- when compared to their L1-based counterparts.	翻訳日:2021-05-23 15:08:53 公開日:2020-12-04
# 単発経路統合パンオプティカルセグメンテーション Single-shot Path Integrated Panoptic Segmentation ( http://arxiv.org/abs/2012.01632v2 ) ライセンス: Link先を確認	Sukjun Hwang, Seoung Wug Oh, Seon Joo Kim	(参考訳) インスタンスのセグメンテーションとセマンティックセグメンテーションを統一する新しいタスクであるpanoptic segmentationが最近注目を集めている。しかしながら、従来の手法のほとんどは、指定された分割タスクに特化した経路ごとに複数の経路から構成される。本稿では,実行フローを統合することで,単ショットでのパノプティカルセグメンテーションを解決することを提案する。統合された経路では、panoptic-featureと呼ばれる統合機能マップが生成され、物と物の両方の情報が含まれている。パノプティカル・フィーチャーは、同じインスタンスに属するクラスタピクセルを誘導し、異なるクラスのオブジェクトを区別する補助的な問題によってより洗練される。各フィルタが物または物を表す畳み込みフィルタのコレクションは、一度にパンオプティカル機能に適用され、シングルショットのパンオプティカルセグメンテーションが実現される。トップダウンとボトムアップの両方のアプローチの利点を生かして、SPINetと呼ばれる手法は、COCOとCityscapesという主要な汎視的セグメンテーションベンチマークにおいて、高い効率と精度を享受する。 Panoptic segmentation, which is a novel task of unifying instance segmentation and semantic segmentation, has attracted a lot of attention lately. However, most of the previous methods are composed of multiple pathways with each pathway specialized to a designated segmentation task. In this paper, we propose to resolve panoptic segmentation in single-shot by integrating the execution flows. With the integrated pathway, a unified feature map called Panoptic-Feature is generated, which includes the information of both things and stuffs. Panoptic-Feature becomes more sophisticated by auxiliary problems that guide to cluster pixels that belong to the same instance and differentiate between objects of different classes. A collection of convolutional filters, where each filter represents either a thing or stuff, is applied to Panoptic-Feature at once, materializing the single-shot panoptic segmentation. Taking the advantages of both top-down and bottom-up approaches, our method, named SPINet, enjoys high efficiency and accuracy on major panoptic segmentation benchmarks: COCO and Cityscapes.	翻訳日:2021-05-23 14:59:49 公開日:2020-12-04
# 教師なし3次元セグメンテーションのための双曲表現の学習 Learning Hyperbolic Representations for Unsupervised 3D Segmentation ( http://arxiv.org/abs/2012.01644v2 ) ライセンス: Link先を確認	Joy Hsu, Jeffrey Gu, Gong-Her Wu, Wah Chiu, Serena Yeung	(参考訳) 複雑なボリュームデータには教師なしの3Dセグメンテーションが必要であり、特にアノテーションの能力が制限されている場合や新しいカテゴリの発見が望まれている場合などである。 3次元ボリュームデータの多くは本質的に階層的であるという観察から,双曲型潜在空間を持つ変分型オートエンコーダ(VAE)と,3次元画像内の階層構造をより良くモデル化したジャイロプレーン畳み込み層を用いて,教師なしセグメンテーションのための3次元パッチの効果的な表現を学習することを提案する。また,階層的三重項損失とマルチスケールパッチサンプリングスキームを導入し,粒度の異なるレベル間の関係を埋め込む。階層型玩具データセット,BraTS全腫瘍データセット,低温電子顕微鏡データを用いた非教師なし3次元セグメンテーションにおけるハイパーボリック表現の有効性を実証した。 There exists a need for unsupervised 3D segmentation on complex volumetric data, particularly when annotation ability is limited or discovery of new categories is desired. Using the observation that much of 3D volumetric data is innately hierarchical, we propose learning effective representations of 3D patches for unsupervised segmentation through a variational autoencoder (VAE) with a hyperbolic latent space and a proposed gyroplane convolutional layer, which better models the underlying hierarchical structure within a 3D image. We also introduce a hierarchical triplet loss and multi-scale patch sampling scheme to embed relationships across varying levels of granularity. We demonstrate the effectiveness of our hyperbolic representations for unsupervised 3D segmentation on a hierarchical toy dataset, BraTS whole tumor dataset, and cryogenic electron microscopy data.	翻訳日:2021-05-23 14:58:17 公開日:2020-12-04
# 物体検出のためのデュアルリファインメント特徴ピラミッドネットワーク Dual Refinement Feature Pyramid Networks for Object Detection ( http://arxiv.org/abs/2012.01733v2 ) ライセンス: Link先を確認	Jialiang Ma, Bin Chen	(参考訳) FPNは、オブジェクト検出器で使われる一般的なコンポーネントであり、隣り合うレベルの補間と和によって、マルチスケール情報を補う。しかし、非線形演算と異なる出力次元の畳み込み層が存在するため、異なるレベル間の関係はより複雑であり、ピクセルワイズ和は効率的なアプローチではない。本稿では,まず,画素レベルと特徴マップレベルからの設計欠陥を分析する。そこで我々はDual Refinement Feature Pyramid Networks (DRFPN) と呼ばれる新しいパラメータフリー特徴ピラミッドネットワークを設計した。具体的には、DRFPNはSRB(Spatial Refinement Block)とCRB(Channel Refinement Block)の2つのモジュールで構成される。 srbは隣接するレベル間の文脈情報に基づいてサンプリングポイントの位置と内容を学ぶ。 CRBはアテンション機構に基づく適応チャネルマージ法を学習する。提案するRFPNは,既存のFPNモデルに容易に接続できる。ベルとホイッスルがなければ、2段階検出器では、COCO検出ベンチマークでは1.6から2.2AP、COCOセグメンテーションベンチマークでは1.5から1.9APで異なるFPNベースのモデルよりも優れている。 1段階検出器では、ResNet50をバックボーンとして使用する場合、DRFPNはアンカーベースRetinaNetを1.9 AP、アンカーフリーFCOSを1.3 AP改善する。 DRFPNの強靭性と一般化能力を検証する。コードは公開される予定だ。 FPN is a common component used in object detectors, it supplements multi-scale information by adjacent level features interpolation and summation. However, due to the existence of nonlinear operations and the convolutional layers with different output dimensions, the relationship between different levels is much more complex, the pixel-wise summation is not an efficient approach. In this paper, we first analyze the design defects from pixel level and feature map level. Then, we design a novel parameter-free feature pyramid networks named Dual Refinement Feature Pyramid Networks (DRFPN) for the problems. Specifically, DRFPN consists of two modules: Spatial Refinement Block (SRB) and Channel Refinement Block (CRB). SRB learns the location and content of sampling points based on contextual information between adjacent levels. CRB learns an adaptive channel merging method based on attention mechanism. Our proposed DRFPN can be easily plugged into existing FPN-based models. Without bells and whistles, for two-stage detectors, our model outperforms different FPN-based counterparts by 1.6 to 2.2 AP on the COCO detection benchmark, and 1.5 to 1.9 AP on the COCO segmentation benchmark. For one-stage detectors, DRFPN improves anchor-based RetinaNet by 1.9 AP and anchor-free FCOS by 1.3 AP when using ResNet50 as backbone. Extensive experiments verifies the robustness and generalization ability of DRFPN. The code will be made publicly available.	翻訳日:2021-05-23 14:56:43 公開日:2020-12-04
# 多スーパービジョンによる歩行者軌道予測のための時間ピラミッドネットワーク Temporal Pyramid Network for Pedestrian Trajectory Prediction with Multi-Supervision ( http://arxiv.org/abs/2012.01884v2 ) ライセンス: Link先を確認	Rongqin Liang, Yuanman Li, Xia Li, yi tang, Jiantao Zhou, Wenbin Zou	(参考訳) 群衆の人間の動きを予測することは、自動運転車の自然なナビゲーションからビデオ監視のインテリジェントなセキュリティシステムまで、多くのアプリケーションにとって重要である。従来のすべての作業は、1つの解像度で軌道をモデル化し予測するが、これは比較的非効率であり、移動行動の長距離情報(例えば、軌道の目的地)と短距離情報(例えば、ある時点での歩行方向と速度)を同時に利用することが困難である。本稿では,スクイーズ変調と拡張変調による歩行者追跡予測のための時間的ピラミッドネットワークを提案する。我々の階層的なフレームワークは、上から下までよりリッチな時間情報を持つ特徴ピラミッドを構築し、様々なテンポでの動作をよりよく捉えます。さらに,マルチスーパービジョンを用いた粗大な核融合戦略を提案する。グローバルコンテキストの上位粗い特徴をリッチローカルコンテキストの下位細かい特徴に段階的にマージすることにより、この手法は軌道の長距離情報と短距離情報の両方を完全に活用することができる。いくつかのベンチマーク実験の結果,提案手法の優位性を示した。 Predicting human motion behavior in a crowd is important for many applications, ranging from the natural navigation of autonomous vehicles to intelligent security systems of video surveillance. All the previous works model and predict the trajectory with a single resolution, which is rather inefficient and difficult to simultaneously exploit the long-range information (e.g., the destination of the trajectory), and the short-range information (e.g., the walking direction and speed at a certain time) of the motion behavior. In this paper, we propose a temporal pyramid network for pedestrian trajectory prediction through a squeeze modulation and a dilation modulation. Our hierarchical framework builds a feature pyramid with increasingly richer temporal information from top to bottom, which can better capture the motion behavior at various tempos. Furthermore, we propose a coarse-to-fine fusion strategy with multi-supervision. By progressively merging the top coarse features of global context to the bottom fine features of rich local context, our method can fully exploit both the long-range and short-range information of the trajectory. Experimental results on several benchmarks demonstrate the superiority of our method.	翻訳日:2021-05-23 14:55:04 公開日:2020-12-04
# (参考訳) ドメイン間のガイド付き画像キャプション性能の理解 Understanding Guided Image Captioning Performance across Domains ( http://arxiv.org/abs/2012.02339v1 ) ライセンス: CC BY 4.0	Edwin G. Ng, Bo Pang, Piyush Sharma, Radu Soricut	(参考訳) 画像キャプションモデルは一般的に、ユーザの関心を考慮に入れられる能力がなく、通常は、可読性、情報提供性、情報過負荷のバランスをとるグローバルな記述にデフォルトがある。一方、VQAモデルは、テキスト質問がかなり正確であることを期待しながら、長い記述的な回答を提供する能力に欠ける。本稿では,画像中の接地可能な概念や非接地可能な概念を参照するガイドテキストと呼ばれる追加入力を用いて,画像キャプションが重視すべき概念を制御する方法を提案する。このモデルはトランスフォーマティブベースのマルチモーダルエンコーダで構成されており、ガイドテキストとグローバルおよびオブジェクトレベルの画像特徴を併用して、ガイドキャプションを生成するために使用される早期融合表現を導出する。 Visual Genomeデータでトレーニングされたモデルは、自動オブジェクトラベルでガイドされるときにドメイン内でうまく適合するが、概念キャプションでトレーニングされたガイド付きキャプションモデルは、ドメイン外の画像やガイドテキストをより一般化する。人手による評価結果から,非制限領域の大規模トレーニングデータセットへのアクセスが要求されるとともに,スタイルの多様性(語彙サイズを増大させることなく)が向上する要因であることが示唆された。 Image captioning models generally lack the capability to take into account user interest, and usually default to global descriptions that try to balance readability, informativeness, and information overload. On the other hand, VQA models generally lack the ability to provide long descriptive answers, while expecting the textual question to be quite precise. We present a method to control the concepts that an image caption should focus on, using an additional input called the guiding text that refers to either groundable or ungroundable concepts in the image. Our model consists of a Transformer-based multimodal encoder that uses the guiding text together with global and object-level image features to derive early-fusion representations used to generate the guided caption. While models trained on Visual Genome data have an in-domain advantage of fitting well when guided with automatic object labels, we find that guided captioning models trained on Conceptual Captions generalize better on out-of-domain images and guiding texts. Our human-evaluation results indicate that attempting in-the-wild guided image captioning requires access to large, unrestricted-domain training datasets, and that increased style diversity (even without increasing vocabulary size) is a key factor for improved performance.	翻訳日:2021-05-23 13:02:49 公開日:2020-12-04
# (参考訳) 分割と学習: 予測+最適化のための分割と克服のアプローチ Divide and Learn: A Divide and Conquer Approach for Predict+Optimize ( http://arxiv.org/abs/2012.02342v1 ) ライセンス: CC BY 4.0	Ali Ugur Guler, Emir Demirovic, Jeffrey Chan, James Bailey, Christopher Leckie, Peter J. Stuckey	(参考訳) 予測+最適化問題は、確率係数の機械学習と予測係数を使用する組合せ最適化プロブレムを組み合わせる。この問題は2つの異なる段階で解決できるが、最適化損失を直接最小化する方がよい。しかし、これは離散的で微分不可能な組合せ函数を通して dif-ferentiating を必要とする。既存のアプローチの多くはある種の代理勾配を用いる。 demirovicet氏は、予測された係数を分割線形関数として、最適化問題の損失を直接表現する方法を示した。彼らのアプローチは、動的プログラミングの定式化による最適化の問題に限定されている。本研究では,この制約を伴わずに最適化問題に対処し,最適化損失を用いてその係数を予測するアルゴリズムを提案する。また, 計算量が少なく, 同様の再実行を実現するため, この手法の合意版も導入した。我々は,予測最適化問題に対する他のアプローチと比較し,他の予測最適化手法よりも厳密な組合せ問題に対処できることを示す。 The predict+optimize problem combines machine learning ofproblem coefficients with a combinatorial optimization prob-lem that uses the predicted coefficients. While this problemcan be solved in two separate stages, it is better to directlyminimize the optimization loss. However, this requires dif-ferentiating through a discrete, non-differentiable combina-torial function. Most existing approaches use some form ofsurrogate gradient. Demirovicet alshowed how to directlyexpress the loss of the optimization problem in terms of thepredicted coefficients as a piece-wise linear function. How-ever, their approach is restricted to optimization problemswith a dynamic programming formulation. In this work wepropose a novel divide and conquer algorithm to tackle op-timization problems without this restriction and predict itscoefficients using the optimization loss. We also introduce agreedy version of this approach, which achieves similar re-sults with less computation. We compare our approach withother approaches to the predict+optimize problem and showwe can successfully tackle some hard combinatorial problemsbetter than other predict+optimize methods.	翻訳日:2021-05-23 12:45:15 公開日:2020-12-04
# (参考訳) コピースペース:どこに画像を書き込むか? Copyspace: Where to Write on Images? ( http://arxiv.org/abs/2012.08933v1 ) ライセンス: CC BY 4.0	Jessica M. Lundin and Michael Sollami and Brian Lonsdorf and Alan Ross and Owen Schoppe and David Woodward and S\"onke Rohde	(参考訳) 画像上のテキストの配置は、高品質なビジュアルデザインを生み出す上で重要な部分である。テキスト要素の適切な位置、向き、スタイルを決定することで、この作業を自動化するには、背景画像の内容を理解する必要がある。画像上に描画されたテキストの美的パラメータを「コピースペース検出」と呼び、このタスクが前景と背景の分離とは異なることを指摘する。我々は、専門ラベル付きデータに基づいて訓練された1段階と2段階のオブジェクト検出手法を用いて、ソリューションを開発した。このワークショップでは、コピースペース検出のためのそのようなアルゴリズムを検証し、Einstein Designerのような生成設計モデルやパイプラインへの応用を実証する。 The placement of text over an image is an important part of producing high-quality visual designs. Automating this work by determining appropriate position, orientation, and style for textual elements requires understanding the contents of the background image. We refer to the search for aesthetic parameters of text rendered over images as "copyspace detection", noting that this task is distinct from foreground-background separation. We have developed solutions using one and two stage object detection methodologies trained on an expertly labeled data. This workshop will examine such algorithms for copyspace detection and demonstrate their application in generative design models and pipelines such as Einstein Designer.	翻訳日:2021-05-23 12:03:11 公開日:2020-12-04
# (参考訳) 単眼映像における仮想物体のスケールアウェア挿入 Scale-aware Insertion of Virtual Objects in Monocular Videos ( http://arxiv.org/abs/2012.02371v1 ) ライセンス: CC BY 4.0	Songhai Zhang and Xiangli Li and Yingtian Liu and Hongbo Fu	(参考訳) 本稿では,適切な大きさの仮想物体を単眼映像に挿入するスケールアウェア手法を提案する。単眼映像からの幾何復元のスケール曖昧性問題に取り組むため,映像中のグローバルスケールオブジェクトをベイズ的手法を用いて推定し,シーンオブジェクトのサイズは同一のグローバルスケールに厳密に準拠すべきであり,グローバルスケールの可能性は対象カテゴリのサイズ分布に応じて最大化する。そこで我々は,対象のカテゴリの大きさのデータセットを提案する。メートル法ツリー,対応する画像と900以上の対象カテゴリの階層表現である。ビデオから回収したオブジェクトの不完全性に対処するために,オブジェクトの可視次元を抽出してスケール最適化を行う,新しいスケール推定手法を提案する。実験により,本手法は最先端手法よりも優れた性能を示し,異なる映像シーンに対して高い妥当性とロバスト性を示した。 Metric-Tree は https://metric-tree.github.io で利用可能になった。 In this paper, we propose a scale-aware method for inserting virtual objects with proper sizes into monocular videos. To tackle the scale ambiguity problem of geometry recovery from monocular videos, we estimate the global scale objects in a video with a Bayesian approach incorporating the size priors of objects, where the scene objects sizes should strictly conform to the same global scale and the possibilities of global scales are maximized according to the size distribution of object categories. To do so, we propose a dataset of sizes of object categories: Metric-Tree, a hierarchical representation of sizes of more than 900 object categories with the corresponding images. To handle the incompleteness of objects recovered from videos, we propose a novel scale estimation method that extracts plausible dimensions of objects for scale optimization. Experiments have shown that our method for scale estimation performs better than the state-of-the-art methods, and has considerable validity and robustness for different video scenes. Metric-Tree has been made available at: https://metric-tree.github.io	翻訳日:2021-05-23 11:59:13 公開日:2020-12-04
# (参考訳) 放射線画像における画素分割解剖学的埋め込みの自己教師あり学習 Self-supervised Learning of Pixel-wise Anatomical Embeddings in Radiological Images ( http://arxiv.org/abs/2012.02383v1 ) ライセンス: CC BY 4.0	Ke Yan, Jinzheng Cai, Dakai Jin, Shun Miao, Adam P. Harrison, Dazhou Guo, Youbao Tang, Jing Xiao, Jingjing Lu, Le Lu	(参考訳) CT(Computed tomography)やX線などの放射線画像は、固有の構造を持つ解剖学を反映している。様々な画像にまたがる同じ解剖学的または意味的な構造を確実に特定できることは、医用画像解析の基本的な課題である。原則として、このタスクにランドマーク検出やセマンティックセグメンテーションを使用することは可能だが、うまく機能するためには、各解剖学的構造とサブ構造に対する大量のラベル付きデータが必要である。より普遍的なアプローチは、ラベルのない画像から本質的な構造を発見する。我々は,自制解剖学eMbedding (SAM) と呼ばれるアプローチを導入する。 SAMは、解剖学的位置または身体部分を記述する各画像ピクセルに対してセマンティック埋め込みを生成する。このような埋め込みを生成するために,画素レベルのコントラスト学習フレームワークを提案する。粗大な戦略により、グローバルとローカルの両方の解剖情報が符号化される。負のサンプル選択戦略は、異なる身体部位の識別性を高めるために設計されている。 SAMを使用すると、テンプレート画像に任意の関心点をラベル付けし、簡単な近接探索によって他の画像の同じ身体部分を見つけることができる。 2次元および3次元画像モダリティを持つ複数のタスクにおいてSAMの有効性を示す。 19のランドマークを持つ胸部CTデータセットでは、SAMは200倍高速で広く使われている登録アルゴリズムより優れている。 2つのx線データセット、samは1つのラベル付きテンプレートイメージを持つだけで、50のラベル付きイメージでトレーニングされた教師付きメソッドを上回っている。また,CTの全身追跡病変マッチングにもSAMを適用し,91%の精度を得た。 Radiological images such as computed tomography (CT) and X-rays render anatomy with intrinsic structures. Being able to reliably locate the same anatomical or semantic structure across varying images is a fundamental task in medical image analysis. In principle it is possible to use landmark detection or semantic segmentation for this task, but to work well these require large numbers of labeled data for each anatomical structure and sub-structure of interest. A more universal approach would discover the intrinsic structure from unlabeled images. We introduce such an approach, called Self-supervised Anatomical eMbedding (SAM). SAM generates semantic embeddings for each image pixel that describes its anatomical location or body part. To produce such embeddings, we propose a pixel-level contrastive learning framework. A coarse-to-fine strategy ensures both global and local anatomical information are encoded. Negative sample selection strategies are designed to enhance the discriminability among different body parts. Using SAM, one can label any point of interest on a template image, and then locate the same body part in other images by simple nearest neighbor searching. We demonstrate the effectiveness of SAM in multiple tasks with 2D and 3D image modalities. On a chest CT dataset with 19 landmarks, SAM outperforms widely-used registration algorithms while being 200 times faster. On two X-ray datasets, SAM, with only one labeled template image, outperforms supervised methods trained on 50 labeled images. We also apply SAM on whole-body follow-up lesion matching in CT and obtain an accuracy of 91%.	翻訳日:2021-05-23 11:44:36 公開日:2020-12-04
# (参考訳) 多目的最適化モデルを用いた不完全データセットを用いた機械学習 Machine learning with incomplete datasets using multi-objective optimization models ( http://arxiv.org/abs/2012.13352v1 ) ライセンス: CC BY 4.0	Hadi A. Khorshidi, Michael Kirley, Uwe Aickelin	(参考訳) 完全なデータから学習するために機械学習技術が開発されている。データセットに欠落した値が存在する場合、欠落した値やインプテーションでデータポイントを取り除くことで、不完全なデータを別々に前処理する必要がある。本稿では,分類モデルが学習されている間,不足値を扱うオンライン手法を提案する。この目的を達成するために,2つの目的関数を持つ多目的最適化モデルを構築した。また, 目的関数の定式化を3つ提案する。 NSGA IIに基づく進化的アルゴリズムを用いて、パレート解として最適解を求める。提案モデルの信頼性とロバスト性について実験を行い,欠落した値や分類のシナリオを定義した。また,提案モデルが医療情報学にどのように貢献できるかについても述べる。実験結果を用いて3種類の定式化の性能を比較した。提案したモデル結果は、同等の文献と比較することによって検証される。 Machine learning techniques have been developed to learn from complete data. When missing values exist in a dataset, the incomplete data should be preprocessed separately by removing data points with missing values or imputation. In this paper, we propose an online approach to handle missing values while a classification model is learnt. To reach this goal, we develop a multi-objective optimization model with two objective functions for imputation and model selection. We also propose three formulations for imputation objective function. We use an evolutionary algorithm based on NSGA II to find the optimal solutions as the Pareto solutions. We investigate the reliability and robustness of the proposed model using experiments by defining several scenarios in dealing with missing values and classification. We also describe how the proposed model can contribute to medical informatics. We compare the performance of three different formulations via experimental results. The proposed model results get validated by comparing with a comparable literature.	翻訳日:2021-05-23 11:14:36 公開日:2020-12-04
# (参考訳) 区間値データを用いた集計ファジィ数の類似度尺度 Similarity measure for aggregated fuzzy numbers from interval-valued data ( http://arxiv.org/abs/2012.03721v1 ) ライセンス: CC BY 4.0	Justin Kane Gunn, Hadi Akbarzadeh Khorshidi, Uwe Aickelin	(参考訳) 本稿では,2つのファジィ数間の類似度を区間間一致法 (IAA) を用いて計算する手法を提案する。本研究で提案される類似度尺度には, ファジィ数に対する新規な特徴と属性がいくつか含まれている。この研究で完全に再定義または修正された属性には、面積、周囲、センチロイド、石英、および合意比率が含まれる。各機能に対する推奨重み付けは、principal component analysis(pca)を使って学んだ。さらに、類似度測定の応用と将来的な利用について詳述する図示的な例を示す。 This paper presents a method to compute the degree of similarity between two aggregated fuzzy numbers from intervals using the Interval Agreement Approach (IAA). The similarity measure proposed within this study contains several features and attributes, of which are novel to aggregated fuzzy numbers. The attributes completely redefined or modified within this study include area, perimeter, centroids, quartiles and the agreement ratio. The recommended weighting for each feature has been learned using Principal Component Analysis (PCA). Furthermore, an illustrative example is provided to detail the application and potential future use of the similarity measure.	翻訳日:2021-05-23 10:21:12 公開日:2020-12-04
# (参考訳) 遺伝的プログラミングを用いた医学テキスト分類のためのデータ駆動正規表現進化 Data-Driven Regular Expressions Evolution for Medical Text Classification Using Genetic Programming ( http://arxiv.org/abs/2012.07515v1 ) ライセンス: CC BY 4.0	J Liu, R Bai, Z Lu, P Ge, D Liu, Uwe Aickelin	(参考訳) 医学分野において、テキスト分類は構造化情報デジタル化とインテリジェントな意思決定支援を通じて人的負担を大幅に削減できる最も重要なタスクの1つである。学習に基づくテキスト分類技術が普及しているにもかかわらず、学習のブラックボックスの性質から、分類結果の理解や手作業による微調整が困難である。そこで本研究では,遺伝子プログラミング(GP)アプローチを用いた新たな正規表現に基づくテキスト分類手法を提案する。正規表現の種数(専門家がランダムに初期化または手動で構築できる)が与えられた場合、本手法は、新しい正規表現構文と慎重に選択された一連の再生演算子を用いて、選択された適合関数に従って正規表現の集団を進化させる。本手法は,オンライン医療提供者からのリアルタイム医療用テキスト調査を用いて評価し,有望なパフォーマンスを示す。より重要なことに、この手法は医療関係者によって完全に理解され、チェックされ、更新される分類器を生成します。 In medical fields, text classification is one of the most important tasks that can significantly reduce human workload through structured information digitization and intelligent decision support. Despite the popularity of learning-based text classification techniques, it is hard for human to understand or manually fine-tune the classification results for better precision and recall, due to the black box nature of learning. This study proposes a novel regular expression-based text classification method making use of genetic programming (GP) approaches to evolve regular expressions that can classify a given medical text inquiry with satisfactory precision and recall while allow human to read the classifier and fine-tune accordingly if necessary. Given a seed population of regular expressions (can be randomly initialized or manually constructed by experts), our method evolves a population of regular expressions according to chosen fitness function, using a novel regular expression syntax and a series of carefully chosen reproduction operators. Our method is evaluated with real-life medical text inquiries from an online healthcare provider and shows promising performance. More importantly, our method generates classifiers that can be fully understood, checked and updated by medical doctors, which are fundamentally crucial for medical related practices.	翻訳日:2021-05-23 10:08:03 公開日:2020-12-04
# (参考訳) PeR-ViS:意味記述を用いたビデオサーベイランスの個人検索 PeR-ViS: Person Retrieval in Video Surveillance using Semantic Description ( http://arxiv.org/abs/2012.02408v1 ) ライセンス: CC BY 4.0	Parshwa Shah, Arpit Garg and Vandit Gajjar	(参考訳) 人は通常、年齢、性別、身長、布の種類、パターン、色などの記述者によって特徴づけられる。このような記述子は属性やソフトバイオメトリックスとして知られている。ビデオ監視において、人の記述と検索のセマンティックなギャップをリンクする。セマンティック記述のクエリで特定の人物を取得することは、ビデオ監視において重要な応用である。コンピュータビジョンを用いて人検索作業を完全に自動化し,研究コミュニティ内で関心を集めている。しかし、現在のトレンドは、主に画像ベースのクエリを持つ人物の検索に焦点を当てているため、実用上の大きな制限がある。本稿では,画像クエリーの代わりに,映像監視における人物検索の問題点を意味的記述を用いて検討する。この問題を解決するために,Mask R-CNN [14] と DenseNet-161 [16] を用いた深層学習に基づくカスケードフィルタリング手法 (PeR-ViS) を開発した。 SoftBioSearch [6] の標準人物検索データセットでは、0.566平均 IoU と 0.792 %w $IoU > 0.4$ を達成し、現在の最先端をはるかに上回っている。私たちのシンプルで再現可能で効果的なアプローチが、ビデオ監視における人物検索の領域における将来の研究を容易にしてくれることを期待しています。ソースコードとトレーニング済みのウェイトはhttps://parshwa1999.github.io/per-vis/。 A person is usually characterized by descriptors like age, gender, height, cloth type, pattern, color, etc. Such descriptors are known as attributes and/or soft-biometrics. They link the semantic gap between a person's description and retrieval in video surveillance. Retrieving a specific person with the query of semantic description has an important application in video surveillance. Using computer vision to fully automate the person retrieval task has been gathering interest within the research community. However, the Current, trend mainly focuses on retrieving persons with image-based queries, which have major limitations for practical usage. Instead of using an image query, in this paper, we study the problem of person retrieval in video surveillance with a semantic description. To solve this problem, we develop a deep learning-based cascade filtering approach (PeR-ViS), which uses Mask R-CNN [14] (person detection and instance segmentation) and DenseNet-161 [16] (soft-biometric classification). On the standard person retrieval dataset of SoftBioSearch [6], we achieve 0.566 Average IoU and 0.792 %w $IoU > 0.4$, surpassing the current state-of-the-art by a large margin. We hope our simple, reproducible, and effective approach will help ease future research in the domain of person retrieval in video surveillance. The source code and pretrained weights available at https://parshwa1999.github.io/PeR-ViS/.	翻訳日:2021-05-23 09:54:46 公開日:2020-12-04
# (参考訳) 理解可能な医学用語翻訳のためのベンチマークデータセット A Benchmark Dataset for Understandable Medical Language Translation ( http://arxiv.org/abs/2012.02420v1 ) ライセンス: CC BY 4.0	Junyu Luo, Zifei Zheng, Hanzhong Ye, Muchao Ye, Yaqing Wang, Quanzeng You, Cao Xiao and Fenglong Ma	(参考訳) 本稿では,専門的な医学文と素人理解可能な表現を連携させるための,人間による新しい医学用語翻訳データセットであるmedlaneを紹介する。データセットには12,801のトレーニングサンプル、1,015の検証サンプル、1,016のテストサンプルが含まれている。次に,medlaneデータセットにおける1つのnaiveと6つのディープラーニングに基づくアプローチを評価する。直接コピー,統計機械翻訳アプローチモーゼ,4つのニューラルネットワーク翻訳アプローチ(提案するpmbert-mtモデル,seq2seqとその2つの変種),修正されたテキスト要約モデル pointernet などである。結果を比較するために,この課題に特化して設計された3つの新しい指標を含む11の指標を利用する。最後に,メドレーンとベースラインの限界を議論し,この課題に対する研究の方向性を指摘する。 In this paper, we introduce MedLane -- a new human-annotated Medical Language translation dataset, to align professional medical sentences with layperson-understandable expressions. The dataset contains 12,801 training samples, 1,015 validation samples, and 1,016 testing samples. We then evaluate one naive and six deep learning-based approaches on the MedLane dataset, including directly copying, a statistical machine translation approach Moses, four neural machine translation approaches (i.e., the proposed PMBERT-MT model, Seq2Seq and its two variants), and a modified text summarization model PointerNet. To compare the results, we utilize eleven metrics, including three new measures specifically designed for this task. Finally, we discuss the limitations of MedLane and baselines, and point out possible research directions for this task.	翻訳日:2021-05-23 09:44:35 公開日:2020-12-04
# (参考訳) 行動認識・検出のための空間時間アライメントネットワーク Spatial-Temporal Alignment Network for Action Recognition and Detection ( http://arxiv.org/abs/2012.02426v1 ) ライセンス: CC BY 4.0	Junwei Liang, Liangliang Cao, Xuehan Xiong, Ting Yu, Alexander Hauptmann	(参考訳) 本稿では,行動認識と検出を支援する視点不変特徴表現の導入方法について検討する。過去10年間のアクション認識の大きな進歩を目の当たりにしてきたが、大規模データセットにおける幾何学的バリエーションを効率的にモデル化する方法は、いまだに興味深い。本稿では,行動認識と行動検出のための幾何学的不変表現を学習する新しい空間-時間アライメントネットワーク(stan)を提案する。 stanモデルは軽量で汎用的で、resnet3dやslowfastのような既存のアクション認識モデルに非常に低い計算コストで接続できる。我々は、AVA、Kinetics-400、AVA-Kinetics、Charades、Charades-EgoのデータセットでSTANモデルを広範囲にテストした。実験の結果,STANモデルは動作検出タスクと動作認識タスクの両方において,一貫して芸術の状態を改善できることがわかった。私たちはデータ、モデル、コードを公開します。 This paper studies how to introduce viewpoint-invariant feature representations that can help action recognition and detection. Although we have witnessed great progress of action recognition in the past decade, it remains challenging yet interesting how to efficiently model the geometric variations in large scale datasets. This paper proposes a novel Spatial-Temporal Alignment Network (STAN) that aims to learn geometric invariant representations for action recognition and action detection. The STAN model is very light-weighted and generic, which could be plugged into existing action recognition models like ResNet3D and the SlowFast with a very low extra computational cost. We test our STAN model extensively on AVA, Kinetics-400, AVA-Kinetics, Charades, and Charades-Ego datasets. The experimental results show that the STAN model can consistently improve the state of the arts in both action detection and action recognition tasks. We will release our data, models and code.	翻訳日:2021-05-23 09:30:47 公開日:2020-12-04
# (参考訳) 脳は相3次計算を使って量子位相コンピュータとして機能するのか? Does the brain function as a quantum phase computer using phase ternary computation? ( http://arxiv.org/abs/2012.06537v1 ) ライセンス: CC BY 4.0	Andrew Simon Johnson and William Winlow	(参考訳) 本稿では,神経伝達の基礎は,処理誤差を克服するのに十分な時間的精度で計算可能な圧力パルス/ソリトンであることを示す。神経系内のシグナル伝達と計算は複雑で異なる現象である。アクション電位は可塑性であり、アクションポテンシャルピークは神経計算の不適切な不動点となるが、アクションポテンシャル閾値はこの目的に適している。さらに、ニューロンをスパイクすることで時間をかける神経モデルは、処理エラーを克服するために必要な速度以下で動作する。本稿では, 網膜処理を例として, ケーブル理論に基づく現代の神経伝導理論は, 網膜の完全機能に必要な計算時間と脳の他の部分の含意を考慮に入れるのに不適切であることを示す。さらに、連続するイオンチャネルが静電気的に開放される活性化部位では、活性化閾値では電荷が不足するため、ケーブル理論は作用電位の伝播に役立てることができない。脳のニューラルネットのデコンストラクションは、チューリングマシンが最も単純な量子位相コンピュータのグループのメンバーであることを示唆している。しかし、チューリングベースの機構を使用する試みは、チューリングベースのコンピュータの技術が根本的に異なるため、網膜のコーディングや知能の計算を解決できない。脳のニューラルネットにおける符号化は量子ベースであり、量子は時間変数と位相ベース変数を持ち、網膜で以前に示されたように位相三元計算を可能にする。 Here we provide evidence that the fundamental basis of nervous communication is derived from a pressure pulse/soliton capable of computation with sufficient temporal precision to overcome any processing errors. Signalling and computing within the nervous system are complex and different phenomena. Action potentials are plastic and this makes the action potential peak an inappropriate fixed point for neural computation, but the action potential threshold is suitable for this purpose. Furthermore, neural models timed by spiking neurons operate below the rate necessary to overcome processing error. Using retinal processing as our example, we demonstrate that the contemporary theory of nerve conduction based on cable theory is inappropriate to account for the short computational time necessary for the full functioning of the retina and by implication the rest of the brain. Moreover, cable theory cannot be instrumental in the propagation of the action potential because at the activation-threshold there is insufficient charge at the activation site for successive ion channels to be electrostatically opened. Deconstruction of the brain neural network suggests that it is a member of a group of Quantum phase computers of which the Turing machine is the simplest: the brain is another based upon phase ternary computation. However, attempts to use Turing based mechanisms cannot resolve the coding of the retina or the computation of intelligence, as the technology of Turing based computers is fundamentally different. We demonstrate that that coding in the brain neural network is quantum based, where the quanta have a temporal variable and a phase-base variable enabling phase ternary computation as previously demonstrated in the retina.	翻訳日:2021-05-23 09:01:14 公開日:2020-12-04
# (参考訳) 対向例に対する自然ロバスト性を目指して Towards Natural Robustness Against Adversarial Examples ( http://arxiv.org/abs/2012.02452v1 ) ライセンス: CC BY 4.0	Haoyu Chu, Shikui Wei, Yao Zhao	(参考訳) 近年の研究では、ディープニューラルネットワークは敵の例に弱いことが示されているが、敵の例を守るために提案された手法のほとんどは、この問題を根本的に解決できない。本稿では, 対向雑音による誤差を抑えるために, 同一性を持つニューラルネットワークの上限が存在することを理論的に証明する。しかし、実際の計算では、この種のニューラルネットワークはもはや上界を持たないため、敵の例に影響を受けやすい。同様の手順に従って、敵の例が他の深いニューラルネットワークをスキップ接続で騙すことができる理由を説明する。さらに,ニューラルネットワークの新たなファミリーであるneural odes(chen et al., 2018)が,より弱い上限を持つことを示した。このより弱い上限は、結果の変化量が大きすぎることを防ぐ。このように、ニューラルODEは逆例に対して自然な堅牢性を持つ。我々は,3つのホワイトボックス対向攻撃(FGSM,PGD,DI2-FGSM)と1つのブラックボックス対向攻撃(Bundary Attack)によるResNetと比較して,ニューラルODEの性能を評価する。最後に,TRADES や YOPO など,敵対的訓練手法で訓練されたニューラルネットワークの頑健性よりも,ニューラルネットワークの自然な堅牢性の方が優れていることを示す。 Recent studies have shown that deep neural networks are vulnerable to adversarial examples, but most of the methods proposed to defense adversarial examples cannot solve this problem fundamentally. In this paper, we theoretically prove that there is an upper bound for neural networks with identity mappings to constrain the error caused by adversarial noises. However, in actual computations, this kind of neural network no longer holds any upper bound and is therefore susceptible to adversarial examples. Following similar procedures, we explain why adversarial examples can fool other deep neural networks with skip connections. Furthermore, we demonstrate that a new family of deep neural networks called Neural ODEs (Chen et al., 2018) holds a weaker upper bound. This weaker upper bound prevents the amount of change in the result from being too large. Thus, Neural ODEs have natural robustness against adversarial examples. We evaluate the performance of Neural ODEs compared with ResNet under three white-box adversarial attacks (FGSM, PGD, DI2-FGSM) and one black-box adversarial attack (Boundary Attack). Finally, we show that the natural robustness of Neural ODEs is even better than the robustness of neural networks that are trained with adversarial training methods, such as TRADES and YOPO.	翻訳日:2021-05-23 08:46:14 公開日:2020-12-04
# (参考訳) Deep Learning from Demonstrationsのための変形可能な物体からのピアス針のデータセット A data-set of piercing needle through deformable objects for Deep Learning from Demonstrations ( http://arxiv.org/abs/2012.02458v1 ) ライセンス: CC BY 4.0	Hamidreza Hashempour, Kiyanoush Nazari, Fangxun Zhong and Amir Ghalamzan E.	(参考訳) 自動化は非常に時間がかかり、費用がかかるため、多くのロボットタスクはまだ遠隔操作されている。デモから学ぶロボット(RLfD)は、プログラミングの時間とコストを削減する。しかし、従来のRLfDアプローチは、例えば多くのロボットタスクに直接適用できない。視覚情報から特徴を設計するための時間を要するため、最小限の侵襲的なロボットによるロボット縫合。ディープニューラルネットワーク(DNN)は、高次元の観測空間と低レベルのアクション/状態空間の関係を捉える複雑なモデルを作成するための有用なツールとして登場した。それにもかかわらず、そのようなアプローチは適切なDNNモデルのトレーニングに適したデータセットを必要とする。本稿では,da vinci研究キットの2本の腕を軟組織に挿入・挿入するデータセットを提案する。本データセットは,(1)6台の高精細度キャリブレーションカメラで記録された無作為所望の出口点と,(2)対応するロボットデータ,キャリブレーションパラメータ,(3)全ての収集データを同期させたロボット制御入力とからなる。データセットはDeep-RLfDアプローチ用に設計されている。また、単純なフィードフォワードCNNやRCN(Recurrent Convolutional Networks)など、いくつかの深いRLfDアーキテクチャを実装した。本研究は,ベースラインフィードフォワードcnnが視覚情報とロボットの次のステップ制御動作との関係をうまく学習するにも関わらず,rcnsがモデルの予測精度を向上させることを示す。データセットは、RLfDのベースライン実装と同様に、https://github.com/imanlab/d-lfd.comでベンチマーキングが公開されている。 Many robotic tasks are still teleoperated since automating them is very time consuming and expensive. Robot Learning from Demonstrations (RLfD) can reduce programming time and cost. However, conventional RLfD approaches are not directly applicable to many robotic tasks, e.g. robotic suturing with minimally invasive robots, as they require a time-consuming process of designing features from visual information. Deep Neural Networks (DNN) have emerged as useful tools for creating complex models capturing the relationship between high-dimensional observation space and low-level action/state space. Nonetheless, such approaches require a dataset suitable for training appropriate DNN models. This paper presents a dataset of inserting/piercing a needle with two arms of da Vinci Research Kit in/through soft tissues. The dataset consists of (1) 60 successful needle insertion trials with randomised desired exit points recorded by 6 high-resolution calibrated cameras, (2) the corresponding robot data, calibration parameters and (3) the commanded robot control input where all the collected data are synchronised. The dataset is designed for Deep-RLfD approaches. We also implemented several deep RLfD architectures, including simple feed-forward CNNs and different Recurrent Convolutional Networks (RCNs). Our study indicates RCNs improve the prediction accuracy of the model despite that the baseline feed-forward CNNs successfully learns the relationship between the visual information and the next step control actions of the robot. The dataset, as well as our baseline implementations of RLfD, are publicly available for bench-marking at https://github.com/imanlab/d-lfd.	翻訳日:2021-05-23 08:34:30 公開日:2020-12-04
# (参考訳) アクティブラーニングによる低リソース自然言語理解のための微調整bert Fine-tuning BERT for Low-Resource Natural Language Understanding via Active Learning ( http://arxiv.org/abs/2012.02462v1 ) ライセンス: CC BY 4.0	Daniel Grie{\ss}haber, Johannes Maucher and Ngoc Thang Vu	(参考訳) 近年,事前学習されたトランスフォーマーに基づく言語モデルをダウンストリームで活用するタスク固有モデルは,自然言語理解タスクにおける技術結果の高度化を実現している。しかし、1000のトレーニングデータポイント未満のリソース設定で、このアプローチの適合性を調査する研究はほとんどない。本研究では、プールベースのアクティブラーニングを利用してトレーニングを高速化し、新しいデータのラベル付けコストを抑えながら、事前訓練されたTransformerベースの言語モデルであるBERTの微調整方法を検討する。 GLUEデータセットにおける実験結果から,ラベルなしデータのプールからクエリする際のモデルの知識獲得を最大化することにより,モデル性能の優位性を示す。最後に、訓練可能なパラメータの数を減らし、低リソース設定に適したものにするため、微調整中の言語モデルの凍結層の利点を実証し分析する。 Recently, leveraging pre-trained Transformer based language models in down stream, task specific models has advanced state of the art results in natural language understanding tasks. However, only a little research has explored the suitability of this approach in low resource settings with less than 1,000 training data points. In this work, we explore fine-tuning methods of BERT -- a pre-trained Transformer based language model -- by utilizing pool-based active learning to speed up training while keeping the cost of labeling new data constant. Our experimental results on the GLUE data set show an advantage in model performance by maximizing the approximate knowledge gain of the model when querying from the pool of unlabeled data. Finally, we demonstrate and analyze the benefits of freezing layers of the language model during fine-tuning to reduce the number of trainable parameters, making it more suitable for low-resource settings.	翻訳日:2021-05-23 08:19:12 公開日:2020-12-04
# (参考訳) FinCausal共有タスクのためのデータ処理とアノテーション方式 Data Processing and Annotation Schemes for FinCausal Shared Task ( http://arxiv.org/abs/2012.02498v1 ) ライセンス: CC BY 4.0	Dominique Mariko, Estelle Labidurie, Yagmur Ozturk, Hanna Abi Akl, Hugues de Mazancourt	(参考訳) この文書では、FinCausal Shared Task(Mariko et al., 2020)のデータをラベル付けするために使用されるアノテーションスキームを説明します。このタスクは、2020年12月12日に第28回計算言語学国際会議(coling'2020)で開催される金融ナラティブ・プロセッシング・マルチリング金融要約合同ワークショップ(fnp-fns 2020)に関連している。 This document explains the annotation schemes used to label the data for the FinCausal Shared Task (Mariko et al., 2020). This task is associated to the Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation (FNP-FNS 2020), to be held at The 28th International Conference on Computational Linguistics (COLING'2020), on December 12, 2020.	翻訳日:2021-05-23 07:41:17 公開日:2020-12-04
# (参考訳) 財務文書因果検出共有タスク(FinCausal 2020) Financial Document Causality Detection Shared Task (FinCausal 2020) ( http://arxiv.org/abs/2012.02505v1 ) ライセンス: CC BY 4.0	Dominique Mariko, Hanna Abi Akl, Estelle Labidurie, St\'ephane Durfort, Hugues de Mazancourt, Mahmoud El-Haj	(参考訳) 金融文書および関連するfincausalデータセットにおける因果性検出に関するfincausal 2020共有タスクを報告し、参加システムと結果について考察する。二項分類タスク(Task1)と関係抽出タスク(Task2)の2つのサブタスクを提案する。合計16チームが2つのタスクをまたいで実行し、13チームがシステム記述の論文を寄稿した。このワークショップは、2020年9月12日にスペインのバルセロナで開催された第28回計算言語学国際会議(COING'2020)で開催されるFNP-FNS 2020(Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation)に関連付けられている。 We present the FinCausal 2020 Shared Task on Causality Detection in Financial Documents and the associated FinCausal dataset, and discuss the participating systems and results. Two sub-tasks are proposed: a binary classification task (Task 1) and a relation extraction task (Task 2). A total of 16 teams submitted runs across the two Tasks and 13 of them contributed with a system description paper. This workshop is associated to the Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation (FNP-FNS 2020), held at The 28th International Conference on Computational Linguistics (COLING'2020), Barcelona, Spain on September 12, 2020.	翻訳日:2021-05-23 07:26:54 公開日:2020-12-04
# (参考訳) 逐次GANを用いたレコメンダシステムにおけるデータ汚染検出について On Detecting Data Pollution Attacks On Recommender Systems Using Sequential GANs ( http://arxiv.org/abs/2012.02509v1 ) ライセンス: CC BY 4.0	Behzad Shahrasbi, Venugopal Mani, Apoorv Reddy Arrabothu, Deepthi Sharma, Kannan Achan, Sushant Kumar	(参考訳) レコメンダシステムは、あらゆるeコマースプラットフォームの重要な部分だ。勧告は通常、大量のユーザデータを集約することによって生成される。悪意のあるアクターは、悪意のあるデータポイントを注入することで、そのようなレコメンデーションシステムの出力を減らし、システムを利用して財務的な利益を得る。本研究では,悪意のあるデータポイントを識別する半教師付き攻撃検出アルゴリズムを提案する。実際のデータポイントの分布を学習するために汚染される可能性が低いデータセットの一部を活用することで、これを実現します。提案手法は,ユーザ活動の文脈情報を考慮した生成型逆ネットワークアーキテクチャを修飾するものである。これにより、モデルが正しいデータポイントと注入されたデータポイントを区別することができる。 Recommender systems are an essential part of any e-commerce platform. Recommendations are typically generated by aggregating large amounts of user data. A malicious actor may be motivated to sway the output of such recommender systems by injecting malicious datapoints to leverage the system for financial gain. In this work, we propose a semi-supervised attack detection algorithm to identify the malicious datapoints. We do this by leveraging a portion of the dataset that has a lower chance of being polluted to learn the distribution of genuine datapoints. Our proposed approach modifies the Generative Adversarial Network architecture to take into account the contextual information from user activity. This allows the model to distinguish legitimate datapoints from the injected ones.	翻訳日:2021-05-23 07:16:30 公開日:2020-12-04
# (参考訳) 半教師付き学習のための最適輸送によるマッチング分布 Matching Distributions via Optimal Transport for Semi-Supervised Learning ( http://arxiv.org/abs/2012.03790v1 ) ライセンス: CC BY 4.0	Fariborz Taherkhani, Hadi Kazemi, Ali Dabouei, Jeremy Dawson, Nasser M. Nasrabadi	(参考訳) トレーニング期間中に十分なラベル付きデータが得られていない場合、SSL(Semi-Supervised Learning)アプローチはラベルなしデータの使用に有効なフレームワークとなっている。畳み込みニューラルネットワーク(CNN)に基づくSSLメソッドは、画像分類などの標準ベンチマークタスクで成功した結果を提供している。本研究では、ラベル付きおよびラベルなしデータが同じ基礎となる確率分布から得られるSSL問題の一般的な設定について考察する。そこで本稿では,未ラベルデータに対して擬似ラベルを提供するために,離散的経験的確率測度間の類似性の指標として最適輸送(OT)手法を適用し,初期ラベル付きデータと併用してSSL方式でCNNモデルをトレーニングする手法を提案する。提案手法と最先端のSSLアルゴリズムを標準データセット上で評価・比較し,SSLアルゴリズムの優位性と有効性を示す。 Semi-Supervised Learning (SSL) approaches have been an influential framework for the usage of unlabeled data when there is not a sufficient amount of labeled data available over the course of training. SSL methods based on Convolutional Neural Networks (CNNs) have recently provided successful results on standard benchmark tasks such as image classification. In this work, we consider the general setting of SSL problem where the labeled and unlabeled data come from the same underlying probability distribution. We propose a new approach that adopts an Optimal Transport (OT) technique serving as a metric of similarity between discrete empirical probability measures to provide pseudo-labels for the unlabeled data, which can then be used in conjunction with the initial labeled data to train the CNN model in an SSL manner. We have evaluated and compared our proposed method with state-of-the-art SSL algorithms on standard datasets to demonstrate the superiority and effectiveness of our SSL algorithm.	翻訳日:2021-05-23 06:54:40 公開日:2020-12-04
# (参考訳) DeepSym: 計画のための教師なし連続ロボットインタラクションによる深部シンボル生成とルール学習 DeepSym: Deep Symbol Generation and Rule Learning from Unsupervised Continuous Robot Interaction for Planning ( http://arxiv.org/abs/2012.02532v1 ) ライセンス: CC BY 4.0	Alper Ahmetoglu, M. Yunus Seker, Aysu Sayin, Serkan Bugur, Justus Piater, Erhan Oztop, Emre Ugur	(参考訳) 個別のシンボルとルールを連続的なインタラクション体験から自律的に発見することは、ロボットAIの重要な構成要素であるが、依然として難しい問題である。手作業で設計したシンボルやルールのスケーラビリティ、柔軟性、堅牢性の限界を克服し、オープンな環境における抽象レベルを学習し、推論できる自律型ロボットへの大きな進歩となる。この目的に向けて,アクション・グラウンド・離散オブジェクト・効果カテゴリを探索し,複雑なアクション・プランニングに使用できる確率的ルールを構築する,新規で汎用的な手法を提案する。我々のロボットは、与えられたアクションレパートリーを用いて、単一のオブジェクトと複数のオブジェクトと相互作用し、環境内で生成された効果を観察する。アクショングラウンドドオブジェクト,エフェクト,リレーショナルカテゴリを形成するために,シーンのイメージと適用されたアクションを入力とした予測型ディープエンコーダ・デコーダネットワークのバイナライズされたボトルネック層を用いて,ピクセル座標におけるシーン内のオブジェクト変位(アクション効果)を生成する。 binary latent vectorは、オブジェクトの学習された、アクション駆動の分類を表す。ニューラルネットワークが表現する知識をシンボリック推論に有用なルールに抽出するために,決定木をトレーニングしてデコーダ関数を再現する。分岐部から確率的ルールを抽出し、PDDLで表現し、市販のプランナーがロボットの感覚運動体験を操作できるようにする。本システムは,ロボットアームハンドがプッシュ・スタック動作から'ロール可能','インサータブル','大きめ'と解釈できる記号を学習した物理系3次元シミュレーション環境において検証され,既設の確率的プランナーを用いてキューブ,ボール,カップからタワーを構築するような効果的な計画を作成した。 Autonomous discovery of discrete symbols and rules from continuous interaction experience is a crucial building block of robot AI, but remains a challenging problem. Solving it will overcome the limitations in scalability, flexibility, and robustness of manually-designed symbols and rules, and will constitute a substantial advance towards autonomous robots that can learn and reason at abstract levels in open-ended environments. Towards this goal, we propose a novel and general method that finds action-grounded, discrete object and effect categories and builds probabilistic rules over them that can be used in complex action planning. Our robot interacts with single and multiple objects using a given action repertoire and observes the effects created in the environment. In order to form action-grounded object, effect, and relational categories, we employ a binarized bottleneck layer of a predictive, deep encoder-decoder network that takes as input the image of the scene and the action applied, and generates the resulting object displacements in the scene (action effects) in pixel coordinates. The binary latent vector represents a learned, action-driven categorization of objects. To distill the knowledge represented by the neural network into rules useful for symbolic reasoning, we train a decision tree to reproduce its decoder function. From its branches we extract probabilistic rules and represent them in PPDDL, allowing off-the-shelf planners to operate on the robot's sensorimotor experience. Our system is verified in a physics-based 3d simulation environment where a robot arm-hand system learned symbols that can be interpreted as 'rollable', 'insertable', 'larger-than' from its push and stack actions; and generated effective plans to achieve goals such as building towers from given cubes, balls, and cups using off-the-shelf probabilistic planners.	翻訳日:2021-05-23 06:26:58 公開日:2020-12-04
# (参考訳) ニューラル常微分方程式を用いた雲被覆変化下の作物分類 Crop Classification under Varying Cloud Cover with Neural Ordinary Differential Equations ( http://arxiv.org/abs/2012.02542v1 ) ライセンス: CC BY 4.0	Nando Metzger, Mehmet Ozgur Turkoglu, Stefano D'Aronco, Jan Dirk Wegner, Konrad Schindler	(参考訳) 光学衛星センサーは雲を通して地球の表面を見ることができない。周期的な再観測サイクルにもかかわらず、地球観測衛星が取得した画像シーケンスは不規則に時間内にサンプリングされる。作物分類のための最先端の手法(および他の時系列分析タスク)は、リカレントニューラルネットワーク(RNN)のような観測間の通常の時間間隔を暗黙的に仮定する技術に依存している。本稿では,rnnと組み合わせたニューラル常微分方程式(ノード)を用いて不規則間隔画像列における作物種別を分類する。その結果得られたode-rnnモデルは、更新ステップ、再帰ユニットがモデルの隠れた状態に新しい入力データを同化するステップ、ノードが次の観測が到着するまで隠れた状態を伝播する予測ステップの2つのステップで構成される。予測ステップは、いくつかの利点がある潜在力学の連続的な表現に基づいている。概念レベルでは、現象論的サイクルを管理するメカニズムを記述するのがより自然な方法である。現実的な観点では、システムの状態を任意の時点にサンプリングすることが可能であり、必要な時に観測を統合することができ、最後の観測を超えて外挿することができる。実験の結果,ODE-RNNはLSTM,GRU,時間的畳み込みなどの共通ベースラインよりも分類精度が向上していることがわかった。この利得は、わずかしか観測できない(クラウドカバーの頻繁さ)困難なシナリオにおいて最も顕著である。さらに,外挿能力は季節の早い段階での分類性能の向上に寄与し,予測に重要であることを示す。 Optical satellite sensors cannot see the Earth's surface through clouds. Despite the periodic revisit cycle, image sequences acquired by Earth observation satellites are therefore irregularly sampled in time. State-of-the-art methods for crop classification (and other time series analysis tasks) rely on techniques that implicitly assume regular temporal spacing between observations, such as recurrent neural networks (RNNs). We propose to use neural ordinary differential equations (NODEs) in combination with RNNs to classify crop types in irregularly spaced image sequences. The resulting ODE-RNN models consist of two steps: an update step, where a recurrent unit assimilates new input data into the model's hidden state; and a prediction step, in which NODE propagates the hidden state until the next observation arrives. The prediction step is based on a continuous representation of the latent dynamics, which has several advantages. At the conceptual level, it is a more natural way to describe the mechanisms that govern the phenological cycle. From a practical point of view, it makes it possible to sample the system state at arbitrary points in time, such that one can integrate observations whenever they are available, and extrapolate beyond the last observation. Our experiments show that ODE-RNN indeed improves classification accuracy over common baselines such as LSTM, GRU, and temporal convolution. The gains are most prominent in the challenging scenario where only few observations are available (i.e., frequent cloud cover). Moreover, we show that the ability to extrapolate translates to better classification performance early in the season, which is important for forecasting.	翻訳日:2021-05-23 05:56:43 公開日:2020-12-04
# (参考訳) EventKG+BT:知識グラフからインタラクティブな伝記タイムラインを生成する EventKG+BT: Generation of Interactive Biography Timelines from a Knowledge Graph ( http://arxiv.org/abs/2012.06306v1 ) ライセンス: CC BY 4.0	Simon Gottschalk and Elena Demidova	(参考訳) 公共の関心を持つ人々の生活における顕著な業績や重要な出来事の研究には、退屈で時間を要する長い百科事典や伝記資料の密読が必要である。 EventKGナレッジグラフのようなセマンティックリファレンスソースは関連する事実の構造化された表現を提供するが、数百のイベントと特定のエンティティの時間的関係を含んでいることが多い。本稿では,遠隔監視を用いた知識グラフからバイオグラフィーの簡潔かつインタラクティブな時空間表現を生成するタイムライン生成システムEventKG+BTを提案する。 Research on notable accomplishments and important events in the life of people of public interest usually requires close reading of long encyclopedic or biographical sources, which is a tedious and time-consuming task. Whereas semantic reference sources, such as the EventKG knowledge graph, provide structured representations of relevant facts, they often include hundreds of events and temporal relations for particular entities. In this paper, we present EventKG+BT - a timeline generation system that creates concise and interactive spatio-temporal representations of biographies from a knowledge graph using distant supervision.	翻訳日:2021-05-23 05:39:22 公開日:2020-12-04
# (参考訳) 長距離低品質赤外線ビデオにおける小型目標検出のための高性能手法 A high performance approach to detecting small targets in long range low quality infrared videos ( http://arxiv.org/abs/2012.02579v1 ) ライセンス: CC BY 4.0	Chiman Kwan and Bence Budavari	(参考訳) 遠距離赤外線(IR)ビデオではターゲットが小さいため、それらのビデオのターゲットを正確に検出することは困難である。本稿では,広帯域・低品質赤外線ビデオにおける小型目標検出のための高性能手法を提案する。提案手法は,ビデオ解像度向上モジュール,局所強度と勾配(LIG)に基づく小型目標検出器,連結成分分析モジュール,複数フレームからの検出を接続するためのトラックアソシエーションモジュールから構成される。ベンチマークデータセットから3500mから5000mの範囲での実際の中赤外(MWIR)ビデオによる大規模な実験により,提案手法の有効性が明らかとなった。 Since targets are small in long range infrared (IR) videos, it is challenging to accurately detect targets in those videos. In this paper, we propose a high performance approach to detecting small targets in long range and low quality infrared videos. Our approach consists of a video resolution enhancement module, a proven small target detector based on local intensity and gradient (LIG), a connected component (CC) analysis module, and a track association module to connect detections from multiple frames. Extensive experiments using actual mid-wave infrared (MWIR) videos in ranges between 3500 m and 5000 m from a benchmark dataset clearly demonstrated the efficacy of the proposed approach.	翻訳日:2021-05-23 05:15:32 公開日:2020-12-04
# (参考訳) レーン検出結果を用いた車線数予測 Prediction of Lane Number Using Results From Lane Detection ( http://arxiv.org/abs/2012.02604v1 ) ライセンス: CC BY 4.0	Panumate Chetprayoon, Fumihiko Takahashi, Yusuke Uchida	(参考訳) 車両が走行する車線番号は、インテリジェントな車両分野において重要な要素である。多数の車線検出アルゴリズムが提案され,完全な車線検出が可能であれば,車線検出結果から直接車線数を計算することができる。しかし、実際にレーン検出アルゴリズムは時に性能が劣る。そこで本研究では,車線数を予測するために,ドライブレコーダ画像と車線検出結果を組み合わせた新しい車線数予測手法を提案する。実験の結果,提案手法は計算コストを大幅に増大させることなく,優れた結果が得られた。 The lane number that the vehicle is traveling in is a key factor in intelligent vehicle fields. Many lane detection algorithms were proposed and if we can perfectly detect the lanes, we can directly calculate the lane number from the lane detection results. However, in fact, lane detection algorithms sometimes underperform. Therefore, we propose a new approach for predicting the lane number, where we combine the drive recorder image with the lane detection results to predict the lane number. Experiments on our own dataset confirmed that our approach delivered outstanding results without significantly increasing computational cost.	翻訳日:2021-05-23 04:31:43 公開日:2020-12-04
# (参考訳) finnsentiment - フィンランドの感情極性アノテーションのためのソーシャルメディアコーパス FinnSentiment -- A Finnish Social Media Corpus for Sentiment Polarity Annotation ( http://arxiv.org/abs/2012.02613v1 ) ライセンス: CC BY 4.0	Krister Lind\'en and Tommi Jauhiainen and Sam Hardwick	(参考訳) 感情分析と意見のマイニングはソーシャルメディアで明らかな応用領域を持つ重要なタスクである。憎しみの言葉や偽ニュースを示す時ですこれまでの調査では、フィンランドの感情極性アノテーションを備えた大規模なソーシャルメディアデータセットは存在しないことに留意した。この出版物は、3つのネイティブアノテータによる感情極性と独立してアノテートされた27,000の文データセットを導入することで、この欠点を解決することを目的としている。データセット全体に対して同じ3つのアノテータがあり、時間の経過とともにアノテータの振る舞いを研究するためのユニークな機会を提供しました。アノテーション間の合意を分析し,データセットの有用性を検証するための2つのベースラインを提供する。 Sentiment analysis and opinion mining is an important task with obvious application areas in social media, e.g. when indicating hate speech and fake news. In our survey of previous work, we note that there is no large-scale social media data set with sentiment polarity annotations for Finnish. This publications aims to remedy this shortcoming by introducing a 27,000 sentence data set annotated independently with sentiment polarity by three native annotators. We had the same three annotators for the whole data set, which provides a unique opportunity for further studies of annotator behaviour over time. We analyse their inter-annotator agreement and provide two baselines to validate the usefulness of the data set.	翻訳日:2021-05-23 04:05:31 公開日:2020-12-04
# (参考訳) 大規模河川流速推定への深層学習の適用 Application of deep learning to large scale riverine flow velocity estimation ( http://arxiv.org/abs/2012.02620v1 ) ライセンス: CC BY 4.0	Mojtaba Forghani, Yizhou Qian, Jonghyun Lee, Matthew W. Farthing, Tyler Hesser, Peter K. Kitanidis, and Eric F. Darve	(参考訳) 河川流速の高速で信頼性の高い予測は洪水リスク管理を含む多くの応用において重要である。浅水方程式(SWE)は流速の予測に一般的に用いられる。しかし、標準的なSWEソルバによる正確かつ高速な予測は、多くの場合困難である。従来の手法は計算コストが高く、正確な予測には高解像度の河床形状測定 (bathymetry) が必要である。その結果、例えば、異なる境界条件(bc)に対して繰り返し評価される必要がある場合や、浴槽測定が確実性で分かっていない場合において、不適合である。本研究では,これらの問題に取り組む2段階のプロセスを提案する。まず,主成分統計学的手法 (PCGA) を用いて, 流速測定から浴量測定の確率密度関数を推定し, 次に, 複数の機械学習アルゴリズムを用いて, 後部浴量測定分布と所定範囲のBCから, SWEの高速解法を求める。第1ステップでは,浴量計を直接測定することなく流速を予測できる。さらに、第2段階における分布の増強により、時間とともにバスメトリが変化する場合でも、フロー速度予測に付加的なバスメトリ情報が組み込まれ、精度と一般化が向上する。ここでは,PCA-DNN(主成分分析深度ニューラルネットワーク),SE(教師付きエンコーダ),SVE(教師付き変分エンコーダ)という3つの解法を用いて,GAのオーガスタ近郊のサバンナ川(サバンナ川)でそれらを検証し,高速解法が従来の境界値問題よりもはるかに低い計算コストで流速を精度良く予測できることを示した。 Fast and reliable prediction of riverine flow velocities is important in many applications, including flood risk management. The shallow water equations (SWEs) are commonly used for prediction of the flow velocities. However, accurate and fast prediction with standard SWE solvers is challenging in many cases. Traditional approaches are computationally expensive and require high-resolution riverbed profile measurement ( bathymetry) for accurate predictions. As a result, they are a poor fit in situations where they need to be evaluated repetitively due, for example, to varying boundary condition (BC), or when the bathymetry is not known with certainty. In this work, we propose a two-stage process that tackles these issues. First, using the principal component geostatistical approach (PCGA) we estimate the probability density function of the bathymetry from flow velocity measurements, and then we use multiple machine learning algorithms to obtain a fast solver of the SWEs, given augmented realizations from the posterior bathymetry distribution and the prescribed range of BCs. The first step allows us to predict flow velocities without direct measurement of the bathymetry. Furthermore, the augmentation of the distribution in the second stage allows incorporation of the additional bathymetry information into the flow velocity prediction for improved accuracy and generalization, even if the bathymetry changes over time. Here, we use three solvers, referred to as PCA-DNN (principal component analysis-deep neural network), SE (supervised encoder), and SVE (supervised variational encoder), and validate them on a reach of the Savannah river near Augusta, GA. Our results show that the fast solvers are capable of predicting flow velocities with good accuracy, at a computational cost that is significantly lower than the cost of solving the full boundary value problem with traditional methods.	翻訳日:2021-05-23 04:04:39 公開日:2020-12-04
# (参考訳) 注意を理解する:心と機械の中で Understanding Attention: In Minds and Machines ( http://arxiv.org/abs/2012.02659v1 ) ライセンス: CC BY 4.0	Shriraj P. Sawant and Shruti Singh	(参考訳) 注意は複雑で広い概念であり、人工知能、認知科学、心理学、神経科学、関連する分野にまたがる複数の分野にわたって研究されている。注意に関する考えの多くはこれらの分野に大きく重なり合っていないが、限られた資源を適応的に制御する共通のテーマがある。本稿では,ニューラルネットワーク(ANN)における注意のコンセプトと変種について概説する。また、神経科学の観点から、ANNと平行する注意の起源についても論じる。様々な分野間の相互接続のように見える対話を行う代わりに、注意の体系的な分析と、AIや神経科学におけるアイデアの統一に向けて、共通の概念的枠組みに基づく考え方を提案する。 Attention is a complex and broad concept, studied across multiple disciplines spanning artificial intelligence, cognitive science, psychology, neuroscience, and related fields. Although many of the ideas regarding attention do not significantly overlap among these fields, there is a common theme of adaptive control of limited resources. In this work, we review the concept and variants of attention in artificial neural networks (ANNs). We also discuss the origin of attention from the neuroscience point of view parallel to that of ANNs. Instead of having seemingly disconnected dialogues between varied disciplines, we suggest grounding the ideas on common conceptual frameworks for a systematic analysis of attention and towards possible unification of ideas in AI and Neuroscience.	翻訳日:2021-05-23 03:26:28 公開日:2020-12-04
# (参考訳) 透明な対戦相手間の2人プレイゲームにおける学習 Learning in two-player games between transparent opponents ( http://arxiv.org/abs/2012.02671v1 ) ライセンス: CC BY 4.0	Adrian Hutter	(参考訳) 2つの強化学習エージェントが互いにマトリックスゲームを繰り返すシナリオを検討し,各ラウンドの後にパラメータを更新する。エージェントの意思決定は互いに透明であり、各エージェントが対戦相手がどのように振る舞うかを予測することができる。双方のエージェントの無限の回帰を無期限に予測するためには、各エージェントは少なくともエプシロンの確率で相手非依存の応答を与える必要がある。透明性はまた、各エージェントが他のエージェントの勾配ステップ、すなわち、予測して形作ることを可能にする。相手の勾配がそれらに好適な方向にあるパラメータ空間の領域に移動する。本研究では,従来の文献(LOLAとSOS)の2つのアルゴリズムを用いて,実験結果のダイナミクスを検証した。我々は, 相互透明な意思決定と対人意識学習の組み合わせが, 単発受刑者のジレンマにおける相互協力に強く寄与することを発見した。ニワトリのゲームでは、双方のエージェントが好適な均衡に向かって相手を操作しようとすると、相互に有利な結果に収束することが難しくなり、対戦意識の学習は双方のエージェントにとって最悪の結果をもたらす。これは、均衡選択問題を含む社会的ジレンマにおいて許容できる結果を達成する対向学習アルゴリズムを開発する必要性を強調している。 We consider a scenario in which two reinforcement learning agents repeatedly play a matrix game against each other and update their parameters after each round. The agents' decision-making is transparent to each other, which allows each agent to predict how their opponent will play against them. To prevent an infinite regress of both agents recursively predicting each other indefinitely, each agent is required to give an opponent-independent response with some probability at least epsilon. Transparency also allows each agent to anticipate and shape the other agent's gradient step, i.e. to move to regions of parameter space in which the opponent's gradient points in a direction favourable to them. We study the resulting dynamics experimentally, using two algorithms from previous literature (LOLA and SOS) for opponent-aware learning. We find that the combination of mutually transparent decision-making and opponent-aware learning robustly leads to mutual cooperation in a single-shot prisoner's dilemma. In a game of chicken, in which both agents try to manoeuvre their opponent towards their preferred equilibrium, converging to a mutually beneficial outcome turns out to be much harder, and opponent-aware learning can even lead to worst-case outcomes for both agents. This highlights the need to develop opponent-aware learning algorithms that achieve acceptable outcomes in social dilemmas involving an equilibrium selection problem.	翻訳日:2021-05-23 03:17:38 公開日:2020-12-04
# (参考訳) 知識グラフと機械学習による道路標識地中真理構築の高速化 Accelerating Road Sign Ground Truth Construction with Knowledge Graph and Machine Learning ( http://arxiv.org/abs/2012.02672v1 ) ライセンス: CC BY 4.0	Ji Eun Kim, Cory Henson, Kevin Huang, Tuan A. Tran, Wan-Yi Lin	(参考訳) 包括的な高品質な道路標識アノテーションデータセットを持つことは、AIベースの道路標識認識(RSR)システムの成功に不可欠である。実際には、アノテータは異なる国の道路標識システムを学ぶことの難しさに直面することが多いため、その作業は時間を要することが多く、結果が悪い。本稿では,知識グラフと機械学習アルゴリズムである可変プロトタイピングエンコーダ(VPE)を用いて,道路標識を効果的に分類する手法を提案する。アノテーションは視覚属性を使用してロードサイン知識グラフをクエリし、VPEモデルによって提案される最も近いマッチング候補を受け取ることができる。 VPEモデルは知識グラフからの候補と実際のサインイメージパッチを入力として使用する。知識グラフのアプローチは手話検索スペースを98.9%削減できることを示す。さらに,本システムはvpeを用いて,テストされたデータセットの75%の符号に対する正しい単一候補を提案することができ,その場合の人的検索の労力を完全に排除できる。 Having a comprehensive, high-quality dataset of road sign annotation is critical to the success of AI-based Road Sign Recognition (RSR) systems. In practice, annotators often face difficulties in learning road sign systems of different countries; hence, the tasks are often time-consuming and produce poor results. We propose a novel approach using knowledge graphs and a machine learning algorithm - variational prototyping-encoder (VPE) - to assist human annotators in classifying road signs effectively. Annotators can query the Road Sign Knowledge Graph using visual attributes and receive closest matching candidates suggested by the VPE model. The VPE model uses the candidates from the knowledge graph and a real sign image patch as inputs. We show that our knowledge graph approach can reduce sign search space by 98.9%. Furthermore, with VPE, our system can propose the correct single candidate for 75% of signs in the tested datasets, eliminating the human search effort entirely in those cases.	翻訳日:2021-05-23 02:55:50 公開日:2020-12-04
# (参考訳) escape: カーネルベースの機械学習アルゴリズムのための効率的なセキュアでプライベートなdot製品フレームワーク ESCAPED: Efficient Secure and Private Dot Product Framework for Kernel-based Machine Learning Algorithms with Applications in Healthcare ( http://arxiv.org/abs/2012.02688v1 ) ライセンス: CC BY 4.0	Ali Burak \"Unal, Mete Akg\"un, Nico Pfeifer	(参考訳) 高度な機械学習モデルをトレーニングするには、通常多くのトレーニングサンプルが必要です。特に医療分野では、これらのサンプルは非常に高価であり、1つの機関だけではそれ自体で十分ではない。異なるソースからのプライバシーに敏感なデータのマージは通常、データセキュリティとデータ保護によって制限される。これは、変数にノイズ(例えば$\epsilon$-differential privacy)を配置したり、特定の値(例えば$k$-匿名性)を省略したりすることで、データ品質を低下させるアプローチにつながる可能性がある。暗号法に基づくその他の測定は、特に大規模なマルチオミクスデータには特に問題となる、非常に時間を要する計算につながる可能性がある。 ESCAPED(Efficient SeCure and PrivatE Dotのプロダクトフレームワーク)を導入し、サードパーティ上の複数のソースからのベクトルのドット積の計算を可能にし、後にカーネルベースの機械学習アルゴリズムを訓練し、プライバシやノイズの追加を犠牲にすることなく、この問題に対処する。 HIV感染者に対する薬剤耐性予測の枠組みと、精密医療におけるマルチオミクスの次元減少とクラスタリングの問題について検討した。実行時間に関して、我々のフレームワークはアルゴリズムの性能を犠牲にすることなく、最も適した既存のアプローチを著しく上回ります。カーネルベースのアルゴリズムの利点しか示さないが、我々のフレームワークは、複数のソースからのベクトルのドット積を必要とする機械学習モデルに新たな研究機会を開くことができる。 To train sophisticated machine learning models one usually needs many training samples. Especially in healthcare settings these samples can be very expensive, meaning that one institution alone usually does not have enough on its own. Merging privacy-sensitive data from different sources is usually restricted by data security and data protection measures. This can lead to approaches that reduce data quality by putting noise onto the variables (e.g., in $\epsilon$-differential privacy) or omitting certain values (e.g., for $k$-anonymity). Other measures based on cryptographic methods can lead to very time-consuming computations, which is especially problematic for larger multi-omics data. We address this problem by introducing ESCAPED, which stands for Efficient SeCure And PrivatE Dot product framework, enabling the computation of the dot product of vectors from multiple sources on a third-party, which later trains kernel-based machine learning algorithms, while neither sacrificing privacy nor adding noise. We evaluated our framework on drug resistance prediction for HIV-infected people and multi-omics dimensionality reduction and clustering problems in precision medicine. In terms of execution time, our framework significantly outperforms the best-fitting existing approaches without sacrificing the performance of the algorithm. Even though we only show the benefit for kernel-based algorithms, our framework can open up new research opportunities for further machine learning models that require the dot product of vectors from multiple sources.	翻訳日:2021-05-23 02:45:15 公開日:2020-12-04
# (参考訳) 部分観測都市環境における物体探索のための空間言語理解 Spatial Language Understanding for Object Search in Partially Observed Cityscale Environments ( http://arxiv.org/abs/2012.02705v1 ) ライセンス: CC BY 4.0	Kaiyu Zheng, Deniz Bayazit, Rebecca Mathew, Ellie Pavlick, Stefanie Tellex	(参考訳) 本研究では,ロボットが空間言語をオブジェクト位置上の分布として解釈し,部分観測可能な都市環境における効率的な探索を可能にするシステムを提案する。本稿では,空間言語観測空間を紹介し,空間言語から抽出された情報をロボットの信念に取り入れた部分可観測マルコフ決定過程(pomdp)の枠組みに基づいて確率的観測モデルを作成する。曖昧で文脈に依存した前置詞(例えば~前置詞)を解釈するために,言語提供者の環境コンテキストに対する相対的参照フレーム(FoR)の予測を学習する畳み込みニューラルネットワークモデルを提案する。 4万m$^2$の足跡を持つ5都市間の相互評価を通じて,予測モデルと対象探索システムの一般化可能性を示す。シミュレーションによるエンド・ツー・エンド実験は,空間的前置詞理解を必要とせず,キーワードベースラインよりも検索速度が速く,高い成功率が得られることを示す。 We present a system that enables robots to interpret spatial language as a distribution over object locations for effective search in partially observable cityscale environments. We introduce the spatial language observation space and formulate a stochastic observation model under the framework of Partially Observable Markov Decision Process (POMDP) which incorporates information extracted from the spatial language into the robot's belief. To interpret ambiguous, context-dependent prepositions (e.g.~front), we propose a convolutional neural network model that learns to predict the language provider's relative frame of reference (FoR) given environment context. We demonstrate the generalizability of our FoR prediction model and object search system through cross-validation over areas of five cities, each with a 40,000m$^2$ footprint. End-to-end experiments in simulation show that our system achieves faster search and higher success rate compared to a keyword-based baseline without spatial preposition understanding.	翻訳日:2021-05-23 02:42:14 公開日:2020-12-04
# (参考訳) 機械学習アルゴリズムによる移動意図に及ぼす気象因子の影響 Impact of weather factors on migration intention using machine learning algorithms ( http://arxiv.org/abs/2012.02794v1 ) ライセンス: CC BY 4.0	John Aoga, Juhee Bae, Stefanija Veljanoska, Siegfried Nijssen, Pierre Schaus	(参考訳) 経験文学における注目度は、気候ショックの発生と移住決定の変化に向けられている。以前の文献は異なる結果をもたらし、多くの伝統的な経験的アプローチを用いる。本稿では,ブルキナファソ,アイボリーコースト,マリ,モーリタニア,ニジェール,セネガルの6つの農業依存型経済圏への移住を意図した,気象ショックの役割を分析するためのツリーベース機械学習(ML)アプローチを提案する。いくつかの木に基づくアルゴリズム(例えば、XGB、ランダムフォレスト)を列車検証テストワークフローを用いて実行し、堅牢で耐雑音性のあるアプローチを構築する。次に、移行意図に影響を与える方向を示す重要な特徴を決定する。このMLに基づく推定は、異なる時間スケールで標準降水-蒸発散指数(SPEI)が捉えた天候ショックや、様々な社会経済的特徴/共変量などの特徴を考慮に入れている。その結果,(i)社会経済特性が移動意図に影響を及ぼす一方で,天気特性が予測性能を向上させること,(ii)国内特化モデルが必要であること,(iii)国際移動はSPEIのより長い時間スケールに影響され,(内部移動を含む)一般移動はより短い時間スケールによって影響されることがわかった。 A growing attention in the empirical literature has been paid to the incidence of climate shocks and change in migration decisions. Previous literature leads to different results and uses a multitude of traditional empirical approaches. This paper proposes a tree-based Machine Learning (ML) approach to analyze the role of the weather shocks towards an individual's intention to migrate in the six agriculture-dependent-economy countries such as Burkina Faso, Ivory Coast, Mali, Mauritania, Niger, and Senegal. We perform several tree-based algorithms (e.g., XGB, Random Forest) using the train-validation-test workflow to build robust and noise-resistant approaches. Then we determine the important features showing in which direction they are influencing the migration intention. This ML-based estimation accounts for features such as weather shocks captured by the Standardized Precipitation-Evapotranspiration Index (SPEI) for different timescales and various socioeconomic features/covariates. We find that (i) weather features improve the prediction performance although socioeconomic characteristics have more influence on migration intentions, (ii) country-specific model is necessary, and (iii) international move is influenced more by the longer timescales of SPEIs while general move (which includes internal move) by that of shorter timescales.	翻訳日:2021-05-23 02:30:22 公開日:2020-12-04
# (参考訳) 多言語関係学習のためのイベントガイドによるDenoising Event Guided Denoising for Multilingual Relation Learning ( http://arxiv.org/abs/2012.02721v1 ) ライセンス: CC BY 4.0	Amith Ananthram, Emily Allaway, Kathleen McKeown	(参考訳) 汎用的な関係抽出は、soaresらによる膨大なデータ集約的な遠隔監視技術によって、近年大きく向上している。 (2019)は多くのベンチマークで最先端の結果を生成する。本研究では,ゼロショットと少数ショットのほぼ再現が可能なラベル付きテキストから関係抽出のための高品質なトレーニングデータを,トレーニングコストのごく一部で収集する手法を提案する。提案手法は,日時標示されたニュース記事の予測可能な分布構造を生かして,低品質の事例を抽出し,分節化したコーパスを構築する。このコーパスで訓練された小さな多言語エンコーダは、少ない例(50k vs. 300mil+)を使用しながら、英語とスペイン語の少数ショットおよび標準関係ベンチマークにおいて、現在の最先端(どちらも微調整を受けていない場合)と同等に動作することを示す。 General purpose relation extraction has recently seen considerable gains in part due to a massively data-intensive distant supervision technique from Soares et al. (2019) that produces state-of-the-art results across many benchmarks. In this work, we present a methodology for collecting high quality training data for relation extraction from unlabeled text that achieves a near-recreation of their zero-shot and few-shot results at a fraction of the training cost. Our approach exploits the predictable distributional structure of date-marked news articles to build a denoised corpus -- the extraction process filters out low quality examples. We show that a smaller multilingual encoder trained on this corpus performs comparably to the current state-of-the-art (when both receive little to no fine-tuning) on few-shot and standard relation benchmarks in English and Spanish despite using many fewer examples (50k vs. 300mil+).	翻訳日:2021-05-23 02:03:56 公開日:2020-12-04
# (参考訳) パッチ統計を利用した畳み込みニューラルネットワークを用いた超音波散乱体密度分類 Ultrasound Scatterer Density Classification Using Convolutional Neural Networks by Exploiting Patch Statistics ( http://arxiv.org/abs/2012.02738v1 ) ライセンス: CC BY 4.0	Ali K. Z. Tehrani, Mina Amiri, Ivan M. Rosado-Mendez, Timothy J. Hall, and Hassan Rivaz	(参考訳) 定量的超音波(qus)は散乱体密度などの組織特性の重要な情報を明らかにすることができる。分解能細胞当たりの散乱密度が10以上である場合、組織は、それぞれ完全に発達したスペックル(fds)または低密度散乱体(lds)と見なされる。従来,後方散乱エコーの振幅の推定統計パラメータを用いて散乱密度を分類してきた。しかし、パッチサイズが小さい場合、その推定は正確ではない。これらのパラメータは画像設定にも強く依存する。本稿では,QUSのための畳み込みニューラルネットワーク(CNN)アーキテクチャを提案し,シミュレーションデータを用いて学習する。さらに,パッチ統計を追加入力チャネルとして利用することで,ネットワーク性能をさらに向上させる。シミュレーションデータ,実験ファントム,生体内データを用いてネットワークの評価を行った。また,提案するネットワークを古典的および深層学習モデルと比較し,散乱密度値の異なる組織分類において,その優れた性能を示す。また,提案したネットワークは,参照ファントムを必要とせずに,異なる画像パラメータで動作可能であることを示す。本研究は超音波画像における散乱体密度の分類におけるCNNの可能性を示す。 Quantitative ultrasound (QUS) can reveal crucial information on tissue properties such as scatterer density. If the scatterer density per resolution cell is above or below 10, the tissue is considered as fully developed speckle (FDS) or low-density scatterers (LDS), respectively. Conventionally, the scatterer density has been classified using estimated statistical parameters of the amplitude of backscattered echoes. However, if the patch size is small, the estimation is not accurate. These parameters are also highly dependent on imaging settings. In this paper, we propose a convolutional neural network (CNN) architecture for QUS, and train it using simulation data. We further improve the network performance by utilizing patch statistics as additional input channels. We evaluate the network using simulation data, experimental phantoms and in vivo data. We also compare our proposed network with different classic and deep learning models, and demonstrate its superior performance in classification of tissues with different scatterer density values. The results also show that the proposed network is able to work with different imaging parameters with no need for a reference phantom. This work demonstrates the potential of CNNs in classifying scatterer density in ultrasound images.	翻訳日:2021-05-23 01:54:45 公開日:2020-12-04
# (参考訳) 特徴帰属説明における共通解釈可能性仮定の一致 Challenging common interpretability assumptions in feature attribution explanations ( http://arxiv.org/abs/2012.02748v1 ) ライセンス: CC0 1.0	Jonathan Dinu (1), Jeffrey Bigham (2), J. Zico Kolter (2) ((1) Unaffiliated, (2) Carnegie Mellon University)	(参考訳) 機械学習とアルゴリズムによる意思決定システムが、ハイテイクなヒューマン・イン・ザ・ループ設定でますます活用されているため、予測の合理性を理解する必要がある。研究者たちは、説明可能なAI(XAI)でこのニーズに対応しているが、しばしば、評価なしで解釈可能性の公理を宣言する。これらのシステムが評価されると、しばしば、解釈可能性(モデル複雑性など)のプロキシメトリクスによるオフラインシミュレーションによってテストされる。簡単な「プレースボ説明」制御による大規模人物体実験により,3つの共通解釈可能性仮定の妥当性を実証的に評価した。特徴帰属の説明は、人間の意思決定者にとってタスクに限界効用をもたらし、ある場合には認知的および文脈的共同設立者による決定が悪化する。この結果は,これらの手法の適用の普遍的なメリットを問うものであり,XAI研究における人的評価の重要性を浮き彫りにしたい。実験から匿名化されたデータ、研究を複製するためのコード、実験のインタラクティブなデモ、分析で使用されるモデルなど、補助的な資料は以下のとおりである。 As machine learning and algorithmic decision making systems are increasingly being leveraged in high-stakes human-in-the-loop settings, there is a pressing need to understand the rationale of their predictions. Researchers have responded to this need with explainable AI (XAI), but often proclaim interpretability axiomatically without evaluation. When these systems are evaluated, they are often tested through offline simulations with proxy metrics of interpretability (such as model complexity). We empirically evaluate the veracity of three common interpretability assumptions through a large scale human-subjects experiment with a simple "placebo explanation" control. We find that feature attribution explanations provide marginal utility in our task for a human decision maker and in certain cases result in worse decisions due to cognitive and contextual confounders. This result challenges the assumed universal benefit of applying these methods and we hope this work will underscore the importance of human evaluation in XAI research. Supplemental materials -- including anonymized data from the experiment, code to replicate the study, an interactive demo of the experiment, and the models used in the analysis -- can be found at: https://doi.pizza/challenging-xai.	翻訳日:2021-05-23 01:38:03 公開日:2020-12-04
# (参考訳) 歩行者避難シミュレーションモデルの改良 An Improved Simulation Model for Pedestrian Crowd Evacuation ( http://arxiv.org/abs/2012.09135v1 ) ライセンス: CC BY 4.0	Danial A. Muhammed, Tarik A. Rashid, Abeer Alsadoon, Nebojsa Bacanin, Polla Fattah, Mokhtar Mohammadi and Indradip Banerjee	(参考訳) 本稿は,2019年後半に開発された,最新の歩行者避難モデルである「各種AI技術に基づく歩行者避難シミュレーションモデル」について論じる。本研究は,新しい手法を提案し,それをモデルに統合することで,開発したモデルに新たな機能を追加する。本手法により,提案する多くの場所の中から,最適な出口ドア位置を選択することによる安全性など,より適切な避難エリア設計が可能である。この方法は、選択されたモデルの出力、すなわち、避難プロセス内の各個人に対する避難時間に完全に依存する。新しい方法は避難者各避難所の避難所の避難時間の平均を求め,避難所の平均避難時間に基づいて避難所の避難所が避難所の避難所として最も適しているかを決定する。本手法を検証するために, 各種シナリオを用いた避難区域の設計を行った。その結果, 提案手法を用いたモデルでは, 提案位置の適切な出入口位置を予測できることがわかった。最後に, 本手法を統合した研究結果から, 安全の観点から選択したモデルに対して, 避難区域の優れた設計を選択する上で, 適切な判断を下すことができた。 This paper works on one of the most recent pedestrian crowd evacuation models, i.e., "a simulation model for pedestrian crowd evacuation based on various AI techniques", developed in late 2019. This study adds a new feature to the developed model by proposing a new method and integrating it with the model. This method enables the developed model to find a more appropriate evacuation area design, among others regarding safety due to selecting the best exit door location among many suggested locations. This method is completely dependent on the selected model's output, i.e., the evacuation time for each individual within the evacuation process. The new method finds an average of the evacuees' evacuation times of each exit door location; then, based on the average evacuation time, it decides which exit door location would be the best exit door to be used for evacuation by the evacuees. To validate the method, various designs for the evacuation area with various written scenarios were used. The results showed that the model with this new method could predict a proper exit door location among many suggested locations. Lastly, from the results of this research using the integration of this newly proposed method, a new capability for the selected model in terms of safety allowed the right decision in selecting the finest design for the evacuation area among other designs.	翻訳日:2021-05-23 01:14:33 公開日:2020-12-04
# (参考訳) delexicalized paraphrase generation Delexicalized Paraphrase Generation ( http://arxiv.org/abs/2012.02763v1 ) ライセンス: CC BY 4.0	Boya Yu, Konstantine Arkoudas, Wael Hamza	(参考訳) パラフレーズ化のためのニューラルモデルを提案し,デレクシカル化文を生成するよう訓練する。各入力に複数の参照パラフレーズをペアにしたトレーニングデータを作成することで、これを実現する。これらの参照パラフラスは、注釈付きスロットとインテントに基づく意味同値の弱いタイプを表す。スロットの匿名化以外の異なるタイプのスロットからのセマンティクスを理解するために、スロット値のプールの前に畳み込みニューラルネットワーク(cnn)を適用し、出力中のスロットを見つけるためにポインタを使用する。実験の結果,生成したパラフレーズは高品質であり,さらに1.29%の正確な一致が得られた。また,自然言語理解(nlu)タスク,例えばインテント分類や名前付きエンティティ認識は,自動生成パラフレーズを用いたデータ拡張の恩恵を受けることを示す。 We present a neural model for paraphrasing and train it to generate delexicalized sentences. We achieve this by creating training data in which each input is paired with a number of reference paraphrases. These sets of reference paraphrases represent a weak type of semantic equivalence based on annotated slots and intents. To understand semantics from different types of slots, other than anonymizing slots, we apply convolutional neural networks (CNN) prior to pooling on slot values and use pointers to locate slots in the output. We show empirically that the generated paraphrases are of high quality, leading to an additional 1.29% exact match on live utterances. We also show that natural language understanding (NLU) tasks, such as intent classification and named entity recognition, can benefit from data augmentation using automatically generated paraphrases.	翻訳日:2021-05-23 01:04:17 公開日:2020-12-04
# (参考訳) 深層学習における一般化予測のための表現に基づく複雑性尺度 Representation Based Complexity Measures for Predicting Generalization in Deep Learning ( http://arxiv.org/abs/2012.02775v1 ) ライセンス: CC BY 4.0	Parth Natekar, Manik Sharma	(参考訳) ディープニューラルネットワークは、非常に過度にパラメータ化されているにもかかわらず、一般化することができる。近年の研究では、この現象を様々な視点から検討し、ノルムベース、PACベイズベース、マージンベース分析など、これらの視点に基づく一般化誤差や一般化ギャップの予測値の境界について検討している。本研究では,人間の視覚系が不変かつアンタングル化された物体表現をいかに生成するかという神経科学的理論に基づいて,ディープニューラルネットワークの内部表現の品質の観点から一般化の解釈を行う。理論的な境界を与える代わりに、深層モデルにおける一般化の振る舞いを明らかにするためにアドホックに計算できる実用的な複雑性測度を示す。我々はまた、NeurIPS 2020で開催されているDeep Learningの予測一般化に関するNeurIPSコンペティションで優勝したソリューションの詳細な説明も提供している。このソリューションの実装はhttps://github.com/parthnatekar/pgdlで利用可能です。 Deep Neural Networks can generalize despite being significantly overparametrized. Recent research has tried to examine this phenomenon from various view points and to provide bounds on the generalization error or measures predictive of the generalization gap based on these viewpoints, such as norm-based, PAC-Bayes based, and margin-based analysis. In this work, we provide an interpretation of generalization from the perspective of quality of internal representations of deep neural networks, based on neuroscientific theories of how the human visual system creates invariant and untangled object representations. Instead of providing theoretical bounds, we demonstrate practical complexity measures which can be computed ad-hoc to uncover generalization behaviour in deep models. We also provide a detailed description of our solution that won the NeurIPS competition on Predicting Generalization in Deep Learning held at NeurIPS 2020. An implementation of our solution is available at https://github.com/parthnatekar/pgdl.	翻訳日:2021-05-23 00:50:36 公開日:2020-12-04
# (参考訳) 軌道の非教師的埋め込みは移動の潜在構造を捉える Unsupervised embedding of trajectories captures the latent structure of mobility ( http://arxiv.org/abs/2012.02785v1 ) ライセンス: CC BY 4.0	Dakota Murray, Jisung Yoon, Sadamori Kojaku, Rodrigo Costas, Woo-Sung Jung, Sta\v{s}a Milojevi\'c, Yong-Yeol Ahn	(参考訳) 人の移動と移住は、都市の成長と進化、疫病、経済、イノベーションといった社会的な大きな現象を駆動する。歴史的に、人間の移動は物理的分離(地理的距離)によって強い制約を受けてきた。しかし、地理的距離は、言語、文化、歴史的関係がより重要になっている間、物理的な障壁が縮小しているグローバル化の世界において、あまり重要ではない。モビリティを理解することは現代社会にとって重要になっているので、この複雑さを捉えられるフレームワークを見つけることは非常に重要です。本稿では、3つの異なる人間の軌道データセットを用いて、神経埋め込みモデルが位置間のニュアンス関係をベクトル空間にエンコードできることを実証し、人間のモビリティの多面構造を反映した効果的な距離尺度を提供する。科学的モビリティの事例に着目して,科学組織の組込みが,文化的・言語的関係,さらには学術的権威を,多段階の粒度で明らかにすることを示した。さらに, 組込みベクトルは, 科学モビリティのグローバルランドスケープにおける組織的特徴とその位置の普遍的関係を明らかにする。データから直接、スケーラブルで高密度で有意義なモビリティ表現を学習できることは、ドメイン間のモビリティを研究する新たな道を開くことができる。 Human mobility and migration drive major societal phenomena such as the growth and evolution of cities, epidemics, economies, and innovation. Historically, human mobility has been strongly constrained by physical separation -- geographic distance. However, geographic distance is becoming less relevant in the increasingly-globalized world in which physical barriers are shrinking while linguistic, cultural, and historical relationships are becoming more important. As understanding mobility is becoming critical for contemporary society, finding frameworks that can capture this complexity is of paramount importance. Here, using three distinct human trajectory datasets, we demonstrate that a neural embedding model can encode nuanced relationships between locations into a vector-space, providing an effective measure of distance that reflects the multi-faceted structure of human mobility. Focusing on the case of scientific mobility, we show that embeddings of scientific organizations uncover cultural and linguistic relations, and even academic prestige, at multiple levels of granularity. Furthermore, the embedding vectors reveal universal relationships between organizational characteristics and their place in the global landscape of scientific mobility. The ability to learn scalable, dense, and meaningful representations of mobility directly from the data can open up a new avenue of studying mobility across domains.	翻訳日:2021-05-23 00:06:03 公開日:2020-12-04
# (参考訳) Adaptive Explicit Kernel Minkowski Weighted K-means Adaptive Explicit Kernel Minkowski Weighted K-means ( http://arxiv.org/abs/2012.02805v1 ) ライセンス: CC BY 4.0	Amir Aradnia, Maryam Amir Haeri and Mohammad Mehdi Ebadzadeh	(参考訳) k-meansアルゴリズムは最も一般的なデータクラスタリング手法の一つである。しかし、正則 k-平均は入力空間にしか適用できず、クラスタが線形分離可能である場合にも適用できる。 k-平均を核空間に拡張するk-平均は、非線形構造をキャプチャし、任意の形状のクラスターを識別することができる。しかし、カーネルメソッドは、しばしばデータのカーネルマトリックス上で動作し、行列のサイズに悪影響を及ぼすか、カーネル値の繰り返し計算によって高いクラスタリングコストに悩まされる。もうひとつの問題は、アルゴリズムが$k(x_i, x_j)$の評価によってのみデータにアクセスすることだ。本稿では, スペクトル解析に基づく近似有限次元特徴写像を駆動することにより, 線形および非線形アプローチの利点を組み合わせる手法を提案する。近似有限次元特徴写像の適用は, サポートベクターマシン(svm)問題でのみ議論された。この手法をカーネルk-means時代において,巨大なカーネルマトリックスのメモリ保存を緩和し,クラスタ中心をより効率的に計算し,特徴空間で明示的にデータにアクセスすることを提案する。これらの明示的な特徴マップは、特徴空間内のデータに明示的にアクセスし、その空間のk-means拡張を利用することができます。 KMWK-mean(Explicit Kernel Minkowski Weighted K-mean)法は,ミンコフスキー指数と特徴量パラメータを付加することにより,新しい空間の最適値を求めることができることを示す。さらに、ユークリッドノルムではなく、ミンコフスキーノルムと分数ノルム(p<1)によるミンコフスキーノルムの拡張として)を含む他のノルムの調査を提案することにより、近隣探索に対する濃度の影響を低減できる。 The K-means algorithm is among the most commonly used data clustering methods. However, the regular K-means can only be applied in the input space and it is applicable when clusters are linearly separable. The kernel K-means, which extends K-means into the kernel space, is able to capture nonlinear structures and identify arbitrarily shaped clusters. However, kernel methods often operate on the kernel matrix of the data, which scale poorly with the size of the matrix or suffer from the high clustering cost due to the repetitive calculations of kernel values. Another issue is that algorithms access the data only through evaluations of $K(x_i, x_j)$, which limits many processes that can be done on data through the clustering task. This paper proposes a method to combine the advantages of the linear and nonlinear approaches by using driven corresponding approximate finite-dimensional feature maps based on spectral analysis. Applying approximate finite-dimensional feature maps were only discussed in the Support Vector Machines (SVM) problems before. We suggest using this method in kernel K-means era as alleviates storing huge kernel matrix in memory, further calculating cluster centers more efficiently and access the data explicitly in feature space. These explicit feature maps enable us to access the data in the feature space explicitly and take advantage of K-means extensions in that space. We demonstrate our Explicit Kernel Minkowski Weighted K-mean (Explicit KMWK-mean) method is able to be more adopted and find best-fitting values in new space by applying additional Minkowski exponent and feature weights parameter. Moreover, it can reduce the impact of concentration on nearest neighbour search by suggesting investigate among other norms instead of Euclidean norm, includes Minkowski norms and fractional norms (as an extension of the Minkowski norms with p<1).	翻訳日:2021-05-23 00:04:14 公開日:2020-12-04
# (参考訳) 確率自由推論のための時系列の学習概要特徴 Learning summary features of time series for likelihood free inference ( http://arxiv.org/abs/2012.02807v1 ) ライセンス: CC BY 4.0	Pedro L. C. Rodrigues, Alexandre Gramfort	(参考訳) 特定のシミュレーターモデルのどのパラメータが実験データの集合を最もよく記述できるかを決定するために、LFI ( chance-free inference) を使うことに対する科学界からの関心が高まっている。最近のエキサイティングな結果と広範囲のアプリケーションにもかかわらず、時系列データに適用する際のLFIの重要なボトルネックは、ドメイン知識に基づいて手作業で調整される一連の要約機能を定義する必要があることである。本研究では,不定時系列から要約特徴を自動的に学習し,自己回帰移動平均(arma)モデルとファンデルpol発振器から生成された信号に適用するデータ駆動戦略を提案する。その結果,データからの要約特徴の学習は,線形の場合においても自己相関係数などの手作り値に基づくlfi手法よりも優れることがわかった。 There has been an increasing interest from the scientific community in using likelihood-free inference (LFI) to determine which parameters of a given simulator model could best describe a set of experimental data. Despite exciting recent results and a wide range of possible applications, an important bottleneck of LFI when applied to time series data is the necessity of defining a set of summary features, often hand-tailored based on domain knowledge. In this work, we present a data-driven strategy for automatically learning summary features from univariate time series and apply it to signals generated from autoregressive-moving-average (ARMA) models and the Van der Pol Oscillator. Our results indicate that learning summary features from data can compete and even outperform LFI methods based on hand-crafted values such as autocorrelation coefficients even in the linear case.	翻訳日:2021-05-22 23:53:22 公開日:2020-12-04
# (参考訳) 共同配電系統状態推定のための階層的深部アクター・クリティカル学習法 A Hierarchical Deep Actor-Critic Learning Method for Joint Distribution System State Estimation ( http://arxiv.org/abs/2012.02880v1 ) ライセンス: CC BY 4.0	Yuxuan Yuan, Kaveh Dehghanpour, Zhaoyu Wang, Fankun Bu	(参考訳) 揮発性分散型太陽光発電(PV)リソースの普及により,グリッドエッジにおける顧客のリアルタイムモニタリングが重要な課題となっている。しかし、計算が複雑で大規模システムへの拡張性に欠ける分散グリッドの一次レベルと二次レベルの両方について、dsse(distribution system state estimation)を共同で解決する必要がある。 DSSEのほぼリアルタイムな解を実現するため,第1層では重み付き最小二乗法(WLS)アルゴリズムが一次中電圧供給装置よりもDSSEを解くとともに,第2層では,低電圧回路の状態を推定し,グリッドエッジにおけるPVの影響を捉えるために,各二次変圧器に対してディープアクタクリティカル(A-C)モジュールを訓練する。 A-Cパラメータ学習プロセスはオフラインで行われるが、トレーニングされたA-Cモジュールは高速な二次グリッド状態推定のためにオンラインでデプロイされる。監視精度を維持するために、2つのレベルは、トランス電圧(第1層から第2層)とアクティブ/反応性全電力注入(第2層から第1層)を含む二次ノードで境界情報を交換する。このインタラクティブな情報伝達戦略は、数回のイテレーションで両方の層で最適な解を追跡できるクローズドループ構造をもたらす。さらに,本モデルは第1層のヤコビ行列を用いてトポロジの変化を処理できる。提案手法の性能を検証するために,実効用データとフィードモデルを用いて数値実験を行った。 Due to increasing penetration of volatile distributed photovoltaic (PV) resources, real-time monitoring of customers at the grid-edge has become a critical task. However, this requires solving the distribution system state estimation (DSSE) jointly for both primary and secondary levels of distribution grids, which is computationally complex and lacks scalability to large systems. To achieve near real-time solutions for DSSE, we present a novel hierarchical reinforcement learning-aided framework: at the first layer, a weighted least squares (WLS) algorithm solves the DSSE over primary medium-voltage feeders; at the second layer, deep actor-critic (A-C) modules are trained for each secondary transformer using measurement residuals to estimate the states of low-voltage circuits and capture the impact of PVs at the grid-edge. While the A-C parameter learning process takes place offline, the trained A-C modules are deployed online for fast secondary grid state estimation; this is the key factor in scalability and computational efficiency of the framework. To maintain monitoring accuracy, the two levels exchange boundary information with each other at the secondary nodes, including transformer voltages (first layer to second layer) and active/reactive total power injection (second layer to first layer). This interactive information passing strategy results in a closed-loop structure that is able to track optimal solutions at both layers in few iterations. Moreover, our model can handle the topology changes using the Jacobian matrices of the first layer. We have performed numerical experiments using real utility data and feeder models to verify the performance of the proposed framework.	翻訳日:2021-05-22 22:36:43 公開日:2020-12-04
# (参考訳) サイバーセキュリティと侵入検知システムのための深層学習法 Review: Deep Learning Methods for Cybersecurity and Intrusion Detection Systems ( http://arxiv.org/abs/2012.02891v1 ) ライセンス: CC BY 4.0	Mayra Macas, Chunming Wu	(参考訳) サイバー攻撃の数が増えるにつれて、サイバーセキュリティはあらゆるビジネスにとって重要な懸念に発展しつつある。人工知能(AI)と機械学習(ML)(特にディープラーニング - DL)は、脅威検出に寄与し、サイバーアナリストに推奨アクションを提供することができるため、サイバー防衛の重要な技術として活用することができる。サイバーセキュリティへのAI/MLの採用を推進し、効率的なサイバー防衛システムを構築するためには、産業、学術、政府とのグローバルなパートナーシップが必要である。本稿では,ネットワーク侵入検出に使用される各種深層学習手法について検討し,サイバーセキュリティアプリケーションのためのdlフレームワークを提案する。 As the number of cyber-attacks is increasing, cybersecurity is evolving to a key concern for any business. Artificial Intelligence (AI) and Machine Learning (ML) (in particular Deep Learning - DL) can be leveraged as key enabling technologies for cyber-defense, since they can contribute in threat detection and can even provide recommended actions to cyber analysts. A partnership of industry, academia, and government on a global scale is necessary in order to advance the adoption of AI/ML to cybersecurity and create efficient cyber defense systems. In this paper, we are concerned with the investigation of the various deep learning techniques employed for network intrusion detection and we introduce a DL framework for cybersecurity applications.	翻訳日:2021-05-22 22:22:06 公開日:2020-12-04
# (参考訳) ファッションから地下マップを発見 Discovering Underground Maps from Fashion ( http://arxiv.org/abs/2012.02897v1 ) ライセンス: CC BY 4.0	Utkarsh Mall, Kavita Bala, Tamara Berg, Kristen Grauman	(参考訳) 地理的地域におけるファッションセンスは、その地域に関する情報を明らかにすることができる。例えば、人々がそこで行う活動の種類や、地域を頻繁に訪れる群衆の種類(観光スポット、学生地区、ビジネスセンターなど)を反映することができる。本研究では,都市部の地下地図を自動的に作成する手法を提案する。本手法は,都市全域から公開されている画像を用いて,類似のファッションセンスを持つ地域を探索し,地図を監督せずにセグメント化する。世界の37都市を対象に,人間の審査員による実験と非画像データから得られた地下地図ベンチマークを用いて,優れた地下地図を作成するための有望な成果を示す。我々のアプローチはさらに、異なる地区(LAの最もユニークな地域は何ですか? 都市間の類似質問に答える(Bogotaの"Downtown LA"とは何か? ). The fashion sense -- meaning the clothing styles people wear -- in a geographical region can reveal information about that region. For example, it can reflect the kind of activities people do there, or the type of crowds that frequently visit the region (e.g., tourist hot spot, student neighborhood, business center). We propose a method to automatically create underground neighborhood maps of cities by analyzing how people dress. Using publicly available images from across a city, our method finds neighborhoods with a similar fashion sense and segments the map without supervision. For 37 cities worldwide, we show promising results in creating good underground maps, as evaluated using experiments with human judges and underground map benchmarks derived from non-image data. Our approach further allows detecting distinct neighborhoods (what is the most unique region of LA?) and answering analogy questions between cities (what is the "Downtown LA" of Bogota?).	翻訳日:2021-05-22 22:12:26 公開日:2020-12-04
# (参考訳) メカニカルパイプの3次元再構成のための移動式カメラの自動校正 Automated Calibration of Mobile Cameras for 3D Reconstruction of Mechanical Pipes ( http://arxiv.org/abs/2012.02899v1 ) ライセンス: CC BY 4.0	Reza Maalek and Derek Lichti	(参考訳) この原稿は、大規模な円形の黒と白のターゲットフィールドを使用して、光学機器、特にモバイルカメラの校正のための新しいフレームワークを提供する。 i)画像間の目標のマッチング、(ii)目標中心の系統的偏心誤差の調整、(iii)自由ネットワーク自己調整によるキャリブレーションソリューションの反復的改善のための新しい方法が導入された。完全校正実験室から得られた270個の携帯電話画像において,提案したターゲットマッチングは,II型エラーに対するロバスト性を有する円形目標と効果的に一致した。 2つのビューからのカメラ投影行列のみを必要とする偏心調整は、事前にいくつかのオブジェクト空間ターゲット情報を必要とする利用可能なクローズドフォームソリューションと同義的に振る舞う。最後に, 携帯機器の場合, 機械管の3次元再構成半径を推定するためのその場キャリブレーションよりも, フレームワークを用いて得られたキャリブレーションパラメータが優れていることがわかった(約45%改良)。 This manuscript provides a new framework for calibration of optical instruments, in particular mobile cameras, using large-scale circular black and white target fields. New methods were introduced for (i) matching targets between images; (ii) adjusting the systematic eccentricity error of target centers; and (iii) iteratively improving the calibration solution through a free-network self-calibrating bundle adjustment. It was observed that the proposed target matching effectively matched circular targets in 270 mobile phone images from a complete calibration laboratory with robustness to Type II errors. The proposed eccentricity adjustment, which requires only camera projective matrices from two views, behaved synonymous to available closed-form solutions, which require several additional object space target information a priori. Finally, specifically for the case of the mobile devices, the calibration parameters obtained using our framework was found superior compared to in-situ calibration for estimating the 3D reconstructed radius of a mechanical pipe (approximately 45% improvement).	翻訳日:2021-05-22 21:58:46 公開日:2020-12-04
# (参考訳) 人間のフィードバックによる解釈可能な概念モデルの構築 Learning Interpretable Concept-Based Models with Human Feedback ( http://arxiv.org/abs/2012.02898v1 ) ライセンス: CC BY 4.0	Isaac Lage, Finale Doshi-Velez	(参考訳) 人間の理解可能な概念からドメインの表現を学習し、それを予測するために使用する機械学習モデルは、高次元データで訓練されたモデルとの解釈と相互作用を容易にするために提案されている。しかし、これらの方法には重要な制限がある: 概念を定義する方法は本質的に解釈可能ではなく、概念ラベルは個々のインスタンスに存在しているか、ユーザから容易に取得できると仮定する。これらの制限は特に高次元の表形式の特徴に対して急激である。個々のインスタンスではなく,概念特徴をラベル付けするユーザに依存する高次元表データにおいて,一連の透明概念定義を学習するためのアプローチを提案する。提案手法は,概念の意味をユーザの直感的に理解し,透過的な機械学習モデルにより下流ラベルの予測を容易にする概念である。これにより、完全なモデルは透過的で直感的であり、この制約を考慮すれば可能な限り予測可能である。臨床領域を含む実際の予測問題に対するユーザフィードバックをシミュレーションすることで、このような直接的なフィードバックは、類似した予測性能を維持しながら、レベリングインスタンスや他の既存のインタラクションメカニズムに依存する代替の透明なアプローチよりも、真実の概念定義に適合する学習ソリューションにおいてはるかに効率的であることを示す。 Machine learning models that first learn a representation of a domain in terms of human-understandable concepts, then use it to make predictions, have been proposed to facilitate interpretation and interaction with models trained on high-dimensional data. However these methods have important limitations: the way they define concepts are not inherently interpretable, and they assume that concept labels either exist for individual instances or can easily be acquired from users. These limitations are particularly acute for high-dimensional tabular features. We propose an approach for learning a set of transparent concept definitions in high-dimensional tabular data that relies on users labeling concept features instead of individual instances. Our method produces concepts that both align with users' intuitive sense of what a concept means, and facilitate prediction of the downstream label by a transparent machine learning model. This ensures that the full model is transparent and intuitive, and as predictive as possible given this constraint. We demonstrate with simulated user feedback on real prediction problems, including one in a clinical domain, that this kind of direct feedback is much more efficient at learning solutions that align with ground truth concept definitions than alternative transparent approaches that rely on labeling instances or other existing interaction mechanisms, while maintaining similar predictive performance.	翻訳日:2021-05-22 21:29:40 公開日:2020-12-04
# クロスモーダル一般化:メタアリゲーションによる低リソースモダリティ学習 Cross-Modal Generalization: Learning in Low Resource Modalities via Meta-Alignment ( http://arxiv.org/abs/2012.02813v1 ) ライセンス: Link先を確認	Paul Pu Liang, Peter Wu, Liu Ziyin, Louis-Philippe Morency, Ruslan Salakhutdinov	(参考訳) 自然界は視覚、音響、触覚、言語的モダリティを通じて表現される概念が豊富である。しかし、マルチモーダル学習の現在の進歩の多くは、トレーニングやテスト時に同じモダリティが存在している問題に焦点を当てており、低リソースモダリティの学習を特に困難にしている。本研究では,(1)目標モダリティにおける新しいタスクを迅速に実行可能なモデルを訓練するための学習パラダイムであるクロスモーダル一般化のためのアルゴリズムを提案する。メタラーニング)と(2)異なるソースモダリティでトレーニングされている間、そうする。我々は、異なるソースとターゲットのモダリティに対して異なるエンコーダを使用しながら、モダリティをまたいだ一般化を確保するにはどうすればよいのか? 本研究では,新しい表現空間の整列法であるmeta-alignment(メタアリゲーション)を基礎とし,強結合と弱結合のクロスモーダルデータを用いて,異なるモーダル性にまたがる新しいタスクへの迅速な一般化を実現する。本稿では,テキストから画像,画像から音声,テキストから音声の3つの分類課題について検討する。以上の結果から,新たな目標モダリティがわずか (1-10) のラベル付きサンプルしか持たない場合や,ノイズラベルが存在する場合においても高い性能を示す。 The natural world is abundant with concepts expressed via visual, acoustic, tactile, and linguistic modalities. Much of the existing progress in multimodal learning, however, focuses primarily on problems where the same set of modalities are present at train and test time, which makes learning in low-resource modalities particularly difficult. In this work, we propose algorithms for cross-modal generalization: a learning paradigm to train a model that can (1) quickly perform new tasks in a target modality (i.e. meta-learning) and (2) doing so while being trained on a different source modality. We study a key research question: how can we ensure generalization across modalities despite using separate encoders for different source and target modalities? Our solution is based on meta-alignment, a novel method to align representation spaces using strongly and weakly paired cross-modal data while ensuring quick generalization to new tasks across different modalities. We study this problem on 3 classification tasks: text to image, image to audio, and text to speech. Our results demonstrate strong performance even when the new target modality has only a few (1-10) labeled samples and in the presence of noisy labels, a scenario particularly prevalent in low-resource modalities.	翻訳日:2021-05-22 20:55:30 公開日:2020-12-04
# エンド・ツー・エンドのセンサモレータ学習のためのニューラル・ダイナミック・ポリシー Neural Dynamic Policies for End-to-End Sensorimotor Learning ( http://arxiv.org/abs/2012.02788v1 ) ライセンス: Link先を確認	Shikhar Bahl, Mustafa Mukadam, Abhinav Gupta, Deepak Pathak	(参考訳) 感覚運動器制御における現在の支配的なパラダイムは、模倣や強化学習であっても、トルク、関節角、エンドエフェクタ位置といった生のアクション空間でポリシーを直接訓練することである。これにより、エージェントはトレーニングの各時間ステップで個別に決定し、従ってスケーラビリティを連続的、高次元、長距離のタスクに制限する。対照的に、古典ロボットの研究は、長い間、デモを通してロボットの振る舞いを学ぶための政策表現として、力学システムを利用してきた。しかし、これらの手法は深層学習や強化学習によって提供される柔軟性と一般化性に欠けており、そのような環境では未調査のままである。本研究では、このギャップを埋め、二階微分方程式を用いて作用空間を再パラメータ化することにより、動的システムの構造をディープニューラルネットワークベースのポリシーに組み込む。本稿では,行動が生の制御空間を表す事前の政策学習手法とは対照的に,軌道分布空間における予測を行う神経力学ポリシ(ndps)を提案する。組込み構造は、強化学習と模倣学習の両方のためのエンドツーエンドのポリシー学習を可能にする。 ndpsは,模倣学習と強化学習のいずれにおいても,複数のロボット制御タスクの効率や性能において,従来の最先端技術よりも優れていた。プロジェクトビデオとコードはhttps://shikharbahl.github.io/neural-dynamic-policies/で入手できる。 The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces such as torque, joint angle, or end-effector position. This forces the agent to make decisions individually at each timestep in training, and hence, limits the scalability to continuous, high-dimensional, and long-horizon tasks. In contrast, research in classical robotics has, for a long time, exploited dynamical systems as a policy representation to learn robot behaviors via demonstrations. These techniques, however, lack the flexibility and generalizability provided by deep learning or reinforcement learning and have remained under-explored in such settings. In this work, we begin to close this gap and embed the structure of a dynamical system into deep neural network-based policies by reparameterizing action spaces via second-order differential equations. We propose Neural Dynamic Policies (NDPs) that make predictions in trajectory distribution space as opposed to prior policy learning methods where actions represent the raw control space. The embedded structure allows end-to-end policy learning for both reinforcement and imitation learning setups. We show that NDPs outperform the prior state-of-the-art in terms of either efficiency or performance across several robotic control tasks for both imitation and reinforcement learning setups. Project video and code are available at https://shikharbahl.github.io/neural-dynamic-policies/	翻訳日:2021-05-22 20:54:55 公開日:2020-12-04
# 自動車苦情分析のための知識ベースとしての事前学習言語モデル Pre-trained language models as knowledge bases for Automotive Complaint Analysis ( http://arxiv.org/abs/2012.02558v1 ) ライセンス: Link先を確認	V. D. Viellieber and M. A{\ss}enmacher	(参考訳) 最近、bert(devlin et al., 2018)のような大規模な事前学習された言語モデルが、事前学習コーパス(petroni et al., 2019)で取得した常識的事実知識を格納できることが示されている。本研究は,自動車産業における非構造的顧客からのフィードバックから得られた技術的品質問題を明らかにするための一連の調査を,産業からの応用に関してさらに評価する。タスクを満載した事前トレーニング済みモデルのアウト・オブ・ザ・ボックス版を探索した後、私たちは、Office of Defects Investigation (ODI) のデータセットで継続事前トレーニングを通じて、動的により多くの知識を提供する。実験では,ドメイン固有のトピックに関するクエリに関するパフォーマンスを,実際の知識そのものを問う場合と比較した。 (2019年)。評価されたほとんどのアーキテクチャでは、正しいトークンは60\%以上の$Precision@1$$(P@1$)で予測され、その一方、$P@5$と$P@10$は、それぞれ80\%以上、最大90%の値に達する。これらの結果は,顧客からのフィードバックを構造化分析するための知識基盤として言語モデルを用いる可能性を示している。 Recently it has been shown that large pre-trained language models like BERT (Devlin et al., 2018) are able to store commonsense factual knowledge captured in its pre-training corpus (Petroni et al., 2019). In our work we further evaluate this ability with respect to an application from industry creating a set of probes specifically designed to reveal technical quality issues captured as described incidents out of unstructured customer feedback in the automotive industry. After probing the out-of-the-box versions of the pre-trained models with fill-in-the-mask tasks we dynamically provide it with more knowledge via continual pre-training on the Office of Defects Investigation (ODI) Complaints data set. In our experiments the models exhibit performance regarding queries on domain-specific topics compared to when queried on factual knowledge itself, as Petroni et al. (2019) have done. For most of the evaluated architectures the correct token is predicted with a $Precision@1$ ($P@1$) of above 60\%, while for $P@5$ and $P@10$ even values of well above 80\% and up to 90\% respectively are reached. These results show the potential of using language models as a knowledge base for structured analysis of customer feedback.	翻訳日:2021-05-22 20:54:33 公開日:2020-12-04
# メタ学習へのモデル非依存学習 Model-Agnostic Learning to Meta-Learn ( http://arxiv.org/abs/2012.02684v1 ) ライセンス: Link先を確認	Arnout Devos, Yatin Dandi	(参考訳) 本稿では,同一分布から特定のタスクに迅速に適応する前に,関連するタスク間の共通性を未認識のタスク分布から迅速に活用できる学習アルゴリズムを提案する。本稿では,タスク分布の異なる学習が,まずタスクのメタファインタニングによって適応性を向上させる方法を検討する。合成回帰実験は、メタ学習への学習が適応性と連続的な一般化を改善するという直感を検証する。本提案の方法論, 設定, 仮説は, 確定実験を行う前に, ピアレビューによって肯定的に評価された。 In this paper, we propose a learning algorithm that enables a model to quickly exploit commonalities among related tasks from an unseen task distribution, before quickly adapting to specific tasks from that same distribution. We investigate how learning with different task distributions can first improve adaptability by meta-finetuning on related tasks before improving goal task generalization with finetuning. Synthetic regression experiments validate the intuition that learning to meta-learn improves adaptability and consecutively generalization. The methodology, setup, and hypotheses in this proposal were positively evaluated by peer review before conclusive experiments were carried out.	翻訳日:2021-05-22 20:54:01 公開日:2020-12-04
# コモンセンスでテキストベースのゲームをする Playing Text-Based Games with Common Sense ( http://arxiv.org/abs/2012.02757v1 ) ライセンス: Link先を確認	Sahith Dambekodi, Spencer Frazier, Prithviraj Ammanabrolu, Mark O. Riedl	(参考訳) テキストベースのゲームは、エージェントが純粋に自然言語を通して世界と対話するシミュレーションである。それらは典型的には、一般的な日常の物体や場所と相互作用する多くのパズルで構成されている。深層強化学習エージェントはこれらのパズルを解くことができる。しかしながら、人間のプレイヤーにとって自明な環境との日常的な相互作用は、エージェントに新たなパズルとして提示される。エージェントに常識知識を組み込むための2つの手法を探索する。コモンセンス推論モデルCOMETまたは言語モデルBERTで世界状態の潜在的隠れた側面を推測する。言語モデルによって認識される共通パターンに従ってエージェントを探索する。 9to05ゲームはテキストベースのゲームの極端なバージョンであり、日常的な日常的なシナリオにおいて、共通の日常的なオブジェクトと多数のインタラクションを必要とする。我々は、コモンセンス推論によって世界状態に関する信念を補強するエージェントは、観察的エラーやテキスト記述からの共通要素の欠落に対してより頑健であると結論づける。 Text based games are simulations in which an agent interacts with the world purely through natural language. They typically consist of a number of puzzles interspersed with interactions with common everyday objects and locations. Deep reinforcement learning agents can learn to solve these puzzles. However, the everyday interactions with the environment, while trivial for human players, present as additional puzzles to agents. We explore two techniques for incorporating commonsense knowledge into agents. Inferring possibly hidden aspects of the world state with either a commonsense inference model COMET, or a language model BERT. Biasing an agents exploration according to common patterns recognized by a language model. We test our technique in the 9to05 game, which is an extreme version of a text based game that requires numerous interactions with common, everyday objects in common, everyday scenarios. We conclude that agents that augment their beliefs about the world state with commonsense inferences are more robust to observational errors and omissions of common elements from text descriptions.	翻訳日:2021-05-22 20:53:38 公開日:2020-12-04
# 自己監督型VQA:画像とキャプションを用いた視覚的質問への回答 Self-Supervised VQA: Answering Visual Questions using Images and Captions ( http://arxiv.org/abs/2012.02356v1 ) ライセンス: Link先を確認	Pratyay Banerjee, Tejas Gokhale, Yezhou Yang, Chitta Baral	(参考訳) VQAモデルのトレーニング方法は、トレーニングのために人間の注釈付きイメージクエスト・アンサー(I-Q-A)トリプルでデータセットを利用できると仮定する。これにより、データセットへの依存度が高くなり、新しいタイプの質問やシーンへの一般化が欠如している。さらに、これらのデータセットは、アノテータの主観性、偏見、誤り、および言語的先行性を示し、これらのサンプルで訓練されたVQAモデルにパーコレーションする。人間の注釈付きQ-Aペアを使わずにモデルをトレーニングできるかどうかを,説明的かつ主観的でない画像と関連するテキストキャプションのみを用いて検討する。本稿では,テンプレートやqasrlなどのアノテーションフレームワークを用いたキャプションから,手続き的に生成されたq-aペアを用いたモデル学習手法を提案する。多くのVQAモデルは、オブジェクト検出器から抽出された高密度でコストのかかるオブジェクトアノテーションに依存しているため、オブジェクト境界ボックスの単純かつ効果的な代替手段として、空間ピラミド画像パッチを提案する。ラベルシフトのソフトバージョンを含むVQA-v2,GQA,VQA-CPのベンチマークを行った。提案手法はvqa-cpの事前教師付きメソッドを上回っており,完全に教師付き設定のオブジェクト特徴のないメソッドと競合する。 Methodologies for training VQA models assume the availability of datasets with human-annotated Image-Question-Answer(I-Q-A) triplets for training. This has led to a heavy reliance and overfitting on datasets and a lack of generalization to new types of questions and scenes. Moreover, these datasets exhibit annotator subjectivity, biases, and errors, along with linguistic priors, which percolate into VQA models trained on such samples. We study whether models can be trained without any human-annotated Q-A pairs, but only with images and associated text captions which are descriptive and less subjective. We present a method to train models with procedurally generated Q-A pairs from captions using techniques, such as templates and annotation frameworks like QASRL. As most VQA models rely on dense and costly object annotations extracted from object detectors, we propose spatial-pyramid image patches as a simple but effective alternative to object bounding boxes, and demonstrate that our method uses fewer human annotations. We benchmark on VQA-v2, GQA, and on VQA-CP which contains a softer version of label shift. Our methods surpass prior supervised methods on VQA-CP and are competitive with methods without object features in fully supervised setting.	翻訳日:2021-05-22 20:53:04 公開日:2020-12-04
# MPG:コンディショナルスタイルGANを用いた多機能ピザイメージジェネレータ MPG: A Multi-ingredient Pizza Image Generator with Conditional StyleGANs ( http://arxiv.org/abs/2012.02821v1 ) ライセンス: Link先を確認	Fangda Han, Guoyao Hao, Ricardo Guerrero, Vladimir Pavlovic	(参考訳) マルチラベル条件画像生成はコンピュータビジョンにおいて難しい問題である。本研究では,マルチラベル画像合成のための条件付き生成ニューラルネットワーク(gan)フレームワークであるmulti-ingredient pizza generator (mpg)を提案する。そこで我々は,mpgをstylegan2と呼ばれる最先端のgan構造に基づいて設計し,中間的特徴マップを強制してスケールワイズラベル情報を学習する新しい条件付け手法を開発した。また, マルチラベル画像生成問題の複雑な性質から, 対応する成分を予測して合成画像を正規化するとともに, マッチング画像と不一致画像との区別を促す。 MPGの有効性を検証するために、慎重に注釈付けされた多言語ピザ画像データセットであるPizza10で試した。 MPGは、望まれる材料で、フォトリアリスティックなピザ画像を生成することができる。このフレームワークは他のマルチラベル画像生成シナリオにも容易に拡張できる。 Multilabel conditional image generation is a challenging problem in computer vision. In this work we propose Multi-ingredient Pizza Generator (MPG), a conditional Generative Neural Network (GAN) framework for synthesizing multilabel images. We design MPG based on a state-of-the-art GAN structure called StyleGAN2, in which we develop a new conditioning technique by enforcing intermediate feature maps to learn scalewise label information. Because of the complex nature of the multilabel image generation problem, we also regularize synthetic image by predicting the corresponding ingredients as well as encourage the discriminator to distinguish between matched image and mismatched image. To verify the efficacy of MPG, we test it on Pizza10, which is a carefully annotated multi-ingredient pizza image dataset. MPG can successfully generate photo-realist pizza images with desired ingredients. The framework can be easily extend to other multilabel image generation scenarios.	翻訳日:2021-05-22 20:52:43 公開日:2020-12-04
# 逆ダイナミクスモデルを用いた画素からの計画 Planning from Pixels using Inverse Dynamics Models ( http://arxiv.org/abs/2012.02419v1 ) ライセンス: Link先を確認	Keiran Paster, Sheila A. McIlraith, Jimmy Ba	(参考訳) 高次元観測空間におけるタスク非依存力学モデルの学習は、モデルベースRLエージェントでは困難である。本稿では,タスク完了に条件づけられた将来の行動のシーケンスを予測し,潜在世界モデルを学ぶ新しい方法を提案する。これらのタスク条件付きモデルは、タスク関連力学のモデリング能力に適応的に焦点を合わせ、同時にスパース報酬を伴う計画のための効果的なヒューリスティックとして機能する。本手法は,視覚目標達成課題に対する課題評価を行い,従来のモデルフリーアプローチに比べて性能が大幅に向上することを示す。 Learning task-agnostic dynamics models in high-dimensional observation spaces can be challenging for model-based RL agents. We propose a novel way to learn latent world models by learning to predict sequences of future actions conditioned on task completion. These task-conditioned models adaptively focus modeling capacity on task-relevant dynamics, while simultaneously serving as an effective heuristic for planning with sparse rewards. We evaluate our method on challenging visual goal completion tasks and show a substantial increase in performance compared to prior model-free approaches.	翻訳日:2021-05-22 20:52:09 公開日:2020-12-04
# グラフ上の教師なし逆回転表現学習 Unsupervised Adversarially-Robust Representation Learning on Graphs ( http://arxiv.org/abs/2012.02486v1 ) ライセンス: Link先を確認	Jiarong Xu, Junru Chen, Yang Yang, Yizhou Sun, Chunping Wang, Jiangang Lu	(参考訳) 近年の研究では、グラフの深層学習が敵の攻撃に弱いことが示されており、入力データに対する知覚不能な摂動が劇的な性能劣化を引き起こす可能性がある。本稿では,グラフ上のロバスト表現を相互情報を通して学習する基礎的な問題に着目する。ラベル空間に基づいてタスク固有ロバスト性を測定する以前の研究とは対照的に、グラフトポロジとノード属性の合同入力空間を考慮に入れたタスク自由ロバスト性の測定には、表現空間を利用する。本稿では,この問題を制約付きサドル点最適化問題として定式化し,探索空間の縮小で効率よく解く。さらに,タスクフリーなロバストネス尺度と下流分類器のロバストネスとの理論的関係を確実に確立する。大規模な実験により,提案手法はグラフに対する敵攻撃に対する堅牢性を向上できるが,自然な精度も向上できることが示された。 Recent works have demonstrated that deep learning on graphs is vulnerable to adversarial attacks, in that imperceptible perturbations on input data can lead to dramatic performance deterioration. In this paper, we focus on the underlying problem of learning robust representations on graphs via mutual information. In contrast to previous works measure the task-specific robustness based on the label space, we here take advantage of the representation space to study a task-free robustness measure given the joint input space w.r.t graph topology and node attributes. We formulate this problem as a constrained saddle point optimization problem and solve it efficiently in a reduced search space. Furthermore, we provably establish theoretical connections between our task-free robustness measure and the robustness of downstream classifiers. Extensive experiments demonstrate that our proposed method is able to enhance robustness against adversarial attacks on graphs, yet even increases natural accuracy.	翻訳日:2021-05-22 20:52:01 公開日:2020-12-04
# 手続き的生成環境における実証効率の良い逆強化学習 Demonstration-efficient Inverse Reinforcement Learning in Procedurally Generated Environments ( http://arxiv.org/abs/2012.02527v1 ) ライセンス: Link先を確認	Alessandro Sestini, Alexander Kuhnle and Andrew D. Bagdanov	(参考訳) 深層強化学習は、報酬関数を手作業で設計できる領域において、非常に良い結果をもたらす。同時に、このタイプの環境はドメインシフト下でエージェントの過剰フィットと一般化を研究するのに最適であるため、pcg(procedurally content generation)に基づいたゲームをベンチマーク環境として使用するコミュニティの関心が高まっている。逆強化学習(IRL)は、専門家によるデモンストレーションから報酬関数を外挿する代わりに、高次元問題においても良い結果が得られるが、これらのテクニックを手続き的に生成された環境に適用する例はない。これは主に、良い報酬モデルを見つけるのに必要なデモの数のためです。そこで本研究では,pcgゲームにおける実演の必要性を大幅に減らすことができる逆強化学習に基づく手法を提案する。初期シードレベルが制限された環境と、トレーニングを安定させるためにいくつかの修正を加えることで、私たちのアプローチであるDE-AIRLは実証効率が高く、完全に手続き領域に一般化する報酬関数を外挿できることを示す。本手法は,MiniGridとDeepCrawlの2つの手続き環境において,様々なタスクに対して有効であることを示す。 Deep Reinforcement Learning achieves very good results in domains where reward functions can be manually engineered. At the same time, there is growing interest within the community in using games based on Procedurally Content Generation (PCG) as benchmark environments since this type of environment is perfect for studying overfitting and generalization of agents under domain shift. Inverse Reinforcement Learning (IRL) can instead extrapolate reward functions from expert demonstrations, with good results even on high-dimensional problems, however there are no examples of applying these techniques to procedurally-generated environments. This is mostly due to the number of demonstrations needed to find a good reward model. We propose a technique based on Adversarial Inverse Reinforcement Learning which can significantly decrease the need for expert demonstrations in PCG games. Through the use of an environment with a limited set of initial seed levels, plus some modifications to stabilize training, we show that our approach, DE-AIRL, is demonstration-efficient and still able to extrapolate reward functions which generalize to the fully procedural domain. We demonstrate the effectiveness of our technique on two procedural environments, MiniGrid and DeepCrawl, for a variety of tasks.	翻訳日:2021-05-22 20:51:46 公開日:2020-12-04
# DPM:外挿における物理情報ニューラルネットワークの新しいトレーニング手法 DPM: A Novel Training Method for Physics-Informed Neural Networks in Extrapolation ( http://arxiv.org/abs/2012.02681v1 ) ライセンス: Link先を確認	Jungeun Kim, Kookjin Lee, Dongeun Lee, Sheo Yon Jin, Noseong Park	(参考訳) 本稿では,時間依存非線形偏微分方程式 (pdes) によって記述される複雑な物理過程のダイナミクスを学ぶ手法を提案する。私たちの特に関心は、トレーニングで使用される時間領域の範囲を超えて、ソリューションを外挿することにあります。ベースライン手法の選び方は,物理インフォームドニューラルネットワーク (pinn) [raissi et al., j. comput] である。 phys., 378:686--707, 2019] この手法は、解だけでなく、物理過程のダイナミクスを記述する方程式もパラメタライズするためである。 PINNは,多くのベンチマーク問題において,外挿作業において不十分な性能を示す。そこで本研究では,新しいPINNのトレーニング手法を提案するとともに,拡張されたPINNが解の正確な外挿を時間内に行えることを示す。提案手法は,標準L2ノルム法において,既存の手法よりも最大72%小さい誤差を示す。 We present a method for learning dynamics of complex physical processes described by time-dependent nonlinear partial differential equations (PDEs). Our particular interest lies in extrapolating solutions in time beyond the range of temporal domain used in training. Our choice for a baseline method is physics-informed neural network (PINN) [Raissi et al., J. Comput. Phys., 378:686--707, 2019] because the method parameterizes not only the solutions but also the equations that describe the dynamics of physical processes. We demonstrate that PINN performs poorly on extrapolation tasks in many benchmark problems. To address this, we propose a novel method for better training PINN and demonstrate that our newly enhanced PINNs can accurately extrapolate solutions in time. Our method shows up to 72% smaller errors than existing methods in terms of the standard L2-norm metric.	翻訳日:2021-05-22 20:51:27 公開日:2020-12-04
# 安全なAI構築のための11の提案の概要 An overview of 11 proposals for building safe advanced AI ( http://arxiv.org/abs/2012.07532v1 ) ライセンス: Link先を確認	Evan Hubinger	(参考訳) 本稿では、反復増幅、議論によるai安全性、再帰的報酬モデリングなどを含む、現在の機械学習パラダイムの下で安全な高度なaiを構築するための11の異なる提案を分析し比較する。本論文では,各提案について,外方アライメント,内方アライメント,トレーニング競合性,パフォーマンス競争力の4つの要素について評価し,後者の2つを区別する。先行文献は主に個々の提案の分析に重点を置いているが、この分析は前述した4つのコンポーネントの比較分析を含む幅広い提案を比較検討することを目的としている。 This paper analyzes and compares 11 different proposals for building safe advanced AI under the current machine learning paradigm, including major contenders such as iterated amplification, AI safety via debate, and recursive reward modeling. Each proposal is evaluated on the four components of outer alignment, inner alignment, training competitiveness, and performance competitiveness, of which the distinction between the latter two is introduced in this paper. While prior literature has primarily focused on analyzing individual proposals, or primarily focused on outer alignment at the expense of inner alignment, this analysis seeks to take a comparative look at a wide range of proposals including a comparative analysis across all four previously mentioned components.	翻訳日:2021-05-22 20:51:12 公開日:2020-12-04
# ソフトウェア工学におけるチャットボットの自然言語理解プラットフォームの比較 A Comparison of Natural Language Understanding Platforms for Chatbots in Software Engineering ( http://arxiv.org/abs/2012.02640v1 ) ライセンス: Link先を確認	Ahmad Abdellatif, Khaled Badran, Diego Elias Costa, and Emad Shihab	(参考訳) チャットボットは、ソフトウェアエンジニアリングの未来を劇的に変え、実践者が彼らのソフトウェアプロジェクトについてチャットし、調査し、自然言語を使ってさまざまなサービスと対話できるようにする。すべてのチャットボットの中心には自然言語理解(NLU)コンポーネントがあり、チャットボットは自然言語入力を理解できる。近年、チャットボットの既製のNLUコンポーネントとして多くのNLUプラットフォームが提供されているが、ソフトウェアエンジニアリングチャットボットの最高のNLUを選択することはオープンな課題である。そこで本稿では,IBM Watson, Google Dialogflow, Rasa, Microsoft LUIS の4つの NLU を評価し,ソフトウェア工学ベースのチャットボットにおける NLU の使用方法を明らかにした。具体的には,NLUの性能を,意図の分類,信頼度スコアの安定性,実体抽出において検証する。 nlusを評価するには、ソフトウェアエンジニアリングの実践者が行う2つの共通タスクを反映した2つのデータセットを使用する。1) チャットボットとチャットしてソフトウェアリポジトリについて質問するタスク 2) q&aフォーラム(例えばstack overflow)で開発質問をするタスク。我々の発見によると、IBM Watsonは3つの側面(インテント分類、信頼性スコア、エンティティ抽出)を考慮すると、最高のNLUである。しかしながら、各側面の結果から、ibm watsonは、意図分類において、f1-measure > 84%で最高のパフォーマンスを示すが、信頼度スコアでは、rasaは、0.91よりも高い信頼度スコアで上位に来る。また,ダイアログフローを除くすべてのNLUが信頼できる信頼スコアを提供することを示した。エンティティ抽出では、Microsoft LUISとIBM Watsonが2つのSEタスクで他のNLUを上回っている。この結果は,チャットボットでどのNLUを使うかを決める際に,ソフトウェア工学の実践者にガイダンスを提供する。 Chatbots are envisioned to dramatically change the future of Software Engineering, allowing practitioners to chat and inquire about their software projects and interact with different services using natural language. At the heart of every chatbot is a Natural Language Understanding (NLU) component that enables the chatbot to understand natural language input. Recently, many NLU platforms were provided to serve as an off-the-shelf NLU component for chatbots, however, selecting the best NLU for Software Engineering chatbots remains an open challenge. Therefore, in this paper, we evaluate four of the most commonly used NLUs, namely IBM Watson, Google Dialogflow, Rasa, and Microsoft LUIS to shed light on which NLU should be used in Software Engineering based chatbots. Specifically, we examine the NLUs' performance in classifying intents, confidence scores stability, and extracting entities. To evaluate the NLUs, we use two datasets that reflect two common tasks performed by Software Engineering practitioners, 1) the task of chatting with the chatbot to ask questions about software repositories 2) the task of asking development questions on Q&A forums (e.g., Stack Overflow). According to our findings, IBM Watson is the best performing NLU when considering the three aspects (intents classification, confidence scores, and entity extraction). However, the results from each individual aspect show that, in intents classification, IBM Watson performs the best with an F1-measure > 84%, but in confidence scores, Rasa comes on top with a median confidence score higher than 0.91. Our results also show that all NLUs, except for Dialogflow, generally provide trustable confidence scores. For entity extraction, Microsoft LUIS and IBM Watson outperform other NLUs in the two SE tasks. Our results provide guidance to software engineering practitioners when deciding which NLU to use in their chatbots.	翻訳日:2021-05-22 20:50:59 公開日:2020-12-04
# 近似最適化平滑化アルゴリズム Proximal Policy Optimization Smoothed Algorithm ( http://arxiv.org/abs/2012.02439v1 ) ライセンス: Link先を確認	Wangshu Zhu and Andre Rosendo	(参考訳) PPO(Proximal Policy Optimization)は、強化学習のサブフィールドであるポリシーサーチにおいて、各ポリシー更新におけるステップサイズを制限するために代理目的関数を使用することによって、最先端の結果を得た。このような制限は有用であるが、このアルゴリズムは曲線の急激な平坦化による性能不安定性と最適化の非効率さに悩まされている。この問題に対処するために,近位政策最適化スムースアルゴリズム(proximal policy optimization smooth algorithm, ppo)と呼ばれるppo変種を提案する。我々は,ロールバッククリッピング方式を採用したPPOとPPORBを比較し,他のPPO法よりも各ステップでより正確な更新を行うことができることを示す。さらに, 連続制御タスクにおける性能, 安定性の両面で, 最新のPPO変種よりも優れていることを示す。 Proximal policy optimization (PPO) has yielded state-of-the-art results in policy search, a subfield of reinforcement learning, with one of its key points being the use of a surrogate objective function to restrict the step size at each policy update. Although such restriction is helpful, the algorithm still suffers from performance instability and optimization inefficiency from the sudden flattening of the curve. To address this issue we present a PPO variant, named Proximal Policy Optimization Smooth Algorithm (PPOS), and its critical improvement is the use of a functional clipping method instead of a flat clipping method. We compare our method with PPO and PPORB, which adopts a rollback clipping method, and prove that our method can conduct more accurate updates at each time step than other PPO methods. Moreover, we show that it outperforms the latest PPO variants on both performance and stability in challenging continuous control tasks.	翻訳日:2021-05-22 20:50:19 公開日:2020-12-04
# 連合学習におけるバイアス緩和 Mitigating Bias in Federated Learning ( http://arxiv.org/abs/2012.02447v1 ) ライセンス: Link先を確認	Annie Abay, Yi Zhou, Nathalie Baracaldo, Shashank Rajamoni, Ebube Chuba, Heiko Ludwig	(参考訳) 差別を意識したモデルを作成する方法として、彼らは集中型MLに集中し、連邦学習(FL)は未探索のままである。 FLはコラボレーティブMLの上昇するアプローチであり、アグリゲータは複数のパーティを編成して、トレーニングデータを共有せずにグローバルモデルをトレーニングする。本稿では,flにおけるバイアスの原因について議論し,データプライバシを損なうことなくバイアスを軽減するための3つの前処理および内処理手法を提案する。当事者間のデータの不均一性はflの難解な特徴の1つであり,モデル性能,公平度指標,バイアス学習パターンへの影響を分析するために,複数のデータ分布について実験を行う。提案手法の包括的分析を行い,データ分布が歪んだり,20%の当事者がこの手法を用いていた場合でも,これらの手法が有効であることを示す。 As methods to create discrimination-aware models develop, they focus on centralized ML, leaving federated learning (FL) unexplored. FL is a rising approach for collaborative ML, in which an aggregator orchestrates multiple parties to train a global model without sharing their training data. In this paper, we discuss causes of bias in FL and propose three pre-processing and in-processing methods to mitigate bias, without compromising data privacy, a key FL requirement. As data heterogeneity among parties is one of the challenging characteristics of FL, we conduct experiments over several data distributions to analyze their effects on model performance, fairness metrics, and bias learning patterns. We conduct a comprehensive analysis of our proposed techniques, the results demonstrating that these methods are effective even when parties have skewed data distributions or as little as 20% of parties employ the methods.	翻訳日:2021-05-22 20:50:01 公開日:2020-12-04
# ウェアラブルストレスに対するベイズ能動学習と影響検出 Bayesian Active Learning for Wearable Stress and Affect Detection ( http://arxiv.org/abs/2012.02702v1 ) ライセンス: Link先を確認	Abhijith Ragav, Gautham Krishna Gudur	(参考訳) 近年,ヒトでは心理的ストレスが観察され,早期発見は健康リスクの予防に不可欠である。デバイス上での深層学習アルゴリズムによるストレス検出は、広汎なコンピューティングの進歩により増加傾向にある。しかし、対処すべき重要な課題は、適切な地上の真理化技術(アクティブラーニングなど)を通じて、ラベルのないデータをリアルタイムで処理することであり、これは、感情的な状態(ラベル)を確立するのに役立つと同時に、オラクルからクエリする最も情報に富むデータポイントのみを選択するのに役立つ。本稿では,モンテカルロ(mc)ドロップアウトを用いたベイジアンニューラルネットワークにおける近似によるモデル不確実性を表現する枠組みを提案する。これはアクティブラーニングに適した獲得関数と組み合わせられる。 raspberry pi 2で実験された一般的なストレス・インパクト検出データセットを用いた実験結果から,提案フレームワークは,様々な獲得関数を横断するアクティブラーニングにおいて,取得したプールポイントの数がかなり少なく,推論時の効率が大幅に向上することが示唆された。変動比は90.38%の精度を達成し、約40%少ないデータでトレーニング中に達成されるテストの最大精度に匹敵する。 In the recent past, psychological stress has been increasingly observed in humans, and early detection is crucial to prevent health risks. Stress detection using on-device deep learning algorithms has been on the rise owing to advancements in pervasive computing. However, an important challenge that needs to be addressed is handling unlabeled data in real-time via suitable ground truthing techniques (like Active Learning), which should help establish affective states (labels) while also selecting only the most informative data points to query from an oracle. In this paper, we propose a framework with capabilities to represent model uncertainties through approximations in Bayesian Neural Networks using Monte-Carlo (MC) Dropout. This is combined with suitable acquisition functions for active learning. Empirical results on a popular stress and affect detection dataset experimented on a Raspberry Pi 2 indicate that our proposed framework achieves a considerable efficiency boost during inference, with a substantially low number of acquired pool points during active learning across various acquisition functions. Variation Ratios achieves an accuracy of 90.38% which is comparable to the maximum test accuracy achieved while training on about 40% lesser data.	翻訳日:2021-05-22 20:49:27 公開日:2020-12-04
# 1ビットフィードバックは高信頼境界ポリシーに十分である One-bit feedback is sufficient for upper confidence bound policies ( http://arxiv.org/abs/2012.02876v1 ) ライセンス: Link先を確認	Daniel Vial, Sanjay Shakkottai, R. Srikant	(参考訳) 従来のマルチアームバンディット問題の変種を考察し、各アームは、その過去の報酬履歴に基づいて、プル毎に1ビットのフィードバックしか提供できない。我々の主な結果は次のとおりである: フルリワードフィードバックを用いた高信頼バウンドポリシーが与えられると、1ビットフィードバックを生成するためのコーディングスキームと、我々のポリシーによって達成された後悔の比率とフルリワードフィードバックポリシーの後悔が漸近的に近づくような、対応するデコーディングスキームとアーム選択ポリシーが存在する。 We consider a variant of the traditional multi-armed bandit problem in which each arm is only able to provide one-bit feedback during each pull based on its past history of rewards. Our main result is the following: given an upper confidence bound policy which uses full-reward feedback, there exists a coding scheme for generating one-bit feedback, and a corresponding decoding scheme and arm selection policy, such that the ratio of the regret achieved by our policy and the regret of the full-reward feedback policy asymptotically approaches one.	翻訳日:2021-05-22 20:49:07 公開日:2020-12-04
# 色は可塑性か? 画像カラー化のためのUCapsNet Is It a Plausible Colour? UCapsNet for Image Colourisation ( http://arxiv.org/abs/2012.02478v1 ) ライセンス: Link先を確認	Rita Pucci, Christian Micheloni, Gian Luca Foresti, Niki Martinel	(参考訳) 人間は、意味的特徴抽出の能力のおかげで、特に努力することなく、グレースケールの画像の色を想像することができる。自律システムはそれを達成できますか? 幻覚は可視で活気ある色にできるのか? これは色付けの問題です。事前学習した畳み込みニューラルネットワークモデルに依存する既存の作業とは違って,このような色分け問題を自己教師付き学習タスクとしてキャストした。逆学習パラダイムに従って学習したカプセルに基づく新しいアーキテクチャを導入することで,この問題に対処する。カプセルネットワークは、画像内のエンティティのセマンティック表現を抽出することができるが、その空間情報の詳細は緩く、グレースケール画像のカラー化には重要である。したがって、我々のucapsnet構造は、畳み込みニューラルネットワークを通してカプセルや空間的詳細を通して実体を抽出するエンコーディングフェーズを伴います。復号位相は、エンティティ特徴と空間特徴とを結合し、入力されたデータムの可算な色バージョンを暗示する。 ImageNetベンチマークの結果、我々のアプローチは出口ソリューションよりも鮮やかで可視な色を生成でき、監督下で事前訓練されたモデルよりも優れた性能を達成できることがわかった。 Human beings can imagine the colours of a grayscale image with no particular effort thanks to their ability of semantic feature extraction. Can an autonomous system achieve that? Can it hallucinate plausible and vibrant colours? This is the colourisation problem. Different from existing works relying on convolutional neural network models pre-trained with supervision, we cast such colourisation problem as a self-supervised learning task. We tackle the problem with the introduction of a novel architecture based on Capsules trained following the adversarial learning paradigm. Capsule networks are able to extract a semantic representation of the entities in the image but loose details about their spatial information, which is important for colourising a grayscale image. Thus our UCapsNet structure comes with an encoding phase that extracts entities through capsules and spatial details through convolutional neural networks. A decoding phase merges the entity features with the spatial features to hallucinate a plausible colour version of the input datum. Results on the ImageNet benchmark show that our approach is able to generate more vibrant and plausible colours than exiting solutions and achieves superior performance than models pre-trained with supervision.	翻訳日:2021-05-22 20:48:55 公開日:2020-12-04
# 生成モデルにおけるデータバイアスに関する一考察 A Note on Data Biases in Generative Models ( http://arxiv.org/abs/2012.02516v1 ) ライセンス: Link先を確認	Patrick Esser and Robin Rombach and Bj\"orn Ommer	(参考訳) 機械は不公平さや偏見の傾向が低いと考えるのは誘惑的だ。しかし、機械学習のアプローチはデータに基づいて出力を計算する。バイアスは開発パイプラインの任意の段階に入ることができるが、モデルは特にトレーニング対象のデータセットのバイアスを反映して受け入れられるため、必ずしも世界の真実を反映するものではなく、主にデータに関する真実を反映している。現代のアルゴリズムとそれらを形成するデータの関係性に関する認識を高めるために、条件付き可逆ニューラルネットワークを用いて、異なるデータセット間で共有される情報からデータセット固有の情報を分離する。このようにして、同じ画像を異なるデータセットに投影することで、それら固有のバイアスを明らかにすることができる。本手法は, 生成モデルの性能に及ぼすデータセット品質の影響, (ii) 生成モデルによってデータセットの社会的バイアスがどのように再現されるか, (iii) 写真, 油絵, アニメなどの多様なデータセット間の不適切な移動を通して, 創造的応用を示すために用いられる。私たちのコードとインタラクティブなデモはhttps://github.com/compvis/net2netで閲覧できます。 It is tempting to think that machines are less prone to unfairness and prejudice. However, machine learning approaches compute their outputs based on data. While biases can enter at any stage of the development pipeline, models are particularly receptive to mirror biases of the datasets they are trained on and therefore do not necessarily reflect truths about the world but, primarily, truths about the data. To raise awareness about the relationship between modern algorithms and the data that shape them, we use a conditional invertible neural network to disentangle the dataset-specific information from the information which is shared across different datasets. In this way, we can project the same image onto different datasets, thereby revealing their inherent biases. We use this methodology to (i) investigate the impact of dataset quality on the performance of generative models, (ii) show how societal biases of datasets are replicated by generative models, and (iii) present creative applications through unpaired transfer between diverse datasets such as photographs, oil portraits, and animes. Our code and an interactive demonstration are available at https://github.com/CompVis/net2net .	翻訳日:2021-05-22 20:48:37 公開日:2020-12-04
# 教師付き学習の再検討: 生物学習からその名前で呼ぶことへの洞察 Rethinking supervised learning: insights from biological learning and from calling it by its name ( http://arxiv.org/abs/2012.02526v1 ) ライセンス: Link先を確認	Alex Hernandez-Garcia	(参考訳) ニューラルネットワークのルネッサンスは、より広い用語の教師付き学習でコミュニティによってタグ付けされた分類モデルの成功によって触媒された。並外れた結果は、野心的な約束と過度に満ちた誇大宣伝を引き起こした。コミュニティはすぐに、この成功は何千ものラベル付きサンプルが利用可能になったことによるものだと気付いた。監督された学習は多くが栄光から恥へと変わりましたディープ・ラーニングを全体として批判する者もいれば、予測、教師なし、半監督、さらに最近では自己監督型ラーニングといった教師付きラーニングの方法が「代替的」でなければならないと宣言する者もいた。しかし、これらは理論的に根拠のある分類の実際の分類ではなく、すべてブランド名に思える。さらに、教師付き学習を追放するという呼びかけは、人間がほとんどあるいは全く監督せずに学ぶという疑わしい主張によって動機づけられた。ここでは,自然の学習と監督に関する洞察をレビューし,学習は監督なしでは不可能であるという考えを再検討し,単にその名前で呼ぶだけでは,よりよい進歩が期待できると論じる。 The renaissance of artificial neural networks was catalysed by the success of classification models, tagged by the community with the broader term supervised learning. The extraordinary results gave rise to a hype loaded with ambitious promises and overstatements. Soon the community realised that the success owed much to the availability of thousands of labelled examples. And supervised learning went, for many, from glory to shame. Some criticised deep learning as a whole and others proclaimed that the way forward had to be "alternatives" to supervised learning: predictive, unsupervised, semi-supervised and, more recently, self-supervised learning. However, these seem all brand names, rather than actual categories of a theoretically grounded taxonomy. Moreover, the call to banish supervised learning was motivated by the questionable claim that humans learn with little or no supervision. Here, we review insights about learning and supervision in nature, revisit the notion that learning is not possible without supervision and argue that we will make better progress if we just call it by its name.	翻訳日:2021-05-22 20:48:19 公開日:2020-12-04
# ディープネットワークにおける周辺性能劣化の定量化のための経験的手法 An Empirical Method to Quantify the Peripheral Performance Degradation in Deep Networks ( http://arxiv.org/abs/2012.02749v1 ) ライセンス: Link先を確認	Calden Wloka and John K. Tsotsos	(参考訳) 画像に畳み込みカーネルを適用する場合、出力が入力と同じサイズである場合、画像境界付近で何らかのパディングが要求される。つまり、畳み込みニューラルネットワーク(CNN)における畳み込みの各層に対して、カーネルサイズの半幅に相当する画素のストリップを、非正則表現で生成する。ほとんどのcnnカーネルはネットワークのパラメータ負荷を減らすために小さいが、この非バーティカル領域はそれぞれの畳み込み層を持つ。深層・深層ネットワークとストライドベースのダウンサンプリングを組み合わせる傾向は、この領域の伝播が画像の無視できない部分をカバーすることになることを意味する。この畳み込みに関する問題は長年にわたってよく認識されてきたが、現代のネットワーク行動に対する周辺表現の劣化の影響は十分に定量化されていない。翻訳の不変性の限界は何か? 画像パディングは問題を軽減するか、あるいは物体が画像境界と中心の間を移動するときに性能に影響するか? 実験モデルとしてMask R-CNNを用いて,ネットワーク性能の空間依存性を定量化するデータセットと手法を設計する。我々のデータセットは、高解像度の背景にオブジェクトを挿入することで構築され、画像境界に対してターゲットオブジェクトを特定の位置に配置するサブイメージを収穫することができる。対象位置の選択を通してマスクr-cnnの挙動を調べることにより,画像境界近傍,特に画像コーナー付近における性能低下パターンが明らかになる。ネットワーク性能におけるこの空間異方性の範囲と大きさの定量化は、被写体や関心領域の位置が所定の画像内で十分に局所化されることが保証されない制約のない現実的な環境にディープネットワークを配置する上で重要である。 When applying a convolutional kernel to an image, if the output is to remain the same size as the input then some form of padding is required around the image boundary, meaning that for each layer of convolution in a convolutional neural network (CNN), a strip of pixels equal to the half-width of the kernel size is produced with a non-veridical representation. Although most CNN kernels are small to reduce the parameter load of a network, this non-veridical area compounds with each convolutional layer. The tendency toward deeper and deeper networks combined with stride-based down-sampling means that the propagation of this region can end up covering a non-negligable portion of the image. Although this issue with convolutions has been well acknowledged over the years, the impact of this degraded peripheral representation on modern network behavior has not been fully quantified. What are the limits of translation invariance? Does image padding successfully mitigate the issue, or is performance affected as an object moves between the image border and center? Using Mask R-CNN as an experimental model, we design a dataset and methodology to quantify the spatial dependency of network performance. Our dataset is constructed by inserting objects into high resolution backgrounds, thereby allowing us to crop sub-images which place target objects at specific locations relative to the image border. By probing the behaviour of Mask R-CNN across a selection of target locations, we see clear patterns of performance degredation near the image boundary, and in particular in the image corners. Quantifying both the extent and magnitude of this spatial anisotropy in network performance is important for the deployment of deep networks into unconstrained and realistic environments in which the location of objects or regions of interest are not guaranteed to be well localized within a given image.	翻訳日:2021-05-22 20:48:01 公開日:2020-12-04
# 等価表現の学習 Learning Equivariant Representations ( http://arxiv.org/abs/2012.02771v1 ) ライセンス: Link先を確認	Carlos Esteves	(参考訳) 最先端のディープラーニングシステムは、しばしば大量のデータと計算を必要とする。このため、データの既知の構造や未知の構造を活用することが最重要となる。畳み込みニューラルネットワーク(CNN)はこの原理の成功例であり、その特性はシフト等価性である。フィルタを入力の上にスライドさせることで、入力がシフトすると、応答は同じ量にシフトし、意味コンテンツが絶対画素位置から独立している自然画像の構造を利用する。この性質は、音声、画像、ビデオ認識タスクにおけるCNNの成功に不可欠である。この論文では、回転やスケーリングといった他の種類の変換に同値性を拡張する。対称性の群で定義される異なる変換に対する同変モデルを提案する。 The main contributions are (i) polar transformer networks, achieving equivariance to the group of similarities on the plane, (ii) equivariant multi-view networks, achieving equivariance to the group of symmetries of the icosahedron, (iii) spherical CNNs, achieving equivariance to the continuous 3D rotation group, (iv) cross-domain image embeddings, achieving equivariance to 3D rotations for 2D inputs, and (v) spin-weighted spherical CNNs, generalizing the spherical CNNs and achieving equivariance to 3D rotations for spherical vector fields. 用途としては、画像分類、3次元形状分類と検索、パノラマ画像分類とセグメンテーション、形状アライメント、ポーズ推定などがある。これらのモデルに共通しているのは、データの対称性を活用してサンプルとモデルの複雑さを減らし、一般化のパフォーマンスを向上させることだ。この利点は、データが制限されたり、任意の回転のような入力摂動が存在するような困難なタスクにおいてより重要である。 State-of-the-art deep learning systems often require large amounts of data and computation. For this reason, leveraging known or unknown structure of the data is paramount. Convolutional neural networks (CNNs) are successful examples of this principle, their defining characteristic being the shift-equivariance. By sliding a filter over the input, when the input shifts, the response shifts by the same amount, exploiting the structure of natural images where semantic content is independent of absolute pixel positions. This property is essential to the success of CNNs in audio, image and video recognition tasks. In this thesis, we extend equivariance to other kinds of transformations, such as rotation and scaling. We propose equivariant models for different transformations defined by groups of symmetries. The main contributions are (i) polar transformer networks, achieving equivariance to the group of similarities on the plane, (ii) equivariant multi-view networks, achieving equivariance to the group of symmetries of the icosahedron, (iii) spherical CNNs, achieving equivariance to the continuous 3D rotation group, (iv) cross-domain image embeddings, achieving equivariance to 3D rotations for 2D inputs, and (v) spin-weighted spherical CNNs, generalizing the spherical CNNs and achieving equivariance to 3D rotations for spherical vector fields. Applications include image classification, 3D shape classification and retrieval, panoramic image classification and segmentation, shape alignment and pose estimation. What these models have in common is that they leverage symmetries in the data to reduce sample and model complexity and improve generalization performance. The advantages are more significant on (but not limited to) challenging tasks where data is limited or input perturbations such as arbitrary rotations are present.	翻訳日:2021-05-22 20:47:31 公開日:2020-12-04
# 文書レベルの関係抽出のための粗いエンティティ表現 Coarse-to-Fine Entity Representations for Document-level Relation Extraction ( http://arxiv.org/abs/2012.02507v1 ) ライセンス: Link先を確認	Damai Dai, Jing Ren, Shuang Zeng, Baobao Chang, Zhifang Sui	(参考訳) 文書レベルの関係抽出(RE: Document-level Relation extract)は、文内および文間の関係を抽出する必要がある。最近の研究は、通常文書レベルの相互作用をキャプチャする文書レベルのグラフを構築するグラフベースの手法が有用なエンティティ表現を得ることができ、文書レベルのREに取り組むのに役立つことを示している。これらのメソッドは、グラフ全体にフォーカスするか、あるいは対象のエンティティペア間のパスなど、グラフの一部にもっと注意を払うかのどちらかです。しかし、ドキュメントレベルのREは、両方に同時にフォーカスすることの恩恵を受けるかもしれない。そこで,より包括的な実体表現を得るために,二つの相を含む粗大な戦略を取り入れた \textbf{C}oarse-to-\textbf{F}ine \textbf{E}ntity \textbf{R}epresentation model (\textbf{CFER}) を提案する。まず、CFERはグラフニューラルネットワークを使用して、グラフ全体のグローバル情報を粗いレベルで統合する。次に、cferは、グローバル情報をガイダンスとして使用し、ターゲットエンティティペア間のパス情報を細かなレベルで選択的に集約する。分類において、両階層の実体表現を関係抽出のためのより包括的な表現に結合する。大規模文書レベルのREデータセットによる実験結果から,CFERは従来のベースラインモデルよりも優れた性能を発揮することが示された。さらに,詳細なモデル解析により戦略の有効性を検証する。 Document-level Relation Extraction (RE) requires extracting relations expressed within and across sentences. Recent works show that graph-based methods, usually constructing a document-level graph that captures document-aware interactions, can obtain useful entity representations thus helping tackle document-level RE. These methods either focus more on the entire graph, or pay more attention to a part of the graph, e.g., paths between the target entity pair. However, we find that document-level RE may benefit from focusing on both of them simultaneously. Therefore, to obtain more comprehensive entity representations, we propose the \textbf{C}oarse-to-\textbf{F}ine \textbf{E}ntity \textbf{R}epresentation model (\textbf{CFER}) that adopts a coarse-to-fine strategy involving two phases. First, CFER uses graph neural networks to integrate global information in the entire graph at a coarse level. Next, CFER utilizes the global information as a guidance to selectively aggregate path information between the target entity pair at a fine level. In classification, we combine the entity representations from both two levels into more comprehensive representations for relation extraction. Experimental results on a large-scale document-level RE dataset show that CFER achieves better performance than previous baseline models. Further, we verify the effectiveness of our strategy through elaborate model analysis.	翻訳日:2021-05-22 20:47:07 公開日:2020-12-04
# オフラインメタレベルモデルに基づくコールドスタート推薦のための強化学習手法 Offline Meta-level Model-based Reinforcement Learning Approach for Cold-Start Recommendation ( http://arxiv.org/abs/2012.02476v1 ) ライセンス: Link先を確認	Yanan Wang, Yong Ge, Li Li, Rui Chen, Tong Xu	(参考訳) 強化学習(Reinforcement Learning, RL)は、リコメンダシステムに対する長期的なユーザの関心を最適化する上で、非常に有望である。しかしながら、既存のrlベースのレコメンデーションメソッドでは、堅牢なレコメンデーションポリシを学ぶために、各ユーザが多数のインタラクションを必要とする。限られた数のインタラクションを持つ新規ユーザに推奨する場合には,この課題がより重要になります。そこで本稿では,高速ユーザ適応のためのメタレベルモデルに基づく強化学習手法を提案することで,rlベースのレコメンダシステムにおけるコールドスタート課題を解決する。提案手法では,ユーザの好みをユーザコンテキスト変数で推測することで,インタラクションの少ない新規ユーザに対して,レコメンデーションシステムによる適応性の向上を実現する。適応効率を向上させるために,メタレベルのレコメンデーションエージェントを支援する逆強化学習手法を用いて,少数のインタラクションからユーザポリシと報酬を回復することを学ぶ。さらに,情報理論的な観点から,ユーザモデルとレコメンデーションエージェントの相互作用関係をモデル化する。実験の結果,1つのインタラクションシーケンスのみで新規ユーザに対応する場合,提案手法の有効性が示された。さらに,推奨性能境界の理論的解析を行う。 Reinforcement learning (RL) has shown great promise in optimizing long-term user interest in recommender systems. However, existing RL-based recommendation methods need a large number of interactions for each user to learn a robust recommendation policy. The challenge becomes more critical when recommending to new users who have a limited number of interactions. To that end, in this paper, we address the cold-start challenge in the RL-based recommender systems by proposing a meta-level model-based reinforcement learning approach for fast user adaptation. In our approach, we learn to infer each user's preference with a user context variable that enables recommendation systems to better adapt to new users with few interactions. To improve adaptation efficiency, we learn to recover the user policy and reward from only a few interactions via an inverse reinforcement learning method to assist a meta-level recommendation agent. Moreover, we model the interaction relationship between the user model and recommendation agent from an information-theoretic perspective. Empirical results show the effectiveness of the proposed method when adapting to new users with only a single interaction sequence. We further provide a theoretical analysis of the recommendation performance bound.	翻訳日:2021-05-22 20:46:43 公開日:2020-12-04
# ヒューマンモビリティのためのディープラーニング:データとモデルに関する調査 Deep Learning for Human Mobility: a Survey on Data and Models ( http://arxiv.org/abs/2012.02825v1 ) ライセンス: Link先を確認	Massimiliano Luca, Gianni Barlacchi, Bruno Lepri, Luca Pappalardo	(参考訳) 人類の移動性に関する研究は、病気の普及、都市計画、幸福、汚染など、社会の様々な側面に影響を及ぼすため、非常に重要である。電話記録、GPSトレース、ソーシャルメディア投稿などのデジタルモビリティデータの拡散は、人工知能の卓越した予測力と相まって、深層学習を人間のモビリティに適用するきっかけとなった。特に、次の位置予測、すなわち個人の将来の位置を予測すること、群衆の流れ予測、すなわち地理的領域のフローを予測すること、軌道生成、すなわち現実的な個人軌道を生成することの3つのタスクに焦点を当てている。既存の調査では、シングルタスク、データソース、メカニカルあるいは従来の機械学習アプローチにフォーカスしているが、ディープラーニングソリューションの包括的な説明は欠落している。 i)モビリティとディープラーニングに関する基本的な概念、(ii)データソースと公開データセットのレビュー、(iii)ディープラーニングモデルの説明、(iv)関連するオープンチャレンジに関する議論。我々の調査は、次の位置予測、群集の流れ予測、軌道生成に対する先進的なディープラーニングソリューションのガイドである。同時に、これは深層学習の科学者や実践者が人間のモビリティ研究の基本的な概念とオープンな課題を理解するのに役立つ。 The study of human mobility is crucial due to its impact on several aspects of our society, such as disease spreading, urban planning, well-being, pollution, and more. The proliferation of digital mobility data, such as phone records, GPS traces, and social media posts, combined with the outstanding predictive power of artificial intelligence, triggered the application of deep learning to human mobility. In particular, the literature is focusing on three tasks: next-location prediction, i.e., predicting an individual's future locations; crowd flow prediction, i.e., forecasting flows on a geographic region; and trajectory generation, i.e., generating realistic individual trajectories. Existing surveys focus on single tasks, data sources, mechanistic or traditional machine learning approaches, while a comprehensive description of deep learning solutions is missing. This survey provides: (i) basic notions on mobility and deep learning; (ii) a review of data sources and public datasets; (iii) a description of deep learning models and (iv) a discussion about relevant open challenges. Our survey is a guide to the leading deep learning solutions to next-location prediction, crowd flow prediction, and trajectory generation. At the same time, it helps deep learning scientists and practitioners understand the fundamental concepts and the open challenges of the study of human mobility.	翻訳日:2021-05-22 20:46:26 公開日:2020-12-04
# 神経常微分方程式の普遍近似特性 Universal Approximation Property of Neural Ordinary Differential Equations ( http://arxiv.org/abs/2012.02414v1 ) ライセンス: Link先を確認	Takeshi Teshima, Koichi Tojo, Masahiro Ikeda, Isao Ishikawa, Kenta Oono	(参考訳) ニューラル常微分方程式 (neural ordinary differential equation, nodes) は、その自由形式ヤコビアンと扱いやすいヤコビ行列式推定器が利用可能であることを保証する可逆ニューラルネットワークアーキテクチャである。最近、NODEの表現力は、ある条件下で連続写像に対する$L^p$-universal approximatorを形成することで部分的に明らかになった。しかし、l^p$-universalityは、近似器が入力空間の小さな領域の目標関数と大きく異なる場合でも、入力領域全体の近似を保証できない可能性がある。さらにノードのポテンシャルを明らかにするために、そのより強い近似特性、すなわち大きな微分同相写像のクラスを近似するための$\sup$-universality を示す。これは微分同相群の構造定理を利用して示され、その結果、ノードがより強い保証で近似できるかなり大きな写像集合を確立することによって、既存の文献を補完する。 Neural ordinary differential equations (NODEs) is an invertible neural network architecture promising for its free-form Jacobian and the availability of a tractable Jacobian determinant estimator. Recently, the representation power of NODEs has been partly uncovered: they form an $L^p$-universal approximator for continuous maps under certain conditions. However, the $L^p$-universality may fail to guarantee an approximation for the entire input domain as it may still hold even if the approximator largely differs from the target function on a small region of the input space. To further uncover the potential of NODEs, we show their stronger approximation property, namely the $\sup$-universality for approximating a large class of diffeomorphisms. It is shown by leveraging a structure theorem of the diffeomorphism group, and the result complements the existing literature by establishing a fairly large set of mappings that NODEs can approximate with a stronger guarantee.	翻訳日:2021-05-22 20:45:56 公開日:2020-12-04
# トポロジーを考慮した3Dポイントクラウド生成のためのChartPointFlow ChartPointFlow for Topology-Aware 3D Point Cloud Generation ( http://arxiv.org/abs/2012.02346v1 ) ライセンス: Link先を確認	Takumi Kimura, Takashi Matsubara, Kuniaki Uehara	(参考訳) 点雲は三次元形状の表面の表現として機能する。深層生成モデルは、ボールのような潜伏変数の集合からの写像によって、そのバリエーションをモデル化するために適応されている。しかし、以前のアプローチでは点雲の位相構造にはあまり注意が払われておらず、連続写像は様々な数の穴や交点を表現できない。さらに、点雲は複数の部分からなることが多く、表現されることもほとんどない。本稿では,複数の潜在ラベルを持つフローベース生成モデルであるChartPointFlowを提案する。相互情報を最大化することにより、ラベルによって条件付けられた写像は、多様体のチャートのような与えられた点雲の連続部分集合に割り当てられる。これにより、従来のアプローチではぼやけや穴の発生に支障をきたす傾向があるのに対し、提案モデルでは明確な境界を持つトポロジカル構造を保存できる。実験の結果,ChartPointFlowはサンプリングベースポイントクラウドジェネレータ間の生成と再構築において,最先端の性能を実現していることがわかった。 A point cloud serves as a representation of the surface of a three-dimensional shape. Deep generative models have been adapted to model their variations typically by a map from a ball-like set of latent variables. However, previous approaches have not paid much attention to the topological structure of a point cloud; a continuous map cannot express the varying number of holes and intersections. Moreover, a point cloud is often composed of multiple subparts, and it is also hardly expressed. In this paper, we propose ChartPointFlow, which is a flow-based generative model with multiple latent labels. By maximizing the mutual information, a map conditioned by a label is assigned to a continuous subset of a given point cloud, like a chart of a manifold. This enables our proposed model to preserve the topological structure with clear boundaries, while previous approaches tend to suffer from blurs and to fail in generating holes. Experimental results demonstrate that ChartPointFlow achieves the state-of-the-art performance in generation and reconstruction among sampling-based point cloud generators.	翻訳日:2021-05-22 20:45:38 公開日:2020-12-04
# 銀河団リッチネス推定のための光波長誘導型自己教師付き特徴学習 Optical Wavelength Guided Self-Supervised Feature Learning For Galaxy Cluster Richness Estimate ( http://arxiv.org/abs/2012.02368v1 ) ライセンス: Link先を確認	Gongbo Liang, Yuanyuan Su, Sheng-Chieh Lin, Yu Zhang, Yuanyuan Zhang, Nathan Jacobs	(参考訳) 近くの宇宙のほとんどの銀河は、銀河団または銀河群に重力的に結合している。光学的豊かさなどの光学的内容は、現代の天文学や宇宙論における銀河と大規模構造の共同進化を理解する上で重要である。光豊かさの決定は困難である。マルチバンド光画像から光リッチ度を推定するための自己教師型アプローチを提案する。本手法では,マルチバンド光画像のデータ特性を事前学習に利用し,大規模かつ未ラベルのデータセットから特徴表現を学習する。提案手法をSloan Digital Sky Surveyに適用する。その結果、光学的豊かさの推定により、平均絶対誤差が11.84%、内在散乱が20.78%減少し、ラベル付きトレーニングデータの必要性が最大60%低下した。提案手法は,多数の未ラベルのマルチバンド画像が利用可能であるが,画像ラベルの取得にはコストがかかる天文学や宇宙論に有用であると考えている。 Most galaxies in the nearby Universe are gravitationally bound to a cluster or group of galaxies. Their optical contents, such as optical richness, are crucial for understanding the co-evolution of galaxies and large-scale structures in modern astronomy and cosmology. The determination of optical richness can be challenging. We propose a self-supervised approach for estimating optical richness from multi-band optical images. The method uses the data properties of the multi-band optical images for pre-training, which enables learning feature representations from a large but unlabeled dataset. We apply the proposed method to the Sloan Digital Sky Survey. The result shows our estimate of optical richness lowers the mean absolute error and intrinsic scatter by 11.84% and 20.78%, respectively, while reducing the need for labeled training data by up to 60%. We believe the proposed method will benefit astronomy and cosmology, where a large number of unlabeled multi-band images are available, but acquiring image labels is costly.	翻訳日:2021-05-22 20:45:21 公開日:2020-12-04
# 自動運転車の歩行者属性32 Detecting 32 Pedestrian Attributes for Autonomous Vehicles ( http://arxiv.org/abs/2012.02647v1 ) ライセンス: Link先を確認	Taylor Mordan, Matthieu Cord, Patrick P\'erez and Alexandre Alahi	(参考訳) 歩行者は、都市部における自動運転車の安全性を最も重視する道路利用者の1つである。本稿では,歩行者を共同検出し,歩行者属性を32個認識する問題に対処する。これらは視覚的外観や行動を含み、道路横断の予測も含むが、これは主要な安全上の懸念である。そこで本稿では,複合フィールドフレームワークを利用したマルチタスク学習(MTL)モデルを提案する。各フィールドは、歩行者のインスタンスを空間的に特定し、属性予測を集約する。この定式化は自然に空間的文脈を活用し、自動運転のような低解像度シナリオに適している。共同で学習する属性の数を増やすことで、様々なタスクを伴うMLLで発生する勾配のスケールに関する問題を明らかにする。我々は,ネットワークアーキテクチャにおいて,フォーク正規化(fork-normalization)と呼ばれる後方通過時に,異なる目的関数から生じる勾配を正規化する。 JAADは、自動運転車からの歩行者分析のための多くの属性を提供するデータセットであり、競争力のある検出と属性認識の結果と、より安定したMTLトレーニングを示す。 Pedestrians are arguably one of the most safety-critical road users to consider for autonomous vehicles in urban areas. In this paper, we address the problem of jointly detecting pedestrians and recognizing 32 pedestrian attributes. These encompass visual appearance and behavior, and also include the forecasting of road crossing, which is a main safety concern. For this, we introduce a Multi-Task Learning (MTL) model relying on a composite field framework, which achieves both goals in an efficient way. Each field spatially locates pedestrian instances and aggregates attribute predictions over them. This formulation naturally leverages spatial context, making it well suited to low resolution scenarios such as autonomous driving. By increasing the number of attributes jointly learned, we highlight an issue related to the scales of gradients, which arises in MTL with numerous tasks. We solve it by normalizing the gradients coming from different objective functions when they join at the fork in the network architecture during the backward pass, referred to as fork-normalization. Experimental validation is performed on JAAD, a dataset providing numerous attributes for pedestrian analysis from autonomous vehicles, and shows competitive detection and attribute recognition results, as well as a more stable MTL training.	翻訳日:2021-05-22 20:45:06 公開日:2020-12-04
# 原型補正条件付ランダムフィールドを用いたFew-Shotイベント検出 Few-Shot Event Detection with Prototypical Amortized Conditional Random Field ( http://arxiv.org/abs/2012.02353v1 ) ライセンス: Link先を確認	Xin Cong, Shiyao Cui, Bowen Yu, Tingwen Liu, Yubin Wang, Bin Wang	(参考訳) 情報抽出の基本的なタスクであるイベント検出は、いくつかのサンプルで新しいイベントタイプを認識する必要がある場合に苦労する傾向がある。 Few-Shot Event Detection (FSED)。従来の識別列分類パラダイムは、パイプライン方式でこの問題を解決しようとするが、イベントタイプ間のトリガの相違を無視し、エラーの伝播に悩まされる。本稿では,タスクを二部分タグ付け方式で数発タグ付け問題に変換する,新しい統一ジョイントモデルを提案する。この目的のために,我々はまず,ラベルのプロトタイプに基づいてラベル間の遷移スコアを近似する原型的アモルティゼーションネットワークを構築する,限定的シナリオにおけるラベル依存をモデル化するために,原型的アモルティゼーション条件付き確率場 (pa-crf) を設計した。次に、PA-CRFにおける遷移スコアのモデル化のためにガウス分布を導入し、データ不足による不確実な推定を緩和する。ベンチマークデータセットFewEventで実験を行い、実験結果から、タグ付けに基づく手法は既存のパイプラインやジョイントラーニング手法よりも優れていることが示された。さらに、提案したPA-CRFは、公開データセット上で最高の結果を得る。 Event Detection, a fundamental task of Information Extraction, tends to struggle when it needs to recognize novel event types with a few samples, i.e. Few-Shot Event Detection (FSED). Previous identify-then-classify paradigm attempts to solve this problem in the pipeline manner but ignores the trigger discrepancy between event types, thus suffering from the error propagation. In this paper, we present a novel unified joint model which converts the task to a few-shot tagging problem with a double-part tagging scheme. To this end, we first design the Prototypical Amortized Conditional Random Field (PA-CRF) to model the label dependency in the few-shot scenario, which builds prototypical amortization networks to approximate the transition scores between labels based on the label prototypes. Then Gaussian distribution is introduced for the modeling of the transition scores in PA-CRF to alleviate the uncertain estimation resulting from insufficient data. We conduct experiments on the benchmark dataset FewEvent and the experimental results show that the tagging based methods are better than existing pipeline and joint learning methods. In addition, the proposed PA-CRF achieves the best results on the public dataset.	翻訳日:2021-05-22 20:44:49 公開日:2020-12-04
# ddrel: dyadic対話における対人関係分類のための新しいデータセット DDRel: A New Dataset for Interpersonal Relation Classification in Dyadic Dialogues ( http://arxiv.org/abs/2012.02553v1 ) ライセンス: Link先を確認	Qi Jia, Hongru Huang, Kenny Q. Zhu	(参考訳) 対話における対人的言語スタイルの変化は、人間の興味深く、ほとんど本能的な能力である。言語コンテンツから対人関係を理解することは、対話をさらに理解するための重要なステップである。先行研究は主にテキスト中の名前付きエンティティ間の関係抽出に焦点を当てている。本稿では,対話に基づく対話者の関係分類の課題を提案する。我々はIMSDbから映画スクリプトをクロールし、13の事前定義された関係に従って各セッションの関連ラベルを注釈付けした。注釈付きデータセット ddrel は、合計53,126発話の694対の話者による6300のdyadic対話セッションで構成されている。また,セッションレベルおよびペアレベルの関係分類タスクを,広く受け入れられるベースラインで構築する。実験結果から,本課題は既存モデルでは困難な課題であり,将来の研究にはデータセットが有用であることが示唆された。 Interpersonal language style shifting in dialogues is an interesting and almost instinctive ability of human. Understanding interpersonal relationship from language content is also a crucial step toward further understanding dialogues. Previous work mainly focuses on relation extraction between named entities in texts. In this paper, we propose the task of relation classification of interlocutors based on their dialogues. We crawled movie scripts from IMSDb, and annotated the relation labels for each session according to 13 pre-defined relationships. The annotated dataset DDRel consists of 6300 dyadic dialogue sessions between 694 pair of speakers with 53,126 utterances in total. We also construct session-level and pair-level relation classification tasks with widely-accepted baselines. The experimental results show that this task is challenging for existing models and the dataset will be useful for future research.	翻訳日:2021-05-22 20:44:11 公開日:2020-12-04
# 女性・移民に対するサイバーいじめの自動検出とクロスドメイン適応性 Automated Detection of Cyberbullying Against Women and Immigrants and Cross-domain Adaptability ( http://arxiv.org/abs/2012.02565v1 ) ライセンス: Link先を確認	Thushari Atapattu, Mahen Herath, Georgia Zhang, Katrina Falkner	(参考訳) ソーシャルメディア技術の利用が急増しているため、サイバーいじめは社会問題として広まりつつある。少数派、女性、青年はサイバーいじめの一般的な犠牲者である。 nlp技術の進歩にもかかわらず、自動サイバーいじめ検出は依然として困難である。本稿では,最先端NLP技術を用いた技術の進歩に焦点を当てる。 SemEval 2019 - Task 5(HatEval)のTwitterデータセットを、女性や移民に対するヘイトスピーチに使用しています。ヘイトスピーチ(タスクA)とアグレッシブネス(タスクB)をそれぞれ分類する作業において,DistilBERTに基づくベストパフォーマンスアンサンブルモデルにおいて,F1スコアの0.73と0.74を達成している。タスクa用に開発されたアンサンブルモデルを用いて,外部データセットにおける攻撃的言語を分類し,3つのベンチマークデータセットを用いてf1スコアの0.7以下を達成した。我々は、将来のサイバーいじめ研究のための洞察に富んだレコメンデーションを提供するために、誤分類されたツイートの質的分析を行う。 Cyberbullying is a prevalent and growing social problem due to the surge of social media technology usage. Minorities, women, and adolescents are among the common victims of cyberbullying. Despite the advancement of NLP technologies, the automated cyberbullying detection remains challenging. This paper focuses on advancing the technology using state-of-the-art NLP techniques. We use a Twitter dataset from SemEval 2019 - Task 5(HatEval) on hate speech against women and immigrants. Our best performing ensemble model based on DistilBERT has achieved 0.73 and 0.74 of F1 score in the task of classifying hate speech (Task A) and aggressiveness and target (Task B) respectively. We adapt the ensemble model developed for Task A to classify offensive language in external datasets and achieved ~0.7 of F1 score using three benchmark datasets, enabling promising results for cross-domain adaptability. We conduct a qualitative analysis of misclassified tweets to provide insightful recommendations for future cyberbullying research.	翻訳日:2021-05-22 20:44:01 公開日:2020-12-04
# Ve'rdd 紙辞書と低リソースNLPとコミュニティ関与のギャップを狭める Ve'rdd. Narrowing the Gap between Paper Dictionaries, Low-Resource NLP and Community Involvement ( http://arxiv.org/abs/2012.02578v1 ) ライセンス: Link先を確認	Khalid Alnajjar, Mika H\"am\"al\"ainen, Jack Rueter, Niko Partanen	(参考訳) 本稿では,複数のアマチュア編集者に公開されている草の根辞書の再評価と編集の機会を提供する,オープンソースのオンライン辞書編集システムve'rddを提案する。コミュニティの活動は、深刻な絶滅危惧言語であるSkolt Samiの、最先端の有限状態言語記述に組み込むことが目的である。問題は、コミュニティが鉛筆と紙のレベル以上のものに参加することにある。時々、ネイティブスピーカーと辞書指向は、将来自分たちの仕事をより意味のあるものにするであろうインフラを利用するための技術的な理解を欠いているようです。すべての入力を複数回再利用する。そこで本システムは,ユーザフレンドリなUIを支える技術的複雑さを隠蔽するUralic言語のための既存のツールやインフラと統合する。 We present an open-source online dictionary editing system, Ve'rdd, that offers a chance to re-evaluate and edit grassroots dictionaries that have been exposed to multiple amateur editors. The idea is to incorporate community activities into a state-of-the-art finite-state language description of a seriously endangered minority language, Skolt Sami. Problems involve getting the community to take part in things above the pencil-and-paper level. At times, it seems that the native speakers and the dictionary oriented are lacking technical understanding to utilize the infrastructures which might make their work more meaningful in the future, i.e. multiple reuse of all of their input. Therefore, our system integrates with the existing tools and infrastructures for Uralic language masking the technical complexities behind a user-friendly UI.	翻訳日:2021-05-22 20:43:43 公開日:2020-12-04
# SMSデータセットのオンデバイス文類似性 On-Device Sentence Similarity for SMS Dataset ( http://arxiv.org/abs/2012.02819v1 ) ライセンス: Link先を確認	Arun D Prabhu, Nikhil Arora, Shubham Vatsal, Gopi Ramena, Sukumar Moharana, Naresh Purre	(参考訳) 短いメッセージサービス(SMS)テキスト/文間の文の類似性を決定することは、モバイルデバイス産業において重要な役割を果たす。したがって、SMSデータの類似性を評価するためには、検索やナビゲーションの強化、カスタムラベルやタグが送信者に関係なく提供される場合に、同様のタイプのSMSをまとめることなど、さまざまなアプリケーションで必要となる。 SMSデータで直面する問題は、その不完全構造と文法上の矛盾である。本稿では,SMSテキスト間のテキスト類似性を評価するためのユニークなパイプラインを提案する。 SMSテキストに埋め込まれた部分構造を利用してキーワード抽出に音声の一部(POS)モデルを用い,統計的手法を用いて類似度の比較を行った。提案したパイプラインは、SMSデータ間のセマンティックな大きなバリエーションを扱い、デバイス上でのアプリケーション(携帯電話)に有効である。我々の作業の能力を示すため、我々のパイプラインは、以下のセクションの1つで議論されているSMSテキスト類似性の可能性の1つに傾倒して設計されていますが、それでも他のアプリケーションにもスケーラビリティが保証されています。 Determining the sentence similarity between Short Message Service (SMS) texts/sentences plays a significant role in mobile device industry. Gauging the similarity between SMS data is thus necessary for various applications like enhanced searching and navigation, clubbing together SMS of similar type when given a custom label or tag is provided by user irrespective of their sender etc. The problem faced with SMS data is its incomplete structure and grammatical inconsistencies. In this paper, we propose a unique pipeline for evaluating the text similarity between SMS texts. We use Part of Speech (POS) model for keyword extraction by taking advantage of the partial structure embedded in SMS texts and similarity comparisons are carried out using statistical methods. The proposed pipeline deals with major semantic variations across SMS data as well as makes it effective for its application on-device (mobile phone). To showcase the capabilities of our work, our pipeline has been designed with an inclination towards one of the possible applications of SMS text similarity discussed in one of the following sections but nonetheless guarantees scalability for other applications as well.	翻訳日:2021-05-22 20:43:32 公開日:2020-12-04
# 創発的コミュニケーションにおける誘導バイアスと言語表現性 Inductive Bias and Language Expressivity in Emergent Communication ( http://arxiv.org/abs/2012.02875v1 ) ライセンス: Link先を確認	Shangmin Guo, Yi Ren, Agnieszka S{\l}owik, Kory Mathewson	(参考訳) レファレンシャルゲームとレコンストラクションゲームは、創発言語を研究するための最も一般的なゲームタイプである。言語ゲームの種類が創発的言語にどのように影響するかを,言語構成性および<i>i)言語の起源とは異なるタスクへの創発的言語移行の観点から検討する。手作りのシンボリックデータセットを用いた実証実験により、異なるゲームから出現する言語は構成性とさらに異なる表現性を持つことを示した。 Referential games and reconstruction games are the most common game types for studying emergent languages. We investigate how the type of the language game affects the emergent language in terms of: i) language compositionality and ii) transfer of an emergent language to a task different from its origin, which we refer to as language expressivity. With empirical experiments on a handcrafted symbolic dataset, we show that languages emerged from different games have different compositionality and further different expressivity.	翻訳日:2021-05-22 20:43:14 公開日:2020-12-04
# cit-gan:循環型画像翻訳生成広告ネットワークとiris提示攻撃検出への応用 CIT-GAN: Cyclic Image Translation Generative Adversarial Network With Application in Iris Presentation Attack Detection ( http://arxiv.org/abs/2012.02374v1 ) ライセンス: Link先を確認	Shivangi Yadav and Arun Ross	(参考訳) 本研究では,マルチドメイン・スタイル・トランスファーのためのCIT-GAN(Cyclic Image Translation Generative Adversarial Network)を提案する。そこで本研究では,トレーニングデータセットで表現される各ドメインのスタイル特性を学習する能力を有するスタイリングネットワークを提案する。スタイリングネットワークは、ジェネレータがソースドメインから参照ドメインへの画像の変換を駆動し、参照ドメインのスタイル特性を持つ合成画像を生成するのを支援する。各ドメインの学習スタイルの特徴は、スタイル損失とドメイン分類損失の両方に依存する。これにより、各ドメイン内のスタイル特性のばらつきが引き起こされる。提案したCIT-GANは、アイリス提示攻撃検出(PAD)の文脈において、トレーニングセットに表現されていないクラスに対する合成プレゼンテーション攻撃(PA)サンプルを生成するために使用される。現在最先端のアイリスPAD法による評価は、PAD法をトレーニングするために合成されたPAサンプルを使用することの有効性を示す。さらに、Frechet Inception Distance(FID)スコアを用いて合成した試料の品質を評価する。提案手法により生成された合成画像の品質は,StarGanを含む他の競合する手法よりも優れていることを示す。 In this work, we propose a novel Cyclic Image Translation Generative Adversarial Network (CIT-GAN) for multi-domain style transfer. To facilitate this, we introduce a Styling Network that has the capability to learn style characteristics of each domain represented in the training dataset. The Styling Network helps the generator to drive the translation of images from a source domain to a reference domain and generate synthetic images with style characteristics of the reference domain. The learned style characteristics for each domain depend on both the style loss and domain classification loss. This induces variability in style characteristics within each domain. The proposed CIT-GAN is used in the context of iris presentation attack detection (PAD) to generate synthetic presentation attack (PA) samples for classes that are under-represented in the training set. Evaluation using current state-of-the-art iris PAD methods demonstrates the efficacy of using such synthetically generated PA samples for training PAD methods. Further, the quality of the synthetically generated samples is evaluated using Frechet Inception Distance (FID) score. Results show that the quality of synthetic images generated by the proposed method is superior to that of other competing methods, including StarGan.	翻訳日:2021-05-22 20:42:49 公開日:2020-12-04
# DNNに対する実践的ノンボックス攻撃 Practical No-box Adversarial Attacks against DNNs ( http://arxiv.org/abs/2012.02525v1 ) ライセンス: Link先を確認	Qizhang Li, Yiwen Guo, Hao Chen	(参考訳) ディープニューラルネットワーク(DNN)の敵対的脆弱性の研究は急速に進んでいる。既存の攻撃は、内部アクセス(アーキテクチャ、パラメータ、または犠牲者モデルのトレーニングセット)または外部アクセス(モデルに問い合わせる)を必要とする。しかし、多くのシナリオではアクセスが不可能または高価になる可能性がある。我々は、攻撃者がモデル情報やトレーニングセットにアクセスしたり、モデルに問い合わせたりできないノンボックス逆行事例を調査した。その代わり、攻撃者は被害者モデルと同じ問題領域から少数のサンプルしか収集できない。このような強力な脅威モデルは、敵の攻撃の適用性を大きく広げる。非常に小さなデータセット(数十の例の順)でトレーニングを行うための3つのメカニズムを提案し、原型的再構成が最も効果的であることを示す。実験の結果,画像分類や顔認証モデルによく適合する原型的自動エンコーディングモデルに基づく逆例が得られた。 clarifai.comの商用セレブ認識システムにおいて,本手法は,事前学習されたアークフェイスモデルから敵の例を転送する攻撃と同等の15.40%の確率で,平均予測精度を著しく低下させる。 The study of adversarial vulnerabilities of deep neural networks (DNNs) has progressed rapidly. Existing attacks require either internal access (to the architecture, parameters, or training set of the victim model) or external access (to query the model). However, both the access may be infeasible or expensive in many scenarios. We investigate no-box adversarial examples, where the attacker can neither access the model information or the training set nor query the model. Instead, the attacker can only gather a small number of examples from the same problem domain as that of the victim model. Such a stronger threat model greatly expands the applicability of adversarial attacks. We propose three mechanisms for training with a very small dataset (on the order of tens of examples) and find that prototypical reconstruction is the most effective. Our experiments show that adversarial examples crafted on prototypical auto-encoding models transfer well to a variety of image classification and face verification models. On a commercial celebrity recognition system held by clarifai.com, our approach significantly diminishes the average prediction accuracy of the system to only 15.40%, which is on par with the attack that transfers adversarial examples from a pre-trained Arcface model.	翻訳日:2021-05-22 20:42:02 公開日:2020-12-04
# f2net:教師なしビデオオブジェクトセグメンテーションのための前景にフォーカスする学習 F2Net: Learning to Focus on the Foreground for Unsupervised Video Object Segmentation ( http://arxiv.org/abs/2012.02534v1 ) ライセンス: Link先を確認	Daizong Liu, Dongdong Yu, Changhu Wang, Pan Zhou	(参考訳) ディープラーニングベースの手法は教師なしのビデオオブジェクトのセグメンテーションにおいて大きな進歩を遂げているが、難しいシナリオ(視覚の類似性、オクルージョン、外観の変化など)はまだうまく処理されていない。そこで本研究では,フォアグラウンド・ネットワーク(f2net)に着目し,フォアグラウンド・オブジェクトのフレーム内詳細を分割し,セグメンテーション性能を効果的に向上させる手法を提案する。具体的には,Siamese Encoder Module,Center Guiding Outearance Diffusion Module,Dynamic Information Fusion Moduleの3つの主要部分から構成される。まず、シアムエンコーダを用いて、ペアフレーム(参照フレームと現在のフレーム)の特徴表現を抽出する。次に、フレーム間特徴(参照フレームとカレントフレーム間のデンス対応)、フレーム内特徴(現在のフレーム内のデンス対応)、および現在のフレームの本来の意味的特徴をキャプチャする中央案内型外観拡散モジュールを設計する。具体的には、現在のフレームにおける前景オブジェクトの中心位置を予測し、その中心点情報を空間的ガイダンスとして利用して、フレーム間特徴抽出とフレーム内特徴抽出を強化し、その特徴表現が前景オブジェクトにかなり焦点をあてる。最後に,上記の3つの異なるレベル特徴により,比較的重要な特徴を自動的に選択する動的情報融合モジュールを提案する。 DAVIS2016、Youtube-object、FBMSデータセットの大規模な実験により、提案したF2Netは、最先端のパフォーマンスを実現し、大幅な改善がなされた。 Although deep learning based methods have achieved great progress in unsupervised video object segmentation, difficult scenarios (e.g., visual similarity, occlusions, and appearance changing) are still not well-handled. To alleviate these issues, we propose a novel Focus on Foreground Network (F2Net), which delves into the intra-inter frame details for the foreground objects and thus effectively improve the segmentation performance. Specifically, our proposed network consists of three main parts: Siamese Encoder Module, Center Guiding Appearance Diffusion Module, and Dynamic Information Fusion Module. Firstly, we take a siamese encoder to extract the feature representations of paired frames (reference frame and current frame). Then, a Center Guiding Appearance Diffusion Module is designed to capture the inter-frame feature (dense correspondences between reference frame and current frame), intra-frame feature (dense correspondences in current frame), and original semantic feature of current frame. Specifically, we establish a Center Prediction Branch to predict the center location of the foreground object in current frame and leverage the center point information as spatial guidance prior to enhance the inter-frame and intra-frame feature extraction, and thus the feature representation considerably focus on the foreground objects. Finally, we propose a Dynamic Information Fusion Module to automatically select relatively important features through three aforementioned different level features. Extensive experiments on DAVIS2016, Youtube-object, and FBMS datasets show that our proposed F2Net achieves the state-of-the-art performance with significant improvement.	翻訳日:2021-05-22 20:41:45 公開日:2020-12-04
# ラベル付き行数少ない歴史文書におけるオフライン手書き文字認識の促進 Boosting offline handwritten text recognition in historical documents with few labeled lines ( http://arxiv.org/abs/2012.02544v1 ) ライセンス: Link先を確認	Jos\'e Carlos Aradillas, Juan Jos\'e Murillo-Fuentes, Pablo M. Olmos	(参考訳) 本稿では,ラベル付きサンプルがほとんど存在せず,一部に列車セットに誤りが含まれている場合に,過去の文書におけるオフライン手書き文字認識(HTR)の問題に直面する。主な貢献は3つある。まず,大規模データベースからより小さな履歴データベースへの転送学習(tl)の実施方法を分析し,モデルのどのレイヤが微調整プロセスを必要とするかを分析する。第2に、TLとデータ拡張(DA)を効率的に組み合わせる手法を解析する。最後に,学習セットにおける誤りラベルの影響を軽減するアルゴリズムを提案する。これらの方法は、IDFHR 2018コンペティションデータベースであるWashington and Parzivalで分析される。これらすべてのテクニックを組み合わせることで,複雑性のオーバーヘッドが少ないテストセットにおいて,CERの大幅な削減(場合によっては6%)を実証する。 In this paper, we face the problem of offline handwritten text recognition (HTR) in historical documents when few labeled samples are available and some of them contain errors in the train set. Three main contributions are developed. First we analyze how to perform transfer learning (TL) from a massive database to a smaller historical database, analyzing which layers of the model need a fine-tuning process. Second, we analyze methods to efficiently combine TL and data augmentation (DA). Finally, an algorithm to mitigate the effects of incorrect labelings in the training set is proposed. The methods are analyzed over the ICFHR 2018 competition database, Washington and Parzival. Combining all these techniques, we demonstrate a remarkable reduction of CER (up to 6% in some cases) in the test set with little complexity overhead.	翻訳日:2021-05-22 20:41:14 公開日:2020-12-04
# 交通予報におけるU-Netの実践に向けて Towards Good Practices of U-Net for Traffic Forecasting ( http://arxiv.org/abs/2012.02598v1 ) ライセンス: Link先を確認	Jingwei Xu, Jianjin Zhang, Zhiyu Yao, Yunbo Wang	(参考訳) この技術レポートは、2020 Traffic4Cast Challengeの解決策を提示します。トラヒック予測問題は、相対的に弱い時間的依存性(確率的都市交通力学によるものである)と強い事前知識、すなわち \textit{i.e. を持つ将来のフレーム予測タスクであると考える。以下は、都市のロードマップだ。これらの理由から,我々はバックボーンモデルとしてu-netを用い,予測トラフィックフローをより合理的にするためのロードマップ生成手法を提案する。一方,検証セットに基づく微調整戦略を用いて過剰フィッティングを防止し,予測結果を効果的に改善する。本報告の最後には,(1)季節性など固有のデータパターンを活用すること,(2)異なる都市間で共通知識を蒸留・伝達すること,といった今後の研究で検討すべきアプローチについて,さらに議論する。また,評価基準の有効性も分析した。 This technical report presents a solution for the 2020 Traffic4Cast Challenge. We consider the traffic forecasting problem as a future frame prediction task with relatively weak temporal dependencies (might be due to stochastic urban traffic dynamics) and strong prior knowledge, \textit{i.e.}, the roadmaps of the cities. For these reasons, we use the U-Net as the backbone model, and we propose a roadmap generation method to make the predicted traffic flows more rational. Meanwhile, we use a fine-tuning strategy based on the validation set to prevent overfitting, which effectively improves the prediction results. At the end of this report, we further discuss several approaches that we have considered or could be explored in future work: (1) harnessing inherent data patterns, such as seasonality; (2) distilling and transferring common knowledge between different cities. We also analyze the validity of the evaluation metric.	翻訳日:2021-05-22 20:41:01 公開日:2020-12-04
# 識別的半教師付きドメイン適応のための効果的なラベル伝播 Effective Label Propagation for Discriminative Semi-Supervised Domain Adaptation ( http://arxiv.org/abs/2012.02621v1 ) ライセンス: Link先を確認	Zhiyong Huang, Kekai Sheng, Weiming Dong, Xing Mei, Chongyang Ma, Feiyue Huang, Dengwen Zhou, Changsheng Xu	(参考訳) 半教師付きドメイン適応(SSDA)法は,大規模なラベル付きデータがソースドメインで利用可能であるが,ターゲットドメインではほとんどラベル付きサンプルが提供されない大規模画像分類タスクにおいて,大きな可能性を示している。既存のソリューションは通常、2つのドメイン間の機能アライメントに重点を置いているが、ターゲットドメインで学習された表現の識別能力にはほとんど注意を払っていない。本稿では,ドメイン間の効果的な情報伝達とドメイン内セマンティック情報伝達によってこの問題に対処する,新しい効果的なラベル伝搬法を提案する。ドメイン間伝播のために,2つのドメイン間の意味情報の一貫性を促進するために,新しいサイクル不一致損失を提案する。ドメイン内伝搬のために,擬似ラベル付き対象領域データのノイズを緩和し,対象領域の特徴識別性を向上する効果的な自己学習戦略を提案する。汎用的な手法として,様々な領域適応アプローチに容易に適用でき,対象領域における特徴識別を容易にすることができる。 Office-HomeとDomainNetベンチマークの実験では、ELPは主流のSSDAメソッドの分類精度を2%～3%改善している。さらに、ELPは、VisDA-2017ベンチマークでのUDA実験に基づいて、UDAメソッドのパフォーマンスも改善した(81.5%対86.1%)。ソースコードと事前トレーニングされたモデルは間もなくリリースされる予定です。 Semi-supervised domain adaptation (SSDA) methods have demonstrated great potential in large-scale image classification tasks when massive labeled data are available in the source domain but very few labeled samples are provided in the target domain. Existing solutions usually focus on feature alignment between the two domains while paying little attention to the discrimination capability of learned representations in the target domain. In this paper, we present a novel and effective method, namely Effective Label Propagation (ELP), to tackle this problem by using effective inter-domain and intra-domain semantic information propagation. For inter-domain propagation, we propose a new cycle discrepancy loss to encourage consistency of semantic information between the two domains. For intra-domain propagation, we propose an effective self-training strategy to mitigate the noises in pseudo-labeled target domain data and improve the feature discriminability in the target domain. As a general method, our ELP can be easily applied to various domain adaptation approaches and can facilitate their feature discrimination in the target domain. Experiments on Office-Home and DomainNet benchmarks show ELP consistently improves the classification accuracy of mainstream SSDA methods by 2%~3%. Additionally, ELP also improves the performance of UDA methods as well (81.5% vs 86.1%), based on UDA experiments on the VisDA-2017 benchmark. Our source code and pre-trained models will be released soon.	翻訳日:2021-05-22 20:40:46 公開日:2020-12-04
# オブジェクト検出のためのグローバルコンテキスト認識RCNN Global Context Aware RCNN for Object Detection ( http://arxiv.org/abs/2012.02637v1 ) ライセンス: Link先を確認	Wenchao Zhang, Chong Fu, Haoyu Xie, Mai Zhu, Ming Tie, Junxin Chen	(参考訳) RoIPool/RoIAlignは、典型的な2段階オブジェクト検出アルゴリズムに必要なプロセスであり、特徴ピラミッドから抽出したオブジェクト提案を再スケールして固定サイズの特徴写像を生成するために使用される。しかし、これらの局所受容領域の特徴マップは、グローバルな文脈情報を著しく失うことになる。この問題に対処するため,GCA (Global Context Aware) RCNN (Global Context Aware) と呼ばれる新しいエンドツーエンドのトレーニング可能なフレームワークを提案する。 GCAフレームワークの中核となるコンポーネントは、グローバルな特徴ピラミッドとアテンション戦略をそれぞれ、特徴抽出と特徴改善に使用するコンテキスト認識メカニズムである。具体的には、FPNのトップダウンプロセスにおけるグローバルコンテキストの情報フローを改善するために、密接な接続を活用し、さらに注目機構を使用して、特徴ピラミッドの各レベルにおけるグローバルコンテキストを洗練する。最後に,本手法の軽量版も提示し,モデルの複雑さと計算負荷をわずかに増やした。 COCOベンチマークデータセットの実験結果は、我々のアプローチの大きな利点を示している。 RoIPool/RoIAlign is an indispensable process for the typical two-stage object detection algorithm, it is used to rescale the object proposal cropped from the feature pyramid to generate a fixed size feature map. However, these cropped feature maps of local receptive fields will heavily lose global context information. To tackle this problem, we propose a novel end-to-end trainable framework, called Global Context Aware (GCA) RCNN, aiming at assisting the neural network in strengthening the spatial correlation between the background and the foreground by fusing global context information. The core component of our GCA framework is a context aware mechanism, in which both global feature pyramid and attention strategies are used for feature extraction and feature refinement, respectively. Specifically, we leverage the dense connection to improve the information flow of the global context at different stages in the top-down process of FPN, and further use the attention mechanism to refine the global context at each level in the feature pyramid. In the end, we also present a lightweight version of our method, which only slightly increases model complexity and computational burden. Experimental results on COCO benchmark dataset demonstrate the significant advantages of our approach.	翻訳日:2021-05-22 20:40:24 公開日:2020-12-04
# 自然言語を用いたモーメントローカライゼーションのためのマルチスケール2次元隣接ネットワーク Multi-Scale 2D Temporal Adjacent Networks for Moment Localization with Natural Language ( http://arxiv.org/abs/2012.02646v1 ) ライセンス: Link先を確認	Songyang Zhang, Houwen Peng, Jianlong Fu, Yijuan Lu, Jiebo Luo	(参考訳) 自然言語による未検索の映像から特定の瞬間を検索する問題に対処する。ターゲットモーメントは、未トリミングビデオの他の時間モーメントの文脈で発生する可能性があるため、これは難しい問題である。既存の手法では、時間的モーメント間の時間的コンテキストを十分に考慮していないため、この課題にうまく取り組めない。本稿では,ビデオモーメント間の時間的文脈を,時間スケールの異なる2次元マップのセットでモデル化する。各地図について、1次元はモーメントの開始時刻を示し、もう1次元は時間を示す。これらの2dテンポラリマップは、異なる長さの様々なビデオモーメントをカバーでき、隣接するコンテキストを異なるテンポラリスケールで表現することができる。モーメントローカライゼーションのためのシングルショットフレームワークであるMS-2D-TAN(Multi-Scale Temporal Adjacent Network)を提案する。ビデオモーメントと参照表現をマッチングする識別特徴を学習しながら、隣接する時間的文脈を各スケールで符号化することができる。提案したMS-2D-TANを,Charades-STA,ActivityNet Captions,TACoSの3つの挑戦的ベンチマークで評価した。 We address the problem of retrieving a specific moment from an untrimmed video by natural language. It is a challenging problem because a target moment may take place in the context of other temporal moments in the untrimmed video. Existing methods cannot tackle this challenge well since they do not fully consider the temporal contexts between temporal moments. In this paper, we model the temporal context between video moments by a set of predefined two-dimensional maps under different temporal scales. For each map, one dimension indicates the starting time of a moment and the other indicates the duration. These 2D temporal maps can cover diverse video moments with different lengths, while representing their adjacent contexts at different temporal scales. Based on the 2D temporal maps, we propose a Multi-Scale Temporal Adjacent Network (MS-2D-TAN), a single-shot framework for moment localization. It is capable of encoding the adjacent temporal contexts at each scale, while learning discriminative features for matching video moments with referring expressions. We evaluate the proposed MS-2D-TAN on three challenging benchmarks, i.e., Charades-STA, ActivityNet Captions, and TACoS, where our MS-2D-TAN outperforms the state of the art.	翻訳日:2021-05-22 20:40:06 公開日:2020-12-04
# 等尺多形マッチング Isometric Multi-Shape Matching ( http://arxiv.org/abs/2012.02689v1 ) ライセンス: Link先を確認	Maolin Gao, Zorah L\"ahner, Johan Thunberg, Daniel Cremers, Florian Bernard	(参考訳) 形状の対応を見つけることはコンピュータビジョンとグラフィックスの基本的な問題であり、3D再構成、オブジェクト追跡、スタイル転送など多くのアプリケーションに関係している。対応メソッドの大部分は、たとえ同じクラスの複数のインスタンスが利用可能であっても、形状のペア間の解を見つけることを目的としている。アイソメトリーは形状対応問題においてしばしば研究されるが、マルチマッチング環境では明確には考慮されていない。本稿では,等尺的マルチ形状マッチングの新しい最適化式を提案することにより,このギャップを埋める。定式化を解くのに適した最適化アルゴリズムを提案し,コンバージェンスと複雑性解析を提供する。提案アルゴリズムは, 確実にサイクル一貫性のあるマルチマッチングを実現する。提案手法の各種データセット上での優れた性能を実証し,等尺的マルチ形状マッチングにおける新しい最先端技術の設定を行う。 Finding correspondences between shapes is a fundamental problem in computer vision and graphics, which is relevant for many applications, including 3D reconstruction, object tracking, and style transfer. The vast majority of correspondence methods aim to find a solution between pairs of shapes, even if multiple instances of the same class are available. While isometries are often studied in shape correspondence problems, they have not been considered explicitly in the multi-matching setting. This paper closes this gap by proposing a novel optimisation formulation for isometric multi-shape matching. We present a suitable optimisation algorithm for solving our formulation and provide a convergence and complexity analysis. Our algorithm obtains multi-matchings that are by construction provably cycle-consistent. We demonstrate the superior performance of our method on various datasets and set the new state-of-the-art in isometric multi-shape matching.	翻訳日:2021-05-22 20:39:44 公開日:2020-12-04
# SMPLyによる野生における3次元人物位置推定のベンチマーク SMPLy Benchmarking 3D Human Pose Estimation in the Wild ( http://arxiv.org/abs/2012.02743v1 ) ライセンス: Link先を確認	Vincent Leroy, Philippe Weinzaepfel, Romain Br\'egier, Hadrien Combaluzier, Gr\'egory Rogez	(参考訳) 画像から3d人間のポーズを予測することは、最近非常に改善されている。単一の入力画像からポーズと形状の両方を予測できる新しいアプローチが導入されており、しばしばsmplのような人体のパラメトリックモデルに依存している。このような方法の質的な結果はしばしば、撮影中の画像に対して示されるが、モーションキャプチャー室よりも地上の3Dポーズを得るのが難しいため、そのような条件下での適切なベンチマークはいまだに欠落している。本稿では,これらのデータセットを正確な地上構造で容易に生成し,検証するためのパイプラインを提案する。我々は、最近導入されたMannequin Challengeデータセットを利用して、彫像のようなアクションで凍った人々の野生のビデオを収録し、人々が静的であり、カメラがSMPLモデルに正確に適合するように動いているという事実を活用する。登録されたボディモデルを持つ合計24,428フレームは、オンラインRGBビデオのみを使用して、ほぼ無償で567シーンから選択される。我々は,このデータセット上で,最先端のSMPLに基づく人間のポーズ推定手法をベンチマークする。以上の結果から,課題は,特に難易度の高いポーズや,人が部分的に行き詰まったり隠されたりした場面に残ることが示唆された。 Predicting 3D human pose from images has seen great recent improvements. Novel approaches that can even predict both pose and shape from a single input image have been introduced, often relying on a parametric model of the human body such as SMPL. While qualitative results for such methods are often shown for images captured in-the-wild, a proper benchmark in such conditions is still missing, as it is cumbersome to obtain ground-truth 3D poses elsewhere than in a motion capture room. This paper presents a pipeline to easily produce and validate such a dataset with accurate ground-truth, with which we benchmark recent 3D human pose estimation methods in-the-wild. We make use of the recently introduced Mannequin Challenge dataset which contains in-the-wild videos of people frozen in action like statues and leverage the fact that people are static and the camera moving to accurately fit the SMPL model on the sequences. A total of 24,428 frames with registered body models are then selected from 567 scenes at almost no cost, using only online RGB videos. We benchmark state-of-the-art SMPL-based human pose estimation methods on this dataset. Our results highlight that challenges remain, in particular for difficult poses or for scenes where the persons are partially truncated or occluded.	翻訳日:2021-05-22 20:38:52 公開日:2020-12-04
# 弾性重み強化による少数ショット画像生成 Few-shot Image Generation with Elastic Weight Consolidation ( http://arxiv.org/abs/2012.02780v1 ) ライセンス: Link先を確認	Yijun Li, Richard Zhang, Jingwan Lu, Eli Shechtman	(参考訳) 少数ショット画像生成は、利用可能なトレーニング例がほとんどなく、所定のドメインのより多くのデータを生成することを目指している。少数の観測結果(絵文字など)から分布を完全に推測することは理にかなわないため、我々は大規模な関連するソースドメインを事前訓練(人間の顔など)として活用しようと試みている。したがって、ターゲットの外観に適応しながら、ソースドメインの多様性を保ちたいと考えています。対象ドメインのいくつかの例に、追加のパラメータを導入することなく、事前訓練されたモデルを適用する。重要なことは、この適応の際の重みの変化を規則化し、ターゲットを適合させながら、ソースデータセットの情報を最もよく保存する。極めて少ない例(例: <10)を含む,異なる対象領域の高品質な結果を生成することで,アルゴリズムの有効性を実証する。また,サンプル数やソースとターゲットドメインの相違点など,いくつかの重要な要因について,本手法の性能分析を行った。 Few-shot image generation seeks to generate more data of a given domain, with only few available training examples. As it is unreasonable to expect to fully infer the distribution from just a few observations (e.g., emojis), we seek to leverage a large, related source domain as pretraining (e.g., human faces). Thus, we wish to preserve the diversity of the source domain, while adapting to the appearance of the target. We adapt a pretrained model, without introducing any additional parameters, to the few examples of the target domain. Crucially, we regularize the changes of the weights during this adaptation, in order to best preserve the information of the source dataset, while fitting the target. We demonstrate the effectiveness of our algorithm by generating high-quality results of different target domains, including those with extremely few examples (e.g., <10). We also analyze the performance of our method with respect to some important factors, such as the number of examples and the dissimilarity between the source and target domain.	翻訳日:2021-05-22 20:38:31 公開日:2020-12-04
# dempster-shafer理論に基づく新しいマルチクラス化情報融合:振動に基づく故障検出への応用 A novel multi-classifier information fusion based on Dempster-Shafer theory: application to vibration-based fault detection ( http://arxiv.org/abs/2012.02481v1 ) ライセンス: Link先を確認	Vahid Yaghoubi, Liangliang Cheng, Wim Van Paepegem, Mathias Kersemans	(参考訳) 高い予測率を達成することは、障害検出において重要なタスクである。様々な分類手順が利用できるが、それらが全てのアプリケーションに高い精度を与えることはない。そこで本稿では,個別の分類器の性能を高めるために,新しいマルチ分類器融合手法を開発した。これは Dempster-Shafer theory (DST) を用いて得られる。しかし、矛盾する証拠がある場合、DSTは反直感的な結果を与える可能性がある。この点において、証拠間の衝突を計測・緩和するために、新しい計量に基づく前処理技術が考案された。提案手法の有効性を評価し検証するために,uciとkeelの15のベンチマークデータセットに適用した。さらに、その広帯域振動応答に基づいて、多結晶ニッケル合金第一段タービンブレードを分類する。ノイズ-信号比の異なる統計解析と4つの最先端融合技術との比較により,提案手法は分類精度を向上し,個々の分類器よりも優れていることを示す。 Achieving a high prediction rate is a crucial task in fault detection. Although various classification procedures are available, none of them can give high accuracy in all applications. Therefore, in this paper, a novel multi-classifier fusion approach is developed to boost the performance of the individual classifiers. This is acquired by using Dempster-Shafer theory (DST). However, in cases with conflicting evidences, the DST may give counter-intuitive results. In this regard, a preprocessing technique based on a new metric is devised in order to measure and mitigate the conflict between the evidences. To evaluate and validate the effectiveness of the proposed approach, the method is applied to 15 benchmarks datasets from UCI and KEEL. Further, it is applied for classifying polycrystalline Nickel alloy first-stage turbine blades based on their broadband vibrational response. Through statistical analysis with different levels of noise-to-signal ratio, and by comparing with four state-of-the-art fusion techniques, it is shown that that the proposed method improves the classification accuracy and outperforms the individual classifiers.	翻訳日:2021-05-22 20:37:58 公開日:2020-12-04
# 不均一ラベルを用いたフェデレーション学習と移動活動モニタリングモデル Federated Learning with Heterogeneous Labels and Models for Mobile Activity Monitoring ( http://arxiv.org/abs/2012.02539v1 ) ライセンス: Link先を確認	Gautham Krishna Gudur, Satheesh K. Perepu	(参考訳) 生活支援,転倒検出などの様々な医療応用には,HAR(Human Activity Recognition)によるユーザ行動のモデル化が必要である。このようなアプリケーションは、効果的なパーソナライズアクティビティモニタリングのために機械学習技術を使用して、複数のリソースに制約されたユーザーデバイスからの洞察のキャラクタリゼーションを要求する。デバイス上の連合学習は、分散および協調機械学習にとって効果的なアプローチであることが証明されている。しかし、統計(非IIDデータ)とユーザ間の不均一性をモデル化する上で、さまざまな課題がある。さらに,本論文では,連合学習中にユーザ間のラベル(アクティビティ)の不均一性を扱うための,新たな関心課題について検討する。そこで本稿では, モデル蒸留更新を用いて, 重なり合う情報ゲインを利用する, ラベルに基づくアグリゲーションのためのフレームワークを提案する。また,デバイスからサーバへのモデルウェイト転送よりも,モデルスコアのフェデレーション転送が十分であることを示す。 raspberry pi 2のhhar(hetergeneity human activity recognition)データセットによる経験的評価は、平均決定論的精度が少なくとも11.01%上昇していることを示し、提案フレームワークのオンデバイス能力を示している。 Various health-care applications such as assisted living, fall detection, etc., require modeling of user behavior through Human Activity Recognition (HAR). Such applications demand characterization of insights from multiple resource-constrained user devices using machine learning techniques for effective personalized activity monitoring. On-device Federated Learning proves to be an effective approach for distributed and collaborative machine learning. However, there are a variety of challenges in addressing statistical (non-IID data) and model heterogeneities across users. In addition, in this paper, we explore a new challenge of interest -- to handle heterogeneities in labels (activities) across users during federated learning. To this end, we propose a framework for federated label-based aggregation, which leverages overlapping information gain across activities using Model Distillation Update. We also propose that federated transfer of model scores is sufficient rather than model weight transfer from device to server. Empirical evaluation with the Heterogeneity Human Activity Recognition (HHAR) dataset (with four activities for effective elucidation of results) on Raspberry Pi 2 indicates an average deterministic accuracy increase of at least ~11.01%, thus demonstrating the on-device capabilities of our proposed framework.	翻訳日:2021-05-22 20:37:01 公開日:2020-12-04
# 重みの初期設定が人工ニューラルネットワークのトレーニングと機能に及ぼす影響 Effect of the initial configuration of weights on the training and function of artificial neural networks ( http://arxiv.org/abs/2012.02550v1 ) ライセンス: Link先を確認	R. J. Jesus, M. L. Antunes, R. A. da Costa, S. N. Dorogovtsev, J. F. F. Mendes, R. L. Aguiar	(参考訳) ニューラルネットワークの機能と性能は、トレーニングの過程における重みとバイアスの進化によって決定される。本研究では,SGD(Stochastic Gradient Descent)を用いて学習した2層ReLUネットワークの重みの偏りを,初期ランダムな構成から定量的に評価する。この偏差の分布関数の進化とトレーニング中の損失の進化を比較した。我々は,SGDによるトレーニングを成功させることで,初期重量設定の近辺にネットワークを置き去りにすることを発見した。リンクの初期重みごとに、トレーニング後のこの値から偏差の分布関数を測定し、この分布とそのピークのモーメントが初期重みに依存するかを見出した。トレーニング中,これらの偏差の進化を探究し,オーバーフィット領域内での急激な増加を観察した。このジャンプは、損失関数の進化で記録された同様の急上昇と同時に起こる。以上の結果から,SGDが局所最小値を効率的に検出できる能力は,重量のランダムな初期配置の近傍に限られていることが示唆された。 The function and performance of neural networks is largely determined by the evolution of their weights and biases in the process of training, starting from the initial configuration of these parameters to one of the local minima of the loss function. We perform the quantitative statistical characterization of the deviation of the weights of two-hidden-layer ReLU networks of various sizes trained via Stochastic Gradient Descent (SGD) from their initial random configuration. We compare the evolution of the distribution function of this deviation with the evolution of the loss during training. We observed that successful training via SGD leaves the network in the close neighborhood of the initial configuration of its weights. For each initial weight of a link we measured the distribution function of the deviation from this value after training and found how the moments of this distribution and its peak depend on the initial weight. We explored the evolution of these deviations during training and observed an abrupt increase within the overfitting region. This jump occurs simultaneously with a similarly abrupt increase recorded in the evolution of the loss function. Our results suggest that SGD's ability to efficiently find local minima is restricted to the vicinity of the random initial configuration of weights.	翻訳日:2021-05-22 20:36:39 公開日:2020-12-04
# 敵対的事例に対する多重防衛戦略の提唱 Advocating for Multiple Defense Strategies against Adversarial Examples ( http://arxiv.org/abs/2012.02632v1 ) ライセンス: Link先を確認	Alexandre Araujo, Laurent Meunier, Rafael Pinot, Benjamin Negrevergne	(参考訳) ニューラルネットワークを$\ell_\infty$敵の例から保護するために設計された防御機構が、$\ell_2$敵の例に対して性能が劣っていることを実証的に観察されている。本稿では,この観察を検証する幾何学的解析を行う。そこで本研究では,この現象の実際的影響を説明するための実証的な知見を多数提示する。次に,防衛戦略を混合することにより,複数の攻撃に対して防御しようとする既存の防御機構について検討する。本稿の数値実験により,本手法の妥当性を議論し,実例コミュニティに対してオープン質問を提示する。 It has been empirically observed that defense mechanisms designed to protect neural networks against $\ell_\infty$ adversarial examples offer poor performance against $\ell_2$ adversarial examples and vice versa. In this paper we conduct a geometrical analysis that validates this observation. Then, we provide a number of empirical insights to illustrate the effect of this phenomenon in practice. Then, we review some of the existing defense mechanism that attempts to defend against multiple attacks by mixing defense strategies. Thanks to our numerical experiments, we discuss the relevance of this method and state open questions for the adversarial examples community.	翻訳日:2021-05-22 20:36:23 公開日:2020-12-04
# 階層的クラスタリングとゼロ永久ホモロジー Hierarchical Clustering and Zeroth Persistent Homology ( http://arxiv.org/abs/2012.02655v1 ) ライセンス: Link先を確認	\.Ismail G\"uzel and Atabey Kaygun	(参考訳) 本稿では,階層的クラスタリングと第0次永続ホモロジーが,与えられたデータセットに関する同じ位相情報を提供することを示す。この事実は、手元にあるデータセットのフィルター付きビエトリス-リップス複合体から構築されたコヒーネティック行列を用いて示される。任意のコヒーネティック行列と同様に、根木(デンドグラムとも呼ばれる)を通してゼロトホモロジークラスの相互関係を表示することもできる。ホモロジーコヒーネティック行列は高いホモロジーに対して計算できるので、高い永続ホモロジークラスのための類似のデンドグラムをスケッチすることもできる。 In this article, we show that hierarchical clustering and the zeroth persistent homology do deliver the same topological information about a given data set. We show this fact using cophenetic matrices constructed out of the filtered Vietoris-Rips complex of the data set at hand. As in any cophenetic matrix, one can also display the inter-relations of zeroth homology classes via a rooted tree, also known as a dendogram. Since homological cophenetic matrices can be calculated for higher homologies, one can also sketch similar dendograms for higher persistent homology classes.	翻訳日:2021-05-22 20:36:12 公開日:2020-12-04
# 多票投票における有権者のモデリング Modeling Voters in Multi-Winner Approval Voting ( http://arxiv.org/abs/2012.02811v1 ) ライセンス: Link先を確認	Jaelle Scheuerman, Jason Harman, Nicholas Mattei, K. Brent Venable	(参考訳) 多くの現実の状況では、投票と委員会や委員会選挙のようなシナリオを用いて、複数の勝者を返す投票規則を採用する。マルチウィンターの承認投票(AV)では、エージェントが希望する多くの候補者に対する承認からなる投票を提出し、勝者は投票を集計し、最も多くの承認を得た候補者を選ぶことで選ばれる。多くのシナリオでは、エージェントが提出した投票を操作して、真の好みを反映しない方法で投票することでより良い結果を達成することができる。複雑で不確実な状況では、エージェントは操作を計算するのに必要な追加の労力を必要とせず、ヒューリスティックを使用することができる。本稿では,メカニカル・トルクから得られた行動データを用いて,不確実性の度合いの異なる単入投票と多入投票の投票行動を検討する。一般的に、人々はより良い結果を得るために投票を操るが、しばしば最適な操作を識別しない。コンソックや心理学の文献には、認知的に説得力のあるヒューリスティックな戦略に基づくエージェント行動の予測モデルが多数存在する。既存の手法では実世界のデータを適切にモデル化できないことを示す。本稿では,入賞集合の大きさと人間の認知的制約を考慮した新しいモデルを提案し,このモデルが複数入賞承認投票シナリオにおける実世界行動の把握に有効であることを示す。 In many real world situations, collective decisions are made using voting and, in scenarios such as committee or board elections, employing voting rules that return multiple winners. In multi-winner approval voting (AV), an agent submits a ballot consisting of approvals for as many candidates as they wish, and winners are chosen by tallying up the votes and choosing the top-$k$ candidates receiving the most approvals. In many scenarios, an agent may manipulate the ballot they submit in order to achieve a better outcome by voting in a way that does not reflect their true preferences. In complex and uncertain situations, agents may use heuristics instead of incurring the additional effort required to compute the manipulation which most favors them. In this paper, we examine voting behavior in single-winner and multi-winner approval voting scenarios with varying degrees of uncertainty using behavioral data obtained from Mechanical Turk. We find that people generally manipulate their vote to obtain a better outcome, but often do not identify the optimal manipulation. There are a number of predictive models of agent behavior in the COMSOC and psychology literature that are based on cognitively plausible heuristic strategies. We show that the existing approaches do not adequately model real-world data. We propose a novel model that takes into account the size of the winning set and human cognitive constraints, and demonstrate that this model is more effective at capturing real-world behaviors in multi-winner approval voting scenarios.	翻訳日:2021-05-22 20:35:57 公開日:2020-12-04
# 高分解能画像インパインティング用ジェネレータピラミッド Generator Pyramid for High-Resolution Image Inpainting ( http://arxiv.org/abs/2012.02381v1 ) ライセンス: Link先を確認	Leilei Cao, Tong Yang, Yixu Wang, Bo Yan, Yandong Guo	(参考訳) 大きな穴を持つ高解像度画像のインペインティングは、既存のディープラーニングベースのイメージインペインティング手法に挑戦する。本稿では,コンテント補完とテクスチャ合成を明示的に区別する,高解像度画像インペインティングタスクのための新しいフレームワークであるpyramidfillを提案する。 PyramidFillは、低解像度画像で未知の領域の内容を完成させ、高解像度画像で未知の領域のテクスチャを徐々に合成しようとする。したがって,本モデルでは,低解像度マスク画像における内容の完結にGANが関与し,高解像度画像におけるテクスチャの合成にGANが関与する,完全畳み込み型GANのピラミッドで構成されている。コンテントの完成とテクスチャの合成はジェネレータと異なる能力を必要とするため、コンテンツGANとテクスチャGANの異なるアーキテクチャをカスタマイズする。 CelebA-HQ、Places2、および解像度の異なる新しい自然景観データセット(NSHQ)を含む複数のデータセットの実験は、PraamidFillが最先端の手法よりも高品質な塗装結果を生成することを示した。高精細画像の塗布方法を改善するため,高精細1920$\times$1080のNSHQ,高精細自然景観画像をリリースする。 Inpainting high-resolution images with large holes challenges existing deep learning based image inpainting methods. We present a novel framework -- PyramidFill for high-resolution image inpainting task, which explicitly disentangles content completion and texture synthesis. PyramidFill attempts to complete the content of unknown regions in a lower-resolution image, and synthesis the textures of unknown regions in a higher-resolution image, progressively. Thus, our model consists of a pyramid of fully convolutional GANs, wherein the content GAN is responsible for completing contents in the lowest-resolution masked image, and each texture GAN is responsible for synthesizing textures in a higher-resolution image. Since completing contents and synthesising textures demand different abilities from generators, we customize different architectures for the content GAN and texture GAN. Experiments on multiple datasets including CelebA-HQ, Places2 and a new natural scenery dataset (NSHQ) with different resolutions demonstrate that PyramidFill generates higher-quality inpainting results than the state-of-the-art methods. To better assess high-resolution image inpainting methods, we will release NSHQ, high-quality natural scenery images with high-resolution 1920$\times$1080.	翻訳日:2021-05-22 20:35:24 公開日:2020-12-04
# XraySyn:CTによる1枚のX線写真からのリアルなビュー合成 XraySyn: Realistic View Synthesis From a Single Radiograph Through CT Priors ( http://arxiv.org/abs/2012.02407v1 ) ライセンス: Link先を確認	Cheng Peng, Haofu Liao, Gina Wong, Jiebo Luo, Shaohua Kevin Zhou, Rama Chellappa	(参考訳) 放射線写真は、X線を用いて患者の内部解剖を視覚化し、3D情報を2次元平面に投影する。そのため、ラジオグラフィー分析では、医師が3Dヒト解剖学と2Dラジオグラフィーを関連付ける必要がある。少ない範囲で新しいラジオグラフィックビューを合成することは、医師が解剖学をより確実に解釈するのに役立つが、ラジオグラフビューの合成は非常に不適切であり、ペアデータに欠けており、学習に基づくアプローチを活用するために微分可能な操作が欠如している。これらの問題に対処するために,CT(Computerd Tomography)をラジオグラフィシミュレーションに使用し,識別可能なプロジェクションアルゴリズムを設計することにより,ラジオグラフィとCTドメイン間の幾何学的一貫した変換を実現する。 XraySynはリアルなシミュレーションとリアルなラジオグラフィーの微調整を組み合わせることで、リアルなラジオグラフィーの新たなビューを合成することができる。私たちの知る限りでは、ラジオグラフィビューの合成に関する最初の研究である。また, 3次元空間におけるx線撮影の理解を得ることにより, 接地骨ラベルを使わずに, 放射線画像の抽出と抑制に応用できることを示した。 A radiograph visualizes the internal anatomy of a patient through the use of X-ray, which projects 3D information onto a 2D plane. Hence, radiograph analysis naturally requires physicians to relate the prior about 3D human anatomy to 2D radiographs. Synthesizing novel radiographic views in a small range can assist physicians in interpreting anatomy more reliably; however, radiograph view synthesis is heavily ill-posed, lacking in paired data, and lacking in differentiable operations to leverage learning-based approaches. To address these problems, we use Computed Tomography (CT) for radiograph simulation and design a differentiable projection algorithm, which enables us to achieve geometrically consistent transformations between the radiography and CT domains. Our method, XraySyn, can synthesize novel views on real radiographs through a combination of realistic simulation and finetuning on real radiographs. To the best of our knowledge, this is the first work on radiograph view synthesis. We show that by gaining an understanding of radiography in 3D space, our method can be applied to radiograph bone extraction and suppression without groundtruth bone labels.	翻訳日:2021-05-22 20:35:01 公開日:2020-12-04
# アテンションベースオートエンコーダを用いたマルチスケールメッシュ変形成分分析 Multiscale Mesh Deformation Component Analysis with Attention-based Autoencoders ( http://arxiv.org/abs/2012.02459v1 ) ライセンス: Link先を確認	Jie Yang, Lin Gao, Qingyang Tan, Yihua Huang, Shihong Xia and Yu-Kun Lai	(参考訳) 変形成分分析は幾何学処理と形状理解の基本的な問題である。既存のアプローチでは、主に局所的な変形成分を同様のスケールで抽出するが、実世界の物体の変形は通常マルチスケールで分散する。本稿では,注目型オートエンコーダを用いたマルチスケール変形成分の自動推定手法を提案する。このアテンション機構は、アクティブな変形領域におけるマルチスケール変形成分の軟重化を学習するために設計され、スタック化されたアテンションベースのオートエンコーダは、変形成分を異なるスケールで表現する。定量的および定性的評価は,本手法が最先端手法より優れていることを示す。また, 本手法で抽出した多スケール変形成分により, 形状を粗視的に編集でき, 新たな形状のモデル化が容易になる。 Deformation component analysis is a fundamental problem in geometry processing and shape understanding. Existing approaches mainly extract deformation components in local regions at a similar scale while deformations of real-world objects are usually distributed in a multi-scale manner. In this paper, we propose a novel method to exact multiscale deformation components automatically with a stacked attention-based autoencoder. The attention mechanism is designed to learn to softly weight multi-scale deformation components in active deformation regions, and the stacked attention-based autoencoder is learned to represent the deformation components at different scales. Quantitative and qualitative evaluations show that our method outperforms state-of-the-art methods. Furthermore, with the multiscale deformation components extracted by our method, the user can edit shapes in a coarse-to-fine fashion which facilitates effective modeling of new shapes.	翻訳日:2021-05-22 20:34:39 公開日:2020-12-04
# 医療セグメンテーションにおける不均衡問題に対するオフセット曲線の損失 Offset Curves Loss for Imbalanced Problem in Medical Segmentation ( http://arxiv.org/abs/2012.02463v1 ) ライセンス: Link先を確認	Ngan Le, Trung Le, Kashu Yamazaki, Toan Duc Bui, Khoa Luu, Marios Savides	(参考訳) 医用画像分割は医療分析において重要な役割を担い、多くの臨床応用で広く開発された。深層学習に基づくアプローチはセマンティックセグメンテーションにおいて高い性能を達成しているが、それはピクセルワイズ設定と不均衡なクラスデータ問題に限られている。本稿では,高機能化と高機能化の両方を考慮した新しい深層学習モデルの開発により,これらの限界に挑戦する。輪郭内の領域、中間的特徴レベル、すなわち輪郭周りのオフセット曲線と低い特徴レベル、すなわち輪郭提案するオフセット曲線(osc)損失は,3つの主要適合項からなる。第1のフィッティング項は画素単位のセグメンテーションに焦点を当て、第2のフィッティング項は境界周辺の領域(オフセット曲線)に注意を向ける注意モデルとして機能する。第三項は境界の長さを考慮した正規化用語としての役割を担っている。提案するosc損失を2次元ネットワークと3次元ネットワークの両方で評価する。 2つの一般的な医療データセット、すなわち網膜DRIVEと脳腫瘍BRATS 2018データセットは、提案された損失性能のベンチマークに使用される。実験により,提案したOsC損失関数は,最も一般的なセグメンテーションネットワークUnet,FCN上でのクロスエントロピー,ディース,フォカルなどの他の主流損失関数よりも優れていた。 Medical image segmentation has played an important role in medical analysis and widely developed for many clinical applications. Deep learning-based approaches have achieved high performance in semantic segmentation but they are limited to pixel-wise setting and imbalanced classes data problem. In this paper, we tackle those limitations by developing a new deep learning-based model which takes into account both higher feature level i.e. region inside contour, intermediate feature level i.e. offset curves around the contour and lower feature level i.e. contour. Our proposed Offset Curves (OsC) loss consists of three main fitting terms. The first fitting term focuses on pixel-wise level segmentation whereas the second fitting term acts as attention model which pays attention to the area around the boundaries (offset curves). The third terms plays a role as regularization term which takes the length of boundaries into account. We evaluate our proposed OsC loss on both 2D network and 3D network. Two common medical datasets, i.e. retina DRIVE and brain tumor BRATS 2018 datasets are used to benchmark our proposed loss performance. The experiments have shown that our proposed OsC loss function outperforms other mainstream loss functions such as Cross-Entropy, Dice, Focal on the most common segmentation networks Unet, FCN.	翻訳日:2021-05-22 20:34:26 公開日:2020-12-04
# ノイズグラフ構造からのノード表現の学習 Learning Node Representations from Noisy Graph Structures ( http://arxiv.org/abs/2012.02434v1 ) ライセンス: Link先を確認	Junshan Wang, Ziyao Li, Qingqing Long, Weiyu Zhang, Guojie Song, Chuan Shi	(参考訳) グラフ上の低次元表現を学習することは、様々な下流タスクに有効であることが証明されている。しかし、ネットワークのエッジがノード自身ではなくネットワーク全体のノイズを伝搬するという点で、ネットワークを妥協する現実世界のネットワークではノイズが一般的である。既存の手法は構造特性の保存に重点を置いているが、学習されたノイズに対する表現の堅牢性は一般的に無視される。本稿では,ノイズのないノード表現を学習し,同時にノイズを排除する新しい枠組みを提案する。実グラフ上ではノイズはしばしば未知であるため、教師なし環境での正常な構造とノイズを特定するために、グラフ生成器とノイズ発生器という2つのジェネレータを設計する。一方、グラフ生成器は、正規構造を生成するのに有用なグラフ事前知識を組み込む統一スキームとして機能する。本稿では,コミュニティ構造と権限-法次分布による生成過程を例に挙げる。一方、ノイズ発生器は、基本特性を満足するだけでなく、適応的にグラフノイズを生成する。したがって、任意の分布を持つ実雑音をうまく処理することができる。最後に,ノイズを除去し,ノイズのないノード表現を得るためには,2つの生成器を協調して最適化する必要がある。本モデルは実世界データと合成データの両方で評価される。これは、ノード分類やグラフ再構成タスクの他の強力なベースラインよりも優れており、グラフノイズを取り除く能力を示している。 Learning low-dimensional representations on graphs has proved to be effective in various downstream tasks. However, noises prevail in real-world networks, which compromise networks to a large extent in that edges in networks propagate noises through the whole network instead of only the node itself. While existing methods tend to focus on preserving structural properties, the robustness of the learned representations against noises is generally ignored. In this paper, we propose a novel framework to learn noise-free node representations and eliminate noises simultaneously. Since noises are often unknown on real graphs, we design two generators, namely a graph generator and a noise generator, to identify normal structures and noises in an unsupervised setting. On the one hand, the graph generator serves as a unified scheme to incorporate any useful graph prior knowledge to generate normal structures. We illustrate the generative process with community structures and power-law degree distributions as examples. On the other hand, the noise generator generates graph noises not only satisfying some fundamental properties but also in an adaptive way. Thus, real noises with arbitrary distributions can be handled successfully. Finally, in order to eliminate noises and obtain noise-free node representations, two generators need to be optimized jointly, and through maximum likelihood estimation, we equivalently convert the model into imposing different regularization constraints on the true graph and noises respectively. Our model is evaluated on both real-world and synthetic data. It outperforms other strong baselines for node classification and graph reconstruction tasks, demonstrating its ability to eliminate graph noises.	翻訳日:2021-05-22 20:33:45 公開日:2020-12-04
# 機械学習による設計検証の最適化 - オープンソースソリューション Optimising Design Verification Using Machine Learning: An Open Source Solution ( http://arxiv.org/abs/2012.02453v1 ) ライセンス: Link先を確認	B. Samhita Varambally, Naman Sehgal	(参考訳) 集積回路の複雑さが増すにつれ、設計検証はASIC設計フローの最も時間を要する部分となった。 SoC設計サイクルの70%近くは検証によって消費される。すべてのコーナーケースをテストする最も一般的な方法は、制約付きランダム検証を使用することである。ランダムな刺激は、可能なすべての組み合わせにぶつかり、設計を徹底的にテストするために与えられる。しかしながら、このアプローチは、すべてのコーナーケースに到達するために、重要な人間の専門知識を必要とすることが多い。本稿では,機械学習を用いて入力刺激を生成する手法を提案する。これにより、人間の介入が少なく、設計の徹底的な検証を迅速に行える。さらに,オープンソースの検証環境であるCocotbの利用を提案する。 Pythonをベースとしており、シンプルで直感的で、機械学習アプリケーションのための膨大な関数ライブラリを持っている。これにより、System VerilogやSpecman Eといった従来のハードウェア検証言語を使用する場合よりも使いやすくなっている。 With the complexity of Integrated Circuits increasing, design verification has become the most time consuming part of the ASIC design flow. Nearly 70% of the SoC design cycle is consumed by verification. The most commonly used approach to test all corner cases is through the use of Constrained Random Verification. Random stimulus is given in order to hit all possible combinations and test the design thoroughly. However, this approach often requires significant human expertise to reach all corner cases. This paper presents an alternative using Machine Learning to generate the input stimulus. This will allow for faster thorough verification of the design with less human intervention. Furthermore, it is proposed to use the open source verification environment 'Cocotb'. Based on Python, it is simple, intuitive and has a vast library of functions for machine learning applications. This makes it more convenient to use than the bulkier approach using traditional Hardware Verification Languages such as System Verilog or Specman E.	翻訳日:2021-05-22 20:33:24 公開日:2020-12-04
# 実世界FMCWレーダ信号の深部干渉緩和と雑音化 Deep Interference Mitigation and Denoising of Real-World FMCW Radar Signals ( http://arxiv.org/abs/2012.02529v1 ) ライセンス: Link先を確認	Johanna Rock, Mate Toth, Paul Meissner, Franz Pernkopf	(参考訳) レーダセンサーは、運転支援システムや自動運転車の環境認識に不可欠である。主な性能要因は、細かな範囲の解像度と、直接速度を測定する可能性である。レーダーセンサーの数が増加し、これまでに規制されていない自動車レーダ周波数帯により、相互干渉は避けられず、対処されなければならない。センサーは、検出感度の低下を含む干渉の有害な影響を検知、または緩和する能力を持つ必要がある。本稿では,畳み込みニューラルネットワーク(CNN)を用いた干渉緩和手法について,実世界のレーダ計測で評価する。実測値とシミュレーション干渉を組み合わせることで,モデルのトレーニングに適した入出力データを生成する。本研究では,広範囲なパラメータ探索に基づいて,シミュレーションデータと計測データの複雑性関係をモデル化する性能解析を行う。さらに、有限サンプルサイズ性能比較により、シミュレーションデータと実データの両方でトレーニングされたモデルの有効性と、転送学習の有効性を示す。 state of the artによる比較パフォーマンス分析では、ハードウェアのリソース制約も考慮し、実世界の計測の干渉緩和とノイズ除去のためのcnnベースのモデルの可能性を強調している。 Radar sensors are crucial for environment perception of driver assistance systems as well as autonomous cars. Key performance factors are a fine range resolution and the possibility to directly measure velocity. With a rising number of radar sensors and the so far unregulated automotive radar frequency band, mutual interference is inevitable and must be dealt with. Sensors must be capable of detecting, or even mitigating the harmful effects of interference, which include a decreased detection sensitivity. In this paper, we evaluate a Convolutional Neural Network (CNN)-based approach for interference mitigation on real-world radar measurements. We combine real measurements with simulated interference in order to create input-output data suitable for training the model. We analyze the performance to model complexity relation on simulated and measurement data, based on an extensive parameter search. Further, a finite sample size performance comparison shows the effectiveness of the model trained on either simulated or real data as well as for transfer learning. A comparative performance analysis with the state of the art emphasizes the potential of CNN-based models for interference mitigation and denoising of real-world measurements, also considering resource constraints of the hardware.	翻訳日:2021-05-22 20:32:51 公開日:2020-12-04
# 高速低次半有限プログラムを用いたコミュニティ検出 Community detection using fast low-cardinality semidefinite programming ( http://arxiv.org/abs/2012.02676v1 ) ライセンス: Link先を確認	Po-Wei Wang, J. Zico Kolter	(参考訳) モジュラリティの最大化はネットワークのコミュニティ構造を理解するための基本的なツールであるが、基盤となる最適化問題は非凸であり、np困難である。 louvainやleidenといった最先端のアルゴリズムは、局所的なオプティマから逃れるために異なるヒューリスティックに焦点を合わせているが、それでもノードの割り当てをローカルに移動させ、罠にかかりやすいという欲張りなステップに依存している。本稿では,max-k-cutによる半定値緩和を最大化するために,局所更新を一般化した新しい低カージナリティアルゴリズムを提案する。提案アルゴリズムは拡張性があり、小規模なケースに対して大域半定最適性を実証的に達成し、実際のデータセットにおける最先端のアルゴリズムよりも、時間的コストがほとんどない。アルゴリズムの観点からは、ソリューションが低ランクではなくスパースである場合に、半定義型プログラミングをスケールアップするための新しい道を開く。 Modularity maximization has been a fundamental tool for understanding the community structure of a network, but the underlying optimization problem is nonconvex and NP-hard to solve. State-of-the-art algorithms like the Louvain or Leiden methods focus on different heuristics to help escape local optima, but they still depend on a greedy step that moves node assignment locally and is prone to getting trapped. In this paper, we propose a new class of low-cardinality algorithm that generalizes the local update to maximize a semidefinite relaxation derived from max-k-cut. This proposed algorithm is scalable, empirically achieves the global semidefinite optimality for small cases, and outperforms the state-of-the-art algorithms in real-world datasets with little additional time cost. From the algorithmic perspective, it also opens a new avenue for scaling-up semidefinite programming when the solutions are sparse instead of low-rank.	翻訳日:2021-05-22 20:32:20 公開日:2020-12-04
# Nimble: ディープラーニングのための軽量で並列なGPUタスクスケジューリング Nimble: Lightweight and Parallel GPU Task Scheduling for Deep Learning ( http://arxiv.org/abs/2012.02732v1 ) ライセンス: Link先を確認	Woosuk Kwon, Gyeong-In Yu, Eunji Jeong, Byung-Gon Chun	(参考訳) ディープラーニング(DL)フレームワークは、GPUを活用して、DL推論とトレーニングのスピードを改善する。理想的には、DLフレームワークはGPUの計算能力を完全に活用でき、実行時間はGPUに割り当てられた計算量に依存する。しかし、GPUタスクのスケジューリングにおいて、既存のDLフレームワークは、大きなスケジューリングオーバーヘッドや不要なシリアル実行などの非効率に悩まされている。そこで我々は,gpuタスクを最小限のスケジューリングオーバーヘッドで並列に実行するdl実行エンジンであるnimbleを提案する。 Nimble氏は、AoTスケジューリングと呼ばれる新しいテクニックを紹介している。ここで、スケジューリング手順はGPUカーネルを実行する前に終了し、実行中のスケジューリングオーバーヘッドの大部分を取り除く。さらに、Nimbleは単一のGPUで複数のGPUストリームを活用することで、GPUタスクの実行を自動的に並列化する。様々なニューラルネットワークの評価は、pytorchと比較して、nimbleは推論とトレーニングを最大22.34$\times$と3.61$\times$で高速化していることを示している。さらに、Nimbleは最先端の推論システムであるTensorRTとTVMを最大2.81$\times$と1.70$\times$で上回る。 Deep learning (DL) frameworks take advantage of GPUs to improve the speed of DL inference and training. Ideally, DL frameworks should be able to fully utilize the computation power of GPUs such that the running time depends on the amount of computation assigned to GPUs. Yet, we observe that in scheduling GPU tasks, existing DL frameworks suffer from inefficiencies such as large scheduling overhead and unnecessary serial execution. To this end, we propose Nimble, a DL execution engine that runs GPU tasks in parallel with minimal scheduling overhead. Nimble introduces a novel technique called ahead-of-time (AoT) scheduling. Here, the scheduling procedure finishes before executing the GPU kernel, thereby removing most of the scheduling overhead during run time. Furthermore, Nimble automatically parallelizes the execution of GPU tasks by exploiting multiple GPU streams in a single GPU. Evaluation on a variety of neural networks shows that compared to PyTorch, Nimble speeds up inference and training by up to 22.34$\times$ and 3.61$\times$, respectively. Moreover, Nimble outperforms state-of-the-art inference systems, TensorRT and TVM, by up to 2.81$\times$ and 1.70$\times$, respectively.	翻訳日:2021-05-22 20:32:01 公開日:2020-12-04
# 政策介入の効果測定におけるコンセプトドリフトの利用--COVID-19パンデミックの事例から Utilizing Concept Drift for Measuring the Effectiveness of Policy Interventions: The Case of the COVID-19 Pandemic ( http://arxiv.org/abs/2012.03728v1 ) ライセンス: Link先を確認	Lucas Baier, Niklas K\"uhl, Jakob Sch\"offer, Gerhard Satzger	(参考訳) 新型コロナウイルスの感染拡大と致死率の上昇を受け、世界各国は新型コロナウイルスの感染拡大を抑えるための徹底的な対策を講じている。しかし、これらの措置、いわゆる非医薬品介入(NPI)がウイルスの拡散にどのような影響を及ぼすかは不明である。本稿では、機械学習を用いて、政策介入の効果を測定する新しい方法でドリフト検出手法を適用した。我々は、9つのヨーロッパ諸国と28の米国において、新型コロナウイルス(covid-19)の1日当たりのケースナンバーの開発にnpisが与える影響を分析した。解析の結果,NPIが新規症例数に有意な影響を及ぼすまで平均2週間以上かかることが明らかとなった。次に、NPIに関する決定性、気候、人口密度といった各国や国家の特徴が、NPIの有効性を示すまでの時間ラグに与える影響を分析する。分析では,特に学校閉鎖の時期がパンデミックの進展に重大な影響を与えていることが明らかとなった。この情報は、NPI救済でウイルスの厳格な封じ込めを解除する難しい決定に直面した政策当局者にとって極めて重要である。 As a reaction to the high infectiousness and lethality of the COVID-19 virus, countries around the world have adopted drastic policy measures to contain the pandemic. However, it remains unclear which effect these measures, so-called non-pharmaceutical interventions (NPIs), have on the spread of the virus. In this article, we use machine learning and apply drift detection methods in a novel way to measure the effectiveness of policy interventions: We analyze the effect of NPIs on the development of daily case numbers of COVID-19 across 9 European countries and 28 US states. Our analysis shows that it takes more than two weeks on average until NPIs show a significant effect on the number of new cases. We then analyze how characteristics of each country or state, e.g., decisiveness regarding NPIs, climate or population density, influence the time lag until NPIs show their effectiveness. In our analysis, especially the timing of school closures reveals a significant effect on the development of the pandemic. This information is crucial for policy makers confronted with difficult decisions to trade off strict containment of the virus with NPI relief.	翻訳日:2021-05-22 20:31:42 公開日:2020-12-04
# 音から知覚される感情の予測 Predicting Emotions Perceived from Sounds ( http://arxiv.org/abs/2012.02643v1 ) ライセンス: Link先を確認	Faranak Abri, Luis Felipe Guti\'errez, Akbar Siami Namin, David R. W. Sears, Keith S. Jones	(参考訳) 音化とは、音を通してユーザとデータやイベントを通信する科学である。聴覚アイコン、耳栓、音声は、音化に使用される一般的な聴覚表示方式であり、より具体的には情報伝達に音声を使用する。キャプチャーされたデータが認識されると、その意味、さらに重要なことは、意図をより容易に解釈することができ、可視化技術の補完として利用することができる。聴覚知覚を通して、時間的、空間的、または他の文脈指向の情報を伝えることができる。重要な研究課題は、これらの聴覚アイコンから知覚される感情が、自動音化プラットフォームを構築するために予測可能であるかどうかである。本稿では,音から知覚される感情の予測を行うために,主流および従来の機械学習アルゴリズムを複数開発する実験を行う。そのため、音の主な特徴を捕捉し、特徴量削減技術を用いて機械学習アルゴリズムを用いてモデル化する。知覚された感情を高い精度で予測することが可能である。特にランダムフォレストに基づく回帰は、他の機械学習アルゴリズムと比較して優位性を示した。 Sonification is the science of communication of data and events to users through sounds. Auditory icons, earcons, and speech are the common auditory display schemes utilized in sonification, or more specifically in the use of audio to convey information. Once the captured data are perceived, their meanings, and more importantly, intentions can be interpreted more easily and thus can be employed as a complement to visualization techniques. Through auditory perception it is possible to convey information related to temporal, spatial, or some other context-oriented information. An important research question is whether the emotions perceived from these auditory icons or earcons are predictable in order to build an automated sonification platform. This paper conducts an experiment through which several mainstream and conventional machine learning algorithms are developed to study the prediction of emotions perceived from sounds. To do so, the key features of sounds are captured and then are modeled using machine learning algorithms using feature reduction techniques. We observe that it is possible to predict perceived emotions with high accuracy. In particular, the regression based on Random Forest demonstrated its superiority compared to other machine learning algorithms.	翻訳日:2021-05-22 20:31:22 公開日:2020-12-04
# モバイルデータによるマルチモーダルプライバシー保護モッド予測 : 予備研究 Multimodal Privacy-preserving Mood Prediction from Mobile Data: A Preliminary Study ( http://arxiv.org/abs/2012.02359v1 ) ライセンス: Link先を確認	Terrance Liu, Paul Pu Liang, Michal Muszynski, Ryo Ishii, David Brent, Randy Auerbach, Nicholas Allen, Louis-Philippe Morency	(参考訳) 精神的な健康状態は、先進医療に共通のアクセスを持つ国でも診断が不十分である。容易に収集できるデータから気分を正確にかつ効率的に予測できる能力は、精神疾患の早期発見と介入にいくつかの重要な意味を持つ。人間の行動を監視するための有望なデータソースのひとつは、毎日のスマートフォンの利用だ。しかし、個人(例えば、個人識別可能な情報)や保護属性(例えば、人種、性別)を通じてユーザーを特定することなく、行動の要約に注意する必要がある。本稿では,リスクの高い青年の移動行動のデータセットを用いて行動マーカーや日常の気分を調査する。計算モデルを用いて、テキストとアプリの使用状況のマルチモーダルなモデリングは、各モーダルのみに対する日々のムードを高い精度で予測することを発見した。さらに,日々の気分予測を継続しながら,ユーザのアイデンティティを確実に無視するアプローチを評価する。マルチモーダル表現とプライバシ保護学習を組み合わせることで、単調なアプローチに比べてパフォーマンスプライバシのフロンティアを推し進めることができます。 Mental health conditions remain under-diagnosed even in countries with common access to advanced medical care. The ability to accurately and efficiently predict mood from easily collectible data has several important implications towards the early detection and intervention of mental health disorders. One promising data source to help monitor human behavior is from daily smartphone usage. However, care must be taken to summarize behaviors without identifying the user through personal (e.g., personally identifiable information) or protected attributes (e.g., race, gender). In this paper, we study behavioral markers or daily mood using a recent dataset of mobile behaviors from high-risk adolescent populations. Using computational models, we find that multimodal modeling of both text and app usage features is highly predictive of daily mood over each modality alone. Furthermore, we evaluate approaches that reliably obfuscate user identity while remaining predictive of daily mood. By combining multimodal representations with privacy-preserving learning, we are able to push forward the performance-privacy frontier as compared to unimodal approaches.	翻訳日:2021-05-22 20:31:07 公開日:2020-12-04
# アナログmramニューロンとシナプスを用いた一サイクルmlp分類 A Single-Cycle MLP Classifier Using Analog MRAM-based Neurons and Synapses ( http://arxiv.org/abs/2012.02695v1 ) ライセンス: Link先を確認	Ramtin Zand	(参考訳) 本稿では、スピン軌道トルク(SOT)磁気抵抗型ランダムアクセスメモリ(MRAM)を用いて、単一サイクルアナログインメモリコンピューティング(IMC)アーキテクチャのためのシグモダルニューロンと双項化シナプスを実現する。まず,従来最も電力効率の良いアナログsgmoidalニューロン設計に比べてパワーエリア積を12倍削減できるアナログsot-mramベースのニューロンビットセルを提案する。次に、MNISTパターン認識アプリケーションのためのアナログIMCベースの多層パーセプトロン(MLP)アーキテクチャを形成するために、メモリサブアレイ内で提案されたニューロンとシナプスビット細胞を用いる。アーキテクチャレベルの結果から,我々のアナログICCアーキテクチャは,同一の分類精度を実現しつつ,混合信号アナログ/デジタルMCアーキテクチャとディジタルGPU実装と比較して,少なくとも2桁,4桁の性能向上を実現していることがわかった。 In this paper, spin-orbit torque (SOT) magnetoresistive random-access memory (MRAM) devices are leveraged to realize sigmoidal neurons and binarized synapses for a single-cycle analog in-memory computing (IMC) architecture. First, an analog SOT-MRAM-based neuron bitcell is proposed which achieves a 12x reduction in power-area-product compared to the previous most power- and area-efficient analog sigmoidal neuron design. Next, proposed neuron and synapse bit cells are used within memory subarrays to form an analog IMC-based multilayer perceptron (MLP) architecture for the MNIST pattern recognition application. The architecture-level results exhibit that our analog IMC architecture achieves at least two and four orders of magnitude performance improvement compared to a mixed-signal analog/digital IMC architecture and a digital GPU implementation, respectively while realizing a comparable classification accuracy.	翻訳日:2021-05-22 20:30:50 公開日:2020-12-04
# SensiX:エッジ上でのコラボレーション機械学習のためのプラットフォーム SensiX: A Platform for Collaborative Machine Learning on the Edge ( http://arxiv.org/abs/2012.06035v1 ) ライセンス: Link先を確認	Chulhong Min, Akhil Mathur, Alessandro Montanari, Utku Gunay Acer, Fahim Kawsar	(参考訳) 人体上または人体近傍に複数の感覚デバイスが出現することは、極端なエッジコンピューティングの新たなダイナミクスを明らかにする。これにより、スマートフォンやwi-fiゲートウェイなどの強力でリソースに富んだエッジデバイスがパーソナルエッジに変換され、複数のデバイスと連携して、局所性、可用性、近接性といったパワーを生かしながら、優れた感性alアプリケーションを提供する。当然、この変革は、個人のエッジで正確で堅牢で効率的な感覚システムを構築する方法を再考させる。例えば、複数のIMU搭載デバイスを備えた信頼性の高いアクティビティトラッカーをどのように構築するか? センシングモデルの精度は向上しているが、特に新興のマルチデバイス、パーソナルエッジ環境において、ランタイムのパフォーマンスは依然として低下している。パフォーマンスに影響を及ぼす2つの主要な注意事項は、デバイス可用性、データ品質、デバイス配置など、いくつかのランタイム要因によって寄与されるデバイスとデータ変数である。そこで本研究では,センサデータとセンシングモデルの間を行き来するパーソナルエッジプラットフォームsensixを提案する。 SensiXは、アプリケーションからモデル実行を外部化し、デバイス間データの原則マッピングを行う変換演算子と、モデル精度の関数として正しい実行経路を体系的に選択する品質対応選択演算子とからなる。我々はsensixの設計と実装を報告し、モーションおよびオーディオベースのマルチデバイスセンシングシステムの開発におけるその効果を実証する。評価の結果,SensiXは3mWのオーバヘッドを犠牲にして,全体の精度が7～13%向上し,環境動態が最大30%向上した。 The emergence of multiple sensory devices on or near a human body is uncovering new dynamics of extreme edge computing. In this, a powerful and resource-rich edge device such as a smartphone or a Wi-Fi gateway is transformed into a personal edge, collaborating with multiple devices to offer remarkable sensory al eapplications, while harnessing the power of locality, availability, and proximity. Naturally, this transformation pushes us to rethink how to construct accurate, robust, and efficient sensory systems at personal edge. For instance, how do we build a reliable activity tracker with multiple on-body IMU-equipped devices? While the accuracy of sensing models is improving, their runtime performance still suffers, especially under this emerging multi-device, personal edge environments. Two prime caveats that impact their performance are device and data variabilities, contributed by several runtime factors, including device availability, data quality, and device placement. To this end, we present SensiX, a personal edge platform that stays between sensor data and sensing models, and ensures best-effort inference under any condition while coping with device and data variabilities without demanding model engineering. SensiX externalises model execution away from applications, and comprises of two essential functions, a translation operator for principled mapping of device-to-device data and a quality-aware selection operator to systematically choose the right execution path as a function of model accuracy. We report the design and implementation of SensiX and demonstrate its efficacy in developing motion and audio-based multi-device sensing systems. Our evaluation shows that SensiX offers a 7-13% increase in overall accuracy and up to 30% increase across different environment dynamics at the expense of 3mW power overhead.	翻訳日:2021-05-22 20:30:35 公開日:2020-12-04
# 線形および非線形偏微分方程式を解くための局所極値学習機械と領域分解 Local Extreme Learning Machines and Domain Decomposition for Solving Linear and Nonlinear Partial Differential Equations ( http://arxiv.org/abs/2012.02895v1 ) ライセンス: Link先を確認	Suchuan Dong, Zongwei Li	(参考訳) 本稿では,エクストリーム・ラーニング・マシン(elm),ドメイン分割(domain decomposition),局所ニューラルネットワーク(local neural network)のアイデアを組み合わせた,線形および非線形偏微分方程式の解法を提案する。各サブドメインのフィールドソリューションは、ローカルフィードフォワードニューラルネットワークで表現され、サブドメイン境界に$c^k$連続性が課される。各ローカルニューラルネットワークは、少数の隠れレイヤで構成されているが、最後の隠れレイヤは幅がある。局所ニューラルネットワークのすべての隠蔽層における重み/バイアス係数はランダムな値に予め設定されており、出力層内の重み係数のみがトレーニングパラメータである。全体ニューラルネットワークは、バックプロパゲーション型アルゴリズムではなく、線形または非線形の最小二乗計算によって訓練される。本稿では,長期動的シミュレーションのためのブロック時間マーチング手法を提案する。本手法は、ニューラルネットワークにおける自由度に関して明確な収束感を示す。その数値誤差は通常、自由度が増加するにつれて指数関数的にまたはほぼ指数関数的に減少する。提案手法の計算性能を実証するために, 広範囲な数値実験を行った。本稿では,DGM法(Deep Galerkin Method)とPINN(Physical-informed Neural Network)を精度と計算コストの観点から比較する。現在の手法では,DGM や PINN に比べて,数値誤差やネットワークトレーニング時間(典型的には桁違い)がかなり小さいため,明らかな優位性を示す。また、現在の手法を古典有限要素法(FEM)と比較する。現在の手法の計算性能は、FEMの性能と同等であり、しばしば同等である。 We present a neural network-based method for solving linear and nonlinear partial differential equations, by combining the ideas of extreme learning machines (ELM), domain decomposition and local neural networks. The field solution on each sub-domain is represented by a local feed-forward neural network, and $C^k$ continuity is imposed on the sub-domain boundaries. Each local neural network consists of a small number of hidden layers, while its last hidden layer can be wide. The weight/bias coefficients in all hidden layers of the local neural networks are pre-set to random values and are fixed, and only the weight coefficients in the output layers are training parameters. The overall neural network is trained by a linear or nonlinear least squares computation, not by the back-propagation type algorithms. We introduce a block time-marching scheme together with the presented method for long-time dynamic simulations. The current method exhibits a clear sense of convergence with respect to the degrees of freedom in the neural network. Its numerical errors typically decrease exponentially or nearly exponentially as the number of degrees of freedom increases. Extensive numerical experiments have been performed to demonstrate the computational performance of the presented method. We compare the current method with the deep Galerkin method (DGM) and the physics-informed neural network (PINN) in terms of the accuracy and computational cost. The current method exhibits a clear superiority, with its numerical errors and network training time considerably smaller (typically by orders of magnitude) than those of DGM and PINN. We also compare the current method with the classical finite element method (FEM). The computational performance of the current method is on par with, and oftentimes exceeds, the FEM performance.	翻訳日:2021-05-22 20:30:05 公開日:2020-12-04

Title

Authors

Abstract

論文公表日・翻訳日

# タイムビンとウェーブライクエンコーディングのハイブリッド絡み合いの発生法

Scheme for the generation of hybrid entanglement between time-bin and wavelike encodings ( http://arxiv.org/abs/2002.04450v3 )

ライセンス: Link先を確認

\'Elie Gouzien, Floriane Brunel, S\'ebastien Tanzilli, Virginia D'Auria

(参考訳) 本稿では,位相に符号化されたコヒーレント状態キュービットと単一光子時間ビンキュービットを絡むハイブリッド状態の生成法を提案する。他の報告されたソリューションと比較して、時間ビン符号化は、特に光ファイバの長距離伝搬を含む応用によく適合する。これにより,今後の量子通信に有望な資源となる。この観点から,本手法を現実的な実験資源を考慮した分析を行い,得られたハイブリッド状態の品質に対する不完全性の影響について考察する。

We propose a scheme for the generation of hybrid states entangling a single-photon time-bin qubit with a coherent-state qubit encoded on phases. Compared to other reported solutions, time-bin encoding makes hybrid entanglement particularly well adapted to applications involving long-distance propagation in optical fibers. This makes our proposal a promising resource for future out-of-the-laboratory quantum communication. In this perspective, we analyze our scheme by taking into account realistic experimental resources and discuss the impact of their imperfections on the quality of the obtained hybrid state.

翻訳日:2023-06-03 23:31:13 公開日:2020-12-04

# 双極子および高モーメント保存系における異常拡散

Anomalous Diffusion in Dipole- and Higher-Moment Conserving Systems ( http://arxiv.org/abs/2004.00635v2 )

ライセンス: Link先を確認

Johannes Feldmeier, Pablo Sala, Giuseppe de Tomasi, Frank Pollmann, Michael Knap

(参考訳) 相互作用系における大域的保存量の存在は、遅くとも拡散輸送に繋がる。ここでは、関連する大域電荷の双極子モーメントを保持する系、あるいはそれよりも高次モーメントの一般化が、このシナリオから逃れ、代わりに拡散崩壊を示す。双極子および四極子保存の特定の場合のセルオートマトンとしての時間発展をモデル化し、後期緩和の特異な異常指数を求める。これらの知見は、保存モーメントの数に応じて一連の指数を導出する一般的な流体力学モデルを解析的に構築し、電荷相関関数のスケーリング形式を正確に記述することによって説明される。相関関係の空間的プロファイルを解析し,高次モーメント保存の潜在的な実験的意義について考察する。

The presence of global conserved quantities in interacting systems generically leads to diffusive transport at late times. Here, we show that systems conserving the dipole moment of an associated global charge, or even higher moment generalizations thereof, escape this scenario, displaying subdiffusive decay instead. Modelling the time evolution as cellular automata for specific cases of dipole- and quadrupole-conservation, we numerically find distinct anomalous exponents of the late time relaxation. We explain these findings by analytically constructing a general hydrodynamic model that results in a series of exponents depending on the number of conserved moments, yielding an accurate description of the scaling form of charge correlation functions. We analyze the spatial profile of the correlations and discuss potential experimentally relevant signatures of higher moment conservation.

翻訳日:2023-05-27 05:21:56 公開日:2020-12-04

# 量子線形応答の文脈性による熱力学と気象学における量子シグネチャの証明

Certifying quantum signatures in thermodynamics and metrology via contextuality of quantum linear response ( http://arxiv.org/abs/2004.01213v3 )

ライセンス: Link先を確認

Matteo Lostaglio

(参考訳) 線形応答系における古典力学と量子力学の基本的な違いを,後者が一般的な文脈にあることを示すことによって同定する。これにより、所望の出力スケーリング \emph{unavoidably} が文脈性の形で非古典的な効果を必要とする量子エンジンの例を提供することができる。さらに,局所的メトロロジーの文脈的利点について述べる。線形応答理論の普遍性を考えると、これらのツールによって幅広い量子現象の非古典性が証明できることを期待している。

We identify a fundamental difference between classical and quantum dynamics in the linear response regime by showing that the latter is in general contextual. This allows us to provide an example of a quantum engine whose favorable power output scaling \emph{unavoidably} requires nonclassical effects in the form of contextuality. Furthermore, we describe contextual advantages for local metrology. Given the ubiquity of linear response theory, we anticipate that these tools will allow one to certify the nonclassicality of a wide array of quantum phenomena.

翻訳日:2023-05-27 03:05:10 公開日:2020-12-04

# 量子スピンシステムによるフロケット工学と例外環のシミュレート

Floquet engineering and simulating exceptional rings with a quantum spin system ( http://arxiv.org/abs/2005.02703v3 )

ライセンス: Link先を確認

Peng He, Ze-Hao Huang

(参考訳) コヒーレント放射の形での時間周期駆動は、トポロジカル材料や合成量子物質の操作に強力なツールを提供する。本稿では、フレケット工学を通してスペクトルに例外環を示す非エルミート半金属を実現する手法を提案する。環の同心対から双極子対への遷移が観察される。同心対は量子化されたベリー相のみを持ち、双極子対は反対のチャーン数を持ち、フェルミ面のトポロジカルなリフシッツ転移を示す。システムの輸送特性に対処し, この遷移過程には非自明なホール導電率の緊急時が伴うことがわかった。さらに, 量子スピン系を用いた非エルミート半金属の量子シミュレーションと, 長時間ダイナミクスによるトポロジーのキャラクタリゼーションについて検討した。

Time-periodic driving in the form of coherent radiation provides powerful tool for the manipulation of topological materials or synthetic quantum matter. In this paper we propose a scheme to realize non-Hermitian semimetals exhibiting exceptional rings in the spectra through Floquet engineering. A transition from a concentric pair of the rings to a dipolar pair is observed. The concentric pair carries only a quantized Berry phase while the dipolar pair possesses opposite Chern numbers in addition, signaling a topological Lifshitz transition of the Fermi surface. The transport properties of the system are addressed, and we find that this transition process is accompanied by the emergency of a nontrivial Hall conductivity. Furthermore, we explore the quantum simulation of non-Hermitian semimetals with a quantum spin system and the characterization of the topology via the long-time dynamics.

翻訳日:2023-05-21 00:48:33 公開日:2020-12-04

# 有効次元最小化による量子多体散乱のダイナミクスへの影響強化

Enhancing the effect of quantum many-body scars on dynamics by minimising the effective dimension ( http://arxiv.org/abs/2006.03099v2 )

ライセンス: Link先を確認

Shane Dooley, Graham Kells

(参考訳) 量子多体スカーリングは、相互作用するリドバーグ原子鎖における長寿命コヒーレント振動のメカニズムであると考えられている。これらの持続的な振動は、初期状態と多体傷の大きな重複に起因する。この「有効次元」は多体散乱系における非熱的初期状態の同定に有用であることを示す。有効次元を最小化することで、リドベルク連鎖の初期状態がより顕著で長寿命な振動を生じさせ、多くの物体の傷跡が力学に与える影響を強調させる。

Quantum many-body scarring is believed to be the mechanism behind long-lived coherent oscillations in interacting Rydberg atom chains. These persistent oscillations are due to the large overlap of the many-body scars with certain initial states. We show that the "effective dimension" is a useful measure for identifying non-thermalising initial states in many-body scarred systems. By minimising the effective dimension we find physically reasonable initial states of the Rydberg chain that lead to more pronounced and longer lived oscillations, accentuating the effect of the many-body scars on the dynamics.

翻訳日:2023-05-17 04:01:42 公開日:2020-12-04

# 動的アセット割り当てを伴うブロックチェーンマイナの平衡

Equilibrium of Blockchain Miners with Dynamic Asset Allocation ( http://arxiv.org/abs/2006.08016v2 )

ライセンス: Link先を確認

Go Yamamoto, Aron Laszka, Fuhito Kojima

(参考訳) マイニングビジネスの複合リターンを最大化しようとするブロックチェーンマイニング担当者をモデル化し、分析する。最適戦略の分析では、鉱夫と鉱夫の間で新たな均衡点が発見され、鉱夫や鉱夫の市場シェアを予測している。鉱業のコストは、各鉱夫または鉱業プールのシェアを均衡で決定する。我々は, 複合リターンを最大化しようとする鉱夫も鉱業プールも, 鉱業コストが同じ水準であれば, ハッシュレートの50%以上を占めるための経済的インセンティブは得られないと結論づける。しかし、優れたコスト効率の鉱夫が存在する場合、この鉱夫の市場シェアは均衡の50%を超え、生態系全体の存続を脅かす可能性がある。

We model and analyze blockchain miners who seek to maximize the compound return of their mining businesses. The analysis of the optimal strategies finds a new equilibrium point among the miners and the mining pools, which predicts the market share of each miner or mining pool. The cost of mining determines the share of each miner or mining pool at equilibrium. We conclude that neither miners nor mining pools who seek to maximize their compound return will have a financial incentive to occupy more than 50% of the hash rate if the cost of mining is at the same level for all. However, if there is an outstandingly cost-efficient miner, then the market share of this miner may exceed 50% in the equilibrium, which can threaten the viability of the entire ecosystem.

翻訳日:2023-05-14 19:02:42 公開日:2020-12-04

# 高速捕捉イオン絡み込み動作のためのマルチGHz繰り返し、マルチワット平均電力、紫外レーザーパルス

Multi-GHz repetition rate, multi-watt average power, ultraviolet laser pulses for fast trapped-ion entanglement operations ( http://arxiv.org/abs/2007.03404v2 )

ライセンス: Link先を確認

M. I. Hussain, D. Heinrich, M. Guevara-Bertsch, E.Torrontegui, J. J. Garc{\i}a-Ripoll, C. F. Roos, and R. Blatt

(参考訳) 閉じ込められたイオンで2量子ゲート操作を行う従来のアプローチは、本質的に遅いプロセスであるレーザー光による運動側バンド上のイオンの励起に依存している。高速エンタングリングゲートプロトコルを実装する1つの方法は、ゲート速度を桁違いに増やすのに適切なパルスレーザーを必要とする。しかし、このような高速エンタングリングゲート動作の実現は、必要となるレーザー源が市販されていないため、大きな技術的課題となる。そこで我々は,周波数コムに基づく超高速絡み込みゲート源を開発した。ソースは、パルスエネルギー$\sim$800 pJを393.3 nmで5GHz繰り返して数百モードロックパルスのバーストを生成し、高速な2ビットゲート演算を実装するための全ての要求を満たす。単一チャープ紫外線パルスを用いて,ca$^+$イオン中の急速断熱通路を示す。絡み合うゲートを誘導するレーザーシステムの適用性と予測性能を検証するために, 音源パラメータに基づいてシミュレーションを行う。ゲートタイムはトラップ期間よりも速く、エラーは10^{-4}$になる。

The conventional approach to perform two-qubit gate operations in trapped ions relies on exciting the ions on motional sidebands with laser light, which is an inherently slow process. One way to implement a fast entangling gate protocol requires a suitable pulsed laser to increase the gate speed by orders of magnitude. However, the realization of such a fast entangling gate operation presents a big technical challenge, as such the required laser source is not available off-the-shelf. For this, we have engineered an ultrafast entangling gate source based on a frequency comb. The source generates bursts of several hundred mode-locked pulses with pulse energy $\sim$800 pJ at 5 GHz repetition rate at 393.3 nm and complies with all requirements for implementing a fast two-qubit gate operation. Using a single, chirped ultraviolet pulse, we demonstrate a rapid adiabatic passage in a Ca$^+$ ion. To verify the applicability and projected performance of the laser system for inducing entangling gates we run simulations based on our source parameters. The gate time can be faster than a trap period with an error approaching $10^{-4}$.

翻訳日:2023-05-11 02:00:41 公開日:2020-12-04

# 1次元マターウェーブ量子ブレザの測定

Measurement of one-dimensional matter-wave quantum breather ( http://arxiv.org/abs/2007.08365v2 )

ライセンス: Link先を確認

Piotr Staro\'n, Andrzej Syrwid, Krzysztof Sacha

(参考訳) 粒子の位置測定のbethe ansatz法と数値シミュレーションを用いて, 平均場アプローチがブレッシャー解に相当する環上の相互作用するボソンのポストクエンチ多体力学について検討した。初期多体基底状態が翻訳的に不変であるにもかかわらず、系の質量中心の量子揺らぎが抽出された場合、測定によりブレッサーダイナミクスが明らかにされる。さらに、多体進化の解析は、呼吸器を形成するソリトンが解離する兆候を示している。

Employing the Bethe ansatz approach and numerical simulations of measurements of particles' positions we investigate a post-quench many-body dynamics of attractively interacting bosons on a ring, which in the mean-field approach corresponds to the so-called breather solution. Despite the fact that the initial many-body ground state is translationally invariant, the measurements reveal breather dynamics if quantum fluctuations of the center of mass of the system are extracted. Moreover, the analysis of the many-body evolution shows signatures of dissociation of the solitons that form the breather.

翻訳日:2023-05-09 07:01:32 公開日:2020-12-04

# レーザによるオンチップ導波路上のコヒーレント光メモリ

Coherent Optical Memory Baesd on A Laser-written On-chip Waveguide ( http://arxiv.org/abs/2008.12901v2 )

ライセンス: Link先を確認

Tian-Xiang Zhu, Chao Liu, Liang Zheng, Zong-Quan Zhou, Chuan-Feng Li, Guang-Can Guo

(参考訳) 量子メモリは、大規模量子ネットワークを構築するためのコアデバイスである。スケーラブルで便利な実用用途では、集積光メモリ、特にオンチップ光メモリは、他のオンチップデバイスと容易に統合できるため、重要な要件である。本稿では、希土類イオンドープ結晶表面に作製されたIV型導波路(例えば、$\mathrm{Eu^{3+}}$:$\mathrm{Y_2SiO_5}$)に基づくコヒーレント光メモリについて報告する。表面導波路内部における$\mathrm{eu^{3+}}$イオンの光学遷移({^7}f{_0}\rightarrow{^5}d{_0}}$)の性質はバルク結晶と比較してよく保存されている。スピン波原子周波数コム貯蔵は、IV型導波路内で実証される。この装置の信頼性は、検索パルスと基準パルスとの間の${97\pm 1\%}$の高い干渉可視性によって確認される。オンチップ光メモリは、集積量子ノードへの道を開く。

Quantum memory is the core device for the construction of large-scale quantum networks. For scalable and convenient practical applications, integrated optical memories, especially on-chip optical memories, are crucial requirements because they can be easily integrated with other on-chip devices. Here, we report the coherent optical memory based on a type-IV waveguide fabricated on the surface of a rare-earth ion-doped crystal (i.e. $\mathrm{Eu^{3+}}$:$\mathrm{Y_2SiO_5}$). The properties of the optical transition ($\mathrm{{^7}F{_0}\rightarrow{^5}D{_0}}$) of the $\mathrm{Eu^{3+}}$ ions inside the surface waveguide are well preserved compared to those of the bulk crystal. Spin-wave atomic frequency comb storage is demonstrated inside the type-IV waveguide. The reliability of this device is confirmed by the high interference visibility of ${97\pm 1\%}$ between the retrieval pulse and the reference pulse. The developed on-chip optical memory paves the way towards integrated quantum nodes.

翻訳日:2023-05-04 11:20:28 公開日:2020-12-04

# フィン電界効果トランジスタの共通ゲート構造を用いた小型スピン量子ビット

Compact spin qubits using the common gate structure of fin field-effect transistors ( http://arxiv.org/abs/2009.04620v2 )

ライセンス: Link先を確認

Tetsufumi Tanamoto, Keiji Ono

(参考訳) 商用トランジスタのサイズはナノメートルオーダーであり、従来の相補的金属酸化物半導体(cmos)トランジスタを用いたスピン量子ビットの多くの提案がある。しかし、以前に提案されたスピン量子ビットは、少数の量子ビットを制御するために多くのワイヤを必要とする。これにより、量子ビットをチップに組み込む際に重大な「ワイヤの接合」問題が発生する。ここでは、複雑な配線を減らすため、スピン量子ビットがフィンフィールド効果トランジスタ(FinFET)デバイスに埋め込まれ、スピン量子ビットがフィンFETの共通ゲート電極を共有することを理論的に検討する。クォービット間の相互作用は、Ruderman Kittel Kasuya Yosida (RKKY) 相互作用を介してFinFETのチャネルを介して起こる。コンパクトな実装の補償は、小さな空間で高密度の電流線を必要とする。現在提案されている量子コンピュータに加えて,量子アニーリングマシンの可能性についても論じる。

The sizes of commercial transistors are of nanometer order, and there have already been many proposals of spin qubits using conventional complementary metal oxide semiconductor (CMOS) transistors. However, the previously proposed spin qubits require many wires to control a small number of qubits. This causes a significant 'jungle of wires' problem when the qubits are integrated into a chip. Herein, to reduce the complicated wiring, we theoretically consider spin qubits embedded into fin field-effect transistor (FinFET) devices such that the spin qubits share the common gate electrode of the FinFET. The interactions between qubits occur via the Ruderman Kittel Kasuya Yosida (RKKY) interaction via the channel of the FinFET. The compensation for the compact implementation requires high-density current lines in a small space. The possibility of a quantum annealing machine is discussed in addition to the quantum computers of the current proposals.

翻訳日:2023-05-03 00:55:31 公開日:2020-12-04

# 絡み合った状態の条件的弱値によって特徴づけられる量子ゆらぎの文脈性

Contextuality of quantum fluctuations characterized by conditional weak values of entangled states ( http://arxiv.org/abs/2009.06145v2 )

ライセンス: Link先を確認

Holger F. Hofmann

(参考訳) 物理的性質の量子揺らぎは、その物理的性質に少なくとも部分的に敏感な測定値の測定統計において観測することができる。量子理論は、物理特性によって取られる値の効果的な分布は、これらの値が決定される特定の測定コンテキストに依存し、弱い値が測定コンテキストにおける量子ゆらぎのこの依存性を記述する文脈値として特定されていることを示している。ここで、古典統計と量子文脈性の関係は、量子参照と絡み合う系を考えることによって検討される。システムの量子揺らぎは、基準の正確な射影測定によって制御することができ、その結果、基準の測定によって決定される有効な状態準備コンテキストに応じて、量子揺らぎのコンテキスト値が異なる。その結果、混合状態統計は幅広い潜在的な文脈と一致しており、状況の正確な定義には、状態準備と測定の両方において最大量子コヒーレンスが必要であることが示唆された。

The quantum fluctuations of a physical property can be observed in the measurement statistics of any measurement that is at least partially sensitive to that physical property. Quantum theory indicates that the effective distribution of values taken by the physical property depends on the specific measurement context based on which these values are determined and weak values have been identified as the contextual values describing this dependence of quantum fluctuations on the measurement context. Here, the relation between classical statistics and quantum contextuality is explored by considering systems entangled with a quantum reference. The quantum fluctuations of the system can then be steered by precise projective measurements of the reference, resulting in different contextual values of the quantum fluctuations depending on the effective state preparation context determined by the measurement of the reference. The results show that mixed state statistics are consistent with a wide range of potential contexts, indicating that the precise definition of a context requires maximal quantum coherence in both state preparation and measurement.

翻訳日:2023-05-02 06:43:15 公開日:2020-12-04

# エフィモフ様状態と合成双曲曲面上の量子ファンネリング効果

Efimov-like states and quantum funneling effects on synthetic hyperbolic surfaces ( http://arxiv.org/abs/2010.05135v2 )

ライセンス: Link先を確認

Ren Zhang, Chenwei Lv, Yangqian Yan, and Qi Zhou

(参考訳) 調整されたサイト間トンネルとオンサイトエネルギーを持つ工学的格子モデルは、本質的に任意のリーマン面を高度に調整可能な局所曲率で合成することができる。ここでは、平面の格子によって生成される離散合成ポアンカー系半平面とポアンカー系円盤が、任意のゼロでない固有状態に対して無限に退化する。このようなefimov様状態は離散的スケーリング対称性を示し、双曲曲面を用いて量子異常を研究する前例のない装置である。さらに、すべての固有状態は双曲座標において指数関数的に局所化され、エルミート系における量子ファンネリング効果の最初の例を示す。このように、任意の初期波のパケットはポアンカーの半平面の端、もしくはポインカーの円盤上のそれと同等の方向に移動し、光と原子を2次元で収穫する効率的なスキームを提供する。我々の発見は双曲空間の興味深い性質を広げ、エフィモフ状態が余剰次元の曲線空間からの射影と見なせることを示唆している。

Engineering lattice models with tailored inter-site tunnelings and onsite energies could synthesize essentially arbitrary Riemannian surfaces with highly tunable local curvatures. Here, we point out that discrete synthetic Poincar\'e half-planes and Poincar\'e disks, which are created by lattices in flat planes, support infinitely degenerate eigenstates for any nonzero eigenenergies. Such Efimov-like states exhibit a discrete scaling symmetry and imply an unprecedented apparatus for studying quantum anomaly using hyperbolic surfaces. Furthermore, all eigenstates are exponentially localized in the hyperbolic coordinates, signifying the first example of quantum funneling effects in Hermitian systems. As such, any initial wave packet travels towards the edge of the Poincar\'e half-plane or its equivalent on the Poincar\'e disk, delivering an efficient scheme to harvest light and atoms in two dimensions. Our findings unfold the intriguing properties of hyperbolic spaces and suggest that Efimov states may be regarded as a projection from a curved space with an extra dimension.

翻訳日:2023-04-29 11:13:51 公開日:2020-12-04

# AI倫理における理想理論

Ideal theory in AI ethics ( http://arxiv.org/abs/2011.02279v2 )

ライセンス: Link先を確認

Daniel Estrada

(参考訳) 本稿では、ミルズ(2005年)が論じたように、ai倫理研究が理想論のイデオロギーに基づいて行う方法と、fazelpour \& lipton(2020年)のai倫理に適用する方法について述べる。 AI倫理研究者を理想的な理論化に導く構造的・方法論的条件と、このアプローチが我々の研究コミュニティの質と未来にもたらす結果に対処する。最後に、AI倫理における非理想的未来の可能性について議論する。

This paper addresses the ways AI ethics research operates on an ideology of ideal theory, in the sense discussed by Mills (2005) and recently applied to AI ethics by Fazelpour \& Lipton (2020). I address the structural and methodological conditions that attract AI ethics researchers to ideal theorizing, and the consequences this approach has for the quality and future of our research community. Finally, I discuss the possibilities for a nonideal future in AI ethics.

翻訳日:2023-04-26 05:41:23 公開日:2020-12-04

# オッド次元における局所的トポロジカルマーカー

Local Topological Markers in Odd Dimensions ( http://arxiv.org/abs/2011.04771v2 )

ライセンス: Link先を確認

Joseph Sykes and Ryan Barnett

(参考訳) 局所的トポロジカルマーカーは、トポロジカルに非自明なバンドを持つシステムを調べるための貴重なツールであることが証明されている。局所的な性質のため、そのようなマーカーは翻訳的不変系や空間的不均一系を等しい足場で扱うことができる。このうち最も一般的なものはチャーンマーカーと呼ばれるもので、2次元のシステムで利用できる。本稿では,このマーカーを 1d と 3d の系に一般化する方法について述べるとともに,関連する式が 1d と 3d のチャーン数によって与えられる位相ポンピング現象を正確に記述していることを示す。一般導出に加えて、モデルハミルトニアンを数値的に考慮してマーカーを検証する。これらの結果は、奇数次元系の位相ポンピングおよび位相相転移に対する障害の影響を含む将来の研究の扉を開く。

Local topological markers have proven to be a valuable tool for investigating systems with topologically non-trivial bands. Due to their local nature, such markers can treat translationally invariant systems and spatially inhomogeneous systems on an equal footing. Among the most prevalent of these is the so-called Chern marker, which is available for systems in two spatial dimensions. In this paper, we describe how to generalize this marker to 1d and 3d systems, by showing that the relevant expressions accurately describe the phenomenon of topological pumping given by the first and second Chern numbers in 1d and 3d respectively. In addition to providing general derivations, we verify the markers by numerically considering model Hamiltonians. These results will open the door for future studies including the influence of disorder on topological pumping and topological phase transitions in odd-dimensional systems.

翻訳日:2023-04-24 21:09:49 公開日:2020-12-04

# 非平衡量子自由エネルギーと有効温度, 機能および影響作用の生成

Nonequilibrium Quantum Free Energy and Effective Temperature, Generating Functional and Influence Action ( http://arxiv.org/abs/2011.10468v2 )

ライセンス: Link先を確認

Jen-Tsung Hsiang and B. L. Hu

(参考訳) 非平衡自由エネルギー $\mathcal{F}_{\textsc{s}}$ の定義は、熱浴に強く結合された動的ガウス量子開系に対して提案され、粗粒な有効作用と影響作用の観点から関数を生成する方法によって公式導出が提供される。ここで研究された量子ブラウン運動モデルによって実証されたガウス的開量子系に対しては、時間変化のある有効温度を自然な方法で導入することができ、それに伴い非平衡自由エネルギー $\mathcal{F}_{\textsc{s}}$, von Neumann entropy $\mathcal{S}_{vN}$, 内部エネルギー $\mathcal{U}_{\textsc{s}}$, 還元された系$S$ が定義される。浴温を参照する文献に見られる非平衡自由エネルギーとは対照的に、ここで見いだされる非平衡熱力学的関数は、系の完全な非平衡進化の歴史において、慣れ親しんだ関係である $\mathcal{f}_{\textsc{s}}(t)=\mathcal{u}_{\textsc{s}}(t)-t_{\textsc{eff}} (t)\,\mathcal{s}_{vn}(t)$ {\it at any and all moments of time} に従う。系が平衡した後、それらは弱いカップリング限界において、従来の平衡熱力学のそれと一致する。有効温度は、系の状態と浴との相互作用の両方をキャプチャするので、システムの平衡により、最初の浴温度よりもわずかに高い値に近づく。特に、ゼロ温度浴ではゼロではなく、システムバスの絡み合いの存在を示唆している。理にかなったことに、高温で超弱結合下では、浴温と区別がつかない。ここで発見された力学ガウス量子系の非平衡熱力学関数と関係は、非平衡量子力学の有意義な理論を確立するための有用な経路を開くべきである。

A definition of nonequilibrium free energy $\mathcal{F}_{\textsc{s}}$ is proposed for dynamical Gaussian quantum open systems strongly coupled to a heat bath and a formal derivation is provided by way of the generating functional in terms of the coarse-grained effective action and the influence action. For Gaussian open quantum systems exemplified by the quantum Brownian motion model studied here, a time-varying effective temperature can be introduced in a natural way, and with it, the nonequilibrium free energy $\mathcal{F}_{\textsc{s}}$, von Neumann entropy $\mathcal{S}_{vN}$ and internal energy $\mathcal{U}_{\textsc{s}}$ of the reduced system ($S$) can be defined accordingly. In contrast to the nonequilibrium free energy found in the literature which references the bath temperature, the nonequilibrium thermodynamic functions we find here obey the familiar relation $\mathcal{F}_{\textsc{s}}(t)=\mathcal{U}_{\textsc{s}}(t)- T_{\textsc{eff}} (t)\,\mathcal{S}_{vN}(t)$ {\it at any and all moments of time} in the system's fully nonequilibrium evolution history. After the system equilibrates they coincide, in the weak coupling limit, with their counterparts in conventional equilibrium thermodynamics. Since the effective temperature captures both the state of the system and its interaction with the bath, upon the system's equilibration, it approaches a value slightly higher than the initial bath temperature. Notably, it remains nonzero for a zero-temperature bath, signaling the existence of system-bath entanglement. Reasonably, at high bath temperatures and under ultra-weak couplings, it becomes indistinguishable from the bath temperature. The nonequilibrium thermodynamic functions and relations discovered here for dynamical Gaussian quantum systems should open up useful pathways toward establishing meaningful theories of nonequilibrium quantum thermodynamics.

翻訳日:2023-04-23 14:56:25 公開日:2020-12-04

# 新型コロナウイルスの接触追跡とプライバシー:世論の縦断的研究

COVID-19 Contact Tracing and Privacy: A Longitudinal Study of Public Opinion ( http://arxiv.org/abs/2012.01553v2 )

ライセンス: Link先を確認

Lucy Simko, Jack Lucas Chang, Maggie Jiang, Ryan Calo, Franziska Roesner, Tadayoshi Kohno

(参考訳) 新型コロナウイルス感染症(COVID-19)患者を、感染した人の接触情報を全て通知することで、特定するプロセスだ。政府、テクノロジー企業、研究グループは、スマートフォンアプリをリリースし、IoTデバイスを使用し、ウェアラブル技術を分散して“クローズコンタクト”を自動的に追跡し、個々のテストが肯定的な場合に、事前のコンタクトを識別している。しかし、効果的な技術ベースの接触追跡と個人のプライバシーとの緊張関係について、大きな議論が交わされている。そこで本研究では,接触者追跡とプライバシに焦点をあてた7ヶ月間のオンライン調査の結果について報告する。最初の調査は4月1日と3日に行われ、米国で最初のピークを迎える前に、毎週10週間(6月から)、そして11月までの2週間にわたって調査を継続し、コンタクトトレーシングと新型コロナウイルスに関する現在の議論を反映したトピック質問を加えました。以上より,政策立案者,技術者,研究者,公衆衛生専門家に対して,新型コロナウイルス(covid-19)の感染拡大を防止し,潜在的なプライバシー上の懸念を考慮しつつ,テクノロジーの活用方法や利用方法について報告する。引き続き縦断測定を行っており、2020年12月4日のレポートバージョン2.0を参照して、このレポートを時間とともに更新する。

There is growing use of technology-enabled contact tracing, the process of identifying potentially infected COVID-19 patients by notifying all recent contacts of an infected person. Governments, technology companies, and research groups alike have been working towards releasing smartphone apps, using IoT devices, and distributing wearable technology to automatically track "close contacts" and identify prior contacts in the event an individual tests positive. However, there has been significant public discussion about the tensions between effective technology-based contact tracing and the privacy of individuals. To inform this discussion, we present the results of seven months of online surveys focused on contact tracing and privacy, each with 100 participants. Our first surveys were on April 1 and 3, before the first peak of the virus in the US, and we continued to conduct the surveys weekly for 10 weeks (through June), and then fortnightly through November, adding topical questions to reflect current discussions about contact tracing and COVID-19. Our results present the diversity of public opinion and can inform policy makers, technologists, researchers, and public health experts on whether and how to leverage technology to reduce the spread of COVID-19, while considering potential privacy concerns. We are continuing to conduct longitudinal measurements and will update this report over time; citations to this version of the report should reference Report Version 2.0, December 4, 2020.

翻訳日:2023-04-22 07:39:53 公開日:2020-12-04

# 低量子ビット数領域における変分固有解のための最適量子サンプリング回帰アルゴリズム

An optimal quantum sampling regression algorithm for variational eigensolving in the low qubit number regime ( http://arxiv.org/abs/2012.02338v1 )

ライセンス: Link先を確認

Pedro Rivero, Ian C. Clo\"et, Zack Sullivan

(参考訳) VQEアルゴリズムは、現在の量子プロセッサ(すなわちクラウド上)へのアクセス方法を考えると、非常に高価であることが判明した。この問題を軽減するために,代替ハイブリッド量子古典アルゴリズムである量子サンプリング回帰 (qsr) を導入し,低キュービット数領域における時間複雑性に基づくいくつかのユースケースを分析した。いくつかの古典的資源と引き換えに、この新しい戦略は量子プロセッサに必要なサンプルの数で最適であることが証明されている。我々は、このアルゴリズムがVQEよりも効率的であるかどうかを評価するための単純な解析モデルを構築し、同じ理論的考察から、量子的優位が生じる閾値を確立する。最後に,ベンチマーク問題に対するアルゴリズムの有効性を示す。

The VQE algorithm has turned out to be quite expensive to run given the way we currently access quantum processors (i.e. over the cloud). In order to alleviate this issue, we introduce Quantum Sampling Regression (QSR), an alternative hybrid quantum-classical algorithm, and analyze some of its use cases based on time complexity in the low qubit number regime. In exchange for some extra classical resources, this novel strategy is proved to be optimal in terms of the number of samples it requires from the quantum processor. We develop a simple analytical model to evaluate when this algorithm is more efficient than VQE, and, from the same theoretical considerations, establish a threshold above which quantum advantage can occur. Finally, we demonstrate the efficacy of our algorithm for a benchmark problem.

翻訳日:2023-04-22 03:18:10 公開日:2020-12-04

# 数保存演算による位相推定の改善

Improving phase estimation using the number-conserving operations ( http://arxiv.org/abs/2012.02441v1 )

ライセンス: Link先を確認

Huan Zhang, Wei Ye, Chaoping Wei, Cunjin Liu, Zeyang Liao, Liyun Hu

(参考訳) 本稿では, 2モード圧縮真空 (TMSV) 状態において, s^2+t^2=1の積 (saa^{{\dag}}+ta^{{\dag}}a)^{m} の積 (GSP) 演算の数値保存的重ね合わせを適用して生成する非古典的な入力状態を用いて, マッハ・ツェンダー干渉計のパリティ検出による位相測定の精度と精度を向上させる理論的手法を提案する。提案したGSP-TMSVの非古典的特性は、平均光子数(APN)、アンチバンチング効果、および2モードスクイージングの度合いによって研究される。特に,より高次m GSP演算とより小さいパラメータsの両方がAPNを増大させ,量子フィッシャー情報の改善につながることを示す。さらに, 位相測定精度を, 従来の光子減算・加算法と比較して, 光子損失の有無を比較検討した。提案手法は,特にs=0の場合において,光子損失が存在する場合においても,位相分解能の向上と感度の向上により最高の性能を示すことが判明した。興味深いことに、我々のスキームでは標準量子ノイズ限界(SQL)は常に超えることができ、ハイゼンベルク極限(HL)は、小さい総APNを持つs=0.5,1で達成できる。しかし、光子損失がある場合、hl は打ち負かすことはできないが、sql は特に大規模な apn レジームにおいて克服できる。この結果は量子力学における重要な応用を見出すことができる。

We propose a theoretical scheme to improve the resolution and precision of phase measurement with parity detection in the Mach-Zehnder interferometer by using a nonclassical input state which is generated by applying a number-conserving generalized superposition of products (GSP) operation, (saa^{{\dag}}+ta^{{\dag}}a)^{m} with s^2+t^2=1, on two-mode squeezed vacuum (TMSV) state. The nonclassical properties of the proposed GSP-TMSV are investigated via average photon number (APN), anti-bunching effect, and degrees of two-mode squeezing. Particularly, our results show that both higher-order m GSP operation and smaller parameter s can increase the total APN, which leads to the improvement of quantum Fisher information. In addition, we also compare the phase measurement precision with and without photon losses between our scheme and the previous photon subtraction/addition schemes. It is found that our scheme, especially for the case of s=0, has the best performance via the enhanced phase resolution and sensitivity when comparing to those previous schemes even in the presence of photon losses. Interestingly, without losses, the standard quantum-noise limit (SQL) can always be surpassed in our our scheme and the Heisenberg limit (HL) can be even achieved when s=0.5,1 with small total APNs. However, in the presence of photon losses, the HL cannot be beaten, but the SQL can still be overcome particularly in the large total APN regimes. Our results here can find important applications in quantum metrology.

翻訳日:2023-04-22 03:14:23 公開日:2020-12-04

# 2ビット系における集団振幅減衰のための量子回路

Quantum Circuits for Collective Amplitude Damping in Two-Qubit Systems ( http://arxiv.org/abs/2012.02410v1 )

ライセンス: Link先を確認

Yusuke Hama

(参考訳) 量子コンピュータは今や我々の社会に登場し、科学と工学の研究に利用されている。現在、約50量子ビットの中間サイズのコンピュータとして構築されており、ノイズ効果に弱い。したがって、ノイズ・中間スケール量子デバイスと呼ばれる。これらの機械を用いて効率的な量子計算を実現するために、個々の量子ノイズと集合量子ノイズのコヒーレントな制御が鍵となる。本研究では、後者のタイプに着目し、量子回路として表される集合量子ノイズの定式化について検討する。議論の簡略化と具体化のために, 2量子系における集団振幅減衰過程を解析した。我々の形式と量子回路の検証として,6つの異なる初期条件を検証し,量子シミュレーションにおける全体の演算の実行数を変化させることで,集団振幅減衰のディジタル量子シミュレーションを実証する。その結果, 2量子ビット系に対する量子マスター方程式の解との数値マッチングが良好であることが確認された。さらに,より大きな量子ビット系における振幅減衰の集団減衰を解析するために,形式性を拡張する方法の本質について述べる。これらの結果は、量子ノイズを制御し、大規模量子コンピュータを設計するための体系的なアプローチを確立するための道を開いた。

Quantum computers have now appeared in our society and are utilized for the investigation of science and engineering. At present, they have been built as intermediate-size computers containing about fifty qubits and are weak against noise effects. Hence, they are called noisy-intermediate scale quantum devices. In order to accomplish efficient quantum computation with using these machines, a key issue is going to be the coherent control of individual and collective quantum noises. In this work, we focus on a latter type and investigate formulations of the collective quantum noises represented as quantum circuits. To simplify our discussions and make them concrete, we analyze collective amplitude damping processes in two-qubit systems. As verifications of our formalisms and the quantum circuits, we demonstrate digital quantum simulations of the collective amplitude damping by examining six different initial conditions with varying the number of execution of an overall operation for our quantum simulations. We observe that our results show good numerical matching with the solution of quantum master equation for the two-qubit systems as we increase such a number. In addition, we explain the essence of the way to extend our formalisms to analyze the collective amplitude damping in larger qubit systems. These results pave the way for establishing systematic approaches to control the quantum noises and designing large-scale quantum computers.

翻訳日:2023-04-22 03:13:25 公開日:2020-12-04

# 偏りのあるプログラマ? あるいはバイアスデータ? AI倫理の操作に関するフィールド実験

Biased Programmers? Or Biased Data? A Field Experiment in Operationalizing AI Ethics ( http://arxiv.org/abs/2012.02394v1 )

ライセンス: Link先を確認

Bo Cowgill, Fabrizio Dell'Acqua, Samuel Deng, Daniel Hsu, Nakul Verma and Augustin Chaintreau

(参考訳) なぜバイアスのある予測が生じるのか? どんな介入を防げますか? 約400ドルのaiエンジニアから、ランダムに割り当てられた実験条件下でアルゴリズムを開発した8.2百万のアルゴリズムによる計算性能の予測を評価した。我々の治療部門は、プログラマのインセンティブ、トレーニングデータ、認識、および/またはAI倫理に関する技術的な知識を変更しました。次に,アルゴリズム入力のランダムな監査操作と20K被験者の地味数学性能を用いて,サンプル外予測をアルゴリズムから評価する。偏りのある予測は、主に偏りのあるトレーニングデータによって引き起こされる。しかし、より良いトレーニングデータの利点の3分の1は、新しい経済メカニズムによってもたらされます。また、プログラマの人口特性と、性別やキャリアに関する心理的偏見(IAT)の心理テストにおいて、パフォーマンスがどのように変化するかを評価する。女性、マイノリティ、低IATエンジニアがコードのバイアスや差別が低いという証拠は見つからない。しかし, 人口統計群では予測誤差が相関していることが判明し, クロスデミノグラフィー平均化による性能改善が期待できる。最後に、技術的アドバイス、単純なリマインダー、アルゴリズムバイアスの低減のためのインセンティブの改善など、実践的な管理や政策介入のメリットとトレードオフを定量化する。

Why do biased predictions arise? What interventions can prevent them? We evaluate 8.2 million algorithmic predictions of math performance from $\approx$400 AI engineers, each of whom developed an algorithm under a randomly assigned experimental condition. Our treatment arms modified programmers' incentives, training data, awareness, and/or technical knowledge of AI ethics. We then assess out-of-sample predictions from their algorithms using randomized audit manipulations of algorithm inputs and ground-truth math performance for 20K subjects. We find that biased predictions are mostly caused by biased training data. However, one-third of the benefit of better training data comes through a novel economic mechanism: Engineers exert greater effort and are more responsive to incentives when given better training data. We also assess how performance varies with programmers' demographic characteristics, and their performance on a psychological test of implicit bias (IAT) concerning gender and careers. We find no evidence that female, minority and low-IAT engineers exhibit lower bias or discrimination in their code. However, we do find that prediction errors are correlated within demographic groups, which creates performance improvements through cross-demographic averaging. Finally, we quantify the benefits and tradeoffs of practical managerial or policy interventions such as technical advice, simple reminders, and improved incentives for decreasing algorithmic bias.

翻訳日:2023-04-22 03:13:07 公開日:2020-12-04

# アルゴリズム的公正行動主義の経営的効果

The Managerial Effects of Algorithmic Fairness Activism ( http://arxiv.org/abs/2012.02393v1 )

ライセンス: Link先を確認

Bo Cowgill, Fabrizio Dell'Acqua and Sandra Matz

(参考訳) 倫理的議論はビジネスにおけるAIの採用にどのように影響しますか? AIフェアネスアクティビズムで使用される議論に対して、ビジネス上の意思決定者がランダムに公開します。アルゴリズムバイアスマネージャがAIを放棄して人間による手動によるレビューを行い、訴訟やネガティブなPRに対するさらなる期待を報告することの難しさを強調している。これらの効果は、AIが性別と人種格差を減らし、AIフェアネスに対処するためのエンジニアリング投資が実現可能であったとしても継続する。ステータスクオ比較の強調は、反対の効果をもたらす。また、AI倫理論における「科学的ベニア」の効果も測定する。 scientific veneerは管理行動を変えるが、有利な(批判的な)aiアクティビズムには非対称に利益がない。

How do ethical arguments affect AI adoption in business? We randomly expose business decision-makers to arguments used in AI fairness activism. Arguments emphasizing the inescapability of algorithmic bias lead managers to abandon AI for manual review by humans and report greater expectations about lawsuits and negative PR. These effects persist even when AI lowers gender and racial disparities and when engineering investments to address AI fairness are feasible. Emphasis on status quo comparisons yields opposite effects. We also measure the effects of "scientific veneer" in AI ethics arguments. Scientific veneer changes managerial behavior but does not asymmetrically benefit favorable (versus critical) AI activism.

翻訳日:2023-04-22 03:12:44 公開日:2020-12-04

# 2重ボース-ハバード鎖における$\mathbb{Z}_2$相とMajorana分光

$\mathbb{Z}_2$ phases and Majorana spectroscopy in paired Bose-Hubbard chains ( http://arxiv.org/abs/2012.02380v1 )

ライセンス: Link先を確認

Smitha Vishveshwara and David M. Weld

(参考訳) 最寄り-近距離対存在下でボース-ハバード鎖を調べる。ペアリング項は、数ゆらぎを持つが外対角長距離順序を持たない、異常なギャップを持つ$\mathbb{Z}_2$イジング相をもたらす。この相は、ギャップ保護されたマクロキュービットである強相関多体二重縮退基底状態を有する。強い相互作用の極限において、系は異方性逆スピン鎖に写像され、この系は、ゼロエネルギーマヨラナ境界状態を持つボース=ハッバード鎖のよく知られたフェルミオン姉妹に写像される。フェルミオン系とボソニック系の対応する位相は極めて異なる波動関数を持つが、同じエネルギースペクトルを共有している。貯留層誘起対を持つバイアスドジグザグ光学格子におけるボース・ハバード模型の低温原子化の可能性について述べ、実験的キタエフ連鎖分光法への道を開く。

We investigate the Bose-Hubbard chain in the presence of nearest-neighbor pairing. The pairing term gives rise to an unusual gapped $\mathbb{Z}_2$ Ising phase that has number fluctuation but no off-diagonal long range order. This phase has a strongly correlated many-body doubly degenerate ground state which is effectively a gap-protected macroscopic qubit. In the strongly interacting limit, the system can be mapped onto an anisotropic transverse spin chain, which in turn can be mapped to the better-known fermionic sister of the paired Bose-Hubbard chain: the Kitaev chain which hosts zero-energy Majorana bound states. While corresponding phases in the fermionic and bosonic systems have starkly different wavefunctions, they share identical energy spectra. We describe a possible cold-atom realization of the paired Bose-Hubbard model in a biased zig-zag optical lattice with reservoir-induced pairing, opening a possible route towards experimental Kitaev chain spectroscopy.

翻訳日:2023-04-22 03:12:33 公開日:2020-12-04

# 量子チャネルの絡み合いのワンショット操作

One-shot manipulation of entanglement for quantum channels ( http://arxiv.org/abs/2012.02631v1 )

ライセンス: Link先を確認

Ho-Joon Kim, Soojoon Lee, Ludovico Lami, Martin B. Plenio

(参考訳) 量子エンタングルメントの動的資源理論は超チャネル理論を用いて定式化できることを示す。この定式化において,チャネル分離性を自由資源として保持する分離チャネルと自由スーパーチャネルのクラスを同定し,スワップチャネルを動的絡み合い黄金単位として選択する。最初の結果は、自由スーパーチャネルの下の2成分量子チャネルの1ショットの動的絡み合いコストが、チャネルの標準的なログロバストネスによって制限されることである。自由超チャネルの下でのバイパルタイト量子チャネルの1ショット蒸留可能な動的絡み合いは、分離可能なチャネルを最小化したチャネルの仮説テスト相対エントロピーから構築した資源単調によって境界づけられる。また、2部量子チャネルの1ショット触媒的動的絡み合わせコストを、漸近的に無視できない動的絡み合わせを生じさせるような、より大規模な自由超チャネルのクラスの下で解決する。

We show that the dynamic resource theory of quantum entanglement can be formulated using the superchannel theory. In this formulation, we identify the separable channels and the class of free superchannels that preserve channel separability as free resources, and choose the swap channels as dynamic entanglement golden units. Our first result is that the one-shot dynamic entanglement cost of a bipartite quantum channel under the free superchannels is bounded by the standard log-robustness of channels. The one-shot distillable dynamic entanglement of a bipartite quantum channel under the free superchannels is found to be bounded by a resource monotone that we construct from the hypothesis-testing relative entropy of channels with minimization over separable channels. We also address the one-shot catalytic dynamic entanglement cost of a bipartite quantum channel under a larger class of free superchannels that could generate the dynamic entanglement which is asymptotically negligible; it is bounded by the generalized log-robustness of channels.

翻訳日:2023-04-22 03:05:18 公開日:2020-12-04

# 原子間相互作用研究のための新しい一般化モース様ポテンシャル

New Generalized Morse-Like Potential for Studying the Atomic Interaction in Diatomic Molecules ( http://arxiv.org/abs/2012.02581v1 )

ライセンス: Link先を確認

C. M. Ekpo, Ephraim P. Inyang, P. O. Okoi, T. O. Magu, E. P. Agbo, K O Okorie, Etido P. Inyang

(参考訳) 本研究では,任意の次元における新たな一般化モースポテンシャルに対するラジアルシュロディンガー方程式の近似解析解を,ニキフォロフ・ウヴァロフ法を用いて求めた。エネルギー固有値と対応する固有関数は解析的に得られる。いくつかの二原子分子の回転振動エネルギー固有値は、いくつかの分光パラメータの助けを借りて計算される。 Attarctive Radial や Deng-Fan potential などのポテンシャルに対するエネルギー方程式も、いくつかのポテンシャルパラメータによって得られる。我々の結果は既存の文献とよく一致している。

In this study, we obtain the approximate analytical solutions of the radial Schrodinger equation for the New Generalized Morse-Like Potential in arbitrary dimensions by using the Nikiforov Uvarov Method. Energy eigenvalues and corresponding eigenfunction are obtain analytically. The rotational-vibrational energy eigenvalues for some diatomic molecules are computed with the aid of some spectroscopic parameters. The energy equation for some potentials such as Attarctive Radial and Deng-Fan potentials have also been obtained by varying some potential parameters. Our results excellently agree with the already existing literature.

翻訳日:2023-04-22 03:04:29 公開日:2020-12-04

# 積層導波路支援アレー検出チャネルを用いた磁力計統合プラットフォーム

An integrated magnetometry platform with stackable waveguide-assisted detection channels for sensing arrays ( http://arxiv.org/abs/2012.02560v1 )

ライセンス: Link先を確認

Michael Hoese, Michael K. Koch, Vibhav Bharadwaj, Johannes Lang, John P. Hadden, Reina Yoshizaki, Argyro N. Giakoumaki, Roberta Ramponi, Fedor Jelezko, Shane M. Eaton, Alexander Kubanek

(参考訳) ダイヤモンドの負電荷NV$^-$中心はナノスケール、高感度磁気メトリーにおいて大きな成功を収めた。感度向上には効率的な蛍光検出が不可欠である。さらに、統合デバイスは実用的なセンサーを可能にする。ここでは, ダイヤモンド表面下数ナノメートルのnv$^-$中心を生成できる新しいアーキテクチャを提案するとともに, フェムト秒レーザーによるtype-ii導波路のモード場最大値についても述べる。結合効率を実験的に検証し、導波路を介して磁気共鳴信号の検出を示し、磁場および温度センシングにおける第一原理実証実験を行う。センシングタスクは、試料を通して直接光を照射することなく導波路を介して操作することができ、これは光に弱い生体システムにおける磁気測定の重要なステップである。将来的には,空間的および時間的相関のある磁気計測を容易にする2次元センシングアレイの開発が期待できる。

The negatively-charged NV$^-$-center in diamond has shown great success in nanoscale, high-sensitivity magnetometry. Efficient fluorescence detection is crucial for improving the sensitivity. Furthermore, integrated devices enable practicable sensors. Here, we present a novel architecture which allows us to create NV$^-$-centers a few nanometers below the diamond surface, and at the same time in the mode field maximum of femtosecond-laser-written type-II waveguides. We experimentally verify the coupling efficiency, showcase the detection of magnetic resonance signals through the waveguides and perform first proof-of-principle experiments in magnetic field and temperature sensing. The sensing task can be operated via the waveguide without direct light illumination through the sample, which marks an important step for magnetometry in biological systems which are fragile to light. In the future, our approach will enable the development of two-dimensional sensing arrays facilitating spatially and temporally correlated magnetometry.

翻訳日:2023-04-22 03:04:21 公開日:2020-12-04

# 超ラジアントナノレーザーの量子ランジュバンアプローチ

Quantum Langevin approach for superradiant nanolasers ( http://arxiv.org/abs/2012.02533v1 )

ライセンス: Link先を確認

Igor Protsenko, Alexander Uskov, Emil C. Andr\'e, Jesper M{\o}rk and Martijn Wubs

(参考訳) 量子非線形ランジュバン方程式を解析的に解く新しい手法を提案し、集団効果が重要な役割を果たす超ラジアントレーザーのスペクトルの計算に適用した。任意のポンプレートの発振スペクトルを計算し、閾値領域をまたいでレーザーライン幅のポンプ依存性などのよく知られた結果を回収する。我々は,大きな緩和振動を持つ超ラジアントレーザーのスペクトルにおける新しいサイドバンドピークと,弱いポンプ速度に対する発振スペクトルの新しい非線形構造を予測する。提案手法は,レーザーライン幅の狭さ,発振スペクトルの構造,コヒーレント操作への移行における人口変動の重要性に新たな光を当てるものである。

A new approach for analytically solving quantum nonlinear Langevin equations is proposed and applied to calculations of spectra of superradiant lasers where collective effects play an important role. We calculate lasing spectra for arbitrary pump rates and recover well-known results such as the pump dependence of the laser linewidth across the threshold region. We predict new sideband peaks in the spectrum of superradiant lasers with large relaxation oscillations as well as new nonlinear structures in the lasing spectra for weak pump rates. Our approach sheds new light on the importance of population fluctuations in the narrowing of the laser linewidth, in the structure of the lasing spectrum, and in the transition to coherent operation.

翻訳日:2023-04-22 03:03:59 公開日:2020-12-04

# 有限サイズの半導体マイクロワイヤにおけるエキシトン-ポラリトンソリトン

Exciton-polariton solitons in a semiconductor microwire of finite size ( http://arxiv.org/abs/2012.02477v1 )

ライセンス: Link先を確認

E. Nji Nde Aboringong, I. Ngek Ndifon and Alain M. Dikand\'e

(参考訳) エクシトン-ポラリトンソリトン(Exciton- Polariton Soliton)は、光と物質との相互作用により結合した励起子-光子状態からなる強い非線形準粒子である。半導体マイクロワイヤやナノワイヤのような半導体マイクロキャビティシステムでは、ポラリトンは反発的な非線形励起子-励起子相互作用と結合すると明るいポラリトンソリトンを生成する負の質量によって特徴づけられる。本研究では, 外部ポンピングによる放射損失を仮定した有限サイズの微小導波路における励起子-偏光子ソリトンダイナミクスについて検討する。ポラリトンパルスの周期列からなるモデル運動方程式に対する正確な明るいソリトン解は、ジャコビ楕円関数の項で得られる。パルス列の光子成分と励起子成分の両方のエネルギーに対応する正確な解析式を見いだした。その結果、マイクロワイヤ導波路のサイズ(すなわち長さ)は、媒体中に伝播するポラリトンソリトンによって伝達されるエネルギーの定量的な推定を得る上で重要な役割を担っていることが示唆された。

Exciton-polariton solitons are strongly nonlinear quasiparticles composed of coupled exciton-photon states due to the interaction of light with matter. In semiconductor microcavity systems such as semiconductor micro and nanowires, polaritons are characterized by a negative mass which when combined with the repulsive nonlinear exciton-exciton interaction, leads to the generation of bright polariton solitons. In this work we investigate the dynamics of bright exciton-polariton solitons in a finite-size microcavity waveguide, for which radiative losses are assumed balanced by the external pumping. An exact bright-soliton solution to the model equations of motion, consisting of a periodic train of polariton pulses, is obtained in terms of Jacobi elliptic functions. Exact analytical expressions corresponding to the energies of both photonic and excitonic components of the pulse train are found. Results suggest that the size (i.e. the length) of a microwire waveguide plays a relevant role in obtaining a quantitative estimate of the energy that could be conveyed by polariton solitons propagating in the medium.

翻訳日:2023-04-22 03:03:39 公開日:2020-12-04

# パラメータ化された$\phi^4$モデルにおけるkink-antikink散乱誘起呼吸境界状態とオシロン

Kink-antikink scattering-induced breathing bound states and oscillons in a parametrized $\phi^4$ model ( http://arxiv.org/abs/2012.02470v1 )

ライセンス: Link先を確認

F. Naha Nzoupe, Alain M. Dikand\'e and C. Tchawoua

(参考訳) 近年の研究では、標準の$\phi^4$フィールドと同じクラスのスカラーフィールドモデルの形状変形性が、オシヨンと呼ばれる特定の種類の呼吸境界状態の生成を制御できる重要な役割を強調している。宇宙論の文脈では、オシヨンの内蔵機構は、スカラー超軽量ダークマターの標準的な画像に影響を与えることを示唆している。本研究は,古典的$\phi^4$場を漸近的極限として認める双安定系のパラメトリゼーションモデルにおいて,真空近傍のスカラー場の長寿命低振幅ほぼ調和振動の形成に着目して検討した。パラメトリズドモデルの特徴は、電位壁の急勾配のみを変化させる形状変形パラメータを持つ二重ウェルポテンシャルであり、したがってポテンシャルバリアのハンプの平坦さが2つの縮退ミニマとバリア高さに影響を与えない点にある。変形性パラメータの変動は、キンク-フォノン散乱電位のいくつかの追加振動モードを促進し、キンク-アンティキンク散乱における2バウンスウィンドウの抑制とオシロンの生成を引き起こす。数値的な結果から,2重井戸系におけるオシロン生成の主要な要因は,フラットバリアハンプを特徴とする電位障壁の非調和性であることが示唆された。

Recent studies have emphasized the important role that a shape deformability of scalar-field models pertaining to the same class with the standard $\phi^4$ field, can play in controlling the production of a specific type of breathing bound states so-called oscillons. In the context of cosmology, the built-in mechanism of oscillons suggests that they can affect the standard picture of scalar ultra-light dark matter. In the present work kink scatterings are investigated in a parametrized model of bistable system admitting the classical $\phi^4$ field as an asymptotic limit, with focus on the formation of long-lived low-amplitude almost harmonic oscillations of the scalar field around a vacuum. The parametrized model is characterized by a double-well potential with a shape-deformation parameter that changes only the steepness of the potential walls, and hence the flatness of the hump of the potential barrier, leaving unaffected the two degenerate minima and the barrier height. It is found that the variation of the deformability parameter promotes several additional vibrational modes in the kink-phonon scattering potential, leading to suppression of the two-bounce windows in kink-antikink scatterings and the production of oscillons. Numerical results suggest that the anharmonicity of the potential barrier, characterized by a flat barrier hump, is the main determinant factor for the production of oscillons in double-well systems.

翻訳日:2023-04-22 03:03:19 公開日:2020-12-04

# ワンショット動的資源理論

One-shot dynamical resource theory ( http://arxiv.org/abs/2012.02781v1 )

ライセンス: Link先を確認

Xiao Yuan, Pei Zeng, Minbo Gao and Qi Zhao

(参考訳) 資源理論の根本的な問題は資源の操作を研究することである。ここでは、量子チャネルの一般的な動的資源理論に着目し、資源の単一コピーによる1ショットの蒸留と希釈のタスクについて考察する。ユニタリチャネルや純粋な状態準備チャネルの任意のターゲットに対して、任意のリソースとターゲットの間で変換されるレートの上限と下限を決定するための普遍的な戦略を確立する。本研究では, チャネルのロバスト性およびチャネル仮説テストエントロピーに基づく資源対策と, 対象資源対策の正規化要因との関連性を示す。この戦略はチャネルロバスト性が有限であれば収束境界で最適となり、ターゲットリソースの測度は同じ値に崩壊する。シングルショットの結果はまた、漸近的リソース変換率を得るために、チャネルの漸近的並列操作にも適用される。我々は、純粋性、古典的容量、量子容量、非一様性、コヒーレンス、量子チャネルの絡み合いなど、いくつかの動的資源の例を示す。この結果は、量子通信、フォールトトレラント量子コンピューティング、量子熱力学に応用可能な一般的な動的資源理論に適用できる。

A fundamental problem in resource theory is to study the manipulation of the resource. Focusing on a general dynamical resource theory of quantum channels, here we consider tasks of one-shot resource distillation and dilution with a single copy of the resource. For any target of unitary channel or pure state preparation channel, we establish a universal strategy to determine upper and lower bounds on rates that convert between any given resource and the target. We show that the rates are related to resource measures based on the channel robustness and the channel hypothesis testing entropy, with regularization factors of the target resource measures. The strategy becomes optimal with converged bounds when the channel robustness is finite and measures of the target resource collapse to the same value. The single-shot result also applies to asymptotic parallel manipulation of channels to obtain asymptotic resource conversion rates. We give several examples of dynamical resources, including the purity, classical capacity, quantum capacity, non-uniformity, coherence, and entanglement of quantum channels. Our results are applicable to general dynamical resource theories with potential applications in quantum communication, fault-tolerant quantum computing, and quantum thermodynamics.

翻訳日:2023-04-22 02:55:24 公開日:2020-12-04

# 化学気相沈着グラフェン層および境界における異方性スピンダイナミクスとのロバストスピンインターコネクション

Robust Spin Interconnect with Isotropic Spin Dynamics in Chemical Vapour Deposited Graphene Layers and Boundaries ( http://arxiv.org/abs/2012.02674v1 )

ライセンス: Link先を確認

Dmitrii Khokhriakov, Bogdan Karpiak, Anamul Md. Hoque, Bing Zhao, Subir Parui, Saroj P. Dash

(参考訳) 化学気相沈着(CVD)により成長する大面積グラフェンの利用は、全スピンメモリおよび論理回路におけるスケーラブルなスピン配線の開発に不可欠である。しかし、多層グラフェンパッチの存在とその境界がスピンダイナミクスに与える影響は、まだ解決されていないため、ロバストなスピンインターコネクトの基本的な理解と応用には必要である。ここでは, 特別に考案された単一層, 二重層および三層グラフェンチャネルにおける普遍的なスピン輸送と動的特性と, CVDグラフェン試料中に存在するそれらの層の境界と折り畳みについて報告する。グラフェン層の配向が異なるスピンに対して等方性スピン緩和を施した均一スピン寿命を室温で観察した。すべての不均一グラフェンチャネルにおいて、スピン偏極した平面外および面内スピンのスピン寿命異方性比は、一様に近いと測定される。解析の結果,多層チャネルにおけるエリオット・ヤフェットとヤコノフ・ペルルの機構の重要性が示され,後者の役割が高まった。大規模不均質CVDグラフェンの多層パッチとその境界と室温での折り畳みによる普遍的および等方的スピン輸送は、そのスピン相互結合性を証明し、スケーラブルなスピントロニクス回路の開発に有用である。

The utilization of large-area graphene grown by chemical vapour deposition (CVD) is crucial for the development of scalable spin interconnects in all-spin-based memory and logic circuits. However, the fundamental influence of the presence of multilayer graphene patches and their boundaries on spin dynamics has not been addressed yet, which is necessary for basic understanding and application of robust spin interconnects. Here, we report universal spin transport and dynamic properties in specially devised single layer, bi-layer, and tri-layer graphene channels and their layer boundaries and folds that are usually present in CVD graphene samples. We observe uniform spin lifetime with isotropic spin relaxation for spins with different orientations in graphene layers and their boundaries at room temperature. In all the inhomogeneous graphene channels, the spin lifetime anisotropy ratios for spins polarized out-of-plane and in-plane are measured to be close to unity. Our analysis shows the importance of both Elliott-Yafet and Dyakonov-Perel mechanisms, with an increasing role of the latter mechanism in multilayer channels. These results of universal and isotropic spin transport on large-area inhomogeneous CVD graphene with multilayer patches and their boundaries and folds at room temperature prove its outstanding spin interconnect functionality, beneficial for the development of scalable spintronic circuits.

翻訳日:2023-04-22 02:53:46 公開日:2020-12-04

# 気象正常化を用いた大学寮の電力消費に及ぼすCOVID-19の影響調査

Investigation of the Impacts of COVID-19 on the Electricity Consumption of a University Dormitory Using Weather Normalization ( http://arxiv.org/abs/2012.07748v1 )

ライセンス: Link先を確認

Zhihong Pang, Fan Feng, Zheng O'Neill

(参考訳) 本研究では,米国南部にある大学寮ビルの電力消費に対する新型コロナウイルス(covid-19)パンデミックの影響を調査した。 2017年1月1日から2020年7月31日までに収集されたこの大学寮の歴史的電力消費データと、キャンパス内気象観測所の気象データを用いて分析を行った。 4つの逆データ駆動予測モデル、すなわち、ニューラルニューラルネットワーク、ロング短期メモリリカレントニューラルネットワーク、eXtreme Gradient Boosting、Light Gradient Boosting Machineを用いて、気象条件の影響を考慮した。その結果、新型コロナウイルスによるキャンパス閉鎖時の予測値と比較して、対象建物の総電力消費量は41%(約276,000 kWh (942 MMBtu))減少した。また, 日負荷比 (DLR) も有意に変化した。概して、DLRは2020年3月後半に80%から40%近くまで徐々に減少し、2020年4月、5月、6月に30%から60%まで比較的安定した水準を維持し、2020年7月には徐々に正常な能力の80%まで回復した。

This study investigated the impacts of the COVID-19 pandemic on the electricity consumption of a university dormitory building in the southern U.S. The historical electricity consumption data of this university dormitory building and weather data of an on-campus weather station, which were collected from January 1st, 2017 to July 31st, 2020, were used for analysis. Four inverse data-driven prediction models, i.e., Artificial Neural Network, Long Short-Term Memory Recurrent Neural Network, eXtreme Gradient Boosting, and Light Gradient Boosting Machine, were exploited to account for the influence of the weather conditions. The results suggested that the total electricity consumption of the objective building decreased by nearly 41% (about 276,000 kWh (942 MMBtu)) compared with the prediction value during the campus shutdown due to the COVID-19. Besides, the daily load ratio (DLR) varied significantly as well. In general, the DLR decreased gradually from 80% to nearly 40% in the second half of March 2020, maintained on a relatively stable level between 30% to 60% in April, May, and June 2020, and then slowly recovered to 80% of the normal capacity in July 2020.

翻訳日:2023-04-22 02:47:24 公開日:2020-12-04

# 都市内業務におけるオンライン食品配送データの利用と住宅移動度検出とキャラクタリゼーション

Exploring the Usage of Online Food Delivery Data for Intra-Urban Job and Housing Mobility Detection and Characterization ( http://arxiv.org/abs/2012.03739v1 )

ライセンス: Link先を確認

Yawen Zhang, Seth Spielman, Qi Liu, Si Shen, Jason Shuo Zhang, Qin Lv

(参考訳) ヒトのモビリティは都市計画や政策立案において重要な役割を担っている。しかし、ある空間的・時間的解像度では、例えば仕事や住宅の移動などの追跡は非常に困難である。本研究では,仕事や住宅の移動性を検出するために,オンラインフードデリバリーデータであるデータセットの新しいモダリティの利用について検討する。中国の北京で人気のオンライン食品注文・配送サービスから何百万もの注文を受け付けることで、従来のデータソースよりもはるかに高い空間的・時間的解像度で、仕事や住宅の移動を検出できるのです。一般的な動きの季節や起源・運命はよく特定できる。より重要なことは、検出された動きをマクロとマイクロレベルの両方の要素に合わせ、仕事と住宅のダイナミクスを特徴づける。以上の結果から,通勤距離は仕事や住宅移動の大きな要因であることが示唆された。また,(1)住宅移動者の場合,都市空間構造を考えると,住宅コストの低減と通勤距離の短縮との間にはトレードオフがあり,(2)就業ホッパーの場合,残業頻度が高い場合,仕事の切り替えによって労働時間を短縮する傾向が強い。この新しいデータセットのモダリティには制限があるが、異なる特徴を持つ複数のデータセットのマッシュアップがジョブやハウジングのダイナミクスをより包括的に表現できるような、アンサンブルアプローチは有望だと考えています。本研究は,雇用・住宅の移動性の検出・分析に食品配送データを活用することの有効性を実証し,アンサンブルに基づくアプローチの潜在可能性の実現に寄与する。

Human mobility plays a critical role in urban planning and policy-making. However, at certain spatial and temporal resolutions, it is very challenging to track, for example, job and housing mobility. In this study, we explore the usage of a new modality of dataset, online food delivery data, to detect job and housing mobility. By leveraging millions of meal orders from a popular online food ordering and delivery service in Beijing, China, we are able to detect job and housing moves at much higher spatial and temporal resolutions than using traditional data sources. Popular moving seasons and origins/destinations can be well identified. More importantly, we match the detected moves to both macro- and micro-level factors so as to characterize job and housing dynamics. Our findings suggest that commuting distance is a major factor for job and housing mobility. We also observe that: (1) For home movers, there is a trade-off between lower housing cost and shorter commuting distance given the urban spatial structure; (2) For job hoppers, those who frequently work overtime are more likely to reduce their working hours by switching jobs. While this new modality of dataset has its limitations, we believe that ensemble approaches would be promising, where a mash-up of multiple datasets with different characteristic limitations can provide a more comprehensive picture of job and housing dynamics. Our work demonstrates the effectiveness of utilizing food delivery data to detect and analyze job and housing mobility, and contributes to realizing the full potential of ensemble-based approaches.

翻訳日:2023-04-22 02:45:40 公開日:2020-12-04

# 公開鍵暗号を用いた健康診断の検証

Verifiable Proof of Health using Public Key Cryptography ( http://arxiv.org/abs/2012.02885v1 )

ライセンス: Link先を確認

Abhishek Singh, Ramesh Raskar

(参考訳) 現在のパンデミックでは、検疫や接触追跡などの健康関連の介入を行うため、検査は病気の拡散や早期診断を監視・抑制するための最も重要なツールであり続けている。したがって、公の場が安全にオープンする準備ができているため、テストステータスの検証能力が重要となる。暗号ツールの最近の進歩により、セキュアで弾力性のあるデジタルidシステムの構築が可能になった。本稿では,テスト結果検証システムの相互運用可能な層を設計する上で,プライバシや計算,その他の実用上の懸念を考慮に入れて,より厳格で選択的なロックダウンを可能にするためのエンドツーエンドの新型コロナウイルス結果検証プロトコルを構築することを提案する。また,提案システムのセキュリティ,プライバシ,倫理,公平性に関する様々な懸念についても論じる。

In the current pandemic, testing continues to be the most important tool for monitoring and curbing the disease spread and early identification of the disease to perform health-related interventions like quarantine, contact tracing and etc. Therefore, the ability to verify the testing status is pertinent as public places prepare to safely open. Recent advances in cryptographic tools have made it possible to build a secure and resilient digital-id system. In this work, we propose to build an end to end COVID-19 results verification protocol that takes privacy, computation, and other practical concerns into account for designing an inter-operable layer of testing results verification system that could potentially enable less stringent and more selective lockdowns. We also discuss various concerns encompassing the security, privacy, ethics and equity aspect of the proposed system.

翻訳日:2023-04-22 02:45:10 公開日:2020-12-04

# 動的偏極核環境と相互作用するNV中心の客観性

Appearance of objectivity for NV centers interacting with dynamically polarized nuclear environment ( http://arxiv.org/abs/2012.02855v1 )

ライセンス: Link先を確認

Damian Kwiatkowski, {\L}ukasz Cywi\'nski, Jaros{\l}aw K. Korbicz

(参考訳) 量子から古典への遷移は、まだ完全に理解できない。その多面的側面から、最近注目を集めているのが、量子から客観的な世界の出現である。特に、客観性は、スペクトル放送構造(SBS)として知られる進化中の特定の量子状態構造の形成によって現れる。この最強かつ最も基本的な客観性に関する研究がすでに行われているにもかかわらず、具体的な物理媒体での実践的実現は今のところ分析されていない。本研究では, ダイヤモンド中の窒素空孔中心を用いて, sbs生成過程をシミュレートする可能性について検討した。動的偏光技術の達成可能な限界を仮定すると、高いが実験可能な核スピンの偏光(p>0.5$)と$\approx \!以下の磁場に対して。 20$は、NV中心の状態と最も近い偏極環境が、合理的にSBS状態に近づく。

Quantum-to-classical transition still eludes a full understanding. Out of its multiple aspects, one has recently gained an increased attention - the appearance of objective world out of the quantum. One particularly idea is that objectivity appears thanks to specific quantum state structures formation during the evolution, known as Spectrum Broadcast Structures (SBS). Despite that quite some research was already performed on this strongest and most fundamental form of objectivity, its practical realization in a concrete physical medium has not been analyzed so far. In this work, we study the possibility to simulate objectivization process via SBS formation using widely studied Nitrogen-Vacancy centers in diamonds. Assuming achievable limits of dynamical polarization technique, we show that for high, but experimentally viable polarizations ($p>0.5$) of nuclear spins and for magnetic fields lower than $\approx \! 20$ Gauss the state of the NV center and its nearest polarized environment approaches reasonably well an SBS state.

翻訳日:2023-04-22 02:44:57 公開日:2020-12-04

# 拡散写像を用いた量子相転移の教師なし機械学習

Unsupervised machine learning of quantum phase transitions using diffusion maps ( http://arxiv.org/abs/2003.07399v2 )

ライセンス: Link先を確認

Alexander Lidiak and Zhexuan Gong

(参考訳) 実験的な量子シミュレータは巨大で複雑になり、膨大な量の計測データから新しい物理学を発見することは、特にシミュレーションモデルの理論的理解がほとんどない場合、非常に困難である。教師なしの機械学習手法はこの課題を克服する上で特に有望である。量子相転移を学習する特定のタスクのために、教師なし機械学習法は主に単純な順序パラメータによって特徴づけられる相転移のために開発されてきた。しかし、そのような方法はしばしば不連続相、原子価結合固体、位相次数、多体局在など、より複雑な相転移では失敗する。測定データの非線形次元減少とスペクトルクラスタリングを行う拡散写像法は,そのような複雑な位相遷移を教師なしで学習する上で有意なポテンシャルを有することを示す。本手法は、局所観測性の測定を単一基底で行うため、様々な量子位相や相転移を学習するための汎用ツールとして、多くの実験量子シミュレータに容易に適用できる。

Experimental quantum simulators have become large and complex enough that discovering new physics from the huge amount of measurement data can be quite challenging, especially when little theoretical understanding of the simulated model is available. Unsupervised machine learning methods are particularly promising in overcoming this challenge. For the specific task of learning quantum phase transitions, unsupervised machine learning methods have primarily been developed for phase transitions characterized by simple order parameters, typically linear in the measured observables. However, such methods often fail for more complicated phase transitions, such as those involving incommensurate phases, valence-bond solids, topological order, and many-body localization. We show that the diffusion map method, which performs nonlinear dimensionality reduction and spectral clustering of the measurement data, has significant potential for learning such complex phase transitions unsupervised. This method works for measurements of local observables in a single basis and is thus readily applicable to many experimental quantum simulators as a versatile tool for learning various quantum phases and phase transitions.

翻訳日:2022-12-23 03:58:41 公開日:2020-12-04

# きめ細かい表情操作に向けて

Toward Fine-grained Facial Expression Manipulation ( http://arxiv.org/abs/2004.03132v2 )

ライセンス: Link先を確認

Jun Ling, Han Xue, Li Song, Shuhui Yang, Rong Xie, Xiao Gu

(参考訳) 表情操作は、所定の条件で表情を編集することを目的としている。従来の方法は、個別の感情ラベルまたは絶対状態(例えば、顔の動き単位)のガイダンスの下で入力画像を編集し、所望の表現を保持する。しかし、これらの手法は条件非関連領域の変更に悩まされるか、きめ細かい編集に非効率である。本研究では,これら2つの目的を考察し,新しい手法を提案する。まず,連続絶対条件を相対条件,特に相対作用単位に置き換える。相対作用単位を用いて、生成器は非ゼロ値の相対AUによって指定される関心領域のみを変換することを学ぶ。第2に、我々のジェネレータはU-Net上に構築されているが、高品質な表現編集のためのマルチスケール特徴融合(MSF)機構によって強化されている。定量的評価と定性評価の両面での広範囲な実験により,提案手法の改良が示された。コードは \url{https://github.com/junleen/expression-manipulator} で入手できる。

Facial expression manipulation aims at editing facial expression with a given condition. Previous methods edit an input image under the guidance of a discrete emotion label or absolute condition (e.g., facial action units) to possess the desired expression. However, these methods either suffer from changing condition-irrelevant regions or are inefficient for fine-grained editing. In this study, we take these two objectives into consideration and propose a novel method. First, we replace continuous absolute condition with relative condition, specifically, relative action units. With relative action units, the generator learns to only transform regions of interest which are specified by non-zero-valued relative AUs. Second, our generator is built on U-Net but strengthened by Multi-Scale Feature Fusion (MSF) mechanism for high-quality expression editing purposes. Extensive experiments on both quantitative and qualitative evaluation demonstrate the improvements of our proposed approach compared to the state-of-the-art expression editing methods. Code is available at \url{https://github.com/junleen/Expression-manipulator}.

翻訳日:2022-12-16 00:17:13 公開日:2020-12-04

# 混合密度条件生成逆ネットワークモデル(MD-CGAN)

Mixture Density Conditional Generative Adversarial Network Models (MD-CGAN) ( http://arxiv.org/abs/2004.03797v3 )

ライセンス: Link先を確認

Jaleh Zand and Stephen Roberts

(参考訳) 近年,GAN (Generative Adversarial Networks) が注目されている。しかし、そのような例と比較して、予測を含む時系列モデリングへのGANの応用はより限られている。本稿では,時系列予測に着目した混合密度条件付き生成逆解析モデル(md-cgan)を提案する。本モデルでは,予測よりも確率的後続分布を推定できることを示すとともに,一連のベンチマーク手法と比較して,特にノイズが観測時系列の重要な成分である状況において,MD-CGANモデルが良好に動作することを示す。さらに、出力分布としてガウス混合モデルを用いることで、MD-CGANは非ガウス的な後続予測を提供する。

Generative Adversarial Networks (GANs) have gained significant attention in recent years, with impressive applications highlighted in computer vision in particular. Compared to such examples, however, there have been more limited applications of GANs to time series modelling, including forecasting. In this work, we present the Mixture Density Conditional Generative Adversarial Model (MD-CGAN), with a focus on time series forecasting. We show that our model is capable of estimating a probabilistic posterior distribution over forecasts and that, in comparison to a set of benchmark methods, the MD-CGAN model performs well, particularly in situations where noise is a significant component of the observed time series. Further, by using a Gaussian mixture model as the output distribution, MD-CGAN offers posterior predictions that are non-Gaussian.

翻訳日:2022-12-15 08:19:25 公開日:2020-12-04

# 深層学習を用いた道路網上の温室効果ガス排出予測

Greenhouse Gas Emission Prediction on Road Network using Deep Sequence Learning ( http://arxiv.org/abs/2004.08286v2 )

ライセンス: Link先を確認

Lama Alfaseeh, Ran Tu, Bilal Farooq, and Marianne Hatzopoulou

(参考訳) 交通システムの環境への影響を緩和することが最重要課題である。したがって、温室効果ガス(GHG)排出量の予測は、特に知的輸送システム(ITS)の出現において重要なトピックの1つである。本研究では,従来の時間ステップの速度,密度,GHG ERなど,最も代表的な予測値に基づいて,リンクレベルのGHG排出率(ER)を予測するディープラーニングフレームワークを開発する。特に,外因性変数を持つlong-short term memory(lstm)ネットワークの諸仕様を,クラスタリングおよび外因性変数を用いた自己回帰的統合移動平均(arima)モデルと比較した。トロント中心街の道路網はケーススタディとして利用され、校正交通マイクロシミュレーションとMOVESを用いて詳細なデータを合成する。 LSTM仕様では,3分間の速度,密度,GHG ER,リンク内速度が2層を隠蔽し,過度パラメータを体系的に調整した場合に最適であることがわかった。 30秒の更新間隔を採用すると、真のGHG ERと予測されたGHG ERとの相関はわずかに改善されるが、増大したルート平均二乗誤差(RMSE)値に反映されるように予測精度に悪影響を及ぼす。温暖化への悪影響を軽減するために,データ要求の少ない高頻度でのghg排出量の効率的な予測は,大規模道路網における非筋電性エコルーティングへの道を開く

Mitigating the substantial undesirable impact of transportation systems on the environment is paramount. Thus, predicting Greenhouse Gas (GHG) emissions is one of the profound topics, especially with the emergence of intelligent transportation systems (ITS). We develop a deep learning framework to predict link-level GHG emission rate (ER) (in CO2eq gram/second) based on the most representative predictors, such as speed, density, and the GHG ER of previous time steps. In particular, various specifications of the long-short term memory (LSTM) networks with exogenous variables are examined and compared with clustering and the autoregressive integrated moving average (ARIMA) model with exogenous variables. The downtown Toronto road network is used as the case study and highly detailed data are synthesized using a calibrated traffic microsimulation and MOVES. It is found that LSTM specification with speed, density, GHG ER, and in-links speed from three previous minutes performs the best while adopting 2 hidden layers and when the hyper-parameters are systematically tuned. Adopting a 30 second updating interval improves slightly the correlation between true and predicted GHG ERs, but contributes negatively to the prediction accuracy as reflected on the increased root mean square error (RMSE) value. Efficiently predicting GHG emissions at a higher frequency with lower data requirements will pave the way to non-myopic eco-routing on large-scale road networks {to alleviate the adverse impact on the global warming

翻訳日:2022-12-12 21:18:30 公開日:2020-12-04

# Twitterの談話におけるバイアスによる目標情報操作の自動評価

Automatically Characterizing Targeted Information Operations Through Biases Present in Discourse on Twitter ( http://arxiv.org/abs/2004.08726v3 )

ライセンス: Link先を確認

Autumn Toney, Akshat Pandey, Wei Guo, David Broniatowski, Aylin Caliskan

(参考訳) 本稿では、人工知能による新たな情報操作と関連する可能性のある全体的な態度やバイアスを自動的に特徴付ける問題を検討する。これらの新興トピックの正確な分析には、新しいトピックのバイアスを特定するために何百万ものツイートを注釈付けするために、専門家による精巧な手動分析が必要である。本稿では,CaliskanらによるWord Embedding Association Testの新たなドメインへの拡張について紹介する(Caliskan, 2017)。本手法は,情報操作におけるバイアスの定量化に有効である。本手法は,Twitterの透明性レポートからの既知の情報操作関連ツイートを用いて検証する。我々は、新型コロナウイルスパンデミックに関するケーススタディを行い、未ラベルのTwitterデータ上での方法のパフォーマンスを評価し、新興ドメインにおけるそのユーザビリティを実証した。

This paper considers the problem of automatically characterizing overall attitudes and biases that may be associated with emerging information operations via artificial intelligence. Accurate analysis of these emerging topics usually requires laborious, manual analysis by experts to annotate millions of tweets to identify biases in new topics. We introduce extensions of the Word Embedding Association Test from Caliskan et al. to a new domain (Caliskan, 2017). Our practical and unsupervised method is used to quantify biases promoted in information operations. We validate our method using known information operation-related tweets from Twitter's Transparency Report. We perform a case study on the COVID-19 pandemic to evaluate our method's performance on non-labeled Twitter data, demonstrating its usability in emerging domains.

翻訳日:2022-12-12 05:08:21 公開日:2020-12-04

# 持続可能な消費をいかに活用できるか

How Value-Sensitive Design Can Empower Sustainable Consumption ( http://arxiv.org/abs/2004.09180v4 )

ライセンス: Link先を確認

Thomas Asikis, Johannes Klinglmayr, Dirk Helbing, Evangelos Pournaras

(参考訳) いわゆる人口過多の世界では,持続的消費は存在的に重要である。しかしながら,製品選択のスペクトルの拡大と生産の複雑さは,消費者に情報と価値に敏感な意思決定を迫られる。最近の(個人化された)心理的操作に基づくアプローチは、しばしば不透明で、プライバシーを侵害し、(情報的な)自己決定と矛盾する。対照的に、情報的選択に基づく責任ある消費は、人間の認知能力に圧倒されがちな程度に推論を必要とする。その結果、持続可能な消費への集団的シフトは依然として大きな課題である。ここでは,価値に敏感なデザインをサポートし,サステナビリティ意識を活用する,透明な製品情報と説明可能な製品評価に専門家の知識と“群衆の知恵”を用いて,スマートフォンアプリとして実装された新しいパーソナルショッピングアシスタントを紹介する。 2つのスーパーマーケットにおける実世界のフィールド実験は、より高い持続可能性意識とボトムアップによるより持続可能な消費への行動シフトを確認している。これらの結果は、消費者の好みとより高い持続可能性と倫理的に一致した、小売業者と生産者のための新しいビジネスモデルを奨励する。

In a so-called overpopulated world, sustainable consumption is of existential importance.However, the expanding spectrum of product choices and their production complexity challenge consumers to make informed and value-sensitive decisions. Recent approaches based on (personalized) psychological manipulation are often intransparent, potentially privacy-invasive and inconsistent with (informational) self-determination. In contrast, responsible consumption based on informed choices currently requires reasoning to an extent that tends to overwhelm human cognitive capacity. As a result, a collective shift towards sustainable consumption remains a grand challenge. Here we demonstrate a novel personal shopping assistant implemented as a smart phone app that supports a value-sensitive design and leverages sustainability awareness, using experts' knowledge and "wisdom of the crowd" for transparent product information and explainable product ratings. Real-world field experiments in two supermarkets confirm higher sustainability awareness and a bottom-up behavioral shift towards more sustainable consumption. These results encourage novel business models for retailers and producers, ethically aligned with consumer preferences and with higher sustainability.

翻訳日:2022-12-11 19:29:19 公開日:2020-12-04

# EfficientPose: スケーラブルなシングルパーソンポーズ推定

EfficientPose: Scalable single-person pose estimation ( http://arxiv.org/abs/2004.12186v2 )

ライセンス: Link先を確認

Daniel Groos, Heri Ramampiaro, Espen A. F. Ihlen

(参考訳) 一人称人間のポーズ推定は、スポーツにおけるマーカーレス運動分析と臨床応用を促進する。それでも、人間のポーズ推定の最先端モデルは、一般に実際の応用の要件を満たしていない。深層学習技術の普及は、多くの先進的なアプローチを生み出した。しかし、この分野の進展に伴い、より複雑で非効率なモデルも導入され、計算要求が大幅に増加した。このような複雑で非効率な課題に対処するため,我々は,最近提案されている効率性ネットを活用し,効率的かつスケーラブルな一人称ポーズ推定を実現する,新しい畳み込みニューラルネットワークアーキテクチャである efficientpose を提案する。 efficientposeは、モバイル逆ボトルネック畳み込みを用いた効果的なマルチスケール特徴抽出器と計算効率の高い検出ブロックを活用したモデル群であると同時に、ポーズ構成の精度も向上している。複雑さと効率が低いため、EfficientPoseはメモリフットプリントと計算コストを制限し、エッジデバイス上の現実世界のアプリケーションを可能にする。実験の結果,MPIIシングルパーソンベンチマークを用いた結果,提案したEfficientPoseモデルは,精度と計算効率の両面で広く使用されているOpenPoseモデルより大幅に優れていることがわかった。特に,我々のトップパフォーマンスモデルでは,低複雑さのConvNetを用いて,シングルパーソンMPIIにおける最先端の精度を実現している。

Single-person human pose estimation facilitates markerless movement analysis in sports, as well as in clinical applications. Still, state-of-the-art models for human pose estimation generally do not meet the requirements of real-life applications. The proliferation of deep learning techniques has resulted in the development of many advanced approaches. However, with the progresses in the field, more complex and inefficient models have also been introduced, which have caused tremendous increases in computational demands. To cope with these complexity and inefficiency challenges, we propose a novel convolutional neural network architecture, called EfficientPose, which exploits recently proposed EfficientNets in order to deliver efficient and scalable single-person pose estimation. EfficientPose is a family of models harnessing an effective multi-scale feature extractor and computationally efficient detection blocks using mobile inverted bottleneck convolutions, while at the same time ensuring that the precision of the pose configurations is still improved. Due to its low complexity and efficiency, EfficientPose enables real-world applications on edge devices by limiting the memory footprint and computational cost. The results from our experiments, using the challenging MPII single-person benchmark, show that the proposed EfficientPose models substantially outperform the widely-used OpenPose model both in terms of accuracy and computational efficiency. In particular, our top-performing model achieves state-of-the-art accuracy on single-person MPII, with low-complexity ConvNets.

翻訳日:2022-12-09 21:44:12 公開日:2020-12-04

# リアルタイム応用のためのヒト人工内耳力学の畳み込みニューラルネットワークモデルとフィルタチューニング

A convolutional neural-network model of human cochlear mechanics and filter tuning for real-time applications ( http://arxiv.org/abs/2004.14832v4 )

ライセンス: Link先を確認

Deepak Baby, Arthur Van Den Broucke, Sarah Verhulst

(参考訳) 聴覚モデルは、自動音声認識システムのための特徴抽出器や、ロボット工学、機械聴取、補聴器のフロントエンドとして一般的に用いられている。聴覚モデルは人間の聴覚の生体物理学的および非線形的特性を非常に詳細に捉えることができるが、これらの生体物理学モデルは計算コストが高く、リアルタイム応用には使用できない。本稿では,畳み込みニューラルネットワークと計算神経科学を組み合わせることによって,レベル依存フィルタチューニング(connear)を含む人工内耳力学のリアルタイムエンドツーエンドモデルを実現するハイブリッドアプローチを提案する。 CoNNear モデルは音響音声材料で訓練され、その性能と適用性はコクラー力学研究でよく用いられる(見えない)音刺激を用いて評価された。 connearモデルは、人間の人工内耳周波数選択率と、その音響強度依存性を正確にシミュレートする。 CoNNearアーキテクチャは並列で微分可能な計算に基づいており、リアルタイムな人間のパフォーマンスを実現する能力を持っている。これらのユニークなCNNear機能は、次世代のヒューマンライクな機械学習アプリケーションを可能にする。

Auditory models are commonly used as feature extractors for automatic speech-recognition systems or as front-ends for robotics, machine-hearing and hearing-aid applications. Although auditory models can capture the biophysical and nonlinear properties of human hearing in great detail, these biophysical models are computationally expensive and cannot be used in real-time applications. We present a hybrid approach where convolutional neural networks are combined with computational neuroscience to yield a real-time end-to-end model for human cochlear mechanics, including level-dependent filter tuning (CoNNear). The CoNNear model was trained on acoustic speech material and its performance and applicability were evaluated using (unseen) sound stimuli commonly employed in cochlear mechanics research. The CoNNear model accurately simulates human cochlear frequency selectivity and its dependence on sound intensity, an essential quality for robust speech intelligibility at negative speech-to-background-noise ratios. The CoNNear architecture is based on parallel and differentiable computations and has the power to achieve real-time human performance. These unique CoNNear features will enable the next generation of human-like machine-hearing applications.

翻訳日:2022-12-08 05:43:49 公開日:2020-12-04

# TAVAT: 言語理解のための仮想敵訓練

TAVAT: Token-Aware Virtual Adversarial Training for Language Understanding ( http://arxiv.org/abs/2004.14543v3 )

ライセンス: Link先を確認

Linyang Li, Xipeng Qiu

(参考訳) ニューラルネットワークの堅牢性向上には、グラディエントベースの逆行訓練が広く用いられているが、埋め込み空間が離散的であるため、自然言語処理タスクに容易に適応することはできない。自然言語処理の分野では、テキストが離散的であり、勾配によって直接摂動できないため、仮想対位訓練が導入される。あるいは、NLPタスクでは、埋め込み空間上の摂動を生成する仮想敵トレーニングが導入される。その成功にもかかわらず、既存の仮想敵の訓練方法はフロベニウス正規化球によってほぼ制約された摂動を生成する。微粒な摂動を創り出すために,トークン認識型仮想敵訓練法を提案する。トークンレベルの蓄積摂動語彙を導入し、摂動をより早く初期化し、トークンレベルの正規化球を用いて摂動を連続的に制限する。実験の結果, BERT や ALBERT などの事前学習モデルの性能は, かなりの差で向上することがわかった。提案手法は,BERTモデルを用いてGLUEベンチマークのスコアを78.3から80.9に改善し,シーケンスラベリングやテキスト分類タスクの性能を向上させる。

Gradient-based adversarial training is widely used in improving the robustness of neural networks, while it cannot be easily adapted to natural language processing tasks since the embedding space is discrete. In natural language processing fields, virtual adversarial training is introduced since texts are discrete and cannot be perturbed by gradients directly. Alternatively, virtual adversarial training, which generates perturbations on the embedding space, is introduced in NLP tasks. Despite its success, existing virtual adversarial training methods generate perturbations roughly constrained by Frobenius normalization balls. To craft fine-grained perturbations, we propose a Token-Aware Virtual Adversarial Training method. We introduce a token-level accumulated perturbation vocabulary to initialize the perturbations better and use a token-level normalization ball to constrain these perturbations pertinently. Experiments show that our method improves the performance of pre-trained models such as BERT and ALBERT in various tasks by a considerable margin. The proposed method improves the score of the GLUE benchmark from 78.3 to 80.9 using BERT model and it also enhances the performance of sequence labeling and text classification tasks.

翻訳日:2022-12-08 03:57:05 公開日:2020-12-04

# 不規則サンプリング時系列における長期依存性の学習

Learning Long-Term Dependencies in Irregularly-Sampled Time Series ( http://arxiv.org/abs/2006.04418v4 )

ライセンス: Link先を確認

Mathias Lechner and Ramin Hasani

(参考訳) 連続時間隠れ状態を持つリカレントニューラルネットワーク(RNN)は、不規則サンプリング時系列のモデリングに自然に適合する。しかし、これらのモデルは、入力データが長期依存を持つ場合、困難に直面します。通常のRNNと同様、この問題の根底にある理由は、トレーニング中に勾配が消滅または爆発することである。この現象は、ODEソルバの選択に関係なく、隠蔽状態の常微分方程式(ODE)で表される。我々は,その時間連続状態からメモリを分離する長寿命メモリ(LSTM)に基づく新しいアルゴリズムを設計することで,解を提供する。これにより、rnn内の連続時間動的流れをエンコードし、メモリパスを通じて一定のエラー伝搬を確保しながら、任意のタイムラグに到着する入力に応答することができる。我々はこれらのRNNモデルをODE-LSTMと呼ぶ。 ODE-LSTMは, 長期依存性のある一様でないサンプルデータに対して, 高度なRNNベースのデータよりも優れていることを示す。すべてのコードとデータはhttps://github.com/mlech26l/ode-lstmsで入手できる。

Recurrent neural networks (RNNs) with continuous-time hidden states are a natural fit for modeling irregularly-sampled time series. These models, however, face difficulties when the input data possess long-term dependencies. We prove that similar to standard RNNs, the underlying reason for this issue is the vanishing or exploding of the gradient during training. This phenomenon is expressed by the ordinary differential equation (ODE) representation of the hidden state, regardless of the ODE solver's choice. We provide a solution by designing a new algorithm based on the long short-term memory (LSTM) that separates its memory from its time-continuous state. This way, we encode a continuous-time dynamical flow within the RNN, allowing it to respond to inputs arriving at arbitrary time-lags while ensuring a constant error propagation through the memory path. We call these RNN models ODE-LSTMs. We experimentally show that ODE-LSTMs outperform advanced RNN-based counterparts on non-uniformly sampled data with long-term dependencies. All code and data is available at https://github.com/mlech26l/ode-lstms.

翻訳日:2022-11-24 00:32:45 公開日:2020-12-04

# 疎データ依存雑音におけるPCAによる高速ロバスト部分空間追跡

Fast Robust Subspace Tracking via PCA in Sparse Data-Dependent Noise ( http://arxiv.org/abs/2006.08030v3 )

ライセンス: Link先を確認

Praneeth Narayanamurthy and Namrata Vaswani

(参考訳) 本研究はロバスト部分空間追跡(st)問題を研究する。ロバスト ST は、ロバストPCA の(ゆっくりとした)時変部分空間拡張として簡単に理解できる。真のデータは、時間とともに固定またはゆっくり変化する低次元部分空間にあると仮定する。目標は、加法的なスパースアウトレーヤの存在下で変化する部分空間を時間とともに追跡し、(短い遅延で)これを迅速に行うことである。軽度な仮定の下で確実に正しい「高速」ミニバッチロバストなstソリューションを提案する。速い」とは2つの意味を持つ。 (i)部分空間の変化を検出し、その部分空間を最適に近い遅延で追跡することができる。 (ii)これを行う時間の複雑さは、単純な(ロバストでない)pcaと同じである。我々の主な結果は、断片的に定数部分空間(識別可能性に依る)を仮定するが、同時に、各時間にわずかな変化がある場合の座標も提供する。第2の貢献は、線形データ依存ノイズにおけるPCAの非漸近的保証である。これが役立つ重要な設定は、時間とともに十分に変化するサポートと疎結合な線形データ依存ノイズに対してである。この結果を用いて,提案するロバストstソリューションのサブスペース更新ステップの解析を行う。

This work studies the robust subspace tracking (ST) problem. Robust ST can be simply understood as a (slow) time-varying subspace extension of robust PCA. It assumes that the true data lies in a low-dimensional subspace that is either fixed or changes slowly with time. The goal is to track the changing subspaces over time in the presence of additive sparse outliers and to do this quickly (with a short delay). We introduce a "fast" mini-batch robust ST solution that is provably correct under mild assumptions. Here "fast" means two things: (i) the subspace changes can be detected and the subspaces can be tracked with near-optimal delay, and (ii) the time complexity of doing this is the same as that of simple (non-robust) PCA. Our main result assumes piecewise constant subspaces (needed for identifiability), but we also provide a corollary for the case when there is a little change at each time. A second contribution is a novel non-asymptotic guarantee for PCA in linearly data-dependent noise. An important setting where this is useful is for linearly data dependent noise that is sparse with support that changes enough over time. The analysis of the subspace update step of our proposed robust ST solution uses this result.

翻訳日:2022-11-21 12:36:49 公開日:2020-12-04

# 直交グラディエントによる連続学習のための一般化保証

Generalisation Guarantees for Continual Learning with Orthogonal Gradient Descent ( http://arxiv.org/abs/2006.11942v4 )

ライセンス: Link先を確認

Mehdi Abbana Bennani, Thang Doan, Masashi Sugiyama

(参考訳) 継続的学習では、ディープニューラルネットワークは破滅的な忘れがちである。この課題に取り組むために直交勾配降下が提案された。しかし、理論的な保証はまだ証明されていない。本稿では,ニューラルタンジェントカーネルシステムにおける連続学習アルゴリズムの理論的枠組みを提案する。このフレームワークは、伝達学習、一般化、タスク類似性のためのタスクおよびプロキシを通じてモデルのクローズドフォーム表現を含む。この枠組みでは、OGDが破滅的フォーッティングに対して堅牢であることを証明し、SGD と OGD の連続学習に対する最初の一般化を導出する。最後に,本手法の限界について検討し,ogdを用いた連続学習における神経接核変動の重要性を強調する。

In Continual Learning settings, deep neural networks are prone to Catastrophic Forgetting. Orthogonal Gradient Descent was proposed to tackle the challenge. However, no theoretical guarantees have been proven yet. We present a theoretical framework to study Continual Learning algorithms in the Neural Tangent Kernel regime. This framework comprises closed form expression of the model through tasks and proxies for Transfer Learning, generalisation and tasks similarity. In this framework, we prove that OGD is robust to Catastrophic Forgetting then derive the first generalisation bound for SGD and OGD for Continual Learning. Finally, we study the limits of this framework in practice for OGD and highlight the importance of the Neural Tangent Kernel variation for Continual Learning with OGD.

翻訳日:2022-11-18 11:48:48 公開日:2020-12-04

# 効率的かつ透明なレビューのためのオープンソースソフトウェア

Open Source Software for Efficient and Transparent Reviews ( http://arxiv.org/abs/2006.12166v3 )

ライセンス: Link先を確認

Rens van de Schoot, Jonathan de Bruin, Raoul Schram, Parisa Zahedi, Jan de Boer, Felix Weijdema, Bianca Kramer, Martijn Huijts, Maarten Hoogerwerf, Gerbrich Ferdinands, Albert Harkema, Joukje Willemsen, Yongchao Ma, Qixiang Fang, Sybren Hindriks, Lars Tummers, Daniel Oberski

(参考訳) 研究者は,系統的なレビューやメタアナリシスを可能な限り効果的かつ透過的に行うために,タイトルや要約のスクリーニングを高速化するツール (ASReview) を設計した。体系的なレビューやメタ分析を含む多くのタスクでは、科学的文献を体系的にチェックする必要があります。現在、学者や実践者は、レビューやメタ分析にどの研究を組み込むべきかを手作業で調査している。これは、極めて不均衡なデータのためにエラーを起こしやすく、非効率である。体系的なレビューの未来は、利用可能なテキストの膨大な増加に対応するために、機械学習アルゴリズムとのインタラクションになる。そこで我々は,アクティブラーニングを応用したオープンソースの機械学習支援パイプラインasreviewを開発した。シミュレーションにより,ASReviewは手作業によるレビューよりもはるかに効率的なレビューを実現するとともに,高品質なレビューを実現することができることを示す。さらに,フリーでオープンソースな研究ソフトウェアの選択肢を説明し,ユーザエクスペリエンステストの結果を紹介する。私たちはコミュニティに対して,現在のプラクティスよりも測定可能かつ再現可能な改善を提供する,私たち自身のオープンソースプロジェクトへのコントリビューションを呼びかけています。

To help researchers conduct a systematic review or meta-analysis as efficiently and transparently as possible, we designed a tool (ASReview) to accelerate the step of screening titles and abstracts. For many tasks - including but not limited to systematic reviews and meta-analyses - the scientific literature needs to be checked systematically. Currently, scholars and practitioners screen thousands of studies by hand to determine which studies to include in their review or meta-analysis. This is error prone and inefficient because of extremely imbalanced data: only a fraction of the screened studies is relevant. The future of systematic reviewing will be an interaction with machine learning algorithms to deal with the enormous increase of available text. We therefore developed an open source machine learning-aided pipeline applying active learning: ASReview. We demonstrate by means of simulation studies that ASReview can yield far more efficient reviewing than manual reviewing, while providing high quality. Furthermore, we describe the options of the free and open source research software and present the results from user experience tests. We invite the community to contribute to open source projects such as our own that provide measurable and reproducible improvements over current practice.

翻訳日:2022-11-18 06:49:39 公開日:2020-12-04

# ロバスト線形回帰:多項式時間における最適速度

Robust Linear Regression: Optimal Rates in Polynomial Time ( http://arxiv.org/abs/2007.01394v4 )

ライセンス: Link先を確認

Ainesh Bakshi and Adarsh Prasad

(参考訳) 最小分布仮定の下で統計的に最適収束率を達成する線形モデルを学習するための頑健で効率的な推定値を得る。具体的には、私たちのデータは、$k$-hypercontractive distributionから引き出され、$\epsilon$-fractionが逆向きに破損していると仮定する。次に、ノイズが共変量とは独立である場合、$\epsilon^{2-2/k}$に比例するレートで真の分布の最適最小二乗最小値に収束する推定器を記述する。このような推定器は我々の研究以前には知られていなかったが、非有界計算にもアクセスできた。私たちが達成したレートは情報理論上最適であり、klivans, kothari, meka [colt'18] の主要なオープン問題を解く。我々の重要な洞察は、確率変数の独立性の多項式緩和として働く解析条件を特定することである。特に,雑音のモーメントと共変量のモーメントが負の相関関係にある場合,独立雑音と同じ速度が得られることを示す。さらに、条件が満たされない場合、$\epsilon^{2-4/k}$に比例するレートを取得し、情報理論上の下限に再び一致する。我々の中心となる技術的貢献は、前述の多項式不等式として定式化することで、"sum-of-squares"フレームワークにおける確率変数の独立性をアルゴリズム的に活用することである。

We obtain robust and computationally efficient estimators for learning several linear models that achieve statistically optimal convergence rate under minimal distributional assumptions. Concretely, we assume our data is drawn from a $k$-hypercontractive distribution and an $\epsilon$-fraction is adversarially corrupted. We then describe an estimator that converges to the optimal least-squares minimizer for the true distribution at a rate proportional to $\epsilon^{2-2/k}$, when the noise is independent of the covariates. We note that no such estimator was known prior to our work, even with access to unbounded computation. The rate we achieve is information-theoretically optimal and thus we resolve the main open question in Klivans, Kothari and Meka [COLT'18]. Our key insight is to identify an analytic condition that serves as a polynomial relaxation of independence of random variables. In particular, we show that when the moments of the noise and covariates are negatively-correlated, we obtain the same rate as independent noise. Further, when the condition is not satisfied, we obtain a rate proportional to $\epsilon^{2-4/k}$, and again match the information-theoretic lower bound. Our central technical contribution is to algorithmically exploit independence of random variables in the "sum-of-squares" framework by formulating it as the aforementioned polynomial inequality.

翻訳日:2022-11-15 14:32:55 公開日:2020-12-04

# 連結車両による交通流の予測

Prediction of Traffic Flow via Connected Vehicles ( http://arxiv.org/abs/2007.05460v2 )

ライセンス: Link先を確認

Ranwa Al Mallah, Bilal Farooq, Alejandro Quintero

(参考訳) 我々は,交通当局がフロー制御と渋滞防止のために早期行動を取るための短期交通フロー予測(stp)フレームワークを提案する。我々は,過去の流れデータと,リアルタイムフィードやコネクテッドカー(cv)技術が提供する軌道データなどの革新的な特徴に基づいて,目標道路区間の将来の流れを予測する。既存の手法が交通の変動に適応しないという事実に対処するため,本手法が流れの予測に組み込むことによって高度なモデリングを可能にし,CVが現実的に遭遇する様々な事象が軌道に沿ったセグメントに与える影響を示す。 CVからの入力によって強化されたマルチタスク学習環境において,Deep Neural Networks (DNN) を用いてSTP問題を解く。その結果,MTL-CV,平均ルート平均二乗誤差(RMSE)が0.052であり,最先端のARIMA時系列(RMSE 0.255)とベースライン分類器(RMSE 0.122)を上回った。ニューラルネットワーク(ANN)による単一タスク学習と比較して、AMNはMTL-CVよりもパフォーマンスが0.113と低い。 MTL-CVは、測定において直接的な歴史的傾向を使用するのとは対照的に、セグメント間の歴史的類似性を学習した。

We propose a Short-term Traffic flow Prediction (STP) framework so that transportation authorities take early actions to control flow and prevent congestion. We anticipate flow at future time frames on a target road segment based on historical flow data and innovative features such as real time feeds and trajectory data provided by Connected Vehicles (CV) technology. To cope with the fact that existing approaches do not adapt to variation in traffic, we show how this novel approach allows advanced modelling by integrating into the forecasting of flow, the impact of the various events that CV realistically encountered on segments along their trajectory. We solve the STP problem with a Deep Neural Networks (DNN) in a multitask learning setting augmented by input from CV. Results show that our approach, namely MTL-CV, with an average Root-Mean-Square Error (RMSE) of 0.052, outperforms state-of-the-art ARIMA time series (RMSE of 0.255) and baseline classifiers (RMSE of 0.122). Compared to single task learning with Artificial Neural Network (ANN), ANN had a lower performance, 0.113 for RMSE, than MTL-CV. MTL-CV learned historical similarities between segments, in contrast to using direct historical trends in the measure, because trends may not exist in the measure but do in the similarities.

翻訳日:2022-11-11 22:27:05 公開日:2020-12-04

# 条件付き生成逆ネットワークを用いた量子状態トモグラフィ

Quantum State Tomography with Conditional Generative Adversarial Networks ( http://arxiv.org/abs/2008.03240v2 )

ライセンス: Link先を確認

Shahnawaz Ahmed, Carlos S\'anchez Mu\~noz, Franco Nori, Anton Frisk Kockum

(参考訳) 量子状態トモグラフィ(QST)は、中間スケールの量子デバイスにおいて難しい課題である。本稿では,QSTに条件付き生成逆ネットワーク(CGAN)を適用する。 CGANフレームワークでは、2つのデュエルニューラルネットワーク、ジェネレータと識別器がデータからマルチモーダルモデルを学ぶ。我々は、任意の標準ニューラルネットワークから物理密度行列への出力変換を可能にするカスタムニューラルネットワーク層でcganを補強する。密度行列を再構築するために、ジェネレータと判別器ネットワークは標準勾配法を用いて互いにデータを訓練する。我々のQST-CGANは、標準最大化法よりもはるかに高速かつ少ないデータで、光量子状態を再構成することを示した。また、QST-CGANは、類似した量子状態で事前学習された場合、発電機ネットワークの単一評価において量子状態を再構築可能であることを示す。

Quantum state tomography (QST) is a challenging task in intermediate-scale quantum devices. Here, we apply conditional generative adversarial networks (CGANs) to QST. In the CGAN framework, two duelling neural networks, a generator and a discriminator, learn multi-modal models from data. We augment a CGAN with custom neural-network layers that enable conversion of output from any standard neural network into a physical density matrix. To reconstruct the density matrix, the generator and discriminator networks train each other on data using standard gradient-based methods. We demonstrate that our QST-CGAN reconstructs optical quantum states with high fidelity orders of magnitude faster, and from less data, than a standard maximum-likelihood method. We also show that the QST-CGAN can reconstruct a quantum state in a single evaluation of the generator network if it has been pre-trained on similar quantum states.

翻訳日:2022-11-02 01:56:31 公開日:2020-12-04

# 降雨発生から降雨除去へ

From Rain Generation to Rain Removal ( http://arxiv.org/abs/2008.03580v2 )

ライセンス: Link先を確認

Hong Wang, Zongsheng Yue, Qi Xie, Qian Zhao, Yefeng Zheng, Deyu Meng

(参考訳) シングルイメージ雨除去(SIRR)タスクでは、ディープラーニング(DL)ベースの手法の性能は、主に設計済みのデラミニングモデルとトレーニングデータセットの影響を受けている。現在の最先端技術のほとんどは、より優れた評価結果を得るために強力な深層モデルの構築に重点を置いている。本稿では, 降雨画像のより効率的な合成方法を探ることで, トレーニングデータセットの観点からSIRRタスクの処理を新たに試みる。具体的には,降雨層を発生器としてパラメータ化し,物理的構造的雨因子(例えば方向,スケール,厚さなど)を表す潜在変数として入力する,降雨画像のベイズ生成モデルを構築する。このモデルを解くために,降雨画像の統計的分布をデータ駆動的に近似するために,変分推論フレームワークを用いた。学習したジェネレータでは、既存のベンチマークデータセットを効率的に強化し拡張するために、多種多様な非反復的なトレーニングペアを自動かつ十分に生成することができる。降雨画像の現実性を質的に定量的に評価する。包括的実験により,提案手法は,現在の深層単一画像レーダの流出性能を著しく向上させるだけでなく,sirタスクのための大規模トレーニングサンプルプリコレクションの必要性を大幅に緩和する,複雑な降雨分布を忠実に抽出できることがわかった。

For the single image rain removal (SIRR) task, the performance of deep learning (DL)-based methods is mainly affected by the designed deraining models and training datasets. Most of current state-of-the-art focus on constructing powerful deep models to obtain better deraining results. In this paper, to further improve the deraining performance, we novelly attempt to handle the SIRR task from the perspective of training datasets by exploring a more efficient way to synthesize rainy images. Specifically, we build a full Bayesian generative model for rainy image where the rain layer is parameterized as a generator with the input as some latent variables representing the physical structural rain factors, e.g., direction, scale, and thickness. To solve this model, we employ the variational inference framework to approximate the expected statistical distribution of rainy image in a data-driven manner. With the learned generator, we can automatically and sufficiently generate diverse and non-repetitive training pairs so as to efficiently enrich and augment the existing benchmark datasets. User study qualitatively and quantitatively evaluates the realism of generated rainy images. Comprehensive experiments substantiate that the proposed model can faithfully extract the complex rain distribution that not only helps significantly improve the deraining performance of current deep single image derainers, but also largely loosens the requirement of large training sample pre-collection for the SIRR task.

翻訳日:2022-11-01 12:15:52 公開日:2020-12-04

# 可視的人体再同定のためのパラメータ共有探索とヘテロセンターに基づくトリプルト損失

Parameter Sharing Exploration and Hetero-Center based Triplet Loss for Visible-Thermal Person Re-Identification ( http://arxiv.org/abs/2008.06223v2 )

ライセンス: Link先を確認

Haijun Liu, Xiaoheng Tan and Xichuan Zhou

(参考訳) 本稿では、昼間の可視光度と夜間の熱量との一致を目標とする視熱的クロスモーダル人物再識別(VT Re-ID)タスクに焦点を当てた。 VT Re-IDの最も難しい問題である相互モダリティの相違に対処するために、マルチモダリティの人的特徴を学習する。本稿では,既存の文献では十分に研究されていない2ストリームネットワークのパラメータの共有数について検討する。 ResNet50モデルを適切に分割し、モダリティ固有特徴抽出ネットワークとモダリティ共有特徴埋め込みネットワークを構築することにより、VT Re-IDのための2ストリームネットワークのパラメータ共有の効果を実験的に実証する。さらに,パートレベルの人型特徴学習の枠組みでは,アンカーセンターが他のすべてのセンターと比較し,従来の三重項損失の厳密な制約を緩和するために,ヘテロセンタに基づく三重項損失を提案する。非常に単純な方法により,提案手法はVT Re-IDの性能を大幅に向上させることができる。 2つのデータセットに対する実験結果から,提案手法は最先端の手法を大きなマージンで明らかに上回り,特に優れた性能を示すRegDBデータセットでは,ランク1/mAP/mINP 91.05%/83.28%/68.84%であることがわかった。 VT Re-IDの新しいベースラインであり、シンプルだが効果的な戦略である。

This paper focuses on the visible-thermal cross-modality person re-identification (VT Re-ID) task, whose goal is to match person images between the daytime visible modality and the nighttime thermal modality. The two-stream network is usually adopted to address the cross-modality discrepancy, the most challenging problem for VT Re-ID, by learning the multi-modality person features. In this paper, we explore how many parameters of two-stream network should share, which is still not well investigated in the existing literature. By well splitting the ResNet50 model to construct the modality-specific feature extracting network and modality-sharing feature embedding network, we experimentally demonstrate the effect of parameters sharing of two-stream network for VT Re-ID. Moreover, in the framework of part-level person feature learning, we propose the hetero-center based triplet loss to relax the strict constraint of traditional triplet loss through replacing the comparison of anchor to all the other samples by anchor center to all the other centers. With the extremely simple means, the proposed method can significantly improve the VT Re-ID performance. The experimental results on two datasets show that our proposed method distinctly outperforms the state-of-the-art methods by large margins, especially on RegDB dataset achieving superior performance, rank1/mAP/mINP 91.05%/83.28%/68.84%. It can be a new baseline for VT Re-ID, with a simple but effective strategy.

翻訳日:2022-10-30 17:19:15 公開日:2020-12-04

# エコー状態ネットワークにおける貯水池方程式の破断対称性

Breaking Symmetries of the Reservoir Equations in Echo State Networks ( http://arxiv.org/abs/2010.07103v2 )

ライセンス: Link先を確認

Joschka Herteux, Christoph R\"ath

(参考訳) 貯留層計算は非線形時系列の予測に非常に成功したことが繰り返し示されている。しかし、貯水池の適切な設計についてはまだ完全には理解されていない。最も一般的なセットアップは有害な対称性を持ち、ミラー・トラクターと呼ばれるものを予測することに繋がる。分析的に証明します同様の問題は一般的な状況で発生し、いくつかの設計の成功や失敗を説明するのに使用します。対称性は双曲的接点活性化関数の直接の結果である。さらに、対称性を破る4つの方法は数値的に比較される:出力のバイアス、入力のシフト、読み出しの二次項、偶数と奇数の活性化関数の混合。まず, 鏡装具に対する感受性を試験する。第2に,平均値が0にシフトしたLorenzデータを予測するタスクにおいて,その性能を評価する。短時間の予測は予測地平線で測定され、最大のリャプノフ指数と相関次元は気候を表すために用いられる。最後に、lorenzアトラクタとhalvorsenアトラクタの複合データセットで同じ解析を繰り返す。出力バイアスを除くすべてのメソッドは、入力シフトと二次読み出しによって、全体的なパフォーマンスを最大にする対称性を完全に破ることができることが分かりました。

Reservoir computing has repeatedly been shown to be extremely successful in the prediction of nonlinear time-series. However, there is no complete understanding of the proper design of a reservoir yet. We find that the simplest popular setup has a harmful symmetry, which leads to the prediction of what we call mirror-attractor. We prove this analytically. Similar problems can arise in a general context, and we use them to explain the success or failure of some designs. The symmetry is a direct consequence of the hyperbolic tangent activation function. Further, four ways to break the symmetry are compared numerically: A bias in the output, a shift in the input, a quadratic term in the readout, and a mixture of even and odd activation functions. Firstly, we test their susceptibility to the mirror-attractor. Secondly, we evaluate their performance on the task of predicting Lorenz data with the mean shifted to zero. The short-time prediction is measured with the forecast horizon while the largest Lyapunov exponent and the correlation dimension are used to represent the climate. Finally, the same analysis is repeated on a combined dataset of the Lorenz attractor and the Halvorsen attractor, which we designed to reveal potential problems with symmetry. We find that all methods except the output bias are able to fully break the symmetry with input shift and quadratic readout performing the best overall.

翻訳日:2022-10-16 05:00:48 公開日:2020-12-04

# 連続学習によるストリーミンググラフニューラルネットワーク

Streaming Graph Neural Networks via Continual Learning ( http://arxiv.org/abs/2009.10951v2 )

ライセンス: Link先を確認

Junshan Wang, Guojie Song, Yi Wu, Liang Wang

(参考訳) グラフニューラルネットワーク(GNN)は様々なアプリケーションで高いパフォーマンスを実現している。実世界では通常、ネットワークデータはストリーミング形式で形成される。ノードの近傍情報を参照するパターンの分布は、時間とともに変化する可能性がある。 GNNモデルは、まだキャプチャできない新しいパターンを学ぶ必要がある。しかし、徐々に学習が進むと、歴史知識が新しく学んだ知識によって上書きされるという破滅的な問題を忘れてしまう。したがって、GNNモデルをトレーニングして新しいパターンを学び、既存のパターンを同時に維持することが重要である。本稿では,連続学習に基づくストリーミングGNNモデルを提案する。まず,情報伝達に基づく新しいパターンを効率的に検出する近似アルゴリズムを設計する。次に,既存のパターン統合のためのデータ再生とモデル正規化の2つの視点を組み合わせる。特に,ノードの階層-重要サンプリング戦略を設計し,GNNパラメータの重み付き正規化項を導出し,より安定性と知識統合の一般化を実現する。本モデルは,実データおよび合成データを用いて評価し,複数のベースラインと比較した。ノード分類の結果,モデルパラメータを効率的に更新でき,モデル再トレーニングに匹敵する性能が得られることがわかった。さらに,合成データに関するケーススタディを行い,モデルの各部分について特定の分析を行い,新しい知識を学習し,異なる視点から既存の知識を維持する能力を示す。

Graph neural networks (GNNs) have achieved strong performance in various applications. In the real world, network data is usually formed in a streaming fashion. The distributions of patterns that refer to neighborhood information of nodes may shift over time. The GNN model needs to learn the new patterns that cannot yet be captured. But learning incrementally leads to the catastrophic forgetting problem that historical knowledge is overwritten by newly learned knowledge. Therefore, it is important to train GNN model to learn new patterns and maintain existing patterns simultaneously, which few works focus on. In this paper, we propose a streaming GNN model based on continual learning so that the model is trained incrementally and up-to-date node representations can be obtained at each time step. Firstly, we design an approximation algorithm to detect new coming patterns efficiently based on information propagation. Secondly, we combine two perspectives of data replaying and model regularization for existing pattern consolidation. Specially, a hierarchy-importance sampling strategy for nodes is designed and a weighted regularization term for GNN parameters is derived, achieving greater stability and generalization of knowledge consolidation. Our model is evaluated on real and synthetic data sets and compared with multiple baselines. The results of node classification prove that our model can efficiently update model parameters and achieve comparable performance to model retraining. In addition, we also conduct a case study on the synthetic data, and carry out some specific analysis for each part of our model, illustrating its ability to learn new knowledge and maintain existing knowledge from different perspectives.

翻訳日:2022-10-15 15:43:11 公開日:2020-12-04

# 介入的少数ショット学習

Interventional Few-Shot Learning ( http://arxiv.org/abs/2009.13000v2 )

ライセンス: Link先を確認

Zhongqi Yue and Hanwang Zhang and Qianru Sun and Xian-Sheng Hua

(参考訳) 一般的なFew-Shot Learning(FSL)メソッドでは、見過ごされがちな欠陥が明らかになりました。この発見は、事前学習された知識、サンプルの特徴、ラベルの因果関係に関する構造因果モデル(Structure Causal Model, SCM)という私たちの因果的仮定に根ざしている。そこで我々は,新たなFSLパラダイムであるIFSL(Interventional Few-Shot Learning)を提案する。具体的には、バックドア調整に基づく3つの効果的なIFSLアルゴリズムの実装を開発する。これは本質的に、多ショット学習のSCMに対する因果的介入である。 IFSLの貢献は、既存の微調整およびメタラーニングベースのFSLメソッドに直交しているため、IFSLは、新しい1/5ショットステート・オブ・ザ・アートを、 \textit{mini} ImageNet、 \textit{tiered} ImageNet、およびクロスドメインCUBで達成することで、そのすべてを改善することができる。コードはhttps://github.com/yue-zhongqi/ifslでリリースされる。

We uncover an ever-overlooked deficiency in the prevailing Few-Shot Learning (FSL) methods: the pre-trained knowledge is indeed a confounder that limits the performance. This finding is rooted from our causal assumption: a Structural Causal Model (SCM) for the causalities among the pre-trained knowledge, sample features, and labels. Thanks to it, we propose a novel FSL paradigm: Interventional Few-Shot Learning (IFSL). Specifically, we develop three effective IFSL algorithmic implementations based on the backdoor adjustment, which is essentially a causal intervention towards the SCM of many-shot learning: the upper-bound of FSL in a causal view. It is worth noting that the contribution of IFSL is orthogonal to existing fine-tuning and meta-learning based FSL methods, hence IFSL can improve all of them, achieving a new 1-/5-shot state-of-the-art on \textit{mini}ImageNet, \textit{tiered}ImageNet, and cross-domain CUB. Code is released at https://github.com/yue-zhongqi/ifsl.

翻訳日:2022-10-13 21:14:16 公開日:2020-12-04

# コントラスト学習のためのハード負混合

Hard Negative Mixing for Contrastive Learning ( http://arxiv.org/abs/2010.01028v2 )

ライセンス: Link先を確認

Yannis Kalantidis, Mert Bulent Sariyildiz, Noe Pion, Philippe Weinzaepfel, Diane Larlus

(参考訳) コントラスト学習は、コンピュータビジョンのための自己教師あり学習アプローチの重要な要素となっている。同じイメージの2つの拡張バージョンを互いに近接して埋め込み、異なるイメージの埋め込みを分離することで、高度に転送可能な視覚表現を訓練することができる。最近の研究で明らかになったように、重いデータ拡張と大きな負のセットは、どちらもそのような表現を学ぶ上で不可欠である。同時に、画像または特徴レベルのデータ混合戦略は、新しい例を合成することによって教師付き学習と半教師付き学習の両方を改善し、ネットワークにより堅牢な特徴を学習させる。本稿では,対照学習の重要な側面,すなわちハードネガティブの影響は,これまで無視されてきたと論じる。より意味のある負のサンプルを得るために、現在のトップコントラストの自己教師型学習アプローチは、バッチサイズを大幅に増加させるか、非常に大きなメモリバンクを保持するかのいずれかである。ですから私たちは、トップパフォーマンスのフレームワークを深く掘り下げて、より優れた、より高速な学習を促進するために、より厳しいネガティブが必要な証拠を示します。これらの観察に基づいて,データ混合の成功を動機とし,最小の計算オーバーヘッドでオンザフライで計算可能な特徴量レベルでのハードネガティブ混合戦略を提案する。我々は,線形分類,オブジェクト検出,インスタンスセグメンテーションに対するアプローチを徹底的に改善し,その手法を用いることで,最先端の自己教師型学習法で学習した視覚表現の質が向上することを示す。

Contrastive learning has become a key component of self-supervised learning approaches for computer vision. By learning to embed two augmented versions of the same image close to each other and to push the embeddings of different images apart, one can train highly transferable visual representations. As revealed by recent studies, heavy data augmentation and large sets of negatives are both crucial in learning such representations. At the same time, data mixing strategies either at the image or the feature level improve both supervised and semi-supervised learning by synthesizing novel examples, forcing networks to learn more robust features. In this paper, we argue that an important aspect of contrastive learning, i.e., the effect of hard negatives, has so far been neglected. To get more meaningful negative samples, current top contrastive self-supervised learning approaches either substantially increase the batch sizes, or keep very large memory banks; increasing the memory size, however, leads to diminishing returns in terms of performance. We therefore start by delving deeper into a top-performing framework and show evidence that harder negatives are needed to facilitate better and faster learning. Based on these observations, and motivated by the success of data mixing, we propose hard negative mixing strategies at the feature level, that can be computed on-the-fly with a minimal computational overhead. We exhaustively ablate our approach on linear classification, object detection and instance segmentation and show that employing our hard negative mixing procedure improves the quality of visual representations learned by a state-of-the-art self-supervised learning method.

翻訳日:2022-10-12 01:07:58 公開日:2020-12-04

# 変分動的混合

Variational Dynamic Mixtures ( http://arxiv.org/abs/2010.10403v2 )

ライセンス: Link先を確認

Chen Qiu, Stephan Mandt, Maja Rudolph

(参考訳) 深い確率的時系列予測モデルが機械学習の不可欠な部分となっている。いくつかの強力な生成モデルが提案されているが、それらの関連する推論モデルはしばしば制限されすぎており、生成モデルがモード平均ダイナミクスを予測している証拠を提供する。多くの実世界のシーケンスは高度にマルチモーダルであり、それらの平均的なダイナミクスは非物理的である(例えば、予測されたタクシー軌道は道路地図上の建物を通り抜けるかもしれない)。マルチモダリティをよりよく捉えるために、変分動的混合(vdm: variational dynamic mixtures)を開発した。それぞれの時間ステップにおけるVDM近似は混合密度ネットワークであり、そのパラメータは再帰的なアーキテクチャを通して複数のサンプルを伝播することに由来する。この結果, マルチモーダル後部近似が得られた。実証実験により、VDMは、異なるドメインの高度マルチモーダルデータセットにおいて競合するアプローチよりも優れていることを示す。

Deep probabilistic time series forecasting models have become an integral part of machine learning. While several powerful generative models have been proposed, we provide evidence that their associated inference models are oftentimes too limited and cause the generative model to predict mode-averaged dynamics. Modeaveraging is problematic since many real-world sequences are highly multi-modal, and their averaged dynamics are unphysical (e.g., predicted taxi trajectories might run through buildings on the street map). To better capture multi-modality, we develop variational dynamic mixtures (VDM): a new variational family to infer sequential latent variables. The VDM approximate posterior at each time step is a mixture density network, whose parameters come from propagating multiple samples through a recurrent architecture. This results in an expressive multi-modal posterior approximation. In an empirical study, we show that VDM outperforms competing approaches on highly multi-modal datasets from different domains.

翻訳日:2022-10-05 07:20:56 公開日:2020-12-04

# Digital Twins: 最先端のアート理論と実践,課題,オープンリサーチに関する質問

Digital Twins: State of the Art Theory and Practice, Challenges, and Open Research Questions ( http://arxiv.org/abs/2011.02833v3 )

ライセンス: Link先を確認

Angira Sharma, Edward Kosasih, Jie Zhang, Alexandra Brintrup, Anisoara Calinescu

(参考訳) Digital Twinは10年以上前に、リアルタイムモニタリング、シミュレーション、予測などのメリットを享受して、革新的なオールエンコンパスツールとして紹介された。しかし、デジタル双生児(DT)の理論的枠組みと実践的実装は、まだこのビジョンには程遠い。実装は成功したが、十分な実装の詳細は公開されていないため、それらの効果を評価し、比較し、DT方法論を共同で進めることは困難である。この研究は、様々なDT機能と現在のアプローチ、デジタルツインの実装と導入の遅れの背景にある欠点と理由を探求する。機械学習、モノのインターネット、ビッグデータの進歩は、そのリアルタイム監視と予測特性に関するdtの改善に大きく貢献している。この進歩と個々の企業ベースの取り組みにもかかわらず、この分野にはある種の研究ギャップがあり、この概念の普及が遅れている。この遅延の主な理由は、共通参照フレームワークの欠如、ドメイン依存、共有データのセキュリティ懸念、他の技術へのデジタルツインの依存、定量的メトリクスの欠如である。我々は、普遍参照フレームワークに必要なデジタル双生児の必要な構成要素を定義し、シミュレーションや自律システムといった同様の概念と比較して、その一意性を概念として検証する。この研究は、異なるドメインにおけるデジタルツインアプリケーションと、その中の機械学習とビッグデータの現状をさらに評価する。これにより、デジタル双生児の理論と実践をよりよく理解し前進させ、新しい研究課題に答え、特定することができる。

Digital Twin was introduced over a decade ago, as an innovative all-encompassing tool, with perceived benefits including real-time monitoring, simulation and forecasting. However, the theoretical framework and practical implementations of digital twins (DT) are still far from this vision. Although successful implementations exist, sufficient implementation details are not publicly available, therefore it is difficult to assess their effectiveness, draw comparisons and jointly advance the DT methodology. This work explores the various DT features and current approaches, the shortcomings and reasons behind the delay in the implementation and adoption of digital twin. Advancements in machine learning, internet of things and big data have contributed hugely to the improvements in DT with regards to its real-time monitoring and forecasting properties. Despite this progress and individual company-based efforts, certain research gaps exist in the field, which have caused delay in the widespread adoption of this concept. We reviewed relevant works and identified that the major reasons for this delay are the lack of a universal reference framework, domain dependence, security concerns of shared data, reliance of digital twin on other technologies, and lack of quantitative metrics. We define the necessary components of a digital twin required for a universal reference framework, which also validate its uniqueness as a concept compared to similar concepts like simulation, autonomous systems, etc. This work further assesses the digital twin applications in different domains and the current state of machine learning and big data in it. It thus answers and identifies novel research questions, both of which will help to better understand and advance the theory and practice of digital twins.

翻訳日:2022-09-30 13:07:07 公開日:2020-12-04

# 複数の自己教師付き補助タスクを持つグラフベースニューラルネットワークモデル

Graph-Based Neural Network Models with Multiple Self-Supervised Auxiliary Tasks ( http://arxiv.org/abs/2011.07267v2 )

ライセンス: Link先を確認

Franco Manessi, Alessandro Rozza

(参考訳) ニューラルネットワークは大量のラベルのないデータから堅牢な表現を学習できるため、自己教師付き学習が注目されている。さらに、マルチタスク学習は、関連するタスクを同時にトレーニングするネットワークによる表現学習をさらに改善し、パフォーマンスが大幅に向上する。本稿では,グラフベースのニューラルネットワークモデルをマルチタスクで学習するための3つの補助タスクを提案する。グラフ畳み込みネットワークは構造化データポイント間の関係を捉えるための最も有望な手法であるので,標準的な半教師付きグラフ分類タスクにおける競合結果を達成するための構築ブロックとして利用する。

Self-supervised learning is currently gaining a lot of attention, as it allows neural networks to learn robust representations from large quantities of unlabeled data. Additionally, multi-task learning can further improve representation learning by training networks simultaneously on related tasks, leading to significant performance improvements. In this paper, we propose three novel self-supervised auxiliary tasks to train graph-based neural network models in a multi-task fashion. Since Graph Convolutional Networks are among the most promising approaches for capturing relationships among structured data points, we use them as a building block to achieve competitive results on standard semi-supervised graph classification tasks.

翻訳日:2022-09-25 13:46:06 公開日:2020-12-04

# ボリュームセンサを用いた協調認識と知覚のための特徴共有と統合

Feature Sharing and Integration for Cooperative Cognition and Perception with Volumetric Sensors ( http://arxiv.org/abs/2011.08317v3 )

ライセンス: Link先を確認

Ehsan Emad Marvasti, Arash Raftari, Amir Emad Marvasti, Yaser P.Fallah, Rui Guo, Hongsheng Lu

(参考訳) 近年の計算・通信システムの進歩により、高性能ニューラルネットワークと高速無線車両通信ネットワークが導入されている。その結果、協調的知覚や認知といった新しい技術が登場し、部分的に遮蔽されたターゲットの検出とセンシング範囲の拡大のためのソリューションを提供することで、感覚デバイスの固有の制限に対処している。しかし、信頼できる協調認識システムを設計するには、限られたネットワークリソースと異なるソースが共有するデータ間の不一致に起因する課題に対処する必要がある。本稿では,異なる協調認識技術の要件,限界,性能について検討し,Deep Feature Sharing(DFS)の概念の詳細な分析を行う。我々は,異なる協調物体検出設計を探求し,その性能を平均精度で評価する。実験にはVolonyデータセットを使用します。その結果,DFS法はGPSノイズによる局所化誤差にかなり敏感であることがわかった。さらに,より協調的な参加者の追加によるDFS手法の検出ゲインは生の情報共有技術に匹敵するものであり,DFSは通信要求を満たすための設計の柔軟性を実現する。

The recent advancement in computational and communication systems has led to the introduction of high-performing neural networks and high-speed wireless vehicular communication networks. As a result, new technologies such as cooperative perception and cognition have emerged, addressing the inherent limitations of sensory devices by providing solutions for the detection of partially occluded targets and expanding the sensing range. However, designing a reliable cooperative cognition or perception system requires addressing the challenges caused by limited network resources and discrepancies between the data shared by different sources. In this paper, we examine the requirements, limitations, and performance of different cooperative perception techniques, and present an in-depth analysis of the notion of Deep Feature Sharing (DFS). We explore different cooperative object detection designs and evaluate their performance in terms of average precision. We use the Volony dataset for our experimental study. The results confirm that the DFS methods are significantly less sensitive to the localization error caused by GPS noise. Furthermore, the results attest that detection gain of DFS methods caused by adding more cooperative participants in the scenes is comparable to raw information sharing technique while DFS enables flexibility in design toward satisfying communication requirements.

翻訳日:2022-09-24 23:21:18 公開日:2020-12-04

# FROST: より高速でロバストなワンショットセミ教師トレーニング

FROST: Faster and more Robust One-shot Semi-supervised Training ( http://arxiv.org/abs/2011.09471v4 )

ライセンス: Link先を確認

Helena E. Liu and Leslie N. Smith

(参考訳) 半教師付き一発学習の最近の進歩は、新しい応用の深層学習の障壁を低くしている。しかしながら、半教師付き学習の最先端はトレーニングが遅く、ラベル付きデータとハイパーパラメータの値の選択に敏感である。本稿では,一対一の半教師付き学習手法を提案する。具体的には,半教師付き学習と,自己学習の1段階の単一ネットワークバージョンを組み合わせることで,より高速に学習し,ラベル付きサンプルの選択やハイパーパラメータの変更に対して頑健であることを示す。実験では,ラベルなしデータの構成が不明な場合,すなわちラベルなしデータが各クラスの不等数を含み,トレーニングクラスに属さない分布外例を含む場合において,frostがうまく機能することを示す。ハイパフォーマンス、トレーニング速度、ハイパーパラメータに対する感度はFROSTを一発半教師あり訓練の最も実用的な方法である。私たちのコードはhttps://github.com/helenaeliu/frostで利用可能です。

Recent advances in one-shot semi-supervised learning have lowered the barrier for deep learning of new applications. However, the state-of-the-art for semi-supervised learning is slow to train and the performance is sensitive to the choices of the labeled data and hyper-parameter values. In this paper, we present a one-shot semi-supervised learning method that trains up to an order of magnitude faster and is more robust than state-of-the-art methods. Specifically, we show that by combining semi-supervised learning with a one-stage, single network version of self-training, our FROST methodology trains faster and is more robust to choices for the labeled samples and changes in hyper-parameters. Our experiments demonstrate FROST's capability to perform well when the composition of the unlabeled data is unknown; that is when the unlabeled data contain unequal numbers of each class and can contain out-of-distribution examples that don't belong to any of the training classes. High performance, speed of training, and insensitivity to hyper-parameters make FROST the most practical method for one-shot semi-supervised training. Our code is available at https://github.com/HelenaELiu/FROST.

翻訳日:2022-09-24 03:20:03 公開日:2020-12-04

# 適合性問題に関するカテゴリー探索データ分析

Categorical exploratory data analysis on goodness-of-fit issues ( http://arxiv.org/abs/2011.09682v2 )

ライセンス: Link先を確認

Sabrina Enriquez, Fushing Hsieh

(参考訳) ジョージ・ボックス(george box)は、データ分析において、特に現実世界のデータ分析において、引き続き真であり続けるならば、この知恵を可視で説明可能なデータ駆動型パターンで注釈すべきである。このようなアノテーションは、データ分析アプローチとしての統計モデリングの限界だけでなく、妥当性にも価値ある光を当てることができる。実データを潜在的に到達不能あるいは非現実的な理論構造に保持することを避けるため、我々はカテゴリ探索データ分析(ceda)と呼ばれるデータ分析パラダイムを活用すべきである。本提案のメリットを,適合性の観点から,実世界の2つのデータセットを用いて説明する。どちらのデータセットでも、通常の分布のベル形状は一見するとかなりよく合っているように見える。 CEDAを適用して、各データがモデル形状に適合するか、どのようにずれるのかを、いくつかの重要な分布面を通して明らかにする。また、CEDA は木に基づく p-値のバージョンを利用できることを実証し、従来の統計的アプローチに基づく p-値と比較する。データ分析とともに、データサイエンス教育におけるデータ分析の第一手段としてcedaを使用する利点を照らし出すために、グラフィックディスプレイの作成に計算の努力を注ぐ。

If the aphorism "All models are wrong"- George Box, continues to be true in data analysis, particularly when analyzing real-world data, then we should annotate this wisdom with visible and explainable data-driven patterns. Such annotations can critically shed invaluable light on validity as well as limitations of statistical modeling as a data analysis approach. In an effort to avoid holding our real data to potentially unattainable or even unrealistic theoretical structures, we propose to utilize the data analysis paradigm called Categorical Exploratory Data Analysis (CEDA). We illustrate the merits of this proposal with two real-world data sets from the perspective of goodness-of-fit. In both data sets, the Normal distribution's bell shape seemingly fits rather well by first glance. We apply CEDA to bring out where and how each data fits or deviates from the model shape via several important distributional aspects. We also demonstrate that CEDA affords a version of tree-based p-value, and compare it with p-values based on traditional statistical approaches. Along our data analysis, we invest computational efforts in making graphic display to illuminate the advantages of using CEDA as one primary way of data analysis in Data Science education.

翻訳日:2022-09-23 20:53:11 公開日:2020-12-04

# テトラアドラルキラル性を有する分子のメッセージパッシングネットワーク

Message Passing Networks for Molecules with Tetrahedral Chirality ( http://arxiv.org/abs/2012.00094v2 )

ライセンス: Link先を確認

Lagnajit Pattanaik, Octavian-Eugen Ganea, Ian Coley, Klavs F. Jensen, William H. Green, Connor W. Coley

(参考訳) 同一のグラフ接続を持つ分子は、立体化学、空間構造特性を示すと、物理的および生物学的性質が異なる。しかし、分子構造から構造とプロパティの関係を学ぶために設計された現代のニューラルネットワークは、分子をグラフ構造データとして扱うため、立体化学に不変である。そこで我々は, 四面体キラル性をもつ分子の性質を学習するための, メッセージパッシングニューラルネットワークのための2つのカスタムアグリゲーション関数を開発した。合成データと新規なタンパク質-リガンドドッキングデータセットの性能を薬物発見との関連性で評価した。その結果、ベースラインの総和アグリゲータよりも微妙な改善が見られ、さらなるアーキテクチャ開発の機会が浮かび上がっている。

Molecules with identical graph connectivity can exhibit different physical and biological properties if they exhibit stereochemistry-a spatial structural characteristic. However, modern neural architectures designed for learning structure-property relationships from molecular structures treat molecules as graph-structured data and therefore are invariant to stereochemistry. Here, we develop two custom aggregation functions for message passing neural networks to learn properties of molecules with tetrahedral chirality, one common form of stereochemistry. We evaluate performance on synthetic data as well as a newly-proposed protein-ligand docking dataset with relevance to drug discovery. Results show modest improvements over a baseline sum aggregator, highlighting opportunities for further architecture development.

翻訳日:2022-09-21 14:21:32 公開日:2020-12-04

# (参考訳) ディープラーニングのスケールダウン

Scaling down Deep Learning ( http://arxiv.org/abs/2011.14439v3 )

ライセンス: CC BY 4.0

Sam Greydanus

(参考訳) 深層学習モデルは商業的・政治的に関係しているが、その訓練と運用の多くの側面はいまだに理解されていない。これは"深層学習の科学"プロジェクトへの関心を呼び起こし、その多くが大規模に実行され、膨大な時間、お金、電気を必要とする。しかし、この研究はどの程度大規模に行われる必要があるのか? 本稿では,従来のディープラーニングベンチマークに代わる最小限,低メモリ,低スループットのMNIST-1Dを提案する。トレーニングの例はMNISTの例の20倍小さいが、線形モデル、非線形モデル、畳み込みモデル、それぞれ32、68、94%の精度で区別する(これらのモデルはMNISTで94、99+、99+%を得る)。次に,宝くじの空間的インダクティブバイアスの測定,ディープダブル降下の観察,アクティベーション関数のメタラーニングといったユースケースを示す。

Though deep learning models have taken on commercial and political relevance, many aspects of their training and operation remain poorly understood. This has sparked interest in "science of deep learning" projects, many of which are run at scale and require enormous amounts of time, money, and electricity. But how much of this research really needs to occur at scale? In this paper, we introduce MNIST-1D: a minimalist, low-memory, and low-compute alternative to classic deep learning benchmarks. The training examples are 20 times smaller than MNIST examples yet they differentiate more clearly between linear, nonlinear, and convolutional models which attain 32, 68, and 94% accuracy respectively (these models obtain 94, 99+, and 99+% on MNIST). Then we present example use cases which include measuring the spatial inductive biases of lottery tickets, observing deep double descent, and metalearning an activation function.

翻訳日:2021-06-07 09:40:46 公開日:2020-12-04

# (参考訳) 遅延とエネルギー制約を考慮した非同期モバイルエッジ学習のためのタスク割当

Task Allocation for Asynchronous Mobile Edge Learning with Delay and Energy Constraints ( http://arxiv.org/abs/2012.00143v2 )

ライセンス: CC BY 4.0

Umair Mohammad, Sameh Sorour, Mohamed Hefeida

(参考訳) 本稿では、リソース制約された無線エッジネットワークを介して接続された複数のエッジノードまたは学習者間で非同期に機械学習モデルをトレーニングするための最適なタスク割り当てスキームを設計し、"モバイルエッジラーニング(MEL)"のパラダイムを拡張した。最適化は、各学習者に割り当てられたタスクの一部が、所定の大域的遅延制約と局所的な最大エネルギー消費限界内で完了するように行われる。消費される時間とエネルギーは、学習者の不均一なコミュニケーションと計算能力に直接関係している。提案するモデルは異種性認識(HA)である。結果の最適化はNP-hard quadratically-constrained integer linear program (QCILP) であるので、緩和された同期問題の解を用いて2段階のsugg-and-improve (SAI) 解を提案し、非同期問題の解を求める。提案したHA非同期(HA-Asyn)アプローチは、HA同期(HA-Sync)スキームとHU同値バッチ割り当てスキームと比較する。 20名の学習者が様々な完了時間とエネルギー消費制約をテストした結果,提案手法はhu同期/asynchronous (hu-sync/asyn) 法よりも優れており,ha-sync法と比較して最大25\%の利得が得られることがわかった。

This paper extends the paradigm of "mobile edge learning (MEL)" by designing an optimal task allocation scheme for training a machine learning model in an asynchronous manner across mutiple edge nodes or learners connected via a resource-constrained wireless edge network. The optimization is done such that the portion of the task allotted to each learner is completed within a given global delay constraint and a local maximum energy consumption limit. The time and energy consumed are related directly to the heterogeneous communication and computational capabilities of the learners; i.e. the proposed model is heterogeneity aware (HA). Because the resulting optimization is an NP-hard quadratically-constrained integer linear program (QCILP), a two-step suggest-and-improve (SAI) solution is proposed based on using the solution of the relaxed synchronous problem to obtain the solution to the asynchronous problem. The proposed HA asynchronous (HA-Asyn) approach is compared against the HA synchronous (HA-Sync) scheme and the heterogeneity unaware (HU) equal batch allocation scheme. Results from a system of 20 learners tested for various completion time and energy consumption constraints show that the proposed HA-Asyn method works better than the HU synchronous/asynchronous (HU-Sync/Asyn) approach and can provide gains of up-to 25\% compared to the HA-Sync scheme.

翻訳日:2021-06-06 16:42:53 公開日:2020-12-04

# 顔の超解像に対する空間的注意の学習

Learning Spatial Attention for Face Super-Resolution ( http://arxiv.org/abs/2012.01211v2 )

ライセンス: Link先を確認

Chaofeng Chen, Dihong Gong, Hao Wang, Zhifeng Li, Kwan-Yee K. Wong

(参考訳) 一般画像超解像技術は、低解像度の顔画像に適用する場合、詳細な顔構造を復元することが困難である。近年,顔画像に適した深層学習手法は,顔解析やランドマーク予測などのタスクを共同で訓練することで,性能の向上を実現している。しかし、マルチタスク学習には追加のラベル付きデータが必要である。さらに、既存の作品の多くは比較的低解像度の顔画像しか生成できない(例えば、128\times128$)ため、その用途は限られている。本稿では,顔超解像のための新しい顔注意ユニット (FAU) 上に構築したSPatial Attention Residual Network (SPARNet) を紹介する。具体的には,バニラ残留ブロックに空間的注意機構を導入する。これにより、畳み込み層は、キーとなる顔構造に関連する機能を適応的にブートストラップし、より機能豊富な領域に注意を払わないことができる。これにより、キーとなる顔構造が顔画像のごく一部を占めるだけで、トレーニングはより効果的で効率的なものになる。注意マップの可視化は、非常に低解像度の顔であっても、我々の空間的注意ネットワークがキーフェイス構造をうまく捉えることができることを示している。各種指標(psnr, ssim, アイデンティティ類似性, ランドマーク検出など)の定量的比較により, 現状よりも優れた手法が得られた。さらにSPARNetをSPARNetHDと呼ばれるマルチスケールの識別器で拡張し、高解像度な結果(512\times512$)を生成する。合成データを用いて訓練したSPARNetHDは、合成劣化顔画像に対して高品質で高解像度な出力を生成するだけでなく、現実の低画質顔画像に対して優れた一般化能力を示すことを示す。

General image super-resolution techniques have difficulties in recovering detailed face structures when applying to low resolution face images. Recent deep learning based methods tailored for face images have achieved improved performance by jointly trained with additional task such as face parsing and landmark prediction. However, multi-task learning requires extra manually labeled data. Besides, most of the existing works can only generate relatively low resolution face images (e.g., $128\times128$), and their applications are therefore limited. In this paper, we introduce a novel SPatial Attention Residual Network (SPARNet) built on our newly proposed Face Attention Units (FAUs) for face super-resolution. Specifically, we introduce a spatial attention mechanism to the vanilla residual blocks. This enables the convolutional layers to adaptively bootstrap features related to the key face structures and pay less attention to those less feature-rich regions. This makes the training more effective and efficient as the key face structures only account for a very small portion of the face image. Visualization of the attention maps shows that our spatial attention network can capture the key face structures well even for very low resolution faces (e.g., $16\times16$). Quantitative comparisons on various kinds of metrics (including PSNR, SSIM, identity similarity, and landmark detection) demonstrate the superiority of our method over current state-of-the-arts. We further extend SPARNet with multi-scale discriminators, named as SPARNetHD, to produce high resolution results (i.e., $512\times512$). We show that SPARNetHD trained with synthetic data cannot only produce high quality and high resolution outputs for synthetically degraded face images, but also show good generalization ability to real world low quality face images.

翻訳日:2021-05-25 03:57:35 公開日:2020-12-04

# rotnet:畳み込みニューラルネットワークを用いた恒星回転周期の高速かつスケーラブルな推定

RotNet: Fast and Scalable Estimation of Stellar Rotation Periods Using Convolutional Neural Networks ( http://arxiv.org/abs/2012.01985v2 )

ライセンス: Link先を確認

J. Emmanuel Johnson, Sairam Sundaresan, Tansu Daylan, Lisseth Gavilan, Daniel K. Giles, Stela Ishitani Silva, Anna Jungbluth, Brett Morris, Andr\'es Mu\~noz-Jaramillo

(参考訳) 恒星の磁気活動は、望遠鏡が観測する明るさを調節する表面の暗い斑点として現れる。これらの光度曲線は恒星の回転に関する重要な情報を含んでいる。しかしながら、回転周期の正確な推定は、基底真理情報、ノイズデータ、そして縮退解につながる大きなパラメータ空間のために計算的に高価である。深層学習のパワーを活かし,ケプラー光曲線からの恒星回転周期の後退に畳み込みニューラルネットワークを応用した。光曲線の画像変換のための幾何学保存時系列は、転送学習によって訓練されたResNet-18アーキテクチャへの入力として機能する。 mcquillan catalog of published rotation periodsはansatz to groundtruthとして使われている。我々は,この手法の性能を,ランダムフォレスト回帰器,1次元CNN,自動相関関数(ACF)に対してベンチマークし,回転周期を推定する。入力を少ないデータポイント(1k)に制限したものの、モデルはより正確な結果をもたらし、同じ数のデータポイントでacfが動作し、acfが65kのデータポイントで実行するよりも10000倍高速に動作します。最小限の機能エンジニアリングだけで、我々のアプローチは印象的な精度を持ち、より大規模な恒星パラメータの回帰にディープラーニングの適用を動機付けます。

Magnetic activity in stars manifests as dark spots on their surfaces that modulate the brightness observed by telescopes. These light curves contain important information on stellar rotation. However, the accurate estimation of rotation periods is computationally expensive due to scarce ground truth information, noisy data, and large parameter spaces that lead to degenerate solutions. We harness the power of deep learning and successfully apply Convolutional Neural Networks to regress stellar rotation periods from Kepler light curves. Geometry-preserving time-series to image transformations of the light curves serve as inputs to a ResNet-18 based architecture which is trained through transfer learning. The McQuillan catalog of published rotation periods is used as ansatz to groundtruth. We benchmark the performance of our method against a random forest regressor, a 1D CNN, and the Auto-Correlation Function (ACF) - the current standard to estimate rotation periods. Despite limiting our input to fewer data points (1k), our model yields more accurate results and runs 350 times faster than ACF runs on the same number of data points and 10,000 times faster than ACF runs on 65k data points. With only minimal feature engineering our approach has impressive accuracy, motivating the application of deep learning to regress stellar parameters on an even larger scale

翻訳日:2021-05-25 03:43:01 公開日:2020-12-04

# スパースイジングモデルのサンプル効率l0-l2制約構造学習

Sample-efficient L0-L2 constrained structure learning of sparse Ising models ( http://arxiv.org/abs/2012.01744v2 )

ライセンス: Link先を確認

Antoine Dedieu, Miguel L\'azaro-Gredilla, Dileep George

(参考訳) スパースイジングモデルの基盤となるグラフを$n$ i.i.dから$p$ノードで学習する問題を考察する。サンプル最新の最も優れた手法は、経験的損失(ロジスティック回帰損失または相互作用スクリーニング損失)と正規化器(L1ペナルティまたはL1制約)を組み合わせることである。これにより、グラフの各ノードごとに別々に解くことができる凸問題が発生する。本研究では, 濃度制約 L0 ノルムを利用して, 空間性を適切に誘導し, さらに L2 ノルムと組み合わせて非零係数をモデル化する。本稿では,論文で研究されているグラフトポロジの貧弱度と完全回復率の急激な相転移をL1系と比較することにより,理論上, (a) 回復保証のための新しい最先端の上限に達し, (b) 実験的に, サンプルの複雑さの向上を図っている。

We consider the problem of learning the underlying graph of a sparse Ising model with $p$ nodes from $n$ i.i.d. samples. The most recent and best performing approaches combine an empirical loss (the logistic regression loss or the interaction screening loss) with a regularizer (an L1 penalty or an L1 constraint). This results in a convex problem that can be solved separately for each node of the graph. In this work, we leverage the cardinality constraint L0 norm, which is known to properly induce sparsity, and further combine it with an L2 norm to better model the non-zero coefficients. We show that our proposed estimators achieve an improved sample complexity, both (a) theoretically -- by reaching new state-of-the-art upper bounds for recovery guarantees -- and (b) empirically -- by showing sharper phase transitions between poor and full recovery for graph topologies studied in the literature -- when compared to their L1-based counterparts.

翻訳日:2021-05-23 15:08:53 公開日:2020-12-04

# 単発経路統合パンオプティカルセグメンテーション

Single-shot Path Integrated Panoptic Segmentation ( http://arxiv.org/abs/2012.01632v2 )

ライセンス: Link先を確認

Sukjun Hwang, Seoung Wug Oh, Seon Joo Kim

(参考訳) インスタンスのセグメンテーションとセマンティックセグメンテーションを統一する新しいタスクであるpanoptic segmentationが最近注目を集めている。しかしながら、従来の手法のほとんどは、指定された分割タスクに特化した経路ごとに複数の経路から構成される。本稿では,実行フローを統合することで,単ショットでのパノプティカルセグメンテーションを解決することを提案する。統合された経路では、panoptic-featureと呼ばれる統合機能マップが生成され、物と物の両方の情報が含まれている。パノプティカル・フィーチャーは、同じインスタンスに属するクラスタピクセルを誘導し、異なるクラスのオブジェクトを区別する補助的な問題によってより洗練される。各フィルタが物または物を表す畳み込みフィルタのコレクションは、一度にパンオプティカル機能に適用され、シングルショットのパンオプティカルセグメンテーションが実現される。トップダウンとボトムアップの両方のアプローチの利点を生かして、SPINetと呼ばれる手法は、COCOとCityscapesという主要な汎視的セグメンテーションベンチマークにおいて、高い効率と精度を享受する。

Panoptic segmentation, which is a novel task of unifying instance segmentation and semantic segmentation, has attracted a lot of attention lately. However, most of the previous methods are composed of multiple pathways with each pathway specialized to a designated segmentation task. In this paper, we propose to resolve panoptic segmentation in single-shot by integrating the execution flows. With the integrated pathway, a unified feature map called Panoptic-Feature is generated, which includes the information of both things and stuffs. Panoptic-Feature becomes more sophisticated by auxiliary problems that guide to cluster pixels that belong to the same instance and differentiate between objects of different classes. A collection of convolutional filters, where each filter represents either a thing or stuff, is applied to Panoptic-Feature at once, materializing the single-shot panoptic segmentation. Taking the advantages of both top-down and bottom-up approaches, our method, named SPINet, enjoys high efficiency and accuracy on major panoptic segmentation benchmarks: COCO and Cityscapes.

翻訳日:2021-05-23 14:59:49 公開日:2020-12-04

# 教師なし3次元セグメンテーションのための双曲表現の学習

Learning Hyperbolic Representations for Unsupervised 3D Segmentation ( http://arxiv.org/abs/2012.01644v2 )

ライセンス: Link先を確認

Joy Hsu, Jeffrey Gu, Gong-Her Wu, Wah Chiu, Serena Yeung

(参考訳) 複雑なボリュームデータには教師なしの3Dセグメンテーションが必要であり、特にアノテーションの能力が制限されている場合や新しいカテゴリの発見が望まれている場合などである。 3次元ボリュームデータの多くは本質的に階層的であるという観察から,双曲型潜在空間を持つ変分型オートエンコーダ(VAE)と,3次元画像内の階層構造をより良くモデル化したジャイロプレーン畳み込み層を用いて,教師なしセグメンテーションのための3次元パッチの効果的な表現を学習することを提案する。また,階層的三重項損失とマルチスケールパッチサンプリングスキームを導入し,粒度の異なるレベル間の関係を埋め込む。階層型玩具データセット,BraTS全腫瘍データセット,低温電子顕微鏡データを用いた非教師なし3次元セグメンテーションにおけるハイパーボリック表現の有効性を実証した。

There exists a need for unsupervised 3D segmentation on complex volumetric data, particularly when annotation ability is limited or discovery of new categories is desired. Using the observation that much of 3D volumetric data is innately hierarchical, we propose learning effective representations of 3D patches for unsupervised segmentation through a variational autoencoder (VAE) with a hyperbolic latent space and a proposed gyroplane convolutional layer, which better models the underlying hierarchical structure within a 3D image. We also introduce a hierarchical triplet loss and multi-scale patch sampling scheme to embed relationships across varying levels of granularity. We demonstrate the effectiveness of our hyperbolic representations for unsupervised 3D segmentation on a hierarchical toy dataset, BraTS whole tumor dataset, and cryogenic electron microscopy data.

翻訳日:2021-05-23 14:58:17 公開日:2020-12-04

# 物体検出のためのデュアルリファインメント特徴ピラミッドネットワーク

Dual Refinement Feature Pyramid Networks for Object Detection ( http://arxiv.org/abs/2012.01733v2 )

ライセンス: Link先を確認

Jialiang Ma, Bin Chen

(参考訳) FPNは、オブジェクト検出器で使われる一般的なコンポーネントであり、隣り合うレベルの補間と和によって、マルチスケール情報を補う。しかし、非線形演算と異なる出力次元の畳み込み層が存在するため、異なるレベル間の関係はより複雑であり、ピクセルワイズ和は効率的なアプローチではない。本稿では,まず,画素レベルと特徴マップレベルからの設計欠陥を分析する。そこで我々はDual Refinement Feature Pyramid Networks (DRFPN) と呼ばれる新しいパラメータフリー特徴ピラミッドネットワークを設計した。具体的には、DRFPNはSRB(Spatial Refinement Block)とCRB(Channel Refinement Block)の2つのモジュールで構成される。 srbは隣接するレベル間の文脈情報に基づいてサンプリングポイントの位置と内容を学ぶ。 CRBはアテンション機構に基づく適応チャネルマージ法を学習する。提案するRFPNは,既存のFPNモデルに容易に接続できる。ベルとホイッスルがなければ、2段階検出器では、COCO検出ベンチマークでは1.6から2.2AP、COCOセグメンテーションベンチマークでは1.5から1.9APで異なるFPNベースのモデルよりも優れている。 1段階検出器では、ResNet50をバックボーンとして使用する場合、DRFPNはアンカーベースRetinaNetを1.9 AP、アンカーフリーFCOSを1.3 AP改善する。 DRFPNの強靭性と一般化能力を検証する。コードは公開される予定だ。

FPN is a common component used in object detectors, it supplements multi-scale information by adjacent level features interpolation and summation. However, due to the existence of nonlinear operations and the convolutional layers with different output dimensions, the relationship between different levels is much more complex, the pixel-wise summation is not an efficient approach. In this paper, we first analyze the design defects from pixel level and feature map level. Then, we design a novel parameter-free feature pyramid networks named Dual Refinement Feature Pyramid Networks (DRFPN) for the problems. Specifically, DRFPN consists of two modules: Spatial Refinement Block (SRB) and Channel Refinement Block (CRB). SRB learns the location and content of sampling points based on contextual information between adjacent levels. CRB learns an adaptive channel merging method based on attention mechanism. Our proposed DRFPN can be easily plugged into existing FPN-based models. Without bells and whistles, for two-stage detectors, our model outperforms different FPN-based counterparts by 1.6 to 2.2 AP on the COCO detection benchmark, and 1.5 to 1.9 AP on the COCO segmentation benchmark. For one-stage detectors, DRFPN improves anchor-based RetinaNet by 1.9 AP and anchor-free FCOS by 1.3 AP when using ResNet50 as backbone. Extensive experiments verifies the robustness and generalization ability of DRFPN. The code will be made publicly available.

翻訳日:2021-05-23 14:56:43 公開日:2020-12-04

# 多スーパービジョンによる歩行者軌道予測のための時間ピラミッドネットワーク

Temporal Pyramid Network for Pedestrian Trajectory Prediction with Multi-Supervision ( http://arxiv.org/abs/2012.01884v2 )

ライセンス: Link先を確認

Rongqin Liang, Yuanman Li, Xia Li, yi tang, Jiantao Zhou, Wenbin Zou

(参考訳) 群衆の人間の動きを予測することは、自動運転車の自然なナビゲーションからビデオ監視のインテリジェントなセキュリティシステムまで、多くのアプリケーションにとって重要である。従来のすべての作業は、1つの解像度で軌道をモデル化し予測するが、これは比較的非効率であり、移動行動の長距離情報(例えば、軌道の目的地)と短距離情報(例えば、ある時点での歩行方向と速度)を同時に利用することが困難である。本稿では,スクイーズ変調と拡張変調による歩行者追跡予測のための時間的ピラミッドネットワークを提案する。我々の階層的なフレームワークは、上から下までよりリッチな時間情報を持つ特徴ピラミッドを構築し、様々なテンポでの動作をよりよく捉えます。さらに,マルチスーパービジョンを用いた粗大な核融合戦略を提案する。グローバルコンテキストの上位粗い特徴をリッチローカルコンテキストの下位細かい特徴に段階的にマージすることにより、この手法は軌道の長距離情報と短距離情報の両方を完全に活用することができる。いくつかのベンチマーク実験の結果,提案手法の優位性を示した。

Predicting human motion behavior in a crowd is important for many applications, ranging from the natural navigation of autonomous vehicles to intelligent security systems of video surveillance. All the previous works model and predict the trajectory with a single resolution, which is rather inefficient and difficult to simultaneously exploit the long-range information (e.g., the destination of the trajectory), and the short-range information (e.g., the walking direction and speed at a certain time) of the motion behavior. In this paper, we propose a temporal pyramid network for pedestrian trajectory prediction through a squeeze modulation and a dilation modulation. Our hierarchical framework builds a feature pyramid with increasingly richer temporal information from top to bottom, which can better capture the motion behavior at various tempos. Furthermore, we propose a coarse-to-fine fusion strategy with multi-supervision. By progressively merging the top coarse features of global context to the bottom fine features of rich local context, our method can fully exploit both the long-range and short-range information of the trajectory. Experimental results on several benchmarks demonstrate the superiority of our method.

翻訳日:2021-05-23 14:55:04 公開日:2020-12-04

# (参考訳) ドメイン間のガイド付き画像キャプション性能の理解

Understanding Guided Image Captioning Performance across Domains ( http://arxiv.org/abs/2012.02339v1 )

ライセンス: CC BY 4.0

Edwin G. Ng, Bo Pang, Piyush Sharma, Radu Soricut

(参考訳) 画像キャプションモデルは一般的に、ユーザの関心を考慮に入れられる能力がなく、通常は、可読性、情報提供性、情報過負荷のバランスをとるグローバルな記述にデフォルトがある。一方、VQAモデルは、テキスト質問がかなり正確であることを期待しながら、長い記述的な回答を提供する能力に欠ける。本稿では,画像中の接地可能な概念や非接地可能な概念を参照するガイドテキストと呼ばれる追加入力を用いて,画像キャプションが重視すべき概念を制御する方法を提案する。このモデルはトランスフォーマティブベースのマルチモーダルエンコーダで構成されており、ガイドテキストとグローバルおよびオブジェクトレベルの画像特徴を併用して、ガイドキャプションを生成するために使用される早期融合表現を導出する。 Visual Genomeデータでトレーニングされたモデルは、自動オブジェクトラベルでガイドされるときにドメイン内でうまく適合するが、概念キャプションでトレーニングされたガイド付きキャプションモデルは、ドメイン外の画像やガイドテキストをより一般化する。人手による評価結果から,非制限領域の大規模トレーニングデータセットへのアクセスが要求されるとともに,スタイルの多様性(語彙サイズを増大させることなく)が向上する要因であることが示唆された。

Image captioning models generally lack the capability to take into account user interest, and usually default to global descriptions that try to balance readability, informativeness, and information overload. On the other hand, VQA models generally lack the ability to provide long descriptive answers, while expecting the textual question to be quite precise. We present a method to control the concepts that an image caption should focus on, using an additional input called the guiding text that refers to either groundable or ungroundable concepts in the image. Our model consists of a Transformer-based multimodal encoder that uses the guiding text together with global and object-level image features to derive early-fusion representations used to generate the guided caption. While models trained on Visual Genome data have an in-domain advantage of fitting well when guided with automatic object labels, we find that guided captioning models trained on Conceptual Captions generalize better on out-of-domain images and guiding texts. Our human-evaluation results indicate that attempting in-the-wild guided image captioning requires access to large, unrestricted-domain training datasets, and that increased style diversity (even without increasing vocabulary size) is a key factor for improved performance.

翻訳日:2021-05-23 13:02:49 公開日:2020-12-04

# (参考訳) 分割と学習: 予測+最適化のための分割と克服のアプローチ

Divide and Learn: A Divide and Conquer Approach for Predict+Optimize ( http://arxiv.org/abs/2012.02342v1 )

ライセンス: CC BY 4.0

Ali Ugur Guler, Emir Demirovic, Jeffrey Chan, James Bailey, Christopher Leckie, Peter J. Stuckey

(参考訳) 予測+最適化問題は、確率係数の機械学習と予測係数を使用する組合せ最適化プロブレムを組み合わせる。この問題は2つの異なる段階で解決できるが、最適化損失を直接最小化する方がよい。しかし、これは離散的で微分不可能な組合せ函数を通して dif-ferentiating を必要とする。既存のアプローチの多くはある種の代理勾配を用いる。 demirovicet氏は、予測された係数を分割線形関数として、最適化問題の損失を直接表現する方法を示した。彼らのアプローチは、動的プログラミングの定式化による最適化の問題に限定されている。本研究では,この制約を伴わずに最適化問題に対処し,最適化損失を用いてその係数を予測するアルゴリズムを提案する。また, 計算量が少なく, 同様の再実行を実現するため, この手法の合意版も導入した。我々は,予測最適化問題に対する他のアプローチと比較し,他の予測最適化手法よりも厳密な組合せ問題に対処できることを示す。

The predict+optimize problem combines machine learning ofproblem coefficients with a combinatorial optimization prob-lem that uses the predicted coefficients. While this problemcan be solved in two separate stages, it is better to directlyminimize the optimization loss. However, this requires dif-ferentiating through a discrete, non-differentiable combina-torial function. Most existing approaches use some form ofsurrogate gradient. Demirovicet alshowed how to directlyexpress the loss of the optimization problem in terms of thepredicted coefficients as a piece-wise linear function. How-ever, their approach is restricted to optimization problemswith a dynamic programming formulation. In this work wepropose a novel divide and conquer algorithm to tackle op-timization problems without this restriction and predict itscoefficients using the optimization loss. We also introduce agreedy version of this approach, which achieves similar re-sults with less computation. We compare our approach withother approaches to the predict+optimize problem and showwe can successfully tackle some hard combinatorial problemsbetter than other predict+optimize methods.

翻訳日:2021-05-23 12:45:15 公開日:2020-12-04

# (参考訳) コピースペース:どこに画像を書き込むか?

Copyspace: Where to Write on Images? ( http://arxiv.org/abs/2012.08933v1 )

ライセンス: CC BY 4.0

Jessica M. Lundin and Michael Sollami and Brian Lonsdorf and Alan Ross and Owen Schoppe and David Woodward and S\"onke Rohde

(参考訳) 画像上のテキストの配置は、高品質なビジュアルデザインを生み出す上で重要な部分である。テキスト要素の適切な位置、向き、スタイルを決定することで、この作業を自動化するには、背景画像の内容を理解する必要がある。画像上に描画されたテキストの美的パラメータを「コピースペース検出」と呼び、このタスクが前景と背景の分離とは異なることを指摘する。我々は、専門ラベル付きデータに基づいて訓練された1段階と2段階のオブジェクト検出手法を用いて、ソリューションを開発した。このワークショップでは、コピースペース検出のためのそのようなアルゴリズムを検証し、Einstein Designerのような生成設計モデルやパイプラインへの応用を実証する。

The placement of text over an image is an important part of producing high-quality visual designs. Automating this work by determining appropriate position, orientation, and style for textual elements requires understanding the contents of the background image. We refer to the search for aesthetic parameters of text rendered over images as "copyspace detection", noting that this task is distinct from foreground-background separation. We have developed solutions using one and two stage object detection methodologies trained on an expertly labeled data. This workshop will examine such algorithms for copyspace detection and demonstrate their application in generative design models and pipelines such as Einstein Designer.

翻訳日:2021-05-23 12:03:11 公開日:2020-12-04

# (参考訳) 単眼映像における仮想物体のスケールアウェア挿入

Scale-aware Insertion of Virtual Objects in Monocular Videos ( http://arxiv.org/abs/2012.02371v1 )

ライセンス: CC BY 4.0

Songhai Zhang and Xiangli Li and Yingtian Liu and Hongbo Fu

(参考訳) 本稿では,適切な大きさの仮想物体を単眼映像に挿入するスケールアウェア手法を提案する。単眼映像からの幾何復元のスケール曖昧性問題に取り組むため,映像中のグローバルスケールオブジェクトをベイズ的手法を用いて推定し,シーンオブジェクトのサイズは同一のグローバルスケールに厳密に準拠すべきであり,グローバルスケールの可能性は対象カテゴリのサイズ分布に応じて最大化する。そこで我々は,対象のカテゴリの大きさのデータセットを提案する。メートル法ツリー,対応する画像と900以上の対象カテゴリの階層表現である。ビデオから回収したオブジェクトの不完全性に対処するために,オブジェクトの可視次元を抽出してスケール最適化を行う,新しいスケール推定手法を提案する。実験により,本手法は最先端手法よりも優れた性能を示し,異なる映像シーンに対して高い妥当性とロバスト性を示した。 Metric-Tree は https://metric-tree.github.io で利用可能になった。

In this paper, we propose a scale-aware method for inserting virtual objects with proper sizes into monocular videos. To tackle the scale ambiguity problem of geometry recovery from monocular videos, we estimate the global scale objects in a video with a Bayesian approach incorporating the size priors of objects, where the scene objects sizes should strictly conform to the same global scale and the possibilities of global scales are maximized according to the size distribution of object categories. To do so, we propose a dataset of sizes of object categories: Metric-Tree, a hierarchical representation of sizes of more than 900 object categories with the corresponding images. To handle the incompleteness of objects recovered from videos, we propose a novel scale estimation method that extracts plausible dimensions of objects for scale optimization. Experiments have shown that our method for scale estimation performs better than the state-of-the-art methods, and has considerable validity and robustness for different video scenes. Metric-Tree has been made available at: https://metric-tree.github.io

翻訳日:2021-05-23 11:59:13 公開日:2020-12-04

# (参考訳) 放射線画像における画素分割解剖学的埋め込みの自己教師あり学習

Self-supervised Learning of Pixel-wise Anatomical Embeddings in Radiological Images ( http://arxiv.org/abs/2012.02383v1 )

ライセンス: CC BY 4.0

Ke Yan, Jinzheng Cai, Dakai Jin, Shun Miao, Adam P. Harrison, Dazhou Guo, Youbao Tang, Jing Xiao, Jingjing Lu, Le Lu

(参考訳) CT(Computed tomography)やX線などの放射線画像は、固有の構造を持つ解剖学を反映している。様々な画像にまたがる同じ解剖学的または意味的な構造を確実に特定できることは、医用画像解析の基本的な課題である。原則として、このタスクにランドマーク検出やセマンティックセグメンテーションを使用することは可能だが、うまく機能するためには、各解剖学的構造とサブ構造に対する大量のラベル付きデータが必要である。より普遍的なアプローチは、ラベルのない画像から本質的な構造を発見する。我々は,自制解剖学eMbedding (SAM) と呼ばれるアプローチを導入する。 SAMは、解剖学的位置または身体部分を記述する各画像ピクセルに対してセマンティック埋め込みを生成する。このような埋め込みを生成するために,画素レベルのコントラスト学習フレームワークを提案する。粗大な戦略により、グローバルとローカルの両方の解剖情報が符号化される。負のサンプル選択戦略は、異なる身体部位の識別性を高めるために設計されている。 SAMを使用すると、テンプレート画像に任意の関心点をラベル付けし、簡単な近接探索によって他の画像の同じ身体部分を見つけることができる。 2次元および3次元画像モダリティを持つ複数のタスクにおいてSAMの有効性を示す。 19のランドマークを持つ胸部CTデータセットでは、SAMは200倍高速で広く使われている登録アルゴリズムより優れている。 2つのx線データセット、samは1つのラベル付きテンプレートイメージを持つだけで、50のラベル付きイメージでトレーニングされた教師付きメソッドを上回っている。また,CTの全身追跡病変マッチングにもSAMを適用し,91%の精度を得た。

Radiological images such as computed tomography (CT) and X-rays render anatomy with intrinsic structures. Being able to reliably locate the same anatomical or semantic structure across varying images is a fundamental task in medical image analysis. In principle it is possible to use landmark detection or semantic segmentation for this task, but to work well these require large numbers of labeled data for each anatomical structure and sub-structure of interest. A more universal approach would discover the intrinsic structure from unlabeled images. We introduce such an approach, called Self-supervised Anatomical eMbedding (SAM). SAM generates semantic embeddings for each image pixel that describes its anatomical location or body part. To produce such embeddings, we propose a pixel-level contrastive learning framework. A coarse-to-fine strategy ensures both global and local anatomical information are encoded. Negative sample selection strategies are designed to enhance the discriminability among different body parts. Using SAM, one can label any point of interest on a template image, and then locate the same body part in other images by simple nearest neighbor searching. We demonstrate the effectiveness of SAM in multiple tasks with 2D and 3D image modalities. On a chest CT dataset with 19 landmarks, SAM outperforms widely-used registration algorithms while being 200 times faster. On two X-ray datasets, SAM, with only one labeled template image, outperforms supervised methods trained on 50 labeled images. We also apply SAM on whole-body follow-up lesion matching in CT and obtain an accuracy of 91%.

翻訳日:2021-05-23 11:44:36 公開日:2020-12-04

# (参考訳) 多目的最適化モデルを用いた不完全データセットを用いた機械学習

Machine learning with incomplete datasets using multi-objective optimization models ( http://arxiv.org/abs/2012.13352v1 )

ライセンス: CC BY 4.0

Hadi A. Khorshidi, Michael Kirley, Uwe Aickelin

(参考訳) 完全なデータから学習するために機械学習技術が開発されている。データセットに欠落した値が存在する場合、欠落した値やインプテーションでデータポイントを取り除くことで、不完全なデータを別々に前処理する必要がある。本稿では,分類モデルが学習されている間,不足値を扱うオンライン手法を提案する。この目的を達成するために,2つの目的関数を持つ多目的最適化モデルを構築した。また, 目的関数の定式化を3つ提案する。 NSGA IIに基づく進化的アルゴリズムを用いて、パレート解として最適解を求める。提案モデルの信頼性とロバスト性について実験を行い,欠落した値や分類のシナリオを定義した。また,提案モデルが医療情報学にどのように貢献できるかについても述べる。実験結果を用いて3種類の定式化の性能を比較した。提案したモデル結果は、同等の文献と比較することによって検証される。

Machine learning techniques have been developed to learn from complete data. When missing values exist in a dataset, the incomplete data should be preprocessed separately by removing data points with missing values or imputation. In this paper, we propose an online approach to handle missing values while a classification model is learnt. To reach this goal, we develop a multi-objective optimization model with two objective functions for imputation and model selection. We also propose three formulations for imputation objective function. We use an evolutionary algorithm based on NSGA II to find the optimal solutions as the Pareto solutions. We investigate the reliability and robustness of the proposed model using experiments by defining several scenarios in dealing with missing values and classification. We also describe how the proposed model can contribute to medical informatics. We compare the performance of three different formulations via experimental results. The proposed model results get validated by comparing with a comparable literature.

翻訳日:2021-05-23 11:14:36 公開日:2020-12-04

# (参考訳) 区間値データを用いた集計ファジィ数の類似度尺度

Similarity measure for aggregated fuzzy numbers from interval-valued data ( http://arxiv.org/abs/2012.03721v1 )

ライセンス: CC BY 4.0

Justin Kane Gunn, Hadi Akbarzadeh Khorshidi, Uwe Aickelin

(参考訳) 本稿では,2つのファジィ数間の類似度を区間間一致法 (IAA) を用いて計算する手法を提案する。本研究で提案される類似度尺度には, ファジィ数に対する新規な特徴と属性がいくつか含まれている。この研究で完全に再定義または修正された属性には、面積、周囲、センチロイド、石英、および合意比率が含まれる。各機能に対する推奨重み付けは、principal component analysis(pca)を使って学んだ。さらに、類似度測定の応用と将来的な利用について詳述する図示的な例を示す。

This paper presents a method to compute the degree of similarity between two aggregated fuzzy numbers from intervals using the Interval Agreement Approach (IAA). The similarity measure proposed within this study contains several features and attributes, of which are novel to aggregated fuzzy numbers. The attributes completely redefined or modified within this study include area, perimeter, centroids, quartiles and the agreement ratio. The recommended weighting for each feature has been learned using Principal Component Analysis (PCA). Furthermore, an illustrative example is provided to detail the application and potential future use of the similarity measure.

翻訳日:2021-05-23 10:21:12 公開日:2020-12-04

# (参考訳) 遺伝的プログラミングを用いた医学テキスト分類のためのデータ駆動正規表現進化

Data-Driven Regular Expressions Evolution for Medical Text Classification Using Genetic Programming ( http://arxiv.org/abs/2012.07515v1 )

ライセンス: CC BY 4.0

J Liu, R Bai, Z Lu, P Ge, D Liu, Uwe Aickelin

(参考訳) 医学分野において、テキスト分類は構造化情報デジタル化とインテリジェントな意思決定支援を通じて人的負担を大幅に削減できる最も重要なタスクの1つである。学習に基づくテキスト分類技術が普及しているにもかかわらず、学習のブラックボックスの性質から、分類結果の理解や手作業による微調整が困難である。そこで本研究では,遺伝子プログラミング(GP)アプローチを用いた新たな正規表現に基づくテキスト分類手法を提案する。正規表現の種数(専門家がランダムに初期化または手動で構築できる)が与えられた場合、本手法は、新しい正規表現構文と慎重に選択された一連の再生演算子を用いて、選択された適合関数に従って正規表現の集団を進化させる。本手法は,オンライン医療提供者からのリアルタイム医療用テキスト調査を用いて評価し,有望なパフォーマンスを示す。より重要なことに、この手法は医療関係者によって完全に理解され、チェックされ、更新される分類器を生成します。

In medical fields, text classification is one of the most important tasks that can significantly reduce human workload through structured information digitization and intelligent decision support. Despite the popularity of learning-based text classification techniques, it is hard for human to understand or manually fine-tune the classification results for better precision and recall, due to the black box nature of learning. This study proposes a novel regular expression-based text classification method making use of genetic programming (GP) approaches to evolve regular expressions that can classify a given medical text inquiry with satisfactory precision and recall while allow human to read the classifier and fine-tune accordingly if necessary. Given a seed population of regular expressions (can be randomly initialized or manually constructed by experts), our method evolves a population of regular expressions according to chosen fitness function, using a novel regular expression syntax and a series of carefully chosen reproduction operators. Our method is evaluated with real-life medical text inquiries from an online healthcare provider and shows promising performance. More importantly, our method generates classifiers that can be fully understood, checked and updated by medical doctors, which are fundamentally crucial for medical related practices.

翻訳日:2021-05-23 10:08:03 公開日:2020-12-04

# (参考訳) PeR-ViS:意味記述を用いたビデオサーベイランスの個人検索

PeR-ViS: Person Retrieval in Video Surveillance using Semantic Description ( http://arxiv.org/abs/2012.02408v1 )

ライセンス: CC BY 4.0

Parshwa Shah, Arpit Garg and Vandit Gajjar

(参考訳) 人は通常、年齢、性別、身長、布の種類、パターン、色などの記述者によって特徴づけられる。このような記述子は属性やソフトバイオメトリックスとして知られている。ビデオ監視において、人の記述と検索のセマンティックなギャップをリンクする。セマンティック記述のクエリで特定の人物を取得することは、ビデオ監視において重要な応用である。コンピュータビジョンを用いて人検索作業を完全に自動化し,研究コミュニティ内で関心を集めている。しかし、現在のトレンドは、主に画像ベースのクエリを持つ人物の検索に焦点を当てているため、実用上の大きな制限がある。本稿では,画像クエリーの代わりに,映像監視における人物検索の問題点を意味的記述を用いて検討する。この問題を解決するために,Mask R-CNN [14] と DenseNet-161 [16] を用いた深層学習に基づくカスケードフィルタリング手法 (PeR-ViS) を開発した。 SoftBioSearch [6] の標準人物検索データセットでは、0.566平均 IoU と 0.792 %w $IoU > 0.4$ を達成し、現在の最先端をはるかに上回っている。私たちのシンプルで再現可能で効果的なアプローチが、ビデオ監視における人物検索の領域における将来の研究を容易にしてくれることを期待しています。ソースコードとトレーニング済みのウェイトはhttps://parshwa1999.github.io/per-vis/。

A person is usually characterized by descriptors like age, gender, height, cloth type, pattern, color, etc. Such descriptors are known as attributes and/or soft-biometrics. They link the semantic gap between a person's description and retrieval in video surveillance. Retrieving a specific person with the query of semantic description has an important application in video surveillance. Using computer vision to fully automate the person retrieval task has been gathering interest within the research community. However, the Current, trend mainly focuses on retrieving persons with image-based queries, which have major limitations for practical usage. Instead of using an image query, in this paper, we study the problem of person retrieval in video surveillance with a semantic description. To solve this problem, we develop a deep learning-based cascade filtering approach (PeR-ViS), which uses Mask R-CNN [14] (person detection and instance segmentation) and DenseNet-161 [16] (soft-biometric classification). On the standard person retrieval dataset of SoftBioSearch [6], we achieve 0.566 Average IoU and 0.792 %w $IoU > 0.4$, surpassing the current state-of-the-art by a large margin. We hope our simple, reproducible, and effective approach will help ease future research in the domain of person retrieval in video surveillance. The source code and pretrained weights available at https://parshwa1999.github.io/PeR-ViS/.

翻訳日:2021-05-23 09:54:46 公開日:2020-12-04

# (参考訳) 理解可能な医学用語翻訳のためのベンチマークデータセット

A Benchmark Dataset for Understandable Medical Language Translation ( http://arxiv.org/abs/2012.02420v1 )

ライセンス: CC BY 4.0

Junyu Luo, Zifei Zheng, Hanzhong Ye, Muchao Ye, Yaqing Wang, Quanzeng You, Cao Xiao and Fenglong Ma

(参考訳) 本稿では,専門的な医学文と素人理解可能な表現を連携させるための,人間による新しい医学用語翻訳データセットであるmedlaneを紹介する。データセットには12,801のトレーニングサンプル、1,015の検証サンプル、1,016のテストサンプルが含まれている。次に,medlaneデータセットにおける1つのnaiveと6つのディープラーニングに基づくアプローチを評価する。直接コピー,統計機械翻訳アプローチモーゼ,4つのニューラルネットワーク翻訳アプローチ(提案するpmbert-mtモデル,seq2seqとその2つの変種),修正されたテキスト要約モデル pointernet などである。結果を比較するために,この課題に特化して設計された3つの新しい指標を含む11の指標を利用する。最後に,メドレーンとベースラインの限界を議論し,この課題に対する研究の方向性を指摘する。

In this paper, we introduce MedLane -- a new human-annotated Medical Language translation dataset, to align professional medical sentences with layperson-understandable expressions. The dataset contains 12,801 training samples, 1,015 validation samples, and 1,016 testing samples. We then evaluate one naive and six deep learning-based approaches on the MedLane dataset, including directly copying, a statistical machine translation approach Moses, four neural machine translation approaches (i.e., the proposed PMBERT-MT model, Seq2Seq and its two variants), and a modified text summarization model PointerNet. To compare the results, we utilize eleven metrics, including three new measures specifically designed for this task. Finally, we discuss the limitations of MedLane and baselines, and point out possible research directions for this task.

翻訳日:2021-05-23 09:44:35 公開日:2020-12-04

# (参考訳) 行動認識・検出のための空間時間アライメントネットワーク

Spatial-Temporal Alignment Network for Action Recognition and Detection ( http://arxiv.org/abs/2012.02426v1 )

ライセンス: CC BY 4.0

Junwei Liang, Liangliang Cao, Xuehan Xiong, Ting Yu, Alexander Hauptmann

(参考訳) 本稿では,行動認識と検出を支援する視点不変特徴表現の導入方法について検討する。過去10年間のアクション認識の大きな進歩を目の当たりにしてきたが、大規模データセットにおける幾何学的バリエーションを効率的にモデル化する方法は、いまだに興味深い。本稿では,行動認識と行動検出のための幾何学的不変表現を学習する新しい空間-時間アライメントネットワーク(stan)を提案する。 stanモデルは軽量で汎用的で、resnet3dやslowfastのような既存のアクション認識モデルに非常に低い計算コストで接続できる。我々は、AVA、Kinetics-400、AVA-Kinetics、Charades、Charades-EgoのデータセットでSTANモデルを広範囲にテストした。実験の結果,STANモデルは動作検出タスクと動作認識タスクの両方において,一貫して芸術の状態を改善できることがわかった。私たちはデータ、モデル、コードを公開します。

This paper studies how to introduce viewpoint-invariant feature representations that can help action recognition and detection. Although we have witnessed great progress of action recognition in the past decade, it remains challenging yet interesting how to efficiently model the geometric variations in large scale datasets. This paper proposes a novel Spatial-Temporal Alignment Network (STAN) that aims to learn geometric invariant representations for action recognition and action detection. The STAN model is very light-weighted and generic, which could be plugged into existing action recognition models like ResNet3D and the SlowFast with a very low extra computational cost. We test our STAN model extensively on AVA, Kinetics-400, AVA-Kinetics, Charades, and Charades-Ego datasets. The experimental results show that the STAN model can consistently improve the state of the arts in both action detection and action recognition tasks. We will release our data, models and code.

翻訳日:2021-05-23 09:30:47 公開日:2020-12-04

# (参考訳) 脳は相3次計算を使って量子位相コンピュータとして機能するのか?

Does the brain function as a quantum phase computer using phase ternary computation? ( http://arxiv.org/abs/2012.06537v1 )

ライセンス: CC BY 4.0

Andrew Simon Johnson and William Winlow

(参考訳) 本稿では,神経伝達の基礎は,処理誤差を克服するのに十分な時間的精度で計算可能な圧力パルス/ソリトンであることを示す。神経系内のシグナル伝達と計算は複雑で異なる現象である。アクション電位は可塑性であり、アクションポテンシャルピークは神経計算の不適切な不動点となるが、アクションポテンシャル閾値はこの目的に適している。さらに、ニューロンをスパイクすることで時間をかける神経モデルは、処理エラーを克服するために必要な速度以下で動作する。本稿では, 網膜処理を例として, ケーブル理論に基づく現代の神経伝導理論は, 網膜の完全機能に必要な計算時間と脳の他の部分の含意を考慮に入れるのに不適切であることを示す。さらに、連続するイオンチャネルが静電気的に開放される活性化部位では、活性化閾値では電荷が不足するため、ケーブル理論は作用電位の伝播に役立てることができない。脳のニューラルネットのデコンストラクションは、チューリングマシンが最も単純な量子位相コンピュータのグループのメンバーであることを示唆している。しかし、チューリングベースの機構を使用する試みは、チューリングベースのコンピュータの技術が根本的に異なるため、網膜のコーディングや知能の計算を解決できない。脳のニューラルネットにおける符号化は量子ベースであり、量子は時間変数と位相ベース変数を持ち、網膜で以前に示されたように位相三元計算を可能にする。

Here we provide evidence that the fundamental basis of nervous communication is derived from a pressure pulse/soliton capable of computation with sufficient temporal precision to overcome any processing errors. Signalling and computing within the nervous system are complex and different phenomena. Action potentials are plastic and this makes the action potential peak an inappropriate fixed point for neural computation, but the action potential threshold is suitable for this purpose. Furthermore, neural models timed by spiking neurons operate below the rate necessary to overcome processing error. Using retinal processing as our example, we demonstrate that the contemporary theory of nerve conduction based on cable theory is inappropriate to account for the short computational time necessary for the full functioning of the retina and by implication the rest of the brain. Moreover, cable theory cannot be instrumental in the propagation of the action potential because at the activation-threshold there is insufficient charge at the activation site for successive ion channels to be electrostatically opened. Deconstruction of the brain neural network suggests that it is a member of a group of Quantum phase computers of which the Turing machine is the simplest: the brain is another based upon phase ternary computation. However, attempts to use Turing based mechanisms cannot resolve the coding of the retina or the computation of intelligence, as the technology of Turing based computers is fundamentally different. We demonstrate that that coding in the brain neural network is quantum based, where the quanta have a temporal variable and a phase-base variable enabling phase ternary computation as previously demonstrated in the retina.

翻訳日:2021-05-23 09:01:14 公開日:2020-12-04

# (参考訳) 対向例に対する自然ロバスト性を目指して

Towards Natural Robustness Against Adversarial Examples ( http://arxiv.org/abs/2012.02452v1 )

ライセンス: CC BY 4.0

Haoyu Chu, Shikui Wei, Yao Zhao

(参考訳) 近年の研究では、ディープニューラルネットワークは敵の例に弱いことが示されているが、敵の例を守るために提案された手法のほとんどは、この問題を根本的に解決できない。本稿では, 対向雑音による誤差を抑えるために, 同一性を持つニューラルネットワークの上限が存在することを理論的に証明する。しかし、実際の計算では、この種のニューラルネットワークはもはや上界を持たないため、敵の例に影響を受けやすい。同様の手順に従って、敵の例が他の深いニューラルネットワークをスキップ接続で騙すことができる理由を説明する。さらに,ニューラルネットワークの新たなファミリーであるneural odes(chen et al., 2018)が,より弱い上限を持つことを示した。このより弱い上限は、結果の変化量が大きすぎることを防ぐ。このように、ニューラルODEは逆例に対して自然な堅牢性を持つ。我々は,3つのホワイトボックス対向攻撃(FGSM,PGD,DI2-FGSM)と1つのブラックボックス対向攻撃(Bundary Attack)によるResNetと比較して,ニューラルODEの性能を評価する。最後に,TRADES や YOPO など,敵対的訓練手法で訓練されたニューラルネットワークの頑健性よりも,ニューラルネットワークの自然な堅牢性の方が優れていることを示す。

Recent studies have shown that deep neural networks are vulnerable to adversarial examples, but most of the methods proposed to defense adversarial examples cannot solve this problem fundamentally. In this paper, we theoretically prove that there is an upper bound for neural networks with identity mappings to constrain the error caused by adversarial noises. However, in actual computations, this kind of neural network no longer holds any upper bound and is therefore susceptible to adversarial examples. Following similar procedures, we explain why adversarial examples can fool other deep neural networks with skip connections. Furthermore, we demonstrate that a new family of deep neural networks called Neural ODEs (Chen et al., 2018) holds a weaker upper bound. This weaker upper bound prevents the amount of change in the result from being too large. Thus, Neural ODEs have natural robustness against adversarial examples. We evaluate the performance of Neural ODEs compared with ResNet under three white-box adversarial attacks (FGSM, PGD, DI2-FGSM) and one black-box adversarial attack (Boundary Attack). Finally, we show that the natural robustness of Neural ODEs is even better than the robustness of neural networks that are trained with adversarial training methods, such as TRADES and YOPO.

翻訳日:2021-05-23 08:46:14 公開日:2020-12-04

# (参考訳) Deep Learning from Demonstrationsのための変形可能な物体からのピアス針のデータセット

A data-set of piercing needle through deformable objects for Deep Learning from Demonstrations ( http://arxiv.org/abs/2012.02458v1 )

ライセンス: CC BY 4.0

Hamidreza Hashempour, Kiyanoush Nazari, Fangxun Zhong and Amir Ghalamzan E.

(参考訳) 自動化は非常に時間がかかり、費用がかかるため、多くのロボットタスクはまだ遠隔操作されている。デモから学ぶロボット(RLfD)は、プログラミングの時間とコストを削減する。しかし、従来のRLfDアプローチは、例えば多くのロボットタスクに直接適用できない。視覚情報から特徴を設計するための時間を要するため、最小限の侵襲的なロボットによるロボット縫合。ディープニューラルネットワーク(DNN)は、高次元の観測空間と低レベルのアクション/状態空間の関係を捉える複雑なモデルを作成するための有用なツールとして登場した。それにもかかわらず、そのようなアプローチは適切なDNNモデルのトレーニングに適したデータセットを必要とする。本稿では,da vinci研究キットの2本の腕を軟組織に挿入・挿入するデータセットを提案する。本データセットは,(1)6台の高精細度キャリブレーションカメラで記録された無作為所望の出口点と,(2)対応するロボットデータ,キャリブレーションパラメータ,(3)全ての収集データを同期させたロボット制御入力とからなる。データセットはDeep-RLfDアプローチ用に設計されている。また、単純なフィードフォワードCNNやRCN(Recurrent Convolutional Networks)など、いくつかの深いRLfDアーキテクチャを実装した。本研究は,ベースラインフィードフォワードcnnが視覚情報とロボットの次のステップ制御動作との関係をうまく学習するにも関わらず,rcnsがモデルの予測精度を向上させることを示す。データセットは、RLfDのベースライン実装と同様に、https://github.com/imanlab/d-lfd.comでベンチマーキングが公開されている。

Many robotic tasks are still teleoperated since automating them is very time consuming and expensive. Robot Learning from Demonstrations (RLfD) can reduce programming time and cost. However, conventional RLfD approaches are not directly applicable to many robotic tasks, e.g. robotic suturing with minimally invasive robots, as they require a time-consuming process of designing features from visual information. Deep Neural Networks (DNN) have emerged as useful tools for creating complex models capturing the relationship between high-dimensional observation space and low-level action/state space. Nonetheless, such approaches require a dataset suitable for training appropriate DNN models. This paper presents a dataset of inserting/piercing a needle with two arms of da Vinci Research Kit in/through soft tissues. The dataset consists of (1) 60 successful needle insertion trials with randomised desired exit points recorded by 6 high-resolution calibrated cameras, (2) the corresponding robot data, calibration parameters and (3) the commanded robot control input where all the collected data are synchronised. The dataset is designed for Deep-RLfD approaches. We also implemented several deep RLfD architectures, including simple feed-forward CNNs and different Recurrent Convolutional Networks (RCNs). Our study indicates RCNs improve the prediction accuracy of the model despite that the baseline feed-forward CNNs successfully learns the relationship between the visual information and the next step control actions of the robot. The dataset, as well as our baseline implementations of RLfD, are publicly available for bench-marking at https://github.com/imanlab/d-lfd.

翻訳日:2021-05-23 08:34:30 公開日:2020-12-04

# (参考訳) アクティブラーニングによる低リソース自然言語理解のための微調整bert

Fine-tuning BERT for Low-Resource Natural Language Understanding via Active Learning ( http://arxiv.org/abs/2012.02462v1 )

ライセンス: CC BY 4.0

Daniel Grie{\ss}haber, Johannes Maucher and Ngoc Thang Vu

(参考訳) 近年,事前学習されたトランスフォーマーに基づく言語モデルをダウンストリームで活用するタスク固有モデルは,自然言語理解タスクにおける技術結果の高度化を実現している。しかし、1000のトレーニングデータポイント未満のリソース設定で、このアプローチの適合性を調査する研究はほとんどない。本研究では、プールベースのアクティブラーニングを利用してトレーニングを高速化し、新しいデータのラベル付けコストを抑えながら、事前訓練されたTransformerベースの言語モデルであるBERTの微調整方法を検討する。 GLUEデータセットにおける実験結果から,ラベルなしデータのプールからクエリする際のモデルの知識獲得を最大化することにより,モデル性能の優位性を示す。最後に、訓練可能なパラメータの数を減らし、低リソース設定に適したものにするため、微調整中の言語モデルの凍結層の利点を実証し分析する。

Recently, leveraging pre-trained Transformer based language models in down stream, task specific models has advanced state of the art results in natural language understanding tasks. However, only a little research has explored the suitability of this approach in low resource settings with less than 1,000 training data points. In this work, we explore fine-tuning methods of BERT -- a pre-trained Transformer based language model -- by utilizing pool-based active learning to speed up training while keeping the cost of labeling new data constant. Our experimental results on the GLUE data set show an advantage in model performance by maximizing the approximate knowledge gain of the model when querying from the pool of unlabeled data. Finally, we demonstrate and analyze the benefits of freezing layers of the language model during fine-tuning to reduce the number of trainable parameters, making it more suitable for low-resource settings.

翻訳日:2021-05-23 08:19:12 公開日:2020-12-04

# (参考訳) FinCausal共有タスクのためのデータ処理とアノテーション方式

Data Processing and Annotation Schemes for FinCausal Shared Task ( http://arxiv.org/abs/2012.02498v1 )

ライセンス: CC BY 4.0

Dominique Mariko, Estelle Labidurie, Yagmur Ozturk, Hanna Abi Akl, Hugues de Mazancourt

(参考訳) この文書では、FinCausal Shared Task(Mariko et al., 2020)のデータをラベル付けするために使用されるアノテーションスキームを説明します。このタスクは、2020年12月12日に第28回計算言語学国際会議(coling'2020)で開催される金融ナラティブ・プロセッシング・マルチリング金融要約合同ワークショップ(fnp-fns 2020)に関連している。

This document explains the annotation schemes used to label the data for the FinCausal Shared Task (Mariko et al., 2020). This task is associated to the Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation (FNP-FNS 2020), to be held at The 28th International Conference on Computational Linguistics (COLING'2020), on December 12, 2020.

翻訳日:2021-05-23 07:41:17 公開日:2020-12-04

# (参考訳) 財務文書因果検出共有タスク(FinCausal 2020)

Financial Document Causality Detection Shared Task (FinCausal 2020) ( http://arxiv.org/abs/2012.02505v1 )

ライセンス: CC BY 4.0

Dominique Mariko, Hanna Abi Akl, Estelle Labidurie, St\'ephane Durfort, Hugues de Mazancourt, Mahmoud El-Haj

(参考訳) 金融文書および関連するfincausalデータセットにおける因果性検出に関するfincausal 2020共有タスクを報告し、参加システムと結果について考察する。二項分類タスク(Task1)と関係抽出タスク(Task2)の2つのサブタスクを提案する。合計16チームが2つのタスクをまたいで実行し、13チームがシステム記述の論文を寄稿した。このワークショップは、2020年9月12日にスペインのバルセロナで開催された第28回計算言語学国際会議(COING'2020)で開催されるFNP-FNS 2020(Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation)に関連付けられている。

We present the FinCausal 2020 Shared Task on Causality Detection in Financial Documents and the associated FinCausal dataset, and discuss the participating systems and results. Two sub-tasks are proposed: a binary classification task (Task 1) and a relation extraction task (Task 2). A total of 16 teams submitted runs across the two Tasks and 13 of them contributed with a system description paper. This workshop is associated to the Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation (FNP-FNS 2020), held at The 28th International Conference on Computational Linguistics (COLING'2020), Barcelona, Spain on September 12, 2020.

翻訳日:2021-05-23 07:26:54 公開日:2020-12-04

# (参考訳) 逐次GANを用いたレコメンダシステムにおけるデータ汚染検出について

On Detecting Data Pollution Attacks On Recommender Systems Using Sequential GANs ( http://arxiv.org/abs/2012.02509v1 )

ライセンス: CC BY 4.0

Behzad Shahrasbi, Venugopal Mani, Apoorv Reddy Arrabothu, Deepthi Sharma, Kannan Achan, Sushant Kumar

(参考訳) レコメンダシステムは、あらゆるeコマースプラットフォームの重要な部分だ。勧告は通常、大量のユーザデータを集約することによって生成される。悪意のあるアクターは、悪意のあるデータポイントを注入することで、そのようなレコメンデーションシステムの出力を減らし、システムを利用して財務的な利益を得る。本研究では,悪意のあるデータポイントを識別する半教師付き攻撃検出アルゴリズムを提案する。実際のデータポイントの分布を学習するために汚染される可能性が低いデータセットの一部を活用することで、これを実現します。提案手法は,ユーザ活動の文脈情報を考慮した生成型逆ネットワークアーキテクチャを修飾するものである。これにより、モデルが正しいデータポイントと注入されたデータポイントを区別することができる。

Recommender systems are an essential part of any e-commerce platform. Recommendations are typically generated by aggregating large amounts of user data. A malicious actor may be motivated to sway the output of such recommender systems by injecting malicious datapoints to leverage the system for financial gain. In this work, we propose a semi-supervised attack detection algorithm to identify the malicious datapoints. We do this by leveraging a portion of the dataset that has a lower chance of being polluted to learn the distribution of genuine datapoints. Our proposed approach modifies the Generative Adversarial Network architecture to take into account the contextual information from user activity. This allows the model to distinguish legitimate datapoints from the injected ones.

翻訳日:2021-05-23 07:16:30 公開日:2020-12-04

# (参考訳) 半教師付き学習のための最適輸送によるマッチング分布

Matching Distributions via Optimal Transport for Semi-Supervised Learning ( http://arxiv.org/abs/2012.03790v1 )

ライセンス: CC BY 4.0

Fariborz Taherkhani, Hadi Kazemi, Ali Dabouei, Jeremy Dawson, Nasser M. Nasrabadi

(参考訳) トレーニング期間中に十分なラベル付きデータが得られていない場合、SSL(Semi-Supervised Learning)アプローチはラベルなしデータの使用に有効なフレームワークとなっている。畳み込みニューラルネットワーク(CNN)に基づくSSLメソッドは、画像分類などの標準ベンチマークタスクで成功した結果を提供している。本研究では、ラベル付きおよびラベルなしデータが同じ基礎となる確率分布から得られるSSL問題の一般的な設定について考察する。そこで本稿では,未ラベルデータに対して擬似ラベルを提供するために,離散的経験的確率測度間の類似性の指標として最適輸送(OT)手法を適用し,初期ラベル付きデータと併用してSSL方式でCNNモデルをトレーニングする手法を提案する。提案手法と最先端のSSLアルゴリズムを標準データセット上で評価・比較し,SSLアルゴリズムの優位性と有効性を示す。

Semi-Supervised Learning (SSL) approaches have been an influential framework for the usage of unlabeled data when there is not a sufficient amount of labeled data available over the course of training. SSL methods based on Convolutional Neural Networks (CNNs) have recently provided successful results on standard benchmark tasks such as image classification. In this work, we consider the general setting of SSL problem where the labeled and unlabeled data come from the same underlying probability distribution. We propose a new approach that adopts an Optimal Transport (OT) technique serving as a metric of similarity between discrete empirical probability measures to provide pseudo-labels for the unlabeled data, which can then be used in conjunction with the initial labeled data to train the CNN model in an SSL manner. We have evaluated and compared our proposed method with state-of-the-art SSL algorithms on standard datasets to demonstrate the superiority and effectiveness of our SSL algorithm.

翻訳日:2021-05-23 06:54:40 公開日:2020-12-04

# (参考訳) DeepSym: 計画のための教師なし連続ロボットインタラクションによる深部シンボル生成とルール学習

DeepSym: Deep Symbol Generation and Rule Learning from Unsupervised Continuous Robot Interaction for Planning ( http://arxiv.org/abs/2012.02532v1 )

ライセンス: CC BY 4.0

Alper Ahmetoglu, M. Yunus Seker, Aysu Sayin, Serkan Bugur, Justus Piater, Erhan Oztop, Emre Ugur

(参考訳) 個別のシンボルとルールを連続的なインタラクション体験から自律的に発見することは、ロボットAIの重要な構成要素であるが、依然として難しい問題である。手作業で設計したシンボルやルールのスケーラビリティ、柔軟性、堅牢性の限界を克服し、オープンな環境における抽象レベルを学習し、推論できる自律型ロボットへの大きな進歩となる。この目的に向けて,アクション・グラウンド・離散オブジェクト・効果カテゴリを探索し,複雑なアクション・プランニングに使用できる確率的ルールを構築する,新規で汎用的な手法を提案する。我々のロボットは、与えられたアクションレパートリーを用いて、単一のオブジェクトと複数のオブジェクトと相互作用し、環境内で生成された効果を観察する。アクショングラウンドドオブジェクト,エフェクト,リレーショナルカテゴリを形成するために,シーンのイメージと適用されたアクションを入力とした予測型ディープエンコーダ・デコーダネットワークのバイナライズされたボトルネック層を用いて,ピクセル座標におけるシーン内のオブジェクト変位(アクション効果)を生成する。 binary latent vectorは、オブジェクトの学習された、アクション駆動の分類を表す。ニューラルネットワークが表現する知識をシンボリック推論に有用なルールに抽出するために,決定木をトレーニングしてデコーダ関数を再現する。分岐部から確率的ルールを抽出し、PDDLで表現し、市販のプランナーがロボットの感覚運動体験を操作できるようにする。本システムは,ロボットアームハンドがプッシュ・スタック動作から'ロール可能','インサータブル','大きめ'と解釈できる記号を学習した物理系3次元シミュレーション環境において検証され,既設の確率的プランナーを用いてキューブ,ボール,カップからタワーを構築するような効果的な計画を作成した。

Autonomous discovery of discrete symbols and rules from continuous interaction experience is a crucial building block of robot AI, but remains a challenging problem. Solving it will overcome the limitations in scalability, flexibility, and robustness of manually-designed symbols and rules, and will constitute a substantial advance towards autonomous robots that can learn and reason at abstract levels in open-ended environments. Towards this goal, we propose a novel and general method that finds action-grounded, discrete object and effect categories and builds probabilistic rules over them that can be used in complex action planning. Our robot interacts with single and multiple objects using a given action repertoire and observes the effects created in the environment. In order to form action-grounded object, effect, and relational categories, we employ a binarized bottleneck layer of a predictive, deep encoder-decoder network that takes as input the image of the scene and the action applied, and generates the resulting object displacements in the scene (action effects) in pixel coordinates. The binary latent vector represents a learned, action-driven categorization of objects. To distill the knowledge represented by the neural network into rules useful for symbolic reasoning, we train a decision tree to reproduce its decoder function. From its branches we extract probabilistic rules and represent them in PPDDL, allowing off-the-shelf planners to operate on the robot's sensorimotor experience. Our system is verified in a physics-based 3d simulation environment where a robot arm-hand system learned symbols that can be interpreted as 'rollable', 'insertable', 'larger-than' from its push and stack actions; and generated effective plans to achieve goals such as building towers from given cubes, balls, and cups using off-the-shelf probabilistic planners.

翻訳日:2021-05-23 06:26:58 公開日:2020-12-04

# (参考訳) ニューラル常微分方程式を用いた雲被覆変化下の作物分類

Crop Classification under Varying Cloud Cover with Neural Ordinary Differential Equations ( http://arxiv.org/abs/2012.02542v1 )

ライセンス: CC BY 4.0

Nando Metzger, Mehmet Ozgur Turkoglu, Stefano D'Aronco, Jan Dirk Wegner, Konrad Schindler

(参考訳) 光学衛星センサーは雲を通して地球の表面を見ることができない。周期的な再観測サイクルにもかかわらず、地球観測衛星が取得した画像シーケンスは不規則に時間内にサンプリングされる。作物分類のための最先端の手法(および他の時系列分析タスク)は、リカレントニューラルネットワーク(RNN)のような観測間の通常の時間間隔を暗黙的に仮定する技術に依存している。本稿では,rnnと組み合わせたニューラル常微分方程式(ノード)を用いて不規則間隔画像列における作物種別を分類する。その結果得られたode-rnnモデルは、更新ステップ、再帰ユニットがモデルの隠れた状態に新しい入力データを同化するステップ、ノードが次の観測が到着するまで隠れた状態を伝播する予測ステップの2つのステップで構成される。予測ステップは、いくつかの利点がある潜在力学の連続的な表現に基づいている。概念レベルでは、現象論的サイクルを管理するメカニズムを記述するのがより自然な方法である。現実的な観点では、システムの状態を任意の時点にサンプリングすることが可能であり、必要な時に観測を統合することができ、最後の観測を超えて外挿することができる。実験の結果,ODE-RNNはLSTM,GRU,時間的畳み込みなどの共通ベースラインよりも分類精度が向上していることがわかった。この利得は、わずかしか観測できない(クラウドカバーの頻繁さ)困難なシナリオにおいて最も顕著である。さらに,外挿能力は季節の早い段階での分類性能の向上に寄与し,予測に重要であることを示す。

Optical satellite sensors cannot see the Earth's surface through clouds. Despite the periodic revisit cycle, image sequences acquired by Earth observation satellites are therefore irregularly sampled in time. State-of-the-art methods for crop classification (and other time series analysis tasks) rely on techniques that implicitly assume regular temporal spacing between observations, such as recurrent neural networks (RNNs). We propose to use neural ordinary differential equations (NODEs) in combination with RNNs to classify crop types in irregularly spaced image sequences. The resulting ODE-RNN models consist of two steps: an update step, where a recurrent unit assimilates new input data into the model's hidden state; and a prediction step, in which NODE propagates the hidden state until the next observation arrives. The prediction step is based on a continuous representation of the latent dynamics, which has several advantages. At the conceptual level, it is a more natural way to describe the mechanisms that govern the phenological cycle. From a practical point of view, it makes it possible to sample the system state at arbitrary points in time, such that one can integrate observations whenever they are available, and extrapolate beyond the last observation. Our experiments show that ODE-RNN indeed improves classification accuracy over common baselines such as LSTM, GRU, and temporal convolution. The gains are most prominent in the challenging scenario where only few observations are available (i.e., frequent cloud cover). Moreover, we show that the ability to extrapolate translates to better classification performance early in the season, which is important for forecasting.

翻訳日:2021-05-23 05:56:43 公開日:2020-12-04

# (参考訳) EventKG+BT:知識グラフからインタラクティブな伝記タイムラインを生成する

EventKG+BT: Generation of Interactive Biography Timelines from a Knowledge Graph ( http://arxiv.org/abs/2012.06306v1 )

ライセンス: CC BY 4.0

Simon Gottschalk and Elena Demidova

(参考訳) 公共の関心を持つ人々の生活における顕著な業績や重要な出来事の研究には、退屈で時間を要する長い百科事典や伝記資料の密読が必要である。 EventKGナレッジグラフのようなセマンティックリファレンスソースは関連する事実の構造化された表現を提供するが、数百のイベントと特定のエンティティの時間的関係を含んでいることが多い。本稿では,遠隔監視を用いた知識グラフからバイオグラフィーの簡潔かつインタラクティブな時空間表現を生成するタイムライン生成システムEventKG+BTを提案する。

Research on notable accomplishments and important events in the life of people of public interest usually requires close reading of long encyclopedic or biographical sources, which is a tedious and time-consuming task. Whereas semantic reference sources, such as the EventKG knowledge graph, provide structured representations of relevant facts, they often include hundreds of events and temporal relations for particular entities. In this paper, we present EventKG+BT - a timeline generation system that creates concise and interactive spatio-temporal representations of biographies from a knowledge graph using distant supervision.

翻訳日:2021-05-23 05:39:22 公開日:2020-12-04

# (参考訳) 長距離低品質赤外線ビデオにおける小型目標検出のための高性能手法

A high performance approach to detecting small targets in long range low quality infrared videos ( http://arxiv.org/abs/2012.02579v1 )

ライセンス: CC BY 4.0

Chiman Kwan and Bence Budavari

(参考訳) 遠距離赤外線(IR)ビデオではターゲットが小さいため、それらのビデオのターゲットを正確に検出することは困難である。本稿では,広帯域・低品質赤外線ビデオにおける小型目標検出のための高性能手法を提案する。提案手法は,ビデオ解像度向上モジュール,局所強度と勾配(LIG)に基づく小型目標検出器,連結成分分析モジュール,複数フレームからの検出を接続するためのトラックアソシエーションモジュールから構成される。ベンチマークデータセットから3500mから5000mの範囲での実際の中赤外(MWIR)ビデオによる大規模な実験により,提案手法の有効性が明らかとなった。

Since targets are small in long range infrared (IR) videos, it is challenging to accurately detect targets in those videos. In this paper, we propose a high performance approach to detecting small targets in long range and low quality infrared videos. Our approach consists of a video resolution enhancement module, a proven small target detector based on local intensity and gradient (LIG), a connected component (CC) analysis module, and a track association module to connect detections from multiple frames. Extensive experiments using actual mid-wave infrared (MWIR) videos in ranges between 3500 m and 5000 m from a benchmark dataset clearly demonstrated the efficacy of the proposed approach.

翻訳日:2021-05-23 05:15:32 公開日:2020-12-04

# (参考訳) レーン検出結果を用いた車線数予測

Prediction of Lane Number Using Results From Lane Detection ( http://arxiv.org/abs/2012.02604v1 )

ライセンス: CC BY 4.0

Panumate Chetprayoon, Fumihiko Takahashi, Yusuke Uchida

(参考訳) 車両が走行する車線番号は、インテリジェントな車両分野において重要な要素である。多数の車線検出アルゴリズムが提案され,完全な車線検出が可能であれば,車線検出結果から直接車線数を計算することができる。しかし、実際にレーン検出アルゴリズムは時に性能が劣る。そこで本研究では,車線数を予測するために,ドライブレコーダ画像と車線検出結果を組み合わせた新しい車線数予測手法を提案する。実験の結果,提案手法は計算コストを大幅に増大させることなく,優れた結果が得られた。

The lane number that the vehicle is traveling in is a key factor in intelligent vehicle fields. Many lane detection algorithms were proposed and if we can perfectly detect the lanes, we can directly calculate the lane number from the lane detection results. However, in fact, lane detection algorithms sometimes underperform. Therefore, we propose a new approach for predicting the lane number, where we combine the drive recorder image with the lane detection results to predict the lane number. Experiments on our own dataset confirmed that our approach delivered outstanding results without significantly increasing computational cost.

翻訳日:2021-05-23 04:31:43 公開日:2020-12-04

# (参考訳) finnsentiment - フィンランドの感情極性アノテーションのためのソーシャルメディアコーパス

FinnSentiment -- A Finnish Social Media Corpus for Sentiment Polarity Annotation ( http://arxiv.org/abs/2012.02613v1 )

ライセンス: CC BY 4.0

Krister Lind\'en and Tommi Jauhiainen and Sam Hardwick

(参考訳) 感情分析と意見のマイニングはソーシャルメディアで明らかな応用領域を持つ重要なタスクである。憎しみの言葉や偽ニュースを示す時ですこれまでの調査では、フィンランドの感情極性アノテーションを備えた大規模なソーシャルメディアデータセットは存在しないことに留意した。この出版物は、3つのネイティブアノテータによる感情極性と独立してアノテートされた27,000の文データセットを導入することで、この欠点を解決することを目的としている。データセット全体に対して同じ3つのアノテータがあり、時間の経過とともにアノテータの振る舞いを研究するためのユニークな機会を提供しました。アノテーション間の合意を分析し,データセットの有用性を検証するための2つのベースラインを提供する。

Sentiment analysis and opinion mining is an important task with obvious application areas in social media, e.g. when indicating hate speech and fake news. In our survey of previous work, we note that there is no large-scale social media data set with sentiment polarity annotations for Finnish. This publications aims to remedy this shortcoming by introducing a 27,000 sentence data set annotated independently with sentiment polarity by three native annotators. We had the same three annotators for the whole data set, which provides a unique opportunity for further studies of annotator behaviour over time. We analyse their inter-annotator agreement and provide two baselines to validate the usefulness of the data set.

翻訳日:2021-05-23 04:05:31 公開日:2020-12-04

# (参考訳) 大規模河川流速推定への深層学習の適用

Application of deep learning to large scale riverine flow velocity estimation ( http://arxiv.org/abs/2012.02620v1 )

ライセンス: CC BY 4.0

Mojtaba Forghani, Yizhou Qian, Jonghyun Lee, Matthew W. Farthing, Tyler Hesser, Peter K. Kitanidis, and Eric F. Darve

(参考訳) 河川流速の高速で信頼性の高い予測は洪水リスク管理を含む多くの応用において重要である。浅水方程式(SWE)は流速の予測に一般的に用いられる。しかし、標準的なSWEソルバによる正確かつ高速な予測は、多くの場合困難である。従来の手法は計算コストが高く、正確な予測には高解像度の河床形状測定 (bathymetry) が必要である。その結果、例えば、異なる境界条件(bc)に対して繰り返し評価される必要がある場合や、浴槽測定が確実性で分かっていない場合において、不適合である。本研究では,これらの問題に取り組む2段階のプロセスを提案する。まず,主成分統計学的手法 (PCGA) を用いて, 流速測定から浴量測定の確率密度関数を推定し, 次に, 複数の機械学習アルゴリズムを用いて, 後部浴量測定分布と所定範囲のBCから, SWEの高速解法を求める。第1ステップでは,浴量計を直接測定することなく流速を予測できる。さらに、第2段階における分布の増強により、時間とともにバスメトリが変化する場合でも、フロー速度予測に付加的なバスメトリ情報が組み込まれ、精度と一般化が向上する。ここでは,PCA-DNN(主成分分析深度ニューラルネットワーク),SE(教師付きエンコーダ),SVE(教師付き変分エンコーダ)という3つの解法を用いて,GAのオーガスタ近郊のサバンナ川(サバンナ川)でそれらを検証し,高速解法が従来の境界値問題よりもはるかに低い計算コストで流速を精度良く予測できることを示した。

Fast and reliable prediction of riverine flow velocities is important in many applications, including flood risk management. The shallow water equations (SWEs) are commonly used for prediction of the flow velocities. However, accurate and fast prediction with standard SWE solvers is challenging in many cases. Traditional approaches are computationally expensive and require high-resolution riverbed profile measurement ( bathymetry) for accurate predictions. As a result, they are a poor fit in situations where they need to be evaluated repetitively due, for example, to varying boundary condition (BC), or when the bathymetry is not known with certainty. In this work, we propose a two-stage process that tackles these issues. First, using the principal component geostatistical approach (PCGA) we estimate the probability density function of the bathymetry from flow velocity measurements, and then we use multiple machine learning algorithms to obtain a fast solver of the SWEs, given augmented realizations from the posterior bathymetry distribution and the prescribed range of BCs. The first step allows us to predict flow velocities without direct measurement of the bathymetry. Furthermore, the augmentation of the distribution in the second stage allows incorporation of the additional bathymetry information into the flow velocity prediction for improved accuracy and generalization, even if the bathymetry changes over time. Here, we use three solvers, referred to as PCA-DNN (principal component analysis-deep neural network), SE (supervised encoder), and SVE (supervised variational encoder), and validate them on a reach of the Savannah river near Augusta, GA. Our results show that the fast solvers are capable of predicting flow velocities with good accuracy, at a computational cost that is significantly lower than the cost of solving the full boundary value problem with traditional methods.

翻訳日:2021-05-23 04:04:39 公開日:2020-12-04

# (参考訳) 注意を理解する:心と機械の中で

Understanding Attention: In Minds and Machines ( http://arxiv.org/abs/2012.02659v1 )

ライセンス: CC BY 4.0

Shriraj P. Sawant and Shruti Singh

(参考訳) 注意は複雑で広い概念であり、人工知能、認知科学、心理学、神経科学、関連する分野にまたがる複数の分野にわたって研究されている。注意に関する考えの多くはこれらの分野に大きく重なり合っていないが、限られた資源を適応的に制御する共通のテーマがある。本稿では,ニューラルネットワーク(ANN)における注意のコンセプトと変種について概説する。また、神経科学の観点から、ANNと平行する注意の起源についても論じる。様々な分野間の相互接続のように見える対話を行う代わりに、注意の体系的な分析と、AIや神経科学におけるアイデアの統一に向けて、共通の概念的枠組みに基づく考え方を提案する。

Attention is a complex and broad concept, studied across multiple disciplines spanning artificial intelligence, cognitive science, psychology, neuroscience, and related fields. Although many of the ideas regarding attention do not significantly overlap among these fields, there is a common theme of adaptive control of limited resources. In this work, we review the concept and variants of attention in artificial neural networks (ANNs). We also discuss the origin of attention from the neuroscience point of view parallel to that of ANNs. Instead of having seemingly disconnected dialogues between varied disciplines, we suggest grounding the ideas on common conceptual frameworks for a systematic analysis of attention and towards possible unification of ideas in AI and Neuroscience.

翻訳日:2021-05-23 03:26:28 公開日:2020-12-04

# (参考訳) 透明な対戦相手間の2人プレイゲームにおける学習

Learning in two-player games between transparent opponents ( http://arxiv.org/abs/2012.02671v1 )

ライセンス: CC BY 4.0

Adrian Hutter

(参考訳) 2つの強化学習エージェントが互いにマトリックスゲームを繰り返すシナリオを検討し,各ラウンドの後にパラメータを更新する。エージェントの意思決定は互いに透明であり、各エージェントが対戦相手がどのように振る舞うかを予測することができる。双方のエージェントの無限の回帰を無期限に予測するためには、各エージェントは少なくともエプシロンの確率で相手非依存の応答を与える必要がある。透明性はまた、各エージェントが他のエージェントの勾配ステップ、すなわち、予測して形作ることを可能にする。相手の勾配がそれらに好適な方向にあるパラメータ空間の領域に移動する。本研究では,従来の文献(LOLAとSOS)の2つのアルゴリズムを用いて,実験結果のダイナミクスを検証した。我々は, 相互透明な意思決定と対人意識学習の組み合わせが, 単発受刑者のジレンマにおける相互協力に強く寄与することを発見した。ニワトリのゲームでは、双方のエージェントが好適な均衡に向かって相手を操作しようとすると、相互に有利な結果に収束することが難しくなり、対戦意識の学習は双方のエージェントにとって最悪の結果をもたらす。これは、均衡選択問題を含む社会的ジレンマにおいて許容できる結果を達成する対向学習アルゴリズムを開発する必要性を強調している。

We consider a scenario in which two reinforcement learning agents repeatedly play a matrix game against each other and update their parameters after each round. The agents' decision-making is transparent to each other, which allows each agent to predict how their opponent will play against them. To prevent an infinite regress of both agents recursively predicting each other indefinitely, each agent is required to give an opponent-independent response with some probability at least epsilon. Transparency also allows each agent to anticipate and shape the other agent's gradient step, i.e. to move to regions of parameter space in which the opponent's gradient points in a direction favourable to them. We study the resulting dynamics experimentally, using two algorithms from previous literature (LOLA and SOS) for opponent-aware learning. We find that the combination of mutually transparent decision-making and opponent-aware learning robustly leads to mutual cooperation in a single-shot prisoner's dilemma. In a game of chicken, in which both agents try to manoeuvre their opponent towards their preferred equilibrium, converging to a mutually beneficial outcome turns out to be much harder, and opponent-aware learning can even lead to worst-case outcomes for both agents. This highlights the need to develop opponent-aware learning algorithms that achieve acceptable outcomes in social dilemmas involving an equilibrium selection problem.

翻訳日:2021-05-23 03:17:38 公開日:2020-12-04

# (参考訳) 知識グラフと機械学習による道路標識地中真理構築の高速化

Accelerating Road Sign Ground Truth Construction with Knowledge Graph and Machine Learning ( http://arxiv.org/abs/2012.02672v1 )

ライセンス: CC BY 4.0

Ji Eun Kim, Cory Henson, Kevin Huang, Tuan A. Tran, Wan-Yi Lin

(参考訳) 包括的な高品質な道路標識アノテーションデータセットを持つことは、AIベースの道路標識認識(RSR)システムの成功に不可欠である。実際には、アノテータは異なる国の道路標識システムを学ぶことの難しさに直面することが多いため、その作業は時間を要することが多く、結果が悪い。本稿では,知識グラフと機械学習アルゴリズムである可変プロトタイピングエンコーダ(VPE)を用いて,道路標識を効果的に分類する手法を提案する。アノテーションは視覚属性を使用してロードサイン知識グラフをクエリし、VPEモデルによって提案される最も近いマッチング候補を受け取ることができる。 VPEモデルは知識グラフからの候補と実際のサインイメージパッチを入力として使用する。知識グラフのアプローチは手話検索スペースを98.9%削減できることを示す。さらに,本システムはvpeを用いて,テストされたデータセットの75%の符号に対する正しい単一候補を提案することができ,その場合の人的検索の労力を完全に排除できる。

Having a comprehensive, high-quality dataset of road sign annotation is critical to the success of AI-based Road Sign Recognition (RSR) systems. In practice, annotators often face difficulties in learning road sign systems of different countries; hence, the tasks are often time-consuming and produce poor results. We propose a novel approach using knowledge graphs and a machine learning algorithm - variational prototyping-encoder (VPE) - to assist human annotators in classifying road signs effectively. Annotators can query the Road Sign Knowledge Graph using visual attributes and receive closest matching candidates suggested by the VPE model. The VPE model uses the candidates from the knowledge graph and a real sign image patch as inputs. We show that our knowledge graph approach can reduce sign search space by 98.9%. Furthermore, with VPE, our system can propose the correct single candidate for 75% of signs in the tested datasets, eliminating the human search effort entirely in those cases.

翻訳日:2021-05-23 02:55:50 公開日:2020-12-04

# (参考訳) escape: カーネルベースの機械学習アルゴリズムのための効率的なセキュアでプライベートなdot製品フレームワーク

ESCAPED: Efficient Secure and Private Dot Product Framework for Kernel-based Machine Learning Algorithms with Applications in Healthcare ( http://arxiv.org/abs/2012.02688v1 )

ライセンス: CC BY 4.0

Ali Burak \"Unal, Mete Akg\"un, Nico Pfeifer

(参考訳) 高度な機械学習モデルをトレーニングするには、通常多くのトレーニングサンプルが必要です。特に医療分野では、これらのサンプルは非常に高価であり、1つの機関だけではそれ自体で十分ではない。異なるソースからのプライバシーに敏感なデータのマージは通常、データセキュリティとデータ保護によって制限される。これは、変数にノイズ(例えば$\epsilon$-differential privacy)を配置したり、特定の値(例えば$k$-匿名性)を省略したりすることで、データ品質を低下させるアプローチにつながる可能性がある。暗号法に基づくその他の測定は、特に大規模なマルチオミクスデータには特に問題となる、非常に時間を要する計算につながる可能性がある。 ESCAPED(Efficient SeCure and PrivatE Dotのプロダクトフレームワーク)を導入し、サードパーティ上の複数のソースからのベクトルのドット積の計算を可能にし、後にカーネルベースの機械学習アルゴリズムを訓練し、プライバシやノイズの追加を犠牲にすることなく、この問題に対処する。 HIV感染者に対する薬剤耐性予測の枠組みと、精密医療におけるマルチオミクスの次元減少とクラスタリングの問題について検討した。実行時間に関して、我々のフレームワークはアルゴリズムの性能を犠牲にすることなく、最も適した既存のアプローチを著しく上回ります。カーネルベースのアルゴリズムの利点しか示さないが、我々のフレームワークは、複数のソースからのベクトルのドット積を必要とする機械学習モデルに新たな研究機会を開くことができる。

To train sophisticated machine learning models one usually needs many training samples. Especially in healthcare settings these samples can be very expensive, meaning that one institution alone usually does not have enough on its own. Merging privacy-sensitive data from different sources is usually restricted by data security and data protection measures. This can lead to approaches that reduce data quality by putting noise onto the variables (e.g., in $\epsilon$-differential privacy) or omitting certain values (e.g., for $k$-anonymity). Other measures based on cryptographic methods can lead to very time-consuming computations, which is especially problematic for larger multi-omics data. We address this problem by introducing ESCAPED, which stands for Efficient SeCure And PrivatE Dot product framework, enabling the computation of the dot product of vectors from multiple sources on a third-party, which later trains kernel-based machine learning algorithms, while neither sacrificing privacy nor adding noise. We evaluated our framework on drug resistance prediction for HIV-infected people and multi-omics dimensionality reduction and clustering problems in precision medicine. In terms of execution time, our framework significantly outperforms the best-fitting existing approaches without sacrificing the performance of the algorithm. Even though we only show the benefit for kernel-based algorithms, our framework can open up new research opportunities for further machine learning models that require the dot product of vectors from multiple sources.

翻訳日:2021-05-23 02:45:15 公開日:2020-12-04

# (参考訳) 部分観測都市環境における物体探索のための空間言語理解

Spatial Language Understanding for Object Search in Partially Observed Cityscale Environments ( http://arxiv.org/abs/2012.02705v1 )

ライセンス: CC BY 4.0

Kaiyu Zheng, Deniz Bayazit, Rebecca Mathew, Ellie Pavlick, Stefanie Tellex

(参考訳) 本研究では,ロボットが空間言語をオブジェクト位置上の分布として解釈し,部分観測可能な都市環境における効率的な探索を可能にするシステムを提案する。本稿では,空間言語観測空間を紹介し,空間言語から抽出された情報をロボットの信念に取り入れた部分可観測マルコフ決定過程(pomdp)の枠組みに基づいて確率的観測モデルを作成する。曖昧で文脈に依存した前置詞(例えば~前置詞)を解釈するために,言語提供者の環境コンテキストに対する相対的参照フレーム(FoR)の予測を学習する畳み込みニューラルネットワークモデルを提案する。 4万m$^2$の足跡を持つ5都市間の相互評価を通じて,予測モデルと対象探索システムの一般化可能性を示す。シミュレーションによるエンド・ツー・エンド実験は,空間的前置詞理解を必要とせず,キーワードベースラインよりも検索速度が速く,高い成功率が得られることを示す。

We present a system that enables robots to interpret spatial language as a distribution over object locations for effective search in partially observable cityscale environments. We introduce the spatial language observation space and formulate a stochastic observation model under the framework of Partially Observable Markov Decision Process (POMDP) which incorporates information extracted from the spatial language into the robot's belief. To interpret ambiguous, context-dependent prepositions (e.g.~front), we propose a convolutional neural network model that learns to predict the language provider's relative frame of reference (FoR) given environment context. We demonstrate the generalizability of our FoR prediction model and object search system through cross-validation over areas of five cities, each with a 40,000m$^2$ footprint. End-to-end experiments in simulation show that our system achieves faster search and higher success rate compared to a keyword-based baseline without spatial preposition understanding.

翻訳日:2021-05-23 02:42:14 公開日:2020-12-04

# (参考訳) 機械学習アルゴリズムによる移動意図に及ぼす気象因子の影響

Impact of weather factors on migration intention using machine learning algorithms ( http://arxiv.org/abs/2012.02794v1 )

ライセンス: CC BY 4.0

John Aoga, Juhee Bae, Stefanija Veljanoska, Siegfried Nijssen, Pierre Schaus

(参考訳) 経験文学における注目度は、気候ショックの発生と移住決定の変化に向けられている。以前の文献は異なる結果をもたらし、多くの伝統的な経験的アプローチを用いる。本稿では,ブルキナファソ,アイボリーコースト,マリ,モーリタニア,ニジェール,セネガルの6つの農業依存型経済圏への移住を意図した,気象ショックの役割を分析するためのツリーベース機械学習(ML)アプローチを提案する。いくつかの木に基づくアルゴリズム(例えば、XGB、ランダムフォレスト)を列車検証テストワークフローを用いて実行し、堅牢で耐雑音性のあるアプローチを構築する。次に、移行意図に影響を与える方向を示す重要な特徴を決定する。このMLに基づく推定は、異なる時間スケールで標準降水-蒸発散指数(SPEI)が捉えた天候ショックや、様々な社会経済的特徴/共変量などの特徴を考慮に入れている。その結果,(i)社会経済特性が移動意図に影響を及ぼす一方で,天気特性が予測性能を向上させること,(ii)国内特化モデルが必要であること,(iii)国際移動はSPEIのより長い時間スケールに影響され,(内部移動を含む)一般移動はより短い時間スケールによって影響されることがわかった。

A growing attention in the empirical literature has been paid to the incidence of climate shocks and change in migration decisions. Previous literature leads to different results and uses a multitude of traditional empirical approaches. This paper proposes a tree-based Machine Learning (ML) approach to analyze the role of the weather shocks towards an individual's intention to migrate in the six agriculture-dependent-economy countries such as Burkina Faso, Ivory Coast, Mali, Mauritania, Niger, and Senegal. We perform several tree-based algorithms (e.g., XGB, Random Forest) using the train-validation-test workflow to build robust and noise-resistant approaches. Then we determine the important features showing in which direction they are influencing the migration intention. This ML-based estimation accounts for features such as weather shocks captured by the Standardized Precipitation-Evapotranspiration Index (SPEI) for different timescales and various socioeconomic features/covariates. We find that (i) weather features improve the prediction performance although socioeconomic characteristics have more influence on migration intentions, (ii) country-specific model is necessary, and (iii) international move is influenced more by the longer timescales of SPEIs while general move (which includes internal move) by that of shorter timescales.

翻訳日:2021-05-23 02:30:22 公開日:2020-12-04

# (参考訳) 多言語関係学習のためのイベントガイドによるDenoising

Event Guided Denoising for Multilingual Relation Learning ( http://arxiv.org/abs/2012.02721v1 )

ライセンス: CC BY 4.0

Amith Ananthram, Emily Allaway, Kathleen McKeown

(参考訳) 汎用的な関係抽出は、soaresらによる膨大なデータ集約的な遠隔監視技術によって、近年大きく向上している。 (2019)は多くのベンチマークで最先端の結果を生成する。本研究では,ゼロショットと少数ショットのほぼ再現が可能なラベル付きテキストから関係抽出のための高品質なトレーニングデータを,トレーニングコストのごく一部で収集する手法を提案する。提案手法は,日時標示されたニュース記事の予測可能な分布構造を生かして,低品質の事例を抽出し,分節化したコーパスを構築する。このコーパスで訓練された小さな多言語エンコーダは、少ない例(50k vs. 300mil+)を使用しながら、英語とスペイン語の少数ショットおよび標準関係ベンチマークにおいて、現在の最先端(どちらも微調整を受けていない場合)と同等に動作することを示す。

General purpose relation extraction has recently seen considerable gains in part due to a massively data-intensive distant supervision technique from Soares et al. (2019) that produces state-of-the-art results across many benchmarks. In this work, we present a methodology for collecting high quality training data for relation extraction from unlabeled text that achieves a near-recreation of their zero-shot and few-shot results at a fraction of the training cost. Our approach exploits the predictable distributional structure of date-marked news articles to build a denoised corpus -- the extraction process filters out low quality examples. We show that a smaller multilingual encoder trained on this corpus performs comparably to the current state-of-the-art (when both receive little to no fine-tuning) on few-shot and standard relation benchmarks in English and Spanish despite using many fewer examples (50k vs. 300mil+).

翻訳日:2021-05-23 02:03:56 公開日:2020-12-04

# (参考訳) パッチ統計を利用した畳み込みニューラルネットワークを用いた超音波散乱体密度分類

Ultrasound Scatterer Density Classification Using Convolutional Neural Networks by Exploiting Patch Statistics ( http://arxiv.org/abs/2012.02738v1 )

ライセンス: CC BY 4.0

Ali K. Z. Tehrani, Mina Amiri, Ivan M. Rosado-Mendez, Timothy J. Hall, and Hassan Rivaz

(参考訳) 定量的超音波(qus)は散乱体密度などの組織特性の重要な情報を明らかにすることができる。分解能細胞当たりの散乱密度が10以上である場合、組織は、それぞれ完全に発達したスペックル(fds)または低密度散乱体(lds)と見なされる。従来,後方散乱エコーの振幅の推定統計パラメータを用いて散乱密度を分類してきた。しかし、パッチサイズが小さい場合、その推定は正確ではない。これらのパラメータは画像設定にも強く依存する。本稿では,QUSのための畳み込みニューラルネットワーク(CNN)アーキテクチャを提案し,シミュレーションデータを用いて学習する。さらに,パッチ統計を追加入力チャネルとして利用することで,ネットワーク性能をさらに向上させる。シミュレーションデータ,実験ファントム,生体内データを用いてネットワークの評価を行った。また,提案するネットワークを古典的および深層学習モデルと比較し,散乱密度値の異なる組織分類において,その優れた性能を示す。また,提案したネットワークは,参照ファントムを必要とせずに,異なる画像パラメータで動作可能であることを示す。本研究は超音波画像における散乱体密度の分類におけるCNNの可能性を示す。

Quantitative ultrasound (QUS) can reveal crucial information on tissue properties such as scatterer density. If the scatterer density per resolution cell is above or below 10, the tissue is considered as fully developed speckle (FDS) or low-density scatterers (LDS), respectively. Conventionally, the scatterer density has been classified using estimated statistical parameters of the amplitude of backscattered echoes. However, if the patch size is small, the estimation is not accurate. These parameters are also highly dependent on imaging settings. In this paper, we propose a convolutional neural network (CNN) architecture for QUS, and train it using simulation data. We further improve the network performance by utilizing patch statistics as additional input channels. We evaluate the network using simulation data, experimental phantoms and in vivo data. We also compare our proposed network with different classic and deep learning models, and demonstrate its superior performance in classification of tissues with different scatterer density values. The results also show that the proposed network is able to work with different imaging parameters with no need for a reference phantom. This work demonstrates the potential of CNNs in classifying scatterer density in ultrasound images.

翻訳日:2021-05-23 01:54:45 公開日:2020-12-04

# (参考訳) 特徴帰属説明における共通解釈可能性仮定の一致

Challenging common interpretability assumptions in feature attribution explanations ( http://arxiv.org/abs/2012.02748v1 )

ライセンス: CC0 1.0

Jonathan Dinu (1), Jeffrey Bigham (2), J. Zico Kolter (2) ((1) Unaffiliated, (2) Carnegie Mellon University)

(参考訳) 機械学習とアルゴリズムによる意思決定システムが、ハイテイクなヒューマン・イン・ザ・ループ設定でますます活用されているため、予測の合理性を理解する必要がある。研究者たちは、説明可能なAI(XAI)でこのニーズに対応しているが、しばしば、評価なしで解釈可能性の公理を宣言する。これらのシステムが評価されると、しばしば、解釈可能性(モデル複雑性など)のプロキシメトリクスによるオフラインシミュレーションによってテストされる。簡単な「プレースボ説明」制御による大規模人物体実験により,3つの共通解釈可能性仮定の妥当性を実証的に評価した。特徴帰属の説明は、人間の意思決定者にとってタスクに限界効用をもたらし、ある場合には認知的および文脈的共同設立者による決定が悪化する。この結果は,これらの手法の適用の普遍的なメリットを問うものであり,XAI研究における人的評価の重要性を浮き彫りにしたい。実験から匿名化されたデータ、研究を複製するためのコード、実験のインタラクティブなデモ、分析で使用されるモデルなど、補助的な資料は以下のとおりである。

As machine learning and algorithmic decision making systems are increasingly being leveraged in high-stakes human-in-the-loop settings, there is a pressing need to understand the rationale of their predictions. Researchers have responded to this need with explainable AI (XAI), but often proclaim interpretability axiomatically without evaluation. When these systems are evaluated, they are often tested through offline simulations with proxy metrics of interpretability (such as model complexity). We empirically evaluate the veracity of three common interpretability assumptions through a large scale human-subjects experiment with a simple "placebo explanation" control. We find that feature attribution explanations provide marginal utility in our task for a human decision maker and in certain cases result in worse decisions due to cognitive and contextual confounders. This result challenges the assumed universal benefit of applying these methods and we hope this work will underscore the importance of human evaluation in XAI research. Supplemental materials -- including anonymized data from the experiment, code to replicate the study, an interactive demo of the experiment, and the models used in the analysis -- can be found at: https://doi.pizza/challenging-xai.

翻訳日:2021-05-23 01:38:03 公開日:2020-12-04

# (参考訳) 歩行者避難シミュレーションモデルの改良

An Improved Simulation Model for Pedestrian Crowd Evacuation ( http://arxiv.org/abs/2012.09135v1 )

ライセンス: CC BY 4.0

Danial A. Muhammed, Tarik A. Rashid, Abeer Alsadoon, Nebojsa Bacanin, Polla Fattah, Mokhtar Mohammadi and Indradip Banerjee

(参考訳) 本稿は,2019年後半に開発された,最新の歩行者避難モデルである「各種AI技術に基づく歩行者避難シミュレーションモデル」について論じる。本研究は,新しい手法を提案し,それをモデルに統合することで,開発したモデルに新たな機能を追加する。本手法により,提案する多くの場所の中から,最適な出口ドア位置を選択することによる安全性など,より適切な避難エリア設計が可能である。この方法は、選択されたモデルの出力、すなわち、避難プロセス内の各個人に対する避難時間に完全に依存する。新しい方法は避難者各避難所の避難所の避難時間の平均を求め,避難所の平均避難時間に基づいて避難所の避難所が避難所の避難所として最も適しているかを決定する。本手法を検証するために, 各種シナリオを用いた避難区域の設計を行った。その結果, 提案手法を用いたモデルでは, 提案位置の適切な出入口位置を予測できることがわかった。最後に, 本手法を統合した研究結果から, 安全の観点から選択したモデルに対して, 避難区域の優れた設計を選択する上で, 適切な判断を下すことができた。

This paper works on one of the most recent pedestrian crowd evacuation models, i.e., "a simulation model for pedestrian crowd evacuation based on various AI techniques", developed in late 2019. This study adds a new feature to the developed model by proposing a new method and integrating it with the model. This method enables the developed model to find a more appropriate evacuation area design, among others regarding safety due to selecting the best exit door location among many suggested locations. This method is completely dependent on the selected model's output, i.e., the evacuation time for each individual within the evacuation process. The new method finds an average of the evacuees' evacuation times of each exit door location; then, based on the average evacuation time, it decides which exit door location would be the best exit door to be used for evacuation by the evacuees. To validate the method, various designs for the evacuation area with various written scenarios were used. The results showed that the model with this new method could predict a proper exit door location among many suggested locations. Lastly, from the results of this research using the integration of this newly proposed method, a new capability for the selected model in terms of safety allowed the right decision in selecting the finest design for the evacuation area among other designs.

翻訳日:2021-05-23 01:14:33 公開日:2020-12-04

# (参考訳) delexicalized paraphrase generation

Delexicalized Paraphrase Generation ( http://arxiv.org/abs/2012.02763v1 )

ライセンス: CC BY 4.0

Boya Yu, Konstantine Arkoudas, Wael Hamza

(参考訳) パラフレーズ化のためのニューラルモデルを提案し,デレクシカル化文を生成するよう訓練する。各入力に複数の参照パラフレーズをペアにしたトレーニングデータを作成することで、これを実現する。これらの参照パラフラスは、注釈付きスロットとインテントに基づく意味同値の弱いタイプを表す。スロットの匿名化以外の異なるタイプのスロットからのセマンティクスを理解するために、スロット値のプールの前に畳み込みニューラルネットワーク(cnn)を適用し、出力中のスロットを見つけるためにポインタを使用する。実験の結果,生成したパラフレーズは高品質であり,さらに1.29%の正確な一致が得られた。また,自然言語理解(nlu)タスク,例えばインテント分類や名前付きエンティティ認識は,自動生成パラフレーズを用いたデータ拡張の恩恵を受けることを示す。

We present a neural model for paraphrasing and train it to generate delexicalized sentences. We achieve this by creating training data in which each input is paired with a number of reference paraphrases. These sets of reference paraphrases represent a weak type of semantic equivalence based on annotated slots and intents. To understand semantics from different types of slots, other than anonymizing slots, we apply convolutional neural networks (CNN) prior to pooling on slot values and use pointers to locate slots in the output. We show empirically that the generated paraphrases are of high quality, leading to an additional 1.29% exact match on live utterances. We also show that natural language understanding (NLU) tasks, such as intent classification and named entity recognition, can benefit from data augmentation using automatically generated paraphrases.

翻訳日:2021-05-23 01:04:17 公開日:2020-12-04

# (参考訳) 深層学習における一般化予測のための表現に基づく複雑性尺度

Representation Based Complexity Measures for Predicting Generalization in Deep Learning ( http://arxiv.org/abs/2012.02775v1 )

ライセンス: CC BY 4.0

Parth Natekar, Manik Sharma

(参考訳) ディープニューラルネットワークは、非常に過度にパラメータ化されているにもかかわらず、一般化することができる。近年の研究では、この現象を様々な視点から検討し、ノルムベース、PACベイズベース、マージンベース分析など、これらの視点に基づく一般化誤差や一般化ギャップの予測値の境界について検討している。本研究では,人間の視覚系が不変かつアンタングル化された物体表現をいかに生成するかという神経科学的理論に基づいて,ディープニューラルネットワークの内部表現の品質の観点から一般化の解釈を行う。理論的な境界を与える代わりに、深層モデルにおける一般化の振る舞いを明らかにするためにアドホックに計算できる実用的な複雑性測度を示す。我々はまた、NeurIPS 2020で開催されているDeep Learningの予測一般化に関するNeurIPSコンペティションで優勝したソリューションの詳細な説明も提供している。このソリューションの実装はhttps://github.com/parthnatekar/pgdlで利用可能です。

Deep Neural Networks can generalize despite being significantly overparametrized. Recent research has tried to examine this phenomenon from various view points and to provide bounds on the generalization error or measures predictive of the generalization gap based on these viewpoints, such as norm-based, PAC-Bayes based, and margin-based analysis. In this work, we provide an interpretation of generalization from the perspective of quality of internal representations of deep neural networks, based on neuroscientific theories of how the human visual system creates invariant and untangled object representations. Instead of providing theoretical bounds, we demonstrate practical complexity measures which can be computed ad-hoc to uncover generalization behaviour in deep models. We also provide a detailed description of our solution that won the NeurIPS competition on Predicting Generalization in Deep Learning held at NeurIPS 2020. An implementation of our solution is available at https://github.com/parthnatekar/pgdl.

翻訳日:2021-05-23 00:50:36 公開日:2020-12-04

# (参考訳) 軌道の非教師的埋め込みは移動の潜在構造を捉える

Unsupervised embedding of trajectories captures the latent structure of mobility ( http://arxiv.org/abs/2012.02785v1 )

ライセンス: CC BY 4.0

Dakota Murray, Jisung Yoon, Sadamori Kojaku, Rodrigo Costas, Woo-Sung Jung, Sta\v{s}a Milojevi\'c, Yong-Yeol Ahn

(参考訳) 人の移動と移住は、都市の成長と進化、疫病、経済、イノベーションといった社会的な大きな現象を駆動する。歴史的に、人間の移動は物理的分離(地理的距離)によって強い制約を受けてきた。しかし、地理的距離は、言語、文化、歴史的関係がより重要になっている間、物理的な障壁が縮小しているグローバル化の世界において、あまり重要ではない。モビリティを理解することは現代社会にとって重要になっているので、この複雑さを捉えられるフレームワークを見つけることは非常に重要です。本稿では、3つの異なる人間の軌道データセットを用いて、神経埋め込みモデルが位置間のニュアンス関係をベクトル空間にエンコードできることを実証し、人間のモビリティの多面構造を反映した効果的な距離尺度を提供する。科学的モビリティの事例に着目して,科学組織の組込みが,文化的・言語的関係,さらには学術的権威を,多段階の粒度で明らかにすることを示した。さらに, 組込みベクトルは, 科学モビリティのグローバルランドスケープにおける組織的特徴とその位置の普遍的関係を明らかにする。データから直接、スケーラブルで高密度で有意義なモビリティ表現を学習できることは、ドメイン間のモビリティを研究する新たな道を開くことができる。

Human mobility and migration drive major societal phenomena such as the growth and evolution of cities, epidemics, economies, and innovation. Historically, human mobility has been strongly constrained by physical separation -- geographic distance. However, geographic distance is becoming less relevant in the increasingly-globalized world in which physical barriers are shrinking while linguistic, cultural, and historical relationships are becoming more important. As understanding mobility is becoming critical for contemporary society, finding frameworks that can capture this complexity is of paramount importance. Here, using three distinct human trajectory datasets, we demonstrate that a neural embedding model can encode nuanced relationships between locations into a vector-space, providing an effective measure of distance that reflects the multi-faceted structure of human mobility. Focusing on the case of scientific mobility, we show that embeddings of scientific organizations uncover cultural and linguistic relations, and even academic prestige, at multiple levels of granularity. Furthermore, the embedding vectors reveal universal relationships between organizational characteristics and their place in the global landscape of scientific mobility. The ability to learn scalable, dense, and meaningful representations of mobility directly from the data can open up a new avenue of studying mobility across domains.

翻訳日:2021-05-23 00:06:03 公開日:2020-12-04

# (参考訳) Adaptive Explicit Kernel Minkowski Weighted K-means

Adaptive Explicit Kernel Minkowski Weighted K-means ( http://arxiv.org/abs/2012.02805v1 )

ライセンス: CC BY 4.0

Amir Aradnia, Maryam Amir Haeri and Mohammad Mehdi Ebadzadeh

(参考訳) k-meansアルゴリズムは最も一般的なデータクラスタリング手法の一つである。しかし、正則 k-平均は入力空間にしか適用できず、クラスタが線形分離可能である場合にも適用できる。 k-平均を核空間に拡張するk-平均は、非線形構造をキャプチャし、任意の形状のクラスターを識別することができる。しかし、カーネルメソッドは、しばしばデータのカーネルマトリックス上で動作し、行列のサイズに悪影響を及ぼすか、カーネル値の繰り返し計算によって高いクラスタリングコストに悩まされる。もうひとつの問題は、アルゴリズムが$k(x_i, x_j)$の評価によってのみデータにアクセスすることだ。本稿では, スペクトル解析に基づく近似有限次元特徴写像を駆動することにより, 線形および非線形アプローチの利点を組み合わせる手法を提案する。近似有限次元特徴写像の適用は, サポートベクターマシン(svm)問題でのみ議論された。この手法をカーネルk-means時代において,巨大なカーネルマトリックスのメモリ保存を緩和し,クラスタ中心をより効率的に計算し,特徴空間で明示的にデータにアクセスすることを提案する。これらの明示的な特徴マップは、特徴空間内のデータに明示的にアクセスし、その空間のk-means拡張を利用することができます。 KMWK-mean(Explicit Kernel Minkowski Weighted K-mean)法は,ミンコフスキー指数と特徴量パラメータを付加することにより,新しい空間の最適値を求めることができることを示す。さらに、ユークリッドノルムではなく、ミンコフスキーノルムと分数ノルム(p<1)によるミンコフスキーノルムの拡張として)を含む他のノルムの調査を提案することにより、近隣探索に対する濃度の影響を低減できる。

The K-means algorithm is among the most commonly used data clustering methods. However, the regular K-means can only be applied in the input space and it is applicable when clusters are linearly separable. The kernel K-means, which extends K-means into the kernel space, is able to capture nonlinear structures and identify arbitrarily shaped clusters. However, kernel methods often operate on the kernel matrix of the data, which scale poorly with the size of the matrix or suffer from the high clustering cost due to the repetitive calculations of kernel values. Another issue is that algorithms access the data only through evaluations of $K(x_i, x_j)$, which limits many processes that can be done on data through the clustering task. This paper proposes a method to combine the advantages of the linear and nonlinear approaches by using driven corresponding approximate finite-dimensional feature maps based on spectral analysis. Applying approximate finite-dimensional feature maps were only discussed in the Support Vector Machines (SVM) problems before. We suggest using this method in kernel K-means era as alleviates storing huge kernel matrix in memory, further calculating cluster centers more efficiently and access the data explicitly in feature space. These explicit feature maps enable us to access the data in the feature space explicitly and take advantage of K-means extensions in that space. We demonstrate our Explicit Kernel Minkowski Weighted K-mean (Explicit KMWK-mean) method is able to be more adopted and find best-fitting values in new space by applying additional Minkowski exponent and feature weights parameter. Moreover, it can reduce the impact of concentration on nearest neighbour search by suggesting investigate among other norms instead of Euclidean norm, includes Minkowski norms and fractional norms (as an extension of the Minkowski norms with p<1).

翻訳日:2021-05-23 00:04:14 公開日:2020-12-04

# (参考訳) 確率自由推論のための時系列の学習概要特徴

Learning summary features of time series for likelihood free inference ( http://arxiv.org/abs/2012.02807v1 )

ライセンス: CC BY 4.0

Pedro L. C. Rodrigues, Alexandre Gramfort

(参考訳) 特定のシミュレーターモデルのどのパラメータが実験データの集合を最もよく記述できるかを決定するために、LFI ( chance-free inference) を使うことに対する科学界からの関心が高まっている。最近のエキサイティングな結果と広範囲のアプリケーションにもかかわらず、時系列データに適用する際のLFIの重要なボトルネックは、ドメイン知識に基づいて手作業で調整される一連の要約機能を定義する必要があることである。本研究では,不定時系列から要約特徴を自動的に学習し,自己回帰移動平均(arma)モデルとファンデルpol発振器から生成された信号に適用するデータ駆動戦略を提案する。その結果,データからの要約特徴の学習は,線形の場合においても自己相関係数などの手作り値に基づくlfi手法よりも優れることがわかった。

There has been an increasing interest from the scientific community in using likelihood-free inference (LFI) to determine which parameters of a given simulator model could best describe a set of experimental data. Despite exciting recent results and a wide range of possible applications, an important bottleneck of LFI when applied to time series data is the necessity of defining a set of summary features, often hand-tailored based on domain knowledge. In this work, we present a data-driven strategy for automatically learning summary features from univariate time series and apply it to signals generated from autoregressive-moving-average (ARMA) models and the Van der Pol Oscillator. Our results indicate that learning summary features from data can compete and even outperform LFI methods based on hand-crafted values such as autocorrelation coefficients even in the linear case.

翻訳日:2021-05-22 23:53:22 公開日:2020-12-04

# (参考訳) 共同配電系統状態推定のための階層的深部アクター・クリティカル学習法

A Hierarchical Deep Actor-Critic Learning Method for Joint Distribution System State Estimation ( http://arxiv.org/abs/2012.02880v1 )

ライセンス: CC BY 4.0

Yuxuan Yuan, Kaveh Dehghanpour, Zhaoyu Wang, Fankun Bu

(参考訳) 揮発性分散型太陽光発電(PV)リソースの普及により,グリッドエッジにおける顧客のリアルタイムモニタリングが重要な課題となっている。しかし、計算が複雑で大規模システムへの拡張性に欠ける分散グリッドの一次レベルと二次レベルの両方について、dsse(distribution system state estimation)を共同で解決する必要がある。 DSSEのほぼリアルタイムな解を実現するため,第1層では重み付き最小二乗法(WLS)アルゴリズムが一次中電圧供給装置よりもDSSEを解くとともに,第2層では,低電圧回路の状態を推定し,グリッドエッジにおけるPVの影響を捉えるために,各二次変圧器に対してディープアクタクリティカル(A-C)モジュールを訓練する。 A-Cパラメータ学習プロセスはオフラインで行われるが、トレーニングされたA-Cモジュールは高速な二次グリッド状態推定のためにオンラインでデプロイされる。監視精度を維持するために、2つのレベルは、トランス電圧(第1層から第2層)とアクティブ/反応性全電力注入(第2層から第1層)を含む二次ノードで境界情報を交換する。このインタラクティブな情報伝達戦略は、数回のイテレーションで両方の層で最適な解を追跡できるクローズドループ構造をもたらす。さらに,本モデルは第1層のヤコビ行列を用いてトポロジの変化を処理できる。提案手法の性能を検証するために,実効用データとフィードモデルを用いて数値実験を行った。

Due to increasing penetration of volatile distributed photovoltaic (PV) resources, real-time monitoring of customers at the grid-edge has become a critical task. However, this requires solving the distribution system state estimation (DSSE) jointly for both primary and secondary levels of distribution grids, which is computationally complex and lacks scalability to large systems. To achieve near real-time solutions for DSSE, we present a novel hierarchical reinforcement learning-aided framework: at the first layer, a weighted least squares (WLS) algorithm solves the DSSE over primary medium-voltage feeders; at the second layer, deep actor-critic (A-C) modules are trained for each secondary transformer using measurement residuals to estimate the states of low-voltage circuits and capture the impact of PVs at the grid-edge. While the A-C parameter learning process takes place offline, the trained A-C modules are deployed online for fast secondary grid state estimation; this is the key factor in scalability and computational efficiency of the framework. To maintain monitoring accuracy, the two levels exchange boundary information with each other at the secondary nodes, including transformer voltages (first layer to second layer) and active/reactive total power injection (second layer to first layer). This interactive information passing strategy results in a closed-loop structure that is able to track optimal solutions at both layers in few iterations. Moreover, our model can handle the topology changes using the Jacobian matrices of the first layer. We have performed numerical experiments using real utility data and feeder models to verify the performance of the proposed framework.

翻訳日:2021-05-22 22:36:43 公開日:2020-12-04

# (参考訳) サイバーセキュリティと侵入検知システムのための深層学習法

Review: Deep Learning Methods for Cybersecurity and Intrusion Detection Systems ( http://arxiv.org/abs/2012.02891v1 )

ライセンス: CC BY 4.0

Mayra Macas, Chunming Wu

(参考訳) サイバー攻撃の数が増えるにつれて、サイバーセキュリティはあらゆるビジネスにとって重要な懸念に発展しつつある。人工知能(AI)と機械学習(ML)(特にディープラーニング - DL)は、脅威検出に寄与し、サイバーアナリストに推奨アクションを提供することができるため、サイバー防衛の重要な技術として活用することができる。サイバーセキュリティへのAI/MLの採用を推進し、効率的なサイバー防衛システムを構築するためには、産業、学術、政府とのグローバルなパートナーシップが必要である。本稿では,ネットワーク侵入検出に使用される各種深層学習手法について検討し,サイバーセキュリティアプリケーションのためのdlフレームワークを提案する。

As the number of cyber-attacks is increasing, cybersecurity is evolving to a key concern for any business. Artificial Intelligence (AI) and Machine Learning (ML) (in particular Deep Learning - DL) can be leveraged as key enabling technologies for cyber-defense, since they can contribute in threat detection and can even provide recommended actions to cyber analysts. A partnership of industry, academia, and government on a global scale is necessary in order to advance the adoption of AI/ML to cybersecurity and create efficient cyber defense systems. In this paper, we are concerned with the investigation of the various deep learning techniques employed for network intrusion detection and we introduce a DL framework for cybersecurity applications.

翻訳日:2021-05-22 22:22:06 公開日:2020-12-04

# (参考訳) ファッションから地下マップを発見

Discovering Underground Maps from Fashion ( http://arxiv.org/abs/2012.02897v1 )

ライセンス: CC BY 4.0

Utkarsh Mall, Kavita Bala, Tamara Berg, Kristen Grauman

(参考訳) 地理的地域におけるファッションセンスは、その地域に関する情報を明らかにすることができる。例えば、人々がそこで行う活動の種類や、地域を頻繁に訪れる群衆の種類(観光スポット、学生地区、ビジネスセンターなど)を反映することができる。本研究では,都市部の地下地図を自動的に作成する手法を提案する。本手法は,都市全域から公開されている画像を用いて,類似のファッションセンスを持つ地域を探索し,地図を監督せずにセグメント化する。世界の37都市を対象に,人間の審査員による実験と非画像データから得られた地下地図ベンチマークを用いて,優れた地下地図を作成するための有望な成果を示す。我々のアプローチはさらに、異なる地区(LAの最もユニークな地域は何ですか? 都市間の類似質問に答える(Bogotaの"Downtown LA"とは何か? ).

The fashion sense -- meaning the clothing styles people wear -- in a geographical region can reveal information about that region. For example, it can reflect the kind of activities people do there, or the type of crowds that frequently visit the region (e.g., tourist hot spot, student neighborhood, business center). We propose a method to automatically create underground neighborhood maps of cities by analyzing how people dress. Using publicly available images from across a city, our method finds neighborhoods with a similar fashion sense and segments the map without supervision. For 37 cities worldwide, we show promising results in creating good underground maps, as evaluated using experiments with human judges and underground map benchmarks derived from non-image data. Our approach further allows detecting distinct neighborhoods (what is the most unique region of LA?) and answering analogy questions between cities (what is the "Downtown LA" of Bogota?).

翻訳日:2021-05-22 22:12:26 公開日:2020-12-04

# (参考訳) メカニカルパイプの3次元再構成のための移動式カメラの自動校正

Automated Calibration of Mobile Cameras for 3D Reconstruction of Mechanical Pipes ( http://arxiv.org/abs/2012.02899v1 )

ライセンス: CC BY 4.0

Reza Maalek and Derek Lichti

(参考訳) この原稿は、大規模な円形の黒と白のターゲットフィールドを使用して、光学機器、特にモバイルカメラの校正のための新しいフレームワークを提供する。 i)画像間の目標のマッチング、(ii)目標中心の系統的偏心誤差の調整、(iii)自由ネットワーク自己調整によるキャリブレーションソリューションの反復的改善のための新しい方法が導入された。完全校正実験室から得られた270個の携帯電話画像において,提案したターゲットマッチングは,II型エラーに対するロバスト性を有する円形目標と効果的に一致した。 2つのビューからのカメラ投影行列のみを必要とする偏心調整は、事前にいくつかのオブジェクト空間ターゲット情報を必要とする利用可能なクローズドフォームソリューションと同義的に振る舞う。最後に, 携帯機器の場合, 機械管の3次元再構成半径を推定するためのその場キャリブレーションよりも, フレームワークを用いて得られたキャリブレーションパラメータが優れていることがわかった(約45%改良)。

This manuscript provides a new framework for calibration of optical instruments, in particular mobile cameras, using large-scale circular black and white target fields. New methods were introduced for (i) matching targets between images; (ii) adjusting the systematic eccentricity error of target centers; and (iii) iteratively improving the calibration solution through a free-network self-calibrating bundle adjustment. It was observed that the proposed target matching effectively matched circular targets in 270 mobile phone images from a complete calibration laboratory with robustness to Type II errors. The proposed eccentricity adjustment, which requires only camera projective matrices from two views, behaved synonymous to available closed-form solutions, which require several additional object space target information a priori. Finally, specifically for the case of the mobile devices, the calibration parameters obtained using our framework was found superior compared to in-situ calibration for estimating the 3D reconstructed radius of a mechanical pipe (approximately 45% improvement).

翻訳日:2021-05-22 21:58:46 公開日:2020-12-04

# (参考訳) 人間のフィードバックによる解釈可能な概念モデルの構築

Learning Interpretable Concept-Based Models with Human Feedback ( http://arxiv.org/abs/2012.02898v1 )

ライセンス: CC BY 4.0

Isaac Lage, Finale Doshi-Velez

(参考訳) 人間の理解可能な概念からドメインの表現を学習し、それを予測するために使用する機械学習モデルは、高次元データで訓練されたモデルとの解釈と相互作用を容易にするために提案されている。しかし、これらの方法には重要な制限がある: 概念を定義する方法は本質的に解釈可能ではなく、概念ラベルは個々のインスタンスに存在しているか、ユーザから容易に取得できると仮定する。これらの制限は特に高次元の表形式の特徴に対して急激である。個々のインスタンスではなく,概念特徴をラベル付けするユーザに依存する高次元表データにおいて,一連の透明概念定義を学習するためのアプローチを提案する。提案手法は,概念の意味をユーザの直感的に理解し,透過的な機械学習モデルにより下流ラベルの予測を容易にする概念である。これにより、完全なモデルは透過的で直感的であり、この制約を考慮すれば可能な限り予測可能である。臨床領域を含む実際の予測問題に対するユーザフィードバックをシミュレーションすることで、このような直接的なフィードバックは、類似した予測性能を維持しながら、レベリングインスタンスや他の既存のインタラクションメカニズムに依存する代替の透明なアプローチよりも、真実の概念定義に適合する学習ソリューションにおいてはるかに効率的であることを示す。

Machine learning models that first learn a representation of a domain in terms of human-understandable concepts, then use it to make predictions, have been proposed to facilitate interpretation and interaction with models trained on high-dimensional data. However these methods have important limitations: the way they define concepts are not inherently interpretable, and they assume that concept labels either exist for individual instances or can easily be acquired from users. These limitations are particularly acute for high-dimensional tabular features. We propose an approach for learning a set of transparent concept definitions in high-dimensional tabular data that relies on users labeling concept features instead of individual instances. Our method produces concepts that both align with users' intuitive sense of what a concept means, and facilitate prediction of the downstream label by a transparent machine learning model. This ensures that the full model is transparent and intuitive, and as predictive as possible given this constraint. We demonstrate with simulated user feedback on real prediction problems, including one in a clinical domain, that this kind of direct feedback is much more efficient at learning solutions that align with ground truth concept definitions than alternative transparent approaches that rely on labeling instances or other existing interaction mechanisms, while maintaining similar predictive performance.

翻訳日:2021-05-22 21:29:40 公開日:2020-12-04

# クロスモーダル一般化:メタアリゲーションによる低リソースモダリティ学習

Cross-Modal Generalization: Learning in Low Resource Modalities via Meta-Alignment ( http://arxiv.org/abs/2012.02813v1 )

ライセンス: Link先を確認

Paul Pu Liang, Peter Wu, Liu Ziyin, Louis-Philippe Morency, Ruslan Salakhutdinov

(参考訳) 自然界は視覚、音響、触覚、言語的モダリティを通じて表現される概念が豊富である。しかし、マルチモーダル学習の現在の進歩の多くは、トレーニングやテスト時に同じモダリティが存在している問題に焦点を当てており、低リソースモダリティの学習を特に困難にしている。本研究では,(1)目標モダリティにおける新しいタスクを迅速に実行可能なモデルを訓練するための学習パラダイムであるクロスモーダル一般化のためのアルゴリズムを提案する。メタラーニング)と(2)異なるソースモダリティでトレーニングされている間、そうする。我々は、異なるソースとターゲットのモダリティに対して異なるエンコーダを使用しながら、モダリティをまたいだ一般化を確保するにはどうすればよいのか? 本研究では,新しい表現空間の整列法であるmeta-alignment(メタアリゲーション)を基礎とし,強結合と弱結合のクロスモーダルデータを用いて,異なるモーダル性にまたがる新しいタスクへの迅速な一般化を実現する。本稿では,テキストから画像,画像から音声,テキストから音声の3つの分類課題について検討する。以上の結果から,新たな目標モダリティがわずか (1-10) のラベル付きサンプルしか持たない場合や,ノイズラベルが存在する場合においても高い性能を示す。

The natural world is abundant with concepts expressed via visual, acoustic, tactile, and linguistic modalities. Much of the existing progress in multimodal learning, however, focuses primarily on problems where the same set of modalities are present at train and test time, which makes learning in low-resource modalities particularly difficult. In this work, we propose algorithms for cross-modal generalization: a learning paradigm to train a model that can (1) quickly perform new tasks in a target modality (i.e. meta-learning) and (2) doing so while being trained on a different source modality. We study a key research question: how can we ensure generalization across modalities despite using separate encoders for different source and target modalities? Our solution is based on meta-alignment, a novel method to align representation spaces using strongly and weakly paired cross-modal data while ensuring quick generalization to new tasks across different modalities. We study this problem on 3 classification tasks: text to image, image to audio, and text to speech. Our results demonstrate strong performance even when the new target modality has only a few (1-10) labeled samples and in the presence of noisy labels, a scenario particularly prevalent in low-resource modalities.

翻訳日:2021-05-22 20:55:30 公開日:2020-12-04

# エンド・ツー・エンドのセンサモレータ学習のためのニューラル・ダイナミック・ポリシー

Neural Dynamic Policies for End-to-End Sensorimotor Learning ( http://arxiv.org/abs/2012.02788v1 )

ライセンス: Link先を確認

Shikhar Bahl, Mustafa Mukadam, Abhinav Gupta, Deepak Pathak

(参考訳) 感覚運動器制御における現在の支配的なパラダイムは、模倣や強化学習であっても、トルク、関節角、エンドエフェクタ位置といった生のアクション空間でポリシーを直接訓練することである。これにより、エージェントはトレーニングの各時間ステップで個別に決定し、従ってスケーラビリティを連続的、高次元、長距離のタスクに制限する。対照的に、古典ロボットの研究は、長い間、デモを通してロボットの振る舞いを学ぶための政策表現として、力学システムを利用してきた。しかし、これらの手法は深層学習や強化学習によって提供される柔軟性と一般化性に欠けており、そのような環境では未調査のままである。本研究では、このギャップを埋め、二階微分方程式を用いて作用空間を再パラメータ化することにより、動的システムの構造をディープニューラルネットワークベースのポリシーに組み込む。本稿では,行動が生の制御空間を表す事前の政策学習手法とは対照的に,軌道分布空間における予測を行う神経力学ポリシ(ndps)を提案する。組込み構造は、強化学習と模倣学習の両方のためのエンドツーエンドのポリシー学習を可能にする。 ndpsは,模倣学習と強化学習のいずれにおいても,複数のロボット制御タスクの効率や性能において,従来の最先端技術よりも優れていた。プロジェクトビデオとコードはhttps://shikharbahl.github.io/neural-dynamic-policies/で入手できる。

The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces such as torque, joint angle, or end-effector position. This forces the agent to make decisions individually at each timestep in training, and hence, limits the scalability to continuous, high-dimensional, and long-horizon tasks. In contrast, research in classical robotics has, for a long time, exploited dynamical systems as a policy representation to learn robot behaviors via demonstrations. These techniques, however, lack the flexibility and generalizability provided by deep learning or reinforcement learning and have remained under-explored in such settings. In this work, we begin to close this gap and embed the structure of a dynamical system into deep neural network-based policies by reparameterizing action spaces via second-order differential equations. We propose Neural Dynamic Policies (NDPs) that make predictions in trajectory distribution space as opposed to prior policy learning methods where actions represent the raw control space. The embedded structure allows end-to-end policy learning for both reinforcement and imitation learning setups. We show that NDPs outperform the prior state-of-the-art in terms of either efficiency or performance across several robotic control tasks for both imitation and reinforcement learning setups. Project video and code are available at https://shikharbahl.github.io/neural-dynamic-policies/

翻訳日:2021-05-22 20:54:55 公開日:2020-12-04

# 自動車苦情分析のための知識ベースとしての事前学習言語モデル

Pre-trained language models as knowledge bases for Automotive Complaint Analysis ( http://arxiv.org/abs/2012.02558v1 )

ライセンス: Link先を確認

V. D. Viellieber and M. A{\ss}enmacher

(参考訳) 最近、bert(devlin et al., 2018)のような大規模な事前学習された言語モデルが、事前学習コーパス(petroni et al., 2019)で取得した常識的事実知識を格納できることが示されている。本研究は,自動車産業における非構造的顧客からのフィードバックから得られた技術的品質問題を明らかにするための一連の調査を,産業からの応用に関してさらに評価する。タスクを満載した事前トレーニング済みモデルのアウト・オブ・ザ・ボックス版を探索した後、私たちは、Office of Defects Investigation (ODI) のデータセットで継続事前トレーニングを通じて、動的により多くの知識を提供する。実験では,ドメイン固有のトピックに関するクエリに関するパフォーマンスを,実際の知識そのものを問う場合と比較した。 (2019年)。評価されたほとんどのアーキテクチャでは、正しいトークンは60\%以上の$Precision@1$$(P@1$)で予測され、その一方、$P@5$と$P@10$は、それぞれ80\%以上、最大90%の値に達する。これらの結果は,顧客からのフィードバックを構造化分析するための知識基盤として言語モデルを用いる可能性を示している。

Recently it has been shown that large pre-trained language models like BERT (Devlin et al., 2018) are able to store commonsense factual knowledge captured in its pre-training corpus (Petroni et al., 2019). In our work we further evaluate this ability with respect to an application from industry creating a set of probes specifically designed to reveal technical quality issues captured as described incidents out of unstructured customer feedback in the automotive industry. After probing the out-of-the-box versions of the pre-trained models with fill-in-the-mask tasks we dynamically provide it with more knowledge via continual pre-training on the Office of Defects Investigation (ODI) Complaints data set. In our experiments the models exhibit performance regarding queries on domain-specific topics compared to when queried on factual knowledge itself, as Petroni et al. (2019) have done. For most of the evaluated architectures the correct token is predicted with a $Precision@1$ ($P@1$) of above 60\%, while for $P@5$ and $P@10$ even values of well above 80\% and up to 90\% respectively are reached. These results show the potential of using language models as a knowledge base for structured analysis of customer feedback.

翻訳日:2021-05-22 20:54:33 公開日:2020-12-04

# メタ学習へのモデル非依存学習

Model-Agnostic Learning to Meta-Learn ( http://arxiv.org/abs/2012.02684v1 )

ライセンス: Link先を確認

Arnout Devos, Yatin Dandi

(参考訳) 本稿では,同一分布から特定のタスクに迅速に適応する前に,関連するタスク間の共通性を未認識のタスク分布から迅速に活用できる学習アルゴリズムを提案する。本稿では,タスク分布の異なる学習が,まずタスクのメタファインタニングによって適応性を向上させる方法を検討する。合成回帰実験は、メタ学習への学習が適応性と連続的な一般化を改善するという直感を検証する。本提案の方法論, 設定, 仮説は, 確定実験を行う前に, ピアレビューによって肯定的に評価された。

In this paper, we propose a learning algorithm that enables a model to quickly exploit commonalities among related tasks from an unseen task distribution, before quickly adapting to specific tasks from that same distribution. We investigate how learning with different task distributions can first improve adaptability by meta-finetuning on related tasks before improving goal task generalization with finetuning. Synthetic regression experiments validate the intuition that learning to meta-learn improves adaptability and consecutively generalization. The methodology, setup, and hypotheses in this proposal were positively evaluated by peer review before conclusive experiments were carried out.

翻訳日:2021-05-22 20:54:01 公開日:2020-12-04

# コモンセンスでテキストベースのゲームをする

Playing Text-Based Games with Common Sense ( http://arxiv.org/abs/2012.02757v1 )

ライセンス: Link先を確認

Sahith Dambekodi, Spencer Frazier, Prithviraj Ammanabrolu, Mark O. Riedl

(参考訳) テキストベースのゲームは、エージェントが純粋に自然言語を通して世界と対話するシミュレーションである。それらは典型的には、一般的な日常の物体や場所と相互作用する多くのパズルで構成されている。深層強化学習エージェントはこれらのパズルを解くことができる。しかしながら、人間のプレイヤーにとって自明な環境との日常的な相互作用は、エージェントに新たなパズルとして提示される。エージェントに常識知識を組み込むための2つの手法を探索する。コモンセンス推論モデルCOMETまたは言語モデルBERTで世界状態の潜在的隠れた側面を推測する。言語モデルによって認識される共通パターンに従ってエージェントを探索する。 9to05ゲームはテキストベースのゲームの極端なバージョンであり、日常的な日常的なシナリオにおいて、共通の日常的なオブジェクトと多数のインタラクションを必要とする。我々は、コモンセンス推論によって世界状態に関する信念を補強するエージェントは、観察的エラーやテキスト記述からの共通要素の欠落に対してより頑健であると結論づける。

Text based games are simulations in which an agent interacts with the world purely through natural language. They typically consist of a number of puzzles interspersed with interactions with common everyday objects and locations. Deep reinforcement learning agents can learn to solve these puzzles. However, the everyday interactions with the environment, while trivial for human players, present as additional puzzles to agents. We explore two techniques for incorporating commonsense knowledge into agents. Inferring possibly hidden aspects of the world state with either a commonsense inference model COMET, or a language model BERT. Biasing an agents exploration according to common patterns recognized by a language model. We test our technique in the 9to05 game, which is an extreme version of a text based game that requires numerous interactions with common, everyday objects in common, everyday scenarios. We conclude that agents that augment their beliefs about the world state with commonsense inferences are more robust to observational errors and omissions of common elements from text descriptions.

翻訳日:2021-05-22 20:53:38 公開日:2020-12-04

# 自己監督型VQA:画像とキャプションを用いた視覚的質問への回答

Self-Supervised VQA: Answering Visual Questions using Images and Captions ( http://arxiv.org/abs/2012.02356v1 )

ライセンス: Link先を確認

Pratyay Banerjee, Tejas Gokhale, Yezhou Yang, Chitta Baral

(参考訳) VQAモデルのトレーニング方法は、トレーニングのために人間の注釈付きイメージクエスト・アンサー(I-Q-A)トリプルでデータセットを利用できると仮定する。これにより、データセットへの依存度が高くなり、新しいタイプの質問やシーンへの一般化が欠如している。さらに、これらのデータセットは、アノテータの主観性、偏見、誤り、および言語的先行性を示し、これらのサンプルで訓練されたVQAモデルにパーコレーションする。人間の注釈付きQ-Aペアを使わずにモデルをトレーニングできるかどうかを,説明的かつ主観的でない画像と関連するテキストキャプションのみを用いて検討する。本稿では,テンプレートやqasrlなどのアノテーションフレームワークを用いたキャプションから,手続き的に生成されたq-aペアを用いたモデル学習手法を提案する。多くのVQAモデルは、オブジェクト検出器から抽出された高密度でコストのかかるオブジェクトアノテーションに依存しているため、オブジェクト境界ボックスの単純かつ効果的な代替手段として、空間ピラミド画像パッチを提案する。ラベルシフトのソフトバージョンを含むVQA-v2,GQA,VQA-CPのベンチマークを行った。提案手法はvqa-cpの事前教師付きメソッドを上回っており,完全に教師付き設定のオブジェクト特徴のないメソッドと競合する。

Methodologies for training VQA models assume the availability of datasets with human-annotated Image-Question-Answer(I-Q-A) triplets for training. This has led to a heavy reliance and overfitting on datasets and a lack of generalization to new types of questions and scenes. Moreover, these datasets exhibit annotator subjectivity, biases, and errors, along with linguistic priors, which percolate into VQA models trained on such samples. We study whether models can be trained without any human-annotated Q-A pairs, but only with images and associated text captions which are descriptive and less subjective. We present a method to train models with procedurally generated Q-A pairs from captions using techniques, such as templates and annotation frameworks like QASRL. As most VQA models rely on dense and costly object annotations extracted from object detectors, we propose spatial-pyramid image patches as a simple but effective alternative to object bounding boxes, and demonstrate that our method uses fewer human annotations. We benchmark on VQA-v2, GQA, and on VQA-CP which contains a softer version of label shift. Our methods surpass prior supervised methods on VQA-CP and are competitive with methods without object features in fully supervised setting.

翻訳日:2021-05-22 20:53:04 公開日:2020-12-04

# MPG:コンディショナルスタイルGANを用いた多機能ピザイメージジェネレータ

MPG: A Multi-ingredient Pizza Image Generator with Conditional StyleGANs ( http://arxiv.org/abs/2012.02821v1 )

ライセンス: Link先を確認

Fangda Han, Guoyao Hao, Ricardo Guerrero, Vladimir Pavlovic

(参考訳) マルチラベル条件画像生成はコンピュータビジョンにおいて難しい問題である。本研究では,マルチラベル画像合成のための条件付き生成ニューラルネットワーク(gan)フレームワークであるmulti-ingredient pizza generator (mpg)を提案する。そこで我々は,mpgをstylegan2と呼ばれる最先端のgan構造に基づいて設計し,中間的特徴マップを強制してスケールワイズラベル情報を学習する新しい条件付け手法を開発した。また, マルチラベル画像生成問題の複雑な性質から, 対応する成分を予測して合成画像を正規化するとともに, マッチング画像と不一致画像との区別を促す。 MPGの有効性を検証するために、慎重に注釈付けされた多言語ピザ画像データセットであるPizza10で試した。 MPGは、望まれる材料で、フォトリアリスティックなピザ画像を生成することができる。このフレームワークは他のマルチラベル画像生成シナリオにも容易に拡張できる。

Multilabel conditional image generation is a challenging problem in computer vision. In this work we propose Multi-ingredient Pizza Generator (MPG), a conditional Generative Neural Network (GAN) framework for synthesizing multilabel images. We design MPG based on a state-of-the-art GAN structure called StyleGAN2, in which we develop a new conditioning technique by enforcing intermediate feature maps to learn scalewise label information. Because of the complex nature of the multilabel image generation problem, we also regularize synthetic image by predicting the corresponding ingredients as well as encourage the discriminator to distinguish between matched image and mismatched image. To verify the efficacy of MPG, we test it on Pizza10, which is a carefully annotated multi-ingredient pizza image dataset. MPG can successfully generate photo-realist pizza images with desired ingredients. The framework can be easily extend to other multilabel image generation scenarios.

翻訳日:2021-05-22 20:52:43 公開日:2020-12-04

# 逆ダイナミクスモデルを用いた画素からの計画

Planning from Pixels using Inverse Dynamics Models ( http://arxiv.org/abs/2012.02419v1 )

ライセンス: Link先を確認

Keiran Paster, Sheila A. McIlraith, Jimmy Ba

(参考訳) 高次元観測空間におけるタスク非依存力学モデルの学習は、モデルベースRLエージェントでは困難である。本稿では,タスク完了に条件づけられた将来の行動のシーケンスを予測し,潜在世界モデルを学ぶ新しい方法を提案する。これらのタスク条件付きモデルは、タスク関連力学のモデリング能力に適応的に焦点を合わせ、同時にスパース報酬を伴う計画のための効果的なヒューリスティックとして機能する。本手法は,視覚目標達成課題に対する課題評価を行い,従来のモデルフリーアプローチに比べて性能が大幅に向上することを示す。

Learning task-agnostic dynamics models in high-dimensional observation spaces can be challenging for model-based RL agents. We propose a novel way to learn latent world models by learning to predict sequences of future actions conditioned on task completion. These task-conditioned models adaptively focus modeling capacity on task-relevant dynamics, while simultaneously serving as an effective heuristic for planning with sparse rewards. We evaluate our method on challenging visual goal completion tasks and show a substantial increase in performance compared to prior model-free approaches.

翻訳日:2021-05-22 20:52:09 公開日:2020-12-04

# グラフ上の教師なし逆回転表現学習

Unsupervised Adversarially-Robust Representation Learning on Graphs ( http://arxiv.org/abs/2012.02486v1 )

ライセンス: Link先を確認

Jiarong Xu, Junru Chen, Yang Yang, Yizhou Sun, Chunping Wang, Jiangang Lu

(参考訳) 近年の研究では、グラフの深層学習が敵の攻撃に弱いことが示されており、入力データに対する知覚不能な摂動が劇的な性能劣化を引き起こす可能性がある。本稿では,グラフ上のロバスト表現を相互情報を通して学習する基礎的な問題に着目する。ラベル空間に基づいてタスク固有ロバスト性を測定する以前の研究とは対照的に、グラフトポロジとノード属性の合同入力空間を考慮に入れたタスク自由ロバスト性の測定には、表現空間を利用する。本稿では,この問題を制約付きサドル点最適化問題として定式化し,探索空間の縮小で効率よく解く。さらに,タスクフリーなロバストネス尺度と下流分類器のロバストネスとの理論的関係を確実に確立する。大規模な実験により,提案手法はグラフに対する敵攻撃に対する堅牢性を向上できるが,自然な精度も向上できることが示された。

Recent works have demonstrated that deep learning on graphs is vulnerable to adversarial attacks, in that imperceptible perturbations on input data can lead to dramatic performance deterioration. In this paper, we focus on the underlying problem of learning robust representations on graphs via mutual information. In contrast to previous works measure the task-specific robustness based on the label space, we here take advantage of the representation space to study a task-free robustness measure given the joint input space w.r.t graph topology and node attributes. We formulate this problem as a constrained saddle point optimization problem and solve it efficiently in a reduced search space. Furthermore, we provably establish theoretical connections between our task-free robustness measure and the robustness of downstream classifiers. Extensive experiments demonstrate that our proposed method is able to enhance robustness against adversarial attacks on graphs, yet even increases natural accuracy.

翻訳日:2021-05-22 20:52:01 公開日:2020-12-04

# 手続き的生成環境における実証効率の良い逆強化学習

Demonstration-efficient Inverse Reinforcement Learning in Procedurally Generated Environments ( http://arxiv.org/abs/2012.02527v1 )

ライセンス: Link先を確認

Alessandro Sestini, Alexander Kuhnle and Andrew D. Bagdanov

(参考訳) 深層強化学習は、報酬関数を手作業で設計できる領域において、非常に良い結果をもたらす。同時に、このタイプの環境はドメインシフト下でエージェントの過剰フィットと一般化を研究するのに最適であるため、pcg(procedurally content generation)に基づいたゲームをベンチマーク環境として使用するコミュニティの関心が高まっている。逆強化学習(IRL)は、専門家によるデモンストレーションから報酬関数を外挿する代わりに、高次元問題においても良い結果が得られるが、これらのテクニックを手続き的に生成された環境に適用する例はない。これは主に、良い報酬モデルを見つけるのに必要なデモの数のためです。そこで本研究では,pcgゲームにおける実演の必要性を大幅に減らすことができる逆強化学習に基づく手法を提案する。初期シードレベルが制限された環境と、トレーニングを安定させるためにいくつかの修正を加えることで、私たちのアプローチであるDE-AIRLは実証効率が高く、完全に手続き領域に一般化する報酬関数を外挿できることを示す。本手法は,MiniGridとDeepCrawlの2つの手続き環境において,様々なタスクに対して有効であることを示す。

Deep Reinforcement Learning achieves very good results in domains where reward functions can be manually engineered. At the same time, there is growing interest within the community in using games based on Procedurally Content Generation (PCG) as benchmark environments since this type of environment is perfect for studying overfitting and generalization of agents under domain shift. Inverse Reinforcement Learning (IRL) can instead extrapolate reward functions from expert demonstrations, with good results even on high-dimensional problems, however there are no examples of applying these techniques to procedurally-generated environments. This is mostly due to the number of demonstrations needed to find a good reward model. We propose a technique based on Adversarial Inverse Reinforcement Learning which can significantly decrease the need for expert demonstrations in PCG games. Through the use of an environment with a limited set of initial seed levels, plus some modifications to stabilize training, we show that our approach, DE-AIRL, is demonstration-efficient and still able to extrapolate reward functions which generalize to the fully procedural domain. We demonstrate the effectiveness of our technique on two procedural environments, MiniGrid and DeepCrawl, for a variety of tasks.

翻訳日:2021-05-22 20:51:46 公開日:2020-12-04

# DPM:外挿における物理情報ニューラルネットワークの新しいトレーニング手法

DPM: A Novel Training Method for Physics-Informed Neural Networks in Extrapolation ( http://arxiv.org/abs/2012.02681v1 )

ライセンス: Link先を確認

Jungeun Kim, Kookjin Lee, Dongeun Lee, Sheo Yon Jin, Noseong Park

(参考訳) 本稿では,時間依存非線形偏微分方程式 (pdes) によって記述される複雑な物理過程のダイナミクスを学ぶ手法を提案する。私たちの特に関心は、トレーニングで使用される時間領域の範囲を超えて、ソリューションを外挿することにあります。ベースライン手法の選び方は,物理インフォームドニューラルネットワーク (pinn) [raissi et al., j. comput] である。 phys., 378:686--707, 2019] この手法は、解だけでなく、物理過程のダイナミクスを記述する方程式もパラメタライズするためである。 PINNは,多くのベンチマーク問題において,外挿作業において不十分な性能を示す。そこで本研究では,新しいPINNのトレーニング手法を提案するとともに,拡張されたPINNが解の正確な外挿を時間内に行えることを示す。提案手法は,標準L2ノルム法において,既存の手法よりも最大72%小さい誤差を示す。

We present a method for learning dynamics of complex physical processes described by time-dependent nonlinear partial differential equations (PDEs). Our particular interest lies in extrapolating solutions in time beyond the range of temporal domain used in training. Our choice for a baseline method is physics-informed neural network (PINN) [Raissi et al., J. Comput. Phys., 378:686--707, 2019] because the method parameterizes not only the solutions but also the equations that describe the dynamics of physical processes. We demonstrate that PINN performs poorly on extrapolation tasks in many benchmark problems. To address this, we propose a novel method for better training PINN and demonstrate that our newly enhanced PINNs can accurately extrapolate solutions in time. Our method shows up to 72% smaller errors than existing methods in terms of the standard L2-norm metric.

翻訳日:2021-05-22 20:51:27 公開日:2020-12-04

# 安全なAI構築のための11の提案の概要

An overview of 11 proposals for building safe advanced AI ( http://arxiv.org/abs/2012.07532v1 )

ライセンス: Link先を確認

Evan Hubinger

(参考訳) 本稿では、反復増幅、議論によるai安全性、再帰的報酬モデリングなどを含む、現在の機械学習パラダイムの下で安全な高度なaiを構築するための11の異なる提案を分析し比較する。本論文では,各提案について,外方アライメント,内方アライメント,トレーニング競合性,パフォーマンス競争力の4つの要素について評価し,後者の2つを区別する。先行文献は主に個々の提案の分析に重点を置いているが、この分析は前述した4つのコンポーネントの比較分析を含む幅広い提案を比較検討することを目的としている。

This paper analyzes and compares 11 different proposals for building safe advanced AI under the current machine learning paradigm, including major contenders such as iterated amplification, AI safety via debate, and recursive reward modeling. Each proposal is evaluated on the four components of outer alignment, inner alignment, training competitiveness, and performance competitiveness, of which the distinction between the latter two is introduced in this paper. While prior literature has primarily focused on analyzing individual proposals, or primarily focused on outer alignment at the expense of inner alignment, this analysis seeks to take a comparative look at a wide range of proposals including a comparative analysis across all four previously mentioned components.

翻訳日:2021-05-22 20:51:12 公開日:2020-12-04

# ソフトウェア工学におけるチャットボットの自然言語理解プラットフォームの比較

A Comparison of Natural Language Understanding Platforms for Chatbots in Software Engineering ( http://arxiv.org/abs/2012.02640v1 )

ライセンス: Link先を確認

Ahmad Abdellatif, Khaled Badran, Diego Elias Costa, and Emad Shihab

(参考訳) チャットボットは、ソフトウェアエンジニアリングの未来を劇的に変え、実践者が彼らのソフトウェアプロジェクトについてチャットし、調査し、自然言語を使ってさまざまなサービスと対話できるようにする。すべてのチャットボットの中心には自然言語理解(NLU)コンポーネントがあり、チャットボットは自然言語入力を理解できる。近年、チャットボットの既製のNLUコンポーネントとして多くのNLUプラットフォームが提供されているが、ソフトウェアエンジニアリングチャットボットの最高のNLUを選択することはオープンな課題である。そこで本稿では,IBM Watson, Google Dialogflow, Rasa, Microsoft LUIS の4つの NLU を評価し,ソフトウェア工学ベースのチャットボットにおける NLU の使用方法を明らかにした。具体的には,NLUの性能を,意図の分類,信頼度スコアの安定性,実体抽出において検証する。 nlusを評価するには、ソフトウェアエンジニアリングの実践者が行う2つの共通タスクを反映した2つのデータセットを使用する。1) チャットボットとチャットしてソフトウェアリポジトリについて質問するタスク 2) q&aフォーラム(例えばstack overflow)で開発質問をするタスク。我々の発見によると、IBM Watsonは3つの側面(インテント分類、信頼性スコア、エンティティ抽出)を考慮すると、最高のNLUである。しかしながら、各側面の結果から、ibm watsonは、意図分類において、f1-measure > 84%で最高のパフォーマンスを示すが、信頼度スコアでは、rasaは、0.91よりも高い信頼度スコアで上位に来る。また,ダイアログフローを除くすべてのNLUが信頼できる信頼スコアを提供することを示した。エンティティ抽出では、Microsoft LUISとIBM Watsonが2つのSEタスクで他のNLUを上回っている。この結果は,チャットボットでどのNLUを使うかを決める際に,ソフトウェア工学の実践者にガイダンスを提供する。

Chatbots are envisioned to dramatically change the future of Software Engineering, allowing practitioners to chat and inquire about their software projects and interact with different services using natural language. At the heart of every chatbot is a Natural Language Understanding (NLU) component that enables the chatbot to understand natural language input. Recently, many NLU platforms were provided to serve as an off-the-shelf NLU component for chatbots, however, selecting the best NLU for Software Engineering chatbots remains an open challenge. Therefore, in this paper, we evaluate four of the most commonly used NLUs, namely IBM Watson, Google Dialogflow, Rasa, and Microsoft LUIS to shed light on which NLU should be used in Software Engineering based chatbots. Specifically, we examine the NLUs' performance in classifying intents, confidence scores stability, and extracting entities. To evaluate the NLUs, we use two datasets that reflect two common tasks performed by Software Engineering practitioners, 1) the task of chatting with the chatbot to ask questions about software repositories 2) the task of asking development questions on Q&A forums (e.g., Stack Overflow). According to our findings, IBM Watson is the best performing NLU when considering the three aspects (intents classification, confidence scores, and entity extraction). However, the results from each individual aspect show that, in intents classification, IBM Watson performs the best with an F1-measure > 84%, but in confidence scores, Rasa comes on top with a median confidence score higher than 0.91. Our results also show that all NLUs, except for Dialogflow, generally provide trustable confidence scores. For entity extraction, Microsoft LUIS and IBM Watson outperform other NLUs in the two SE tasks. Our results provide guidance to software engineering practitioners when deciding which NLU to use in their chatbots.

翻訳日:2021-05-22 20:50:59 公開日:2020-12-04

# 近似最適化平滑化アルゴリズム

Proximal Policy Optimization Smoothed Algorithm ( http://arxiv.org/abs/2012.02439v1 )

ライセンス: Link先を確認

Wangshu Zhu and Andre Rosendo

(参考訳) PPO(Proximal Policy Optimization)は、強化学習のサブフィールドであるポリシーサーチにおいて、各ポリシー更新におけるステップサイズを制限するために代理目的関数を使用することによって、最先端の結果を得た。このような制限は有用であるが、このアルゴリズムは曲線の急激な平坦化による性能不安定性と最適化の非効率さに悩まされている。この問題に対処するために,近位政策最適化スムースアルゴリズム(proximal policy optimization smooth algorithm, ppo)と呼ばれるppo変種を提案する。我々は,ロールバッククリッピング方式を採用したPPOとPPORBを比較し,他のPPO法よりも各ステップでより正確な更新を行うことができることを示す。さらに, 連続制御タスクにおける性能, 安定性の両面で, 最新のPPO変種よりも優れていることを示す。

Proximal policy optimization (PPO) has yielded state-of-the-art results in policy search, a subfield of reinforcement learning, with one of its key points being the use of a surrogate objective function to restrict the step size at each policy update. Although such restriction is helpful, the algorithm still suffers from performance instability and optimization inefficiency from the sudden flattening of the curve. To address this issue we present a PPO variant, named Proximal Policy Optimization Smooth Algorithm (PPOS), and its critical improvement is the use of a functional clipping method instead of a flat clipping method. We compare our method with PPO and PPORB, which adopts a rollback clipping method, and prove that our method can conduct more accurate updates at each time step than other PPO methods. Moreover, we show that it outperforms the latest PPO variants on both performance and stability in challenging continuous control tasks.

翻訳日:2021-05-22 20:50:19 公開日:2020-12-04

# 連合学習におけるバイアス緩和

Mitigating Bias in Federated Learning ( http://arxiv.org/abs/2012.02447v1 )

ライセンス: Link先を確認

Annie Abay, Yi Zhou, Nathalie Baracaldo, Shashank Rajamoni, Ebube Chuba, Heiko Ludwig

(参考訳) 差別を意識したモデルを作成する方法として、彼らは集中型MLに集中し、連邦学習(FL)は未探索のままである。 FLはコラボレーティブMLの上昇するアプローチであり、アグリゲータは複数のパーティを編成して、トレーニングデータを共有せずにグローバルモデルをトレーニングする。本稿では,flにおけるバイアスの原因について議論し,データプライバシを損なうことなくバイアスを軽減するための3つの前処理および内処理手法を提案する。当事者間のデータの不均一性はflの難解な特徴の1つであり,モデル性能,公平度指標,バイアス学習パターンへの影響を分析するために,複数のデータ分布について実験を行う。提案手法の包括的分析を行い,データ分布が歪んだり,20%の当事者がこの手法を用いていた場合でも,これらの手法が有効であることを示す。

As methods to create discrimination-aware models develop, they focus on centralized ML, leaving federated learning (FL) unexplored. FL is a rising approach for collaborative ML, in which an aggregator orchestrates multiple parties to train a global model without sharing their training data. In this paper, we discuss causes of bias in FL and propose three pre-processing and in-processing methods to mitigate bias, without compromising data privacy, a key FL requirement. As data heterogeneity among parties is one of the challenging characteristics of FL, we conduct experiments over several data distributions to analyze their effects on model performance, fairness metrics, and bias learning patterns. We conduct a comprehensive analysis of our proposed techniques, the results demonstrating that these methods are effective even when parties have skewed data distributions or as little as 20% of parties employ the methods.

翻訳日:2021-05-22 20:50:01 公開日:2020-12-04

# ウェアラブルストレスに対するベイズ能動学習と影響検出

Bayesian Active Learning for Wearable Stress and Affect Detection ( http://arxiv.org/abs/2012.02702v1 )

ライセンス: Link先を確認

Abhijith Ragav, Gautham Krishna Gudur

(参考訳) 近年,ヒトでは心理的ストレスが観察され,早期発見は健康リスクの予防に不可欠である。デバイス上での深層学習アルゴリズムによるストレス検出は、広汎なコンピューティングの進歩により増加傾向にある。しかし、対処すべき重要な課題は、適切な地上の真理化技術(アクティブラーニングなど)を通じて、ラベルのないデータをリアルタイムで処理することであり、これは、感情的な状態(ラベル)を確立するのに役立つと同時に、オラクルからクエリする最も情報に富むデータポイントのみを選択するのに役立つ。本稿では,モンテカルロ(mc)ドロップアウトを用いたベイジアンニューラルネットワークにおける近似によるモデル不確実性を表現する枠組みを提案する。これはアクティブラーニングに適した獲得関数と組み合わせられる。 raspberry pi 2で実験された一般的なストレス・インパクト検出データセットを用いた実験結果から,提案フレームワークは,様々な獲得関数を横断するアクティブラーニングにおいて,取得したプールポイントの数がかなり少なく,推論時の効率が大幅に向上することが示唆された。変動比は90.38%の精度を達成し、約40%少ないデータでトレーニング中に達成されるテストの最大精度に匹敵する。

In the recent past, psychological stress has been increasingly observed in humans, and early detection is crucial to prevent health risks. Stress detection using on-device deep learning algorithms has been on the rise owing to advancements in pervasive computing. However, an important challenge that needs to be addressed is handling unlabeled data in real-time via suitable ground truthing techniques (like Active Learning), which should help establish affective states (labels) while also selecting only the most informative data points to query from an oracle. In this paper, we propose a framework with capabilities to represent model uncertainties through approximations in Bayesian Neural Networks using Monte-Carlo (MC) Dropout. This is combined with suitable acquisition functions for active learning. Empirical results on a popular stress and affect detection dataset experimented on a Raspberry Pi 2 indicate that our proposed framework achieves a considerable efficiency boost during inference, with a substantially low number of acquired pool points during active learning across various acquisition functions. Variation Ratios achieves an accuracy of 90.38% which is comparable to the maximum test accuracy achieved while training on about 40% lesser data.

翻訳日:2021-05-22 20:49:27 公開日:2020-12-04

# 1ビットフィードバックは高信頼境界ポリシーに十分である

One-bit feedback is sufficient for upper confidence bound policies ( http://arxiv.org/abs/2012.02876v1 )

ライセンス: Link先を確認

Daniel Vial, Sanjay Shakkottai, R. Srikant

(参考訳) 従来のマルチアームバンディット問題の変種を考察し、各アームは、その過去の報酬履歴に基づいて、プル毎に1ビットのフィードバックしか提供できない。我々の主な結果は次のとおりである: フルリワードフィードバックを用いた高信頼バウンドポリシーが与えられると、1ビットフィードバックを生成するためのコーディングスキームと、我々のポリシーによって達成された後悔の比率とフルリワードフィードバックポリシーの後悔が漸近的に近づくような、対応するデコーディングスキームとアーム選択ポリシーが存在する。

We consider a variant of the traditional multi-armed bandit problem in which each arm is only able to provide one-bit feedback during each pull based on its past history of rewards. Our main result is the following: given an upper confidence bound policy which uses full-reward feedback, there exists a coding scheme for generating one-bit feedback, and a corresponding decoding scheme and arm selection policy, such that the ratio of the regret achieved by our policy and the regret of the full-reward feedback policy asymptotically approaches one.

翻訳日:2021-05-22 20:49:07 公開日:2020-12-04

# 色は可塑性か? 画像カラー化のためのUCapsNet

Is It a Plausible Colour? UCapsNet for Image Colourisation ( http://arxiv.org/abs/2012.02478v1 )

ライセンス: Link先を確認

Rita Pucci, Christian Micheloni, Gian Luca Foresti, Niki Martinel

(参考訳) 人間は、意味的特徴抽出の能力のおかげで、特に努力することなく、グレースケールの画像の色を想像することができる。自律システムはそれを達成できますか? 幻覚は可視で活気ある色にできるのか? これは色付けの問題です。事前学習した畳み込みニューラルネットワークモデルに依存する既存の作業とは違って,このような色分け問題を自己教師付き学習タスクとしてキャストした。逆学習パラダイムに従って学習したカプセルに基づく新しいアーキテクチャを導入することで,この問題に対処する。カプセルネットワークは、画像内のエンティティのセマンティック表現を抽出することができるが、その空間情報の詳細は緩く、グレースケール画像のカラー化には重要である。したがって、我々のucapsnet構造は、畳み込みニューラルネットワークを通してカプセルや空間的詳細を通して実体を抽出するエンコーディングフェーズを伴います。復号位相は、エンティティ特徴と空間特徴とを結合し、入力されたデータムの可算な色バージョンを暗示する。 ImageNetベンチマークの結果、我々のアプローチは出口ソリューションよりも鮮やかで可視な色を生成でき、監督下で事前訓練されたモデルよりも優れた性能を達成できることがわかった。

Human beings can imagine the colours of a grayscale image with no particular effort thanks to their ability of semantic feature extraction. Can an autonomous system achieve that? Can it hallucinate plausible and vibrant colours? This is the colourisation problem. Different from existing works relying on convolutional neural network models pre-trained with supervision, we cast such colourisation problem as a self-supervised learning task. We tackle the problem with the introduction of a novel architecture based on Capsules trained following the adversarial learning paradigm. Capsule networks are able to extract a semantic representation of the entities in the image but loose details about their spatial information, which is important for colourising a grayscale image. Thus our UCapsNet structure comes with an encoding phase that extracts entities through capsules and spatial details through convolutional neural networks. A decoding phase merges the entity features with the spatial features to hallucinate a plausible colour version of the input datum. Results on the ImageNet benchmark show that our approach is able to generate more vibrant and plausible colours than exiting solutions and achieves superior performance than models pre-trained with supervision.

翻訳日:2021-05-22 20:48:55 公開日:2020-12-04

# 生成モデルにおけるデータバイアスに関する一考察

A Note on Data Biases in Generative Models ( http://arxiv.org/abs/2012.02516v1 )

ライセンス: Link先を確認

Patrick Esser and Robin Rombach and Bj\"orn Ommer

(参考訳) 機械は不公平さや偏見の傾向が低いと考えるのは誘惑的だ。しかし、機械学習のアプローチはデータに基づいて出力を計算する。バイアスは開発パイプラインの任意の段階に入ることができるが、モデルは特にトレーニング対象のデータセットのバイアスを反映して受け入れられるため、必ずしも世界の真実を反映するものではなく、主にデータに関する真実を反映している。現代のアルゴリズムとそれらを形成するデータの関係性に関する認識を高めるために、条件付き可逆ニューラルネットワークを用いて、異なるデータセット間で共有される情報からデータセット固有の情報を分離する。このようにして、同じ画像を異なるデータセットに投影することで、それら固有のバイアスを明らかにすることができる。本手法は, 生成モデルの性能に及ぼすデータセット品質の影響, (ii) 生成モデルによってデータセットの社会的バイアスがどのように再現されるか, (iii) 写真, 油絵, アニメなどの多様なデータセット間の不適切な移動を通して, 創造的応用を示すために用いられる。私たちのコードとインタラクティブなデモはhttps://github.com/compvis/net2netで閲覧できます。

It is tempting to think that machines are less prone to unfairness and prejudice. However, machine learning approaches compute their outputs based on data. While biases can enter at any stage of the development pipeline, models are particularly receptive to mirror biases of the datasets they are trained on and therefore do not necessarily reflect truths about the world but, primarily, truths about the data. To raise awareness about the relationship between modern algorithms and the data that shape them, we use a conditional invertible neural network to disentangle the dataset-specific information from the information which is shared across different datasets. In this way, we can project the same image onto different datasets, thereby revealing their inherent biases. We use this methodology to (i) investigate the impact of dataset quality on the performance of generative models, (ii) show how societal biases of datasets are replicated by generative models, and (iii) present creative applications through unpaired transfer between diverse datasets such as photographs, oil portraits, and animes. Our code and an interactive demonstration are available at https://github.com/CompVis/net2net .

翻訳日:2021-05-22 20:48:37 公開日:2020-12-04

# 教師付き学習の再検討: 生物学習からその名前で呼ぶことへの洞察

Rethinking supervised learning: insights from biological learning and from calling it by its name ( http://arxiv.org/abs/2012.02526v1 )

ライセンス: Link先を確認

Alex Hernandez-Garcia

(参考訳) ニューラルネットワークのルネッサンスは、より広い用語の教師付き学習でコミュニティによってタグ付けされた分類モデルの成功によって触媒された。並外れた結果は、野心的な約束と過度に満ちた誇大宣伝を引き起こした。コミュニティはすぐに、この成功は何千ものラベル付きサンプルが利用可能になったことによるものだと気付いた。監督された学習は多くが栄光から恥へと変わりましたディープ・ラーニングを全体として批判する者もいれば、予測、教師なし、半監督、さらに最近では自己監督型ラーニングといった教師付きラーニングの方法が「代替的」でなければならないと宣言する者もいた。しかし、これらは理論的に根拠のある分類の実際の分類ではなく、すべてブランド名に思える。さらに、教師付き学習を追放するという呼びかけは、人間がほとんどあるいは全く監督せずに学ぶという疑わしい主張によって動機づけられた。ここでは,自然の学習と監督に関する洞察をレビューし,学習は監督なしでは不可能であるという考えを再検討し,単にその名前で呼ぶだけでは,よりよい進歩が期待できると論じる。

The renaissance of artificial neural networks was catalysed by the success of classification models, tagged by the community with the broader term supervised learning. The extraordinary results gave rise to a hype loaded with ambitious promises and overstatements. Soon the community realised that the success owed much to the availability of thousands of labelled examples. And supervised learning went, for many, from glory to shame. Some criticised deep learning as a whole and others proclaimed that the way forward had to be "alternatives" to supervised learning: predictive, unsupervised, semi-supervised and, more recently, self-supervised learning. However, these seem all brand names, rather than actual categories of a theoretically grounded taxonomy. Moreover, the call to banish supervised learning was motivated by the questionable claim that humans learn with little or no supervision. Here, we review insights about learning and supervision in nature, revisit the notion that learning is not possible without supervision and argue that we will make better progress if we just call it by its name.

翻訳日:2021-05-22 20:48:19 公開日:2020-12-04

# ディープネットワークにおける周辺性能劣化の定量化のための経験的手法

An Empirical Method to Quantify the Peripheral Performance Degradation in Deep Networks ( http://arxiv.org/abs/2012.02749v1 )

ライセンス: Link先を確認

Calden Wloka and John K. Tsotsos

(参考訳) 画像に畳み込みカーネルを適用する場合、出力が入力と同じサイズである場合、画像境界付近で何らかのパディングが要求される。つまり、畳み込みニューラルネットワーク(CNN)における畳み込みの各層に対して、カーネルサイズの半幅に相当する画素のストリップを、非正則表現で生成する。ほとんどのcnnカーネルはネットワークのパラメータ負荷を減らすために小さいが、この非バーティカル領域はそれぞれの畳み込み層を持つ。深層・深層ネットワークとストライドベースのダウンサンプリングを組み合わせる傾向は、この領域の伝播が画像の無視できない部分をカバーすることになることを意味する。この畳み込みに関する問題は長年にわたってよく認識されてきたが、現代のネットワーク行動に対する周辺表現の劣化の影響は十分に定量化されていない。翻訳の不変性の限界は何か? 画像パディングは問題を軽減するか、あるいは物体が画像境界と中心の間を移動するときに性能に影響するか? 実験モデルとしてMask R-CNNを用いて,ネットワーク性能の空間依存性を定量化するデータセットと手法を設計する。我々のデータセットは、高解像度の背景にオブジェクトを挿入することで構築され、画像境界に対してターゲットオブジェクトを特定の位置に配置するサブイメージを収穫することができる。対象位置の選択を通してマスクr-cnnの挙動を調べることにより,画像境界近傍,特に画像コーナー付近における性能低下パターンが明らかになる。ネットワーク性能におけるこの空間異方性の範囲と大きさの定量化は、被写体や関心領域の位置が所定の画像内で十分に局所化されることが保証されない制約のない現実的な環境にディープネットワークを配置する上で重要である。

When applying a convolutional kernel to an image, if the output is to remain the same size as the input then some form of padding is required around the image boundary, meaning that for each layer of convolution in a convolutional neural network (CNN), a strip of pixels equal to the half-width of the kernel size is produced with a non-veridical representation. Although most CNN kernels are small to reduce the parameter load of a network, this non-veridical area compounds with each convolutional layer. The tendency toward deeper and deeper networks combined with stride-based down-sampling means that the propagation of this region can end up covering a non-negligable portion of the image. Although this issue with convolutions has been well acknowledged over the years, the impact of this degraded peripheral representation on modern network behavior has not been fully quantified. What are the limits of translation invariance? Does image padding successfully mitigate the issue, or is performance affected as an object moves between the image border and center? Using Mask R-CNN as an experimental model, we design a dataset and methodology to quantify the spatial dependency of network performance. Our dataset is constructed by inserting objects into high resolution backgrounds, thereby allowing us to crop sub-images which place target objects at specific locations relative to the image border. By probing the behaviour of Mask R-CNN across a selection of target locations, we see clear patterns of performance degredation near the image boundary, and in particular in the image corners. Quantifying both the extent and magnitude of this spatial anisotropy in network performance is important for the deployment of deep networks into unconstrained and realistic environments in which the location of objects or regions of interest are not guaranteed to be well localized within a given image.

翻訳日:2021-05-22 20:48:01 公開日:2020-12-04

# 等価表現の学習

Learning Equivariant Representations ( http://arxiv.org/abs/2012.02771v1 )

ライセンス: Link先を確認

Carlos Esteves

(参考訳) 最先端のディープラーニングシステムは、しばしば大量のデータと計算を必要とする。このため、データの既知の構造や未知の構造を活用することが最重要となる。畳み込みニューラルネットワーク(CNN)はこの原理の成功例であり、その特性はシフト等価性である。フィルタを入力の上にスライドさせることで、入力がシフトすると、応答は同じ量にシフトし、意味コンテンツが絶対画素位置から独立している自然画像の構造を利用する。この性質は、音声、画像、ビデオ認識タスクにおけるCNNの成功に不可欠である。この論文では、回転やスケーリングといった他の種類の変換に同値性を拡張する。対称性の群で定義される異なる変換に対する同変モデルを提案する。 The main contributions are (i) polar transformer networks, achieving equivariance to the group of similarities on the plane, (ii) equivariant multi-view networks, achieving equivariance to the group of symmetries of the icosahedron, (iii) spherical CNNs, achieving equivariance to the continuous 3D rotation group, (iv) cross-domain image embeddings, achieving equivariance to 3D rotations for 2D inputs, and (v) spin-weighted spherical CNNs, generalizing the spherical CNNs and achieving equivariance to 3D rotations for spherical vector fields. 用途としては、画像分類、3次元形状分類と検索、パノラマ画像分類とセグメンテーション、形状アライメント、ポーズ推定などがある。これらのモデルに共通しているのは、データの対称性を活用してサンプルとモデルの複雑さを減らし、一般化のパフォーマンスを向上させることだ。この利点は、データが制限されたり、任意の回転のような入力摂動が存在するような困難なタスクにおいてより重要である。

State-of-the-art deep learning systems often require large amounts of data and computation. For this reason, leveraging known or unknown structure of the data is paramount. Convolutional neural networks (CNNs) are successful examples of this principle, their defining characteristic being the shift-equivariance. By sliding a filter over the input, when the input shifts, the response shifts by the same amount, exploiting the structure of natural images where semantic content is independent of absolute pixel positions. This property is essential to the success of CNNs in audio, image and video recognition tasks. In this thesis, we extend equivariance to other kinds of transformations, such as rotation and scaling. We propose equivariant models for different transformations defined by groups of symmetries. The main contributions are (i) polar transformer networks, achieving equivariance to the group of similarities on the plane, (ii) equivariant multi-view networks, achieving equivariance to the group of symmetries of the icosahedron, (iii) spherical CNNs, achieving equivariance to the continuous 3D rotation group, (iv) cross-domain image embeddings, achieving equivariance to 3D rotations for 2D inputs, and (v) spin-weighted spherical CNNs, generalizing the spherical CNNs and achieving equivariance to 3D rotations for spherical vector fields. Applications include image classification, 3D shape classification and retrieval, panoramic image classification and segmentation, shape alignment and pose estimation. What these models have in common is that they leverage symmetries in the data to reduce sample and model complexity and improve generalization performance. The advantages are more significant on (but not limited to) challenging tasks where data is limited or input perturbations such as arbitrary rotations are present.

翻訳日:2021-05-22 20:47:31 公開日:2020-12-04

# 文書レベルの関係抽出のための粗いエンティティ表現

Coarse-to-Fine Entity Representations for Document-level Relation Extraction ( http://arxiv.org/abs/2012.02507v1 )

ライセンス: Link先を確認

Damai Dai, Jing Ren, Shuang Zeng, Baobao Chang, Zhifang Sui

(参考訳) 文書レベルの関係抽出(RE: Document-level Relation extract)は、文内および文間の関係を抽出する必要がある。最近の研究は、通常文書レベルの相互作用をキャプチャする文書レベルのグラフを構築するグラフベースの手法が有用なエンティティ表現を得ることができ、文書レベルのREに取り組むのに役立つことを示している。これらのメソッドは、グラフ全体にフォーカスするか、あるいは対象のエンティティペア間のパスなど、グラフの一部にもっと注意を払うかのどちらかです。しかし、ドキュメントレベルのREは、両方に同時にフォーカスすることの恩恵を受けるかもしれない。そこで,より包括的な実体表現を得るために,二つの相を含む粗大な戦略を取り入れた \textbf{C}oarse-to-\textbf{F}ine \textbf{E}ntity \textbf{R}epresentation model (\textbf{CFER}) を提案する。まず、CFERはグラフニューラルネットワークを使用して、グラフ全体のグローバル情報を粗いレベルで統合する。次に、cferは、グローバル情報をガイダンスとして使用し、ターゲットエンティティペア間のパス情報を細かなレベルで選択的に集約する。分類において、両階層の実体表現を関係抽出のためのより包括的な表現に結合する。大規模文書レベルのREデータセットによる実験結果から,CFERは従来のベースラインモデルよりも優れた性能を発揮することが示された。さらに,詳細なモデル解析により戦略の有効性を検証する。

Document-level Relation Extraction (RE) requires extracting relations expressed within and across sentences. Recent works show that graph-based methods, usually constructing a document-level graph that captures document-aware interactions, can obtain useful entity representations thus helping tackle document-level RE. These methods either focus more on the entire graph, or pay more attention to a part of the graph, e.g., paths between the target entity pair. However, we find that document-level RE may benefit from focusing on both of them simultaneously. Therefore, to obtain more comprehensive entity representations, we propose the \textbf{C}oarse-to-\textbf{F}ine \textbf{E}ntity \textbf{R}epresentation model (\textbf{CFER}) that adopts a coarse-to-fine strategy involving two phases. First, CFER uses graph neural networks to integrate global information in the entire graph at a coarse level. Next, CFER utilizes the global information as a guidance to selectively aggregate path information between the target entity pair at a fine level. In classification, we combine the entity representations from both two levels into more comprehensive representations for relation extraction. Experimental results on a large-scale document-level RE dataset show that CFER achieves better performance than previous baseline models. Further, we verify the effectiveness of our strategy through elaborate model analysis.

翻訳日:2021-05-22 20:47:07 公開日:2020-12-04

# オフラインメタレベルモデルに基づくコールドスタート推薦のための強化学習手法

Offline Meta-level Model-based Reinforcement Learning Approach for Cold-Start Recommendation ( http://arxiv.org/abs/2012.02476v1 )

ライセンス: Link先を確認

Yanan Wang, Yong Ge, Li Li, Rui Chen, Tong Xu

(参考訳) 強化学習(Reinforcement Learning, RL)は、リコメンダシステムに対する長期的なユーザの関心を最適化する上で、非常に有望である。しかしながら、既存のrlベースのレコメンデーションメソッドでは、堅牢なレコメンデーションポリシを学ぶために、各ユーザが多数のインタラクションを必要とする。限られた数のインタラクションを持つ新規ユーザに推奨する場合には,この課題がより重要になります。そこで本稿では,高速ユーザ適応のためのメタレベルモデルに基づく強化学習手法を提案することで,rlベースのレコメンダシステムにおけるコールドスタート課題を解決する。提案手法では,ユーザの好みをユーザコンテキスト変数で推測することで,インタラクションの少ない新規ユーザに対して,レコメンデーションシステムによる適応性の向上を実現する。適応効率を向上させるために,メタレベルのレコメンデーションエージェントを支援する逆強化学習手法を用いて,少数のインタラクションからユーザポリシと報酬を回復することを学ぶ。さらに,情報理論的な観点から,ユーザモデルとレコメンデーションエージェントの相互作用関係をモデル化する。実験の結果,1つのインタラクションシーケンスのみで新規ユーザに対応する場合,提案手法の有効性が示された。さらに,推奨性能境界の理論的解析を行う。

Reinforcement learning (RL) has shown great promise in optimizing long-term user interest in recommender systems. However, existing RL-based recommendation methods need a large number of interactions for each user to learn a robust recommendation policy. The challenge becomes more critical when recommending to new users who have a limited number of interactions. To that end, in this paper, we address the cold-start challenge in the RL-based recommender systems by proposing a meta-level model-based reinforcement learning approach for fast user adaptation. In our approach, we learn to infer each user's preference with a user context variable that enables recommendation systems to better adapt to new users with few interactions. To improve adaptation efficiency, we learn to recover the user policy and reward from only a few interactions via an inverse reinforcement learning method to assist a meta-level recommendation agent. Moreover, we model the interaction relationship between the user model and recommendation agent from an information-theoretic perspective. Empirical results show the effectiveness of the proposed method when adapting to new users with only a single interaction sequence. We further provide a theoretical analysis of the recommendation performance bound.

翻訳日:2021-05-22 20:46:43 公開日:2020-12-04

# ヒューマンモビリティのためのディープラーニング:データとモデルに関する調査

Deep Learning for Human Mobility: a Survey on Data and Models ( http://arxiv.org/abs/2012.02825v1 )

ライセンス: Link先を確認

Massimiliano Luca, Gianni Barlacchi, Bruno Lepri, Luca Pappalardo

(参考訳) 人類の移動性に関する研究は、病気の普及、都市計画、幸福、汚染など、社会の様々な側面に影響を及ぼすため、非常に重要である。電話記録、GPSトレース、ソーシャルメディア投稿などのデジタルモビリティデータの拡散は、人工知能の卓越した予測力と相まって、深層学習を人間のモビリティに適用するきっかけとなった。特に、次の位置予測、すなわち個人の将来の位置を予測すること、群衆の流れ予測、すなわち地理的領域のフローを予測すること、軌道生成、すなわち現実的な個人軌道を生成することの3つのタスクに焦点を当てている。既存の調査では、シングルタスク、データソース、メカニカルあるいは従来の機械学習アプローチにフォーカスしているが、ディープラーニングソリューションの包括的な説明は欠落している。 i)モビリティとディープラーニングに関する基本的な概念、(ii)データソースと公開データセットのレビュー、(iii)ディープラーニングモデルの説明、(iv)関連するオープンチャレンジに関する議論。我々の調査は、次の位置予測、群集の流れ予測、軌道生成に対する先進的なディープラーニングソリューションのガイドである。同時に、これは深層学習の科学者や実践者が人間のモビリティ研究の基本的な概念とオープンな課題を理解するのに役立つ。

The study of human mobility is crucial due to its impact on several aspects of our society, such as disease spreading, urban planning, well-being, pollution, and more. The proliferation of digital mobility data, such as phone records, GPS traces, and social media posts, combined with the outstanding predictive power of artificial intelligence, triggered the application of deep learning to human mobility. In particular, the literature is focusing on three tasks: next-location prediction, i.e., predicting an individual's future locations; crowd flow prediction, i.e., forecasting flows on a geographic region; and trajectory generation, i.e., generating realistic individual trajectories. Existing surveys focus on single tasks, data sources, mechanistic or traditional machine learning approaches, while a comprehensive description of deep learning solutions is missing. This survey provides: (i) basic notions on mobility and deep learning; (ii) a review of data sources and public datasets; (iii) a description of deep learning models and (iv) a discussion about relevant open challenges. Our survey is a guide to the leading deep learning solutions to next-location prediction, crowd flow prediction, and trajectory generation. At the same time, it helps deep learning scientists and practitioners understand the fundamental concepts and the open challenges of the study of human mobility.

翻訳日:2021-05-22 20:46:26 公開日:2020-12-04

# 神経常微分方程式の普遍近似特性

Universal Approximation Property of Neural Ordinary Differential Equations ( http://arxiv.org/abs/2012.02414v1 )

ライセンス: Link先を確認

Takeshi Teshima, Koichi Tojo, Masahiro Ikeda, Isao Ishikawa, Kenta Oono

(参考訳) ニューラル常微分方程式 (neural ordinary differential equation, nodes) は、その自由形式ヤコビアンと扱いやすいヤコビ行列式推定器が利用可能であることを保証する可逆ニューラルネットワークアーキテクチャである。最近、NODEの表現力は、ある条件下で連続写像に対する$L^p$-universal approximatorを形成することで部分的に明らかになった。しかし、l^p$-universalityは、近似器が入力空間の小さな領域の目標関数と大きく異なる場合でも、入力領域全体の近似を保証できない可能性がある。さらにノードのポテンシャルを明らかにするために、そのより強い近似特性、すなわち大きな微分同相写像のクラスを近似するための$\sup$-universality を示す。これは微分同相群の構造定理を利用して示され、その結果、ノードがより強い保証で近似できるかなり大きな写像集合を確立することによって、既存の文献を補完する。

Neural ordinary differential equations (NODEs) is an invertible neural network architecture promising for its free-form Jacobian and the availability of a tractable Jacobian determinant estimator. Recently, the representation power of NODEs has been partly uncovered: they form an $L^p$-universal approximator for continuous maps under certain conditions. However, the $L^p$-universality may fail to guarantee an approximation for the entire input domain as it may still hold even if the approximator largely differs from the target function on a small region of the input space. To further uncover the potential of NODEs, we show their stronger approximation property, namely the $\sup$-universality for approximating a large class of diffeomorphisms. It is shown by leveraging a structure theorem of the diffeomorphism group, and the result complements the existing literature by establishing a fairly large set of mappings that NODEs can approximate with a stronger guarantee.

翻訳日:2021-05-22 20:45:56 公開日:2020-12-04

# トポロジーを考慮した3Dポイントクラウド生成のためのChartPointFlow

ChartPointFlow for Topology-Aware 3D Point Cloud Generation ( http://arxiv.org/abs/2012.02346v1 )

ライセンス: Link先を確認

Takumi Kimura, Takashi Matsubara, Kuniaki Uehara

(参考訳) 点雲は三次元形状の表面の表現として機能する。深層生成モデルは、ボールのような潜伏変数の集合からの写像によって、そのバリエーションをモデル化するために適応されている。しかし、以前のアプローチでは点雲の位相構造にはあまり注意が払われておらず、連続写像は様々な数の穴や交点を表現できない。さらに、点雲は複数の部分からなることが多く、表現されることもほとんどない。本稿では,複数の潜在ラベルを持つフローベース生成モデルであるChartPointFlowを提案する。相互情報を最大化することにより、ラベルによって条件付けられた写像は、多様体のチャートのような与えられた点雲の連続部分集合に割り当てられる。これにより、従来のアプローチではぼやけや穴の発生に支障をきたす傾向があるのに対し、提案モデルでは明確な境界を持つトポロジカル構造を保存できる。実験の結果,ChartPointFlowはサンプリングベースポイントクラウドジェネレータ間の生成と再構築において,最先端の性能を実現していることがわかった。

A point cloud serves as a representation of the surface of a three-dimensional shape. Deep generative models have been adapted to model their variations typically by a map from a ball-like set of latent variables. However, previous approaches have not paid much attention to the topological structure of a point cloud; a continuous map cannot express the varying number of holes and intersections. Moreover, a point cloud is often composed of multiple subparts, and it is also hardly expressed. In this paper, we propose ChartPointFlow, which is a flow-based generative model with multiple latent labels. By maximizing the mutual information, a map conditioned by a label is assigned to a continuous subset of a given point cloud, like a chart of a manifold. This enables our proposed model to preserve the topological structure with clear boundaries, while previous approaches tend to suffer from blurs and to fail in generating holes. Experimental results demonstrate that ChartPointFlow achieves the state-of-the-art performance in generation and reconstruction among sampling-based point cloud generators.

翻訳日:2021-05-22 20:45:38 公開日:2020-12-04

# 銀河団リッチネス推定のための光波長誘導型自己教師付き特徴学習

Optical Wavelength Guided Self-Supervised Feature Learning For Galaxy Cluster Richness Estimate ( http://arxiv.org/abs/2012.02368v1 )

ライセンス: Link先を確認

Gongbo Liang, Yuanyuan Su, Sheng-Chieh Lin, Yu Zhang, Yuanyuan Zhang, Nathan Jacobs

(参考訳) 近くの宇宙のほとんどの銀河は、銀河団または銀河群に重力的に結合している。光学的豊かさなどの光学的内容は、現代の天文学や宇宙論における銀河と大規模構造の共同進化を理解する上で重要である。光豊かさの決定は困難である。マルチバンド光画像から光リッチ度を推定するための自己教師型アプローチを提案する。本手法では,マルチバンド光画像のデータ特性を事前学習に利用し,大規模かつ未ラベルのデータセットから特徴表現を学習する。提案手法をSloan Digital Sky Surveyに適用する。その結果、光学的豊かさの推定により、平均絶対誤差が11.84%、内在散乱が20.78%減少し、ラベル付きトレーニングデータの必要性が最大60%低下した。提案手法は,多数の未ラベルのマルチバンド画像が利用可能であるが,画像ラベルの取得にはコストがかかる天文学や宇宙論に有用であると考えている。

Most galaxies in the nearby Universe are gravitationally bound to a cluster or group of galaxies. Their optical contents, such as optical richness, are crucial for understanding the co-evolution of galaxies and large-scale structures in modern astronomy and cosmology. The determination of optical richness can be challenging. We propose a self-supervised approach for estimating optical richness from multi-band optical images. The method uses the data properties of the multi-band optical images for pre-training, which enables learning feature representations from a large but unlabeled dataset. We apply the proposed method to the Sloan Digital Sky Survey. The result shows our estimate of optical richness lowers the mean absolute error and intrinsic scatter by 11.84% and 20.78%, respectively, while reducing the need for labeled training data by up to 60%. We believe the proposed method will benefit astronomy and cosmology, where a large number of unlabeled multi-band images are available, but acquiring image labels is costly.

翻訳日:2021-05-22 20:45:21 公開日:2020-12-04

# 自動運転車の歩行者属性32

Detecting 32 Pedestrian Attributes for Autonomous Vehicles ( http://arxiv.org/abs/2012.02647v1 )

ライセンス: Link先を確認

Taylor Mordan, Matthieu Cord, Patrick P\'erez and Alexandre Alahi

(参考訳) 歩行者は、都市部における自動運転車の安全性を最も重視する道路利用者の1つである。本稿では,歩行者を共同検出し,歩行者属性を32個認識する問題に対処する。これらは視覚的外観や行動を含み、道路横断の予測も含むが、これは主要な安全上の懸念である。そこで本稿では,複合フィールドフレームワークを利用したマルチタスク学習(MTL)モデルを提案する。各フィールドは、歩行者のインスタンスを空間的に特定し、属性予測を集約する。この定式化は自然に空間的文脈を活用し、自動運転のような低解像度シナリオに適している。共同で学習する属性の数を増やすことで、様々なタスクを伴うMLLで発生する勾配のスケールに関する問題を明らかにする。我々は,ネットワークアーキテクチャにおいて,フォーク正規化(fork-normalization)と呼ばれる後方通過時に,異なる目的関数から生じる勾配を正規化する。 JAADは、自動運転車からの歩行者分析のための多くの属性を提供するデータセットであり、競争力のある検出と属性認識の結果と、より安定したMTLトレーニングを示す。

Pedestrians are arguably one of the most safety-critical road users to consider for autonomous vehicles in urban areas. In this paper, we address the problem of jointly detecting pedestrians and recognizing 32 pedestrian attributes. These encompass visual appearance and behavior, and also include the forecasting of road crossing, which is a main safety concern. For this, we introduce a Multi-Task Learning (MTL) model relying on a composite field framework, which achieves both goals in an efficient way. Each field spatially locates pedestrian instances and aggregates attribute predictions over them. This formulation naturally leverages spatial context, making it well suited to low resolution scenarios such as autonomous driving. By increasing the number of attributes jointly learned, we highlight an issue related to the scales of gradients, which arises in MTL with numerous tasks. We solve it by normalizing the gradients coming from different objective functions when they join at the fork in the network architecture during the backward pass, referred to as fork-normalization. Experimental validation is performed on JAAD, a dataset providing numerous attributes for pedestrian analysis from autonomous vehicles, and shows competitive detection and attribute recognition results, as well as a more stable MTL training.

翻訳日:2021-05-22 20:45:06 公開日:2020-12-04

# 原型補正条件付ランダムフィールドを用いたFew-Shotイベント検出

Few-Shot Event Detection with Prototypical Amortized Conditional Random Field ( http://arxiv.org/abs/2012.02353v1 )

ライセンス: Link先を確認

Xin Cong, Shiyao Cui, Bowen Yu, Tingwen Liu, Yubin Wang, Bin Wang

(参考訳) 情報抽出の基本的なタスクであるイベント検出は、いくつかのサンプルで新しいイベントタイプを認識する必要がある場合に苦労する傾向がある。 Few-Shot Event Detection (FSED)。従来の識別列分類パラダイムは、パイプライン方式でこの問題を解決しようとするが、イベントタイプ間のトリガの相違を無視し、エラーの伝播に悩まされる。本稿では,タスクを二部分タグ付け方式で数発タグ付け問題に変換する,新しい統一ジョイントモデルを提案する。この目的のために,我々はまず,ラベルのプロトタイプに基づいてラベル間の遷移スコアを近似する原型的アモルティゼーションネットワークを構築する,限定的シナリオにおけるラベル依存をモデル化するために,原型的アモルティゼーション条件付き確率場 (pa-crf) を設計した。次に、PA-CRFにおける遷移スコアのモデル化のためにガウス分布を導入し、データ不足による不確実な推定を緩和する。ベンチマークデータセットFewEventで実験を行い、実験結果から、タグ付けに基づく手法は既存のパイプラインやジョイントラーニング手法よりも優れていることが示された。さらに、提案したPA-CRFは、公開データセット上で最高の結果を得る。

Event Detection, a fundamental task of Information Extraction, tends to struggle when it needs to recognize novel event types with a few samples, i.e. Few-Shot Event Detection (FSED). Previous identify-then-classify paradigm attempts to solve this problem in the pipeline manner but ignores the trigger discrepancy between event types, thus suffering from the error propagation. In this paper, we present a novel unified joint model which converts the task to a few-shot tagging problem with a double-part tagging scheme. To this end, we first design the Prototypical Amortized Conditional Random Field (PA-CRF) to model the label dependency in the few-shot scenario, which builds prototypical amortization networks to approximate the transition scores between labels based on the label prototypes. Then Gaussian distribution is introduced for the modeling of the transition scores in PA-CRF to alleviate the uncertain estimation resulting from insufficient data. We conduct experiments on the benchmark dataset FewEvent and the experimental results show that the tagging based methods are better than existing pipeline and joint learning methods. In addition, the proposed PA-CRF achieves the best results on the public dataset.

翻訳日:2021-05-22 20:44:49 公開日:2020-12-04

# ddrel: dyadic対話における対人関係分類のための新しいデータセット

DDRel: A New Dataset for Interpersonal Relation Classification in Dyadic Dialogues ( http://arxiv.org/abs/2012.02553v1 )

ライセンス: Link先を確認

Qi Jia, Hongru Huang, Kenny Q. Zhu

(参考訳) 対話における対人的言語スタイルの変化は、人間の興味深く、ほとんど本能的な能力である。言語コンテンツから対人関係を理解することは、対話をさらに理解するための重要なステップである。先行研究は主にテキスト中の名前付きエンティティ間の関係抽出に焦点を当てている。本稿では,対話に基づく対話者の関係分類の課題を提案する。我々はIMSDbから映画スクリプトをクロールし、13の事前定義された関係に従って各セッションの関連ラベルを注釈付けした。注釈付きデータセット ddrel は、合計53,126発話の694対の話者による6300のdyadic対話セッションで構成されている。また,セッションレベルおよびペアレベルの関係分類タスクを,広く受け入れられるベースラインで構築する。実験結果から,本課題は既存モデルでは困難な課題であり,将来の研究にはデータセットが有用であることが示唆された。

Interpersonal language style shifting in dialogues is an interesting and almost instinctive ability of human. Understanding interpersonal relationship from language content is also a crucial step toward further understanding dialogues. Previous work mainly focuses on relation extraction between named entities in texts. In this paper, we propose the task of relation classification of interlocutors based on their dialogues. We crawled movie scripts from IMSDb, and annotated the relation labels for each session according to 13 pre-defined relationships. The annotated dataset DDRel consists of 6300 dyadic dialogue sessions between 694 pair of speakers with 53,126 utterances in total. We also construct session-level and pair-level relation classification tasks with widely-accepted baselines. The experimental results show that this task is challenging for existing models and the dataset will be useful for future research.

翻訳日:2021-05-22 20:44:11 公開日:2020-12-04

# 女性・移民に対するサイバーいじめの自動検出とクロスドメイン適応性

Automated Detection of Cyberbullying Against Women and Immigrants and Cross-domain Adaptability ( http://arxiv.org/abs/2012.02565v1 )

ライセンス: Link先を確認

Thushari Atapattu, Mahen Herath, Georgia Zhang, Katrina Falkner

(参考訳) ソーシャルメディア技術の利用が急増しているため、サイバーいじめは社会問題として広まりつつある。少数派、女性、青年はサイバーいじめの一般的な犠牲者である。 nlp技術の進歩にもかかわらず、自動サイバーいじめ検出は依然として困難である。本稿では,最先端NLP技術を用いた技術の進歩に焦点を当てる。 SemEval 2019 - Task 5(HatEval)のTwitterデータセットを、女性や移民に対するヘイトスピーチに使用しています。ヘイトスピーチ(タスクA)とアグレッシブネス(タスクB)をそれぞれ分類する作業において,DistilBERTに基づくベストパフォーマンスアンサンブルモデルにおいて,F1スコアの0.73と0.74を達成している。タスクa用に開発されたアンサンブルモデルを用いて,外部データセットにおける攻撃的言語を分類し,3つのベンチマークデータセットを用いてf1スコアの0.7以下を達成した。我々は、将来のサイバーいじめ研究のための洞察に富んだレコメンデーションを提供するために、誤分類されたツイートの質的分析を行う。

Cyberbullying is a prevalent and growing social problem due to the surge of social media technology usage. Minorities, women, and adolescents are among the common victims of cyberbullying. Despite the advancement of NLP technologies, the automated cyberbullying detection remains challenging. This paper focuses on advancing the technology using state-of-the-art NLP techniques. We use a Twitter dataset from SemEval 2019 - Task 5(HatEval) on hate speech against women and immigrants. Our best performing ensemble model based on DistilBERT has achieved 0.73 and 0.74 of F1 score in the task of classifying hate speech (Task A) and aggressiveness and target (Task B) respectively. We adapt the ensemble model developed for Task A to classify offensive language in external datasets and achieved ~0.7 of F1 score using three benchmark datasets, enabling promising results for cross-domain adaptability. We conduct a qualitative analysis of misclassified tweets to provide insightful recommendations for future cyberbullying research.

翻訳日:2021-05-22 20:44:01 公開日:2020-12-04

# Ve'rdd 紙辞書と低リソースNLPとコミュニティ関与のギャップを狭める

Ve'rdd. Narrowing the Gap between Paper Dictionaries, Low-Resource NLP and Community Involvement ( http://arxiv.org/abs/2012.02578v1 )

ライセンス: Link先を確認

Khalid Alnajjar, Mika H\"am\"al\"ainen, Jack Rueter, Niko Partanen

(参考訳) 本稿では,複数のアマチュア編集者に公開されている草の根辞書の再評価と編集の機会を提供する,オープンソースのオンライン辞書編集システムve'rddを提案する。コミュニティの活動は、深刻な絶滅危惧言語であるSkolt Samiの、最先端の有限状態言語記述に組み込むことが目的である。問題は、コミュニティが鉛筆と紙のレベル以上のものに参加することにある。時々、ネイティブスピーカーと辞書指向は、将来自分たちの仕事をより意味のあるものにするであろうインフラを利用するための技術的な理解を欠いているようです。すべての入力を複数回再利用する。そこで本システムは,ユーザフレンドリなUIを支える技術的複雑さを隠蔽するUralic言語のための既存のツールやインフラと統合する。

We present an open-source online dictionary editing system, Ve'rdd, that offers a chance to re-evaluate and edit grassroots dictionaries that have been exposed to multiple amateur editors. The idea is to incorporate community activities into a state-of-the-art finite-state language description of a seriously endangered minority language, Skolt Sami. Problems involve getting the community to take part in things above the pencil-and-paper level. At times, it seems that the native speakers and the dictionary oriented are lacking technical understanding to utilize the infrastructures which might make their work more meaningful in the future, i.e. multiple reuse of all of their input. Therefore, our system integrates with the existing tools and infrastructures for Uralic language masking the technical complexities behind a user-friendly UI.

翻訳日:2021-05-22 20:43:43 公開日:2020-12-04

# SMSデータセットのオンデバイス文類似性

On-Device Sentence Similarity for SMS Dataset ( http://arxiv.org/abs/2012.02819v1 )

ライセンス: Link先を確認

Arun D Prabhu, Nikhil Arora, Shubham Vatsal, Gopi Ramena, Sukumar Moharana, Naresh Purre

(参考訳) 短いメッセージサービス(SMS)テキスト/文間の文の類似性を決定することは、モバイルデバイス産業において重要な役割を果たす。したがって、SMSデータの類似性を評価するためには、検索やナビゲーションの強化、カスタムラベルやタグが送信者に関係なく提供される場合に、同様のタイプのSMSをまとめることなど、さまざまなアプリケーションで必要となる。 SMSデータで直面する問題は、その不完全構造と文法上の矛盾である。本稿では,SMSテキスト間のテキスト類似性を評価するためのユニークなパイプラインを提案する。 SMSテキストに埋め込まれた部分構造を利用してキーワード抽出に音声の一部(POS)モデルを用い,統計的手法を用いて類似度の比較を行った。提案したパイプラインは、SMSデータ間のセマンティックな大きなバリエーションを扱い、デバイス上でのアプリケーション(携帯電話)に有効である。我々の作業の能力を示すため、我々のパイプラインは、以下のセクションの1つで議論されているSMSテキスト類似性の可能性の1つに傾倒して設計されていますが、それでも他のアプリケーションにもスケーラビリティが保証されています。

Determining the sentence similarity between Short Message Service (SMS) texts/sentences plays a significant role in mobile device industry. Gauging the similarity between SMS data is thus necessary for various applications like enhanced searching and navigation, clubbing together SMS of similar type when given a custom label or tag is provided by user irrespective of their sender etc. The problem faced with SMS data is its incomplete structure and grammatical inconsistencies. In this paper, we propose a unique pipeline for evaluating the text similarity between SMS texts. We use Part of Speech (POS) model for keyword extraction by taking advantage of the partial structure embedded in SMS texts and similarity comparisons are carried out using statistical methods. The proposed pipeline deals with major semantic variations across SMS data as well as makes it effective for its application on-device (mobile phone). To showcase the capabilities of our work, our pipeline has been designed with an inclination towards one of the possible applications of SMS text similarity discussed in one of the following sections but nonetheless guarantees scalability for other applications as well.

翻訳日:2021-05-22 20:43:32 公開日:2020-12-04

# 創発的コミュニケーションにおける誘導バイアスと言語表現性

Inductive Bias and Language Expressivity in Emergent Communication ( http://arxiv.org/abs/2012.02875v1 )

ライセンス: Link先を確認

Shangmin Guo, Yi Ren, Agnieszka S{\l}owik, Kory Mathewson

(参考訳) レファレンシャルゲームとレコンストラクションゲームは、創発言語を研究するための最も一般的なゲームタイプである。言語ゲームの種類が創発的言語にどのように影響するかを,言語構成性および<i>i)言語の起源とは異なるタスクへの創発的言語移行の観点から検討する。手作りのシンボリックデータセットを用いた実証実験により、異なるゲームから出現する言語は構成性とさらに異なる表現性を持つことを示した。

Referential games and reconstruction games are the most common game types for studying emergent languages. We investigate how the type of the language game affects the emergent language in terms of: i) language compositionality and ii) transfer of an emergent language to a task different from its origin, which we refer to as language expressivity. With empirical experiments on a handcrafted symbolic dataset, we show that languages emerged from different games have different compositionality and further different expressivity.

翻訳日:2021-05-22 20:43:14 公開日:2020-12-04

# cit-gan:循環型画像翻訳生成広告ネットワークとiris提示攻撃検出への応用

CIT-GAN: Cyclic Image Translation Generative Adversarial Network With Application in Iris Presentation Attack Detection ( http://arxiv.org/abs/2012.02374v1 )

ライセンス: Link先を確認

Shivangi Yadav and Arun Ross

(参考訳) 本研究では,マルチドメイン・スタイル・トランスファーのためのCIT-GAN(Cyclic Image Translation Generative Adversarial Network)を提案する。そこで本研究では,トレーニングデータセットで表現される各ドメインのスタイル特性を学習する能力を有するスタイリングネットワークを提案する。スタイリングネットワークは、ジェネレータがソースドメインから参照ドメインへの画像の変換を駆動し、参照ドメインのスタイル特性を持つ合成画像を生成するのを支援する。各ドメインの学習スタイルの特徴は、スタイル損失とドメイン分類損失の両方に依存する。これにより、各ドメイン内のスタイル特性のばらつきが引き起こされる。提案したCIT-GANは、アイリス提示攻撃検出(PAD)の文脈において、トレーニングセットに表現されていないクラスに対する合成プレゼンテーション攻撃(PA)サンプルを生成するために使用される。現在最先端のアイリスPAD法による評価は、PAD法をトレーニングするために合成されたPAサンプルを使用することの有効性を示す。さらに、Frechet Inception Distance(FID)スコアを用いて合成した試料の品質を評価する。提案手法により生成された合成画像の品質は,StarGanを含む他の競合する手法よりも優れていることを示す。

In this work, we propose a novel Cyclic Image Translation Generative Adversarial Network (CIT-GAN) for multi-domain style transfer. To facilitate this, we introduce a Styling Network that has the capability to learn style characteristics of each domain represented in the training dataset. The Styling Network helps the generator to drive the translation of images from a source domain to a reference domain and generate synthetic images with style characteristics of the reference domain. The learned style characteristics for each domain depend on both the style loss and domain classification loss. This induces variability in style characteristics within each domain. The proposed CIT-GAN is used in the context of iris presentation attack detection (PAD) to generate synthetic presentation attack (PA) samples for classes that are under-represented in the training set. Evaluation using current state-of-the-art iris PAD methods demonstrates the efficacy of using such synthetically generated PA samples for training PAD methods. Further, the quality of the synthetically generated samples is evaluated using Frechet Inception Distance (FID) score. Results show that the quality of synthetic images generated by the proposed method is superior to that of other competing methods, including StarGan.

翻訳日:2021-05-22 20:42:49 公開日:2020-12-04

# DNNに対する実践的ノンボックス攻撃

Practical No-box Adversarial Attacks against DNNs ( http://arxiv.org/abs/2012.02525v1 )

ライセンス: Link先を確認

Qizhang Li, Yiwen Guo, Hao Chen

(参考訳) ディープニューラルネットワーク(DNN)の敵対的脆弱性の研究は急速に進んでいる。既存の攻撃は、内部アクセス(アーキテクチャ、パラメータ、または犠牲者モデルのトレーニングセット)または外部アクセス(モデルに問い合わせる)を必要とする。しかし、多くのシナリオではアクセスが不可能または高価になる可能性がある。我々は、攻撃者がモデル情報やトレーニングセットにアクセスしたり、モデルに問い合わせたりできないノンボックス逆行事例を調査した。その代わり、攻撃者は被害者モデルと同じ問題領域から少数のサンプルしか収集できない。このような強力な脅威モデルは、敵の攻撃の適用性を大きく広げる。非常に小さなデータセット(数十の例の順)でトレーニングを行うための3つのメカニズムを提案し、原型的再構成が最も効果的であることを示す。実験の結果,画像分類や顔認証モデルによく適合する原型的自動エンコーディングモデルに基づく逆例が得られた。 clarifai.comの商用セレブ認識システムにおいて,本手法は,事前学習されたアークフェイスモデルから敵の例を転送する攻撃と同等の15.40%の確率で,平均予測精度を著しく低下させる。

The study of adversarial vulnerabilities of deep neural networks (DNNs) has progressed rapidly. Existing attacks require either internal access (to the architecture, parameters, or training set of the victim model) or external access (to query the model). However, both the access may be infeasible or expensive in many scenarios. We investigate no-box adversarial examples, where the attacker can neither access the model information or the training set nor query the model. Instead, the attacker can only gather a small number of examples from the same problem domain as that of the victim model. Such a stronger threat model greatly expands the applicability of adversarial attacks. We propose three mechanisms for training with a very small dataset (on the order of tens of examples) and find that prototypical reconstruction is the most effective. Our experiments show that adversarial examples crafted on prototypical auto-encoding models transfer well to a variety of image classification and face verification models. On a commercial celebrity recognition system held by clarifai.com, our approach significantly diminishes the average prediction accuracy of the system to only 15.40%, which is on par with the attack that transfers adversarial examples from a pre-trained Arcface model.

翻訳日:2021-05-22 20:42:02 公開日:2020-12-04

# f2net:教師なしビデオオブジェクトセグメンテーションのための前景にフォーカスする学習

F2Net: Learning to Focus on the Foreground for Unsupervised Video Object Segmentation ( http://arxiv.org/abs/2012.02534v1 )

ライセンス: Link先を確認

Daizong Liu, Dongdong Yu, Changhu Wang, Pan Zhou

(参考訳) ディープラーニングベースの手法は教師なしのビデオオブジェクトのセグメンテーションにおいて大きな進歩を遂げているが、難しいシナリオ(視覚の類似性、オクルージョン、外観の変化など)はまだうまく処理されていない。そこで本研究では,フォアグラウンド・ネットワーク(f2net)に着目し,フォアグラウンド・オブジェクトのフレーム内詳細を分割し,セグメンテーション性能を効果的に向上させる手法を提案する。具体的には,Siamese Encoder Module,Center Guiding Outearance Diffusion Module,Dynamic Information Fusion Moduleの3つの主要部分から構成される。まず、シアムエンコーダを用いて、ペアフレーム(参照フレームと現在のフレーム)の特徴表現を抽出する。次に、フレーム間特徴(参照フレームとカレントフレーム間のデンス対応)、フレーム内特徴(現在のフレーム内のデンス対応)、および現在のフレームの本来の意味的特徴をキャプチャする中央案内型外観拡散モジュールを設計する。具体的には、現在のフレームにおける前景オブジェクトの中心位置を予測し、その中心点情報を空間的ガイダンスとして利用して、フレーム間特徴抽出とフレーム内特徴抽出を強化し、その特徴表現が前景オブジェクトにかなり焦点をあてる。最後に,上記の3つの異なるレベル特徴により,比較的重要な特徴を自動的に選択する動的情報融合モジュールを提案する。 DAVIS2016、Youtube-object、FBMSデータセットの大規模な実験により、提案したF2Netは、最先端のパフォーマンスを実現し、大幅な改善がなされた。

Although deep learning based methods have achieved great progress in unsupervised video object segmentation, difficult scenarios (e.g., visual similarity, occlusions, and appearance changing) are still not well-handled. To alleviate these issues, we propose a novel Focus on Foreground Network (F2Net), which delves into the intra-inter frame details for the foreground objects and thus effectively improve the segmentation performance. Specifically, our proposed network consists of three main parts: Siamese Encoder Module, Center Guiding Appearance Diffusion Module, and Dynamic Information Fusion Module. Firstly, we take a siamese encoder to extract the feature representations of paired frames (reference frame and current frame). Then, a Center Guiding Appearance Diffusion Module is designed to capture the inter-frame feature (dense correspondences between reference frame and current frame), intra-frame feature (dense correspondences in current frame), and original semantic feature of current frame. Specifically, we establish a Center Prediction Branch to predict the center location of the foreground object in current frame and leverage the center point information as spatial guidance prior to enhance the inter-frame and intra-frame feature extraction, and thus the feature representation considerably focus on the foreground objects. Finally, we propose a Dynamic Information Fusion Module to automatically select relatively important features through three aforementioned different level features. Extensive experiments on DAVIS2016, Youtube-object, and FBMS datasets show that our proposed F2Net achieves the state-of-the-art performance with significant improvement.

翻訳日:2021-05-22 20:41:45 公開日:2020-12-04

# ラベル付き行数少ない歴史文書におけるオフライン手書き文字認識の促進

Boosting offline handwritten text recognition in historical documents with few labeled lines ( http://arxiv.org/abs/2012.02544v1 )

ライセンス: Link先を確認

Jos\'e Carlos Aradillas, Juan Jos\'e Murillo-Fuentes, Pablo M. Olmos

(参考訳) 本稿では,ラベル付きサンプルがほとんど存在せず,一部に列車セットに誤りが含まれている場合に,過去の文書におけるオフライン手書き文字認識(HTR)の問題に直面する。主な貢献は3つある。まず,大規模データベースからより小さな履歴データベースへの転送学習(tl)の実施方法を分析し,モデルのどのレイヤが微調整プロセスを必要とするかを分析する。第2に、TLとデータ拡張(DA)を効率的に組み合わせる手法を解析する。最後に,学習セットにおける誤りラベルの影響を軽減するアルゴリズムを提案する。これらの方法は、IDFHR 2018コンペティションデータベースであるWashington and Parzivalで分析される。これらすべてのテクニックを組み合わせることで,複雑性のオーバーヘッドが少ないテストセットにおいて,CERの大幅な削減(場合によっては6%)を実証する。

In this paper, we face the problem of offline handwritten text recognition (HTR) in historical documents when few labeled samples are available and some of them contain errors in the train set. Three main contributions are developed. First we analyze how to perform transfer learning (TL) from a massive database to a smaller historical database, analyzing which layers of the model need a fine-tuning process. Second, we analyze methods to efficiently combine TL and data augmentation (DA). Finally, an algorithm to mitigate the effects of incorrect labelings in the training set is proposed. The methods are analyzed over the ICFHR 2018 competition database, Washington and Parzival. Combining all these techniques, we demonstrate a remarkable reduction of CER (up to 6% in some cases) in the test set with little complexity overhead.

翻訳日:2021-05-22 20:41:14 公開日:2020-12-04

# 交通予報におけるU-Netの実践に向けて

Towards Good Practices of U-Net for Traffic Forecasting ( http://arxiv.org/abs/2012.02598v1 )

ライセンス: Link先を確認

Jingwei Xu, Jianjin Zhang, Zhiyu Yao, Yunbo Wang

(参考訳) この技術レポートは、2020 Traffic4Cast Challengeの解決策を提示します。トラヒック予測問題は、相対的に弱い時間的依存性(確率的都市交通力学によるものである)と強い事前知識、すなわち \textit{i.e. を持つ将来のフレーム予測タスクであると考える。以下は、都市のロードマップだ。これらの理由から,我々はバックボーンモデルとしてu-netを用い,予測トラフィックフローをより合理的にするためのロードマップ生成手法を提案する。一方,検証セットに基づく微調整戦略を用いて過剰フィッティングを防止し,予測結果を効果的に改善する。本報告の最後には,(1)季節性など固有のデータパターンを活用すること,(2)異なる都市間で共通知識を蒸留・伝達すること,といった今後の研究で検討すべきアプローチについて,さらに議論する。また,評価基準の有効性も分析した。

This technical report presents a solution for the 2020 Traffic4Cast Challenge. We consider the traffic forecasting problem as a future frame prediction task with relatively weak temporal dependencies (might be due to stochastic urban traffic dynamics) and strong prior knowledge, \textit{i.e.}, the roadmaps of the cities. For these reasons, we use the U-Net as the backbone model, and we propose a roadmap generation method to make the predicted traffic flows more rational. Meanwhile, we use a fine-tuning strategy based on the validation set to prevent overfitting, which effectively improves the prediction results. At the end of this report, we further discuss several approaches that we have considered or could be explored in future work: (1) harnessing inherent data patterns, such as seasonality; (2) distilling and transferring common knowledge between different cities. We also analyze the validity of the evaluation metric.

翻訳日:2021-05-22 20:41:01 公開日:2020-12-04

# 識別的半教師付きドメイン適応のための効果的なラベル伝播

Effective Label Propagation for Discriminative Semi-Supervised Domain Adaptation ( http://arxiv.org/abs/2012.02621v1 )

ライセンス: Link先を確認

Zhiyong Huang, Kekai Sheng, Weiming Dong, Xing Mei, Chongyang Ma, Feiyue Huang, Dengwen Zhou, Changsheng Xu

(参考訳) 半教師付きドメイン適応(SSDA)法は,大規模なラベル付きデータがソースドメインで利用可能であるが,ターゲットドメインではほとんどラベル付きサンプルが提供されない大規模画像分類タスクにおいて,大きな可能性を示している。既存のソリューションは通常、2つのドメイン間の機能アライメントに重点を置いているが、ターゲットドメインで学習された表現の識別能力にはほとんど注意を払っていない。本稿では,ドメイン間の効果的な情報伝達とドメイン内セマンティック情報伝達によってこの問題に対処する,新しい効果的なラベル伝搬法を提案する。ドメイン間伝播のために,2つのドメイン間の意味情報の一貫性を促進するために,新しいサイクル不一致損失を提案する。ドメイン内伝搬のために,擬似ラベル付き対象領域データのノイズを緩和し,対象領域の特徴識別性を向上する効果的な自己学習戦略を提案する。汎用的な手法として,様々な領域適応アプローチに容易に適用でき,対象領域における特徴識別を容易にすることができる。 Office-HomeとDomainNetベンチマークの実験では、ELPは主流のSSDAメソッドの分類精度を2%～3%改善している。さらに、ELPは、VisDA-2017ベンチマークでのUDA実験に基づいて、UDAメソッドのパフォーマンスも改善した(81.5%対86.1%)。ソースコードと事前トレーニングされたモデルは間もなくリリースされる予定です。

Semi-supervised domain adaptation (SSDA) methods have demonstrated great potential in large-scale image classification tasks when massive labeled data are available in the source domain but very few labeled samples are provided in the target domain. Existing solutions usually focus on feature alignment between the two domains while paying little attention to the discrimination capability of learned representations in the target domain. In this paper, we present a novel and effective method, namely Effective Label Propagation (ELP), to tackle this problem by using effective inter-domain and intra-domain semantic information propagation. For inter-domain propagation, we propose a new cycle discrepancy loss to encourage consistency of semantic information between the two domains. For intra-domain propagation, we propose an effective self-training strategy to mitigate the noises in pseudo-labeled target domain data and improve the feature discriminability in the target domain. As a general method, our ELP can be easily applied to various domain adaptation approaches and can facilitate their feature discrimination in the target domain. Experiments on Office-Home and DomainNet benchmarks show ELP consistently improves the classification accuracy of mainstream SSDA methods by 2%~3%. Additionally, ELP also improves the performance of UDA methods as well (81.5% vs 86.1%), based on UDA experiments on the VisDA-2017 benchmark. Our source code and pre-trained models will be released soon.

翻訳日:2021-05-22 20:40:46 公開日:2020-12-04

# オブジェクト検出のためのグローバルコンテキスト認識RCNN

Global Context Aware RCNN for Object Detection ( http://arxiv.org/abs/2012.02637v1 )

ライセンス: Link先を確認

Wenchao Zhang, Chong Fu, Haoyu Xie, Mai Zhu, Ming Tie, Junxin Chen

(参考訳) RoIPool/RoIAlignは、典型的な2段階オブジェクト検出アルゴリズムに必要なプロセスであり、特徴ピラミッドから抽出したオブジェクト提案を再スケールして固定サイズの特徴写像を生成するために使用される。しかし、これらの局所受容領域の特徴マップは、グローバルな文脈情報を著しく失うことになる。この問題に対処するため,GCA (Global Context Aware) RCNN (Global Context Aware) と呼ばれる新しいエンドツーエンドのトレーニング可能なフレームワークを提案する。 GCAフレームワークの中核となるコンポーネントは、グローバルな特徴ピラミッドとアテンション戦略をそれぞれ、特徴抽出と特徴改善に使用するコンテキスト認識メカニズムである。具体的には、FPNのトップダウンプロセスにおけるグローバルコンテキストの情報フローを改善するために、密接な接続を活用し、さらに注目機構を使用して、特徴ピラミッドの各レベルにおけるグローバルコンテキストを洗練する。最後に,本手法の軽量版も提示し,モデルの複雑さと計算負荷をわずかに増やした。 COCOベンチマークデータセットの実験結果は、我々のアプローチの大きな利点を示している。

RoIPool/RoIAlign is an indispensable process for the typical two-stage object detection algorithm, it is used to rescale the object proposal cropped from the feature pyramid to generate a fixed size feature map. However, these cropped feature maps of local receptive fields will heavily lose global context information. To tackle this problem, we propose a novel end-to-end trainable framework, called Global Context Aware (GCA) RCNN, aiming at assisting the neural network in strengthening the spatial correlation between the background and the foreground by fusing global context information. The core component of our GCA framework is a context aware mechanism, in which both global feature pyramid and attention strategies are used for feature extraction and feature refinement, respectively. Specifically, we leverage the dense connection to improve the information flow of the global context at different stages in the top-down process of FPN, and further use the attention mechanism to refine the global context at each level in the feature pyramid. In the end, we also present a lightweight version of our method, which only slightly increases model complexity and computational burden. Experimental results on COCO benchmark dataset demonstrate the significant advantages of our approach.

翻訳日:2021-05-22 20:40:24 公開日:2020-12-04

# 自然言語を用いたモーメントローカライゼーションのためのマルチスケール2次元隣接ネットワーク

Multi-Scale 2D Temporal Adjacent Networks for Moment Localization with Natural Language ( http://arxiv.org/abs/2012.02646v1 )

ライセンス: Link先を確認

Songyang Zhang, Houwen Peng, Jianlong Fu, Yijuan Lu, Jiebo Luo

(参考訳) 自然言語による未検索の映像から特定の瞬間を検索する問題に対処する。ターゲットモーメントは、未トリミングビデオの他の時間モーメントの文脈で発生する可能性があるため、これは難しい問題である。既存の手法では、時間的モーメント間の時間的コンテキストを十分に考慮していないため、この課題にうまく取り組めない。本稿では,ビデオモーメント間の時間的文脈を,時間スケールの異なる2次元マップのセットでモデル化する。各地図について、1次元はモーメントの開始時刻を示し、もう1次元は時間を示す。これらの2dテンポラリマップは、異なる長さの様々なビデオモーメントをカバーでき、隣接するコンテキストを異なるテンポラリスケールで表現することができる。モーメントローカライゼーションのためのシングルショットフレームワークであるMS-2D-TAN(Multi-Scale Temporal Adjacent Network)を提案する。ビデオモーメントと参照表現をマッチングする識別特徴を学習しながら、隣接する時間的文脈を各スケールで符号化することができる。提案したMS-2D-TANを,Charades-STA,ActivityNet Captions,TACoSの3つの挑戦的ベンチマークで評価した。

We address the problem of retrieving a specific moment from an untrimmed video by natural language. It is a challenging problem because a target moment may take place in the context of other temporal moments in the untrimmed video. Existing methods cannot tackle this challenge well since they do not fully consider the temporal contexts between temporal moments. In this paper, we model the temporal context between video moments by a set of predefined two-dimensional maps under different temporal scales. For each map, one dimension indicates the starting time of a moment and the other indicates the duration. These 2D temporal maps can cover diverse video moments with different lengths, while representing their adjacent contexts at different temporal scales. Based on the 2D temporal maps, we propose a Multi-Scale Temporal Adjacent Network (MS-2D-TAN), a single-shot framework for moment localization. It is capable of encoding the adjacent temporal contexts at each scale, while learning discriminative features for matching video moments with referring expressions. We evaluate the proposed MS-2D-TAN on three challenging benchmarks, i.e., Charades-STA, ActivityNet Captions, and TACoS, where our MS-2D-TAN outperforms the state of the art.

翻訳日:2021-05-22 20:40:06 公開日:2020-12-04

# 等尺多形マッチング

Isometric Multi-Shape Matching ( http://arxiv.org/abs/2012.02689v1 )

ライセンス: Link先を確認

Maolin Gao, Zorah L\"ahner, Johan Thunberg, Daniel Cremers, Florian Bernard

(参考訳) 形状の対応を見つけることはコンピュータビジョンとグラフィックスの基本的な問題であり、3D再構成、オブジェクト追跡、スタイル転送など多くのアプリケーションに関係している。対応メソッドの大部分は、たとえ同じクラスの複数のインスタンスが利用可能であっても、形状のペア間の解を見つけることを目的としている。アイソメトリーは形状対応問題においてしばしば研究されるが、マルチマッチング環境では明確には考慮されていない。本稿では,等尺的マルチ形状マッチングの新しい最適化式を提案することにより,このギャップを埋める。定式化を解くのに適した最適化アルゴリズムを提案し,コンバージェンスと複雑性解析を提供する。提案アルゴリズムは, 確実にサイクル一貫性のあるマルチマッチングを実現する。提案手法の各種データセット上での優れた性能を実証し,等尺的マルチ形状マッチングにおける新しい最先端技術の設定を行う。

Finding correspondences between shapes is a fundamental problem in computer vision and graphics, which is relevant for many applications, including 3D reconstruction, object tracking, and style transfer. The vast majority of correspondence methods aim to find a solution between pairs of shapes, even if multiple instances of the same class are available. While isometries are often studied in shape correspondence problems, they have not been considered explicitly in the multi-matching setting. This paper closes this gap by proposing a novel optimisation formulation for isometric multi-shape matching. We present a suitable optimisation algorithm for solving our formulation and provide a convergence and complexity analysis. Our algorithm obtains multi-matchings that are by construction provably cycle-consistent. We demonstrate the superior performance of our method on various datasets and set the new state-of-the-art in isometric multi-shape matching.

翻訳日:2021-05-22 20:39:44 公開日:2020-12-04

# SMPLyによる野生における3次元人物位置推定のベンチマーク

SMPLy Benchmarking 3D Human Pose Estimation in the Wild ( http://arxiv.org/abs/2012.02743v1 )

ライセンス: Link先を確認

Vincent Leroy, Philippe Weinzaepfel, Romain Br\'egier, Hadrien Combaluzier, Gr\'egory Rogez

(参考訳) 画像から3d人間のポーズを予測することは、最近非常に改善されている。単一の入力画像からポーズと形状の両方を予測できる新しいアプローチが導入されており、しばしばsmplのような人体のパラメトリックモデルに依存している。このような方法の質的な結果はしばしば、撮影中の画像に対して示されるが、モーションキャプチャー室よりも地上の3Dポーズを得るのが難しいため、そのような条件下での適切なベンチマークはいまだに欠落している。本稿では,これらのデータセットを正確な地上構造で容易に生成し,検証するためのパイプラインを提案する。我々は、最近導入されたMannequin Challengeデータセットを利用して、彫像のようなアクションで凍った人々の野生のビデオを収録し、人々が静的であり、カメラがSMPLモデルに正確に適合するように動いているという事実を活用する。登録されたボディモデルを持つ合計24,428フレームは、オンラインRGBビデオのみを使用して、ほぼ無償で567シーンから選択される。我々は,このデータセット上で,最先端のSMPLに基づく人間のポーズ推定手法をベンチマークする。以上の結果から,課題は,特に難易度の高いポーズや,人が部分的に行き詰まったり隠されたりした場面に残ることが示唆された。

Predicting 3D human pose from images has seen great recent improvements. Novel approaches that can even predict both pose and shape from a single input image have been introduced, often relying on a parametric model of the human body such as SMPL. While qualitative results for such methods are often shown for images captured in-the-wild, a proper benchmark in such conditions is still missing, as it is cumbersome to obtain ground-truth 3D poses elsewhere than in a motion capture room. This paper presents a pipeline to easily produce and validate such a dataset with accurate ground-truth, with which we benchmark recent 3D human pose estimation methods in-the-wild. We make use of the recently introduced Mannequin Challenge dataset which contains in-the-wild videos of people frozen in action like statues and leverage the fact that people are static and the camera moving to accurately fit the SMPL model on the sequences. A total of 24,428 frames with registered body models are then selected from 567 scenes at almost no cost, using only online RGB videos. We benchmark state-of-the-art SMPL-based human pose estimation methods on this dataset. Our results highlight that challenges remain, in particular for difficult poses or for scenes where the persons are partially truncated or occluded.

翻訳日:2021-05-22 20:38:52 公開日:2020-12-04

# 弾性重み強化による少数ショット画像生成

Few-shot Image Generation with Elastic Weight Consolidation ( http://arxiv.org/abs/2012.02780v1 )

ライセンス: Link先を確認

Yijun Li, Richard Zhang, Jingwan Lu, Eli Shechtman

(参考訳) 少数ショット画像生成は、利用可能なトレーニング例がほとんどなく、所定のドメインのより多くのデータを生成することを目指している。少数の観測結果(絵文字など)から分布を完全に推測することは理にかなわないため、我々は大規模な関連するソースドメインを事前訓練(人間の顔など)として活用しようと試みている。したがって、ターゲットの外観に適応しながら、ソースドメインの多様性を保ちたいと考えています。対象ドメインのいくつかの例に、追加のパラメータを導入することなく、事前訓練されたモデルを適用する。重要なことは、この適応の際の重みの変化を規則化し、ターゲットを適合させながら、ソースデータセットの情報を最もよく保存する。極めて少ない例(例: <10)を含む,異なる対象領域の高品質な結果を生成することで,アルゴリズムの有効性を実証する。また,サンプル数やソースとターゲットドメインの相違点など,いくつかの重要な要因について,本手法の性能分析を行った。

Few-shot image generation seeks to generate more data of a given domain, with only few available training examples. As it is unreasonable to expect to fully infer the distribution from just a few observations (e.g., emojis), we seek to leverage a large, related source domain as pretraining (e.g., human faces). Thus, we wish to preserve the diversity of the source domain, while adapting to the appearance of the target. We adapt a pretrained model, without introducing any additional parameters, to the few examples of the target domain. Crucially, we regularize the changes of the weights during this adaptation, in order to best preserve the information of the source dataset, while fitting the target. We demonstrate the effectiveness of our algorithm by generating high-quality results of different target domains, including those with extremely few examples (e.g., <10). We also analyze the performance of our method with respect to some important factors, such as the number of examples and the dissimilarity between the source and target domain.

翻訳日:2021-05-22 20:38:31 公開日:2020-12-04

# dempster-shafer理論に基づく新しいマルチクラス化情報融合:振動に基づく故障検出への応用

A novel multi-classifier information fusion based on Dempster-Shafer theory: application to vibration-based fault detection ( http://arxiv.org/abs/2012.02481v1 )

ライセンス: Link先を確認

Vahid Yaghoubi, Liangliang Cheng, Wim Van Paepegem, Mathias Kersemans

(参考訳) 高い予測率を達成することは、障害検出において重要なタスクである。様々な分類手順が利用できるが、それらが全てのアプリケーションに高い精度を与えることはない。そこで本稿では,個別の分類器の性能を高めるために,新しいマルチ分類器融合手法を開発した。これは Dempster-Shafer theory (DST) を用いて得られる。しかし、矛盾する証拠がある場合、DSTは反直感的な結果を与える可能性がある。この点において、証拠間の衝突を計測・緩和するために、新しい計量に基づく前処理技術が考案された。提案手法の有効性を評価し検証するために,uciとkeelの15のベンチマークデータセットに適用した。さらに、その広帯域振動応答に基づいて、多結晶ニッケル合金第一段タービンブレードを分類する。ノイズ-信号比の異なる統計解析と4つの最先端融合技術との比較により,提案手法は分類精度を向上し,個々の分類器よりも優れていることを示す。

Achieving a high prediction rate is a crucial task in fault detection. Although various classification procedures are available, none of them can give high accuracy in all applications. Therefore, in this paper, a novel multi-classifier fusion approach is developed to boost the performance of the individual classifiers. This is acquired by using Dempster-Shafer theory (DST). However, in cases with conflicting evidences, the DST may give counter-intuitive results. In this regard, a preprocessing technique based on a new metric is devised in order to measure and mitigate the conflict between the evidences. To evaluate and validate the effectiveness of the proposed approach, the method is applied to 15 benchmarks datasets from UCI and KEEL. Further, it is applied for classifying polycrystalline Nickel alloy first-stage turbine blades based on their broadband vibrational response. Through statistical analysis with different levels of noise-to-signal ratio, and by comparing with four state-of-the-art fusion techniques, it is shown that that the proposed method improves the classification accuracy and outperforms the individual classifiers.

翻訳日:2021-05-22 20:37:58 公開日:2020-12-04

# 不均一ラベルを用いたフェデレーション学習と移動活動モニタリングモデル

Federated Learning with Heterogeneous Labels and Models for Mobile Activity Monitoring ( http://arxiv.org/abs/2012.02539v1 )

ライセンス: Link先を確認

Gautham Krishna Gudur, Satheesh K. Perepu

(参考訳) 生活支援,転倒検出などの様々な医療応用には,HAR(Human Activity Recognition)によるユーザ行動のモデル化が必要である。このようなアプリケーションは、効果的なパーソナライズアクティビティモニタリングのために機械学習技術を使用して、複数のリソースに制約されたユーザーデバイスからの洞察のキャラクタリゼーションを要求する。デバイス上の連合学習は、分散および協調機械学習にとって効果的なアプローチであることが証明されている。しかし、統計(非IIDデータ)とユーザ間の不均一性をモデル化する上で、さまざまな課題がある。さらに,本論文では,連合学習中にユーザ間のラベル(アクティビティ)の不均一性を扱うための,新たな関心課題について検討する。そこで本稿では, モデル蒸留更新を用いて, 重なり合う情報ゲインを利用する, ラベルに基づくアグリゲーションのためのフレームワークを提案する。また,デバイスからサーバへのモデルウェイト転送よりも,モデルスコアのフェデレーション転送が十分であることを示す。 raspberry pi 2のhhar(hetergeneity human activity recognition)データセットによる経験的評価は、平均決定論的精度が少なくとも11.01%上昇していることを示し、提案フレームワークのオンデバイス能力を示している。

Various health-care applications such as assisted living, fall detection, etc., require modeling of user behavior through Human Activity Recognition (HAR). Such applications demand characterization of insights from multiple resource-constrained user devices using machine learning techniques for effective personalized activity monitoring. On-device Federated Learning proves to be an effective approach for distributed and collaborative machine learning. However, there are a variety of challenges in addressing statistical (non-IID data) and model heterogeneities across users. In addition, in this paper, we explore a new challenge of interest -- to handle heterogeneities in labels (activities) across users during federated learning. To this end, we propose a framework for federated label-based aggregation, which leverages overlapping information gain across activities using Model Distillation Update. We also propose that federated transfer of model scores is sufficient rather than model weight transfer from device to server. Empirical evaluation with the Heterogeneity Human Activity Recognition (HHAR) dataset (with four activities for effective elucidation of results) on Raspberry Pi 2 indicates an average deterministic accuracy increase of at least ~11.01%, thus demonstrating the on-device capabilities of our proposed framework.

翻訳日:2021-05-22 20:37:01 公開日:2020-12-04

# 重みの初期設定が人工ニューラルネットワークのトレーニングと機能に及ぼす影響

Effect of the initial configuration of weights on the training and function of artificial neural networks ( http://arxiv.org/abs/2012.02550v1 )

ライセンス: Link先を確認

R. J. Jesus, M. L. Antunes, R. A. da Costa, S. N. Dorogovtsev, J. F. F. Mendes, R. L. Aguiar

(参考訳) ニューラルネットワークの機能と性能は、トレーニングの過程における重みとバイアスの進化によって決定される。本研究では,SGD(Stochastic Gradient Descent)を用いて学習した2層ReLUネットワークの重みの偏りを,初期ランダムな構成から定量的に評価する。この偏差の分布関数の進化とトレーニング中の損失の進化を比較した。我々は,SGDによるトレーニングを成功させることで,初期重量設定の近辺にネットワークを置き去りにすることを発見した。リンクの初期重みごとに、トレーニング後のこの値から偏差の分布関数を測定し、この分布とそのピークのモーメントが初期重みに依存するかを見出した。トレーニング中,これらの偏差の進化を探究し,オーバーフィット領域内での急激な増加を観察した。このジャンプは、損失関数の進化で記録された同様の急上昇と同時に起こる。以上の結果から,SGDが局所最小値を効率的に検出できる能力は,重量のランダムな初期配置の近傍に限られていることが示唆された。

The function and performance of neural networks is largely determined by the evolution of their weights and biases in the process of training, starting from the initial configuration of these parameters to one of the local minima of the loss function. We perform the quantitative statistical characterization of the deviation of the weights of two-hidden-layer ReLU networks of various sizes trained via Stochastic Gradient Descent (SGD) from their initial random configuration. We compare the evolution of the distribution function of this deviation with the evolution of the loss during training. We observed that successful training via SGD leaves the network in the close neighborhood of the initial configuration of its weights. For each initial weight of a link we measured the distribution function of the deviation from this value after training and found how the moments of this distribution and its peak depend on the initial weight. We explored the evolution of these deviations during training and observed an abrupt increase within the overfitting region. This jump occurs simultaneously with a similarly abrupt increase recorded in the evolution of the loss function. Our results suggest that SGD's ability to efficiently find local minima is restricted to the vicinity of the random initial configuration of weights.

翻訳日:2021-05-22 20:36:39 公開日:2020-12-04

# 敵対的事例に対する多重防衛戦略の提唱

Advocating for Multiple Defense Strategies against Adversarial Examples ( http://arxiv.org/abs/2012.02632v1 )

ライセンス: Link先を確認

Alexandre Araujo, Laurent Meunier, Rafael Pinot, Benjamin Negrevergne

(参考訳) ニューラルネットワークを$\ell_\infty$敵の例から保護するために設計された防御機構が、$\ell_2$敵の例に対して性能が劣っていることを実証的に観察されている。本稿では,この観察を検証する幾何学的解析を行う。そこで本研究では,この現象の実際的影響を説明するための実証的な知見を多数提示する。次に,防衛戦略を混合することにより,複数の攻撃に対して防御しようとする既存の防御機構について検討する。本稿の数値実験により,本手法の妥当性を議論し,実例コミュニティに対してオープン質問を提示する。

It has been empirically observed that defense mechanisms designed to protect neural networks against $\ell_\infty$ adversarial examples offer poor performance against $\ell_2$ adversarial examples and vice versa. In this paper we conduct a geometrical analysis that validates this observation. Then, we provide a number of empirical insights to illustrate the effect of this phenomenon in practice. Then, we review some of the existing defense mechanism that attempts to defend against multiple attacks by mixing defense strategies. Thanks to our numerical experiments, we discuss the relevance of this method and state open questions for the adversarial examples community.

翻訳日:2021-05-22 20:36:23 公開日:2020-12-04

# 階層的クラスタリングとゼロ永久ホモロジー

Hierarchical Clustering and Zeroth Persistent Homology ( http://arxiv.org/abs/2012.02655v1 )

ライセンス: Link先を確認

\.Ismail G\"uzel and Atabey Kaygun

(参考訳) 本稿では,階層的クラスタリングと第0次永続ホモロジーが,与えられたデータセットに関する同じ位相情報を提供することを示す。この事実は、手元にあるデータセットのフィルター付きビエトリス-リップス複合体から構築されたコヒーネティック行列を用いて示される。任意のコヒーネティック行列と同様に、根木(デンドグラムとも呼ばれる)を通してゼロトホモロジークラスの相互関係を表示することもできる。ホモロジーコヒーネティック行列は高いホモロジーに対して計算できるので、高い永続ホモロジークラスのための類似のデンドグラムをスケッチすることもできる。

In this article, we show that hierarchical clustering and the zeroth persistent homology do deliver the same topological information about a given data set. We show this fact using cophenetic matrices constructed out of the filtered Vietoris-Rips complex of the data set at hand. As in any cophenetic matrix, one can also display the inter-relations of zeroth homology classes via a rooted tree, also known as a dendogram. Since homological cophenetic matrices can be calculated for higher homologies, one can also sketch similar dendograms for higher persistent homology classes.

翻訳日:2021-05-22 20:36:12 公開日:2020-12-04

# 多票投票における有権者のモデリング

Modeling Voters in Multi-Winner Approval Voting ( http://arxiv.org/abs/2012.02811v1 )

ライセンス: Link先を確認

Jaelle Scheuerman, Jason Harman, Nicholas Mattei, K. Brent Venable

(参考訳) 多くの現実の状況では、投票と委員会や委員会選挙のようなシナリオを用いて、複数の勝者を返す投票規則を採用する。マルチウィンターの承認投票(AV)では、エージェントが希望する多くの候補者に対する承認からなる投票を提出し、勝者は投票を集計し、最も多くの承認を得た候補者を選ぶことで選ばれる。多くのシナリオでは、エージェントが提出した投票を操作して、真の好みを反映しない方法で投票することでより良い結果を達成することができる。複雑で不確実な状況では、エージェントは操作を計算するのに必要な追加の労力を必要とせず、ヒューリスティックを使用することができる。本稿では,メカニカル・トルクから得られた行動データを用いて,不確実性の度合いの異なる単入投票と多入投票の投票行動を検討する。一般的に、人々はより良い結果を得るために投票を操るが、しばしば最適な操作を識別しない。コンソックや心理学の文献には、認知的に説得力のあるヒューリスティックな戦略に基づくエージェント行動の予測モデルが多数存在する。既存の手法では実世界のデータを適切にモデル化できないことを示す。本稿では,入賞集合の大きさと人間の認知的制約を考慮した新しいモデルを提案し,このモデルが複数入賞承認投票シナリオにおける実世界行動の把握に有効であることを示す。

In many real world situations, collective decisions are made using voting and, in scenarios such as committee or board elections, employing voting rules that return multiple winners. In multi-winner approval voting (AV), an agent submits a ballot consisting of approvals for as many candidates as they wish, and winners are chosen by tallying up the votes and choosing the top-$k$ candidates receiving the most approvals. In many scenarios, an agent may manipulate the ballot they submit in order to achieve a better outcome by voting in a way that does not reflect their true preferences. In complex and uncertain situations, agents may use heuristics instead of incurring the additional effort required to compute the manipulation which most favors them. In this paper, we examine voting behavior in single-winner and multi-winner approval voting scenarios with varying degrees of uncertainty using behavioral data obtained from Mechanical Turk. We find that people generally manipulate their vote to obtain a better outcome, but often do not identify the optimal manipulation. There are a number of predictive models of agent behavior in the COMSOC and psychology literature that are based on cognitively plausible heuristic strategies. We show that the existing approaches do not adequately model real-world data. We propose a novel model that takes into account the size of the winning set and human cognitive constraints, and demonstrate that this model is more effective at capturing real-world behaviors in multi-winner approval voting scenarios.

翻訳日:2021-05-22 20:35:57 公開日:2020-12-04

# 高分解能画像インパインティング用ジェネレータピラミッド

Generator Pyramid for High-Resolution Image Inpainting ( http://arxiv.org/abs/2012.02381v1 )

ライセンス: Link先を確認

Leilei Cao, Tong Yang, Yixu Wang, Bo Yan, Yandong Guo

(参考訳) 大きな穴を持つ高解像度画像のインペインティングは、既存のディープラーニングベースのイメージインペインティング手法に挑戦する。本稿では,コンテント補完とテクスチャ合成を明示的に区別する,高解像度画像インペインティングタスクのための新しいフレームワークであるpyramidfillを提案する。 PyramidFillは、低解像度画像で未知の領域の内容を完成させ、高解像度画像で未知の領域のテクスチャを徐々に合成しようとする。したがって,本モデルでは,低解像度マスク画像における内容の完結にGANが関与し,高解像度画像におけるテクスチャの合成にGANが関与する,完全畳み込み型GANのピラミッドで構成されている。コンテントの完成とテクスチャの合成はジェネレータと異なる能力を必要とするため、コンテンツGANとテクスチャGANの異なるアーキテクチャをカスタマイズする。 CelebA-HQ、Places2、および解像度の異なる新しい自然景観データセット(NSHQ)を含む複数のデータセットの実験は、PraamidFillが最先端の手法よりも高品質な塗装結果を生成することを示した。高精細画像の塗布方法を改善するため,高精細1920$\times$1080のNSHQ,高精細自然景観画像をリリースする。

Inpainting high-resolution images with large holes challenges existing deep learning based image inpainting methods. We present a novel framework -- PyramidFill for high-resolution image inpainting task, which explicitly disentangles content completion and texture synthesis. PyramidFill attempts to complete the content of unknown regions in a lower-resolution image, and synthesis the textures of unknown regions in a higher-resolution image, progressively. Thus, our model consists of a pyramid of fully convolutional GANs, wherein the content GAN is responsible for completing contents in the lowest-resolution masked image, and each texture GAN is responsible for synthesizing textures in a higher-resolution image. Since completing contents and synthesising textures demand different abilities from generators, we customize different architectures for the content GAN and texture GAN. Experiments on multiple datasets including CelebA-HQ, Places2 and a new natural scenery dataset (NSHQ) with different resolutions demonstrate that PyramidFill generates higher-quality inpainting results than the state-of-the-art methods. To better assess high-resolution image inpainting methods, we will release NSHQ, high-quality natural scenery images with high-resolution 1920$\times$1080.

翻訳日:2021-05-22 20:35:24 公開日:2020-12-04

# XraySyn:CTによる1枚のX線写真からのリアルなビュー合成

XraySyn: Realistic View Synthesis From a Single Radiograph Through CT Priors ( http://arxiv.org/abs/2012.02407v1 )

ライセンス: Link先を確認

Cheng Peng, Haofu Liao, Gina Wong, Jiebo Luo, Shaohua Kevin Zhou, Rama Chellappa

(参考訳) 放射線写真は、X線を用いて患者の内部解剖を視覚化し、3D情報を2次元平面に投影する。そのため、ラジオグラフィー分析では、医師が3Dヒト解剖学と2Dラジオグラフィーを関連付ける必要がある。少ない範囲で新しいラジオグラフィックビューを合成することは、医師が解剖学をより確実に解釈するのに役立つが、ラジオグラフビューの合成は非常に不適切であり、ペアデータに欠けており、学習に基づくアプローチを活用するために微分可能な操作が欠如している。これらの問題に対処するために,CT(Computerd Tomography)をラジオグラフィシミュレーションに使用し,識別可能なプロジェクションアルゴリズムを設計することにより,ラジオグラフィとCTドメイン間の幾何学的一貫した変換を実現する。 XraySynはリアルなシミュレーションとリアルなラジオグラフィーの微調整を組み合わせることで、リアルなラジオグラフィーの新たなビューを合成することができる。私たちの知る限りでは、ラジオグラフィビューの合成に関する最初の研究である。また, 3次元空間におけるx線撮影の理解を得ることにより, 接地骨ラベルを使わずに, 放射線画像の抽出と抑制に応用できることを示した。

A radiograph visualizes the internal anatomy of a patient through the use of X-ray, which projects 3D information onto a 2D plane. Hence, radiograph analysis naturally requires physicians to relate the prior about 3D human anatomy to 2D radiographs. Synthesizing novel radiographic views in a small range can assist physicians in interpreting anatomy more reliably; however, radiograph view synthesis is heavily ill-posed, lacking in paired data, and lacking in differentiable operations to leverage learning-based approaches. To address these problems, we use Computed Tomography (CT) for radiograph simulation and design a differentiable projection algorithm, which enables us to achieve geometrically consistent transformations between the radiography and CT domains. Our method, XraySyn, can synthesize novel views on real radiographs through a combination of realistic simulation and finetuning on real radiographs. To the best of our knowledge, this is the first work on radiograph view synthesis. We show that by gaining an understanding of radiography in 3D space, our method can be applied to radiograph bone extraction and suppression without groundtruth bone labels.

翻訳日:2021-05-22 20:35:01 公開日:2020-12-04

# アテンションベースオートエンコーダを用いたマルチスケールメッシュ変形成分分析

Multiscale Mesh Deformation Component Analysis with Attention-based Autoencoders ( http://arxiv.org/abs/2012.02459v1 )

ライセンス: Link先を確認

Jie Yang, Lin Gao, Qingyang Tan, Yihua Huang, Shihong Xia and Yu-Kun Lai

(参考訳) 変形成分分析は幾何学処理と形状理解の基本的な問題である。既存のアプローチでは、主に局所的な変形成分を同様のスケールで抽出するが、実世界の物体の変形は通常マルチスケールで分散する。本稿では,注目型オートエンコーダを用いたマルチスケール変形成分の自動推定手法を提案する。このアテンション機構は、アクティブな変形領域におけるマルチスケール変形成分の軟重化を学習するために設計され、スタック化されたアテンションベースのオートエンコーダは、変形成分を異なるスケールで表現する。定量的および定性的評価は,本手法が最先端手法より優れていることを示す。また, 本手法で抽出した多スケール変形成分により, 形状を粗視的に編集でき, 新たな形状のモデル化が容易になる。

Deformation component analysis is a fundamental problem in geometry processing and shape understanding. Existing approaches mainly extract deformation components in local regions at a similar scale while deformations of real-world objects are usually distributed in a multi-scale manner. In this paper, we propose a novel method to exact multiscale deformation components automatically with a stacked attention-based autoencoder. The attention mechanism is designed to learn to softly weight multi-scale deformation components in active deformation regions, and the stacked attention-based autoencoder is learned to represent the deformation components at different scales. Quantitative and qualitative evaluations show that our method outperforms state-of-the-art methods. Furthermore, with the multiscale deformation components extracted by our method, the user can edit shapes in a coarse-to-fine fashion which facilitates effective modeling of new shapes.

翻訳日:2021-05-22 20:34:39 公開日:2020-12-04

# 医療セグメンテーションにおける不均衡問題に対するオフセット曲線の損失

Offset Curves Loss for Imbalanced Problem in Medical Segmentation ( http://arxiv.org/abs/2012.02463v1 )

ライセンス: Link先を確認

Ngan Le, Trung Le, Kashu Yamazaki, Toan Duc Bui, Khoa Luu, Marios Savides

(参考訳) 医用画像分割は医療分析において重要な役割を担い、多くの臨床応用で広く開発された。深層学習に基づくアプローチはセマンティックセグメンテーションにおいて高い性能を達成しているが、それはピクセルワイズ設定と不均衡なクラスデータ問題に限られている。本稿では,高機能化と高機能化の両方を考慮した新しい深層学習モデルの開発により,これらの限界に挑戦する。輪郭内の領域、中間的特徴レベル、すなわち輪郭周りのオフセット曲線と低い特徴レベル、すなわち輪郭提案するオフセット曲線(osc)損失は,3つの主要適合項からなる。第1のフィッティング項は画素単位のセグメンテーションに焦点を当て、第2のフィッティング項は境界周辺の領域(オフセット曲線)に注意を向ける注意モデルとして機能する。第三項は境界の長さを考慮した正規化用語としての役割を担っている。提案するosc損失を2次元ネットワークと3次元ネットワークの両方で評価する。 2つの一般的な医療データセット、すなわち網膜DRIVEと脳腫瘍BRATS 2018データセットは、提案された損失性能のベンチマークに使用される。実験により,提案したOsC損失関数は,最も一般的なセグメンテーションネットワークUnet,FCN上でのクロスエントロピー,ディース,フォカルなどの他の主流損失関数よりも優れていた。

Medical image segmentation has played an important role in medical analysis and widely developed for many clinical applications. Deep learning-based approaches have achieved high performance in semantic segmentation but they are limited to pixel-wise setting and imbalanced classes data problem. In this paper, we tackle those limitations by developing a new deep learning-based model which takes into account both higher feature level i.e. region inside contour, intermediate feature level i.e. offset curves around the contour and lower feature level i.e. contour. Our proposed Offset Curves (OsC) loss consists of three main fitting terms. The first fitting term focuses on pixel-wise level segmentation whereas the second fitting term acts as attention model which pays attention to the area around the boundaries (offset curves). The third terms plays a role as regularization term which takes the length of boundaries into account. We evaluate our proposed OsC loss on both 2D network and 3D network. Two common medical datasets, i.e. retina DRIVE and brain tumor BRATS 2018 datasets are used to benchmark our proposed loss performance. The experiments have shown that our proposed OsC loss function outperforms other mainstream loss functions such as Cross-Entropy, Dice, Focal on the most common segmentation networks Unet, FCN.

翻訳日:2021-05-22 20:34:26 公開日:2020-12-04

# ノイズグラフ構造からのノード表現の学習

Learning Node Representations from Noisy Graph Structures ( http://arxiv.org/abs/2012.02434v1 )

ライセンス: Link先を確認

Junshan Wang, Ziyao Li, Qingqing Long, Weiyu Zhang, Guojie Song, Chuan Shi

(参考訳) グラフ上の低次元表現を学習することは、様々な下流タスクに有効であることが証明されている。しかし、ネットワークのエッジがノード自身ではなくネットワーク全体のノイズを伝搬するという点で、ネットワークを妥協する現実世界のネットワークではノイズが一般的である。既存の手法は構造特性の保存に重点を置いているが、学習されたノイズに対する表現の堅牢性は一般的に無視される。本稿では,ノイズのないノード表現を学習し,同時にノイズを排除する新しい枠組みを提案する。実グラフ上ではノイズはしばしば未知であるため、教師なし環境での正常な構造とノイズを特定するために、グラフ生成器とノイズ発生器という2つのジェネレータを設計する。一方、グラフ生成器は、正規構造を生成するのに有用なグラフ事前知識を組み込む統一スキームとして機能する。本稿では,コミュニティ構造と権限-法次分布による生成過程を例に挙げる。一方、ノイズ発生器は、基本特性を満足するだけでなく、適応的にグラフノイズを生成する。したがって、任意の分布を持つ実雑音をうまく処理することができる。最後に,ノイズを除去し,ノイズのないノード表現を得るためには,2つの生成器を協調して最適化する必要がある。本モデルは実世界データと合成データの両方で評価される。これは、ノード分類やグラフ再構成タスクの他の強力なベースラインよりも優れており、グラフノイズを取り除く能力を示している。

Learning low-dimensional representations on graphs has proved to be effective in various downstream tasks. However, noises prevail in real-world networks, which compromise networks to a large extent in that edges in networks propagate noises through the whole network instead of only the node itself. While existing methods tend to focus on preserving structural properties, the robustness of the learned representations against noises is generally ignored. In this paper, we propose a novel framework to learn noise-free node representations and eliminate noises simultaneously. Since noises are often unknown on real graphs, we design two generators, namely a graph generator and a noise generator, to identify normal structures and noises in an unsupervised setting. On the one hand, the graph generator serves as a unified scheme to incorporate any useful graph prior knowledge to generate normal structures. We illustrate the generative process with community structures and power-law degree distributions as examples. On the other hand, the noise generator generates graph noises not only satisfying some fundamental properties but also in an adaptive way. Thus, real noises with arbitrary distributions can be handled successfully. Finally, in order to eliminate noises and obtain noise-free node representations, two generators need to be optimized jointly, and through maximum likelihood estimation, we equivalently convert the model into imposing different regularization constraints on the true graph and noises respectively. Our model is evaluated on both real-world and synthetic data. It outperforms other strong baselines for node classification and graph reconstruction tasks, demonstrating its ability to eliminate graph noises.

翻訳日:2021-05-22 20:33:45 公開日:2020-12-04

# 機械学習による設計検証の最適化 - オープンソースソリューション

Optimising Design Verification Using Machine Learning: An Open Source Solution ( http://arxiv.org/abs/2012.02453v1 )

ライセンス: Link先を確認

B. Samhita Varambally, Naman Sehgal

(参考訳) 集積回路の複雑さが増すにつれ、設計検証はASIC設計フローの最も時間を要する部分となった。 SoC設計サイクルの70%近くは検証によって消費される。すべてのコーナーケースをテストする最も一般的な方法は、制約付きランダム検証を使用することである。ランダムな刺激は、可能なすべての組み合わせにぶつかり、設計を徹底的にテストするために与えられる。しかしながら、このアプローチは、すべてのコーナーケースに到達するために、重要な人間の専門知識を必要とすることが多い。本稿では,機械学習を用いて入力刺激を生成する手法を提案する。これにより、人間の介入が少なく、設計の徹底的な検証を迅速に行える。さらに,オープンソースの検証環境であるCocotbの利用を提案する。 Pythonをベースとしており、シンプルで直感的で、機械学習アプリケーションのための膨大な関数ライブラリを持っている。これにより、System VerilogやSpecman Eといった従来のハードウェア検証言語を使用する場合よりも使いやすくなっている。

With the complexity of Integrated Circuits increasing, design verification has become the most time consuming part of the ASIC design flow. Nearly 70% of the SoC design cycle is consumed by verification. The most commonly used approach to test all corner cases is through the use of Constrained Random Verification. Random stimulus is given in order to hit all possible combinations and test the design thoroughly. However, this approach often requires significant human expertise to reach all corner cases. This paper presents an alternative using Machine Learning to generate the input stimulus. This will allow for faster thorough verification of the design with less human intervention. Furthermore, it is proposed to use the open source verification environment 'Cocotb'. Based on Python, it is simple, intuitive and has a vast library of functions for machine learning applications. This makes it more convenient to use than the bulkier approach using traditional Hardware Verification Languages such as System Verilog or Specman E.

翻訳日:2021-05-22 20:33:24 公開日:2020-12-04

# 実世界FMCWレーダ信号の深部干渉緩和と雑音化

Deep Interference Mitigation and Denoising of Real-World FMCW Radar Signals ( http://arxiv.org/abs/2012.02529v1 )

ライセンス: Link先を確認

Johanna Rock, Mate Toth, Paul Meissner, Franz Pernkopf

(参考訳) レーダセンサーは、運転支援システムや自動運転車の環境認識に不可欠である。主な性能要因は、細かな範囲の解像度と、直接速度を測定する可能性である。レーダーセンサーの数が増加し、これまでに規制されていない自動車レーダ周波数帯により、相互干渉は避けられず、対処されなければならない。センサーは、検出感度の低下を含む干渉の有害な影響を検知、または緩和する能力を持つ必要がある。本稿では,畳み込みニューラルネットワーク(CNN)を用いた干渉緩和手法について,実世界のレーダ計測で評価する。実測値とシミュレーション干渉を組み合わせることで,モデルのトレーニングに適した入出力データを生成する。本研究では,広範囲なパラメータ探索に基づいて,シミュレーションデータと計測データの複雑性関係をモデル化する性能解析を行う。さらに、有限サンプルサイズ性能比較により、シミュレーションデータと実データの両方でトレーニングされたモデルの有効性と、転送学習の有効性を示す。 state of the artによる比較パフォーマンス分析では、ハードウェアのリソース制約も考慮し、実世界の計測の干渉緩和とノイズ除去のためのcnnベースのモデルの可能性を強調している。

Radar sensors are crucial for environment perception of driver assistance systems as well as autonomous cars. Key performance factors are a fine range resolution and the possibility to directly measure velocity. With a rising number of radar sensors and the so far unregulated automotive radar frequency band, mutual interference is inevitable and must be dealt with. Sensors must be capable of detecting, or even mitigating the harmful effects of interference, which include a decreased detection sensitivity. In this paper, we evaluate a Convolutional Neural Network (CNN)-based approach for interference mitigation on real-world radar measurements. We combine real measurements with simulated interference in order to create input-output data suitable for training the model. We analyze the performance to model complexity relation on simulated and measurement data, based on an extensive parameter search. Further, a finite sample size performance comparison shows the effectiveness of the model trained on either simulated or real data as well as for transfer learning. A comparative performance analysis with the state of the art emphasizes the potential of CNN-based models for interference mitigation and denoising of real-world measurements, also considering resource constraints of the hardware.

翻訳日:2021-05-22 20:32:51 公開日:2020-12-04

# 高速低次半有限プログラムを用いたコミュニティ検出

Community detection using fast low-cardinality semidefinite programming ( http://arxiv.org/abs/2012.02676v1 )

ライセンス: Link先を確認

Po-Wei Wang, J. Zico Kolter

(参考訳) モジュラリティの最大化はネットワークのコミュニティ構造を理解するための基本的なツールであるが、基盤となる最適化問題は非凸であり、np困難である。 louvainやleidenといった最先端のアルゴリズムは、局所的なオプティマから逃れるために異なるヒューリスティックに焦点を合わせているが、それでもノードの割り当てをローカルに移動させ、罠にかかりやすいという欲張りなステップに依存している。本稿では,max-k-cutによる半定値緩和を最大化するために,局所更新を一般化した新しい低カージナリティアルゴリズムを提案する。提案アルゴリズムは拡張性があり、小規模なケースに対して大域半定最適性を実証的に達成し、実際のデータセットにおける最先端のアルゴリズムよりも、時間的コストがほとんどない。アルゴリズムの観点からは、ソリューションが低ランクではなくスパースである場合に、半定義型プログラミングをスケールアップするための新しい道を開く。

Modularity maximization has been a fundamental tool for understanding the community structure of a network, but the underlying optimization problem is nonconvex and NP-hard to solve. State-of-the-art algorithms like the Louvain or Leiden methods focus on different heuristics to help escape local optima, but they still depend on a greedy step that moves node assignment locally and is prone to getting trapped. In this paper, we propose a new class of low-cardinality algorithm that generalizes the local update to maximize a semidefinite relaxation derived from max-k-cut. This proposed algorithm is scalable, empirically achieves the global semidefinite optimality for small cases, and outperforms the state-of-the-art algorithms in real-world datasets with little additional time cost. From the algorithmic perspective, it also opens a new avenue for scaling-up semidefinite programming when the solutions are sparse instead of low-rank.

翻訳日:2021-05-22 20:32:20 公開日:2020-12-04

# Nimble: ディープラーニングのための軽量で並列なGPUタスクスケジューリング

Nimble: Lightweight and Parallel GPU Task Scheduling for Deep Learning ( http://arxiv.org/abs/2012.02732v1 )

ライセンス: Link先を確認

Woosuk Kwon, Gyeong-In Yu, Eunji Jeong, Byung-Gon Chun

(参考訳) ディープラーニング(DL)フレームワークは、GPUを活用して、DL推論とトレーニングのスピードを改善する。理想的には、DLフレームワークはGPUの計算能力を完全に活用でき、実行時間はGPUに割り当てられた計算量に依存する。しかし、GPUタスクのスケジューリングにおいて、既存のDLフレームワークは、大きなスケジューリングオーバーヘッドや不要なシリアル実行などの非効率に悩まされている。そこで我々は,gpuタスクを最小限のスケジューリングオーバーヘッドで並列に実行するdl実行エンジンであるnimbleを提案する。 Nimble氏は、AoTスケジューリングと呼ばれる新しいテクニックを紹介している。ここで、スケジューリング手順はGPUカーネルを実行する前に終了し、実行中のスケジューリングオーバーヘッドの大部分を取り除く。さらに、Nimbleは単一のGPUで複数のGPUストリームを活用することで、GPUタスクの実行を自動的に並列化する。様々なニューラルネットワークの評価は、pytorchと比較して、nimbleは推論とトレーニングを最大22.34$\times$と3.61$\times$で高速化していることを示している。さらに、Nimbleは最先端の推論システムであるTensorRTとTVMを最大2.81$\times$と1.70$\times$で上回る。

Deep learning (DL) frameworks take advantage of GPUs to improve the speed of DL inference and training. Ideally, DL frameworks should be able to fully utilize the computation power of GPUs such that the running time depends on the amount of computation assigned to GPUs. Yet, we observe that in scheduling GPU tasks, existing DL frameworks suffer from inefficiencies such as large scheduling overhead and unnecessary serial execution. To this end, we propose Nimble, a DL execution engine that runs GPU tasks in parallel with minimal scheduling overhead. Nimble introduces a novel technique called ahead-of-time (AoT) scheduling. Here, the scheduling procedure finishes before executing the GPU kernel, thereby removing most of the scheduling overhead during run time. Furthermore, Nimble automatically parallelizes the execution of GPU tasks by exploiting multiple GPU streams in a single GPU. Evaluation on a variety of neural networks shows that compared to PyTorch, Nimble speeds up inference and training by up to 22.34$\times$ and 3.61$\times$, respectively. Moreover, Nimble outperforms state-of-the-art inference systems, TensorRT and TVM, by up to 2.81$\times$ and 1.70$\times$, respectively.

翻訳日:2021-05-22 20:32:01 公開日:2020-12-04

# 政策介入の効果測定におけるコンセプトドリフトの利用--COVID-19パンデミックの事例から

Utilizing Concept Drift for Measuring the Effectiveness of Policy Interventions: The Case of the COVID-19 Pandemic ( http://arxiv.org/abs/2012.03728v1 )

ライセンス: Link先を確認

Lucas Baier, Niklas K\"uhl, Jakob Sch\"offer, Gerhard Satzger

(参考訳) 新型コロナウイルスの感染拡大と致死率の上昇を受け、世界各国は新型コロナウイルスの感染拡大を抑えるための徹底的な対策を講じている。しかし、これらの措置、いわゆる非医薬品介入(NPI)がウイルスの拡散にどのような影響を及ぼすかは不明である。本稿では、機械学習を用いて、政策介入の効果を測定する新しい方法でドリフト検出手法を適用した。我々は、9つのヨーロッパ諸国と28の米国において、新型コロナウイルス(covid-19)の1日当たりのケースナンバーの開発にnpisが与える影響を分析した。解析の結果,NPIが新規症例数に有意な影響を及ぼすまで平均2週間以上かかることが明らかとなった。次に、NPIに関する決定性、気候、人口密度といった各国や国家の特徴が、NPIの有効性を示すまでの時間ラグに与える影響を分析する。分析では,特に学校閉鎖の時期がパンデミックの進展に重大な影響を与えていることが明らかとなった。この情報は、NPI救済でウイルスの厳格な封じ込めを解除する難しい決定に直面した政策当局者にとって極めて重要である。

As a reaction to the high infectiousness and lethality of the COVID-19 virus, countries around the world have adopted drastic policy measures to contain the pandemic. However, it remains unclear which effect these measures, so-called non-pharmaceutical interventions (NPIs), have on the spread of the virus. In this article, we use machine learning and apply drift detection methods in a novel way to measure the effectiveness of policy interventions: We analyze the effect of NPIs on the development of daily case numbers of COVID-19 across 9 European countries and 28 US states. Our analysis shows that it takes more than two weeks on average until NPIs show a significant effect on the number of new cases. We then analyze how characteristics of each country or state, e.g., decisiveness regarding NPIs, climate or population density, influence the time lag until NPIs show their effectiveness. In our analysis, especially the timing of school closures reveals a significant effect on the development of the pandemic. This information is crucial for policy makers confronted with difficult decisions to trade off strict containment of the virus with NPI relief.

翻訳日:2021-05-22 20:31:42 公開日:2020-12-04

# 音から知覚される感情の予測

Predicting Emotions Perceived from Sounds ( http://arxiv.org/abs/2012.02643v1 )

ライセンス: Link先を確認

Faranak Abri, Luis Felipe Guti\'errez, Akbar Siami Namin, David R. W. Sears, Keith S. Jones

(参考訳) 音化とは、音を通してユーザとデータやイベントを通信する科学である。聴覚アイコン、耳栓、音声は、音化に使用される一般的な聴覚表示方式であり、より具体的には情報伝達に音声を使用する。キャプチャーされたデータが認識されると、その意味、さらに重要なことは、意図をより容易に解釈することができ、可視化技術の補完として利用することができる。聴覚知覚を通して、時間的、空間的、または他の文脈指向の情報を伝えることができる。重要な研究課題は、これらの聴覚アイコンから知覚される感情が、自動音化プラットフォームを構築するために予測可能であるかどうかである。本稿では,音から知覚される感情の予測を行うために,主流および従来の機械学習アルゴリズムを複数開発する実験を行う。そのため、音の主な特徴を捕捉し、特徴量削減技術を用いて機械学習アルゴリズムを用いてモデル化する。知覚された感情を高い精度で予測することが可能である。特にランダムフォレストに基づく回帰は、他の機械学習アルゴリズムと比較して優位性を示した。

Sonification is the science of communication of data and events to users through sounds. Auditory icons, earcons, and speech are the common auditory display schemes utilized in sonification, or more specifically in the use of audio to convey information. Once the captured data are perceived, their meanings, and more importantly, intentions can be interpreted more easily and thus can be employed as a complement to visualization techniques. Through auditory perception it is possible to convey information related to temporal, spatial, or some other context-oriented information. An important research question is whether the emotions perceived from these auditory icons or earcons are predictable in order to build an automated sonification platform. This paper conducts an experiment through which several mainstream and conventional machine learning algorithms are developed to study the prediction of emotions perceived from sounds. To do so, the key features of sounds are captured and then are modeled using machine learning algorithms using feature reduction techniques. We observe that it is possible to predict perceived emotions with high accuracy. In particular, the regression based on Random Forest demonstrated its superiority compared to other machine learning algorithms.

翻訳日:2021-05-22 20:31:22 公開日:2020-12-04

# モバイルデータによるマルチモーダルプライバシー保護モッド予測 : 予備研究

Multimodal Privacy-preserving Mood Prediction from Mobile Data: A Preliminary Study ( http://arxiv.org/abs/2012.02359v1 )

ライセンス: Link先を確認

Terrance Liu, Paul Pu Liang, Michal Muszynski, Ryo Ishii, David Brent, Randy Auerbach, Nicholas Allen, Louis-Philippe Morency

(参考訳) 精神的な健康状態は、先進医療に共通のアクセスを持つ国でも診断が不十分である。容易に収集できるデータから気分を正確にかつ効率的に予測できる能力は、精神疾患の早期発見と介入にいくつかの重要な意味を持つ。人間の行動を監視するための有望なデータソースのひとつは、毎日のスマートフォンの利用だ。しかし、個人(例えば、個人識別可能な情報)や保護属性(例えば、人種、性別)を通じてユーザーを特定することなく、行動の要約に注意する必要がある。本稿では,リスクの高い青年の移動行動のデータセットを用いて行動マーカーや日常の気分を調査する。計算モデルを用いて、テキストとアプリの使用状況のマルチモーダルなモデリングは、各モーダルのみに対する日々のムードを高い精度で予測することを発見した。さらに,日々の気分予測を継続しながら,ユーザのアイデンティティを確実に無視するアプローチを評価する。マルチモーダル表現とプライバシ保護学習を組み合わせることで、単調なアプローチに比べてパフォーマンスプライバシのフロンティアを推し進めることができます。

Mental health conditions remain under-diagnosed even in countries with common access to advanced medical care. The ability to accurately and efficiently predict mood from easily collectible data has several important implications towards the early detection and intervention of mental health disorders. One promising data source to help monitor human behavior is from daily smartphone usage. However, care must be taken to summarize behaviors without identifying the user through personal (e.g., personally identifiable information) or protected attributes (e.g., race, gender). In this paper, we study behavioral markers or daily mood using a recent dataset of mobile behaviors from high-risk adolescent populations. Using computational models, we find that multimodal modeling of both text and app usage features is highly predictive of daily mood over each modality alone. Furthermore, we evaluate approaches that reliably obfuscate user identity while remaining predictive of daily mood. By combining multimodal representations with privacy-preserving learning, we are able to push forward the performance-privacy frontier as compared to unimodal approaches.

翻訳日:2021-05-22 20:31:07 公開日:2020-12-04

# アナログmramニューロンとシナプスを用いた一サイクルmlp分類

A Single-Cycle MLP Classifier Using Analog MRAM-based Neurons and Synapses ( http://arxiv.org/abs/2012.02695v1 )

ライセンス: Link先を確認

Ramtin Zand

(参考訳) 本稿では、スピン軌道トルク(SOT)磁気抵抗型ランダムアクセスメモリ(MRAM)を用いて、単一サイクルアナログインメモリコンピューティング(IMC)アーキテクチャのためのシグモダルニューロンと双項化シナプスを実現する。まず,従来最も電力効率の良いアナログsgmoidalニューロン設計に比べてパワーエリア積を12倍削減できるアナログsot-mramベースのニューロンビットセルを提案する。次に、MNISTパターン認識アプリケーションのためのアナログIMCベースの多層パーセプトロン(MLP)アーキテクチャを形成するために、メモリサブアレイ内で提案されたニューロンとシナプスビット細胞を用いる。アーキテクチャレベルの結果から,我々のアナログICCアーキテクチャは,同一の分類精度を実現しつつ,混合信号アナログ/デジタルMCアーキテクチャとディジタルGPU実装と比較して,少なくとも2桁,4桁の性能向上を実現していることがわかった。

In this paper, spin-orbit torque (SOT) magnetoresistive random-access memory (MRAM) devices are leveraged to realize sigmoidal neurons and binarized synapses for a single-cycle analog in-memory computing (IMC) architecture. First, an analog SOT-MRAM-based neuron bitcell is proposed which achieves a 12x reduction in power-area-product compared to the previous most power- and area-efficient analog sigmoidal neuron design. Next, proposed neuron and synapse bit cells are used within memory subarrays to form an analog IMC-based multilayer perceptron (MLP) architecture for the MNIST pattern recognition application. The architecture-level results exhibit that our analog IMC architecture achieves at least two and four orders of magnitude performance improvement compared to a mixed-signal analog/digital IMC architecture and a digital GPU implementation, respectively while realizing a comparable classification accuracy.

翻訳日:2021-05-22 20:30:50 公開日:2020-12-04

# SensiX:エッジ上でのコラボレーション機械学習のためのプラットフォーム

SensiX: A Platform for Collaborative Machine Learning on the Edge ( http://arxiv.org/abs/2012.06035v1 )

ライセンス: Link先を確認

Chulhong Min, Akhil Mathur, Alessandro Montanari, Utku Gunay Acer, Fahim Kawsar

(参考訳) 人体上または人体近傍に複数の感覚デバイスが出現することは、極端なエッジコンピューティングの新たなダイナミクスを明らかにする。これにより、スマートフォンやwi-fiゲートウェイなどの強力でリソースに富んだエッジデバイスがパーソナルエッジに変換され、複数のデバイスと連携して、局所性、可用性、近接性といったパワーを生かしながら、優れた感性alアプリケーションを提供する。当然、この変革は、個人のエッジで正確で堅牢で効率的な感覚システムを構築する方法を再考させる。例えば、複数のIMU搭載デバイスを備えた信頼性の高いアクティビティトラッカーをどのように構築するか? センシングモデルの精度は向上しているが、特に新興のマルチデバイス、パーソナルエッジ環境において、ランタイムのパフォーマンスは依然として低下している。パフォーマンスに影響を及ぼす2つの主要な注意事項は、デバイス可用性、データ品質、デバイス配置など、いくつかのランタイム要因によって寄与されるデバイスとデータ変数である。そこで本研究では,センサデータとセンシングモデルの間を行き来するパーソナルエッジプラットフォームsensixを提案する。 SensiXは、アプリケーションからモデル実行を外部化し、デバイス間データの原則マッピングを行う変換演算子と、モデル精度の関数として正しい実行経路を体系的に選択する品質対応選択演算子とからなる。我々はsensixの設計と実装を報告し、モーションおよびオーディオベースのマルチデバイスセンシングシステムの開発におけるその効果を実証する。評価の結果,SensiXは3mWのオーバヘッドを犠牲にして,全体の精度が7～13%向上し,環境動態が最大30%向上した。

The emergence of multiple sensory devices on or near a human body is uncovering new dynamics of extreme edge computing. In this, a powerful and resource-rich edge device such as a smartphone or a Wi-Fi gateway is transformed into a personal edge, collaborating with multiple devices to offer remarkable sensory al eapplications, while harnessing the power of locality, availability, and proximity. Naturally, this transformation pushes us to rethink how to construct accurate, robust, and efficient sensory systems at personal edge. For instance, how do we build a reliable activity tracker with multiple on-body IMU-equipped devices? While the accuracy of sensing models is improving, their runtime performance still suffers, especially under this emerging multi-device, personal edge environments. Two prime caveats that impact their performance are device and data variabilities, contributed by several runtime factors, including device availability, data quality, and device placement. To this end, we present SensiX, a personal edge platform that stays between sensor data and sensing models, and ensures best-effort inference under any condition while coping with device and data variabilities without demanding model engineering. SensiX externalises model execution away from applications, and comprises of two essential functions, a translation operator for principled mapping of device-to-device data and a quality-aware selection operator to systematically choose the right execution path as a function of model accuracy. We report the design and implementation of SensiX and demonstrate its efficacy in developing motion and audio-based multi-device sensing systems. Our evaluation shows that SensiX offers a 7-13% increase in overall accuracy and up to 30% increase across different environment dynamics at the expense of 3mW power overhead.

翻訳日:2021-05-22 20:30:35 公開日:2020-12-04

# 線形および非線形偏微分方程式を解くための局所極値学習機械と領域分解

Local Extreme Learning Machines and Domain Decomposition for Solving Linear and Nonlinear Partial Differential Equations ( http://arxiv.org/abs/2012.02895v1 )

ライセンス: Link先を確認

Suchuan Dong, Zongwei Li

(参考訳) 本稿では,エクストリーム・ラーニング・マシン(elm),ドメイン分割(domain decomposition),局所ニューラルネットワーク(local neural network)のアイデアを組み合わせた,線形および非線形偏微分方程式の解法を提案する。各サブドメインのフィールドソリューションは、ローカルフィードフォワードニューラルネットワークで表現され、サブドメイン境界に$c^k$連続性が課される。各ローカルニューラルネットワークは、少数の隠れレイヤで構成されているが、最後の隠れレイヤは幅がある。局所ニューラルネットワークのすべての隠蔽層における重み/バイアス係数はランダムな値に予め設定されており、出力層内の重み係数のみがトレーニングパラメータである。全体ニューラルネットワークは、バックプロパゲーション型アルゴリズムではなく、線形または非線形の最小二乗計算によって訓練される。本稿では,長期動的シミュレーションのためのブロック時間マーチング手法を提案する。本手法は、ニューラルネットワークにおける自由度に関して明確な収束感を示す。その数値誤差は通常、自由度が増加するにつれて指数関数的にまたはほぼ指数関数的に減少する。提案手法の計算性能を実証するために, 広範囲な数値実験を行った。本稿では,DGM法(Deep Galerkin Method)とPINN(Physical-informed Neural Network)を精度と計算コストの観点から比較する。現在の手法では,DGM や PINN に比べて,数値誤差やネットワークトレーニング時間(典型的には桁違い)がかなり小さいため,明らかな優位性を示す。また、現在の手法を古典有限要素法(FEM)と比較する。現在の手法の計算性能は、FEMの性能と同等であり、しばしば同等である。

We present a neural network-based method for solving linear and nonlinear partial differential equations, by combining the ideas of extreme learning machines (ELM), domain decomposition and local neural networks. The field solution on each sub-domain is represented by a local feed-forward neural network, and $C^k$ continuity is imposed on the sub-domain boundaries. Each local neural network consists of a small number of hidden layers, while its last hidden layer can be wide. The weight/bias coefficients in all hidden layers of the local neural networks are pre-set to random values and are fixed, and only the weight coefficients in the output layers are training parameters. The overall neural network is trained by a linear or nonlinear least squares computation, not by the back-propagation type algorithms. We introduce a block time-marching scheme together with the presented method for long-time dynamic simulations. The current method exhibits a clear sense of convergence with respect to the degrees of freedom in the neural network. Its numerical errors typically decrease exponentially or nearly exponentially as the number of degrees of freedom increases. Extensive numerical experiments have been performed to demonstrate the computational performance of the presented method. We compare the current method with the deep Galerkin method (DGM) and the physics-informed neural network (PINN) in terms of the accuracy and computational cost. The current method exhibits a clear superiority, with its numerical errors and network training time considerably smaller (typically by orders of magnitude) than those of DGM and PINN. We also compare the current method with the classical finite element method (FEM). The computational performance of the current method is on par with, and oftentimes exceeds, the FEM performance.

翻訳日:2021-05-22 20:30:05 公開日:2020-12-04

PDF登録状況（公開日: 20201204）